In the following code snippet I intend to do the following:
(1) Multiply each element of the identity by the d optimization variable.
(2) Sum a vector of ones to a CVXPY affine expression, which is also a vector of 24 elements.
(3) Create a constraint which compares two vectors element-wise.
import numpy as np
import cvxpy as cp
weights = cp.Variable(5)
d = cp.Variable(1)
meas = np.random.rand(8, 3)
det = np.random.rand(24, 5)
dm = d * np.eye(3) # (1)
beh = np.ones([24, 1]) + cp.reshape((dm # meas.T).T, [24, 1]) # (2)
constrs = [beh == det # weights] #(3)
My questions are:
Q1: Did I code what I wanted?
Q2: At (2), I get the following error:
/usr/lib/python3.8/site-packages/cvxpy/utilities/shape.py in sum_shapes(shapes)
45 # Only allow broadcasting for 0D arrays or summation of scalars.
46 if shape != t and len(squeezed(shape)) != 0 and len(squeezed(t)) != 0:
---> 47 raise ValueError(
48 "Cannot broadcast dimensions " +
49 len(shapes)*" %s" % tuple(shapes))
ValueError: Cannot broadcast dimensions (24, 1) [24, 1]
What exactly does this mean, and how do I fix it?
Q3: When I do det # weights, at (3), I get an Expression(AFFINE, UNKNOWN, (24,)). In the constraint, I'll compare it with beh, which I'm guessing will be an Expression(AFFINE, UNKNOWN, (24, 1)). Will this comparison also bring an issue?
When I started using cvxpy, I also had some trouble making dimensions fit. In my experience, it is a good idea to use arrays with as few dimensions as possible. So if you have a 2-dimensional array where 1 of the dimensions only has length 1, see if you can reduce the dimension. (see below)
The problem in (2) is solved when you changed the brackets you use when reshaping the cvxpy expression to (24,1), like this:
beh = np.ones([24, 1]) + cp.reshape((dm # meas.T).T, (24, 1)) # (2)
You could also avoid your problem by simply doing:
beh = 1 + cp.reshape((dm # meas.T).T, (24, 1)) # (2)
which will do the same: add 1 to each entry of the cvxpy array.
After this is done, you will have a problem with your final line: "ValueError: Cannot broadcast dimensions (24, 1) (24,)"
This can be remedied by making beh of the dimension (24, ) too (reduce the dimension will solve your problems here, as mentioned before). The full working code would be:
import numpy as np
import cvxpy as cp
weights = cp.Variable(5)
d = cp.Variable(1)
meas = np.random.rand(8, 3)
det = np.random.rand(24, 5)
dm = d * np.eye(3) # (1)
beh = 1 + cp.reshape((dm # meas.T).T, (24, )) # (2)
constrs = [beh == det # weights] #(3)
Hope this helps!
Related
I was reading about attention and came across this equation:
import einops
from fancy_einsum import einsum
import torch
x = torch.rand((200, 10, 768))
y = torch.rand((20, 768, 64))
res = einsum("batch query_pos d_model, n_heads d_model d_head -> batch query_pos n_heads d_head", x, y)
And I am not able to understand the underlying operations that give the result res
I thought it might be matmul and tried this:
import torch
x_ = x.unsqueeze(dim = 2).unsqueeze(dim = 2)
y_ = torch.broadcast_to(y, (1, 1, 20, 768, 64))
res2 = x_ # y_
res2 = res2.squeeze(dim = -2)
(res == res2).all() # Prints False
But that does not seem to be right.
Any help regarding this is greatly appreciated
So whenever using einsum you best think about the meaning of the dimensions. Basically we perform a multiplication between the two inputs in this case. The signature passed to einsum shows what dimensions will be preserved and which ones will be "summed away". I simplified the signature with single letters here:
res = einsum("b q m, n m h -> b q n h", x, y)
We can read from this that both x and y have three dimensions. Furthermore both have a dimension called m, and this doesn't appear in the output. So we can conclude that it gets "summed away". So for each entry of the output we have following formula. For simplicity I reused the dimension names as indices, so for every b,q,n,h we get
___
\
res[b,q,n,h] = / x[b,q,m] * y[n,m,h]
/__
m
To do this with any other function than einsum is usually more cumbersome. So first we need to reorder and unsqueeze the dimensions in a way that they are compatible to be multiplied, so we can do the following (the shapes annotated above):
#(b,q,m,n,h) (b, q, m, 1, 1) (m, n, h)
product = x[:, :, :, None, None] * y.permute([1,0,2])
Due to the broadcasting rules, the second (y-) term will implicitly get the required leading dummy dimensions.
Then we can "sum away" the dimension m:
res = product.sum(dim=2) # (b,q,n,h)
So you can interpret that as a matrix multiplication if you want, or also just a scalar product, but of course with many "batch"-dimensions.
I am attempting to do a nested loop in order to find the mean-squared for a variety of different sized distributions. I keep getting an error that reads: "ValueError: could not broadcast input array from shape (0,) into shape (1000,)".
I am a beginner coder so I know this may be trivial for some...
My code:
#%% Initialize variables.
rng = np.random.default_rng()
rand = rng.random
num_steps = 1000
num_walks = 1000
x_step = np.zeros((num_steps, num_walks))
y_step = np.zeros((num_steps, num_walks))
x_final = np.zeros((1, num_walks))
y_final = np.zeros((1, num_walks))
displacement = np.zeros((num_walks, 1))
mean_squared_displacement = np.zeros(10)
#%% Find the mean-squared displacement for a variety of step numbers.
step_variation = np.linspace(0, 10000, 11)
for n in range(np.size(step_variation)-1):
for m in range(num_walks):
x_step[:,m] = np.cumsum(2*(rand(int(step_variation[n]))<.5)-1) # ERROR APPEARS ON THIS LINE
y_step[:,m] = np.cumsum(2*(rand(int(step_variation[n]))<.5)-1)
x_final[0,m] = x_step[-1,m]
y_final[0,m] = y_step[-1,m]
displacement[m,0] = np.sqrt(x_final[0,m]**2 + y_final[0,m]**2)
mean_squared_displacement[n] = np.mean(displacement[m,0]**2)
What steps did you take to debug this? Any? or did you just throw your hands up in despair, not understanding that the error means?
Did you examine the problem line? Test pieces in it?
x_step[:,m] = np.cumsum(2*(rand(int(step_variation[n]))<.5)-1)
The first value of step_variation is 0 (from linspace). rand(0) produces a (0,) shape array. The rest of that expression is thus also (0,) shape.
In [13]: rand(0)
Out[13]: array([], dtype=float64)
x_step is (1000,1000), so x_step[:,m] is (1000,) shape. The error tells us/you that it can't put a (0,) (no values) array into that (1000,) shape slot.
I have a for loop with a range of 2000 in this for loop I have to create an array called Array
out of two other arrays, let's call them ArrayOfPositionSatellite with a size of (3,38) and the other array called ArrayOfPositionMassPoint with a size of (38, 3, 4412). The size of Array is (38,3,4412) and the size of PositonOfSatellite and PointsOfMassPoint is (3, ). My attempt to overwrite the ArrayOfMassPoint with to for-loops :
ArrayOfPositionSatellite= ArrayOfPositionSatellite.T
Array = ArrayOfPositionMassPoint
for i in range(38):
for k in range(4412):
PositionOfSatellite = ArrayOfPositionSatellite[:,i]
PositionOfMassPoint= ArrayOfPositionMassPoint[i,:,k]
ElementOfA = -Gravitationalconstant* (PositionOfSatellite - PositionOfMassPoint)/(np.linalg.norm( PositionOfSatellite - PositionOfMassPoint)**3)
Array[i,:,k] = ElementOfArray
Problem
My problem is that it takes around 3 hours to run the code and this is too long. Is there some way to make it more time-efficient?
If something is unclear please leave a comment and I will add more details.
You can vectorize your calculations. Like:
import numpy as np
ArrayOfPositionSatellite = np.random.randn(3, 38)
ArrayOfPositionMassPoint = np.random.randn(38, 3, 4412)
Gravitationalconstant = 6.67430e-11
# This is the difference vector
v = ArrayOfPositionMassPoint - ArrayOfPositionSatellite.T[:,:,None]
# This is norm of the difference vector
norm = np.linalg.norm(v, axis=1) ** 3
# This is normalized vector
norm_v = v / norm[:, None, :]
# This is the result
array = norm_v * -Gravitationalconstant
array.shape
>>> (38, 3, 4412)
This takes around ~40ms on my machine, instead of 3 hours.
Question
I have a Brownian motion vectorized path given below that I am trying to replicate in python. My problem is that one of the functions, highlighted as the first function in yellow is not working. Instead it is giving me an error (operands could not be broadcast together with shapes (1000,499) (500,1000)) due to the dimensions of the two arrays being added.
How can I recreate the array U using python?
The MATLAB code is from An Algorithmic Introduction to Numerical Simulation of stochastic Differential Equations and is not my own code.
MATLAB code
Python attempt not working
import numpy as np
import matplotlib.pyplot as plt
from numpy import matlib as mb
# BPATH 3 Function along a Brownian path
T = 1
N = 500
dt = float(T/N)
M = 1000 # ntraj
t = np.arange(dt,T,dt)
## == Mean of 1000 paths == ##
dW = np.sqrt(dt)* np.random.normal(0.0, 1.0, (n,M)) # all general increments
W = np.cumsum(dW, axis=1) # cumsum by column
a = mb.repmat(t,M,1)
U = np.exp(a+0.5*W) # Error is here
Error message
17 W = np.cumsum(dW, axis=1) # cumsum by column
18 a = mb.repmat(t,M,1)
---> 19 U = np.exp(a+0.5*W)
20
21
ValueError: operands could not be broadcast together with shapes (1000,499) (500,1000)
You shouldn't need to import and use mb.repmat. That's used in MATLAB because it doesn't do broadcasting (or at least didn't until recently). To add to the (500,1000) shape W, t needs to be (500,), expended to (500,1). I'd suggest using linspace for generating t.
Test:
t = np.linspace(dt,T, N) # (500,) shape
...
U = t[:,None] + 0.5*W # add (500,1) with (500,1000)
np.arange doesn't include the end point so your t was only (499,) shaped.
===
The MATLAB code, reproduced in Octave
>> T=1;N=500;dt=T/N;t=[dt:dt:1];
>> M=1000;
>> x=repmat(t,[M 1]);
>> W=randn(M,N);
>> Umean=mean(x+W);
t is (1,500); x and W are (1000,500) size, and thus can add elementwise. Umean is (1,500) - the mean over the 1st dimension.
The numpy code:
In [1]: T = 1
...: N = 500
...: dt = float(T/N)
...: M = 1000 # ntraj
...: t = np.arange(dt,T,dt)
In [2]: t.shape
Out[2]: (499,)
In [3]: t = np.linspace(dt,T,N)
In [4]: t.shape
Out[4]: (500,)
In [6]: W=np.random.normal(0.0, 1.0, (N,M))
In [7]: W.shape
Out[7]: (500, 1000)
In [8]: (t[:,None]+W).shape
Out[8]: (500, 1000)
t[:,None] is (500,1) shape, which broadcasts to (500,1000) without using repmat.
To get the mean over the size 1000 dimension we have to use:
In [9]: np.mean((t[:,None]+W), axis=1).shape
Out[9]: (500,)
Easiest thing might be for me to just post the numpy code that I'm trying to perform directly in Theano if it's possible:
tensor = shared(np.random.randn(7, 16, 16)).eval()
tensor2 = tensor[0,:,:].eval()
tensor2[tensor2 < 1] = 0.0
tensor2[tensor2 > 0] = 1.0
new_tensor = [tensor2]
for i in range(1, tensor.shape[0]):
new_tensor.append(np.multiply(tensor2, tensor[i,:,:].eval()))
output = np.array(new_tensor).reshape(7,16,16)
If it's not immediately obvious, what I'm trying to do is use the values from one matrix of a tensor made up of 7 different matrices and apply that to the other matrices in the tensor.
Really, the problem I'm solving is doing conditional statements in an objective function for a fully convoltional network in Keras. Basically the loss for some of the feature map values is going to be calculated (and subsequently weighted) differently from others depending on some of the values in one of the feature maps.
You can easily implement conditionals with switch statement.
Here would be the equivalent code:
import theano
from theano import tensor as T
import numpy as np
def _check_new(var):
shape = var.shape[0]
t_1, t_2 = T.split(var, [1, shape-1], 2, axis=0)
ones = T.ones_like(t_1)
cond = T.gt(t_1, ones)
mask = T.repeat(cond, t_2.shape[0], axis=0)
out = T.switch(mask, t_2, T.zeros_like(t_2))
output = T.join(0, cond, out)
return output
def _check_old(var):
tensor = var.eval()
tensor2 = tensor[0,:,:]
tensor2[tensor2 < 1] = 0.0
tensor2[tensor2 > 0] = 1.0
new_tensor = [tensor2]
for i in range(1, tensor.shape[0]):
new_tensor.append(np.multiply(tensor2, tensor[i,:,:]))
output = theano.shared(np.array(new_tensor).reshape(7,16,16))
return output
tensor = theano.shared(np.random.randn(7, 16, 16))
out1 = _check_new(tensor).eval()
out2 = _check_old(tensor).eval()
print out1
print '----------------'
print ((out1-out2) ** 2).mean()
Note: since your masking on the first filter, I needed to use split and join operations.