I am attempting to do a nested loop in order to find the mean-squared for a variety of different sized distributions. I keep getting an error that reads: "ValueError: could not broadcast input array from shape (0,) into shape (1000,)".
I am a beginner coder so I know this may be trivial for some...
My code:
#%% Initialize variables.
rng = np.random.default_rng()
rand = rng.random
num_steps = 1000
num_walks = 1000
x_step = np.zeros((num_steps, num_walks))
y_step = np.zeros((num_steps, num_walks))
x_final = np.zeros((1, num_walks))
y_final = np.zeros((1, num_walks))
displacement = np.zeros((num_walks, 1))
mean_squared_displacement = np.zeros(10)
#%% Find the mean-squared displacement for a variety of step numbers.
step_variation = np.linspace(0, 10000, 11)
for n in range(np.size(step_variation)-1):
for m in range(num_walks):
x_step[:,m] = np.cumsum(2*(rand(int(step_variation[n]))<.5)-1) # ERROR APPEARS ON THIS LINE
y_step[:,m] = np.cumsum(2*(rand(int(step_variation[n]))<.5)-1)
x_final[0,m] = x_step[-1,m]
y_final[0,m] = y_step[-1,m]
displacement[m,0] = np.sqrt(x_final[0,m]**2 + y_final[0,m]**2)
mean_squared_displacement[n] = np.mean(displacement[m,0]**2)
What steps did you take to debug this? Any? or did you just throw your hands up in despair, not understanding that the error means?
Did you examine the problem line? Test pieces in it?
x_step[:,m] = np.cumsum(2*(rand(int(step_variation[n]))<.5)-1)
The first value of step_variation is 0 (from linspace). rand(0) produces a (0,) shape array. The rest of that expression is thus also (0,) shape.
In [13]: rand(0)
Out[13]: array([], dtype=float64)
x_step is (1000,1000), so x_step[:,m] is (1000,) shape. The error tells us/you that it can't put a (0,) (no values) array into that (1000,) shape slot.
Related
I am trying to create a plot with numpy and I am running into a Value Error. I have checked the shapes of the arrays before they enter the loop and somehow they are coming out miss-shaped relative to one another. Here is my current code:
sol = np.zeros((partition, partition))
solA = np.zeros((partition, partition))
for i in range(partition):
sol[:, i] = odeint(SolveMe, ic, timeSpace, args=(zeemanSpace[i],))[:, 1]
solA[:, i] = (1/(time[-1]))*It.cumtrapz(sol[i:, 1], timeSpace)
With the previous declarations:
partition = 100
time = [ti, tf]
zman = [qi, qf]
zeemanSpace = np.linspace(zman[0], zman[-1], partition)
timeSpace = np.linspace(time[0], time[-1], partition)
I commented out the solA[:, i] in my for loop and run this test:
print(sol.shape)
print(solA.shape)
Which produces
runfile('/Users/taylor/Library/Mobile Documents/com~apple~CloudDocs/Documents/Academia/Student Material/OU/Research/DPT of BEC/Work/Numerical/Gen X. & P/Zhang et al. Results/Scratch.py', wdir='/Users/taylor/Library/Mobile Documents/com~apple~CloudDocs/Documents/Academia/Student Material/OU/Research/DPT of BEC/Work/Numerical/Gen X. & P/Zhang et al. Results')
(100, 100)
(100, 100)
When the previous line in the for loop is uncommented, the following error is produced:
ValueError: could not broadcast input array from shape (99,) into shape (100,)
Any help would be greatly appreciated!
I have three 1D vectors. Let's say T with 100k element array, f and df each with 200 element array:
T = [T0, T1, ..., T100k]
f = [f0, f1, ..., f200]
df = [df0, df1, ..., df200]
For each element array, I have to calculate a function such as the following:
P = T*f + T**2 *df
My first instinct was to use the NumPy outer to find the function with each combination of f and df
P1 = np.outer(f,T)
P2 = np.outer(df,T**2)
P = np.add.outer(P1, P2)
However, in this case, I am facing the ram issue and receiving the following error:
Unable to allocate 2.23 PiB for an array with shape (200, 100000, 200,
100000) and data type float64
Is there a good way that I can calculate this?
My attempt using for loops
n=100
f_range = 5e-7
df_range = 1.5e-15
fsrc = np.arange(f - n * f_range, f + n * f_range, f_range) #array of 200
dfsrc = np.arange(df - n * df_range, df + n * df_range, df_range) #array of 200
dfnus=pd.DataFrame(fsrc)
numf=dfnus.shape[0]
dfnudots=pd.DataFrame(dfsrc)
numfdot=dfnudots.shape[0]
test2D = np.zeros([numf,(numfdot)])
for indexf, f in enumerate(fsrc):
for indexfd, fd in enumerate(dfsrc):
a=make_phase(T,f,fd) #--> this is just a function that performs T*f + T**2 *df
zlauf2d=z_n(a, n=1, norm=1) #---> And this is just another function that takes this 1D "a" and gives another 1D element array
test2D[indexf, indexfd]=np.copy(zlauf2d) #---> I do this so I could make a contour plot at the end. It just copys the same thing to 2D
Now my test2D has the shape of (200,200). This is what I want, however the floor loop is taking ages and I want somehow reduce two for loop to at least one.
Using broadcasting:
P1 = (f[:, np.newaxis] * T).sum(axis=-1)
P2 = (df[:, np.newaxis] * T**2).sum(axis=-1)
P = P1[:, np.newaxis] + P2
Alternatively, using outer:
P1 = (np.outer(f, T)).sum(axis=-1)
P2 = (np.outer(df, T**2)).sum(axis=-1)
P = P1[..., np.newaxis] + P2
This produces an array of shape (f.size, df.size) == (200, 200).
Generally speaking, if the final output array size is very large, one can either:
Reduce the size of the datatypes. One way is to change the datatypes of the arrays used to calculate the final output via P1.astype(np.float32). Alternatively, some operations allow one to pass in a dtype=np.float32 as a parameter.
Chunk the computation and work with smaller subsections of the result.
Based on the most recent edit, compute an array a with shape (200, 200, 100000). Then, take its element-wise norm along the last axis to produce an array z with shape (200, 200).
a = (
f[:, np.newaxis, np.newaxis] * T
+ df[np.newaxis, :, np.newaxis] * T**2
)
# L1 norm along last axis.
z = np.abs(a).sum(axis=-1)
This produces an array of shape (f.size, df.size) == (200, 200).
I have a for loop with a range of 2000 in this for loop I have to create an array called Array
out of two other arrays, let's call them ArrayOfPositionSatellite with a size of (3,38) and the other array called ArrayOfPositionMassPoint with a size of (38, 3, 4412). The size of Array is (38,3,4412) and the size of PositonOfSatellite and PointsOfMassPoint is (3, ). My attempt to overwrite the ArrayOfMassPoint with to for-loops :
ArrayOfPositionSatellite= ArrayOfPositionSatellite.T
Array = ArrayOfPositionMassPoint
for i in range(38):
for k in range(4412):
PositionOfSatellite = ArrayOfPositionSatellite[:,i]
PositionOfMassPoint= ArrayOfPositionMassPoint[i,:,k]
ElementOfA = -Gravitationalconstant* (PositionOfSatellite - PositionOfMassPoint)/(np.linalg.norm( PositionOfSatellite - PositionOfMassPoint)**3)
Array[i,:,k] = ElementOfArray
Problem
My problem is that it takes around 3 hours to run the code and this is too long. Is there some way to make it more time-efficient?
If something is unclear please leave a comment and I will add more details.
You can vectorize your calculations. Like:
import numpy as np
ArrayOfPositionSatellite = np.random.randn(3, 38)
ArrayOfPositionMassPoint = np.random.randn(38, 3, 4412)
Gravitationalconstant = 6.67430e-11
# This is the difference vector
v = ArrayOfPositionMassPoint - ArrayOfPositionSatellite.T[:,:,None]
# This is norm of the difference vector
norm = np.linalg.norm(v, axis=1) ** 3
# This is normalized vector
norm_v = v / norm[:, None, :]
# This is the result
array = norm_v * -Gravitationalconstant
array.shape
>>> (38, 3, 4412)
This takes around ~40ms on my machine, instead of 3 hours.
Question
I have a Brownian motion vectorized path given below that I am trying to replicate in python. My problem is that one of the functions, highlighted as the first function in yellow is not working. Instead it is giving me an error (operands could not be broadcast together with shapes (1000,499) (500,1000)) due to the dimensions of the two arrays being added.
How can I recreate the array U using python?
The MATLAB code is from An Algorithmic Introduction to Numerical Simulation of stochastic Differential Equations and is not my own code.
MATLAB code
Python attempt not working
import numpy as np
import matplotlib.pyplot as plt
from numpy import matlib as mb
# BPATH 3 Function along a Brownian path
T = 1
N = 500
dt = float(T/N)
M = 1000 # ntraj
t = np.arange(dt,T,dt)
## == Mean of 1000 paths == ##
dW = np.sqrt(dt)* np.random.normal(0.0, 1.0, (n,M)) # all general increments
W = np.cumsum(dW, axis=1) # cumsum by column
a = mb.repmat(t,M,1)
U = np.exp(a+0.5*W) # Error is here
Error message
17 W = np.cumsum(dW, axis=1) # cumsum by column
18 a = mb.repmat(t,M,1)
---> 19 U = np.exp(a+0.5*W)
20
21
ValueError: operands could not be broadcast together with shapes (1000,499) (500,1000)
You shouldn't need to import and use mb.repmat. That's used in MATLAB because it doesn't do broadcasting (or at least didn't until recently). To add to the (500,1000) shape W, t needs to be (500,), expended to (500,1). I'd suggest using linspace for generating t.
Test:
t = np.linspace(dt,T, N) # (500,) shape
...
U = t[:,None] + 0.5*W # add (500,1) with (500,1000)
np.arange doesn't include the end point so your t was only (499,) shaped.
===
The MATLAB code, reproduced in Octave
>> T=1;N=500;dt=T/N;t=[dt:dt:1];
>> M=1000;
>> x=repmat(t,[M 1]);
>> W=randn(M,N);
>> Umean=mean(x+W);
t is (1,500); x and W are (1000,500) size, and thus can add elementwise. Umean is (1,500) - the mean over the 1st dimension.
The numpy code:
In [1]: T = 1
...: N = 500
...: dt = float(T/N)
...: M = 1000 # ntraj
...: t = np.arange(dt,T,dt)
In [2]: t.shape
Out[2]: (499,)
In [3]: t = np.linspace(dt,T,N)
In [4]: t.shape
Out[4]: (500,)
In [6]: W=np.random.normal(0.0, 1.0, (N,M))
In [7]: W.shape
Out[7]: (500, 1000)
In [8]: (t[:,None]+W).shape
Out[8]: (500, 1000)
t[:,None] is (500,1) shape, which broadcasts to (500,1000) without using repmat.
To get the mean over the size 1000 dimension we have to use:
In [9]: np.mean((t[:,None]+W), axis=1).shape
Out[9]: (500,)
In the following code snippet I intend to do the following:
(1) Multiply each element of the identity by the d optimization variable.
(2) Sum a vector of ones to a CVXPY affine expression, which is also a vector of 24 elements.
(3) Create a constraint which compares two vectors element-wise.
import numpy as np
import cvxpy as cp
weights = cp.Variable(5)
d = cp.Variable(1)
meas = np.random.rand(8, 3)
det = np.random.rand(24, 5)
dm = d * np.eye(3) # (1)
beh = np.ones([24, 1]) + cp.reshape((dm # meas.T).T, [24, 1]) # (2)
constrs = [beh == det # weights] #(3)
My questions are:
Q1: Did I code what I wanted?
Q2: At (2), I get the following error:
/usr/lib/python3.8/site-packages/cvxpy/utilities/shape.py in sum_shapes(shapes)
45 # Only allow broadcasting for 0D arrays or summation of scalars.
46 if shape != t and len(squeezed(shape)) != 0 and len(squeezed(t)) != 0:
---> 47 raise ValueError(
48 "Cannot broadcast dimensions " +
49 len(shapes)*" %s" % tuple(shapes))
ValueError: Cannot broadcast dimensions (24, 1) [24, 1]
What exactly does this mean, and how do I fix it?
Q3: When I do det # weights, at (3), I get an Expression(AFFINE, UNKNOWN, (24,)). In the constraint, I'll compare it with beh, which I'm guessing will be an Expression(AFFINE, UNKNOWN, (24, 1)). Will this comparison also bring an issue?
When I started using cvxpy, I also had some trouble making dimensions fit. In my experience, it is a good idea to use arrays with as few dimensions as possible. So if you have a 2-dimensional array where 1 of the dimensions only has length 1, see if you can reduce the dimension. (see below)
The problem in (2) is solved when you changed the brackets you use when reshaping the cvxpy expression to (24,1), like this:
beh = np.ones([24, 1]) + cp.reshape((dm # meas.T).T, (24, 1)) # (2)
You could also avoid your problem by simply doing:
beh = 1 + cp.reshape((dm # meas.T).T, (24, 1)) # (2)
which will do the same: add 1 to each entry of the cvxpy array.
After this is done, you will have a problem with your final line: "ValueError: Cannot broadcast dimensions (24, 1) (24,)"
This can be remedied by making beh of the dimension (24, ) too (reduce the dimension will solve your problems here, as mentioned before). The full working code would be:
import numpy as np
import cvxpy as cp
weights = cp.Variable(5)
d = cp.Variable(1)
meas = np.random.rand(8, 3)
det = np.random.rand(24, 5)
dm = d * np.eye(3) # (1)
beh = 1 + cp.reshape((dm # meas.T).T, (24, )) # (2)
constrs = [beh == det # weights] #(3)
Hope this helps!