Python: Repetition of 2-d random walk simulation

Python: Repetition of 2-d random walk simulation - python

I'm simulating a 2-dimensional random walk, with direction 0 < θ < 2π and T=1000 steps. I already have:
a=np.zeros((1000,2), dtype=np.float)
def randwalk(x,y):
theta=2*math.pi*rd.rand() # Theta is a random angle between 0 and 2pi
x+=math.cos(theta); # Since spatial unit = 1
y+=math.sin(theta); # Since spatial unit = 1
return (x,y)
x, y = 0., 0.
for i in range(1000):
x, y = randwalk(x,y)
a[i,:] = x, y
This generates a single walk, and stores all intermediate coordinates in the numpy array a. How can I edit my code to repeat the walk 12 times (using a new random seed every time) and then save each run in a separate text file? Do I need a while loop within my randwalk function?
Guess:
rwalkrepeat = []
for _ in range(12):
a=np.zeros((1000,2), dtype=np.float)
x, y = 0., 0.
for i in range(1000):
x, y = randwalk(x,y)
a[i,:] = x, y
rwalkrepeat.append(a)
print rwalkrepeat

You don't need any explicit loops. The entire solution can be vectorized (untested):
nsteps = 1000
nwalks = 12
theta = 2 * np.pi * np.random.rand(nwalks, nsteps - 1)
xy = np.dstack((np.cos(theta), np.sin(theta)))
a = np.hstack((np.zeros((nwalks, 1, 2)), np.cumsum(xy, axis=1)))

An approach to this which sticks with the general form of your code is:
import numpy as np
import matplotlib.pyplot as plt
import random as rd
import math
a=np.zeros((1000,2), dtype=np.float)
def randwalk(x,y):
theta=2*math.pi*rd.random()
x+=math.cos(theta);
y+=math.sin(theta);
return (x,y)
fn_base = "my_random_walk_%i.txt"
for j in range(12):
rd.seed(j)
x, y = 0., 0.
for i in range(1000):
x, y = randwalk(x,y)
a[i,:] = x, y
fn = fn_base % j
np.savetxt(fn, a)
For the basic calculation, panda-34's and NPE's answers are also good, and take advantage of numpy's vectorization.
Here I used seed(j) to explicitly set the seed the random numbers. The advantage of this is that each result will be repeatable as long as the seed is the same, even if, say, they are not run in sequence, or you change the array length, etc. This isn't necessary though if one didn't want repeatable runs -- then random would just seed from the time and all random number throughout all runs would be different.
Explanation for file names: since OP requested saving each of multiple runs to different files, I thought it would be good to have numbered files, eg, here my_random_walk_0.txt, my_random_walk_1.txt, etc. In my example I used the name fn_base as a variable to hold the general format of the filename, so that, say, the code fn = fn_base % 17 would set fn equal to my_random_walk_17.txt (this is a bit old school for python, read about "string formatting" in python for more).

If you use numpy, why aren't you using numpy?
I'd do it this way:
n_moves = 1000
a = np.zeros((n_moves, 2))
for i in range(12):
thetas = (2*np.pi) * np.random.rand(n_moves-1)
a[1:,0] = np.cos(thetas)
a[1:,1] = np.sin(thetas)
a = np.add.accumulate(a, 0)

Related

ask for help for a sum (sigma) function

need help to calculate this:
so, the total number of y is equal to number of x, and each y is calculated with one x and several a.
My code list below, it gives the correct results for a0. what is a simple way to calculate this? maybe a different version can also verify the results.
Thanks a lot.
import numpy as np
import matplotlib.pyplot as plt
a = np.array([1,2,3,4],float) # here we can give several a
b = np.asarray(list(enumerate(a)))
x = np.linspace(0.0,1.0,10)
y1 = []
for r in x:
y1.append(np.exp(np.sum((1-r)**2*a*((2*b[:,0]+1)*r-1+r)*(r-1+r)**(b[:,0]-1))))
y1=np.asarray(y1)

You can write almost literally the same in numpy:
def f(x, a):
x, a = np.asanyarray(x), np.asanyarray(a)
x = x[:, None] # create new dimension to sum along
i = np.arange(len(a)) # create counter
return np.sum((1-x)**2 * a * ((2*i + 1) * x - (1-x)) * (x - (1-x))**(i-1), axis=-1)
As a side note: there are obvious algebraic simplifications you could take advantage of.

Solving an ordinary differential equation on a fixed grid (preferably in python)

I have a differential equation of the form
dy(x)/dx = f(y,x)
that I would like to solve for y.
I have an array xs containing all of the values of x for which I need ys.
For only those values of x, I can evaluate f(y,x) for any y.
How can I solve for ys, preferably in python?
MWE
import numpy as np
# these are the only x values that are legal
xs = np.array([0.15, 0.383, 0.99, 1.0001])
# some made up function --- I don't actually have an analytic form like this
def f(y, x):
if not np.any(np.isclose(x, xs)):
return np.nan
return np.sin(y + x**2)
# now I want to know which array of ys satisfies dy(x)/dx = f(y,x)

Assuming you can use something simple like Forward Euler...
Numerical solutions will rely on approximate solutions at previous times. So if you want a solution at t = 1 it is likely you will need the approximate solution at t<1.
My advice is to figure out what step size will allow you to hit the times you need, and then find the approximate solution on an interval containing those times.
import numpy as np
#from your example, smallest step size required to hit all would be 0.0001.
a = 0 #start point
b = 1.5 #possible end point
h = 0.0001
N = float(b-a)/h
y = np.zeros(n)
t = np.linspace(a,b,n)
y[0] = 0.1 #initial condition here
for i in range(1,n):
y[i] = y[i-1] + h*f(t[i-1],y[i-1])
Alternatively, you could use an adaptive step method (which I am not prepared to explain right now) to take larger steps between the times you need.
Or, you could find an approximate solution over an interval using a coarser mesh and interpolate the solution.
Any of these should work.

I think you should first solve ODE on a regular grid, and then interpolate solution on your fixed grid. The approximate code for your problem
import numpy as np
from scipy.integrate import odeint
from scipy import interpolate
xs = np.array([0.15, 0.383, 0.99, 1.0001])
# dy/dx = f(x,y)
def dy_dx(y, x):
return np.sin(y + x ** 2)
y0 = 0.0 # init condition
x = np.linspace(0, 10, 200)# here you can control an accuracy
sol = odeint(dy_dx, y0, x)
f = interpolate.interp1d(x, np.ravel(sol))
ys = f(xs)
But dy_dx(y, x) should always return something reasonable (not np.none).
Here is the drawing for this case

Specify the shift for numpy.correlate

I wonder if there is a possibility to specify the shift expressed by k variable for the cross-correlation of two 1D arrays. Because with the numpy.correlate function and its mode parameter set to 'full' I will get cross-correlate coefficients for each k shift for whole length of the taken array (assuming that both arrays are the same size). Let me show you what I mean exactly on below example:
import numpy as np
# Define signal 1.
signal_1 = np.array([1, 2 ,3])
# Define signal 2.
signal_2 = np.array([1, 2, 3])
# Other definitions.
Xi = signal_1
Yi = signal_2
N = np.size(Xi)
k = 3
Xs = np.average(Xi)
Ys = np.average(Yi)
# Cross-covariance coefficient function.
def crossCovariance(Xi, Yi, N, k, Xs, Ys, forCorrelation = False):
autoCov = 0
for i in np.arange(0, N-k):
autoCov += ((Xi[i+k])-Xs)*(Yi[i]-Ys)
if forCorrelation == True:
return autoCov/N
else:
return (1/(N-1))*autoCov
# Expected value function.
def E(X, P):
expectedValue = 0
for i in np.arange(0, np.size(X)):
expectedValue += X[i] * (P[i] / np.size(X))
return expectedValue
# Cross-correlation coefficient function.
def crossCorrelation(Xi, Yi, k):
# Calculate the covariance coefficient.
cov = crossCovariance(Xi, Yi, N, k, Xs, Ys, forCorrelation = True)
# Calculate standard deviations.
EX = E(Xi, np.ones(np.size(Xi)))
SDX = (E((Xi - EX) ** 2, np.ones(np.size(Xi)))) ** (1/2)
EY = E(Yi, np.ones(np.size(Yi)))
SDY = (E((Yi - EY) ** 2, np.ones(np.size(Yi)))) ** (1/2)
# Calculate correlation coefficient.
return cov / (SDX * SDY)
# Express cross-covariance or cross-correlation function in a form of a 1D vector.
def array(k, norm = True):
# If norm = True, return array of autocorrelation coefficients.
# If norm = False, return array of autocovariance coefficients.
vector = np.array([])
shifts = np.abs(np.arange(-k, k+1, 1))
for i in shifts:
if norm == True:
vector = np.append(crossCorrelation(Xi, Yi, i), vector)
else:
vector = np.append(crossCovariance(Xi, Yi, N, i, Xs, Ys), vector)
return vector
In my example, calling the method array(k, norm = True) for different values of k will give resuslt as I shown below:
k = 3, [ 0. -0.5 0. 1. 0. -0.5 0. ]
k = 2, [-0.5 0. 1. 0. -0.5]
k = 1, [ 0. 1. 0.]
k = 0, [ 1.]
My approach is good for the learning purposes but I need to move to the native numpy functions in order to speed up my analysis. How one could specify the k shift value while using the native numpy.correlate function? PS k parameter specify the "time" shift between two arrays. Thank you in advance.

Whilst I'm not aware of any built-in function for computing the cross-correlation for a particular range of signal lags, you can speed your version up a lot by vectorization, i.e. performing operations on arrays rather than single elements in an array.
This version uses only a single Python loop over the lags:
import numpy as np
def xcorr(x, y, k, normalize=True):
n = x.shape[0]
# initialize the output array
out = np.empty((2 * k) + 1, dtype=np.double)
lags = np.arange(-k, k + 1)
# pre-compute E(x), E(y)
mu_x = x.mean()
mu_y = y.mean()
# loop over lags
for ii, lag in enumerate(lags):
# use slice indexing to get 'shifted' views of the two input signals
if lag < 0:
xi = x[:lag]
yi = y[-lag:]
elif lag > 0:
xi = x[:-lag]
yi = y[lag:]
else:
xi = x
yi = y
# x - mu_x; y - mu_y
xdiff = xi - mu_x
ydiff = yi - mu_y
# E[(x - mu_x) * (y - mu_y)]
out[ii] = xdiff.dot(ydiff) / n
# NB: xdiff.dot(ydiff) == (xdiff * ydiff).sum()
if normalize:
# E[(x - mu_x) * (y - mu_y)] / (sigma_x * sigma_y)
out /= np.std(x) * np.std(y)
return lags, out
Some more general points of advice:
As I mentioned in the comments, you should try to give your functions names that are informative, and that aren't likely to conflict with other things in your namespace (e.g. array vs np.array).
It's much better to make your functions self-contained. In your version, N, k, Xs and Ys are defined outside the main function. In this situation you might accidentally modify or overwrite one of these variables, and it can get tricky to debug errors caused by this sort of thing.
Appending to numpy arrays (e.g. using np.append or np.concatenate) is slow, so avoid it whenever you can. If, as in this case, you know the size of the output ahead of time, it's much faster to pre-allocate the output array (e.g. using np.empty or np.zeros), then fill in the elements. If you absolutely have to do concatenation, it's often faster to append to a normal Python list, then convert it to a numpy array at the end.

It's available by specifying maxlags:
import matplotlib.pyplot as plt
xcorr = plt.xcorr(signal_1, signal_2, maxlags=1)
Documentation can be found here. This implementation is based on np.correlate.

Storing intermediate values in a numpy array

I'm trying to simulate a 2-d random walk, with direction 0 < θ < 2π and T=1000 steps.
a=np.zeros((1000,1000))
def randwalk(x,y):
theta=2*math.pi*rd.rand()
x+=math.cos(theta);
y+=math.sin(theta);
return (x,y)
How can I store all the intermediate coordinates in a? I was initially trying something of the form:
for i in range(1000):
for j in range(1000):
a[i,j] = randwalk(x,y)
But this doesn't seem to work at all.

The main obvious problem is that you want a 2D array of 1000 points, not a 1000x1000 array. For example, you say you want to take 1000 steps, but your nested loop takes 1,000,000.
import numpy as np
import matplotlib.pyplot as plt
import random as rd
import math
a=np.zeros((1000,2), dtype=np.float)
def randwalk(x,y):
theta=2*math.pi*rd.random()
x+=math.cos(theta);
y+=math.sin(theta);
return (x,y)
x, y = 0., 0.
for i in range(1000):
x, y = randwalk(x,y)
a[i,:] = x, y
plt.figure()
plt.plot(a[:,0], a[:,1])
plt.show()

You probably want something like
T = 1000
a = [(0,0)] * T
for i in range(1, len(a)):
a[i] = randwalk(*a[i - 1])
No need for numpy here.

You've got a type error. randwalk is returning a 2-tuple, and you're trying to set an array element where a float is expected.
First of all, you don't want a 1000 by 1000 array. This would give a million data points, and you only need 2000. I think what you want is something like this:
xs = np.zeros((1000))
ys = np.zeros((1000))
x = 0
y = 0
for i in range(1000):
xs[i], ys[i] = randwalk()
Also, should change the definition of randwalk to take no parameters, and to make x and y global variables:
def randwalk():
global x, y
As you have it, you're modifying the values of the parameters, but they aren't accumulated from call to call.

Strange behaviour with Gaussian random distribution

I'm running a bit of code whose purpose is to take a list/array of floats and an associated list/array of the same length acting as an "error" and shuffle the first list around according to a Gaussian distribution.
This is a MWE of the code:
import random
import numpy as np
def random_data(N, a, b):
# Generate some random data.
return np.random.uniform(a, b, N).tolist()
# Obtain values for x.
x = random_data(100, 0., 1.)
# Obtain error/sigma values for x.
x_sigma = random_data(100, 0., 0.2)
# Generate new x values shuffling each float around a
# Gaussian distribution with a given sigma.
x_gauss = random.gauss(np.array(x), np.array(x_sigma))
print x-x_gauss
What I find is that the result of doing x-x_gauss is a list of floats that is always either positive or negative. This means the random.gauss call is always assigning either a larger new value for each float in x or a smaller one for all values in x.
I would expect the random.gauss call to shuffle the floats in x around its values both to the right and to the left, since this is a random process.
Why is this not happening? Am I understanding something wrong about the process?

This is the definition of random.gauss:
def gauss(self, mu, sigma):
random = self.random
z = self.gauss_next
self.gauss_next = None
if z is None:
x2pi = random() * TWOPI
g2rad = _sqrt(-2.0 * _log(1.0 - random()))
z = _cos(x2pi) * g2rad
self.gauss_next = _sin(x2pi) * g2rad
return mu + z*sigma
Notice that is is generating one value for z, and returning mu + z*sigma.
Since mu and sigma are numpy arrays, this calculation is being done element-wise. Since sigma is positive, the shift z*sigma is either always positive or negative, depending on the sign of z
If you are using NumPy, unless there is a specific reason to do otherwise, I would use the np.random module to generate these values. It would be quicker than using a Python loop with calls to random.gauss:
import numpy as np
N = 100
x = np.random.uniform(0., 1., size=N)
x_sigma = np.random.uniform(0., 0.2, size=N)
z = np.random.normal(0, 1, size=N)
x_gauss = x + z*x_sigma
print x-x_gauss

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Repetition of 2-d random walk simulation - python

You don't need any explicit loops. The entire solution can be vectorized (untested): nsteps = 1000 nwalks = 12 theta = 2 * np.pi * np.random.rand(nwalks, nsteps - 1) xy = np.dstack((np.cos(theta), np.sin(theta))) a = np.hstack((np.zeros((nwalks, 1, 2)), np.cumsum(xy, axis=1)))

If you use numpy, why aren't you using numpy? I'd do it this way: n_moves = 1000 a = np.zeros((n_moves, 2)) for i in range(12): thetas = (2np.pi) np.random.rand(n_moves-1) a[1:,0] = np.cos(thetas) a[1:,1] = np.sin(thetas) a = np.add.accumulate(a, 0)

Related

ask for help for a sum (sigma) function

Solving an ordinary differential equation on a fixed grid (preferably in python)

Specify the shift for numpy.correlate

Storing intermediate values in a numpy array

Strange behaviour with Gaussian random distribution

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Repetition of 2-d random walk simulation - python

You don't need any explicit loops. The entire solution can be vectorized (untested): nsteps = 1000 nwalks = 12 theta = 2 * np.pi * np.random.rand(nwalks, nsteps - 1) xy = np.dstack((np.cos(theta), np.sin(theta))) a = np.hstack((np.zeros((nwalks, 1, 2)), np.cumsum(xy, axis=1)))

If you use numpy, why aren't you using numpy? I'd do it this way: n_moves = 1000 a = np.zeros((n_moves, 2)) for i in range(12): thetas = (2*np.pi) * np.random.rand(n_moves-1) a[1:,0] = np.cos(thetas) a[1:,1] = np.sin(thetas) a = np.add.accumulate(a, 0)

Related

ask for help for a sum (sigma) function

Solving an ordinary differential equation on a fixed grid (preferably in python)

Specify the shift for numpy.correlate

Storing intermediate values in a numpy array

Strange behaviour with Gaussian random distribution

Categories

Resources

If you use numpy, why aren't you using numpy? I'd do it this way: n_moves = 1000 a = np.zeros((n_moves, 2)) for i in range(12): thetas = (2np.pi) np.random.rand(n_moves-1) a[1:,0] = np.cos(thetas) a[1:,1] = np.sin(thetas) a = np.add.accumulate(a, 0)