Simulating an elementary stochastic process in Python - python

I'm trying to simulate a simple stochastic process in Python, but with no success. The process is the following:
x(t + δt) = r(t) * x(t)
where r(t) is a Bernoulli random variable that can assume the values 1.5 or 0.6.
I've tried the following:
n = 10
r = np.zeros( (1,n))
for i in range(0, n, 1):
if r[1,i] == r[1,0]:
r[1,i] = 1
else:
B = bernoulli.rvs(0.5, size=1)
if B == 0:
r[1,i] = r[1,i-1] * 0.6
else:
r[1,i] = r[1,i-1] * 1.5
Can you explain why this is wrong and a possible solution?

So , first thing is that the SDE should be perceived over time, so you also need to consider the discretization rather than just giving the number of steps through n .
Essentially, what you are asking is just a simple random walk with a Bernoulli random variable taking on the values 0.5 and 1.6 instead of a Gaussian (standard normal) random variable.
So I have created an answer here, using NumPy to create the Bernoulli random variable for efficiency (numpy is faster than scipy) and then running the simulation with a stepsize of 0.01 then plotting the solution using matplotlib.
One thing to note that this SDE is one dimensional so we can just store the state and time in separate vectors and plot them at the end.
# Function generating bernoulli trial (your r(t))
def get_bernoulli(p=0.5):
'''
Function using numpy (faster than scipy.stats)
to generate bernoulli random variable with values 0.5 or 1.6
'''
B = np.random.binomial(1, p, 1)
if B == 0:
return 0.6
else:
return 1.5
This is then used in the simulation as
import numpy as np
import matplotlib.pyplot as plt
dt = 0.01 #step size
x0 = 1# initialize
tfinal = 1
sqrtdt = np.sqrt(dt)
n = int(tfinal/dt)
# State and time vectors
xtraj = np.zeros(n+1, float)
trange = np.linspace(start=0,stop=tfinal ,num=n+1)
# initialized
xtraj[0] = x0
for i in range(n):
xtraj[i+1] = xtraj[i] * get_bernoulli(p=0.5)
plt.plot(trange,xtraj,label=r'$x(t)$')
plt.xlabel("time")
plt.ylabel(r"$X$")
plt.legend()
plt.show()
Where we assumed the Bernoulli trial is fair, but can be customized to add some more variation.

Related

How can I optimize this code in python? For solving stochastic differential equations

I am developing a code that uses a method called Platen to solve stochastic differential equations. Then I must solve that stochastic differential equation many times (on the order of 10,000 times) to average all the results. My code is:
import numpy as np
import random
import numba
#numba.jit(nopython=True)
def integrador2(y,t,h): #this is the integrator of the function that solves the SDE
m = 6.6551079E-26 #parameters
gamma=0.05
T = 5E-3
k_b = 1.3806488E-23
b=np.sqrt(2*m*gamma*T*k_b)
c=np.sqrt(h)
for i in range(len(t)):
dW=c*random.gauss(0,1)
A=np.array([y[i,-1]/m,-gamma*y[i,-1]]) #this is the platen method that is applied at
B_dW=np.array([0,b*dW]) #each time step
z=y[i]+A*h+B_dW
Az=np.array([z[-1]/m,-gamma*z[-1]])
y[i+1]=y[i]+1/2*(Az+A)*h+B_dW
return y
def media(args): #args is a tuple with the parameters
y = args[0]
t = args[1]
k = args[2]
x=0
p=0
for n in range(k): #k=number of trajectories
y=integrador2(y,t,h)
x=(1./(n+1))*(n*x+y[:,0]) #I do the average like this so as not to have to save all the
p=(1./(n+1))*(n*p+y[:,1]) #solutions in memory
return x,p
The variables y, t and h are:
y0 = np.array([initial position, initial moment]) #initial conditions
t = np.linspace(initial time, final time, number of time intervals) #time array
y = np.zeros((len(t)+1,len(y0))) #array of positions and moments
y[0,:]=np.array(y0) #I keep the initial condition
h = (final time-initial time)/(number of time intervals) #time increment
I need to be able to run the program for a number of time intervals of 10 ** 7 and solve it 10 ** 4 times (k = 10 ** 4).
I feel that I have already reached a dead end because I already accelerate the function that calculates the result with Numba and then (although I do not put it here) I parallelize the "media" function to work with the four cores that my computer has. Even doing all this, my program takes an hour and a half to execute for 10 ** 6 time intervals and k = 10 ** 4, I have not had the courage to execute it for 10 ** 7 time intervals because my intuition tells me that it would take more than 10 hours.
I would really appreciate if someone could advise me to make some parts of the code faster.
Finally, I apologize if I have not expressed myself completely correctly in any part of the question, I am a physicist, not a computer scientist and my English is far from perfect.
I can save about 75% of compute time by simplifying the math in the loop:
def integrador2(y,t,h): #this is the integrator of the function that solves the SDE
m = 6.6551079E-26 #parameters
gamma=0.05
T = 5E-3
k_b = 1.3806488E-23
b=np.sqrt(2*m*gamma*T*k_b)
c=np.sqrt(h)
h = h * 1.
coeff0 = h/m - gamma*h**2/(2.*m)
coeff1 = (1. - gamma*h + gamma**2*h**2/2.)
coeffd = c*b*(1. - gamma*h/2.)
for i in range(len(t)):
dW=np.random.normal()
# Method 2
y[i+1] = np.array([y[i][0] + y[i][1]*coeff0, y[i][1]*coeff1 + dW*coeffd])
return y
Here's a method using filters with scipy, which I don't think is compatible with Numba, but is slightly faster than the solution above:
from scipy import signal
# #numba.jit(nopython=True)
def integrador2(y,t,h): #this is the integrator of the function that solves the SDE
m = 6.6551079E-26 #parameters
gamma=0.05
T = 5E-3
k_b = 1.3806488E-23
b=np.sqrt(2*m*gamma*T*k_b)
c=np.sqrt(h)
h = h * 1.
coeff0a = 1.
coeff0b = h/m - gamma*h**2/(2.*m)
coeff1 = (1. - gamma*h + gamma**2*h**2/2.)
coeffd = c*b*(1. - gamma*h/2.)
noise = np.zeros(y.shape[0])
noise[1:] = np.random.normal(0.,coeffd*1.,y.shape[0]-1)
noise[0] = y[0,1]
a = [1, -coeff1]
b = [1]
y[1:,1] = signal.lfilter(b,a,noise)[1:]
a = [1, -coeff0a]
b = [coeff0b]
y[1:,0] = signal.lfilter(b,a,y[:,1])[1:]
return y

efficient sampling from beta-binomial distribution in python

for a stochastic simulation I need to draw a lot of random numbers which are beta binomial distributed.
At the moment I implemented it this way (using python):
import scipy as scp
from scipy.stats import rv_discrete
class beta_binomial(rv_discrete):
"""
creating betabinomial distribution by defining its pmf
"""
def _pmf(self, k, a, b, n):
return scp.special.binom(n,k)*scp.special.beta(k+a,n-k+b)/scp.special.beta(a,b)
so sampling a random number x can be done by:
betabinomial = beta_binomial(name="betabinomial")
x = betabinomial.rvs(0.5,0.5,3) # with some parameter
The problem is, that sampling one random number takes ca. 0.5ms, which is in my case dominating the whole simulation speed. The limiting element is the evaluation of the beta functions (or gamma functions within these).
Does anyone has a great idea how to speed up the sampling?
Well, here is working and lightly tested code which seems to be faster, using compound distribution property of Beta-Binomial.
We sample p from beta and then using it as parameter for binomial. If you would sample large sized vectors, it would be even faster.
import numpy as np
def sample_Beta_Binomial(a, b, n, size=None):
p = np.random.beta(a, b, size=size)
r = np.random.binomial(n, p)
return r
np.random.seed(777777)
q = sample_Beta_Binomial(0.5, 0.5, 3, size=10)
print(q)
Output is
[3 1 3 2 0 0 0 3 0 3]
Quick test
np.random.seed(777777)
n = 10
a = 2.
b = 2.
N = 100000
q = sample_Beta_Binomial(a, b, n, size=N)
h = np.zeros(n+1, dtype=np.float64) # histogram
for v in q: # fill it
h[v] += 1.0
h /= np.float64(N) # normalization
print(h)
prints histogram
[0.03752 0.07096 0.09314 0.1114 0.12286 0.12569 0.12254 0.1127 0.09548 0.06967 0.03804]
which is quite similar to green graph in the Wiki page on Beta-Binomial

Scipy Minimize Not Working

I'm running the minimization below:
from scipy.optimize import minimize
import numpy as np
import math
import matplotlib.pyplot as plt
### objective function ###
def Rlzd_Vol1(w1, S):
L = len(S) - 1
m = len(S[0])
# Compute log returns, size (L, m)
LR = np.array([np.diff(np.log(S[:,j])) for j in xrange(m)]).T
# Compute weighted returns
w = np.array([w1, 1.0 - w1])
R = np.array([np.sum(w*LR[i,:]) for i in xrange(L)]) # size L
# Compute Realized Vol.
vol = np.std(R) * math.sqrt(260)
return vol
# stock prices
S = np.exp(np.random.normal(size=(50,2)))
### optimization ###
obj_fun = lambda w1: Rlzd_Vol1(w1, S)
w1_0 = 0.1
res = minimize(obj_fun, w1_0)
print res
### Plot objective function ###
fig_obj = plt.figure()
ax_obj = fig_obj.add_subplot(111)
n = 100
w1 = np.linspace(0.0, 1.0, n)
y_obj = np.zeros(n)
for i in xrange(n):
y_obj[i] = obj_fun(w1[i])
ax_obj.plot(w1, y_obj)
plt.show()
The objective function shows an obvious minimum (it's quadratic):
But the minimization output tells me the minimum is at 0.1, the initial point:
I cannot figure out what's going wrong. Any thoughts?
w1 is passed in as a (single entry) vector and not as scalar from the minimize routine. Try what happens if you define w1 = np.array([0.2]) and then calculate w = np.array([w1, 1.0 - w1]). You'll see you get a 2x1 matrix instead of a 2 entry vector.
To make your objective function able to handle w1 being an array you can simply put in an explicit conversion to float w1 = float(w1) as the first line of Rlzd_Vol1. Doing so I obtain the correct minimum.
Note that you might want to use scipy.optimize.minimize_scalar instead especially if you can bracket where you minimum will be.

Exercise on calculating and plotting cumulated empirical distribution

I was trying to finish an exercise in Jonh Stachurski's book (a textbook devoted to teach economists how to use Python). One of these is about how to calculate and plot cumulated empirical distribution. They provide a class called ecdf to calculate empirical distribution function
# Filename: ecdf.py
# Author: John Stachurski
# Date: December 2008
# Corresponds to: Listing 6.3
class ECDF:
def __init__(self, observations):
self.observations = observations
def __call__(self, x):
counter = 0.0
for obs in self.observations:
if obs <= x:
counter += 1
return counter / len(self.observations)
And the excercise reads
【Exercise 6.1.12】 Add a method to the ECDF class that uses Matplotlib to plot the em-
pirical distribution over a specified interval. Replicate the four graphs in figure 6.3
(modulo randomness).
the figure is need to be replicated is
and an illusion of algorithm
The following is my initial attempt
from ecdf import ECDF
import numpy as np
import matplotlib.pyplot as plt
from srs import SRS
from math import sqrt
from random import lognormvariate
# =========================
# parameters and arguments
# =========================
alpha, sigma2, s, delta = 0.3, 0.2, 0.5, 0.1
# numbers of draws
n = 1000
# length of each markov chain
t = 20
num_simu = [4,25,100,5000]
# Define F(k, z) = s k^alpha z + (1 - delta) k
F = lambda k, z: s * (k**alpha) * z + (1 - delta) * k
lognorm = lambda: lognormvariate(0, sqrt(sigma2))
# =====================
# create empirical distribution
# =====================
# different draw numbers
k = np.linspace(0,25,500)
for n in num_simu:
for x in range(n):
# list used to store capital stock (kt) in the last periods (t=20)
kt = []
solow_srs = SRS(F=F, phi=lognorm, X=1.0)
px = solow_srs.sample_path(t)
kt.append(px[-1])
# generate the empirical distribution function
F = ECDF(kt)
prob_kt_n = [F(i) for i in k] # need to determine range
# n refers to the n-th draw
# ==================================
# use for-loop to create subplots
# ==================================
#k = np.linspace(0,25,500)
#num_rows,num_cols = 2,2
The difficulties to me are 1) How can I store list/array of empirical distribution results for different draw numbers in the given graph. 2) How to create subplots using a for-loop. I also encountered some other tiny errors.
Thank you for your suggestions.
About (1), my advice is to create a dictionary (i.e. something like d = {} and then d[n] = ECDF(data) for each number n of observations).
Dunno about (2).

Python Code: Geometric Brownian Motion - what's wrong?

I'm pretty new to Python, but for a paper in University I need to apply some models, using preferably Python. I spent a couple of days with the code I attached, but I can't really help, what's wrong, it's not creating a random process which looks like standard brownian motions with drift. My parameters like mu and sigma (expected return or drift and volatility) tend to change nothing but the slope of the noise process. That's my problem, it all looks like noise. Hope my problem is specific enough, here is my coode:
import math
from matplotlib.pyplot import *
from numpy import *
from numpy.random import standard_normal
'''
geometric brownian motion with drift!
Spezifikationen:
mu=drift factor [Annahme von Risikoneutralitaet]
sigma: volatility in %
T: time span
dt: lenght of steps
S0: Stock Price in t=0
W: Brownian Motion with Drift N[0,1]
'''
T=1
mu=0.025
sigma=0.1
S0=20
dt=0.01
Steps=round(T/dt)
t=(arange(0, Steps))
x=arange(0, Steps)
W=(standard_normal(size=Steps)+mu*t)### standard brownian motion###
X=(mu-0.5*sigma**2)*dt+(sigma*sqrt(dt)*W) ###geometric brownian motion####
y=S0*math.e**(X)
plot(t,y)
show()
According to Wikipedia,
So it appears that
X=(mu-0.5*sigma**2)*t+(sigma*W) ###geometric brownian motion####
rather than
X=(mu-0.5*sigma**2)*dt+(sigma*sqrt(dt)*W)
Since T represents the time horizon, I think t should be
t = np.linspace(0, T, N)
Now, according to these Matlab examples (here and here), it appears
W = np.random.standard_normal(size = N)
W = np.cumsum(W)*np.sqrt(dt) ### standard brownian motion ###
not,
W=(standard_normal(size=Steps)+mu*t)
Please check the math, however, I could be wrong.
So, putting it all together:
import matplotlib.pyplot as plt
import numpy as np
T = 2
mu = 0.1
sigma = 0.01
S0 = 20
dt = 0.01
N = round(T/dt)
t = np.linspace(0, T, N)
W = np.random.standard_normal(size = N)
W = np.cumsum(W)*np.sqrt(dt) ### standard brownian motion ###
X = (mu-0.5*sigma**2)*t + sigma*W
S = S0*np.exp(X) ### geometric brownian motion ###
plt.plot(t, S)
plt.show()
yields
An additional implementation using the parametrization of the gaussian law though the normal fonction (instead of standard_normal), a bit shorter.
import numpy as np
T = 2
mu = 0.1
sigma = 0.01
S0 = 20
dt = 0.01
N = round(T/dt)
# reversely you can specify N and then compute dt, which is more common in financial litterature
X = np.random.normal(mu * dt, sigma* np.sqrt(dt), N)
X = np.cumsum(X)
S = S0 * np.exp(X)

Categories

Resources