Error somewhere in my RNG for Monte-Carlo Simulation? - python

So, for a Monte-Carlo class, I created a uniform random number generator to simulate a normal distribution and use it for MC option pricing, and I'm going terribly wrong somewhere. It uses a simple linear congruential generator (lcg) which generates a random vector that is fed into an numerical approximation of a inverse normal distribution (beasley-springer-morrow algorithm) to generate standard normally distributed values (elaboration on the exact process can be found here).
This is my code so far.
Rng:
def lcrm_generator(num, a, seed, c):
mod = 4294967296 # 2^32
val = int(seed) #int() is to ensure seeds with a leading zero still get accepted by the program
rng = []
for i in range(num):
val = (a * val + c) % mod
rng.append(val/(mod-1))
return rng
Inverse normal approximator:
def bsm_algorithm(u):
# These are my necessary initial constants
a0 = 2.50662823884; a1 = -18.61500062529; a2 = 41.39119773534; a3 = -25.44106049637;
b0 = -8.47351093090; b1 = 23.08336743743; b2 = -21.06224101826; b3 = 3.13082909833;
c0 = 0.3374754822726147; c1 = 0.9761690190917186; c2 = 0.1607979714918209; c3 = 0.0276438810333863;
c4 = 0.0038405729373609; c5 = 0.0003951896511919; c6 = 0.0000321767881768; c7 = 0.0000002888167364;
c8 = 0.0000003960315187;
x = [0]*len(u)
for i in range(len(u)):
y = u[i] - 0.5
if abs(y) < 0.42:
r = y**2
x[i] = y*(((a3*r+a2)*r+a1)*r+a0)/((((b3*r+b2)*r+b1)*r+b0)*r+1)
else:
r = u[i]
if y > 0:
r = 1 - u[i]
r = log(-log(r))
x[i] = c0+r*(c1+r*(c2+r*(c3+r*(c4+r*(c5+r*(c6+r*(c7+r*c8)))))))
if y < 0:
x[i] = -x[i]
return x
Combining these two with the following and drawing the histogram shows the data looks correctly normal,
a=lcrm_generator(100000,301,"0",21)
b = bsm_algorithm(a)
plt.hist(b, bins=100)
plt.show()
And option pricing function:
def LookbackOP(S,K,r,sigma,intervals,sims,Call_Put=1):
## My objects that will determine the option prices.
path = [0]*intervals
values = [0]*sims
## Objects to hold the random nums used for simulation.
randuni = [0]*sims
randnorm = [0]*sims
for i in range(sims):
randuni[i] = lcrm_generator(intervals,301,i,21)
randnorm[i] = bsm_algorithm(randuni[i])
# Generating the simulation one by one.
for i in range(sims):
path[0] = 1
## Below is to generate one whole simulated path.
################## MY INCORRECT WAY ##################
for j in range(1,intervals):
path[j] = path[j-1]*exp((r - .5*sigma**2)*(1/intervals) + sqrt(1/intervals)*randnorm[i][j])
################## CORRECT BUILT-IN WAY ##################
# path[j] = path[j-1]*exp((r - .5*sigma**2)*(1/intervals) + sqrt(1/intervals)*np.random.normal(0,1))
## For each separate simulation, price the option either as a call or a put.
if Call_Put == 1:
values[i] = max(S*max(path)-K,0)
elif Call_Put == 0:
values[i] = max(K-S*min(path),0)
else:
print("Error: You inputted something other than '1 = Call', or '0 = Put'")
plt.hist(values,bins=30)
plt.show()
## To get expected return of option by takeing their avg.
option_value = np.mean(values)
print(option_value)
return option_value
In the last block of code, my error is pointed out, which seemingly can be fixed by simply using numpy's normal rng. Using one versus the other produces drastically different answers, and I'm tempted to trust numpy over myself, but my code looks normal, so where am I going wrong.

First,
a=lcrm_generator(100000,301,"0",21)
this looks weird - why do you need a character for seed? Anyway, good parameters are in the table here: https://en.wikipedia.org/wiki/Linear_congruential_generator. But problem is not LCG, I believe, but there might be systemic difference in your gaussian.
I ran code
from scipy.stats import norm
q = lcrm_generator(100000, 301, "0", 21)
g = bsm(q)
param = norm.fit(g)
print(param)
for 100 000, 1 000 000 and 10 000 000 samples, and my output is
(0.0009370998731855792, 0.9982155665317061)
(-0.0006429485100838258, 0.9996875045073682)
(-0.0007464875819397183, 1.0002711142625116)
and there is no improvement between 1 000 000 and 10 000 000 samples. Basically, you use some approximation for gaussian inverse function, and those are just artifacts of approximation, nothing to be done about it.
Numpy, I believe, is using one of the precise normal sampling methods (Box-Muller, I think)

Related

While loop for Python odient solver?

I have a mathematical model of differential equations that begins as linear and then uses correctional coefficients after reaching a certain value (1).
Currently, I solve the linear function independently, find out where the array goes from less than 1 to greater than 1, and then use that value from the array as the new initial condition. I also correct the time scale.
def vttmodel_linear(m,t,tm,tv,M_max):
n = 1/(7*tm)
dMdt = n
return dMdt
M_0 = 0
M_max = 1 + 7*((RH_crit-RH)/(RH_crit-100)) - 2*np.square((RH_crit-RH)/(RH_crit-100))
print(M_max)
# tm = days
# M = weeks so 7*tm
t = np.arange(0,104+1)
tm = np.exp(-0.68*np.log(T) - 13.9*np.log(RH) + 0.14*W - 0.33*SQ + 66.02)
tv = np.exp(-0.74*np.log(T) - 12.72*np.log(RH) + 0.06*W + 61.50)
m = odient(vttmodel_linear, M_0, t, args=(tm,tv,M_max))
M_0 = m[(np.where(m>1)[0][0])-1]
t = np.where(m>1)[0]
Then I use the new initial condition, M_0 and the updated time scale to solve the non-linear portion of the model.
def vttmodel(M,t,tm,tv,M_max):
n = 1/(7*tm)
k1 = 2/((tv/tm)-1)
k2 = np.max([1-np.exp(2.3*(M-M_max)), 0])
dMdt = n*k1*k2
return dMdt
M = odient(vttmodel, M_0, t, args=(tm,tv,M_max))
I then splice the arrays m and M at the location I found earlier and graph the result.
I would like to find a simplified way to do this. I have tried using If statements within the odient function and also a While loop when calling the two functions, but have not had any luck interrupting the odient function. Suggestions would be helpful. Thank you.

Invalid Syntax error on addition when running FEniCS in Windows subsystem

I am currently working on a project where we are solving a system of PDE's in FEniCs. I have created the following code in order to solve the system but I get an invalid syntax error on
a = a0 + a1
I am not that good in Python and I have never used FEniCS before. I am also using a windows subsystem in order to run it which makes it extra complicated for me to understand any error that I might have made. I appreciate any suggestions you may have and I apologize in advance if I ask obvious questions!
from fenics import *
# Create mesh and define function space
mesh = Mesh (" circle.xml ")
# Construct the finite element space
V = VectorFunctionSpace (mesh , 'P', 1)
# Define parameters :
T = 150
dt = 0.5
alpha = 0.4
beta = 2
gamma = 0.8
delta = 1
# Class representing the intial conditions
class InitialConditions ( UserExpression ):
def eval (self , values , x):
values [0] = Expression(("(4/25)-2*pow(10,-7)*(x[0]-0.1*x[1]-225)*(x[0]-0.1*x[1]-675)"),degree=2)
values [1] = Expression(("(22/45)-3*pow(10,-5)*(x[0]-450)-1.2*pow(10,-4)*(x[1]-150)"),degree=2)
def value_shape ( self ):
return (2 ,)
# Define initial condition
indata = InitialConditions(degree =2)
u0 = Function (V)
u0 = interpolate (indata,V)
# Test and trial functions
u,v = TrialFunction(V), TestFunction(V)
# Create bilinear and linear forms
a0 = (u[0]*v[0]*dx) + (0.5*delta*dt*inner(grad(u[0]),grad(v[0]))*dx)
a1 = (u[1]*v[1]*dx) + (0.5*delta*dt*inner(grad(u[1]),grad(v[1]))*dx)
L0 = (u0[0]*v[0]*dx) - (0.5*delta*dt*inner(grad(u0[0]),grad(v[0]))*dx) - (dt*u0[0]*v[0]*dx*(((u0[0]*u0[1])/(u0[0]+alpha))-u0[0]*(1-u0[0]))*dx)
L1 = (u0[1]*v[1]*dx) - (0.5*delta*dt*inner(grad(u0[1]),grad(v[1]))*dx) - (dt*u0[1]*v[1]*dx*(-beta*((u0[0]*u0[1])/(u0[0]+alpha))-gamma*u0[1]*dx)
a = a0 + a1
L = L0 + L1
#Set up boundary condition
g = Constant([0.0,0.0])
bc = DirichletBC(V,u_initial,DirichletBoundary())
bc = [] #NEUMANN
#Assemble matrix
A = assemble(a)
# Set an output file
out_file = File("Results.pvd","compressed")
# Set initial condition
u = Function(V)
u.assign(u0)
t = 0.0
out_file << (u,t)
u_initial = Function(V)
u_initial.assign(u0)
t_save = 0
num_samples = 20
# Time - stepping
while t < T:
# assign u0
u0.assign(u)
#Assemble vector and apply boundary conditions
A = assemble(a)
b = asseble(L)
t_save += dt
if t_save > T/ num_samples or t >= T-dt:
print("Saving!")
#Save the solution to file
out_file << (uv,t)
t_save = 0
#Move to next interval and adjust boundary condition
t += dt
There's a typo in your line b = asseble(L) --> b=assemble(L)
Perhaps this tiny error is giving you the issue? Although I'd imagine the error message would be more descriptive.

How can I stop my Runge-Kutta2 (Heun) method from exploding?

I am currently trying to write some python code to solve an arbitrary system of first order ODEs, using a general explicit Runge-Kutta method defined by the values alpha, gamma (both vectors of dimension m) and beta (lower triangular matrix of dimension m x m) of the Butcher table which are passed in by the user. My code appears to work for single ODEs, having tested it on a few different examples, but I'm struggling to generalise my code to vector valued ODEs (i.e. systems).
In particular, I try to solve a Van der Pol oscillator ODE (reduced to a first order system) using Heun's method defined by the Butcher Tableau values given in my code, but I receive the errors
"RuntimeWarning: overflow encountered in double_scalars f = lambda t,u: np.array(... etc)" and
"RuntimeWarning: invalid value encountered in add kvec[i] = f(t+alpha[i]*h,y+h*sum)"
followed by my solution vector that is clearly blowing up. Note that the commented out code below is one of the examples of single ODEs that I tried and is solved correctly. Could anyone please help? Here is my code:
import numpy as np
def rk(t,y,h,f,alpha,beta,gamma):
'''Runga Kutta iteration'''
return y + h*phi(t,y,h,f,alpha,beta,gamma)
def phi(t,y,h,f,alpha,beta,gamma):
'''Phi function for the Runga Kutta iteration'''
m = len(alpha)
count = np.zeros(len(f(t,y)))
kvec = k(t,y,h,f,alpha,beta,gamma)
for i in range(1,m+1):
count = count + gamma[i-1]*kvec[i-1]
return count
def k(t,y,h,f,alpha,beta,gamma):
'''returning a vector containing each step k_{i} in the m step Runga Kutta method'''
m = len(alpha)
kvec = np.zeros((m,len(f(t,y))))
kvec[0] = f(t,y)
for i in range(1,m):
sum = np.zeros(len(f(t,y)))
for l in range(1,i+1):
sum = sum + beta[i][l-1]*kvec[l-1]
kvec[i] = f(t+alpha[i]*h,y+h*sum)
return kvec
def timeLoop(y0,N,f,alpha,beta,gamma,h,rk):
'''function that loops through time using the RK method'''
t = np.zeros([N+1])
y = np.zeros([N+1,len(y0)])
y[0] = y0
t[0] = 0
for i in range(1,N+1):
y[i] = rk(t[i-1],y[i-1], h, f,alpha,beta,gamma)
t[i] = t[i-1]+h
return t,y
#################################################################
'''f = lambda t,y: (c-y)**2
Y = lambda t: np.array([(1+t*c*(c-1))/(1+t*(c-1))])
h0 = 1
c = 1.5
T = 10
alpha = np.array([0,1])
gamma = np.array([0.5,0.5])
beta = np.array([[0,0],[1,0]])
eff_rk = compute(h0,Y(0),T,f,alpha,beta,gamma,rk, Y,11)'''
#constants
mu = 100
T = 1000
h = 0.01
N = int(T/h)
#initial conditions
y0 = 0.02
d0 = 0
init = np.array([y0,d0])
#Butcher Tableau for Heun's method
alpha = np.array([0,1])
gamma = np.array([0.5,0.5])
beta = np.array([[0,0],[1,0]])
#rhs of the ode system
f = lambda t,u: np.array([u[1],mu*(1-u[0]**2)*u[1]-u[0]])
#solving the system
time, sol = timeLoop(init,N,f,alpha,beta,gamma,h,rk)
print(sol)
Your step size is not small enough. The Van der Pol oscillator with mu=100 is a fast-slow system with very sharp turns at the switching of the modes, so rather stiff. With explicit methods this requires small step sizes, the smallest sensible step size is 1e-5 to 1e-6. You get a solution on the limit cycle already for h=0.001, with resulting velocities up to 150.
You can reduce some of that stiffness by using a different velocity/impulse variable. In the equation
x'' - mu*(1-x^2)*x' + x = 0
you can combine the first two terms into a derivative,
mu*v = x' - mu*(1-x^2/3)*x
so that
x' = mu*(v+(1-x^2/3)*x)
v' = -x/mu
The second equation is now uniformly slow close to the limit cycle, while the first has long relatively straight jumps when v leaves the cubic v=x^3/3-x.
This integrates nicely with the original h=0.01, keeping the solution inside the box [-3,3]x[-2,2], even if it shows some strange oscillations that are not present for smaller step sizes and the exact solution.

Have I implemented Milstein's method/Euler-Maruyama correctly?

I have an stochastic differential equation (SDE) that I am trying to solve using Milsteins method but am getting results that disagree with experiment.
The SDE is
which I have broken up into 2 first order equations:
eq1:
eq2:
Then I have used the Ito form:
So that for eq1:
and for eq2:
My python code used to attempt to solve this is like so:
# set constants from real data
Gamma0 = 4000 # defines enviromental damping
Omega0 = 75e3*2*np.pi # defines the angular frequency of the motion
eta = 0 # set eta 0 => no effect from non-linear p*q**2 term
T_0 = 300 # temperature of enviroment
k_b = scipy.constants.Boltzmann
m = 3.1e-19 # mass of oscillator
# set a and b functions for these 2 equations
def a_p(t, p, q):
return -(Gamma0 - Omega0*eta*q**2)*p
def b_p(t, p, q):
return np.sqrt(2*Gamma0*k_b*T_0/m)
def a_q(t, p, q):
return p
# generate time data
dt = 10e-11
tArray = np.arange(0, 200e-6, dt)
# initialise q and p arrays and set initial conditions to 0, 0
q0 = 0
p0 = 0
q = np.zeros_like(tArray)
p = np.zeros_like(tArray)
q[0] = q0
p[0] = p0
# generate normally distributed random numbers
dwArray = np.random.normal(0, np.sqrt(dt), len(tArray)) # independent and identically distributed normal random variables with expected value 0 and variance dt
# iterate through implementing Milstein's method (technically Euler-Maruyama since b' = 0
for n, t in enumerate(tArray[:-1]):
dw = dwArray[n]
p[n+1] = p[n] + a_p(t, p[n], q[n])*dt + b_p(t, p[n], q[n])*dw + 0
q[n+1] = q[n] + a_q(t, p[n], q[n])*dt + 0
Where in this case p is velocity and q is position.
I then get the following plots of q and p:
I expected the resulting plot of position to look something like the following, which I get from experimental data (from which the constants used in the model are determined):
Have I implemented Milstein's method correctly?
If I have, what else might be wrong my process of solving the SDE that'd causing this disagreement with the experiment?
You missed a term in the drift coefficient, note that to the right of dp there are two dt terms. Thus
def a_p(t, p, q):
return -(Gamma0 - Omega0*eta*q**2)*p - Omega0**2*q
which is actually the part that makes the oscillator into an oscillator. With that corrected the solution looks like
And no, you did not implement the Milstein method as there are no derivatives of b_p which are what distinguishes Milstein from Euler-Maruyama, the missing term is +0.5*b'(X)*b(X)*(dW**2-dt).
There is also a derivative-free version of Milsteins method as a two-stage kind-of Runge-Kutta method, documented in wikipedia or the original in arxiv.org (PDF).
The step there is (vector based, duplicate into X=[p,q], K1=[k1_p,k1_q] etc. to be close to your conventions)
S = random_choice_of ([-1,1])
K1 = a(X )*dt + b(X )*(dW - S*sqrt(dt))
Xh = X + K1
K2 = a(Xh)*dt + b(Xh)*(dW + S*sqrt(dt))
X = X + 0.5 * (K1+K2)

Population Monte Carlo implementation

I am trying to implement the Population Monte Carlo algorithm as described in this paper (see page 78 Fig.3) for a simple model (see function model()) with one parameter using Python. Unfortunately, the algorithm doesn't work and I can't figure out what's wrong. See my implementation below. The actual function is called abc(). All other functions can be seen as helper-functions and seem to work fine.
To check whether the algorithm workds, I first generate observed data with the only parameter of the model set to param = 8. Therefore, the posterior resulting from the ABC algorithm should be centered around 8. This is not the case and I'm wondering why.
I would appreciate any help or comments.
# imports
from math import exp
from math import log
from math import sqrt
import numpy as np
import random
from scipy.stats import norm
# globals
N = 300 # sample size
N_PARTICLE = 300 # number of particles
ITERS = 5 # number of decreasing thresholds
M = 10 # number of words to remember
MEAN = 7 # prior mean of parameter
SD = 2 # prior sd of parameter
def model(param):
recall_prob_all = 1/(1 + np.exp(M - param))
recall_prob_one_item = np.exp(np.log(recall_prob_all) / float(M))
return sum([1 if random.random() < recall_prob_one_item else 0 for item in range(M)])
## example
print "Output of model function: \n" + str(model(10)) + "\n"
# generate data from model
def generate(param):
out = np.empty(N)
for i in range(N):
out[i] = model(param)
return out
## example
print "Output of generate function: \n" + str(generate(10)) + "\n"
# distance function (sum of squared error)
def distance(obsData,simData):
out = 0.0
for i in range(len(obsData)):
out += (obsData[i] - simData[i]) * (obsData[i] - simData[i])
return out
## example
print "Output of distance function: \n" + str(distance([1,2,3],[4,5,6])) + "\n"
# sample new particles based on weights
def sample(particles, weights):
return np.random.choice(particles, 1, p=weights)
## example
print "Output of sample function: \n" + str(sample([1,2,3],[0.1,0.1,0.8])) + "\n"
# perturbance function
def perturb(variance):
return np.random.normal(0,sqrt(variance),1)[0]
## example
print "Output of perturb function: \n" + str(perturb(1)) + "\n"
# compute new weight
def computeWeight(prevWeights,prevParticles,prevVariance,currentParticle):
denom = 0.0
proposal = norm(currentParticle, sqrt(prevVariance))
prior = norm(MEAN,SD)
for i in range(len(prevParticles)):
denom += prevWeights[i] * proposal.pdf(prevParticles[i])
return prior.pdf(currentParticle)/denom
## example
prevWeights = [0.2,0.3,0.5]
prevParticles = [1,2,3]
prevVariance = 1
currentParticle = 2.5
print "Output of computeWeight function: \n" + str(computeWeight(prevWeights,prevParticles,prevVariance,currentParticle)) + "\n"
# normalize weights
def normalize(weights):
return weights/np.sum(weights)
## example
print "Output of normalize function: \n" + str(normalize([3.,5.,9.])) + "\n"
# sampling from prior distribution
def rprior():
return np.random.normal(MEAN,SD,1)[0]
## example
print "Output of rprior function: \n" + str(rprior()) + "\n"
# ABC using Population Monte Carlo sampling
def abc(obsData,eps):
draw = 0
Distance = 1e9
variance = np.empty(ITERS)
simData = np.empty(N)
particles = np.empty([ITERS,N_PARTICLE])
weights = np.empty([ITERS,N_PARTICLE])
for t in range(ITERS):
if t == 0:
for i in range(N_PARTICLE):
while(Distance > eps[t]):
draw = rprior()
simData = generate(draw)
Distance = distance(obsData,simData)
Distance = 1e9
particles[t][i] = draw
weights[t][i] = 1./N_PARTICLE
variance[t] = 2 * np.var(particles[t])
continue
for i in range(N_PARTICLE):
while(Distance > eps[t]):
draw = sample(particles[t-1],weights[t-1])
draw += perturb(variance[t-1])
simData = generate(draw)
Distance = distance(obsData,simData)
Distance = 1e9
particles[t][i] = draw
weights[t][i] = computeWeight(weights[t-1],particles[t-1],variance[t-1],particles[t][i])
weights[t] = normalize(weights[t])
variance[t] = 2 * np.var(particles[t])
return particles[ITERS-1]
true_param = 9
obsData = generate(true_param)
eps = [15000,10000,8000,6000,3000]
posterior = abc(obsData,eps)
#print posterior
I stumbled upon this question as I was looking for pythonic implementations of PMC algorithms, since, quite coincidentally, I'm currently in the process of applying the techniques in this exact paper to my own research.
Can you post the results you're getting? My guess is that 1) you're using a poor choice of distance function (and/or similarity thresholds), or 2) you're not using enough particles. I may be wrong here (I'm not very well-versed in sample statistics), but your distance function implicitly suggests to me that the ordering of your random draws matters. I'd have to think about this more to determine whether it actually has any effect on the convergence properties (it may not), but why don't you simply use the mean or median as your sample statistic?
I ran your code with 1000 particles and a true parameter value of 8, while using the absolute difference between sample means as my distance function, for three iterations with epsilons of [0.5, 0.3, 0.1]; the peak of my estimated posterior distribution seems to be approaching 8 just like it should on each iteration, alongside a reduction in the population variance. Note that there is still a noticeable rightward bias, but this is because of the asymmetry of your model (parameter values of 8 or less can never result in more than 8 observed successes, while all parameters values greater than 8 can, leading to a rightward skewedness in the distribution).
Here's the plot of my results:

Categories

Resources