Integrate function of random variables in pymc3

Integrate function of random variables in pymc3 - python

I am trying to construct a model in pymc3 which requires me to integrate a function of random variables. The basic idea using actual numbers is this:
from scipy.integrate import quad
def PowerLaw(x,N0,alpha):
""" A PowerLaw distribution"""
return N0 * x**-alpha
print quad(PowerLaw,0.1,10,args=(1e-4,2)) #(0.0009900000000000002, 3.008094474110274e-12)
I can also do this in theano:
from theano import function,tensor as tt
xt = tt.dscalar('x')
N0 = tt.dscalar('N0')
alpha = tt.dscalar('alpha')
y = PowerLaw(xt,N0,alpha)
func = function([xt,N0,alpha],y)
print quad(func,0.1,10,args=(1e-4,2)) #Same answer as before
Here is an example of what I want to do:
with pm.Model() as myModel:
N0 = pm.Uniform("N0",1e-5,1e-1)
alpha = pm.Uniform("alpha",1,5)
yval = quad(PowerLaw,0.1,10,args=(N0,alpha))
But of course when I try this I get a TypeError because N0 and alpha are not floats. Of course, in this simple case, I know the analytical solution of the integral; my actual model requires more complicated integrals where I do not know the closed form. Is there any way to accomplish this in pymc3?

Related

Solving coupled differential equations with sympy

I am trying to solve the following system of first order coupled differential equations:
- dR/dt=k2*Y(t)-k1*R(t)*L(t)+k4*X(t)-k3*R(t)*I(t)
- dL/dt=k2*Y(t)-k1*R(t)*L(t)
- dI/dt=k4*X(t)-k3*R(t)*I(t)
- dX/dt=k1*R(t)*L(t)-k2*Y(t)
- dY/dt=k3*R(t)*I(t)-k4*X(t)
The knowed initial conditions are: X(0)=0, Y(0)=0
The knowed constants values are: k1=6500, k2=0.9
This equations defines a kinetick model and I need to solve them to get the Y(t) function to fit my data and find k3 and k4 values. In order to that, I have tried to solve the system simbologically with sympy. There is my code:
import matplotlib.pyplot as plt
import numpy as np
import sympy
from sympy.solvers.ode.systems import dsolve_system
from scipy.integrate import solve_ivp
from scipy.integrate import odeint
k1 = sympy.Symbol('k1', real=True, positive=True)
k2 = sympy.Symbol('k2', real=True, positive=True)
k3 = sympy.Symbol('k3', real=True, positive=True)
k4 = sympy.Symbol('k4', real=True, positive=True)
t = sympy.Symbol('t',real=True, positive=True)
L = sympy.Function('L')
R = sympy.Function('R')
I = sympy.Function('I')
Y = sympy.Function('Y')
X = sympy.Function('X')
f1=k2*Y(t)-k1*R(t)*L(t)+k4*X(t)-k3*R(t)*I(t)
f2=k2*Y(t)-k1*R(t)*L(t)
f3=k4*X(t)-k3*R(t)*I(t)
f4=-f2
f5=-f3
eq1=sympy.Eq(sympy.Derivative(R(t),t),f1)
eq2=sympy.Eq(sympy.Derivative(L(t),t),f2)
eq3=sympy.Eq(sympy.Derivative(I(t),t),f3)
eq4=sympy.Eq(sympy.Derivative(Y(t),t),f4)
eq5=sympy.Eq(sympy.Derivative(X(t),t),f5)
Sys=(eq1,eq2,eq3,eq4,eq5])
solsys=dsolve_system(eqs=Sys,funcs=[X(t),Y(t),R(t),L(t),I(t)], t=t, ics={Y(0):0, X(0):0})
There is the answer:
NotImplementedError:
The system of ODEs passed cannot be solved by dsolve_system.
I have tried with dsolve too, but I get the same.
Is there any other solver I can use or some way of doing this that will allow me to get the function for the fitting? I'm using python 3.8 in Spider with Anaconda in windows64.
Thank you!
# Update
Following
You are saying "experiment". So you have data and want to fit the model to them, find appropriate values for k3 and k4 at least, and perhaps for all coefficients and the initial conditions (the first measured data point might not be the initial condition for the best fit)? See stackoverflow.com/questions/71722061/… for a recent attempt on such a task. –
Lutz Lehmann
23 hours
There is my new code:
t=[0,0.25,0.5,0.75,1.5,2.27,3.05,3.82,4.6,5.37,6.15,6.92,7.7,8.47,13.42,18.42,23.42,28.42,33.42,38.42,43.42,48.42,53.42,58.42,63.42,68.42,83.4,98.4,113.4,128.4,143.4,158.4,173.4,188.4,203.4,218.4,233.4,248.4]
yexp=[832.49,1028.01,1098.12,1190.08,1188.97,1377.09,1407.47,1529.35,1431.72,1556.66,1634.59,1679.09,1692.05,1681.89,1621.88,1716.77,1717.91,1686.7,1753.5,1722.98,1630.14,1724.16,1670.45,1677.16,1614.98,1671.16,1654.03,1661.84,1675.31,1626.76,1638.29,1614.41,1594.31,1584.73,1599.22,1587.85,1567.74,1602.69]
def derivative(S, t, k3, k4):
k1=1798931
k2=0.2629
x, y,r,l,i = S
doty = k1*r*l+k2*y
dotx = k3*r*i-k4*x
dotr = k2*y-k1*r*l+k4*x-k3*r*i
dotl = k2*y-k1*r*l
doti = k4*x-k3*r*i
return np.array([doty, dotx, dotr, dotl, doti])
def solver(XY,t,para):
return odeint(derivative, XY, t, args = para, atol=1e-8, rtol=1e-11)
def integration(XY_arr,*para):
XY0 = para[:5]
para = para[5:]
T = np.arange(len(XY_arr))
res0 = solver(XY0,T, para)
res1 = [ solver(XY0,[t,t+1],para)[-1]
for t,XY in enumerate(XY_arr[:-1]) ]
return np.concatenate([res0,res1]).flatten()
XData =yexp
YData = np.concatenate([ yexp,yexp,yexp,yexp,yexp,yexp[1:],yexp[1:],yexp[1:],yexp[1:],yexp[1:]]).flatten()
p0 =[0,0,100,10,10,1e8,0.01]
params, info = curve_fit(integration,XData,YData,p0=p0, maxfev=5000)
XY0, para = params[:5], params[5:]
print(XY0,tuple(para))
t_plot = np.linspace(0,len(t),500)
x_plot = solver(XY0, t_plot, tuple(para))
But the output are not correct, as are the same as initial condition p0:
[ 0. 0. 100. 10. 10.] (100000000.0, 0.01)
Graph
I understand that the function 'integration' gives packed values of y for each function at each instant of time, but I don't know how to unpack them to make the curve_fitt separately. Maybe I don't quite understand how it works.
Thank you!

As you observed, sympy is not able to solve this system. This might mean that
the procedure to classify ODE in sympy is not complete enough, or
some trick/method is needed above the standard set of methods that is implemented in sympy, or
that there just is no symbolic solution.
The last case is the generic one, take a symbolically solvable ODE, add some random term, and almost certainly the resulting ODE is no longer symbolically solvable.
As I understand with the comments, you have an model via ODE system with state space (cX,cY,cR,cL,cI) with equations with 4 parameters k1,k2,k3,k4 and, by the structure of a reaction system R+I <-> X, R+L <-> Y, the sums cR+cX+cY, cL+cY, cI+cX are all constant.
For some other process that is approximately represented by the model, you have time series data t[k],y[k] for the Y component. Also you have partial information on the initial state and the parameter set. If there are sufficiently many data points one could also forget about these, fit for all parameters, and compare how far away the given parameters are to the computed ones.
There are several modules and packages that solve this fitting task in a more or less abstract fashion. I think pyomo and gekko can both be used. More directly one can use the facilities of scipy.odr or scipy.optimize.
Define the forward function that transforms time and parameters
def model(t,u,k1,k2,k3,k4):
X,Y,R,L,I = u
dL = k2*Y - k1*R*L
dI = k4*X - k3*R*I
dR = dL+dI
dX = -dI
dY = -dL
return dX,dY,dR,dL,dI
def solver(t,u0,k):
res = solve_ivp(model, [0, t[-1]], u0, args=tuple(k), t_eval=t,
method="DOP853", atol=1e-7, rtol=1e-10)
return res.y
Prepare some data plus noise
k1o = 6.500; k2o=0.9
T = np.linspace(0,0.05,21)
U = solver(T, [0,0,50,40,25], [k1o, k2o, 5.400, 0.7])
Y = U[1] # equilibrium slightly above 30
Y += np.random.uniform(high=0.05, size=Y.shape)
Prepare the function that splits the combined parameter vector in initial state and coefficients, call the curve fitting function
from scipy.optimize import curve_fit
def partial(t,R,L,I,k3,k4):
print(R,L,I,k3,k4)
U = solver(t,[0,0,R,L,I],[k1o,k2o,k3,k4])
return U[1]
params, info = curve_fit(partial,T,Y, p0=[30,20,10, 0.3,3.000])
R,L,I, k3,k4 = params
print(R,L,I, k3,k4)
It turns out that curve_fit goes into strange regions with large negative values. A likely reason is that the Y component is, in the end, not coupled strongly enough to all the other components, meaning that large changes in some of the parameters have minimal influence on Y, so that minimal noise in Y can lead to large deviations in these parameters. Here this apparently happens (first) to k3.

System of First Order ODEs in Python

I have seen how to solve systems of ODEs in Python, but all of the examples I have seen were "standard" equations. What I mean by standard is that the equations do not say "derivative of one function = expression that contains derivative of another function".
Here is a sample system I am trying to solve numerically. Initial conditions are x(0) = 5, y(0) = 3, z(0) = 2, and all initial derivatives are 0:
x'(t) + 4y(t) = -3y'(t)
y'(t) + ty(t) = -2z'(t)
z'(t) = -2y(t) + x'(t)
I am not 100% sure how to code this. Here is what I have tried:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
import math
def ODESystem(f,t):
x = f[0]
y = f[1]
z = f[2]
Now, what do I define first: dydt, dxdt or dzdt. Is there a way for me to define one expression that "hangs around" before I use it to define another expression?

You do not need to solve anything manually, you can just as well do
def ODESystem(f,t):
x,y,z = f
return np.linalg.solve([[1,3,0],[0,1,2],[-1,0,1]], [-4, -t, -2])*y

Nevermind; I am stupid. I can keep on substituting into the third equation until I get an equation for z'(t) that does not include any other derivatives.

python: Initial condition in solving differential equation

I want to solve this differential equation:
y′′+2y′+2y=cos(2x) with initial conditions:
y(1)=2,y′(2)=0.5
y′(1)=1,y′(2)=0.8
y(1)=0,y(2)=1
and it's code is:
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
def dU_dx(U, x):
return [U[1], -2*U[1] - 2*U[0] + np.cos(2*x)]
U0 = [1,0]
xs = np.linspace(0, 10, 200)
Us = odeint(dU_dx, U0, xs)
ys = Us[:,0]
plt.xlabel("x")
plt.ylabel("y")
plt.title("Damped harmonic oscillator")
plt.plot(xs,ys);
how can I fulfill it?

Your initial conditions are not, as they give values at two different points. These are all boundary conditions.
def bc1(u1,u2): return [u1[0]-2.0,u2[1]-0.5]
def bc2(u1,u2): return [u1[1]-1.0,u2[1]-0.8]
def bc3(u1,u2): return [u1[0]-0.0,u2[0]-1.0]
You need a BVP solver to solve these boundary value problems.
You can either make your own solver using the shooting method, in case 1 as
def shoot(b): return odeint(dU_dx,[2,b],[1,2])[-1,1]-0.5
b = fsolve(shoot,0)
T = linspace(1,2,N)
U = odeint(dU_dx,[2,b],T)
or use the secant method instead of scipy.optimize.fsolve, as the problem is linear this should converge in 1, at most 2 steps.
Or you can use the scipy.integrate.solve_bvp solver (which is perhaps newer than the question?). Your task is similar to the documented examples. Note that the argument order in the ODE function is switched in all other solvers, even in odeint you can give the option tfirst=True.
def dudx(x,u): return [u[1], np.cos(2*x)-2*(u[1]+u[0])]
Solutions generated with solve_bvp, the nodes are the automatically generated subdivision of the integration interval, their density tells how "non-flat" the ODE is in that region.
xplot=np.linspace(1,2,161)
for k,bc in enumerate([bc1,bc2,bc3]):
res = solve_bvp(dudx, bc, [1.0,2.0], [[0,0],[0,0]], tol=1e-5)
print res.message
l,=plt.plot(res.x,res.y[0],'x')
c = l.get_color()
plt.plot(xplot, res.sol(xplot)[0],c=c, label="%d."%(k+1))
Solutions generated using the shooting method using the initial values at x=0 as unknown parameters to then obtain the solution trajectories for the interval [0,3].
x = np.linspace(0,3,301)
for k,bc in enumerate([bc1,bc2,bc3]):
def shoot(u0): u = odeint(dudx,u0,[0,1,2],tfirst=True); return bc(u[1],u[2])
u0 = fsolve(shoot,[0,0])
u = odeint(dudx,u0,x,tfirst=True);
l, = plt.plot(x, u[:,0], label="%d."%(k+1))
c = l.get_color()
plt.plot(x[::100],u[::100,0],'x',c=c)

You can use the scipy.integrate.ode function this is similar to scipy.integrate.odeint but allows a jac parameter which is df/dy or in the case of your given ODE df/dx

The shape variable in pymc3.DensityDist does not work properly

I am trying to define a multivariate custom distribution through pymc3.DensityDist(); however, I keep getting the following error that dimensions do not match:
"LinAlgError: 0-dimensional array given. Array must be two-dimensional"
I have already seen https://github.com/pymc-devs/pymc3/issues/535 but I could not find the answer to my question. Just for clarity, here is my simple example
import numpy as np
import pymc3 as pm
def pdf(x):
y = 0
print(x)
sigma = np.identity(2)
isigma = sigma
mu = np.array([[1,2],[3,4]])
for i in range(2):
x0 = x- mu[i,:]
xsinv = np.linalg.multi_dot([x0,isigma,x0])
y = y + np.exp(-0.5*xsinv)
return y
logp = lambda x: np.log(pdf(x))
with pm.Model() as model:
pm.DensityDist('x',logp, shape=2)
step = pm.Metropolis(tune=False, S=np.identity(2))
trace = pm.sample(100000, step=step, chain=1, tune=0,progressbar=False)
result = trace['x']
In this simple code I want to define an unnormilized pdf function, which is sum of two unnormalized normal distributions, and take samples from this pdf through Metropolis algorithm.
Thanks,

Try replacing numpy for theano in the following lines:
xsinv = tt.dot(tt.dot(x0, isigma), x0)
y = y + tt.exp(-0.5 * xsinv)
as a side note, try using NUTS instead of metropolis and let PyMC3 choose the sampling method for you, just do
trace = pm.sample(1000)
For future reference you can also ask questions here

Using adaptive step sizes with scipy.integrate.ode

The (brief) documentation for scipy.integrate.ode says that two methods (dopri5 and dop853) have stepsize control and dense output. Looking at the examples and the code itself, I can only see a very simple way to get output from an integrator. Namely, it looks like you just step the integrator forward by some fixed dt, get the function value(s) at that time, and repeat.
My problem has pretty variable timescales, so I'd like to just get the values at whatever time steps it needs to evaluate to achieve the required tolerances. That is, early on, things are changing slowly, so the output time steps can be big. But as things get interesting, the output time steps have to be smaller. I don't actually want dense output at equal intervals, I just want the time steps the adaptive function uses.
EDIT: Dense output
A related notion (almost the opposite) is "dense output", whereby the steps taken are as large as the stepper cares to take, but the values of the function are interpolated (usually with accuracy comparable to the accuracy of the stepper) to whatever you want. The fortran underlying scipy.integrate.ode is apparently capable of this, but ode does not have the interface. odeint, on the other hand, is based on a different code, and does evidently do dense output. (You can output every time your right-hand-side is called to see when that happens, and see that it has nothing to do with the output times.)
So I could still take advantage of adaptivity, as long as I could decide on the output time steps I want ahead of time. Unfortunately, for my favorite system, I don't even know what the approximate timescales are as functions of time, until I run the integration. So I'll have to combine the idea of taking one integrator step with this notion of dense output.
EDIT 2: Dense output again
Apparently, scipy 1.0.0 introduced support for dense output through a new interface. In particular, they recommend moving away from scipy.integrate.odeint and towards scipy.integrate.solve_ivp, which as a keyword dense_output. If set to True, the returned object has an attribute sol that you can call with an array of times, which then returns the integrated functions values at those times. That still doesn't solve the problem for this question, but it is useful in many cases.

Since SciPy 0.13.0,
The intermediate results from the dopri family of ODE solvers can
now be accessed by a solout callback function.
import numpy as np
from scipy.integrate import ode
import matplotlib.pyplot as plt
def logistic(t, y, r):
return r * y * (1.0 - y)
r = .01
t0 = 0
y0 = 1e-5
t1 = 5000.0
backend = 'dopri5'
# backend = 'dop853'
solver = ode(logistic).set_integrator(backend)
sol = []
def solout(t, y):
sol.append([t, *y])
solver.set_solout(solout)
solver.set_initial_value(y0, t0).set_f_params(r)
solver.integrate(t1)
sol = np.array(sol)
plt.plot(sol[:,0], sol[:,1], 'b.-')
plt.show()
Result:
The result seems to be slightly different from Tim D's, although they both use the same backend. I suspect this having to do with FSAL property of dopri5. In Tim's approach, I think the result k7 from the seventh stage is discarded, so k1 is calculated afresh.
Note: There's a known bug with set_solout not working if you set it after setting initial values. It was fixed as of SciPy 0.17.0.

I've been looking at this to try to get the same result. It turns out you can use a hack to get the step-by-step results by setting nsteps=1 in the ode instantiation. It will generate a UserWarning at every step (this can be caught and suppressed).
import numpy as np
from scipy.integrate import ode
import matplotlib.pyplot as plt
import warnings
def logistic(t, y, r):
return r * y * (1.0 - y)
r = .01
t0 = 0
y0 = 1e-5
t1 = 5000.0
#backend = 'vode'
backend = 'dopri5'
#backend = 'dop853'
solver = ode(logistic).set_integrator(backend, nsteps=1)
solver.set_initial_value(y0, t0).set_f_params(r)
# suppress Fortran-printed warning
solver._integrator.iwork[2] = -1
sol = []
warnings.filterwarnings("ignore", category=UserWarning)
while solver.t < t1:
solver.integrate(t1, step=True)
sol.append([solver.t, solver.y])
warnings.resetwarnings()
sol = np.array(sol)
plt.plot(sol[:,0], sol[:,1], 'b.-')
plt.show()
result:

The integrate method accepts a boolean argument step that tells the method to return a single internal step. However, it appears that the 'dopri5' and 'dop853' solvers do not support it.
The following code shows how you can get the internal steps taken by the solver when the 'vode' solver is used:
import numpy as np
from scipy.integrate import ode
import matplotlib.pyplot as plt
def logistic(t, y, r):
return r * y * (1.0 - y)
r = .01
t0 = 0
y0 = 1e-5
t1 = 5000.0
backend = 'vode'
#backend = 'dopri5'
#backend = 'dop853'
solver = ode(logistic).set_integrator(backend)
solver.set_initial_value(y0, t0).set_f_params(r)
sol = []
while solver.successful() and solver.t < t1:
solver.integrate(t1, step=True)
sol.append([solver.t, solver.y])
sol = np.array(sol)
plt.plot(sol[:,0], sol[:,1], 'b.-')
plt.show()
Result:

FYI, although an answer has been accepted already, I should point out for the historical record that dense output and arbitrary sampling from anywhere along the computed trajectory is natively supported in PyDSTool. This also includes a record of all the adaptively-determined time steps used internally by the solver. This interfaces with both dopri853 and radau5 and auto-generates the C code necessary to interface with them rather than relying on (much slower) python function callbacks for the right-hand side definition. None of these features are natively or efficiently provided in any other python-focused solver, to my knowledge.

Here's another option that should also work with dopri5 and dop853. Basically, the solver will call the logistic() function as often as needed to calculate intermediate values so that's where we store the results:
import numpy as np
from scipy.integrate import ode
import matplotlib.pyplot as plt
sol = []
def logistic(t, y, r):
sol.append([t, y])
return r * y * (1.0 - y)
r = .01
t0 = 0
y0 = 1e-5
t1 = 5000.0
# Maximum number of steps that the integrator is allowed
# to do along the whole interval [t0, t1].
N = 10000
#backend = 'vode'
backend = 'dopri5'
#backend = 'dop853'
solver = ode(logistic).set_integrator(backend, nsteps=N)
solver.set_initial_value(y0, t0).set_f_params(r)
# Single call to solver.integrate()
solver.integrate(t1)
sol = np.array(sol)
plt.plot(sol[:,0], sol[:,1], 'b.-')
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Integrate function of random variables in pymc3 - python

Related

Solving coupled differential equations with sympy

System of First Order ODEs in Python

python: Initial condition in solving differential equation

The shape variable in pymc3.DensityDist does not work properly

Using adaptive step sizes with scipy.integrate.ode

Categories

Resources