Fitting data to numerical solution of an ode in python - python

I have a system of two first order ODEs, which are nonlinear, and hence difficult to solve analytically in a closed form. I want to fit the numerical solution to this system of ODEs to a data set. My data set is for only one of the two variables that are part of the ODE system. How do I go about this?
This didn't help because there's only one variable there.
My code which is currently leading to an error is:
import numpy as np
from scipy.integrate import odeint
from scipy.optimize import curve_fit
def f(y, t, a, b, g):
S, I = y # S, I are supposed to be my variables
Sdot = -a * S * I
Idot = (a - b) * S * I + (b - g - b * I) * I
dydt = [Sdot, Idot]
return dydt
def y(t, a, b, g, y0):
y = odeint(f, y0, t, args=(a, b, g))
return y.ravel()
I_data =[] # I have data only for I, not for S
file = open('./ratings_showdown.csv')
for e_raw in file.read().split('\r\n'):
try:
e=float(e_raw); I_data.append(e)
except ValueError:
continue
data_t = range(len(I_data))
popt, cov = curve_fit(y, data_t, I_data, [.05, 0.02, 0.01, [0.99,0.01]])
#want to fit I part of solution to data for variable I
#ERROR here, ValueError: setting an array element with a sequence
a_opt, b_opt, g_opt, y0_opt = popt
print("a = %g" % a_opt)
print("b = %g" % b_opt)
print("g = %g" % g_opt)
print("y0 = %g" % y0_opt)
import matplotlib.pyplot as plt
t = np.linspace(0, len(data_y), 2000)
plt.plot(data_t, data_y, '.',
t, y(t, a_opt, b_opt, g_opt, y0_opt), '-')
plt.gcf().set_size_inches(6, 4)
#plt.savefig('out.png', dpi=96) #to save the fit result
plt.show()

This type of ODE fitting becomes a lot easier in symfit, which I wrote specifically as a user friendly wrapper to scipy. I think it will be very useful for your situation because the decreased amount of boiler-plate code simplifies things a lot.
From the docs and applied roughly to your problem:
from symfit import variables, parameters, Fit, D, ODEModel
S, I, t = variables('S, I, t')
a, b, g = parameters('a, b, g')
model_dict = {
D(S, t): -a * S * I,
D(I, t): (a - b) * S * I + (b - g - b * I) * I
}
ode_model = ODEModel(model_dict, initial={t: 0.0, S: 0.99, I: 0.01})
fit = Fit(ode_model, t=tdata, I=I_data, S=None)
fit_result = fit.execute()
Check out the docs for more :)

So I figured out the problem.
The curve_fit() function apparently returns a list as it's second return value. So, instead of passing the initial conditions as a list [0.99,0.01], I passed them separately as 0.99 and 0.01.

Related

Python solve ODE

How to solve a ODE,I have tried to use scipy.integrate.odeint, but what should I do if I don't know the initial value,
That's what I've defined:
alpha=0.204
beta=0.196
b=5.853
c=241.38
def Left_equation(x):
return alpha * x
def Right_equation(x):
return (b * (x ** 2)) + c
def diff_equation(x, t):
return (t ** beta) * (Right_equation(x) - Left_equation(x))
I also want to get the graph of the result,I don't know if I need to get the analytic solution first.
If the initial condition is not known, then the integration would need to be done symbolically. The Python package sympy can be used for symbolic integration of ordinary differential equations, as follows (using the function sympy.dsolve):
"""How to integrate symbolically an ordinary differential equation."""
import sympy
def main():
alpha = 0.204
beta = 0.196
b = 5.853
c = 241.38
t = sympy.symbols('t')
x = sympy.Function('x')(t)
dxdt = sympy.Derivative(x, t)
e = (t**beta) * ((b * (x**2)) + c - alpha * x)
x_eq = sympy.dsolve(dxdt - e)
if __name__ == '__main__':
main()
For this example, the function sympy.solvers.ode.ode.dsolve raises the exception PolynomialDivisionFailed. But the above code shows how to do symbolic integration.
An alternative is to numerically solve the differential equation (for specific initial condition), and compute solutions for a range of initial conditions, to explore how the solution depends on the initial condition. Using the function scipy.integrate.odepack.odeint of the Python package scipy, and the Python packages matplotlib and numpy, this can be done as follows:
"""How to integrate numerically an ordinary differential equation."""
from matplotlib import pyplot as plt
import numpy as np
import scipy.integrate
def main():
x0 = np.linspace(0, 0.2, 10) # multiple alternative initial conditions
t = np.linspace(0, 0.01, 100) # where to solve
x = scipy.integrate.odeint(deriv, x0, t)
# plot results
plt.plot(t, x)
plt.xlabel('t')
plt.ylabel('x')
plt.grid(True)
plt.show()
def deriv(x, t):
alpha = 0.204
beta = 0.196
b = 5.853
c = 241.38
return (t ** beta) * ((b * (x**2)) + c - alpha * x)
if __name__ == '__main__':
main()
As noted though in the documentation of the function scipy.integrate.odeint:
For new code, use scipy.integrate.solve_ivp to solve a differential equation.
The function scipy.integrate.solve_ivp can be used as follows:
"""How to integrate numerically an ordinary differential equation."""
from matplotlib import pyplot as plt
import numpy as np
import scipy.integrate
def main():
x0 = np.linspace(0, 0.2, 10) # multiple alternative initial conditions
t = (0.0, 0.01) # where to solve: (start, end)
res = scipy.integrate.solve_ivp(deriv, t, x0) # different order of
# arguments than for the function `scipy.integrate.odeint`
# plot results
plt.plot(res.t, res.y.T)
plt.xlabel('t')
plt.ylabel('x')
plt.grid(True)
plt.show()
def deriv(t, x): # not x, t as for the function `scipy.integrate.odeint`
alpha = 0.204
beta = 0.196
b = 5.853
c = 241.38
return (t ** beta) * ((b * (x**2)) + c - alpha * x)
if __name__ == '__main__':
main()
Please note the differences in arguments between the function scipy.integrate.odeint and the function scipy.integrate.solve_ivp.
The above code produces the following plot:

curve_fit to vary only the last three parametres while two values are known

I'm trying to fit 3 model parameters to curve_fit with 2 known parameters. Here's an example (that doesn't work but I think this is pretty much what I should be doing)
def richards(t, beta, l_f, nu, k, t_m):
denom = 1 + nu * np.exp(-k * (t - t_m))
return beta + l_f / np.power(denom, 1/nu)
for i, (r, c) in enumerate(coords):
params[i, :] = curve_fit(
f = richards,
xdata = ts,
ydata = plates[0, r, c]
)[0]
So the question is how do I incorporate the known variables so that only 3 variables need to be fitted?
Say you know that beta = 2 and nu = 3. Then you can define an auxiliary function
def f(t, l_f, k, t_m):
return richards(t=t, beta=2, l_f=l_f, nu=3, k=k, t_m=t_m)
and pass this function to the modelling function:
... curve_fit(f=f, ...)

Curve fitting of complex data

I want to fit complex data set with a two functions which shared the same parameters. For this I used
def funcReal(x,a,b,c,d):
return np.real((a + 1j*b)*(np.exp(1j*k*x - kappa1*x) - np.exp(kappa2*x)) + (c + 1j*d)*(np.exp(-1j*k*x - kappa1*x) - np.exp(-kappa2*x)))
def funcImag(x,a,b,c,d):
return np.imag((a + 1j*b)*(np.exp(1j*k*x - kappa1*x) - np.exp(kappa2*x)) + (c + 1j*d)*(np.exp(-1j*k*x - kappa1*x) - np.exp(-kappa2*x)))`
poptReal, pcovReal = curve_fit(funcReal, x, yReal)
poptImag, pcovImag = curve_fit(funcImag, x, yImag)
Here funcReal is the real part of my model, funcImag the imaginary part, yReal the real part of the data and yImag the imaginary part of the data.
However, both fits does not give me the same parameters for the real and imaginary part.
My question is there a package or a method such that I can realized multi fits for multiple data sets and multiple functions with shared parameters?
To fit both the complex function given above, we can treat the real and imaginary components as a coordinate point, or as a vector. Since curve_fit doesn't care about the order at which data points are inserted in the vectors x (independent data) and y (dependent data), we can simply split the complex data and stack the real and imaginary components using hstack. See the example below.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
kappa1 = np.pi
kappa2 = -0.01
def long_function(x, a, b, c, d):
return (a + 1j*b)*(np.exp(1j*k*x - kappa1*x) - np.exp(kappa2*x)) + (c + 1j*d)*(np.exp(-1j*k*x - kappa1*x) - np.exp(-kappa2*x))
def funcBoth(x, a, b, c, d):
N = len(x)
x_real = x[:N//2]
x_imag = x[N//2:]
y_real = np.real(long_function(x_real, a, b, c, d))
y_imag = np.imag(long_function(x_imag, a, b, c, d))
return np.hstack([y_real, y_imag])
# Create an independent variable with 100 measurements
N = 100
x = np.linspace(0, 10, N)
# True values of the dependent variable
y = long_function(x, a=1.1, b=0.3, c=-0.2, d=0.23)
# Add uniform complex noise (real + imaginary)
noise = (np.random.rand(N) + 1j * np.random.rand(N) - 0.5 - 0.5j) * 0.1
yNoisy = y + noise
# Split the measurements into a real and imaginary part
yReal = np.real(yNoisy)
yImag = np.imag(yNoisy)
yBoth = np.hstack([yReal, yImag])
# Find the best-fit solution
poptBoth, pcovBoth = curve_fit(funcBoth, np.hstack([x, x]), yBoth)
# Compute the best-fit solution
yFit = long_function(x, *poptBoth)
print(poptBoth)
# Plot the results
plt.figure(figsize=(9, 4))
plt.subplot(121)
plt.plot(x, np.real(yNoisy), "k.", label="Noisy y")
plt.plot(x, np.real(y), "r--", label="True y")
plt.plot(x, np.real(yFit), label="Best fit")
plt.ylabel("Real part of y")
plt.xlabel("x")
plt.legend()
plt.subplot(122)
plt.plot(x, np.imag(yNoisy), "k.")
plt.plot(x, np.imag(y), "r--")
plt.plot(x, np.imag(yFit))
plt.ylabel("Imaginary part of y")
plt.xlabel("x")
plt.tight_layout()
plt.show()
Result:
The best-fit parameters that were found in this example were a = 1.14, b = 0.375, c = -0.236, and d = 0.163, which are close enough to the true parameter values given the amplitude of the noise that I inserted here.

Scipy.optimize.curve_fit does not fit

Say I want to fit a sine function using scipy.optimize.curve_fit. I don't know any parameters of the function. To get the frequency, I do Fourier transform and guess all the other parameters - amplitude, phase, and offset. When running my program, I do get a fit but it does not make sense. What is the problem? Any help will be appreciated.
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
ampl = 1
freq = 24.5
phase = np.pi/2
offset = 0.05
t = np.arange(0,10,0.001)
func = np.sin(2*np.pi*t*freq + phase) + offset
fastfft = np.fft.fft(func)
freq_array = np.fft.fftfreq(len(t),t[0]-t[1])
max_value_index = np.argmax(abs(fastfft))
frequency = abs(freq_array[max_value_index])
def fit(a, f, p, o, t):
return a * np.sin(2*np.pi*t*f + p) + o
guess = (0.9, frequency, np.pi/4, 0.1)
params, fit = sp.optimize.curve_fit(fit, t, func, p0=guess)
a, f, p, o = params
fitfunc = lambda t: a * np.sin(2*np.pi*t*f + p) + o
plt.plot(t, func, 'r-', t, fitfunc(t), 'b-')
The main problem in your program was a misunderstanding, how scipy.optimize.curve_fit is designed and its assumption of the fit function:
ydata = f(xdata, *params) + eps
This means that the fit function has to have the array for the x values as the first parameter followed by the function parameters in no particular order and must return an array for the y values. Here is an example, how to do this:
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize
#t has to be the first parameter of the fit function
def fit(t, a, f, p, o):
return a * np.sin(2*np.pi*t*f + p) + o
ampl = 1
freq = 2
phase = np.pi/2
offset = 0.5
t = np.arange(0,10,0.01)
#is the same as fit(t, ampl, freq, phase, offset)
func = np.sin(2*np.pi*t*freq + phase) + offset
fastfft = np.fft.fft(func)
freq_array = np.fft.fftfreq(len(t),t[0]-t[1])
max_value_index = np.argmax(abs(fastfft))
frequency = abs(freq_array[max_value_index])
guess = (0.9, frequency, np.pi/4, 0.1)
#renamed the covariance matrix
params, pcov = scipy.optimize.curve_fit(fit, t, func, p0=guess)
a, f, p, o = params
#calculate the fit plot using the fit function
plt.plot(t, func, 'r-', t, fit(t, *params), 'b-')
plt.show()
As you can see, I have also changed the way the fit function for the plot is calculated. You don't need another function - just utilise the fit function with the parameter list, the fit procedure gives you back.
The other problem was that you called the covariance array fit - overwriting the previously defined function fit. I fixed that as well.
P.S.: Of course now you only see one curve, because the perfect fit covers your data points.

Getting standard error associated with parameter estimates from scipy.optimize.curve_fit

I am using scipy.optimize.curve_fit to fit a curve to some data i have. The curves, for the most part, seem to fit very well. For some reason, pcov = inf when i print it off.
What i really need is to calculate the error associated with the parameters i'm fitting, and am not sure how exactly to do this even if it does give me the covariance matrix.
The model being fit to is:
def intensity(x,R_out,R_in,K_in,K_out,a,b,c):
K_in,K_out = abs(0.0),abs(K_out)
if x<=R_in:
return 2*R_out*(K_out*np.sqrt(1-x**2/R_out**2)-
(K_out-0.0)*np.sqrt(R_in**2/R_out**2-x**2/R_out**2)) + c
elif x>=R_in and x<=R_out:
return K_out*2*R_out*np.sqrt(1-x**2/R_out**2) + c
elif x>R_out:
return c
intensity_vec = np.vectorize(intensity)
def intensity_vec_self(x,R_out,R_in,K_in,K_out,a,b,c):
y = np.zeros(x.shape)
for i in range(len(y)):
y[i]=intensity_vec(x[i],R_out,R_in,K_in,K_out,a,b,c)
return y
and there are 400 data points, i can put that on here if you think it will help.
To summarize, i can't get curve_fit to print off my pcov and need help as to figure out why and if i can get it to do so.
Also, if it is a quick explanation i would like to know how to use the pcov array to attain the errors associated with my fit.
Thanks
The variance of parameters are the diagonal elements of the variance-co variance matrix, and the standard error is the square root of it. np.sqrt(np.diag(pcov))
Regarding getting inf, see and compare these two examples:
In [129]:
import numpy as np
def func(x, a, b, c, d):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5, 1)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = so.curve_fit(func, xdata, ydata)
print np.sqrt(np.diag(pcov))
[ inf inf inf inf]
And:
In [130]:
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = so.curve_fit(func, xdata, ydata)
print np.sqrt(np.diag(pcov))
[ 0.11097646 0.11849107 0.05230711]
In this extreme example, d has no effect on the function func, hence it will be associated with variance of +inf, or in another word, it can be just about any value. Removing d from func will get what will make sense.
In reality, if parameters are of very different scale, say:
def func(x, a, b, c, d):
#return a * np.exp(-b * x) + c
return a * np.exp(-b * x) + c + d*1e-10
You will also get inf due to float point overflow/underflow.
In your case, I think you never used a and b. So it is just like the first example here.

Categories

Resources