I'm trying to estimate two parameter values A and B of an ODE using curve_fit, and then fit the solution to this ODE to my data set, plotting the results.
My code:
def model(I,t,A,B):
dIdt = A*(2000 - I) + B*(2000 - I)*(I/2000)
return dIdt
xData = # this is an np.array of my x values
yData = # this is an np.array of my y values
plt.plot(xData, yData, 'r.-', label='experimental-data') #This part of the code seems to work
initialGuess = [1.0,1.0]
popt, pcov = curve_fit(model, xData, yData, initialGuess) #This is where the error is
print(popt)
xFit = np.arange(0.0, 5.0, 0.01)
I0 = 0
t = np.linspace(0,60)
I = odeint(model,I0,t) #This is where i integrate the ODE to obtain I(t).
plt.plot(xFit, I(xFit, *popt), 'r', label='fit params: a=%5.3f, b=%5.3f' % tuple(popt))
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
The error I am getting is
model() missing 1 required positional argument: 'B'.
I roughly understand what's going on: my model() function takes in 4 arguments at the beginning: I,t,A and B. However, somewhere along the line, the code only recognizes these first 3 arguments, and leaves out B. I am not sure how to fix this.
I have tried a few things:
taking out the 'initialGuess' from the error line, so that there are 3 arguments in the curve_fit line , and this gave me a new error
Improper input: N=3 must not exceed M=1
which makes me think, that the initialGuess entry isn't the problem.
Changed model in the error line to model(), which gave me the error
model() missing 4 required positional arguments: 'I', 't', 'A', and 'B'
Working off this, I changed model to model(I,t,A,B), which ends up giving me name 'A' is not defined
And now I am lost.
All of these errors are happening in the same line, so I've tried changing things in there, but perhaps I am missing something else. Most of the online sources that touch on this error mention having to instantiate a class instance, but I'm unsure what this means in this context, I have not defined a class in the code.
I hope I've made my confusion clear, any guidance would be appreciated.
Perform curve_fit from scipy.optimize with model function (see https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html):
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
def model(i, a, b):
return a * (2_000 - i)\
+ b * (2_000 - i)\
* (i / 2_000)
xData = np.array(range(10))
yData = model(xData, 1, 1)
initialGuess = [1.0, 1.0]
popt, pcov = curve_fit(f=model,
xdata=xData,
ydata=yData,
p0=initialGuess
)
print(popt)
Returns:
[1. 1.]
Next, Perform integration using odeint from scipy.integrate:
from scipy.integrate import odeint
xFit = np.arange(0.0, 5.0, 0.1)
I0 = 0
t = np.linspace(0, 60)
a, b = 1, 1
def model(i, t, a, b):
return a * (2_000 - i)\
+ b * (2_000 - i)\
* (i / 2_000)
I = odeint(model, I0, t, args=(a, b))
plt.plot(xFit, I[:, 0], 'b', label= 'fit params: a=%5.3f, b=%5.3f' % tuple(popt))
plt.xlabel('x')
plt.ylabel('y')
plt.legend()
plt.show()
Reveals the plot (see https://docs.scipy.org/doc/scipy/reference/generated/scipy.integrate.odeint.html):
Related
I'm doing a curve fit in python using scipy.curve_fit, and the fit itself looks great, however the parameters that are generated don't make sense.
The equation is (ax)^b + cx, but with the params python finds a = -c and b = 1, so the whole equation just equals 0 for every value of x.
here is the plot
(https://i.stack.imgur.com/fBfg7.png)](https://i.stack.imgur.com/fBfg7.png)
here is the experimental raw data I used: https://pastebin.com/CR2BCJji
xdata = cfu_u
ydata = OD_u
min_cfu = 0.1
max_cfu = 9.1
x_vec = pow(10,np.arange(min_cfu,max_cfu,0.1))
def func(x,a, b, c):
return (a*x)**b + c*x
popt, pcov = curve_fit(func, xdata, ydata)
plt.plot(x_vec, func(x_vec, *popt), label = 'curve fit',color='slateblue',linewidth = 2.2)
plt.plot(cfu_u,OD_u,'-',label = 'experimental data',marker='.',markersize=8,color='deepskyblue',linewidth = 1.4)
plt.legend(loc='upper left',fontsize=12)
plt.ylabel("Y",fontsize=12)
plt.xlabel("X",fontsize=12)
plt.xscale("log")
plt.gcf().set_size_inches(7, 5)
plt.show()
print(popt)
[ 1.44930871e+03 1.00000000e+00 -1.44930871e+03]
I used the curve_fit function from scipy to fit an exponential curve to some data. The fit looks very good, so that part was a success.
However, the parameters output by the curve_fit function do not make sense, and solving f(x) with them results in f(x)=0 for every value of x, which is clearly not what is happening in the curve.
Modify your model to show what's actually happening:
def func(x: np.ndarray, a: float, b: float, c: float) -> np.ndarray:
return (a*x)**(1 - b) + (c - a)*x
producing optimized parameters
[3.49003332e-04 6.60420171e-06 3.13366557e-08]
This is likely to be numerically unstable. Try optimizing in the log domain instead.
When I run your example (after adding imports, etc.), I get NaNs for popt, and I eventually realized you were allowing general, real b with negative x. If I fit to the positive x only, I get a popt of [1.89176133e+01 5.66689997e+00 1.29380532e+08]. The fit isn't too bad (see below), but perhaps you need to restrict b to be an integer to fit the whole set. I'm not sure how to do that in Scipy (I assume you need mixed integer-real optimization, and I haven't investigated if Scipy supports that.)
Code:
import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt
cfu_u, OD_u = np.loadtxt('data.txt', skiprows=1).T
# fit to positive x only
posmask = cfu_u > 0
xdata = cfu_u[posmask]
ydata = OD_u[posmask]
def func(x, a, b, c):
return (a*x)**b + c*x
popt, pcov = curve_fit(func, xdata, ydata, p0=[1000,2,1])
x_vec = np.geomspace(xdata.min(), xdata.max())
plt.plot(x_vec, func(x_vec, *popt), label = 'curve fit',color='slateblue',linewidth = 2.2)
plt.plot(cfu_u,OD_u,'-',label = 'experimental data', marker='x',markersize=8,color='deepskyblue',linewidth = 1.4)
plt.legend(loc='upper left',fontsize=12)
plt.ylabel("Y",fontsize=12)
plt.xlabel("X",fontsize=12)
plt.yscale("log")
plt.xscale("symlog")
plt.show()
print(popt)
#[ 1.44930871e+03 1.00000000e+00 -1.44930871e+03]
I want to solve the equation in python over the time Interval I = [0,10] with initial condition (x_0, y_0) = (1,0) and the parameter values μ ∈ {-2, -1, 0, 1, 2} using the function
scipy.integrate.odeint
Then I want to plot the solutions (x(t;x_0,y_0), y(t;x_0,y_0)) in the xy-plane.
The originally given linear system is
dx/dt = y, x(0) = x_0
dy/dt = - x - μy, y(0) = y_0
Please see my code below:
import numpy as np
from scipy.integrate import odeint
sol = odeint(myode, y0, t , args=(mu,1)) #mu and 1 are the coefficients when set equation to 0
y0 = 0
myode(y, t, mu) = -x-mu*y
def t = np.linspace(0,10, 101) #time interval
dydt = [y[1], -y[0] - mu*y[1]]
return dydt
Could anyone check if I defined the callable function myode correctly? This function evaluates the right hand side of the ODE.
Also an syntax error message showed up for this line of code
def t = np.linspace(0,10, 101) #time interval
saying there is invalid syntax. Should I somehow use
for * in **
to get rid of the error message? If yes, how exactly?
I am very new to Python and ODE. Could anyone help me with this question? Thank you very much!
myode should be a function definition, thus
def myode(u, t, mu): x,y = u; return [ y, -x-mu*y]
The time array is a simple variable declaration/assignment, there should be no def there. As the system is two-dimensional, the initial value also needs to have dimension two
sol = odeint(myode, [x0,y0], t, args=(mu,) )
Thus a minimal modification of your script is
def myode(u, t, mu): x,y = u; return [ y, -x-mu*y]
t = np.linspace(0,10, 101) #time interval
x0,y0 = 1,0 # initial conditions
for mu in [-2,-1,0,1,2]:
sol = odeint(myode, [x0,y0], t, args=(mu,) )
x,y = sol.T
plt.plot(x,y)
a=5; plt.xlim(-a,a); plt.ylim(-a,a)
plt.grid(); plt.show()
giving the plot
Try using the solve_ivp method.
from scipy.integrate import solve_ivp
import matplotlib.pyplot as plt
import numpy as np
i = 0
u = [-2,-1,0,1,2]
for x in u:
def rhs2(t,y):
return [y[1], -1*y[0] - u[x]*y[1]]
value = u[i]
res2 = solve_ivp(rhs2, [0,10], [1,0] , t_eval=[0,1,2,3,4,5,6,7,8,9,10], method = 'RK45')
t = np.array(res2.t[1:-1])
x = np.array(res2.y[0][1:-1])
y = np.array(res2.y[1][1:-1])
fig = plt.figure()
plt.plot(t, x, 'b-', label='X(t)')
plt.plot(t, y, 'g-', label='Y(t)')
plt.title("u = {}".format(value))
plt.legend(loc='lower right')
plt.show()
i = i + 1
Here is the solve_ivp method Documentation
Here is a very similar problem with a better explanation.
Hello I have a problem to fit some data with Python. I just begin to fit my data with Python so I have some problems... This is my code :
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import *
from numpy import linalg as LA
def f(x,a,b,c):
return a*np.power(x,b)+c
x = np.array([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79])
y = np.array([7200,7925,8050,8200,8000,7550,7500,6800,6400,8150,6566,6280,6105,5963,5673,5495,5395,4800,4550,4558,4228,4087,3951,3817,3721,3612,3498,3416,3359,3269,3163,3241,2984,4475,2757,2644,2555,2600,3163,2720,2630,2543,2454,2441,2389,2339,2293,2261,2212,2180,2143,2450,2065,2032,1994,1960,1930,1897,1870,1838,1821,1785,1763,1741,1718,1689,1676,1662,1635,1635,1667,1633,1617,1615,1599,1581,1565,1547,1547])
params, extras = curve_fit(f, x, y)
plt.plot(x,y, 'o')
plt.plot(x, f(x, params[0], params[1], params[2]))
plt.title('Fit')
plt.legend(['data','fit'],loc='best')
plt.show()
And actually I want to fit my data with a function f(x) = a*x^b + c where I am looking for the best values of a, b and c to fit my data.
Do you know where there is something which is wrong ?
Thank you for your help.
Three caveats :
your model is not very good.
it diverge in x=0 : don't take first points.
you must give initial parameter estimations.
An exemple:
p0=[50000,-1,0]
x=x[10:]
y=y[10:]
params, cov = curve_fit(f, x, y,p0) #params=[3.16e+04 -5.83e-01 -1.00e+03]
plt.plot(x,y, 'o')
plt.plot(x, f(x, *params))
plt.title('Fit')
plt.legend(['data','fit'],loc='best')
plt.show()
You can estimate the quality of the model by
In [178]: np.sqrt(np.diag(cov))/params
Out[178]: array([ 0.12066005, -0.12537714, -0.53450057])
which shows that the estimation of error on parameters is greater than 10%.
The problem is the function you use for fitting. Consider using something like
def f(x, a, b, c):
return a*x + b*np.power(x, 2) + c
EDIT: accidentally posted the original function instead of the one I wanted to suggest.
why this fitting is this much bad ?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fit(x, a, b, c, d):
return a * np.sin(b * x + c) + d
xdata = np.linspace(0, 360, 1000)
ydata = 89.9535 + 60.9535 * np.sin(0.0174 * xdata - 1.5708)
popt, pcov = curve_fit(fit, xdata, ydata)
plt.plot(xdata, 89.9535 + 60.9535 * np.sin(0.0174 * xdata - 1.5708))
plt.plot(xdata, fit(xdata, popt[0], popt[1], popt[2], popt[3]))
plt.show()
the fitted curve seems very strange, or maybe I am miss using it , thanks for any helps .
This is the result:
curve_fit finds a local minimum for the least-squares problem. In this case, there are many local minima.
One way around this is to use as good an initial guess as possible. For problems with multiple local minima, curve_fit's default of all ones for the initial guess can be pretty bad. For your function, the crucial parameter is b, the frequency. If you know that value will be small, i.e. on the order of 0.01, use 0.01 as the initial guess:
In [77]: (a, b, c, d), pcov = curve_fit(fit, xdata, ydata, p0=[1, .01, 1, 1])
In [78]: a
Out[78]: 60.953499999999998
In [79]: b
Out[79]: 0.017399999999999999
In [80]: c
Out[80]: -102.10176491487339
In [81]: ((c + np.pi) % (2*np.pi)) - np.pi
Out[81]: -1.570800000000002
In [82]: d
Out[82]: 89.953500000000005
As an alternative, plot the original data alone and use it to make initial guesses of the parameters. For a periodic function it can be easy to estimate the period and the amplitude. In this case the guesses need not be too close.
Then I used these in curve_fit:
popt, pcov = curve_fit(fit, xdata, ydata, [ 80., np.pi/330, 1., 1. ])
The result it returned are essentially the original values.
array([ 6.09535000e+01, 1.74000000e-02, -1.57080000e+00,
8.99535000e+01])
I need to determine the values of ceofficients in my equation. For that I decided to use the least square method. The equation is presented below:
The equation presents a connection between stress and time to failure of a tested product at different temperature levels. The data that I've used is made up, but presents the structure of the actual data, that I will use later on.
For better understanding I also included a graphical correlation:
I am fairly new to python so I didn't know that there so many ways/functions of this method availible, so I decided to try out a few:
Input data
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from lmfit import minimize, Parameters, fit_report
# data
temp =np.array([650, 700, 750, 720, 680]) # temperature
xdata = np.array([500, 525, 540, 534, 490]) # time
ydata = np.array([330, 332, 315, 325, 335]) # stress
T = temp[0]
plt.plot(xdata,ydata,'*')
plt.xlabel('xdata')
plt.ylabel('ydata')
1. Using the curve_fit function
def func(logS, a_0, a_1, a_2, T_a, logt_a):
return logt_a + (T - T_a) * (a_0 + a_1 * logS + a_2 * logS**2)
popt, pcov = curve_fit(func, xdata, ydata, p0=(1, 1, 1, 1, 1))
popt
zapis = 'a_0: {0:1.5e}\na_1: {1:1.5e}\na_2: {2:1.5e}\nT_a: {3:1.5e}\nlogt_a: {4:1.5e}'.format(popt[0], popt[1], popt[2], popt[3], popt[4])
print(zapis)
a_0 = popt[0]
a_1 = popt[1]
a_2 = popt[2]
T_a = popt[3]
logt_a = popt[4]
residuals = ydata - func(logS, a_0, a_1, a_2, T_a, logt_a)
fres = sum(residuals**2)
print(fres)
curvex=np.linspace(np.min(xdata)-np.min(xdata)/10, np.max(xdata)+50, np.max(xdata)/10)
curvey=func(curvex, a_0, a_1, a_2, T_a, logt_a)
plt.plot(xdata,ydata,'*')
plt.plot(curvex,curvey, 'r')
plt.xlabel('xdata')
plt.ylabel('ydata')
2. Using the leastsq function
from scipy.optimize import leastsq
def function(parameters, logS):
a_0, a_1, a_2, T_a, logt_a = parameters
model = logt_a + (T - T_a) * (a_0 + a_1 * logS + a_2 * logS**2)
return model
def objective(pars, t_r, logS):
err = t_r - function(pars, logS)
return err
x0 = [ 1.0, 1.0, 1.0, 1.0, 1.0 ] #initial guess of parameters
plsq = leastsq(objective, x0, args=(ydata, xdata))
print('Fitted parameters = {0}'.format(plsq[0]))
plt.plot(xdata, ydata, 'ro')
#plot the fitted curve on top
x = np.linspace(min(xdata), max(xdata), 50)
y = function(plsq[0], x)
plt.plot(x, y, 'k-')
plt.xlabel('x')
plt.ylabel('y')
In both cases I got this results:
a_0: -5.95683e+02
a_1: 2.65405e-02
a_2: -2.63017e-05
T_a: 1.21502e+02
logt_a: 3.11614e+05
Question 1: What is the best way of determing the initial values of the searched coefficients?
Question 2: Which of the methods in python, that is based on the least square method is the best for equations like in my case?
Question 3: Is there a way to make the process of determing the coefficients as parameters more automated? Because I will have to try out also higher order polynomials which will lead to more coefficients (a_3, a_4, a_5,...). The idea would be to write the order of the polynomial and everything else would then be calculated and formed by itself.