I have some data points to fit with a model. My model is not defined as an equation but as a numerical solution of 3 equations.
My model is defined as below:
def eq(q):
z1=q[0]
z2=q[1]
H=q[2]
F=empty((3))
F[0] = ((J*(1-(D*(1-(1-8*a*T/D**3)**(1/3)))**(b)/Lx))*sin(z1-z2))+(H*sin(z1-pi/4))+(((3.6*10**5)/2)*sin(2*z1))
F[1] = ((-J*(1-(D*(1-(1-8*a*T/D**3)**(1/3)))**(b)/Lx))*sin(z1-z2))+(H*sin(z2-pi/4))+(((3.6*10**5)/2)*sin(2*z2))
F[2] = cos(z1-pi/4)
return F
guess1=array([2.35,0.2,125000])
z=fsolve(eq,guess1)
Hc=z[2]*(1-(T/Tb)**(1/2))
that
D=10**(-8)
a=2.2*10**(-28)
Lx=4.28*10**(-9)
and J, b, Tb are parameters and z1, z2, H are variables
My data points are:
T=[10, 60, 110, 160, 210, 260, 300]
Hc=[0.58933, 0.5783, 0.57938, 0.58884, 0.60588, 0.62788, 0.6474]
how can I find J, b, Tb according to fitting model with data points?
You can use scipy.optimize.curve_fit. You need to understand that curve_fit will only care about the input and the output of your function, such that you need to define it this way :
def func(x, *params):
....
return y
Then you can apply curve_fit (https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html):
popt, pcov = curve_fit(func, x, y_data)
I wrote and example taking your case. There is 100 ways to write it more efficiently. However, I tried to make it clear so you can understand the link between what you gave and what I propose.
import numpy as np
from scipy.optimize import fsolve, curve_fit
def eq(q, T, J, b):
z1 = q[0]
z2 = q[1]
H = q[2]
D=10**(-8)
a=2.2*10**(-28)
Lx=4.28*10**(-9)
F = np.empty((3))
F[0] = ((J*(1-(D*(1-(1-8*a*T/D**3)**(1/3)))**(b)/Lx))*np.sin(z1-z2))+(H*np.sin(z1-np.pi/4))+(((3.6*10**5)/2)*np.sin(2*z1))
F[1] = ((-J*(1-(D*(1-(1-8*a*T/D**3)**(1/3)))**(b)/Lx))*np.sin(z1-z2))+(H*np.sin(z2-np.pi/4))+(((3.6*10**5)/2)*np.sin(2*z2))
F[2] = np.cos(z1-np.pi/4)
return F
def func(T,J,b,Tb):
Hc = []
guess1 = np.array([2.35,0.2,125000])
for t in T :
z = fsolve(eq, guess1, args = (t, J, b))
Hc.append(z[2]*(1-(t/Tb)**(1/2)))
return Hc
T = [10, 60, 110, 160, 210, 260, 300]
Hc_exp = [0.58933, 0.5783, 0.57938, 0.58884, 0.60588, 0.62788, 0.6474]
p0 = (1,1,100)
popt, pcov = curve_fit(func, T, Hc_exp, p0)
J = popt[0]
b = popt[1]
Tb = popt[2]
It seems that the fit is hard to be done. To improve that, you can add an initial guess to the parameters through p0, or adding bounds to the parameters (see curve_fit documentation). Once it converges, you can have the error on the estimated parameters.
Related
I have a system of two first order ODEs, which are nonlinear, and hence difficult to solve analytically in a closed form. I want to fit the numerical solution to this system of ODEs to a data set. My data set is for only one of the two variables that are part of the ODE system. How do I go about this?
This didn't help because there's only one variable there.
My code which is currently leading to an error is:
import numpy as np
from scipy.integrate import odeint
from scipy.optimize import curve_fit
def f(y, t, a, b, g):
S, I = y # S, I are supposed to be my variables
Sdot = -a * S * I
Idot = (a - b) * S * I + (b - g - b * I) * I
dydt = [Sdot, Idot]
return dydt
def y(t, a, b, g, y0):
y = odeint(f, y0, t, args=(a, b, g))
return y.ravel()
I_data =[] # I have data only for I, not for S
file = open('./ratings_showdown.csv')
for e_raw in file.read().split('\r\n'):
try:
e=float(e_raw); I_data.append(e)
except ValueError:
continue
data_t = range(len(I_data))
popt, cov = curve_fit(y, data_t, I_data, [.05, 0.02, 0.01, [0.99,0.01]])
#want to fit I part of solution to data for variable I
#ERROR here, ValueError: setting an array element with a sequence
a_opt, b_opt, g_opt, y0_opt = popt
print("a = %g" % a_opt)
print("b = %g" % b_opt)
print("g = %g" % g_opt)
print("y0 = %g" % y0_opt)
import matplotlib.pyplot as plt
t = np.linspace(0, len(data_y), 2000)
plt.plot(data_t, data_y, '.',
t, y(t, a_opt, b_opt, g_opt, y0_opt), '-')
plt.gcf().set_size_inches(6, 4)
#plt.savefig('out.png', dpi=96) #to save the fit result
plt.show()
This type of ODE fitting becomes a lot easier in symfit, which I wrote specifically as a user friendly wrapper to scipy. I think it will be very useful for your situation because the decreased amount of boiler-plate code simplifies things a lot.
From the docs and applied roughly to your problem:
from symfit import variables, parameters, Fit, D, ODEModel
S, I, t = variables('S, I, t')
a, b, g = parameters('a, b, g')
model_dict = {
D(S, t): -a * S * I,
D(I, t): (a - b) * S * I + (b - g - b * I) * I
}
ode_model = ODEModel(model_dict, initial={t: 0.0, S: 0.99, I: 0.01})
fit = Fit(ode_model, t=tdata, I=I_data, S=None)
fit_result = fit.execute()
Check out the docs for more :)
So I figured out the problem.
The curve_fit() function apparently returns a list as it's second return value. So, instead of passing the initial conditions as a list [0.99,0.01], I passed them separately as 0.99 and 0.01.
I have an int list x, like [43, 43, 46, ....., 487, 496, 502](just for example)
x is a list of word count, I want change a list of word count to a list penalty score when training a text classification model.
I'd like use a curve function(maybe like math.log?) use to map value from x to y, and I need the min value in x(43) mapping to y(0.8), and the max value in x(502) to y(0.08), the other values in x map to a y follow the function.
For example:
x = [43, 43, 46, ....., 487, 496, 502]
y_bounds = [0.8, 0.08]
def creat_curve_func(x, y_bounds, curve_shape='log'):
...
func = creat_curve_func(x, y)
assert func(43) == 0.8
assert func(502) == 0.08
func(46)
>>> 0.78652 (just a fake result for example)
func(479)
>>> 0.097 (just a fake result for example)
I quickly found that I have to try some parameter by my self to get a curve function fit my purpose, try again and again.
Then I try to find a lib to do such work, scipy.optimize.curve_fit turns out. But it need three parameter at least: f(the function I want to generate), xdata, ydata(I only have y bounds:0.8, 0.08), only xdata I have.
Is there any good sulotion?
update
I think this is easy unserstood so didn't write the fail code of curve_fit.Is this the reason of down vote?
The reason that why I can't just use curve_fit
x = sorted([43, 43, 46, ....., 487, 496, 502])
y = np.linspace(0.8, 0.08, len(x)) # can not set y as this way which lead to the wrong result
def func(x, a, b):
return a * x +b # I want a curve function in fact, linear is simple to understand here
popt, pcov = curve_fit(func, x, y)
func(42, *popt)
0.47056348146450089 # I want 0.8 here
How about this way?
EDIT: added weights. If you don't need to put your end points exactly on the curve you could use weights:
import scipy.optimize as opti
import numpy as np
xdata = np.array([43, 56, 234, 502], float)
ydata = np.linspace(0.8, 0.08, len(xdata))
weights = np.ones_like(xdata, float)
weights[0] = 0.001
weights[-1] = 0.001
def fun(x, a, b, z):
return np.log(z/x + a) + b
popt, pcov = opti.curve_fit(fun, xdata, ydata, sigma=weights)
print fun(xdata, *popt)
>>> [ 0.79999994 ... 0.08000009]
EDIT:
You can also play with these parameters, of course:
import scipy.optimize as opti
import numpy as np
xdata = np.array([43, 56, 234, 502], float)
xdata = np.round(np.sort(np.random.rand(100) * (502-43) + 43))
ydata = np.linspace(0.8, 0.08, len(xdata))
weights = np.ones_like(xdata, float)
weights[0] = 0.00001
weights[-1] = 0.00001
def fun(x, a, b, z):
return np.log(z/x + a) + b
popt, pcov = opti.curve_fit(fun, xdata, ydata, sigma=weights)
print fun(xdata, *popt)
>>>[ 0.8 ... 0.08 ]
I am just wondering if there is a easy way to implement gaussian/lorentzian fits to 10 peaks and extract fwhm and also to determine the position of fwhm on the x-values. The complicated way is to separate the peaks and fit the data and extract fwhm.
Data is [https://drive.google.com/file/d/0B6sUnnbyNGuOT2RZb2UwYXU4dlE/view?usp=sharing].
Any advise greatly appreciated. Thanks.
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('data.txt', delimiter=',')
x, y = data
plt.plot(x,y)
plt.show()
def func(x, *params):
y = np.zeros_like(x)
print len(params)
for i in range(0, len(params), 3):
ctr = params[i]
amp = params[i+1]
wid = params[i+2]
y = y + amp * np.exp( -((x - ctr)/wid)**2)
guess = [0, 60000, 80, 1000, 60000, 80]
for i in range(12):
guess += [60+80*i, 46000, 25]
popt, pcov = curve_fit(func, x, y, p0=guess)
print popt
fit = func(x, *popt)
plt.plot(x, y)
plt.plot(x, fit , 'r-')
plt.show()
Traceback (most recent call last):
File "C:\Users\test.py", line 33, in <module>
popt, pcov = curve_fit(func, x, y, p0=guess)
File "C:\Python27\lib\site-packages\scipy\optimize\minpack.py", line 533, in curve_fit
res = leastsq(func, p0, args=args, full_output=1, **kw)
File "C:\Python27\lib\site-packages\scipy\optimize\minpack.py", line 368, in leastsq
shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
File "C:\Python27\lib\site-packages\scipy\optimize\minpack.py", line 19, in _check_func
res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
File "C:\Python27\lib\site-packages\scipy\optimize\minpack.py", line 444, in _ general_function
return function(xdata, *params) - ydata
TypeError: unsupported operand type(s) for -: 'NoneType' and 'float'
This requires a non-linear fit. A good tool for this is scipy's curve_fit function.
To use curve_fit, we need a model function, call it func, that takes x and our (guessed) parameters as arguments and returns the corresponding values for y. As our model, we use a sum of gaussians:
from scipy.optimize import curve_fit
import numpy as np
def func(x, *params):
y = np.zeros_like(x)
for i in range(0, len(params), 3):
ctr = params[i]
amp = params[i+1]
wid = params[i+2]
y = y + amp * np.exp( -((x - ctr)/wid)**2)
return y
Now, let's create an initial guess for our parameters. This guess starts with peaks at x=0 and x=1,000 with amplitude 60,000 and e-folding widths of 80. Then, we add candidate peaks at x=60, 140, 220, ... with amplitude 46,000 and width of 25:
guess = [0, 60000, 80, 1000, 60000, 80]
for i in range(12):
guess += [60+80*i, 46000, 25]
Now, we are ready to perform the fit:
popt, pcov = curve_fit(func, x, y, p0=guess)
fit = func(x, *popt)
To see how well we did, let's plot the actual y values (solid black curve) and the fit (dashed red curve) against x:
As you can see, the fit is fairly good.
Complete working code
from scipy.optimize import curve_fit
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('data.txt', delimiter=',')
x, y = data
plt.plot(x,y)
plt.show()
def func(x, *params):
y = np.zeros_like(x)
for i in range(0, len(params), 3):
ctr = params[i]
amp = params[i+1]
wid = params[i+2]
y = y + amp * np.exp( -((x - ctr)/wid)**2)
return y
guess = [0, 60000, 80, 1000, 60000, 80]
for i in range(12):
guess += [60+80*i, 46000, 25]
popt, pcov = curve_fit(func, x, y, p0=guess)
print popt
fit = func(x, *popt)
plt.plot(x, y)
plt.plot(x, fit , 'r-')
plt.show()
#john1024's answer is good, but requires a manual process to generate the initial guess. here's an easy way to automate the starting guess. replace the relevant 3 lines of john1024's code by the following:
import scipy.signal
i_pk = scipy.signal.find_peaks_cwt(y, widths=range(3,len(x)//Npks))
DX = (np.max(x)-np.min(x))/float(Npks) # starting guess for component width
guess = np.ravel([[x[i], y[i], DX] for i in i_pk]) # starting guess for (x, amp, width) for each component
IMHO it is always advisable to plot the residual (data - model) in problems such as this. You will also want to the look at the ChiSq of the fit.
I am using scipy.optimize.curve_fit to fit a curve to some data i have. The curves, for the most part, seem to fit very well. For some reason, pcov = inf when i print it off.
What i really need is to calculate the error associated with the parameters i'm fitting, and am not sure how exactly to do this even if it does give me the covariance matrix.
The model being fit to is:
def intensity(x,R_out,R_in,K_in,K_out,a,b,c):
K_in,K_out = abs(0.0),abs(K_out)
if x<=R_in:
return 2*R_out*(K_out*np.sqrt(1-x**2/R_out**2)-
(K_out-0.0)*np.sqrt(R_in**2/R_out**2-x**2/R_out**2)) + c
elif x>=R_in and x<=R_out:
return K_out*2*R_out*np.sqrt(1-x**2/R_out**2) + c
elif x>R_out:
return c
intensity_vec = np.vectorize(intensity)
def intensity_vec_self(x,R_out,R_in,K_in,K_out,a,b,c):
y = np.zeros(x.shape)
for i in range(len(y)):
y[i]=intensity_vec(x[i],R_out,R_in,K_in,K_out,a,b,c)
return y
and there are 400 data points, i can put that on here if you think it will help.
To summarize, i can't get curve_fit to print off my pcov and need help as to figure out why and if i can get it to do so.
Also, if it is a quick explanation i would like to know how to use the pcov array to attain the errors associated with my fit.
Thanks
The variance of parameters are the diagonal elements of the variance-co variance matrix, and the standard error is the square root of it. np.sqrt(np.diag(pcov))
Regarding getting inf, see and compare these two examples:
In [129]:
import numpy as np
def func(x, a, b, c, d):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5, 1)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = so.curve_fit(func, xdata, ydata)
print np.sqrt(np.diag(pcov))
[ inf inf inf inf]
And:
In [130]:
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = so.curve_fit(func, xdata, ydata)
print np.sqrt(np.diag(pcov))
[ 0.11097646 0.11849107 0.05230711]
In this extreme example, d has no effect on the function func, hence it will be associated with variance of +inf, or in another word, it can be just about any value. Removing d from func will get what will make sense.
In reality, if parameters are of very different scale, say:
def func(x, a, b, c, d):
#return a * np.exp(-b * x) + c
return a * np.exp(-b * x) + c + d*1e-10
You will also get inf due to float point overflow/underflow.
In your case, I think you never used a and b. So it is just like the first example here.
I need to do a simple curve fitting using scipy's curve_fit function. However, my data is in the form of a matrix. I can easily do this in numpy but I wanted to see the goodness of fit for scipy.
Problem:
AX = B --> given A, find X for least square error.
from scipy.optimize import curve_fit
def getXval():
a = 4; b = 3, c = 1;
f0 = a*pow(b, 2)*c
f1 = a*b/c
return [f0, f1]
def fit(x, a0, a1):
res = a0*x[0] + a1*x[1]
return [res]
x = getXval()
y = [0.15]
popt, pcov = curve_fit(fit, x, y)
This is, however, not working. Can someone point what is going on here?
Your code has a few problems.
1) Use numpy arrays instead of Python lists
2) your are missing values for y.
This works for me:
from scipy.optimize import curve_fit
import numpy as np
def getXval():
a = 4; b = 3; c = 1;
f0 = a*pow(b, 2)*c
f1 = a*b/c
return np.array([f0, f1])
def fit(x, a0, a1):
res = a0*x[0] + a1*x[1]
return np.array([res])
x = getXval()
y = np.array([0.15, 0.34])
popt, pcov = curve_fit(fit, x, y)
print popt, pcov