I have a set of data made up of (2-dimensional) observations of multiple objects. The observations can be described by a general function plus an offset that is unique to each object. I want to use curve_fit to simultaneously recover the general function and the offsets for each object (with associated errors). I do not know in advance how many objects the data-set will be made up of, only that there are likely to be multiple observations of each.
So a generalised data set of 7 observations might look like this:
[[x[0], y1[0], y2[0], lab='A'],
[x[1], y1[1], y2[1], lab='B'],
[x[2], y1[2], y2[2], lab='A'],
[x[3], y1[3], y2[3], lab='A'],
[x[4], y1[4], y2[4], lab='B'],
[x[5], y1[5], y2[5], lab='C'],
[x[6], y1[6], y2[6], lab='A']]
I could do the task by passing the parameters of the general function (say g = [g0, g1, g2]) and the object offsets offsets = n x [o1, o2] to fit_func and then using an object label to decide which of the n offsets needs to be added to the general function, except that I can't figure out how to pass the label.
def fit_func(x, g, offsets, lab):
y1 = g[0] * cos(2*(x - g[1])) + offsets['lab',0] + g[2]
y2 = g[0] * sin(2*(x - g[1])) + offsets['lab',1] + g[2]
return [y1, y2]
The problem is that lab is not a float to be fit, so I can't figure out how to pass it. From reading some other threads I believe I will need a wrapper function, but I can't figure out what form it should take, and then how to call it in such a way that I can specify sigma and p0.
Can anyone point me in the right direction?
Edit: I managed to produce a function that I thought would work. It used a global parameter call to choose options within the function call. So, for example I interleaved the y1 and y2 arrays, and had the function call the second equation every second run with a global getEven() and setEven(bool) call. However curve_fit really didn't like that. The fit values were nonsensical.
At the moment I am fitting the equation for y1 and the equation for y2 separately and taking the rms to determine g0 and g1 (this also gives me offsets['A',0] and offsets['A',1] respectively. I could just do this multiple times with each different object in the set, but I can't fit the g2 parameter this way, since in any given call to the y1 or y2 function it is degenerate with the corresponding offset.
Here is example code that fits two different equations with a shared parameter using 'A' or 'B' decoding. It appears to work as you need for decoding the lab type, but I personally have never done this before and while it appears to function per your post the "text-to-float" conversion inside the function seems klunky to me. But it works.
import numpy
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
# single array with all "X" data to pass around
num = numpy.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0])
ids = numpy.array(['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'])
xdata = numpy.array([num, ids]) # combine data, numpy auto-converts to 'text' type
# ydata is numeric single array
ydata = [9.0,8.0,7.0,6.0,4.0,3.0,2.0,1.0]
def fitFunction(data, commonParameter, pA, pB):
numericDataAsText = data[0]
textData = data[1]
returnArray = []
for i in range(len(textData)):
x = float(numericDataAsText[i])
if textData[i] == 'A':
val = commonParameter + x * pA
elif textData[i] == 'B':
val = commonParameter + x * pB
else:
raise(Exception('Error: must use A or B'))
returnArray.append(val)
return returnArray
initialParameters = [1.0, 1.0, 1.0]
# curve fit the equations individually to their respective data
params, pcov = curve_fit(fitFunction, xdata, ydata, initialParameters)
# values for display of fitted function
commonParameter, pA, pB = params
# for plotting the fitting results
y_fit = fitFunction(xdata, commonParameter, pA, pB)
plt.plot(xdata[0], ydata, 'D') # plot the raw data as a scatterplot
plt.plot(xdata[0][:4], y_fit[:4])
plt.plot(xdata[0][4:], y_fit[4:])
plt.show()
print('fittedparameters:', params)
It would be helpful to show a more complete example of what you are trying, including the call to scipy.optimize.curve_fit. But, if I understand the question correctly, you want to have an argument for your model function that is not treated as a variable in the fit. I believe that curve_fit cannot do this, and treats all arguments after the first as variables.
In fact, I think that your model function will not work for curve_fit because you expect g to be a sequence of values. With curve_fit, each argument after the first will get a single float value. So you probably want something like
def func(x, g0, g1, g2, offsets):
y1 = g0 * cos(2*(x - g1)) + offsets['lab', 0] + g2
...
Anyway, I have two suggestions to work around this limitation of curve_fit:
First, you could overload x. Now, curve_fit will internally apply numpy.asarray() to the x you pass in, but it will otherwise just pass it along to your model function. So, if you turn x into a list containing your real x and your lab, you should be able to unpack this in your model function, say like
xhack = [x, offsets]
def func(x, g0, g1, g2):
x, offsets = x
....
out = curve_fit(func, xhack, ...)
Personally, I think that's kind of ugly, but it might work.
Second, you could use lmfit (https://lmfit.github.io/lmfit-py/), which provides a higher level interface to curve fitting and fixes many of the shortcomings of curve_fit. For your question in particular, lmfit's Model class for curve fitting examines the model function more carefully to turn function arguments into fitting parameters. Specifically:
keyword arguments with non-numerical defaults will not be turned into fit parameters.
you can specify more than 1 "independent variable", and they do not have to be the first argument of the function.
That is, you could either write:
from lmfit import Model
def func(x, g0, g1, g2, offsets=None):
y1 = g0 * cos(2*(x - g1)) + offsets['lab', 0] + g2
mymodel = Model(func)
or explicitly tell Model what the independent variables are:
from lmfit import Model
def func(x, g0, g1, g2, offsets):
y1 = g0 * cos(2*(x - g1)) + offsets['lab', 0] + g2
mymodel = Model(func, independent_vars=['x', 'offsets'])
Either way, offsets can be any complex objects, and you would use this mymodel for curve fitting with:
# create parameter objects for this model, with initial values:
params = mymodel.make_params(g0=0, g1=0.5, g2=2.0)
# run the fit
result = mymodel.fit(ydata, params, x=x, offsets=offsets)
There are lots of other conveniences we added to lmfit (I am one of the developers) for building curve fitting models and working with parameters as high-level objects, but this might be enough to get you started.
Related
I am trying to solve the following system of first order coupled differential equations:
- dR/dt=k2*Y(t)-k1*R(t)*L(t)+k4*X(t)-k3*R(t)*I(t)
- dL/dt=k2*Y(t)-k1*R(t)*L(t)
- dI/dt=k4*X(t)-k3*R(t)*I(t)
- dX/dt=k1*R(t)*L(t)-k2*Y(t)
- dY/dt=k3*R(t)*I(t)-k4*X(t)
The knowed initial conditions are: X(0)=0, Y(0)=0
The knowed constants values are: k1=6500, k2=0.9
This equations defines a kinetick model and I need to solve them to get the Y(t) function to fit my data and find k3 and k4 values. In order to that, I have tried to solve the system simbologically with sympy. There is my code:
import matplotlib.pyplot as plt
import numpy as np
import sympy
from sympy.solvers.ode.systems import dsolve_system
from scipy.integrate import solve_ivp
from scipy.integrate import odeint
k1 = sympy.Symbol('k1', real=True, positive=True)
k2 = sympy.Symbol('k2', real=True, positive=True)
k3 = sympy.Symbol('k3', real=True, positive=True)
k4 = sympy.Symbol('k4', real=True, positive=True)
t = sympy.Symbol('t',real=True, positive=True)
L = sympy.Function('L')
R = sympy.Function('R')
I = sympy.Function('I')
Y = sympy.Function('Y')
X = sympy.Function('X')
f1=k2*Y(t)-k1*R(t)*L(t)+k4*X(t)-k3*R(t)*I(t)
f2=k2*Y(t)-k1*R(t)*L(t)
f3=k4*X(t)-k3*R(t)*I(t)
f4=-f2
f5=-f3
eq1=sympy.Eq(sympy.Derivative(R(t),t),f1)
eq2=sympy.Eq(sympy.Derivative(L(t),t),f2)
eq3=sympy.Eq(sympy.Derivative(I(t),t),f3)
eq4=sympy.Eq(sympy.Derivative(Y(t),t),f4)
eq5=sympy.Eq(sympy.Derivative(X(t),t),f5)
Sys=(eq1,eq2,eq3,eq4,eq5])
solsys=dsolve_system(eqs=Sys,funcs=[X(t),Y(t),R(t),L(t),I(t)], t=t, ics={Y(0):0, X(0):0})
There is the answer:
NotImplementedError:
The system of ODEs passed cannot be solved by dsolve_system.
I have tried with dsolve too, but I get the same.
Is there any other solver I can use or some way of doing this that will allow me to get the function for the fitting? I'm using python 3.8 in Spider with Anaconda in windows64.
Thank you!
# Update
Following
You are saying "experiment". So you have data and want to fit the model to them, find appropriate values for k3 and k4 at least, and perhaps for all coefficients and the initial conditions (the first measured data point might not be the initial condition for the best fit)? See stackoverflow.com/questions/71722061/… for a recent attempt on such a task. –
Lutz Lehmann
23 hours
There is my new code:
t=[0,0.25,0.5,0.75,1.5,2.27,3.05,3.82,4.6,5.37,6.15,6.92,7.7,8.47,13.42,18.42,23.42,28.42,33.42,38.42,43.42,48.42,53.42,58.42,63.42,68.42,83.4,98.4,113.4,128.4,143.4,158.4,173.4,188.4,203.4,218.4,233.4,248.4]
yexp=[832.49,1028.01,1098.12,1190.08,1188.97,1377.09,1407.47,1529.35,1431.72,1556.66,1634.59,1679.09,1692.05,1681.89,1621.88,1716.77,1717.91,1686.7,1753.5,1722.98,1630.14,1724.16,1670.45,1677.16,1614.98,1671.16,1654.03,1661.84,1675.31,1626.76,1638.29,1614.41,1594.31,1584.73,1599.22,1587.85,1567.74,1602.69]
def derivative(S, t, k3, k4):
k1=1798931
k2=0.2629
x, y,r,l,i = S
doty = k1*r*l+k2*y
dotx = k3*r*i-k4*x
dotr = k2*y-k1*r*l+k4*x-k3*r*i
dotl = k2*y-k1*r*l
doti = k4*x-k3*r*i
return np.array([doty, dotx, dotr, dotl, doti])
def solver(XY,t,para):
return odeint(derivative, XY, t, args = para, atol=1e-8, rtol=1e-11)
def integration(XY_arr,*para):
XY0 = para[:5]
para = para[5:]
T = np.arange(len(XY_arr))
res0 = solver(XY0,T, para)
res1 = [ solver(XY0,[t,t+1],para)[-1]
for t,XY in enumerate(XY_arr[:-1]) ]
return np.concatenate([res0,res1]).flatten()
XData =yexp
YData = np.concatenate([ yexp,yexp,yexp,yexp,yexp,yexp[1:],yexp[1:],yexp[1:],yexp[1:],yexp[1:]]).flatten()
p0 =[0,0,100,10,10,1e8,0.01]
params, info = curve_fit(integration,XData,YData,p0=p0, maxfev=5000)
XY0, para = params[:5], params[5:]
print(XY0,tuple(para))
t_plot = np.linspace(0,len(t),500)
x_plot = solver(XY0, t_plot, tuple(para))
But the output are not correct, as are the same as initial condition p0:
[ 0. 0. 100. 10. 10.] (100000000.0, 0.01)
Graph
I understand that the function 'integration' gives packed values of y for each function at each instant of time, but I don't know how to unpack them to make the curve_fitt separately. Maybe I don't quite understand how it works.
Thank you!
As you observed, sympy is not able to solve this system. This might mean that
the procedure to classify ODE in sympy is not complete enough, or
some trick/method is needed above the standard set of methods that is implemented in sympy, or
that there just is no symbolic solution.
The last case is the generic one, take a symbolically solvable ODE, add some random term, and almost certainly the resulting ODE is no longer symbolically solvable.
As I understand with the comments, you have an model via ODE system with state space (cX,cY,cR,cL,cI) with equations with 4 parameters k1,k2,k3,k4 and, by the structure of a reaction system R+I <-> X, R+L <-> Y, the sums cR+cX+cY, cL+cY, cI+cX are all constant.
For some other process that is approximately represented by the model, you have time series data t[k],y[k] for the Y component. Also you have partial information on the initial state and the parameter set. If there are sufficiently many data points one could also forget about these, fit for all parameters, and compare how far away the given parameters are to the computed ones.
There are several modules and packages that solve this fitting task in a more or less abstract fashion. I think pyomo and gekko can both be used. More directly one can use the facilities of scipy.odr or scipy.optimize.
Define the forward function that transforms time and parameters
def model(t,u,k1,k2,k3,k4):
X,Y,R,L,I = u
dL = k2*Y - k1*R*L
dI = k4*X - k3*R*I
dR = dL+dI
dX = -dI
dY = -dL
return dX,dY,dR,dL,dI
def solver(t,u0,k):
res = solve_ivp(model, [0, t[-1]], u0, args=tuple(k), t_eval=t,
method="DOP853", atol=1e-7, rtol=1e-10)
return res.y
Prepare some data plus noise
k1o = 6.500; k2o=0.9
T = np.linspace(0,0.05,21)
U = solver(T, [0,0,50,40,25], [k1o, k2o, 5.400, 0.7])
Y = U[1] # equilibrium slightly above 30
Y += np.random.uniform(high=0.05, size=Y.shape)
Prepare the function that splits the combined parameter vector in initial state and coefficients, call the curve fitting function
from scipy.optimize import curve_fit
def partial(t,R,L,I,k3,k4):
print(R,L,I,k3,k4)
U = solver(t,[0,0,R,L,I],[k1o,k2o,k3,k4])
return U[1]
params, info = curve_fit(partial,T,Y, p0=[30,20,10, 0.3,3.000])
R,L,I, k3,k4 = params
print(R,L,I, k3,k4)
It turns out that curve_fit goes into strange regions with large negative values. A likely reason is that the Y component is, in the end, not coupled strongly enough to all the other components, meaning that large changes in some of the parameters have minimal influence on Y, so that minimal noise in Y can lead to large deviations in these parameters. Here this apparently happens (first) to k3.
I have a problem: I have two distinct equations, one is a linear equation, the other one is an exponential equation. However not both equations should be valid at the same time, meaning that there are two distinct regimes.
Equation 1 (x < a): E*x
Equation 2 (x >=a): a+b*x+c*(1-np.exp(-d*np.array(x)))
Meaning the first part of the data should just be fit with a linear equation and the rest should be fit with the before mentioned equation 2.
The data I'm trying to fit looks like this (I have also added some sample data, if people wanna have a go):
I have tried several thing already, from just defining one fit function with a heaviside function:
def fit_fun(x,a,b,c,d,E):
funktion1=E*np.array(x)
funktion2=a+b*x+c*(1-np.exp(-d*np.array(x)))
return np.heaviside(x+a,0)*funktion2+(1-np.heaviside(x+a,0))*funktion1
defining a piecewise function:
def fit_fun(x,a,b,c,d,E):
return np.piecewise(x, [x <= a, x > a], [lambda x: E*np.array(x), lambda x: a+b*x+c*(1-np.exp(-d*np.array(x)))])
to lastly (which unforunattly yields me some form function error?):
def plast_fun(x,a,b,c,d,E):
out = E*x
out [np.where(x >= a)] = a+b*x+c*(1-np.exp(-d+x))
return out
Don't get me wrong I do get "some" fits, but they do seem to either take one or the other equation and not really use both. I also tried using several bounds and inital guesses, but it never changes.
Any input would be greatly appreciated!
Data:
0.000000 -1.570670
0.000434 83.292677
0.000867 108.909402
0.001301 124.121676
0.001734 138.187659
0.002168 151.278839
0.002601 163.160478
0.003035 174.255626
0.003468 185.035092
0.003902 195.629820
0.004336 205.887161
0.004769 215.611995
0.005203 224.752083
0.005636 233.436680
0.006070 241.897851
0.006503 250.352697
0.006937 258.915168
0.007370 267.569337
0.007804 276.199005
0.008237 284.646778
0.008671 292.772349
0.009105 300.489611
0.009538 307.776858
0.009972 314.666291
0.010405 321.224211
0.010839 327.531594
0.011272 333.669261
0.011706 339.706420
0.012139 345.689265
0.012573 351.628362
0.013007 357.488150
0.013440 363.185771
0.013874 368.606298
0.014307 373.635696
0.014741 378.203192
0.015174 382.315634
0.015608 386.064126
0.016041 389.592120
0.016475 393.033854
0.016908 396.454226
0.017342 399.831519
0.017776 403.107084
0.018209 406.277016
0.018643 409.441119
0.019076 412.710982
0.019510 415.987331
0.019943 418.873140
0.020377 421.178098
0.020810 423.756827
So far I have found these two questions, but I could't figure it out:
Fit of two different functions with boarder as fit parameter
Fit a curve for data made up of two distinct regimes
I suspect you are making a mistake in the second equation, where you do a+b*x+c*(1-np.exp(-d+x)). where a is the value of x where you change from one curve to the other. I think you should use the value of y instead which is a*E. Also it is very important to define initial parameters to the fit. I've ran the following code with your data in .txt file and the fit seems pretty good as you can see bellow:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import optimize, stats
def fit_fun(x,a,b,c,d,E):
return np.piecewise(x, [x <= a, x > a], [lambda x: E*x, lambda x: a*E+b*x+c*(1-np.exp(-d*x))])
df = pd.read_csv('teste.txt', delimiter='\s+', header=None)
df.columns = ['x','y']
xdata = df['x']
ydata = df['y']
p0 = [0.001,1,1,1,100000]
popt, pcov = optimize.curve_fit(fit_fun, xdata.values, ydata.values, p0=p0, maxfev=10000, absolute_sigma=True, method='trf')
print(popt)
plt.plot(xdata, ydata,'*')
plt.plot(xdata, fit_fun(xdata.values, *popt), 'r')
plt.show()
I am trying to fit some experimental data to a nonlinear function with one parameter that includes an arcus cosine function which therefore is limited in its area of definition from -1 to 1. I use scipy's curve_fit to find the parameter of the function, but it returns the following error:
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 400.
The function I want to fit is this one:
def fitfunc(x, a):
y = np.rad2deg(np.arccos(x*np.cos(np.deg2rad(a))))
return y
For the fitting, I provid a numpy array for x and y respectively which contain values in degree (which is why the function contains conversion to and from radians).
param, param_cov = curve_fit(fitfunc, xs, ys)
When I use other fit functions like for example a polynomial, the curve_fit returns some values, the error mentioned above only occurs when I use this function which includes an arcus cosine.
I suspect that it cannot fit the data points because depending on the parameter of the arcus cosine function, some data points do not lie inside the area of definition of the arcus cosine. I have tried raising the number iterations (maxfev) but without success.
Sample data:
ys = np.array([113.46125, 129.4225, 140.88125, 145.80375, 145.4425,
146.97125, 97.8025, 112.91125, 114.4325, 119.16125,
130.13875, 134.63125, 129.4375, 141.99, 139.86,
138.77875, 137.91875, 140.71375])
xs = np.array([2.786427013, 3.325624466, 3.473013087, 3.598247534, 4.304280248,
4.958273121, 2.679526725, 2.409388637, 2.606306639, 3.661558062,
4.569923009, 4.836843789, 3.377013596, 3.664550526, 4.335401233,
3.064199519, 3.97155254, 4.100567011])
As HS-nebula mentioned in his comments, you need to define an initial value a0 of a as a start guess for the curve-fitting. Moreover, you need to be careful when choosing a0 as your np.arcos() is only defined in [-1,1] and choosing the wrong a0 results in error.
import numpy as np
from scipy.optimize import curve_fit
ys = np.array([113.46125, 129.4225, 140.88125, 145.80375, 145.4425, 146.97125,
97.8025, 112.91125, 114.4325, 119.16125, 130.13875, 134.63125,
129.4375, 141.99, 139.86, 138.77875, 137.91875, 140.71375])
xs = np.array([2.786427013, 3.325624466, 3.473013087, 3.598247534, 4.304280248, 4.958273121,
2.679526725, 2.409388637, 2.606306639, 3.661558062, 4.569923009, 4.836843789,
3.377013596, 3.664550526, 4.335401233, 3.064199519, 3.97155254, 4.100567011])
def fit_func(x, a):
a_in_rad = np.deg2rad(a)
cos_a_in_rad = np.cos(a_in_rad)
arcos_xa_product = np.arccos( x * cos_a_in_rad )
return np.rad2deg(arcos_xa_product)
a0 = 80
param, param_cov = curve_fit(fit_func, xs, ys, a0, bounds = (0, 360))
print('Using curve we retrieve a value of a = ', param[0])
Output:
Using curve we retrieve a value of a = 100.05275506147824
However if you choose a0=60, you get the following error:
ValueError: Residuals are not finite in the initial point.
To be able to use the data with all possible values of a, a normalization as HS-nebula suggested is good idea.
I have a function Imaginary which describes a physics process and I want to fit this to a dataset x_interpolate, y_interpolate. The function is a form of a Lorentzian peak function and I have some initial values that are user given, except for f_peak (the peak location) which I find using a peak finding algorithm. All of the fit parameters, except for the offset, are expected to be positive and thus I have set bounds_I accordingly.
def Imaginary(freq, alpha, res, Ms, off):
numerator = (2*alpha*freq*res**2)
denominator = (4*(alpha*res*freq)**2) + (res**2 - freq**2)**2
Im = Ms*(numerator/denominator) + off
return Im
pI = np.array([alpha_init, f_peak, Ms_init, 0])
bounds_I = ([0,0,0,0, -np.inf], [np.inf,np.inf,np.inf, np.inf])
poptI, pcovI = curve_fit(Imaginary, x_interpolate, y_interpolate, pI, bounds=bounds_I)
In some situations I want to keep the parameter f_peak fixed during the fitting process. I tried an easy solution by changing bounds_I to:
bounds_I = ([0,f_peak+0.001,0,0, -np.inf], [np.inf,f_peak-0.001,np.inf, np.inf])
This is for many reasons not an optimal way of doing this so I was wondering if there is a more Pythonic way of doing this? Thank you for your help
If a parameter is fixed, it is not really a parameter, so it should be removed from the list of parameters. Define a model that has that parameter replaced by a fixed value, and fit that. Example below, simplified for brevity and to be self-contained:
x = np.arange(10)
y = np.sqrt(x)
def parabola(x, a, b, c):
return a*x**2 + b*x + c
fit1 = curve_fit(parabola, x, y) # [-0.02989396, 0.56204598, 0.25337086]
b_fixed = 0.5
fit2 = curve_fit(lambda x, a, c: parabola(x, a, b_fixed, c), x, y)
The second call to fit returns [-0.02350478, 0.35048631], which are the optimal values of a and c. The value of b was fixed at 0.5.
Of course, the parameter should be removed from the initial vector pI and the bounds as well.
You might find lmfit (https://lmfit.github.io/lmfit-py/) helpful. This library adds a higher-level interface to the scipy optimization routines, aiming for a more Pythonic approach to optimization and curve fitting. For example, it uses Parameter objects to allow setting bounds and fixing parameters without having to modify the objective or model function. For curve-fitting, it defines high level Model functions that can be used.
For you example, you could use your Imaginary function as you've written it with
from lmfit import Model
lmodel = Model(Imaginary)
and then create Parameters (lmfit will name the Parameter objects according to your function signature), providing initial values:
params = lmodel.make_params(alpha=alpha_init, res=f_peak, Ms=Ms_init, off=0)
By default all Parameters are unbound and will vary in the fit, but you can modify these attributes (without rewriting the model function):
params['alpha'].min = 0
params['res'].min = 0
params['Ms'].min = 0
You can set one (or more) of the parameters to not vary in the fit as with:
params['res'].vary = False
To be clear: this does not require altering the model function, making it much easier to change with is fixed, what bounds might be imposed, and so forth.
You would then perform the fit with the model and these parameters:
result = lmodel.fit(y_interpolate, params, freq=x_interpolate)
you can get a report of fit statistics, best-fit values and uncertainties for parameters with
print(result.fit_report())
The best fit Parameters will be held in result.params.
FWIW, lmfit also has builtin Models for many common forms, including Lorentzian and a Constant offset. So, you could construct this model as
from lmfit.models import LorentzianModel, ConstantModel
mymodel = LorentzianModel(prefix='l_') + ConstantModel()
params = mymodel.make_params()
which will have Parameters named l_amplitude, l_center, l_sigma, and c (where c is the constant) and the model will use the name x for the independent variable (your freq). This approach can become very convenient when you may want to change the functional form of the peaks or background, or when fitting multiple peaks to a spectrum.
I was able to solve this issue regarding arbitrary number of parameters and arbitrary positioning of the fixed parameters:
def d_fit(x, y, param, boundMi, boundMx, listparam):
Sparam, SboundMi, SboundMx = asarray([]), asarray([]), asarray([])
Nparam, NboundMi, NboundMx = asarray([]), asarray([]), asarray([])
for i in range(len(param)):
if(listparam[i] == 1):
Sparam = append(Sparam,asarray(param[i]))
SboundMi = append(SboundMi,asarray(boundMi[i]))
SboundMx = append(SboundMx,asarray(boundMx[i]))
else:
Nparam = append(Nparam,asarray(param[i]))
def funF(x, Sparam):
j = 0
for i in range(len(param)):
if(listparam[i] == 1):
param[i] = Sparam[i-j]
else:
param[i] = Nparam[j]
j = j + 1
return fun(x, param)
return curve_fit(lambda x, *Sparam: funF(x, Sparam), x, y, p0 = Sparam, bounds = (SboundMi,SboundMx))
In this case:
param = [a,b,c,...] # parameters array (any size)
boundMi = [min_a, min_b, min_c,...] # minimum allowable value of each parameter
boundMx = [max_a, max_b, max_c,...] # maximum allowable value of each parameter
listparam = [0,1,1,0,...] # 1 = fit and 0 = fix the corresponding parameter in the fit routine
and the root function is define as
def fun(x, param):
a,b,c,d.... = param
return a*b/c... # any function of the params a,b,c,d...
This way, you can change the root function and the number of parameters without changing the fit routine.
And, at any time, you can fix or let fit any parameter by changing "listparam".
Use like this:
popt, pcov = d_fit(x, y, param, boundMi, boundMx, listparam)
"popt" and "pcov" are 1D arrays of the size of the number of "1" in "listparam" bringing the results of the fitted parameters (best value and err matrix)
"param" will ramain an 1D array of the same size of the original (input) "param", HOWEVER IT WILL BE UPDATED AUTOMATICALLY TO THE FITTED VALUES (same as "popt") for the fitted values, keeping the fixed values according to "listparam"
Hope can be usefull!
Obs1: x = 1D-array independent values and y = 1D-array dependent values
Obs2: This is my first post. Please let me know if I can improove it!
I want to fit a curve to my data:
x=[24,25,28,37,58,104,200,235,235]
y=[340,350,370,400,430,460,490,520,550]
xerr=[1.1,1,0.8,1.4,1.4,2.6,3.8,2,2]
def fit_fc(x, a, b, c):
return a*x**b+c
popt, pcov=curve_fit(fit_fc,x,y,maxfev=5000)
plt.plot(x,fit_fc(x,popt[0],popt[1],popt[2]))
plt.errorbar(x,y,xerr=xerr,fmt='-o')
but i want to put some constraints on the a,b and c. For example I want them to be in some range, lets say between 0 and 20. How can i achieve that? I'm new in Python, so any help would be appreciated.
You could use lmfit to constrain you parameters. For the following plot, I constrained your parameters a and b to the range [0,20] (which you mentioned in your post) and c to the range [0, 400]. The parameters you get are:
a: 19.9999991
b: 0.46769173
c: 274.074071
and the corresponding plot looks as follows:
As you can see, the model reproduces the data reasonable well and the parameters are in the given ranges.
Here is the code that reproduces the results with additional comments:
from lmfit import minimize, Parameters, Parameter, report_fit
import numpy as np
x=[24,25,28,37,58,104,200,235,235]
y=[340,350,370,400,430,460,490,520,550]
def fit_fc(params, x, data):
a = params['a'].value
b = params['b'].value
c = params['c'].value
model = np.power(x,b)*a + c
return model - data #that's what you want to minimize
# create a set of Parameters
#'value' is the initial condition
#'min' and 'max' define your boundaries
params = Parameters()
params.add('a', value= 2, min=0, max=20)
params.add('b', value= 0.5, min=0, max=20)
params.add('c', value= 300.0, min=0, max=400)
# do fit, here with leastsq model
result = minimize(fit_fc, params, args=(x, y))
# calculate final result
final = y + result.residual
# write error report
report_fit(params)
#plot results
try:
import pylab
pylab.plot(x, y, 'k+')
pylab.plot(x, final, 'r')
pylab.show()
except:
pass
If you constrain all of your parameters to the range [0,20], the plot looks rather bad:
It depends on what you want to have happen if the variables are out of range. You can use a simple if statement (in this case the program exit()s):
x = 21
if (x not in range(0, 20)):
print("var x is out of range")
exit()
Another way is to assert that the variable must be in the range. In this case, it's wrapped in a try/except block that handles the problem gracefully, and also exit()s like above:
try:
assert(x in range(0, 20))
except AssertionError:
print("variable x is out of range")
exit()
Scipy uses unconstrained least squares in order to fit curve parameters, so it won't be that straightforward: https://github.com/scipy/scipy/blob/v0.16.0/scipy/optimize/minpack.py#L454
What you'd probably like to do is called constrained (non-linear?, giving what you're trying to fit) least squares problem. For instance, take a look at those discussions:
Constrained least-squares estimation in Python ( leastsq_bounds: https://gist.github.com/denis-bz/65da931bdbf92c49e4d0 )
scipy.optimize.leastsq with bound constraints