How to improve curve fit for data with multiphasic exponential equation? - python

I'm working on a project and I need to fit some data to the equation
y = bmax_1 * np.exp(-koff_1 * x) + bmax_2 * np.exp(-koff_2*x) with bmax_1, koff_1, bmax_2, koff_2 being the parameters. I have tried using curve_fit, but the result is quite poor and give an R-squared value of 0.16. I'm wondering if there is something I can do to improve the fit?
#Define model function
def func(x, bmax_1, koff_1, bmax_2, koff_2):
return bmax_1*np.exp(-koff_1*x) + bmax_2 * np.exp(-koff_2 *x)
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return np.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
parameterBounds = []
parameterBounds.append([0.0, 200.0]) # search bounds for bamx_1
parameterBounds.append([0.0, 10.0]) # search bounds for koff_1
parameterBounds.append([0.0, 200.0]) # search bounds for bmax_2
parameterBounds.append([0.0, 10.0]) # search bounds for koff_2
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=4)
return result.x
column = input('Column to be analysed: ')
xData = df.loc[:, 'T']
yData = df.loc[:, column]
geneticParameters = generate_Initial_Parameters()
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters, bounds = )
print('Fitted parameters:', fittedParameters)
The fitted parameters are [1.24066146e+02 1.48240328e-02 1.34805335e+01 8.26108828e-01]
Result of the fit:

Since the initial values required by the iterative method presents challenges to evaluate, why not trying a non-iterative method which doesn't requires initial values ?
The method below comes from https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales .
Eventually if an iterative method is of absolute necessity to achive some special specifications, the results from the method below can be used as very good initial values.
-
If you want to fit the function without the parameter a :

Related

Fitting two peaks with gauss in python

Curve_fit is not fit properly. I'm trying to fit experimental data with curve_fit. The data is imported from a .txt file to a array:
d = np.loadtxt("data.txt")
data_x = np.array(d[:, 0])
data_y = np.array(d[:, 2])
data_y_err = np.array(d[:, 3])
Since i know there must be two peaks, my model is a sum of two gaussian curves:
def model_dGauss(x, xc, A, y0, w, dx):
P = A/(w*np.sqrt(2*np.pi))
mu1 = (x - (xc - dx/3))/(2*w**2)
mu2 = (x - (xc + 2*dx/3))/(2*w**2)
return y0 + P * ( np.exp(-mu1**2) + 0.5 * np.exp(-mu2**2))
Using values for the guess is very sensitive to my guess values. Where is the point of fitting data if just nearly perfect fitting parameter will provide a result? Or am I doing something completely wrong?
t = np.linspace(8.4, 10, 300)
guess_dG = [32, 1, 10, 0.1, 0.2]
popt, pcov = curve_fit(model_dGauss, data_x, data_y, p0=guess_dG, sigma=data_y_err, absolute_sigma=True)
A, xc, y0, w, dx = popt
Plotting the data
plt.scatter(data_x, data_y)
plt.plot(t, model_dGauss(t1,*popt))
plt.errorbar(data_x, data_y, yerr=data_y_err)
yields:
Plot result
The result is just a straight line at the bottom of my graph while the evaluated parameters are not that bad. How can that be?
A complete example of code is always appreciated (and, ahem, usually expected here on SO). To remove much of the confusion about using curve_fit here, allow me to suggest that you will have an easier time using lmfit (https://lmfit.github.io/lmfit-py) and especially its builtin model functions and its use of named parameters. With lmfit, your code for two Gaussians plus a constant offset might look like this:
from lmfit.models import GaussianModel, ConstantModel
# start with 1 Gaussian + Constant offset:
model = GaussianModel(prefix='p1_') + ConstantModel()
# this model will have parameters named:
# p1_amplitude, p1_center, p1_sigma, and c.
# here we give initial values to these parameters
params = model.make_params(p1_amplitude=10, p1_center=32, p1_sigma=0.5, c=10)
# optionally place bounds on parameters (probably not needed here):
params['p1_amplitude'].min = 0.
## params['p1_center'].vary = False # fix a parameter from varying in fit
# now do the fit (including weighting residual by 1/y_err):
result = model.fit(data_y, params, x=data_x, weights=1.0/data_y_err)
# print out param values, uncertainties, and fit statistics, or get best-fit
# parameters from `result.params`
print(result.fit_report())
# plot results
plt.errorbar(data_x, data_y, yerr=data_y_err, label='data')
plt.plot(data_x, result.best_fit, label='best fit')
plt.legend()
plt.show()
To add a second Gaussian, you could just do
model = GaussianModel(prefix='p1_') + GaussianModel(prefix='p2_') + ConstantModel()
# and then:
params = model.make_params(p1_amplitude=10, p1_center=32, p1_sigma=0.5, c=10,
p2_amplitude=2, p2_center=31.75, p1_sigma=0.5)
and so on.
Your model has the two gaussian sharing or at least having "linked" values - the sigma values should be the same for the two peaks and the amplitude of the 2nd is half that of the 1st. As defined so far, the 2-Gaussian model has all the parameters being independent. But lmfit has a mechanism for setting constraints on any parameter by giving an algebraic expression in terms of other parameters. So, for example, you could say
params['p2_sigma'].expr = 'p1_sigma'
params['p2_amplitude'].expr = 'p1_amplitude / 2.0'
Now, p2_amplitude and p2_sigma will not be independently varied in the fit but will be constrained to have those values.

Curve fitting with nth order polynomial having sine ripples

I'm modeling measurement errors in a certain measuring device. This is how the data looks: high frequency sine ripples on a low frequency polynomial. My model should capture the ripples too.
The curve that fits the error should be of the form: error(x) = a0 + a1*x + a2*x^2 + ... an*x^n + Asin(x/lambda). The order n of the polynomial is not known. My plan is to iterate n from 1-9 and select the one that has the highest F-value.
I've played with numpy.polyfit and scipy.optimize.curve_fit so far. numpy.polyfit is only for polynomials, so while I can generate the "best fit" polynomial, there's no way to determine the parameters A and lambda for the sine term. scipy.optimize.curve_fit would have worked great if I already knew the order of the polynomial for the polynomial part of error(x).
Is there a clever way to use both numpy.polyfit and scipy.optimize.curve_fit to get this done? Or another library-function perhaps?
Here's the code for how I'm using numpy.polyfit to select the best polynomial:
def GetErrorPolynomial(X, Y):
maxFval = 0.0
for i in range(1, 10): # i is the order of the polynomial (max order = 9)
error_func = np.polyfit(X, Y, i)
error_func = np.poly1d(error_func)
# F-test (looking for the largest F value)
numerator = np.sum(np.square(error_func(X) - np.mean(Y))) / i
denominator = np.sum(np.square(Y - error_func(X))) / (Y.size - i - 1)
Fval = numerator / denominator
if Fval > maxFval:
maxFval = Fval
maxFvalPolynomial = error_func
return maxFvalPolynomial
And here's the code for how I'm using curve_fit:
def poly_sine_fit(x, a, b, c, d, l):
return a*np.square(x) + b*x + c + d*np.sin(x/l)
param, _ = curve_fit(poly_sine_fit, x_data, y_data)
It's "hardcoded" to a quadratic function, but I want to select the "best" order as I'm doing above with np.polyfit
I finally found a way to model the ripples and can answer my own question. This 2006 paper does curve-fitting on ripples that resemble my dataset.
First off, I did a least squares polynomial fit and then subtracted this polynomial curve from the original data. This left me with only the ripples. Applying the Fourier transform, I picked out the dominant frequencies which let me reconstruct the sine ripples. Then I simply added these ripples to the polynomial curve I had obtained in the beginning. That did it.
Use Scikit-learn Linear Regression
Here is a code sample I used to perform a linear regression with a polynom of degree 3 that pass by the point 0 with value 1 and null derivative. You just have to adapt the function create_vector with the function you want.
from sklearn import linear_model
import numpy as np
def create_vector(x):
# currently representing a polynom Y = a*X^3 + b*X^2
x3 = np.power(x, 3)
x2 = np.power(x, 2)
X = np.append(x3, x2, axis=1)
return X
data_x = [some_data_input]
data_y = [some_data_output]
x = np.array(data_x).reshape(-1, 1)
y_data = np.array(data_y).reshape(-1, 1)-1 # -1 to pass by the point (0,1)
X = create_vector(x)
regr = linear_model.LinearRegression(fit_intercept=False)
regr.fit(X, y_data)
I extracted data from the scatterplot for analysis and found that a polynomial + sine did not seem to be an optimal model, because lower order polynomials were not following the shape of the data very well and higher order polynomials were exhibiting Runge's phenomenon of high curvature at the data extremes. I performed an equation search to find what the high-frequency sine wave might be imposed upon, and a good candidate seemed to be the Extreme Value peak equation "a * exp(-1.0 * exp(-1.0 * ((x-b)/c))-((x-b)/c) + 1.0) + offset" as shown below.
Here is a graphical Python curve fitter for this equation, at the top of the file I load the data I had extracted so you would need to replace this with the actual data. This fitter uses scipy's differential_evolution genetic algorithm module to estimate initial parameter values for the non-linear fitter, which uses the Latin Hypercube algorithm to ensure a thorough search of parameter space and requires bounds within which to search. Here those bounds are taken from the data maximum and minimum values.
Subtracting the model predictions from this fitted curve should leave you with only the sine component to be modeled. I noted that there seems to be an additional narrow, low-amplitude peak at approximately x = 275.
import numpy, scipy, matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import differential_evolution
import warnings
##########################################################
# load data section
f = open('/home/zunzun/temp/temp.dat')
textData = f.read()
f.close()
xData = []
yData = []
for line in textData.split('\n'):
if line: # ignore blank lines
spl = line.split()
xData.append(float(spl[0]))
yData.append(float(spl[1]))
xData = numpy.array(xData)
yData = numpy.array(yData)
##########################################################
# model to be fitted
def func(x, a, b, c, offset): # Extreme Valye Peak equation from zunzun.com
return a * numpy.exp(-1.0 * numpy.exp(-1.0 * ((x-b)/c))-((x-b)/c) + 1.0) + offset
##########################################################
# fitting section
# function for genetic algorithm to minimize (sum of squared error)
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
val = func(xData, *parameterTuple)
return numpy.sum((yData - val) ** 2.0)
def generate_Initial_Parameters():
# min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minData = min(minX, minY)
maxData = max(maxX, maxY)
parameterBounds = []
parameterBounds.append([minData, maxData]) # search bounds for a
parameterBounds.append([minData, maxData]) # search bounds for b
parameterBounds.append([minData, maxData]) # search bounds for c
parameterBounds.append([minY, maxY]) # search bounds for offset
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# by default, differential_evolution completes by calling curve_fit() using parameter bounds
geneticParameters = generate_Initial_Parameters()
# now call curve_fit without passing bounds from the genetic algorithm,
# just in case the best fit parameters are aoutside those bounds
fittedParameters, pcov = curve_fit(func, xData, yData, geneticParameters)
print('Fitted parameters:', fittedParameters)
print()
modelPredictions = func(xData, *fittedParameters)
absError = modelPredictions - yData
SE = numpy.square(absError) # squared errors
MSE = numpy.mean(SE) # mean squared errors
RMSE = numpy.sqrt(MSE) # Root Mean Squared Error, RMSE
Rsquared = 1.0 - (numpy.var(absError) / numpy.var(yData))
print()
print('RMSE:', RMSE)
print('R-squared:', Rsquared)
print()
##########################################################
# graphics output section
def ModelAndScatterPlot(graphWidth, graphHeight):
f = plt.figure(figsize=(graphWidth/100.0, graphHeight/100.0), dpi=100)
axes = f.add_subplot(111)
# first the raw data as a scatter plot
axes.plot(xData, yData, 'D')
# create data for the fitted equation plot
xModel = numpy.linspace(min(xData), max(xData))
yModel = func(xModel, *fittedParameters)
# now the model as a line plot
axes.plot(xModel, yModel)
axes.set_xlabel('X Data') # X axis data label
axes.set_ylabel('Y Data') # Y axis data label
plt.show()
plt.close('all') # clean up after using pyplot
graphWidth = 800
graphHeight = 600
ModelAndScatterPlot(graphWidth, graphHeight)
UPDATE -------
If the high-frequency sine component is constant (which I do not know) then modeling a small portion of the data with only a few cycles will be sufficient to determine the equation and initial parameter estimates for fitting the sine wave portion of the model. Here I have done this with the following result:
from the following equation:
amplitude = -1.0362957093184177E+00
center = 3.6632754608370377E+01
width = 5.0813421718648293E+00
Offset = 5.1940843481496088E+00
pi = 3.14159265358979323846 # constant not fitted
y = amplitude * sin(pi * (x - center) / width) + Offset
Combining these two models using the actual data, rather than my scatterplot-extracted data, should be close to what you need.

How to get errors of parameters from maximum likelihood estimation with known likelihood function in python?

I have a some data and want to fit a given psychometric function p.
I'm intereseted in the fit parameters and the errors as well. With the 'classical' method using the curve_fit function from the scipy package it's easy to get the parameters of p and the errors. However I want to do the same using a maximum likelihood estimation (MLE). From the output and the figure you can see that both methods offer slight different parameters. Implementing the MLE is not the problem but I don't know how to get the errors using this method. Is there an easy way to get them? My likelihood function L is:
I was not able to adapt the code described here http://rlhick.people.wm.edu/posts/estimating-custom-mle.html but this is probably a solution. How can I implement this? Or this there any other way?
A similar function is fitted here using scipy stats models: https://stats.stackexchange.com/questions/66199/maximum-likelihood-curve-model-fitting-in-python. However the errors of the parameters are not calculated neither.
The negative log-likelihood function is correct, since it offers the right parameters, but I was wondering if this function depends on y-data? The negative log likelihood function l is obviously l = -ln(L).
Here is my code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
## libary
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import minimize
def p(x,x50,s50):
"""return y value of psychometric function p"""
return 1./(1+np.exp(4.*s50*(x50-x)))
def initialparams(x,y):
"""return initial fit parameters for function p with given dataset"""
midpoint = np.mean(x)
slope = (np.max(y)-np.min(y))/(np.max(x)-np.min(x))
return [midpoint, slope]
def cfit_error(pcov):
"""return errors of fir from covariance matrix"""
return np.sqrt(np.diag(pcov))
def neg_loglike(params):
"""analytical negative log likelihood function. This function is dependend on the dataset (x and y) and the two parameters x50 and s50."""
x50 = params[0]
s50 = params[1]
i = len(xdata)
prod = 1.
for i in range(i):
#print prod
prod *= p(xdata[i],x50,s50)**(ydata[i]*5) * (1-p(xdata[i],x50,s50))**((1.-ydata[i])*5)
return -np.log(prod)
xdata = [0.,-7.5,-9.,-13.500001,-12.436171,-16.208617,-13.533123,-12.998025,-13.377527,-12.570075,-13.320075,-13.070075,-11.820075,-12.070075,-12.820075,-13.070075,-12.320075,-12.570075,-11.320075,-12.070075]
ydata = [1.,0.6,0.8,0.4,1.,0.,0.4,0.6,0.2,0.8,0.4,0.,0.6,0.8,0.6,0.2,0.6,0.,0.8,0.6]
intparams = initialparams(xdata, ydata)## guess some initial parameters
## normal curve fit using least squares algorithm
popt, pcov = curve_fit(p, xdata, ydata, p0=intparams)
print('scipy.optimize.curve_fit:')
print('x50 = {:f} +- {:f}'.format(popt[0], cfit_error(pcov)[0]))
print('s50 = {:f} +- {:f}\n'.format(popt[1], cfit_error(pcov)[1]))
## fitting using maximum likelihood estimation
results = minimize(neg_loglike, initialparams(xdata,ydata), method='Nelder-Mead')
print('MLE with self defined likelihood-function:')
print('x50 = {:f}'.format(results.x[0]))
print('s50 = {:f}'.format(results.x[1]))
#print results
## ploting the data and results
xfit = np.arange(-20,1,0.1)
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.plot(xdata, ydata, 'xb', label='measured data')
ax.plot(xfit, p(xfit, *popt), '-r', label='curve fit')
ax.plot(xfit, p(xfit, *results.x), '-g', label='MLE')
plt.legend()
plt.show()
The output is:
scipy.optimize.curve_fit:
x50 = -12.681586 +- 0.252561
s50 = 0.264371 +- 0.117911
MLE with self defined likelihood-function:
x50 = -12.406544
s50 = 0.107389
Both fits and measured data can be seen here:
My Python version is 2.7 on Debian Stretch. Thank you for your help.
Finally the method described by Rob Hicks (http://rlhick.people.wm.edu/posts/estimating-custom-mle.html) worked out. After installing numdifftools, I could calculate the errors of estimated parameters from the hessian matrix.
Installing numdifftools on Linux with su rights:
apt-get install python-pip
pip install numdifftools
An complete code example of my programm from above is here:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
## libary
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import minimize
import numdifftools as ndt
def p(x,x50,s50):
"""return y value of psychometric function p"""
return 1./(1+np.exp(4.*s50*(x50-x)))
def initialparams(x,y):
"""return initial fit parameters for function p with given dataset"""
midpoint = np.mean(x)
slope = (np.max(y)-np.min(y))/(np.max(x)-np.min(x))
return [midpoint, slope]
def cfit_error(pcov):
"""return errors of fir from covariance matrix"""
return np.sqrt(np.diag(pcov))
def neg_loglike(params):
"""analytical negative log likelihood function. This function is dependend on the dataset (x and y) and the two parameters x50 and s50."""
x50 = params[0]
s50 = params[1]
i = len(xdata)
prod = 1.
for i in range(i):
#print prod
prod *= p(xdata[i],x50,s50)**(ydata[i]*5) * (1-p(xdata[i],x50,s50))**((1.-ydata[i])*5)
return -np.log(prod)
xdata = [0.,-7.5,-9.,-13.500001,-12.436171,-16.208617,-13.533123,-12.998025,-13.377527,-12.570075,-13.320075,-13.070075,-11.820075,-12.070075,-12.820075,-13.070075,-12.320075,-12.570075,-11.320075,-12.070075]
ydata = [1.,0.6,0.8,0.4,1.,0.,0.4,0.6,0.2,0.8,0.4,0.,0.6,0.8,0.6,0.2,0.6,0.,0.8,0.6]
intparams = initialparams(xdata, ydata)## guess some initial parameters
## normal curve fit using least squares algorithm
popt, pcov = curve_fit(p, xdata, ydata, p0=intparams)
print('scipy.optimize.curve_fit:')
print('x50 = {:f} +- {:f}'.format(popt[0], cfit_error(pcov)[0]))
print('s50 = {:f} +- {:f}\n'.format(popt[1], cfit_error(pcov)[1]))
## fitting using maximum likelihood estimation
results = minimize(neg_loglike, initialparams(xdata,ydata), method='Nelder-Mead')
## calculating errors from hessian matrix using numdifftools
Hfun = ndt.Hessian(neg_loglike, full_output=True)
hessian_ndt, info = Hfun(results.x)
se = np.sqrt(np.diag(np.linalg.inv(hessian_ndt)))
print('MLE with self defined likelihood-function:')
print('x50 = {:f} +- {:f}'.format(results.x[0], se[0]))
print('s50 = {:f} +- {:f}'.format(results.x[1], se[1]))
Generates the following output:
scipy.optimize.curve_fit:
x50 = -18.702375 +- 1.246728
s50 = 0.063620 +- 0.041207
MLE with self defined likelihood-function:
x50 = -18.572181 +- 0.779847
s50 = 0.078935 +- 0.028783
However some RuntimeErrors occur in calculating the hessian matrix with numdifftools. There is some Division by Zero. This is maybe because of my self defined neg_loglike funtion. At the end there some results for the errors. The method using "Extending Statsmodels" is probably more elegant, but I couldn't figure it out.

Issue fitting curves:

I´m trying to fit my data to a certain function by when I try to plot it, I always get double lines as shown in the figure below. This is the code I´m using:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import warnings
from scipy.optimize import differential_evolution
# bounds on parameters are set in generate_Initial_Parameters() below
def func_original(x,a,b,c):
return a/(x**2)+b/x+c
# bounds on parameters are set in generate_Initial_Parameters() below
def func_recommended(x,a,b,c):
return 1/(a*x**2+b*x+c)
# select peak function here
#func = func_original
func = func_recommended
# function for genetic algorithm to minimize (sum of squared error)
# bounds on parameters are set in generate_Initial_Parameters() below
def sumOfSquaredError(parameterTuple):
warnings.filterwarnings("ignore") # do not print warnings by genetic algorithm
return np.sum((yData - func(xData, *parameterTuple)) ** 2)
def generate_Initial_Parameters():
# data min and max used for bounds
maxX = max(xData)
minX = min(xData)
maxY = max(yData)
minY = min(yData)
minSearch = min([minX, minY])
maxSearch = max([maxX, maxY])
parameterBounds = []
parameterBounds.append([minSearch, maxSearch]) # parameter bounds for a
parameterBounds.append([minSearch, maxSearch]) # parameter bounds for b
parameterBounds.append([minSearch, maxSearch]) # parameter bounds for c
# "seed" the numpy random number generator for repeatable results
result = differential_evolution(sumOfSquaredError, parameterBounds, seed=3)
return result.x
# load data from text file
data=np.loadtxt('gammaoh.txt')
use=np.transpose(data)
yData=use[0]
xData=use[2]
# generate initial parameter values
initialParameters = generate_Initial_Parameters()
# curve fit the data
fittedParameters, niepewnosci = curve_fit(func, xData, yData, initialParameters)
# create values for display of fitted peak function
a, b, c = fittedParameters
y_fit = func(xData, a, b, c)
plt.plot(xData, yData, 'bo', label='Puntos experimentais $\gamma_{OH}$', markersize=5)
plt.plot(xData, (1/(xData**2*0.5998-2.29255*xData+1.7988)) , 'b-',label='Axuste $\gamma_{OH}$')
plt.title('Axustes coeficientes de actividade ')
plt.xlabel('$\chi_{H_2O}$ ')
plt.ylabel('$\gamma$')
plt.grid(True)
plt.legend(loc=2)
plt.savefig('gammaoh.png')
I would be very grateful if someone could tell me how to fix this, thank you in advance. Also, if anyone knows a better way of fitting data to a given function, it would be nice if you could tell me.
I have no deeper knowledge of the problem you are solving, but to avoid the extra line in the plot, it works if the lists are sorted according to x. I did this:
xData.sort()
tt = (1/(xData**2*0.5998-2.29255*xData+1.7988))
plt.plot(xData, tt , 'b-',label='Axuste $\gamma_{OH}$')

Non-linear fitting with weighted errorbars - Minimizer/scipy.curve_fit/model.fit

I am working on a Python fitting code for Michaelies-Menten, a non-linear equation, able to include weighted errorbars. At the moment I have tried using Minimizer and model.fit from lmfit. Though Minimizer does not include weighted errorbars and model.fit seems to be less statistical than Minimizer.
Is there a way to include weighted errorbars in Minimizer?
Would scipy.optimize.curve_fit be a better way to fit this code?
Or is there another fitting program that would be better?
My code is below
def michealies_menten(path, conc, substrate):
os.chdir(path)
h = open('average_STD_data.dat', 'r')
f = open('concentration.dat', 'r')
x = []
y = []
std = []
std_1 = []
for line in h:
line = line.split(',')
y.append(line[0])
std.append(line[1])
for line in f:
x = line.split(',')
for i in range(len(std)):
new = 1.0/(float(std[i])**2.0)
std_1.append(float(new))
std.insert(0, '0')
std_1.insert(0, '0')
x.insert(0, '0')
y.insert(0, '0')
y=map(float,y)
x=map(float,x)
std=map(float,std)
std_1=map(float,std_1)
x=np.array(x)
y=np.array(y)
std_1=np.array(std_1)
####Model.fit code:
def my_model(x, Vmax, Km):
return Vmax * x / (Km + x)
gmodel = Model(my_model)
result = gmodel.fit(y, x=x, Vmax=4000.0, Km=3.0, weights=std_1)
print result.fit_report()
Vmax_1=result.params['Vmax'].value
Km_1=result.params['Km'].value
model = (Vmax_1*x/(Km_1+x))
###Minimizer code:
def get_residual(params, x, y):
Vmax = params['Vmax']
Km = params['Km']
model = Vmax * x / (Km + x)
return model - y
#Parameters definition for LmFit
params = Parameters()
params.add('Vmax',value=4000., min=0)
params.add('Km',value=3., min=0)
#Produces the Km and Vmax value which is then extranted
minner = Minimizer(get_residual, params, fcn_args=(x,y))
result = minner.minimize()
print "Result of minization, deviation from y value:", result.residual
#Resulting in the final y-data, which gives the fitted data.
final = y + result.residual
print "I am here, Final:", final
#Gives report over the minimize function
print "Result.params:"
result.params.pretty_print()
print "Report_fit:"
report_fit(result)
#Transfer lmFit output of the minimize(result) to variable for further use
Vmax=result.params['Vmax'].value
Km=result.params['Km'].value
print "Fitted - Heres Km", Km
print "Fitted - Heres Vmax", Vmax
#Draw the different graphs
#plt.plot(x,fitfunc_michment(x, *popt),'b',label='curve_fit')
plt.plot(x,final,'r',label='lmfit')
plt.plot(x,model,'b',label='model')
plt.plot(x,y,'rs',label='Raw')
plt.errorbar(x,y,yerr=std, ecolor='r', linestyle="None")
plt.xlabel(s.join(['Concentration of ', subst ,' [', conc_unit,']']),fontsize=12)
plt.ylabel('Intensity [A.U.]', fontsize=12)
plt.savefig('Michaelis_Menten_plot.png', bbox_inches='tight')
plt.show()
print 'Hello World, i am the Km value: ', Km
print 'Vmax value: ', Vmax
Hope you can help me!
Cheers
If I understand correctly, you want to fit the model described in my_model to data y(x) (in the arrays y and x) and use the uncertainty in y, std, to weight the fit -- minimizing (data - mode)/uncertainty rather than just data - model.
To do this with lmfit.Model, you want to pass in a weight of 1./std (probably checking for divde-by-zero), as with:
result = gmodel.fit(y, x=x, Vmax=4000.0, Km=3.0, weights=1.0/(std))
(it wasn't clear to me why there was both std and std_1.
To do this with Minimize, add std as to the fcn_args tuple (arguments to be passed to your objective function), and change the objective function to replace
return model - y
with
return (model -y)/std
With that, you should be ready to go.
FWIW, Model.fit uses Minimizer, so it's not really "less statistical", it's just a different emphasis.
As an aside, there are probably more efficient ways to load the data (perhaps some variation of numpy.loadtxt) but that's not the main question here.

Categories

Resources