Iterative reconvolution fitting with measured irf using python and lmfit - python

I am trying to fit an exponential decay function with convolution to a measured instrument response using python and lmfit.
I am new to python and I am trying to follow the code in https://groups.google.com/group/lmfit-py/attach/90f51c25ebb39a52/deconvol_exp2.py?part=0.1&authuser=0.
import numpy as np
from lmfit import Model
import matplotlib.pyplot as plt
import requests
# Load data
url = requests.get('https://groups.google.com/group/lmfit-py/attach/73a983d40ad945b1/tcspcdatashifted.csv?part=0.1&authuser=0')
x,decay1,irf=np.loadtxt(url.iter_lines(),delimiter=',',unpack=True,dtype='float')
plt.figure(1)
plt.semilogy(x,decay1,x,irf)
plt.show()
# Define weights
wWeights=1/np.sqrt(decay1+1)
# define the double exponential model
def jumpexpmodel(x,tau1,ampl1,tau2,ampl2,y0,x0,args=(irf)):
ymodel=np.zeros(x.size)
t=x
c=x0
n=len(irf)
irf_s1=np.remainder(np.remainder(t-np.floor(c)-1, n)+n,n)
irf_s11=(1-c+np.floor(c))*irf[irf_s1.astype(int)]
irf_s2=np.remainder(np.remainder(t-np.ceil(c)-1,n)+n,n)
irf_s22=(c-np.floor(c))*irf[irf_s2.astype(int)]
irf_shift=irf_s11+irf_s22
irf_reshaped_norm=irf_shift/sum(irf_shift)
ymodel = ampl1*np.exp(-(x)/tau1)
ymodel+= ampl2*np.exp(-(x)/tau2)
z=Convol(ymodel,irf_reshaped_norm)
z+=y0
return z
# convolution using fft (x and h of equal length)
def Convol(x,h):
X=np.fft.fft(x)
H=np.fft.fft(h)
xch=np.real(np.fft.ifft(X*H))
return xch
# assign the model for fitting
mod=Model(jumpexpmodel)
When defining the initial parameters for the fit, I am getting the error.
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
#initialize the parameters - showing error
pars = mod.make_params(tau1=10,ampl1=1000,tau2=10,ampl2=1000,y0=0,x0=10,args=irf)
pars['x0'].vary =True
pars['y0'].vary =True
print(pars)
# fit this model with weights, initial parameters
result = mod.fit(decay1,params=pars,weights=wWeights,method='leastsq',x=x)
# print results
print(result.fit_report())
# plot results
plt.figure(5)
plt.subplot(2,1,1)
plt.semilogy(x,decay1,'r-',x,result.best_fit,'b')
plt.subplot(2,1,2)
plt.plot(x,result.residual)
plt.show()
Based on the documentation for lmfit.model, I suspect, this is because of how argument irf is defined in model as args=(irf). I have tried to pass irf to model instead of params. I have also tried to use **kwargs.
What is the correct way to incorporate irf into the model for convolution and fit the data?

I believe that you want to consider irf as an additional independent variable of the model function - a value that you pass in to the function but is not treated as a variable in the fit.
To do that, just modify the signature of your model function jumpexpmodel() to be the simpler
def jumpexpmodel(x, tau1, ampl1, tau2, ampl2, y0, x0, irf):
the body of the function is fine (in fact the args=(irf) would not have worked because you would have needed to unpack args -- the signature here is really what you wanted anyway).
Then tell lmfit.Model() that irf is an independent variable - the default is that the first argument is the only independent variable:
mod = Model(jumpexpmodel, independent_vars=('x', 'irf'))
Then, when making the parameters, do not include irf or args:
pars = mod.make_params(tau1=10, ampl1=1000, tau2=10, ampl2=1000, y0=0, x0=10)
but rather now pass in irf along with x to mod.fit():
result = mod.fit(decay1, params=pars, weights=wWeights, method='leastsq', x=x, irf=irf)
The rest of your program looks fine and the resulting fit will work reasonably well, giving a report of
[[Model]]
Model(jumpexpmodel)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 138
# data points = 2797
# variables = 6
chi-square = 3795.52585
reduced chi-square = 1.35991610
Akaike info crit = 865.855713
Bayesian info crit = 901.473529
[[Variables]]
tau1: 50.4330421 +/- 0.68246203 (1.35%) (init = 10)
ampl1: 2630.30664 +/- 20.1552948 (0.77%) (init = 1000)
tau2: 225.392872 +/- 2.75674753 (1.22%) (init = 10)
ampl2: 523.257894 +/- 12.4451921 (2.38%) (init = 1000)
y0: 20.7975212 +/- 0.14165429 (0.68%) (init = 0)
x0: -9.70588133 +/- 0.12597936 (1.30%) (init = 10)
[[Correlations]] (unreported correlations are < 0.100)
C(tau2, ampl2) = -0.947
C(tau1, ampl2) = -0.805
C(tau1, tau2) = 0.706
C(tau1, x0) = -0.562
C(ampl1, x0) = 0.514
C(tau1, ampl1) = -0.453
C(tau2, y0) = -0.426
C(ampl2, y0) = 0.314
C(ampl2, x0) = 0.291
C(tau2, x0) = -0.260
C(tau1, y0) = -0.212
C(ampl1, tau2) = 0.119
and a plot like this:

Related

LMFIT not properly fitting where scipy does with same starting parameter values

I have a complicated curve fitting function:
def corr_function(tau: np.ndarray, BG: float, avg_C: float, vz: float):
wxc = 8.3
wy = 2.5
wz = 3.35
D = 4.4e1
return 1/((math.pi)**(3/2)*wxc*wy*wz*avg_C)*(1 + 4*D*tau/(wxc**2))**(-1/2)*(1 + 4*D*tau/(wy**2))**(-1/2)*(1 + 4*D*tau/(wz**2))**(-1/2)*np.exp(-((vz*tau)**2/(wz**2 + 4*D*tau))) + BG
I tried to fit this with scipy:
popt, pcov = curve_fit(corr_function, tau, corr, [0, 1e-12, 2e5])
and lmfit
model = Model(corr_function, independent_vars=['tau'])
result = model.fit(
corr,
tau=tau,
BG=Parameter('BG', value=0, min=0),
avg_C=Parameter('avg_C', value=1e-12, min=0),
vz=Parameter('vz', value=2e5, min=0),
)
And while the scipy converges to a proper answer (blue), the lmfit doesn't (orange), where lmfit parameters don't change really at all during fitting
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 61
# data points = 400
# variables = 3
chi-square = 1.5370e+12
reduced chi-square = 3.8714e+09
Akaike info crit = 8833.74620
Bayesian info crit = 8845.72059
## Warning: uncertainties could not be estimated:
BG_guess: at boundary
avg_C_guess: at initial value
avg_C_guess: at boundary
[[Variables]]
BG: 0.00000000 (init = 0)
avg_C: 3.9999e-12 (init = 4e-12)
vz: 8831416.63 (init = 200000)
I think I need lmfit to sample a larger parameter space (or more iterations), anyone know how to do this?
Also, note, I need to the input parameters to be static (can't bring them closer to proper fit), as I'll need to automate fitting for large parameter spaces

How to use python's lmfit for retrieving best constrained coefficient rates of ordinary differential equations (ODE)?

I am trying to minimize a loss function over an Ordinary differential equation (ODE) problem, using python.
This loss function is meant to retrieve the best coefficient rates of a user defined ODE.
Despite my efforts and examples found in the internet, all results ended up with ODEs whose solutions returned coefficient values which were outside of their respective bounds.
I know that this can happen for several reasons: choice of minimizer, bad ODE definition, integration step-size, among other things.
Nevertheless, my major problem is to find an example where the minimizer can support integration "Breaks" in the ODE integration.
For example, if I am interested in a simple ODE model, for example the Prey-Predator ODE (Lotka-Volterra), one rule that must be set is that no populations can achieve negative values.
Therefore, if the ODE minimizer for some given reason returns a negative value for any of the two populations modeled (Prey and Predator), the minimizer must stop its integration.
Nevertheless, as it appears, this kind of rule is not supported by common minimizations, at least for the Minimizer objects of the lmfit that I tested.
Even with the bounding rules setted for each Parameter object, the Minimizers continue to interpolate the data beyond the given thresholds (i.e.: negative populations).
For example:
Let's assume that I want to find the best coefficient rates of a Prey-Predator ODE with respect to some empirical data.
In this case, if I use the python's lmfit library for my problem, I could do something like below:
from lmfit import Parameters, report_fit, Minimizer
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import ode
end = '\n'*2 + '-'*50 + '\n'*2
def ode_f(t, xs, ps):
"""Lotka-Volterra predator-prey model."""
if isinstance(ps, Parameters):
Prey_Natality = ps[r'Prey_Natality'].value
Predation_rate = ps['Predation_rate'].value
Predator_Natality = ps['Predator_Natality'].value
Predator_Mortality = ps['Predator_Mortality'].value
Prey_natural_mortality = ps['Prey_natural_mortality'].value
else:
(Prey_Natality, Predation_rate, Predator_Natality,
Predator_Mortality, Prey_natural_mortality) = ps
Prey, Predator = xs
dPrey_dt = ( + Prey_Natality*Prey
- Predation_rate*(Prey)*Predator
- Prey_natural_mortality*Prey)
dPred_dt = ( + Predation_rate* (Prey ) *Predator
+ Predator_Natality*Predator
- Predator_Mortality*Predator)
return [dPrey_dt, dPred_dt]
def solout(y):
if np.any( y <=0):
return -1 # stop integration
else:
return 0
def ode_solver(t,
x0,
ps,
mode='vode',
nsteps=500,
method='bdf'):
"""
Solution to the ODE x'(t) = f(t,x,k) with initial condition x(0) = x0
"""
r = ode(ode_f).set_integrator(mode,
nsteps=nsteps,
method=method)
t0 = t.min()
tmax = t.max()
dt = np.diff(t)[0]
r.set_initial_value(x0, t0).set_f_params(ps)
y = []
times = []
integration_time = r.t
while r.successful() and integration_time < tmax:
r.integrate(integration_time + dt)
integration_time = r.t
yi = r.y
y.append(yi)
times.append(integration_time)
if solout(yi) == -1:
print('Time stoped at: {0}'.format(integration_time))
break
return (np.array(y)).astype(float), times
def ODE_solver_residual_evaluator(ps, ts, data):
x0 = [ps['Prey_Pop'].value, ps['Predator_Pop'].value]
model, times = ode_solver(ts, x0, ps)
# if data.shape[0] <= model.shape[0]:
# data.resize((model.shape[0],data.shape[1]),refcheck=False)
# else:
# model.resize((data.shape[0],data.shape[1]),refcheck=False)
return ( model[:len(data )] - data[:len(model)] )
def residual_dim_reducer(residual_array):
return np.square(residual_array).sum()
if '__main__' == __name__:
########
dt = 10**(-4)
t = np.arange(0, 100, dt)
Prey_initial_pop = 1200
Predator_initial_pop = 50
x0 = np.array([Prey_initial_pop,
Predator_initial_pop])
Prey_Natality = 2.6
Predation_rate = 0.12
Predator_Natality = 0.401
Predator_Mortality = 0.0025
Prey_natural_mortality = 0.001
true_params = np.array((Prey_Natality,
Predation_rate,
Predator_Natality,
Predator_Mortality,
Prey_natural_mortality))
data, times = ode_solver(t, x0, true_params)
data += np.random.lognormal(size=data.shape)*0.5
Param_Names = ['Prey_Natality',
'Predation',
'Predator_Natality',
'Predator_Mortality',
'Prey_natural_mortality']
Populations = ['Prey population',
'Predator population']
for i in range(data.shape[1]):
plt.plot(times, np.real(data[:,i]), 'o', label=r'original {0}'.format(Populations[i]))
plt.legend()
plt.show()
import os
to_save = os.path.join(os.getcwd(), 'original data.png')
plt.savefig(to_save)
print('Creating the minizer object', end=end)
################3
# set parameters incluing bounds
params = Parameters()
params.add('Prey_Pop', value= 100, min=1, max=600)
params.add('Predator_Pop', value=10, min=1, max=400)
params.add('Prey_Natality', value=0.03, min=0.000001, max=3.5)
params.add('Predation_rate', value=0.02, min=0.00003, max=3.5)
params.add('Predator_Natality', value=0.0004, min=0.00001, max=3.2)
params.add('Predator_Mortality', value=0.003, min=0.000001, max=3.2)
params.add('Prey_natural_mortality', value=0.001, min=0.0000001, max=0.2)
fitter = Minimizer(ODE_solver_residual_evaluator,
params,
fcn_args=(t, data),
iter_cb=None,
scale_covar=True,
nan_policy='propagate',
reduce_fcn=residual_dim_reducer,
calc_covar=True,
)
fitted_ODE = fitter.minimize(method='Nelder-Mead')
Optimum_params = fitted_ODE.params.valuesdict()
x0_optimum = np.array([Optimum_params.pop('Prey_Pop'),
Optimum_params.pop('Predator_Pop')])
Y_fitted, times_fitted = ode_solver(t, x0_optimum, Optimum_params.values())
Param_Names = list(params.keys())
print(end, 'Param_Names: {0}'.format(Param_Names))
##################
data = data[:len(Y_fitted)]
Y_fitted = Y_fitted[:len(data)]
from sklearn.metrics import (explained_variance_score,
r2_score,
mean_squared_error)
explained_variance_score = explained_variance_score(Y_fitted, data)
R2 = r2_score(Y_fitted, data)
RMSE = np.sqrt(mean_squared_error(Y_fitted, data))
print('Explained variance: {0} \n R2: {1} \n RMSE: {2}'.format(explained_variance_score,
R2, RMSE))
print(end, 'Fitting by Minizer is complete', end=end)
# display fitted statistics
report_fit(fitted_ODE)
print(end)
######################3
fig2 = plt.figure()
n_ref_markers = 12
markers_on = np.linspace(0, data[:,i].size-1, n_ref_markers).astype(int).tolist()
for i in range(Y_fitted.shape[1]):
plt.plot(times_fitted[:len(times)],
Y_fitted[:len(times),i],
'-',
linewidth=1.1,
label=r'fitted {0} '.format(Param_Names[i]))
plt.plot(np.arange(len(data)), data[:,i], 'o',
markevery=markers_on,
label=r'original {0}'.format(Param_Names[i]))
fig2.legend()
fig2.show()
The returned values are:
Two figures:
A) the original data
B) the original plus the ODE's modeled data according to the best coefficient rates that lmfit returned
A table of returns containing fitting measurements and parameters statistical descriptions.
Returned output:
--------------------------------------------------
Fitting by Minizer is complete
--------------------------------------------------
Models fitting error:
Explained variance: -90.1682809468072
R2: -3125.4358694840084
RMSE: 785.9581933129715
--------------------------------------------------
[[Fit Statistics]]
# fitting method = Nelder-Mead
# function evals = 66437
# data points = 364
# variables = 7
chi-square = 2.2485e+08
reduced chi-square = 629842.649
Akaike info crit = 4867.50583
Bayesian info crit = 4894.78590
## Warning: uncertainties could not be estimated:
[[Variables]]
Prey_Pop: 117.479436 +/- 16.3998313 (13.96%) (init = 100)
Predator_Pop: 397.552948 +/- nan (nan%) (init = 10)
Prey_Natality: 1.05567443 +/- nan (nan%) (init = 0.03)
Predation_rate: 3.46543190 +/- 0.05124925 (1.48%) (init = 0.02)
Predator_Natality: 0.48528830 +/- nan (nan%) (init = 0.0004)
Predator_Mortality: 2.76733581 +/- 0.06777831 (2.45%) (init = 0.003)
Prey_natural_mortality: 0.03928761 +/- 0.00503378 (12.81%) (init = 0.001)
[[Correlations]] (unreported correlations are < 0.100)
C(Predation_rate, Prey_natural_mortality) = -0.596
C(Prey_Pop, Predator_Mortality) = 0.179
C(Predation_rate, Predator_Mortality) = 0.141
C(Prey_Pop, Prey_natural_mortality) = 0.127
C(Predator_Mortality, Prey_natural_mortality) = -0.112
This problem came from a didactic example extracted from here:
Any help on the subject would be appreciated.
Sincerely,

Fitting Voigt function to data in Python

I recently got a script running to fit a gaussian to my absorption profile with help of SO. My hope was that things would work fine if I simply replace the Gauss function by a Voigt one, but this seems not to be the case. I think mainly due to the fact that it is a shifted voigt.
Edit: The profiles are absorption lines that vary in optical thickness. In practice they will be a mix between optically thick and thin features. Like the bottom part in this diagram. The current data will be more like the top image, but maybe the bottom is already flattened a bit. (And we only see the left side of the profile, a bit beyond the center)
For a Gauss it looks like this and as predicted the bottom seems to be less deep than the fit wants it to be, but still quite close. The profile itself should still be a voigt though. But now I realize that the central points might throw off the fit. So maybe a weight should be added based on wing position?
I'm mostly wondering if the shifted function could be mis-defined or if its my starting values.
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
from scipy.special import wofz
x = np.arange(13)
xx = xx = np.linspace(0, 13, 100)
y = np.array([19699.959 , 21679.445 , 21143.195 , 20602.875 , 16246.769 ,
11635.25 , 8602.465 , 7035.493 , 6697.0337, 6510.092 ,
7717.772 , 12270.446 , 16807.81 ])
# weighted arithmetic mean (corrected - check the section below)
#mean = 2.4
sigma = 2.4
gamma = 2.4
def Gauss(x, y0, a, x0, sigma):
return y0 + a * np.exp(-(x - x0)**2 / (2 * sigma**2))
def Voigt(x, x0, y0, a, sigma, gamma):
#sigma = alpha / np.sqrt(2 * np.log(2))
return y0 + a * np.real(wofz((x - x0 + 1j*gamma)/sigma/np.sqrt(2))) / sigma /np.sqrt(2*np.pi)
popt, pcov = curve_fit(Voigt, x, y, p0=[8, np.max(y), -(np.max(y)-np.min(y)), sigma, gamma])
#p0=[8, np.max(y), -(np.max(y)-np.min(y)), mean, sigma])
plt.plot(x, y, 'b+:', label='data')
plt.plot(xx, Voigt(xx, *popt), 'r-', label='fit')
plt.legend()
plt.show()
I may be misunderstanding the model you're using, but I think you would need to include some sort of constant or linear background.
To do that with lmfit (which has Voigt, Gaussian, and many other models built in, and tries very hard to make these interchangeable), I would suggest starting with something like this:
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import GaussianModel, VoigtModel, LinearModel, ConstantModel
x = np.arange(13)
xx = np.linspace(0, 13, 100)
y = np.array([19699.959 , 21679.445 , 21143.195 , 20602.875 , 16246.769 ,
11635.25 , 8602.465 , 7035.493 , 6697.0337, 6510.092 ,
7717.772 , 12270.446 , 16807.81 ])
# build model as Voigt + Constant
## model = GaussianModel() + ConstantModel()
model = VoigtModel() + ConstantModel()
# create parameters with initial values
params = model.make_params(amplitude=-1e5, center=8,
sigma=2, gamma=2, c=25000)
# maybe place bounds on some parameters
params['center'].min = 2
params['center'].max = 12
params['amplitude'].max = 0.
# do the fit, print out report with results
result = model.fit(y, params, x=x)
print(result.fit_report())
# plot data, best fit, fit interpolated to `xx`
plt.plot(x, y, 'b+:', label='data')
plt.plot(x, result.best_fit, 'ko', label='fitted points')
plt.plot(xx, result.eval(x=xx), 'r-', label='interpolated fit')
plt.legend()
plt.show()
And, yes, you can simply replace VoigtModel() with GaussianModel() or LorentzianModel() and redo the fit and compare the fit statistics to see which model is better.
For the Voigt model fit, the printed report would be
[[Model]]
(Model(voigt) + Model(constant))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 41
# data points = 13
# variables = 4
chi-square = 17548672.8
reduced chi-square = 1949852.54
Akaike info crit = 191.502014
Bayesian info crit = 193.761811
[[Variables]]
amplitude: -173004.338 +/- 30031.4068 (17.36%) (init = -100000)
center: 8.06574198 +/- 0.16209266 (2.01%) (init = 8)
sigma: 1.96247322 +/- 0.23522096 (11.99%) (init = 2)
c: 23800.6655 +/- 1474.58991 (6.20%) (init = 25000)
gamma: 1.96247322 +/- 0.23522096 (11.99%) == 'sigma'
fwhm: 7.06743644 +/- 0.51511574 (7.29%) == '1.0692*gamma+sqrt(0.8664*gamma**2+5.545083*sigma**2)'
height: -18399.0337 +/- 2273.61672 (12.36%) == '(amplitude/(max(2.220446049250313e-16, sigma*sqrt(2*pi))))*wofz((1j*gamma)/(max(2.220446049250313e-16, sigma*sqrt(2)))).real'
[[Correlations]] (unreported correlations are < 0.100)
C(amplitude, c) = -0.957
C(amplitude, sigma) = -0.916
C(sigma, c) = 0.831
C(center, c) = -0.151
Note that by default gamma is constrained to be the same value as sigma. This constraint can be lifted and gamma made to vary independently with params['gamma'].set(expr=None, vary=True, min=1.e-9). I think that you may not have enough data points in this data set to robustly and independently determine gamma.
The plot for that fit would look like this:
I managed to get something, but not very satisfying. If you remove the offset as a parameter and add 20000 directly in the Voigt function, with starting values [8, 126000, 0.71, 2] (the particular values don't' matter much) you'll get something like
Now, the fit produces a value for gamma which is negative which I cannot really justify. I would expect gamma to always be positive, but maybe I'm wrong and it's completely fine.
One thing you could try is to mirror your data so that its a "positive" peak (and while at it removing the background) and/or normalize the values. That might help you in the convergence.
I have no idea why when using the offset as a parameter the solver has problems finding an optimum. Maybe you need a different optimizer routine.
Maybe it'll be a better option to use the lmfit package that it's a wrapper over scipy to fit nonlinear functions with many prebuilt lineshapes. There is even an example of fitting a Voigt profile.

Lmfit gives -1 correlation and large uncertainty (python)

I am trying to fit a model function to a curve using the lmfit module.
The curve that I am fitting is set up as follows:
e(x) = exp(-(x-X)/x0) for x larger or equal than X, 0 otherwise.
G(x) = (1/sqrt(2*pi)*sigma) * exp(-x^2/2*sigma^2)
The model fit M(x) = E * conv(e,G)(x) + B
Where e is a truncated exponential, G is a gaussian, and E and B are constants. The operator between e and G is a convolution.
When I try to fit this function to my data I get a good fit. However, the fit is very sensitive to my initial value that I provide for X. This is also reflected in the uncertainty in the parameters:
[[Model]]
((Model(totemiss) * (Model(exptruncated) <function convolve at 0x7f139e2dcde8> Model(gaussian))) + Model(background))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 67
# data points = 54
# variables = 5
chi-square = 120558969110355112544642583094864038386991104.00000
reduced chi-square = 2460387124701124853181382654239391973638144.00000
Akaike info crit = 5275.63336
Bayesian info crit = 5285.57828
[[Variables]]
E: 9.7316e+28 +/- 2.41e+33 (2475007.74%) (init= 1.2e+29)
x0: 5.9420e+06 +/- 9.52e+04 (1.60%) (init= 5000000)
X: 4.9049e+05 +/- 1.47e+11 (29978575.17%) (init= 100000)
sigma: 2.6258e+06 +/- 5.74e+04 (2.19%) (init= 2000000)
center: 0 (fixed)
amplitude: 1 (fixed)
B: 3.9017e+22 +/- 3.75e+20 (0.96%) (init= 4.5e+22)
[[Correlations]] (unreported correlations are < 0.100)
C(E, X) = -1.000
C(sigma, B) = -0.429
C(x0, sigma) = -0.283
C(x0, B) = -0.266
C(E, x0) = -0.105
C(x0, X) = 0.105
I suspect this has something to do due to the correlation between E and X being -1.00, which does not make any sense. I am trying to find out why I get this error and I believe it might be in the definition of the model:
def exptruncated(x, x0, X):
return np.exp(-(x-X)/x0)* (x > X)
#Define convolution operator
def convolve(arr, kernel):
npts = min(len(arr), len(kernel))
pad = np.ones(npts)
tmp = np.concatenate((pad*arr[0], arr, pad*arr[-1]))
out = np.convolve(tmp, kernel, mode='valid')
noff = int((len(out) - npts)/2)
return out[noff:noff+npts]
#Constant value for total emissions#
def totemiss(x,E):
return E
#Constant value for background value
def background(x,B):
return B
# create Composite Model using the custom convolution operator
# M(x) = E + conv(exp,gauss) + B
mod = Model(totemiss)* CompositeModel(Model(exptruncated), Model(gaussian), convolve) + Model(background)
mod.set_param_hint('x0',value=50*1e5,min=0,max=60*1e5)
mod.set_param_hint('amplitude',value=1.0)
mod.set_param_hint('center',value=0.0)
mod.set_param_hint('sigma',value=20*1e5,min=0,max=100*1e5)
mod.set_param_hint('X',value=1.0*1e5,min=0, max=5.0*1e5)
mod.set_param_hint('B',value=0.45*1e23,min=0.3*1e23,max=1.0*1e23)
mod.set_param_hint('E',value=1.2*1e29,min=1.2*1e26,max=1.0*1e32)
pars = mod.make_params()
pars['amplitude'].vary = False
pars['center'].vary = False
result = mod.fit(y, params=pars, x=x)
comps = result.eval_components(x=x)
Although I believe the model is the reason I am not able to find where the error comes from. Perhaps somebody of you can help me out!
Why not just remove E from the model -- the X parameter is serving as a constant offset.
I'd also advise having parameters that are more reasonably scaled, closer to order of unity (roughly 1e-6 to 1e6, say). You can add scales of 1e10 and so on as needed in the model calculation, but it generally helps the calculations of covariance (used to determine how to update values in the fit) to have parameters more uniformly scaled.

How to convert a user input string into an equations which can be used to find the best fit to some data

I have some x and y data which I plot in a graph window. I would like a user to define an equation and then use something like SciPy to find the best values for that equation.
As an example equation
user input => y = ((m^2 / c^4) * 2)^0.5
How can I put this string into curve_fitting or something similar and find the missing values please? I thought I could use an anonymous function but that seems to not be working for me.
You might find lmfit (http://lmfit.github.io/lmfit-py/) useful for this purpose. As part of its high-level approach to curve-fitting, it has an ExpressionModel class that supports user-defined model functions taken from Python expressions. More details can be found at
http://lmfit.github.io/lmfit-py/builtin_models.html#user-defined-models. As a simple example (taken from the example folder in the github repo):
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import ExpressionModel
x = np.linspace(-10, 10, 201)
amp, cen, wid = 3.4, 1.8, 0.5
y = amp * np.exp(-(x-cen)**2 / (2*wid**2)) / (np.sqrt(2*np.pi)*wid)
y = y + np.random.normal(size=len(x), scale=0.01)
gmod = ExpressionModel('amp * exp(-(x-cen)**2 /(2*wid**2))/(sqrt(2*pi)*wid)')
result = gmod.fit(y, x=x, amp=5, cen=5, wid=1)
print(result.fit_report())
plt.plot(x, y, 'bo')
plt.plot(x, result.init_fit, 'k--')
plt.plot(x, result.best_fit, 'r-')
plt.show()
will print out the results of
[[Model]]
Model(_eval)
[[Fit Statistics]]
# function evals = 54
# data points = 201
# variables = 3
chi-square = 0.019
reduced chi-square = 0.000
Akaike info crit = -1856.580
Bayesian info crit = -1846.670
[[Variables]]
amp: 3.40478705 +/- 0.005053 (0.15%) (init= 5)
cen: 1.79930413 +/- 0.000858 (0.05%) (init= 5)
wid: 0.50051059 +/- 0.000858 (0.17%) (init= 1)
[[Correlations]] (unreported correlations are < 0.100)
C(amp, wid) = 0.577
and produce a plot of
Just to be clear: this uses the asteval module (https://newville.github.io/asteval/) to parse and evaluate the user input in a way that tries to be as safe as possible from malicious user input that would be exposed using a plain eval.

Categories

Resources