Python data fitting using curve_fit

Python data fitting using curve_fit - python

(This task uses the jupyter notebook system)
This is not
Fit the Higgs mass - given a fitter(xvalues, data, init) function below, write a function fitfunc(...) that describes the combined background and signal model to fit the data. Create two pictures:
(a) plot the data with cross markers ('+' symbol) and the best fit curve as red line on the first plot and
(b) draw the residuals with cross markers on the second plot where residuals are defined as the difference between best fit model and pure background model, see below.
The fit function is composed of a background model with 3 parameters
𝑏(𝑚)=𝐴 * exp(𝑏1(𝑚−105.5)+𝑏2(𝑚−105.5)^2)
The signal is added to the background and its model is
𝑠(𝑚)=𝑅/(𝜎√(2𝜋)) * exp(−(𝑚−𝜇)^2/(2𝜎^2))
The equations are not an issue, it is easy to put them into code, as I have done below:
# YOUR CODE HERE
import math
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fitfunc(m, mu, sigma, R, A, b1, b2):
tb1 = b1 * (m - 105.5)
tb2 = b2 * ((m-105.5)**2)
b = A * np.exp(tb1 + tb2)
ts1 = R / (sigma * np.sqrt(2 * np.pi))
ts2 = -(((m - mu)**2) / (2 * (sigma**2)))
s = ts1 * np.exp(ts2)
tot = b + s
return tot
#
def fitter(xval, yval, initial):
''' function to fit the given data using a 'fitfunc' TBD.
The curve_fit function is called. Only the best fit values
are returned to be utilized in a main script.
'''
best, _ = curve_fit(fitfunc, xval, yval, p0=initial)
return best
# Use functions with script below for plotting parts (a) and (b)
The fitter method was already provided, so I don't think it is to be changed.
This is my code for plotting the results:
# start value parameter definitions, see equations for s(m) and b(m).
# init[0] = mu
# init[1] = sigma
# init[2] = R
# init[3] = A
# init[4] = b1
# init[5] = b2
init = (125.8, 1.4, 470.0, 5000.0, -0.04, -1.5e-4)
xvalues = np.arange(start=105.5, stop=160.5, step=1)
data = np.array([4780, 4440, 4205, 4150, 3920, 3890, 3590, 3460, 3300, 3200, 3000,
2950, 2830, 2700, 2620, 2610, 2510, 2280, 2330, 2345, 2300, 2190,
2080, 1990, 1840, 1830, 1730, 1680, 1620, 1600, 1540, 1505, 1450,
1410, 1380, 1380, 1250, 1230, 1220, 1110, 1110, 1080, 1055, 1050,
940, 920, 950, 880, 870, 850, 800, 820, 810, 770, 760])
# YOUR CODE HERE
def main():
arr = np.ndarray(init)
fitt = fitfunc(xvalues, init[0], init[1], init[2], init[3], init[4], init[5])
def plota(xval, yval):
fig = plt.figure()
axis1 = fig.add_axes([0.12, 0.1, 0.85, 0.85])
axis1.plot(xval, yval, marker="+", color="red")
axis1.set_title("Combined", size=12)
axis1.set_xlabel("Mass [GeV]", size=12)
plt.show()
return
plota(xvalues, fitt)
plota(xvalues, fitter(xvalues, fitt, arr))
main()
In this second block, my code starts after the "#YOUR CODE HERE", the rest was already provided.
At the end, the first call of plota() is a curve of the data points found and the second call is my attempt at doing a "best fit curve" as asked by (a). The first call plots just fine, but is not what the question is asking for. This gives a type error: "'float' object cannot be interpreted as an integer". I tried rounding these to integers as well, and I get this error instead: "fitfunc() missing 6 required positional arguments: 'mu', 'sigma', 'R', 'A', 'b1', and 'b2'". I think I am on the right lines with the second call, but I don't know what the third parameter of the fitter method is supposed to be. Looking through the notes I have been provided, it says that it is supposed to be some sort of initial guess but I don't know what this would have to be.
As for part (b), I am not sure how I would get the residuals, I think I can just iterate through the "best" array returned from the fitter method and calculate b(m) values separately and subtract, but I am unsure of the wording of the question.
Thank you for any help.
TypeError Traceback (most recent call last)
<ipython-input-2-30fd8d6062a3> in <module>
27 plota(xvalues, fitt)
28 plota(xvalues, fitter(xvalues, fitt, arr))
---> 29 main()
30
<ipython-input-2-30fd8d6062a3> in main()
26 return
27 plota(xvalues, fitt)
---> 28 plota(xvalues, fitter(xvalues, fitt, arr))
29 main()
30
<ipython-input-1-ac8e97799a28> in fitter(xval, yval, initial)
22 are returned to be utilized in a main script.
23 '''
---> 24 best, _ = curve_fit(fitfunc, xval, yval, p0=initial)
25 return best
26
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in curve_fit(f, xdata, ydata, p0, sigma, absolute_sigma, check_finite, bounds, method, jac, **kwargs)
750 # Remove full_output from kwargs, otherwise we're passing it in twice.
751 return_full = kwargs.pop('full_output', False)
--> 752 res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
753 popt, pcov, infodict, errmsg, ier = res
754 cost = np.sum(infodict['fvec'] ** 2)
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
381 if not isinstance(args, tuple):
382 args = (args,)
--> 383 shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
384 m = shape[0]
385
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
24 def _check_func(checker, argname, thefunc, x0, args, numinputs,
25 output_shape=None):
---> 26 res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
27 if (output_shape is not None) and (shape(res) != output_shape):
28 if (output_shape[0] != 1):
C:\ProgramData\Anaconda3\lib\site-packages\scipy\optimize\minpack.py in func_wrapped(params)
456 if transform is None:
457 def func_wrapped(params):
--> 458 return func(xdata, *params) - ydata
459 elif transform.ndim == 1:
460 def func_wrapped(params):
TypeError: fitfunc() missing 6 required positional arguments: 'mu', 'sigma', 'R', 'A', 'b1', and 'b2'

I think you're close but for two things:
values for b1 and b2 > 0 can lead to Infinities in the exponents
the return value from curve_fit are the best parameter values, not the best fit. You'll have to calculate that yourself.
You also probably want to fit the data array, right? I think this might be what you're looking for
import math
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fitfunc(m, mu, sigma, R, A, b1, b2):
"""comment about Higgs mass here"""
tb1 = b1 * (m - 105.5)
tb2 = b2 * ((m-105.5)**2)
b = A * np.exp(tb1 + tb2)
ts1 = R / (sigma * np.sqrt(2 * np.pi))
ts2 = -(((m - mu)**2) / (2 * (sigma**2)))
s = ts1 * np.exp(ts2)
tot = b + s
return tot
xvalues = np.arange(start=105.5, stop=160.5, step=1)
data = np.array([4780, 4440, 4205, 4150, 3920, 3890, 3590, 3460, 3300, 3200, 3000,
2950, 2830, 2700, 2620, 2610, 2510, 2280, 2330, 2345, 2300, 2190,
2080, 1990, 1840, 1830, 1730, 1680, 1620, 1600, 1540, 1505, 1450,
1410, 1380, 1380, 1250, 1230, 1220, 1110, 1110, 1080, 1055, 1050,
940, 920, 950, 880, 870, 850, 800, 820, 810, 770, 760])
# start value parameter definitions, see equations for s(m) and b(m).
# init[0] = mu
# init[1] = sigma
# init[2] = R
# init[3] = A
# init[4] = b1
# init[5] = b2
init = np.array([125.8, 2, 470, 5000., -0.05, -0.001])
init_fit = fitfunc(xvalues, *init)
best, _ = curve_fit(fitfunc, xvalues, data, p0=init)
print(best)
best_fit = fitfunc(xvalues, *best)
plt.plot(xvalues, data, color='red', marker='+', label='data')
plt.plot(xvalues, init_fit, color='black', label='init')
plt.plot(xvalues, best_fit, color='blue', label='fit')
plt.gca().set_title("Combined", size=12)
plt.gca().set_xlabel("Mass [GeV]", size=12)
plt.legend()
plt.show()
If you'll allow, I'd also suggest using lmfit (http://lmfit.github.io/lmfit-py/) (disclosure: I am one of the authors) for this. Using this library, the code above with curve_fit would transform to
import lmfit
h_model = Model(fitfunc)
params = h_model.make_params(mu=125.8, sigma=2, R=470,
A=5000, b1=-0.05, b2=-0.001)
result = h_model.fit(data, params, m=xvalues)
print(result.fit_report())
plt.plot(xvalues, data, color='red', marker='+', label='data')
plt.plot(xvalues, result.init_fit, color='black', label='init')
plt.plot(xvalues, result.best_fit, color='blue', label='fit')
plt.gca().set_title("Combined", size=12)
plt.gca().set_xlabel("Mass [GeV]", size=12)
plt.legend()
plt.show()
Note here that with lmfit, Parameters are named using your function arguments. In lmfit all parameters can have bounds, so you could do something like
params['b1'].max = 0.0
to ensure that b1 stays negative You can also fix any of the parameter values. And there are many other features.
The printed report for this fit would include estimates of uncertainties and correlations as well as fit statistics:
[[Model]]
Model(fitfunc)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 100
# data points = 55
# variables = 6
chi-square = 106329.424
reduced chi-square = 2169.98824
Akaike info crit = 428.183028
Bayesian info crit = 440.227027
[[Variables]]
mu: 125.940465 +/- 0.34609625 (0.27%) (init = 125.8)
sigma: 1.52638256 +/- 0.37354633 (24.47%) (init = 2)
R: 677.016219 +/- 163.585050 (24.16%) (init = 470)
A: 4660.71073 +/- 24.3437093 (0.52%) (init = 5000)
b1: -0.04279037 +/- 7.7658e-04 (1.81%) (init = -0.05)
b2: 1.7476e-04 +/- 1.7587e-05 (10.06%) (init = -0.001)
[[Correlations]] (unreported correlations are < 0.100)
C(b1, b2) = -0.952
C(A, b1) = -0.775
C(sigma, R) = 0.655
C(A, b2) = 0.650
C(R, b1) = -0.492
C(R, b2) = 0.445
C(sigma, b1) = -0.317
C(sigma, b2) = 0.287
C(R, A) = 0.230
C(sigma, A) = 0.146
and the plot would look something like

I modified your code to run, so your init array has changed for me here.
"""."""
# YOUR CODE HERE
import math
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def fitfunc(m, mu, sigma, R, A, b1, b2):
"""."""
tb1 = b1 * (m - 105.5)
tb2 = b2 * ((m-105.5)**2)
b = A * np.exp(tb1 + tb2)
ts1 = R / (sigma * np.sqrt(2 * np.pi))
ts2 = -(((m - mu)**2) / (2 * (sigma**2)))
s = ts1 * np.exp(ts2)
tot = b + s
return tot
def fitter(xval, yval, initial):
"""
Function to fit the given data using a 'fitfunc' TBD.
The curve_fit function is called. Only the best fit values
are returned to be utilized in a main script.
"""
best, _ = curve_fit(fitfunc, xval, yval, p0=initial)
return best
# Use functions with script below for plotting parts (a) and (b)
# start value parameter definitions, see equations for s(m) and b(m).
# init[0] = mu
# init[1] = sigma
# init[2] = R
# init[3] = A
# init[4] = b1
# init[5] = b2
init = (126, 2, 470, 5000, 1, 5)
xvalues = np.arange(start=105.5, stop=160.5, step=1)
data = np.array([4780, 4440, 4205, 4150, 3920, 3890, 3590, 3460, 3300, 3200, 3000,
2950, 2830, 2700, 2620, 2610, 2510, 2280, 2330, 2345, 2300, 2190,
2080, 1990, 1840, 1830, 1730, 1680, 1620, 1600, 1540, 1505, 1450,
1410, 1380, 1380, 1250, 1230, 1220, 1110, 1110, 1080, 1055, 1050,
940, 920, 950, 880, 870, 850, 800, 820, 810, 770, 760])
def main():
"""."""
arr = np.ndarray(init)
fitt = fitfunc(xvalues, init[0], init[1], init[2], init[3], init[4], init[5])
def plota(xval, yval):
fig = plt.figure()
axis1 = fig.add_axes([0.12, 0.1, 0.85, 0.85])
axis1.plot(xval, yval, marker="+", color="red")
axis1.set_title("Combined", size=12)
axis1.set_xlabel("Mass [GeV]", size=12)
plt.show()
return
plota(xvalues, fitt)
plota(xvalues, fitter(xvalues, fitt, arr))
main()
Note the indentation on main is off by 1 tab/space grouping.

Related

ValueError: too many values to unpack, while using LMfit with solve_ivp

I am facing ValueError: too many values to unpack (expected 2) while optimizing parameters of a system of ODEs using solve_ivp. In fact I get the same error when I tried to use solve_ivp instead of odeint in this SO answer, which you may use as a minimal working example since it has the same problem as far as I am concerned. The only changes I made to that code is swap positions of y, t in arguments for f and similarly while solving it using solve_ivp like so: x = solve_ivp(f, t, x0, args=(paras,)) instead of using odeint in g
Here's the full code for the sake of convenience:
# import libraries
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint, solve_ivp
from lmfit import minimize, Parameters, Parameter, report_fit
def f(t, y, paras):
"""
Your system of differential equations
"""
x1 = y[0]
x2 = y[1]
x3 = y[2]
try:
k0 = paras['k0'].value
k1 = paras['k1'].value
except KeyError:
k0, k1 = paras
# the model equations
f0 = -k0 * x1
f1 = k0 * x1 - k1 * x2
f2 = k1 * x2
return [f0, f1, f2]
def g(t, x0, paras):
"""
Solution to the ODE x'(t) = f(t,x,k) with initial condition x(0) = x0
"""
x = solve_ivp(f, t, x0, args=(paras,))
return x
def residual(paras, t, data):
"""
compute the residual between actual data and fitted data
"""
x0 = paras['x10'].value, paras['x20'].value, paras['x30'].value
model = g(t, x0, paras)
# you only have data for one of your variables
x2_model = model[:, 1]
return (x2_model - data).ravel()
# initial conditions
x10 = 5.
x20 = 0
x30 = 0
y0 = [x10, x20, x30]
# measured data
t_measured = np.linspace(0, 9, 10)
x2_measured = np.array([0.000, 0.416, 0.489, 0.595, 0.506, 0.493, 0.458, 0.394, 0.335, 0.309])
plt.figure()
plt.scatter(t_measured, x2_measured, marker='o', color='b', label='measured data', s=75)
# set parameters including bounds; you can also fix parameters (use vary=False)
params = Parameters()
params.add('x10', value=x10, vary=False)
params.add('x20', value=x20, vary=False)
params.add('x30', value=x30, vary=False)
params.add('k0', value=0.2, min=0.0001, max=2.)
params.add('k1', value=0.3, min=0.0001, max=2.)
# fit model
result = minimize(residual, params, args=(t_measured, x2_measured), method='leastsq') # leastsq nelder
# check results of the fit
data_fitted = g(np.linspace(0., 9., 100), y0, result.params)
# plot fitted data
plt.plot(np.linspace(0., 9., 100), data_fitted[:, 1], '-', linewidth=2, color='red', label='fitted data')
plt.legend()
plt.xlim([0, max(t_measured)])
plt.ylim([0, 1.1 * max(data_fitted[:, 1])])
# display fitted statistics
report_fit(result)
plt.show()
Here's the error traceback:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/home/swami/work/scrap/lmfit_example1.ipynb Cell 4 in <cell line: 67>()
64 params.add('k1', value=0.3, min=0.0001, max=2.)
66 # fit model
---> 67 result = minimize(residual, params, args=(t_measured, x2_measured), method='leastsq') # leastsq nelder
68 # check results of the fit
69 data_fitted = g(np.linspace(0., 9., 100), y0, result.params)
File ~/miniconda3/envs/dynamical/lib/python3.10/site-packages/lmfit/minimizer.py:2600, in minimize(fcn, params, method, args, kws, iter_cb, scale_covar, nan_policy, reduce_fcn, calc_covar, max_nfev, **fit_kws)
2460 """Perform the minimization of the objective function.
2461
2462 The minimize function takes an objective function to be minimized,
(...)
2594
2595 """
2596 fitter = Minimizer(fcn, params, fcn_args=args, fcn_kws=kws,
2597 iter_cb=iter_cb, scale_covar=scale_covar,
2598 nan_policy=nan_policy, reduce_fcn=reduce_fcn,
2599 calc_covar=calc_covar, max_nfev=max_nfev, **fit_kws)
-> 2600 return fitter.minimize(method=method)
File ~/miniconda3/envs/dynamical/lib/python3.10/site-packages/lmfit/minimizer.py:2369, in Minimizer.minimize(self, method, params, **kws)
2366 if (key.lower().startswith(user_method) or
2367 val.lower().startswith(user_method)):
2368 kwargs['method'] = val
-> 2369 return function(**kwargs)
File ~/miniconda3/envs/dynamical/lib/python3.10/site-packages/lmfit/minimizer.py:1693, in Minimizer.leastsq(self, params, max_nfev, **kws)
1691 result.call_kws = lskws
1692 try:
-> 1693 lsout = scipy_leastsq(self.__residual, variables, **lskws)
1694 except AbortFitException:
1695 pass
File ~/.local/lib/python3.10/site-packages/scipy/optimize/_minpack_py.py:410, in leastsq(func, x0, args, Dfun, full_output, col_deriv, ftol, xtol, gtol, maxfev, epsfcn, factor, diag)
408 if not isinstance(args, tuple):
409 args = (args,)
--> 410 shape, dtype = _check_func('leastsq', 'func', func, x0, args, n)
411 m = shape[0]
413 if n > m:
File ~/.local/lib/python3.10/site-packages/scipy/optimize/_minpack_py.py:24, in _check_func(checker, argname, thefunc, x0, args, numinputs, output_shape)
22 def _check_func(checker, argname, thefunc, x0, args, numinputs,
23 output_shape=None):
---> 24 res = atleast_1d(thefunc(*((x0[:numinputs],) + args)))
25 if (output_shape is not None) and (shape(res) != output_shape):
26 if (output_shape[0] != 1):
File ~/miniconda3/envs/dynamical/lib/python3.10/site-packages/lmfit/minimizer.py:586, in Minimizer.__residual(self, fvars, apply_bounds_transformation)
583 self.result.success = False
584 raise AbortFitException(f"fit aborted: too many function evaluations {self.max_nfev}")
--> 586 out = self.userfcn(params, *self.userargs, **self.userkws)
588 if callable(self.iter_cb):
589 abort = self.iter_cb(params, self.result.nfev, out,
590 *self.userargs, **self.userkws)
/home/swami/work/scrap/lmfit_example1.ipynb Cell 4 in residual(paras, t, data)
33 """
34 compute the residual between actual data and fitted data
35 """
37 x0 = paras['x10'].value, paras['x20'].value, paras['x30'].value
---> 38 model = g(t, x0, paras)
40 # you only have data for one of your variables
41 x2_model = model[:, 1]
/home/swami/work/scrap/lmfit_example1.ipynb Cell 4 in g(t, x0, paras)
23 def g(t, x0, paras):
24 """
25 Solution to the ODE x'(t) = f(t,x,k) with initial condition x(0) = x0
26 """
---> 27 x = solve_ivp(f, t, x0, args=(paras,))
28 return x
File ~/.local/lib/python3.10/site-packages/scipy/integrate/_ivp/ivp.py:512, in solve_ivp(fun, t_span, y0, method, t_eval, dense_output, events, vectorized, args, **options)
507 if method not in METHODS and not (
508 inspect.isclass(method) and issubclass(method, OdeSolver)):
509 raise ValueError("`method` must be one of {} or OdeSolver class."
510 .format(METHODS))
--> 512 t0, tf = map(float, t_span)
514 if args is not None:
515 # Wrap the user's fun (and jac, if given) in lambdas to hide the
516 # additional parameters. Pass in the original fun as a keyword
517 # argument to keep it in the scope of the lambda.
518 try:
ValueError: too many values to unpack (expected 2)
Any idea what the problem might be?

How do I fix 'x and y must be the same size' error on python?

I'm brand new to Python and struggling with this error 'x and y must be the same size'
Here is the code for my scatter plot
def plotNumericalConvergence(paramArr, GrArr, Label):
plt.figure()
x = paramArr
y = GrArr
plt.scatter(x=x,y=y)
plt.xlabel(Label)
plt.ylabel('Gr')
plt.title('title')
plt.show()
and here is the code for what its taking in to plot:
def numericalConvergence(Position, Velocity, Charge, Mass, dt, B):
gyroArr = np.array([])
gyroArr2 = np.array([])
gyroArr3 = np.array([])
dtArr = np.array([])
fieldArr = np.array([])
chargeArr = np.array([])
dtArr = np.append(dtArr, [dt])
gyroArr = np.append(gyroArr, [6.324555320336759])
gyroArr2 = np.append(gyroArr, [6.324555320336759])
gyroArr3 = np.append(gyroArr, [6.324555320336759])
fieldArr = np.append(fieldArr, [[0,0,1]])
chargeArr = np.append(chargeArr, Charge)
# Incrementing timestep
for i in range (10):
start = time.time()
dt = dt + 0.1000
print('\n'"Timestep", i+1)
trv= pstep(qom,Position,Velocity,0.0,dt,N_t)
Gr = MeasuredGr(trv)
PredGr = GyroRadius(Position, Velocity, Charge, Mass, dt, B)
gyroArr = np.append(gyroArr, [Gr])
dtArr = np.append(dtArr, [dt])
end = time.time()
print("Predicted gyro radius =", PredGr)
print("Measured gryo radius =", Gr)
print("Timestep =", dt)
print("Magnetic Field =", B)
print("Charge =", Charge)
print("nt =",(end - start)/dt) # need to fix this## Predicted gyro radius
Label = "DT"
plotNumericalConvergence(dtArr, gyroArr, Label)
# Incrementing magnetic field
for i in range (10):
start = time.time()
dt=0.001
B = [float(x) + 1 for x in B] # Increments all numbers in magnetic field array by 1
print('\n'"Magnetic Field", i+1)
trv = pstep(qom,Position,Velocity,0.0,dt,N_t)
Gr = MeasuredGr(trv)
PredGr = GyroRadius(Position, Velocity, Charge, Mass, dt, B)
gyroArr2 = np.append(gyroArr2, [Gr])
fieldArr = np.append(fieldArr, [[B]])
end = time.time()
print("Predicted gyro radius =", PredGr)
print("Measured gryo radius =", Gr)
print("Timestep =", dt)
print("Magnetic Field =", B)
print("Charge =", Charge)
print("nt =",(end - start)/dt)
Label = "Magnetic Field"
plotNumericalConvergence(fieldArr, gyroArr2, Label)
# Incrementing Charge
for i in range (10):
start = time.time()
B = [0,0,1]
Charge = Charge + 0.1
print('\n'"Charge", i+1)
# add label param for y, new gr array each loop - no 2nd method needed
trv=pstep(qom,Position,Velocity,0.0,dt,N_t)
Gr = MeasuredGr(trv)
PredGr = GyroRadius(Position, Velocity, Charge, Mass, dt, B)
gyroArr3 = np.append(gyroArr3, [Gr])
chargeArr = np.append(chargeArr, [Charge])
print("Predicted gyro radius =", PredGr)
print("Measured gryo radius =", Gr)
print("Timestep =", dt)
print("Magnetic Field =", B)
print("Charge =", Charge)
print("nt =",(end - start)/dt)
Label = "Charge"
print(gyroArr3)
print(chargeArr)
plotNumericalConvergence(chargeArr, gyroArr3, Label)
The plot works for the dt, but not the magnetic field or charge. I've seen stuff on here about reshaping arrays and something along the lines of [:,0] kind of thing but I am really stuck and don't understand Python 100% lol. Thanks!
EDIT - Full traceback:
ValueError Traceback (most recent call last)
Cell In [249], line 25
22 bf=EvalB(ipos)
23 vel = Boris(qom,ivel,ef,bf,-0.5*dt)
---> 25 numericalConvergence(ipos, vel, Charge, Mass, dt, B)
26 #print(gyroArr)
27 #print(dtArr)
28 #plotNumericalConvergence(dtArr, gyroArr)
Cell In [246], line 101, in numericalConvergence(Position, Velocity, Charge, Mass, dt, B)
99 print(gyroArr3)
100 print(chargeArr)
--> 101 plotNumericalConvergence(chargeArr, gyroArr3, Label)
103 return gyroArr, dtArr
Cell In [247], line 8, in plotNumericalConvergence(paramArr, GrArr, Label)
5 x = paramArr
6 y = GrArr
----> 8 plt.scatter(x=x,y=y)
10 plt.xlabel(Label)
11 plt.ylabel('Gr')
File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/matplotlib/pyplot.py:2790, in scatter(x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, edgecolors, plotnonfinite, data, **kwargs)
2785 #_copy_docstring_and_deprecators(Axes.scatter)
2786 def scatter(
2787 x, y, s=None, c=None, marker=None, cmap=None, norm=None,
2788 vmin=None, vmax=None, alpha=None, linewidths=None, *,
2789 edgecolors=None, plotnonfinite=False, data=None, **kwargs):
-> 2790 __ret = gca().scatter(
2791 x, y, s=s, c=c, marker=marker, cmap=cmap, norm=norm,
2792 vmin=vmin, vmax=vmax, alpha=alpha, linewidths=linewidths,
2793 edgecolors=edgecolors, plotnonfinite=plotnonfinite,
2794 **({"data": data} if data is not None else {}), **kwargs)
2795 sci(__ret)
2796 return __ret
File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/matplotlib/__init__.py:1423, in _preprocess_data.<locals>.inner(ax, data, *args, **kwargs)
1420 #functools.wraps(func)
1421 def inner(ax, *args, data=None, **kwargs):
1422 if data is None:
-> 1423 return func(ax, *map(sanitize_sequence, args), **kwargs)
1425 bound = new_sig.bind(ax, *args, **kwargs)
1426 auto_label = (bound.arguments.get(label_namer)
1427 or bound.kwargs.get(label_namer))
File /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages/matplotlib/axes/_axes.py:4520, in Axes.scatter(self, x, y, s, c, marker, cmap, norm, vmin, vmax, alpha, linewidths, edgecolors, plotnonfinite, **kwargs)
4518 y = np.ma.ravel(y)
4519 if x.size != y.size:
-> 4520 raise ValueError("x and y must be the same size")
4522 if s is None:
4523 s = (20 if mpl.rcParams['_internal.classic_mode'] else
4524 mpl.rcParams['lines.markersize'] ** 2.0)
ValueError: x and y must be the same size

When you generate a scatter plot, then both x and y should be
1-D arrays of equal size.
Check sizes of x and y, their sizes are probably different.

Fitting 2 experimental datasets using scipy - chemical reaction

Long time lurking, first time posting.
I am working with a chemical system that is detected only for a certain period of time, so I will have the reaction and the decay of the signal. The equation is given by:
Derivative(GL, t): (-k*GL) - GL/a,
Derivative(GM, t): (k*GL) - GM/b,
I have managed to fit my data by using symfit package (image below to give an idea of the system), however since I will need to do Monte Carlo simulation, I need to fit my data using scipy. Chemical reaction and fitting using symfit
I have tried to define the equation in this way:
def f(C, xdata):
GL = ydataScaled
GM = ydataScaled2
dGLdt = -k*GL - GL/a
dGMdt = k*GL - GM/b
return [dGLdt, dGMdt]
However, I am not able to fit neither by using optimize.minimize or odeint. What would be the right approach in this case to fit two dataset in y that share some parameters?
Full code:
import nmrglue as ng
import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
from scipy import integrate
from scipy.optimize import curve_fit
from scipy.integrate import odeint
from symfit import variables, parameters, Fit, ODEModel, Derivative, D, exp, sin, Model, cos, integrate
# read in the bruker formatted data
dic,data = ng.bruker.read_pdata('/opt/topspin4.1.0/NMR/2021_09_27_Glutamine/90/pdata/1')
#Bruker to NMRPipe data
C = ng.convert.converter()
C.from_bruker(dic, data)
pdic, ppdata = C.to_pipe()
#process the spectrum
ZF_Number = 16384
ppdata = ng.proc_base.di(ppdata) # discard the imaginaries
show = ppdata[2] #show the spectra number X
# determind the ppm scale
udic = ng.bruker.guess_udic(dic, data)
uc = ng.fileiobase.uc_from_udic(udic)
ppm_scale = uc.ppm_scale()
ppms = uc.ppm_scale()
#Plot the spectra
fig1 = plt.figure()
bx = fig1.add_subplot(111)
bx.plot(ppms, show)
plt.xlabel('Chemical Shift (ppm)')
plt.ylabel('Intensity')
First = 0
End = 80
#Integration for every i in the range
Area = []
Area2 = []
Area3 = [] #noise measurement, using the same chemical shift lenght as the product-peak.
#limits = [(176, 180), (180, 183)]
for i in range(First,End):
Area.append(ng.analysis.integration.integrate(ppdata[i], uc, (177.15, 177.80), unit = "ppm", noise_limits = None, norm_to_range = None, calibrate = 1.0))
NP_Area = np.asarray(Area)
for i in range(First, End):
Area2.append(ng.analysis.integration.integrate(ppdata[i], uc, (180.80, 181.10), unit = "ppm", noise_limits = None, norm_to_range = None, calibrate = 1.0))
NP_Area2 = np.asarray(Area2)
for i in range(First, End):
Area3.append(ng.analysis.integration.integrate(ppdata[i], uc, (20.0, 20.3), unit = "ppm", noise_limits = None, norm_to_range = None, calibrate = 1.0))
NP_Area3 = np.asarray(Area3)
#Plot the buildUP
fig2 = plt.figure()
cx = fig2.add_subplot(111)
cx.plot(NP_Area)
cx.plot(NP_Area2)
plt.xlabel('Time (seconds)')
plt.ylabel('Intensity')
#Fitting
d1 = dic['acqus']['D'][1]
xdata = (np.arange(First, End) - First)*d1
ydata = NP_Area[:,0]
ydata2 = NP_Area2[:,0]
ydataScaled = ydata/max(ydata) #normalized to the initial value of the Glu signal to compensate for any variations in the polarization level
ydataScaled2 = ydata2/max(ydata) # same as above
#GL, GM, t = variables('GL, GM, t')
a, b, k = parameters('a, b, k')
# Define the equation considering the enzymatic reaction Gl -> Gm with the HP decay.
def f(C, xdata):
GL = ydataScaled
GM = ydataScaled2
dGLdt = -k*GL - GL/a
dGMdt = k*GL - GM/b
return [dGLdt, dGMdt]
C0 = [1, 0]
popt, pcov = sp.optimize.minimize(f, xdata, args = (ydataScaled, ydataScaled2))```
And the error:
runfile('/Users/karensantos/Desktop/Codes/Stack_question.py', wdir='/Users/karensantos/Desktop/Codes')
2
(512, 32768)
float64
/opt/anaconda3/lib/python3.8/site-packages/nmrglue/fileio/convert.py:68: UserWarning: Incompatible dtypes, conversion not recommended
warn("Incompatible dtypes, conversion not recommended")
Traceback (most recent call last):
File "/Users/karensantos/Desktop/Codes/Stack_question.py", line 112, in <module>
popt, pcov = sp.optimize.minimize(f, xdata, args = (ydataScaled, ydataScaled2))
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/_minimize.py", line 612, in minimize
return _minimize_bfgs(fun, x0, args, jac, callback, **options)
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/optimize.py", line 1101, in _minimize_bfgs
sf = _prepare_scalar_function(fun, x0, jac, args=args, epsilon=eps,
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/optimize.py", line 261, in _prepare_scalar_function
sf = ScalarFunction(fun, x0, args, grad, hess,
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 76, in __init__
self._update_fun()
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 166, in _update_fun
self._update_fun_impl()
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 73, in update_fun
self.f = fun_wrapped(self.x)
File "/opt/anaconda3/lib/python3.8/site-packages/scipy/optimize/_differentiable_functions.py", line 70, in fun_wrapped
return fun(x, *args)
TypeError: f() takes 2 positional arguments but 3 were given

how can I fix this ''name is not defined''

I am newbie in python I have this code i want to use subclass lmfit.models and implement a guess method,
class DecayingSineModel():
def __init__(self, *args, **kwargs):
def decaying_sine(self, x, ampl, offset, freq, x0, tau):
return ampl * np.sin((x - x0)*freq) * np.exp(-x/tau) + offset
super(DecayingSineModel, self).__init__(decaying_sine, *args, **kwargs)
def pset(param, value):
params["%s%s" % (self.prefix, param)].set(value=value)
def guess(self, data, **kwargs):
params = self.make_params()
pset("ampl", np.max(data) - np.min(data))
pset("offset", np.mean(data))
pset("freq", 1)
pset("x0", 0)
pset("tau", 1)
return lmfit.models.update_param_vals(params, self.prefix, **kwargs)
sp = DecayingSineModel()
params = sp.guess(y, x=x)
fit = sp.fit(y, params, x=x)
and i am recieving following error
the error that i am recieving
the image of error that i recieved is in this address

You almost certainly want your DecayingSineWave to inherit from lmfit.Model. As others have pointed out there are a number of other problems with your code, including the fact that your pset references self but doesn't have it passed in and yet you call pset, not self.pset. Your model function decaying_sine should not have self.
A cleaned-up version:
import numpy as np
import lmfit
import matplotlib.pyplot as plt
class DecayingSineModel(lmfit.Model):
def __init__(self, *args, **kwargs):
def decaying_sine(x, ampl, offset, freq, x0, tau):
return ampl * np.sin((x - x0)*freq) * np.exp(-x/tau) + offset
super(DecayingSineModel, self).__init__(decaying_sine, *args, **kwargs)
def guess(self, data, x=None, **kwargs):
ampl = np.max(data) - np.min(data)
offset = np.mean(data)
params = self.make_params(ampl=ampl, offset=offset, freq=1, x0=0, tau=1)
return lmfit.models.update_param_vals(params, self.prefix, **kwargs)
sp = DecayingSineModel()
x = np.linspace(0, 25, 201)
noise = np.random.normal(size=len(x), scale=0.25)
y = 2 + 7*np.sin(1.6*(x-0.2)) * np.exp(-x/18) + noise
params = sp.guess(y, x=x)
result = sp.fit(y, params, x=x)
print(result.fit_report())
plt.plot(x, y, 'bo')
plt.plot(x, result.best_fit, 'r-')
plt.show()
gives a report of:
[[Model]]
Model(decaying_sine)
[[Fit Statistics]]
# function evals = 83
# data points = 201
# variables = 5
chi-square = 39.266
reduced chi-square = 0.200
Akaike info crit = -318.220
Bayesian info crit = -301.703
[[Variables]]
ampl: 6.92483967 +/- 0.123863 (1.79%) (init= 12.59529)
offset: 1.96307863 +/- 0.031684 (1.61%) (init= 2.139916)
freq: 1.60060819 +/- 0.001775 (0.11%) (init= 1)
x0: 0.19650313 +/- 0.010267 (5.23%) (init= 0)
tau: 18.3528781 +/- 0.614576 (3.35%) (init= 1)
[[Correlations]] (unreported correlations are < 0.100)
C(ampl, tau) = -0.781
C(freq, x0) = 0.750
and

IndexError returned on curve_fit: error on function call?

I am trying to use curve_fit given this function
def F(xy,*p):
x,y = xy
c = np.array(p).ravel()
n = (len(c)-1)/4
omega = pi/180.0
z = c[0]
for t in range(n):
z += c[4*t+1] * (cos((t+1)*omega*x))
z += c[4*t+2] * (cos((t+1)*omega*y))
z += c[4*t+3] * (sin((t+1)*omega*x))
z += c[4*t+4] * (sin((t+1)*omega*y))
return z
def G(xy,*p):
x,y = xy
c = np.array(p).ravel()
ngm = (len(c))/7
z = 0
for t in range(ngm):
a = c[7*t]
cx = c[7*t+1]
mx = c[7*t+2]
sx = c[7*t+3]
cy = c[7*t+4]
my = c[7*t+5]
sy = c[7*t+6]
z += a * np.exp(-((cx*(x-mx)**2)/(2*(sx**2)))-((cy*(y-my)**2)/(2*(sy**2))))
return z
def FG(xy,*p):
x,y = xy
c = np.array(p).ravel()
nf = int(c[0])
ng = int(c[1])
print nf,ng
pf = [c[i] for i in range(2,4*nf+3)]
pg = [c[i] for i in range(4*nf+3,4*nf+7*ng+3)]
z1 = F(xy,pf)
z2 = G(xy,pg)
return z1+z2
pfit,cov = opt.curve_fit(FG,xy,z,p,bounds=bounds)
I am sure that the shape of both p and bounds are appropriate. I tried printing nf and ng, and they are properly printed until after some number of iterations (around after 20th function call, not the same in every run), where the values changed significantly.
After the 20th (or more) run, it returns the following error:
File "/Users/pensieve/calcs/3D_AA/0_codes/fitpkgs.py", line 144, in FGfit
pfit,cov = opt.curve_fit(FG,xy,z,p,bounds=bounds)
File "/Library/Python/2.7/site-packages/scipy-0.18.1-py2.7-macosx-10.10-intel.egg/scipy/optimize/minpack.py", line 683, in curve_fit
**kwargs)
File "/Library/Python/2.7/site-packages/scipy-0.18.1-py2.7-macosx-10.10-intel.egg/scipy/optimize/_lsq/least_squares.py", line 878, in least_squares
tr_options.copy(), verbose)
File "/Library/Python/2.7/site-packages/scipy-0.18.1-py2.7-macosx-10.10-intel.egg/scipy/optimize/_lsq/trf.py", line 128, in trf
loss_function, tr_solver, tr_options, verbose)
File "/Library/Python/2.7/site-packages/scipy-0.18.1-py2.7-macosx-10.10-intel.egg/scipy/optimize/_lsq/trf.py", line 341, in trf_bounds
f_new = fun(x_new)
File "/Library/Python/2.7/site-packages/scipy-0.18.1-py2.7-macosx-10.10-intel.egg/scipy/optimize/_lsq/least_squares.py", line 764, in fun_wrapped
return np.atleast_1d(fun(x, *args, **kwargs))
File "/Library/Python/2.7/site-packages/scipy-0.18.1-py2.7-macosx-10.10-intel.egg/scipy/optimize/minpack.py", line 455, in func_wrapped
return func(xdata, *params) - ydata
File "/Users/pensieve/calcs/3D_AA/0_codes/fitfunctions.py", line 65, in FG
pgm = [c[i] for i in range(4*nf+3,4*nf+7*ng+3)]
IndexError: index out of bounds
For reference, I use scipy 0.18.1.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python data fitting using curve_fit - python

Related

ValueError: too many values to unpack, while using LMfit with solve_ivp

How do I fix 'x and y must be the same size' error on python?

Fitting 2 experimental datasets using scipy - chemical reaction

how can I fix this ''name is not defined''

IndexError returned on curve_fit: error on function call?

Categories

Resources