Python - fitting data with exponential function

Python - fitting data with exponential function - python

I am aware that there are a few questions about a similar subject, although I couldn't find a proper answer.
I would like to fit some data with a function (called Bastenaire) and iget the parameters values. Here is the code:
import numpy as np
from matplotlib import pyplot as plt
from scipy import optimize
def bastenaire(s, A,B, C,sd):
logNB=np.log(A)-C*(s-sd)-np.log(s-sd)
return np.exp(logNB)-B
S=np.array([659,646,634,623,613,595,580,565,551,535,515,493,473,452,432,413,394,374,355,345])
N=np.array([46963,52934,59975,65522,74241,87237,101977,116751,133665,157067,189426,233260,281321,355558,428815,522582,630257,768067,902506,1017280])
fitmb,fitmob=optimize.curve_fit(bastenaire,S,N,p0=(30000,2000000000,0.2,250))
plt.scatter(N,S)
plt.plot(bastenaire(S,*fitmb),S,label='bastenaire')
plt.legend()
plt.show()
However, the curve fit cannot identify the correct parameters and I get: OptimizeWarning: Covariance of the parameters could not be estimated.
Same results when I give no input parameters values.
Figure
Is there any way to tweak something and get results? Should my dataset cover a wider range and values?
Thank you!
Broc

Fitting is tough, you need to restrain the parameter space using bounds and (often) check a bit your initial values.
To make it work, I search for an initial value where the function had the correct look, then estimated some constraints:
bounds = np.array([(1e4, 1e12), (-np.inf, np.inf), (1e-20, 1e-2), (-2000., 20000)]).T
fitmb, fitmob = optimize.curve_fit(bastenaire,S, N,p0=(1e7,-100.,1e-5,250.), bounds=bounds)
returns
(array([ 1.00000000e+10, 1.03174824e+04, 7.53169772e-03, -7.32901325e+01]), array([[ 2.24128391e-06, 6.17858390e+00, -1.44693602e-07,
-5.72040842e-03],
[ 6.17858390e+00, 1.70326029e+07, -3.98881486e-01,
-1.57696515e+04],
[-1.44693602e-07, -3.98881486e-01, 1.14650323e-08,
4.68707940e-04],
[-5.72040842e-03, -1.57696515e+04, 4.68707940e-04,
1.93358414e+01]]))

Related

Scipy curve fit doesn't perform a fit and raises "Covariance of the parameters could not be estimated" error

I am trying to do a simple linear curve fit with scipy, normally this method works fine for me. This time however for a reason unknown to me it doesn't work.
(I suspect that maybe the numbers are so big that it reaches the limit of what can be stored under a given data type.)
Regardless of the reason, the idea is to make a plot that looks like this:
As you see on the axis here the numbers are of a common order of magnitude. However this time I tried to make a fit to much bigger data points on the order of 1E10, for this I tried to use the following code (here I present only the code for making a scatter plot and then fitting only one data set).
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
ucrt_T = 2/np.sqrt(3)
ucrt_U = 0.1/np.sqrt(3)
T = [314.1, 325.1, 335.1, 345.1, 355.1, 365.1, 374.1, 384.1, 393.1]
T_to_4th = [9733560790.61, 11170378213.80, 12609495509.84, 14183383217.88, 15900203737.92, 17768359469.96, 19586229219.65, 21765930026.49, 23878782252.31]
ucrt_T_lst = [143130823.11, 158701221.00, 173801148.95, 189829733.26, 206814686.75, 224783722.22, 241820148.88, 261735288.93, 280568229.17]
UBlack = [1.9,3.1, 4.4, 5.6, 7.0, 8.7, 10.2, 11.8, 13.4]
def lin_function(x,a,b):
return a*x + b
def line_fit_2():
#Dodanie pozostałych punktów na wykresie
plt.scatter(UBlack, T_to_4th, color='blue')
plt.errorbar(UBlack, T_to_4th, yerr=ucrt_T, fmt='o')
#Seria CZARNA
VltBlack = np.array(UBlack)
Tt4 = np.array(T_to_4th)
popt, pcov = curve_fit(lin_function, VltBlack, Tt4, absolute_sigma=False)
perr = np.sqrt(np.diag(pcov))
y = lin_function(VltBlack, *popt)
#Stylistyka i wygląd wykresu
#plt.plot(Pressure1, y, '--', color = 'g', label="fit with: $a={:.3f}\pm{:.3f}$, $b={:.3f}\pm{:.3f}$" .format(popt[0], perr[0], popt[1], perr[1]))
plt.plot(VltBlack, y, '--', color='green')
plt.ylabel(r'$T^4$ w $[K^4]$')
plt.xlabel(r'Napięcie termometru U w [mV]')
plt.legend(['Fit', 'Data points'])
plt.grid()
plt.show()
line_fit_2()
If you will run it you will find out that the scatter plot is created however the fit isn't executed properly, as only a horizontal line will be added. Additionally an error OptimizeWarning: Covariance of the parameters could not be estimated category=OptimizeWarning) is raised.
I would be very happy to know what I am doing wrong or how to resolve this problem. All help is appreciated!

You've pretty much already answered your question, so I'll just confirm your suspicion: the reason the OptimizeWarning is raised is because the underlying optimization algorithm doesn't work properly/diverges due to large parameter numbers.
The solution is very simple, just scale your input parameters before using the fitting tool. Just keep the scaling in mind when you add labels to your x/y axis:
T_to_4th = np.array([9733560790.61, 11170378213.80, 12609495509.84, 14183383217.88, 15900203737.92, 17768359469.96, 19586229219.65, 21765930026.49, 23878782252.31])/10e6
ucrt_T_lst = np.array([143130823.11, 158701221.00, 173801148.95, 189829733.26, 206814686.75, 224783722.22, 241820148.88, 261735288.93, 280568229.17])/10e6
What I did is just divide the lists with big numbers by 10e6. This means that the values are no longer in kPa for example, but in mega kPa (which would be GPa now).
To divide the entire list by the same value, first convert it to a numpy array.
Hope this helps :)

Fitting two voigt curves, one after the other using lmfit

I have the following emission spectra of Neon collected on a Raman (background subtracted data):
x=np.array([[1114.120887, 1114.682293, 1115.243641, 1115.80493 , 1116.366161, 1116.927334, 1117.488449, 1118.049505, 1118.610503, 1119.171443, 1119.732324, 1120.293147, 1120.853912, 1121.414619, 1121.975267, 1122.535857, 1123.096389, 1123.656863, 1124.217278, 1124.777635, 1125.337934, 1125.898175, 1126.458357, 1127.018482, 1127.578548, 1128.138556, 1128.698505, 1129.258397, 1129.81823 , 1130.378005, 1130.937722, 1131.497381, 1132.056981]])
y=np.array([[-4.89046878e+00, -4.90985832e+00, -5.92924587e+00, -3.28194437e+00, -1.96801488e+00, -3.32070938e+00, -5.34008887e+00, -3.59466330e-01, -2.04552879e+00, -1.06490224e+00, 8.24910035e+00, 5.32297309e+01, 1.11543677e+02, 8.98576241e+01, 2.18504948e+02, 7.15152212e+02, 7.62799601e+02, 2.89446870e+02, 7.24275144e+01, 1.94081610e+01, 1.72212272e+00, 7.02773412e-01, -3.16573861e-01, 4.99745483e+00, 7.97811157e+00, 6.25396305e-01, 6.27274408e+00, -4.41328018e+00, -7.76592840e+00, 3.88142539e+00, 6.52872017e+00, 1.50939096e+00, -8.43249208e-01]])
I have fitted a single Voigt function using lmfit, specifically:
model = VoigtModel()+ ConstantModel()
params=model.make_params(center=1123.096389, amplitude=1000, sigma=0.27)
result = model.fit(y.flatten(), params, x=x.flatten())
There is a second peak on the LH shoulder (sorry can't post image)- people using commercial peak fitting software fit the first voigt, then add the second, and then it adjusts the fits of both. How would I do this in python?
A related question - is there a way to optimize how many points to include in the peak fit. Right now, I am only feeding x and y data covering a set spectral range to do the peak fitting. But commercial software optimizes how much range to include in a given peak fit (I presume using residuals). How would I recreate this?
Thanks!

You can do it manually as so:
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import VoigtModel, ConstantModel
x=np.array([1114.120887, 1114.682293, 1115.243641, 1115.80493 , 1116.366161, 1116.927334, 1117.488449, 1118.049505, 1118.610503, 1119.171443, 1119.732324, 1120.293147, 1120.853912, 1121.414619, 1121.975267, 1122.535857, 1123.096389, 1123.656863, 1124.217278, 1124.777635, 1125.337934, 1125.898175, 1126.458357, 1127.018482, 1127.578548, 1128.138556, 1128.698505, 1129.258397, 1129.81823 , 1130.378005, 1130.937722, 1131.497381, 1132.056981])
y=np.array([-4.89046878e+00, -4.90985832e+00, -5.92924587e+00, -3.28194437e+00, -1.96801488e+00, -3.32070938e+00, -5.34008887e+00, -3.59466330e-01, -2.04552879e+00, -1.06490224e+00, 8.24910035e+00, 5.32297309e+01, 1.11543677e+02, 8.98576241e+01, 2.18504948e+02, 7.15152212e+02, 7.62799601e+02, 2.89446870e+02, 7.24275144e+01, 1.94081610e+01, 1.72212272e+00, 7.02773412e-01, -3.16573861e-01, 4.99745483e+00, 7.97811157e+00, 6.25396305e-01, 6.27274408e+00, -4.41328018e+00, -7.76592840e+00, 3.88142539e+00, 6.52872017e+00, 1.50939096e+00, -8.43249208e-01])
model = VoigtModel() + ConstantModel()
params=model.make_params(center=1123.0, amplitude=1000, sigma=0.27)
result1 = model.fit(y.flatten(), params, x=x.flatten())
rest = y-result1.best_fit
model = VoigtModel() + ConstantModel()
params=model.make_params(center=1120.5, amplitude=200, sigma=0.27)
result2 = model.fit(rest, params, x=x.flatten())
rest -= result2.best_fit
plt.plot(x, y, label='Original')
plt.plot(x, result1.best_fit, label='1123.0')
plt.plot(x, result2.best_fit, label='1120.5')
plt.plot(x, rest, label='residual')
plt.legend()
plt.show()
You have to make sure that the residual makes sense. In this case, is quite close to 0, so I'd argue that it is fine.
lmfit does optimize the fit, so it is not necessary to pinpoint the exact value of the peak position. Also, it is important to point out that because of the resolution of this data (and spectroscopy in general), the highest points are not necessarily the centre of the peak. Additionally, because of the same, some shoulders might not be shoulders, though in this case looks like it is.
For your related question - judging by the documentation of lmfit it uses all the range you input. Residuals seem like not a solution since you fall in the same problem (what range to consider). I believe that the commercial SW you mention uses Multivariate Curve Resolution (MCR). These deconvolution problems have been a hot topic for decades. If you are interested in this kind of solution, I suggest reading about Multivariate Curve Resolution (MCR).

Improper input: N=3 must not exceed M=1 (error trying to fit a gaussian function)

I'm pretty new to python and curve fitting and currently I'm trying to fit the graph below with a Gaussian
I'm following this tutorial and my code looks like this
import numpy as np
import matplotlib.pyplot as plt
from pylab import genfromtxt
from matplotlib import pyplot
from numpy import sqrt, pi, exp, linspace,loadtxt
from lmfit import Model
def gaussian(x,amp,cen,wid):
"1-d gaussian: gaussian(x,amp,cen,wid)"
return (amp/(sqrt(2*pi)*wid))*exp(-(x-cen)**2/(2*wid**2))
filelist=[]
time=[0.00,-1.33,-2.67,-4.00,-5.33,-6.67,1.13,2.67,4.00,5.33,6.67]
index=0
offset=0
filelist.append('0.asc')
for i in range(1,6):
filelist.append("-%s00.asc" %(i))
for i in range(1,6):
filelist.append("+%s00.asc" %(i))
sfgpeaks=[]
for fname in filelist:
data=np.genfromtxt(fname,delimiter=',',unpack=True,skip_footer=20)
SFGX=data[0,500:530]
SFGY=data[1,500:530]
SFGpeakY=np.max(SFGY)
sfgpeaks.append(SFGpeakY)
gmodel = Model(gaussian)
result = gmodel.fit(SFGpeakY, x=time[index], amp=5,cen=5,wid=3)
plt.plot(time[index],sfgpeaks[index],'ro')
plt.plot(time[index],result.init_fit, 'k--',label="Gaussian Fit")
plt.xticks(time)
index=index+1
print(pump2SHGX)
pyplot.title("Time Delay-SFG peak")
plt.xlabel("Timedelay[ps]")
plt.ylabel("Counts[arb.unit]")
plt.savefig("796and804nmtimesfg")
plt.legend(bbox_to_anchor=(1.0,0.5))
plt.show()
However, I'm getting an error when I try to add the data that I have(time delay and the Y value of the graph above) into the gaussian parameters.
The error I'm getting is this
TypeError: Improper input: N=3 must not exceed M=1
Does this error because I'm trying to insert a value from an array into the parameter??
Any help is much appreciated.

You have
result = gmodel.fit(SFGpeakY, x=time[index], amp=5,cen=5,wid=3)
which is passing 1 value as x and 1 value as the data. The model is then evaluated at that 1 point. The error message is the fit is complaining that you have 3 variables and 1 value.
You probably want to fit the data array SFGY with x set to SFGX,
result = gmodel.fit(SFGY, x=SFGX, amp=5,cen=5,wid=3)
though it wasn't clear to me what data is used in the plot you attached.
Also: you probably want to give initial values for amp, cen, and wid based on the data. Your SFGpeakY is probably a decent guess for amp, and SFGX.mean() and SFGX.std() are probably decent guesses or cen and wid.
Also: you plot result.init_fit labeled as "Gaussian Fit". result.init_fit will be the model evaluated with the initial values for the parameters. The best fit with the refined parameters will be in result.best_fit.

Can I use scipy.curve fit in python when one of the fitted parameters changes the xdata input array values?

This is my first time posting a question and I'm going to try to make it as clear as I can but feel free to ask questions.
I'm trying to fit a model to a curve using the scipy.curve_fit method as below:
import numpy as np
import matplotlib.pyplot as pyplot
import scipy
from scipy.optimize import curve_fit
def func2(x,EM):
return (((4.0*EM*(np.sqrt(8*10**-9)))/(3.0*(1.0-(0.5**2))*8*10**-9))*(((((x))*1*10**-9)**((3.0/2.0)))))
ydata=[-0.003428768, -0.009050058, -0.0037997673999999996, -0.0003833233, -0.007557649, -0.0034860994, -0.0009856887, -0.0017508664, -0.00036931394999999996,
-0.0040713947, -0.005737315000000001, 0.0005120568, -0.007336486, -0.00719302, -0.0039941817, -0.0029785274, -0.0013044578, -0.008190335, -0.00833507,
-0.0074282060000000006, -0.009629990000000001, -0.009425125, -0.008662485999999999, -0.0019445216, -0.008331748, -0.009513038, -0.0047609017, -0.004364422,
-0.010325097, -0.0036570733, -0.0060091914, -0.005655772, -0.0045517069999999995, -0.00066998035, 0.006374902, 0.006445733, 0.0019101816,
0.010262737999999999, 0.011139007, 0.018161469, 0.016963122, 0.022915895, 0.027177791, 0.028707139, 0.040105638, 0.044088004, 0.041657403,
0.052325636999999994, 0.062399405, 0.07020844, 0.076979915, 0.08888523, 0.099634745, 0.10961602, 0.12188646, 0.13677225, 0.15639512, 0.16833586,
0.18849944000000002, 0.21515548, 0.23989769000000002, 0.26319308, 0.29388397, 0.321042, 0.35637776, 0.38564656999999997, 0.4185209, 0.44986692,
0.48931552999999994, 0.52583893, 0.5626885, 0.6051665, 0.6461075, 0.69644346, 0.7447817, 0.7931281, 0.8381386000000001, 0.8883482, 0.9395609999999999,
0.9853629, 1.0377034, 1.0889026, 1.1334094]
xdata=[34.51388, 33.963736999999995,
33.510695, 33.04127, 32.477253, 32.013624, 31.536019999999997, 31.02925, 30.541649999999997,
30.008646, 29.493828, 29.049707, 28.479668, 27.980956, 27.509590000000003, 27.018721, 26.533737, 25.972296,
25.471065, 24.979228000000003, 24.459624, 23.961517, 23.46839, 23.028454, 22.471411, 21.960924, 21.503428000000003,
21.007033, 20.453855, 20.013475, 19.492528, 18.995746999999998, 18.505670000000002, 18.040403, 17.603387, 17.104082,
16.563634, 16.138298000000002, 15.646187, 15.20897, 14.69833, 14.25156, 13.789688, 13.303409, 12.905278, 12.440909, 11.919262,
11.514609, 11.104646, 10.674512, 10.235055, 9.84145, 9.437523, 9.026733, 8.63639, 8.2694065, 7.944733, 7.551445, 7.231599999999999,
6.9697434, 6.690793299999999, 6.3989780000000005, 6.173159, 5.9157856, 5.731453, 5.4929328, 5.2866156, 5.066648000000001, 4.9190496,
4.745381399999999, 4.574569599999999, 4.4540283, 4.3197597000000005, 4.2694026, 4.2012034, 4.133134, 4.035212, 3.9837262, 3.9412007, 3.8503475999999996,
3.8178950000000005, 3.7753053999999997, 3.6728842]
dstart=20.0
xdata=np.array(xdata[::-1])
xdata=xdata-dstart
xdata=list(xdata)
xdata1=[]
ydata1=[]
for i in range(len(xdata)):
if xdata[i]>0:
xdata1.append(xdata[i])
ydata1.append(ydata[i])
xdata=np.array(xdata1)
ydata=np.array(ydata1)
popt, pcov = curve_fit(func2, xdata, ydata)
a=popt[0]
print "E=", popt[0]/10**6
t=func2(xdata,a)
ax=pyplot.figure().add_subplot(1,1,1)
ax.plot(xdata,t, color="blue",mew=2.0,label="Hertz Fit")
ax.plot(xdata,ydata,ls="",marker="x",color="red",mew=2.0,label="Data")
ax.legend(loc=2)
pyplot.show()
The "dstart" value basically cuts off the lower portion of the code I don't want to fit because it is negative and the model doesn't like negative numbers. Currently I have to manually set "dstart" before running the code and then I see the final result.
I started by doing this fitting in Excel with Solver to vary both the "EM" variable and the "dstart" variable simultaneously by nesting the code which adjusts the xdata by "dstart" and cuts off the negative values into the function being fit.
Essentially what I want is:
import numpy as np
import matplotlib.pyplot as pyplot
import scipy
from scipy.optimize import curve_fit
def func2(x,EM,dstart):
xdata=np.array(x[::-1])
xdata=dstart-xdata
xdata=list(xdata)
xdata1=[]
for i in range(len(xdata)):
if xdata[i]>0:
xdata1.append(xdata[i])
global xdata2
xdata2=np.array(xdata1)
return (((4.0*EM*(np.sqrt(8*10**-9)))/(3.0*(1.0-(0.5**2))*8*10**-9))*(((((xdata2))*1*10**-9)**((3.0/2.0)))))
ydata=[-0.003428768, -0.009050058, -0.0037997673999999996, -0.0003833233, -0.007557649, -0.0034860994, -0.0009856887, -0.0017508664, -0.00036931394999999996,
-0.0040713947, -0.005737315000000001, 0.0005120568, -0.007336486, -0.00719302, -0.0039941817, -0.0029785274, -0.0013044578, -0.008190335, -0.00833507,
-0.0074282060000000006, -0.009629990000000001, -0.009425125, -0.008662485999999999, -0.0019445216, -0.008331748, -0.009513038, -0.0047609017, -0.004364422,
-0.010325097, -0.0036570733, -0.0060091914, -0.005655772, -0.0045517069999999995, -0.00066998035, 0.006374902, 0.006445733, 0.0019101816,
0.010262737999999999, 0.011139007, 0.018161469, 0.016963122, 0.022915895, 0.027177791, 0.028707139, 0.040105638, 0.044088004, 0.041657403,
0.052325636999999994, 0.062399405, 0.07020844, 0.076979915, 0.08888523, 0.099634745, 0.10961602, 0.12188646, 0.13677225, 0.15639512, 0.16833586,
0.18849944000000002, 0.21515548, 0.23989769000000002, 0.26319308, 0.29388397, 0.321042, 0.35637776, 0.38564656999999997, 0.4185209, 0.44986692,
0.48931552999999994, 0.52583893, 0.5626885, 0.6051665, 0.6461075, 0.69644346, 0.7447817, 0.7931281, 0.8381386000000001, 0.8883482, 0.9395609999999999,
0.9853629, 1.0377034, 1.0889026, 1.1334094]
xdata=[34.51388, 33.963736999999995,
33.510695, 33.04127, 32.477253, 32.013624, 31.536019999999997, 31.02925, 30.541649999999997,
30.008646, 29.493828, 29.049707, 28.479668, 27.980956, 27.509590000000003, 27.018721, 26.533737, 25.972296,
25.471065, 24.979228000000003, 24.459624, 23.961517, 23.46839, 23.028454, 22.471411, 21.960924, 21.503428000000003,
21.007033, 20.453855, 20.013475, 19.492528, 18.995746999999998, 18.505670000000002, 18.040403, 17.603387, 17.104082,
16.563634, 16.138298000000002, 15.646187, 15.20897, 14.69833, 14.25156, 13.789688, 13.303409, 12.905278, 12.440909, 11.919262,
11.514609, 11.104646, 10.674512, 10.235055, 9.84145, 9.437523, 9.026733, 8.63639, 8.2694065, 7.944733, 7.551445, 7.231599999999999,
6.9697434, 6.690793299999999, 6.3989780000000005, 6.173159, 5.9157856, 5.731453, 5.4929328, 5.2866156, 5.066648000000001, 4.9190496,
4.745381399999999, 4.574569599999999, 4.4540283, 4.3197597000000005, 4.2694026, 4.2012034, 4.133134, 4.035212, 3.9837262, 3.9412007, 3.8503475999999996,
3.8178950000000005, 3.7753053999999997, 3.6728842]
xdata2=list(xdata2)
ydata1=[]
for i in range(len(xdata2)):
if xdata2[i]>0:
ydata1.append(ydata[i])
popt, pcov = curve_fit(func2, xdata, ydata)
But this doesn't work as I get a value error "ValueError: operands could not be broadcast together with shapes (28,) (30,)". I think what I need is for the the curve_fit to bring in the xdata, adjust by the first guessed "dstart", guess EM and check for fit and minimized error, try new "dstart" to adjust xdata, guess EM and check for fit and minimized error, so on and so forth. As I'm still fairly new to Python I'm definitely out of my element with the curve fit and I would just use Excel if I didn't have potentially thousands of curves to run.
Any help would be appreciated!

I'll split this in two: conceptual and coding related
Conceptual:
Let's start by rephrasing your question. As it stands the answer is: Yes, obviously. Simply absorb the parameter-dependent change of x in the target function. But that won't solve your problem. What you really seem to be interested in is what to do with parameters for which some of the x cannot be processed by your function. There is no one-size-fits-all for that.
You could choose to deem such parameters as unacceptable in which case you'd have to resort to constrained optimisation. There are a few solvers in scipy that can do that.
You could choose to remove the difficult points from the data set before fitting.
You could introduce soft constraints and penalise bad values instead of ruling them out completely.
Programming style:
for loops in numerical programs. There are gazillions of posts on that on this site, so I'll only give one example:
xdata2=list(xdata2)
ydata1=[]
for i in range(len(xdata2)):
if xdata2[i]>0:
ydata1.append(ydata[i])
can be written in one line that will execute much faster and return an array instead of a list:
ydata1 = ydata[xdata2 > 0]
look at the numpy tutorial/docs or search this site for "vectorization" if you want to learn this technique.
Apart from that, no complaints.
Why your second program doesn't work.
You are sieving both your x and your y, so they should have the same shape. But then you go on and use an old copy instead of the new y whereas you do use the new x. That's why the shapes don't match
Btw. the way you've set it up (modify x within func2) is more or less implementing the absorb strategy I mention earlier. Only, since you have no access to y you cannot change the shape of x.

scipy: finding optimal parameters with fmin and odeint, bad fit

Below I solve a second order ODE that describes a spring-mass dashpot system: u'' +cu'+ku=0. I have no problems with the odeint solver.The odeint function correctly solves the position U(t) over the specified time.
#modeling spring mass system
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy import integrate
#Make the following substitution to make system first order
#Y[1]=y′(t) and Y[0]=y(t),
#system: Y[0]'=Y[1] and Y[1]'=-c*Y[1]-k*Y[0]
#=======================================================
def eq(par,initial_cond,start_t,end_t,incr):
#-time-grid-----------------------------------
t = np.linspace(start_t, end_t,incr)
#differential-eq-system----------------------
def funct(y,t):
ut=y[0]
ut_dt=y[1]
c,k=par
# the model equations u'=Y[1], u''=-k*Y[0]-c*Y[1] from u''+c*u'+k*u=0
f0 =ut_dt
f1 =-k*ut-c*ut_dt
return [f0, f1]
#integrate------------------------------------
ds = integrate.odeint(funct,initial_cond,t)
return (ds[:,0],ds[:,1],t)
#=======================================================
#parameters
c=2. #spring coefficient
k=10. #dampening coefficient
#collect parameters in tuple
coefs=(c,k)
# initial conditions
u0=6.
ud0=0.
y0=[u0,ud0]
start,stop,incr=0,20,100
#Solve and plot solution
F0,F1,T=eq(coefs,y0,start,stop,incr)
plt.figure()
plt.plot(T,F0,'-b',T,F1,'-r')
plt.legend(('u0', 'u1'),'upper center')
plt.title('Mass-Spring System')
However, I would like to use scipy.optimize.fmin() to find the optimal fitting parameters (c,k) for this system if given simulated measurements. So I use the solution from above where c=2, and k=10 and add random noise.
rand_i=randn(incr)
#noiselevel
nl=.05
noisy_data=F0+nl*rand_i
plt.plot(noisy_data,label="noisy_data:c=2,k=10")
plt.legend()
Next, I set up a scoring function for fmin() to minimize. I use a guess for the parameters, c=1,k=1.
from scipy.optimize import fmin
#1.Get 'Real' Data
#====================================================
nd=noisy_data#solution with parameters: c=2,k=10
#====================================================
#2.Set up Info for Model System
#===================================================
# guess parameters
c=1 #spring coefficient
k=1 #dampening coefficient
#collect parameters in tuple
coefs=(c,k)
# initial conditions
u0=6.
ud0=0.
y0=[u0,ud0]
# model steps
#---------------------------------------------------
start_time=0
end_time=20
intervals=100
mt=np.linspace(start_time,end_time,intervals)
#3.Score Fit of System
#=========================================================
def score(parms):
#a.Get Solution to system
F0,F1,T=eq(coefs,y0,start_time,end_time,intervals)
#b.Pick of Model Points to Compare
um=F0
#c.Score Difference between model(ode output) and data points (noisy data)
ss=lambda data,model:((data-model)**2).sum()
return ss(nd,um)
#========================================================
#4.Optimize Fit
#=======================================================
fit_score=score(coefs)
answ=fmin(score,(coefs))
The problem is that fmin doesn't find the correct parameters. It finds that the guess parameters are the best, even though the score function is high. Below I print the fmin solution answ and show that it is the same as the initial guess even after fmin() has been called.
print(answ==[c,k])
Does anyone know why fmin() doesn't find the correct parameters, c=2, k=10?

There is a trivial bug in your code: you define score with input parameter parms, but then refer to said variable as coefs. Fix:
def score(coefs): #changed
#a.Get Solution to system
F0,F1,T=eq(coefs,y0,start_time,end_time,intervals)
#b.Pick of Model Points to Compare
um=F0
#c.Score Difference between model(ode output) and data points (noisy data)
ss=lambda data,model:((data-model)**2).sum()
return ss(nd,um)
Before:
In [369]: answ
Out[369]: array([ 1., 1.])
After:
In [373]: answ
Out[373]: array([ 2.0425695 , 9.96937966])
However, note that answ==(c,k) will always be False, even for a perfect fit: you're working with floating-point numbers. Any meaningful comparison should look like max(abs(answ-[2,10])/abs(answ))<tol or something similar. (I know your original question used this to show that the values didn't change, but still.)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.