Python fitting sinus cardinal and LMFIT library - python

I have a set of data from a physic experiment (simple-slit experiment) in university and i am trying to fit this data to a model that i build from the lmfit library.
I want a sinus cardinal square, in this form:
I(X)= I0.sincĀ²(pi.a.X/(lambda.D))
with a : the width of the slit,
lambda : wavelenght of the light
D : distance camera/slit
I0 : original intensity
import csv as csv
from math import pi
import matplotlib.pyplot as plt
import numpy as np
from lmfit import *
# create data to be fitted
with open('data_1.csv', 'r') as f:
values = list(csv.reader(f, delimiter=','))
values = np.array(values[1:], dtype=np.float)
position = values[:, 0]
intensity = values[:, 1]
#define function model
def fct(x, I0, a, D, b):
return I0 * np.square(np.sinc(pi * a * (x + b) / (0.00000063 * D)))
#b is for the horizontal shift because my experience
#was centered on 700 due to the camera
# do fit
vmodel = Model(fct)
vmodel.set_param_hint('I0', min=0., max=300.)
vmodel.set_param_hint('a', value=0.0005, min=0.0, max=1.)
vmodel.set_param_hint('D', value=0.53, min=0.0, max=1.)
vmodel.set_param_hint('b', min=0., max=2000.)
pars = vmodel.make_params()
result = vmodel.fit(intensity, pars, x=position)
# write report
print(result.fit_report())
#after we plot the data, with position on x and intensity on y
It returns totally wrong values and an error :
RuntimeWarning: invalid value encountered in double_scalars spercent =
'({0:.2%})'.format(abs(par.stderr/par.value))
[[Model]]
Model(fct)
[[Fit Statistics]]
# function evals = 7
# data points = 1280
# variables = 4
chi-square = 4058147.794
reduced chi-square = 3180.367
Akaike info crit = 10326.876
Bayesian info crit = 10347.494
[[Variables]]
I0: 0 +/- 0 (nan%) (init= 0)
a: 0.00050000 +/- 0 (0.00%) (init= 0.0005)
D: 0.50000000 +/- 0 (0.00%) (init= 0.5)
b: 400 +/- 0 (0.00%) (init= 400)
Could you help me please ? I tried lots of type models from this library but nothing work correctly and i really need it. I already solved 2D problems with a np.square, and other reading things, the major problem is the model.
Waiting for answers,
Thanks,

You probably want to provide reasonable starting values for all your parameter values. As you wrote it, there are no initial values for I0 or b but conveniently(?) these parameters have bounds set, so that initial values can be inferred (probably poorly) as the lower bound -- I don't know how b became 400. Maybe a typo?
Anyway, I would recommend trying
pars = vmodel.make_params(I0=150, b=400)
and then try the fit again.

Related

Trying to fit a trig function to data with scipy

I am trying to fit some data using scipy.optimize.curve_fit. I have read the documentation and also this StackOverflow post, but neither seem to answer my question.
I have some data which is simple, 2D data which looks approximately like a trig function. I want to fit it with a general trig function
using scipy.
My approach is as follows:
from __future__ import division
import numpy as np
from scipy.optimize import curve_fit
#Load the data
data = np.loadtxt('example_data.txt')
t = data[:,0]
y = data[:,1]
#define the function to fit
def func_cos(t,A,omega,dphi,C):
# A is the amplitude, omega the frequency, dphi and C the horizontal/vertical shifts
return A*np.cos(omega*t + dphi) + C
#do a scipy fit
popt, pcov = curve_fit(func_cos, t,y)
#Plot fit data and original data
fig = plt.figure(figsize=(14,10))
ax1 = plt.subplot2grid((1,1), (0,0))
ax1.plot(t,y)
ax1.plot(t,func_cos(t,*popt))
This outputs:
where blue is the data orange is the fit. Clearly I am doing something wrong. Any pointers?
If no values are provided for initial guess of the parameters p0 then a value of 1 is assumed for each of them. From the docs:
p0 : array_like, optional
Initial guess for the parameters (length N). If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised).
Since your data has very large x-values and very small y-values an initial guess of 1 is far from the actual solution and hence the optimizer does not converge. You can help the optimizer by providing suitable initial parameter values that can be guessed / approximated from the data:
Amplitude: A = (y.max() - y.min()) / 2
Offset: C = (y.max() + y.min()) / 2
Frequency: Here we can estimate the number of zero crossing by multiplying consecutive y-values and check which products are smaller than zero. This number divided by the total x-range gives the frequency and in order to get it in units of pi we can multiply that number by pi: y_shifted = y - offset; oemga = np.pi * np.sum(y_shifted[:-1] * y_shifted[1:] < 0) / (t.max() - t.min())
Phase shift: can be set to zero, dphi = 0
So in summary, the following initial parameter guess can be used:
offset = (y.max() + y.min()) / 2
y_shifted = y - offset
p0 = (
(y.max() - y.min()) / 2,
np.pi * np.sum(y_shifted[:-1] * y_shifted[1:] < 0) / (t.max() - t.min()),
0,
offset
)
popt, pcov = curve_fit(func_cos, t, y, p0=p0)
Which gives me the following fit function:

Fitting Voigt function to data in Python

I recently got a script running to fit a gaussian to my absorption profile with help of SO. My hope was that things would work fine if I simply replace the Gauss function by a Voigt one, but this seems not to be the case. I think mainly due to the fact that it is a shifted voigt.
Edit: The profiles are absorption lines that vary in optical thickness. In practice they will be a mix between optically thick and thin features. Like the bottom part in this diagram. The current data will be more like the top image, but maybe the bottom is already flattened a bit. (And we only see the left side of the profile, a bit beyond the center)
For a Gauss it looks like this and as predicted the bottom seems to be less deep than the fit wants it to be, but still quite close. The profile itself should still be a voigt though. But now I realize that the central points might throw off the fit. So maybe a weight should be added based on wing position?
I'm mostly wondering if the shifted function could be mis-defined or if its my starting values.
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
import numpy as np
from scipy.special import wofz
x = np.arange(13)
xx = xx = np.linspace(0, 13, 100)
y = np.array([19699.959 , 21679.445 , 21143.195 , 20602.875 , 16246.769 ,
11635.25 , 8602.465 , 7035.493 , 6697.0337, 6510.092 ,
7717.772 , 12270.446 , 16807.81 ])
# weighted arithmetic mean (corrected - check the section below)
#mean = 2.4
sigma = 2.4
gamma = 2.4
def Gauss(x, y0, a, x0, sigma):
return y0 + a * np.exp(-(x - x0)**2 / (2 * sigma**2))
def Voigt(x, x0, y0, a, sigma, gamma):
#sigma = alpha / np.sqrt(2 * np.log(2))
return y0 + a * np.real(wofz((x - x0 + 1j*gamma)/sigma/np.sqrt(2))) / sigma /np.sqrt(2*np.pi)
popt, pcov = curve_fit(Voigt, x, y, p0=[8, np.max(y), -(np.max(y)-np.min(y)), sigma, gamma])
#p0=[8, np.max(y), -(np.max(y)-np.min(y)), mean, sigma])
plt.plot(x, y, 'b+:', label='data')
plt.plot(xx, Voigt(xx, *popt), 'r-', label='fit')
plt.legend()
plt.show()
I may be misunderstanding the model you're using, but I think you would need to include some sort of constant or linear background.
To do that with lmfit (which has Voigt, Gaussian, and many other models built in, and tries very hard to make these interchangeable), I would suggest starting with something like this:
import numpy as np
import matplotlib.pyplot as plt
from lmfit.models import GaussianModel, VoigtModel, LinearModel, ConstantModel
x = np.arange(13)
xx = np.linspace(0, 13, 100)
y = np.array([19699.959 , 21679.445 , 21143.195 , 20602.875 , 16246.769 ,
11635.25 , 8602.465 , 7035.493 , 6697.0337, 6510.092 ,
7717.772 , 12270.446 , 16807.81 ])
# build model as Voigt + Constant
## model = GaussianModel() + ConstantModel()
model = VoigtModel() + ConstantModel()
# create parameters with initial values
params = model.make_params(amplitude=-1e5, center=8,
sigma=2, gamma=2, c=25000)
# maybe place bounds on some parameters
params['center'].min = 2
params['center'].max = 12
params['amplitude'].max = 0.
# do the fit, print out report with results
result = model.fit(y, params, x=x)
print(result.fit_report())
# plot data, best fit, fit interpolated to `xx`
plt.plot(x, y, 'b+:', label='data')
plt.plot(x, result.best_fit, 'ko', label='fitted points')
plt.plot(xx, result.eval(x=xx), 'r-', label='interpolated fit')
plt.legend()
plt.show()
And, yes, you can simply replace VoigtModel() with GaussianModel() or LorentzianModel() and redo the fit and compare the fit statistics to see which model is better.
For the Voigt model fit, the printed report would be
[[Model]]
(Model(voigt) + Model(constant))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 41
# data points = 13
# variables = 4
chi-square = 17548672.8
reduced chi-square = 1949852.54
Akaike info crit = 191.502014
Bayesian info crit = 193.761811
[[Variables]]
amplitude: -173004.338 +/- 30031.4068 (17.36%) (init = -100000)
center: 8.06574198 +/- 0.16209266 (2.01%) (init = 8)
sigma: 1.96247322 +/- 0.23522096 (11.99%) (init = 2)
c: 23800.6655 +/- 1474.58991 (6.20%) (init = 25000)
gamma: 1.96247322 +/- 0.23522096 (11.99%) == 'sigma'
fwhm: 7.06743644 +/- 0.51511574 (7.29%) == '1.0692*gamma+sqrt(0.8664*gamma**2+5.545083*sigma**2)'
height: -18399.0337 +/- 2273.61672 (12.36%) == '(amplitude/(max(2.220446049250313e-16, sigma*sqrt(2*pi))))*wofz((1j*gamma)/(max(2.220446049250313e-16, sigma*sqrt(2)))).real'
[[Correlations]] (unreported correlations are < 0.100)
C(amplitude, c) = -0.957
C(amplitude, sigma) = -0.916
C(sigma, c) = 0.831
C(center, c) = -0.151
Note that by default gamma is constrained to be the same value as sigma. This constraint can be lifted and gamma made to vary independently with params['gamma'].set(expr=None, vary=True, min=1.e-9). I think that you may not have enough data points in this data set to robustly and independently determine gamma.
The plot for that fit would look like this:
I managed to get something, but not very satisfying. If you remove the offset as a parameter and add 20000 directly in the Voigt function, with starting values [8, 126000, 0.71, 2] (the particular values don't' matter much) you'll get something like
Now, the fit produces a value for gamma which is negative which I cannot really justify. I would expect gamma to always be positive, but maybe I'm wrong and it's completely fine.
One thing you could try is to mirror your data so that its a "positive" peak (and while at it removing the background) and/or normalize the values. That might help you in the convergence.
I have no idea why when using the offset as a parameter the solver has problems finding an optimum. Maybe you need a different optimizer routine.
Maybe it'll be a better option to use the lmfit package that it's a wrapper over scipy to fit nonlinear functions with many prebuilt lineshapes. There is even an example of fitting a Voigt profile.

Illogical parameters returned by scipy.curve_fit

I'm modelling a ball falling through fluid in Python and fitting the model function to a set of data points using the damping coefficients (a and b) and the density of the fluid, but the fitted value for the fluid density is coming back negative and I have no idea what is wrong in the code. My code is below:
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
from scipy.optimize import curve_fit
##%%Parameters and constants
m = 0.1 #mass of object in kg
g = 9.81 #acceleration due to gravity in m/s**2
rho = 700 #density of object in kg/m**3
v0 = 0 #velocity at t=0
y0 = 0 #position at t=0
V = m / rho #volume in cubic meters
r = ((3/4)*(V/np.pi))**(1/3) #radius of the sphere
asample = 0.0001 #sample value for a
bsample = 0.0001 #sample value for b
#%%Defining integrating function
##fuction => y'' = g*(1-(rhof/rho))-((a/m)y'+(b/m)y'**2)
## y' = v
## v' = g*(1-rhof/rho)-((a/m)v+(b/m)v**2)
def sinkingball(Y, time, a, b, rhof):
return [Y[1],(1/m)*(V*g*(rho-rhof)-a*Y[1]-b*(Y[1]**2))]
def balldepth(time, a, b, rhof):
solutions = odeint(sinkingball, [y0,v0], time, args=(a, b, rhof))
return solutions[:,0]
time = np.linspace(0,15,151)
# imported some experimental values and named the array data
a, b, rhof = curve_fit(balldepth, time, data, p0=(asample, bsample, 100))[0]
print(a,b,rhof)
Providing the output you actually get would be helpful, and the comment about time not being used by sinkingball() is worth following.
You might find lmfit (https://lmfit.github.io/lmfit-py) useful. This provides a higher-level interface to curve-fitting that allows, among other things, placing bounds on parameters so that they can remain physically sensible. I think your problem would translate from curve_fit to lmfit as:
from lmfit import Model
def balldepth(time, a, b, rhof):
solutions = odeint(sinkingball, [y0,v0], time, args=(a, b, rhof))
return solutions[:,0]
# create a model based on the model function "balldepth"
ballmodel = Model(balldepth)
# create parameters, which will be named using the names of the
# function arguments, and provide initial values
params = bollmodel.make_params(a=0.001, b=0.001, rhof=100)
# you wanted rhof to **not** vary in the fit:
params['rhof'].vary = False
# set min/max values on `a` and `b`:
params['a'].min = 0
params['b'].min = 0
# run the fit
result = ballmodel.fit(data, params, time=time)
# print out full report of results
print(result.fit_report())
# get / print out best-fit parameters:
for parname, param in result.params.items():
print("%s = %f +/- %f" % (parname, param.value, param.stderr))

Lmfit gives -1 correlation and large uncertainty (python)

I am trying to fit a model function to a curve using the lmfit module.
The curve that I am fitting is set up as follows:
e(x) = exp(-(x-X)/x0) for x larger or equal than X, 0 otherwise.
G(x) = (1/sqrt(2*pi)*sigma) * exp(-x^2/2*sigma^2)
The model fit M(x) = E * conv(e,G)(x) + B
Where e is a truncated exponential, G is a gaussian, and E and B are constants. The operator between e and G is a convolution.
When I try to fit this function to my data I get a good fit. However, the fit is very sensitive to my initial value that I provide for X. This is also reflected in the uncertainty in the parameters:
[[Model]]
((Model(totemiss) * (Model(exptruncated) <function convolve at 0x7f139e2dcde8> Model(gaussian))) + Model(background))
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 67
# data points = 54
# variables = 5
chi-square = 120558969110355112544642583094864038386991104.00000
reduced chi-square = 2460387124701124853181382654239391973638144.00000
Akaike info crit = 5275.63336
Bayesian info crit = 5285.57828
[[Variables]]
E: 9.7316e+28 +/- 2.41e+33 (2475007.74%) (init= 1.2e+29)
x0: 5.9420e+06 +/- 9.52e+04 (1.60%) (init= 5000000)
X: 4.9049e+05 +/- 1.47e+11 (29978575.17%) (init= 100000)
sigma: 2.6258e+06 +/- 5.74e+04 (2.19%) (init= 2000000)
center: 0 (fixed)
amplitude: 1 (fixed)
B: 3.9017e+22 +/- 3.75e+20 (0.96%) (init= 4.5e+22)
[[Correlations]] (unreported correlations are < 0.100)
C(E, X) = -1.000
C(sigma, B) = -0.429
C(x0, sigma) = -0.283
C(x0, B) = -0.266
C(E, x0) = -0.105
C(x0, X) = 0.105
I suspect this has something to do due to the correlation between E and X being -1.00, which does not make any sense. I am trying to find out why I get this error and I believe it might be in the definition of the model:
def exptruncated(x, x0, X):
return np.exp(-(x-X)/x0)* (x > X)
#Define convolution operator
def convolve(arr, kernel):
npts = min(len(arr), len(kernel))
pad = np.ones(npts)
tmp = np.concatenate((pad*arr[0], arr, pad*arr[-1]))
out = np.convolve(tmp, kernel, mode='valid')
noff = int((len(out) - npts)/2)
return out[noff:noff+npts]
#Constant value for total emissions#
def totemiss(x,E):
return E
#Constant value for background value
def background(x,B):
return B
# create Composite Model using the custom convolution operator
# M(x) = E + conv(exp,gauss) + B
mod = Model(totemiss)* CompositeModel(Model(exptruncated), Model(gaussian), convolve) + Model(background)
mod.set_param_hint('x0',value=50*1e5,min=0,max=60*1e5)
mod.set_param_hint('amplitude',value=1.0)
mod.set_param_hint('center',value=0.0)
mod.set_param_hint('sigma',value=20*1e5,min=0,max=100*1e5)
mod.set_param_hint('X',value=1.0*1e5,min=0, max=5.0*1e5)
mod.set_param_hint('B',value=0.45*1e23,min=0.3*1e23,max=1.0*1e23)
mod.set_param_hint('E',value=1.2*1e29,min=1.2*1e26,max=1.0*1e32)
pars = mod.make_params()
pars['amplitude'].vary = False
pars['center'].vary = False
result = mod.fit(y, params=pars, x=x)
comps = result.eval_components(x=x)
Although I believe the model is the reason I am not able to find where the error comes from. Perhaps somebody of you can help me out!
Why not just remove E from the model -- the X parameter is serving as a constant offset.
I'd also advise having parameters that are more reasonably scaled, closer to order of unity (roughly 1e-6 to 1e6, say). You can add scales of 1e10 and so on as needed in the model calculation, but it generally helps the calculations of covariance (used to determine how to update values in the fit) to have parameters more uniformly scaled.

Failure of non linear fit to sine curve

I've been trying to fit the amplitude, frequency and phase of a sine curve given some generated two dimensional toy data. (Code at the end)
To get estimates for the three parameters, I first perform an FFT. I use the values from the FFT as initial guesses for the actual frequency and phase and then fit for them (row by row). I wrote my code such that I input which bin of the FFT I want the frequency to be in, so I can check if the fitting is working well. But there's some pretty strange behaviour. If my input bin is say 3.1 (a non integral bin, so the FFT won't give me the right frequency) then the fit works wonderfully. But if the input bin is 3 (so the FFT outputs the exact frequency) then my fit fails, and I'm trying to understand why.
Here's the output when I give the input bins (in the X and Y direction) as 3.0 and 2.1 respectively:
(The plot on the right is data - fit)
Here's the output when I give the input bins as 3.0 and 2.0:
Question: Why does the non linear fit fail when I input the exact frequency of the curve?
Code:
#! /usr/bin/python
# For the purposes of this code, it's easier to think of the X-Y axes as transposed,
# so the X axis is vertical and the Y axis is horizontal
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as optimize
import itertools
import sys
PI = np.pi
# Function which accepts paramters to define a sin curve
# Used for the non linear fit
def sineFit(t, a, f, p):
return a * np.sin(2.0 * PI * f*t + p)
xSize = 18
ySize = 60
npt = xSize * ySize
# Get frequency bin from user input
xFreq = float(sys.argv[1])
yFreq = float(sys.argv[2])
xPeriod = xSize/xFreq
yPeriod = ySize/yFreq
# arrays should be defined here
# Generate the 2D sine curve
for jj in range (0, xSize):
for ii in range(0, ySize):
sineGen[jj, ii] = np.cos(2.0*PI*(ii/xPeriod + jj/yPeriod))
# Compute 2dim FFT as well as freq bins along each axis
fftData = np.fft.fft2(sineGen)
fftMean = np.mean(fftData)
fftRMS = np.std(fftData)
xFreqArr = np.fft.fftfreq(fftData.shape[1]) # Frequency bins along x
yFreqArr = np.fft.fftfreq(fftData.shape[0]) # Frequency bins along y
# Find peak of FFT, and position of peak
maxVal = np.amax(np.abs(fftData))
maxPos = np.where(np.abs(fftData) == maxVal)
# Iterate through peaks in the FFT
# For this example, number of loops will always be only one
prevPhase = -1000
for col, row in itertools.izip(maxPos[0], maxPos[1]):
# Initial guesses for fit parameters from FFT
init_phase = np.angle(fftData[col,row])
init_amp = 2.0 * maxVal/npt
init_freqY = yFreqArr[col]
init_freqX = xFreqArr[row]
cntr = 0
if prevPhase == -1000:
prevPhase = init_phase
guess = [init_amp, init_freqX, prevPhase]
# Fit each row of the 2D sine curve independently
for rr in sineGen:
(amp, freq, phs), pcov = optimize.curve_fit(sineFit, xDat, rr, guess)
# xDat is an linspace array, containing a list of numbers from 0 to xSize-1
# Subtract fit from original data and plot
fitData = sineFit(xDat, amp, freq, phs)
sub1 = rr - fitData
# Plot
fig1 = plt.figure()
ax1 = fig1.add_subplot(121)
p1, = ax1.plot(rr, 'g')
p2, = ax1.plot(fitData, 'b')
plt.legend([p1,p2], ["data", "fit"])
ax2 = fig1.add_subplot(122)
p3, = ax2.plot(sub1)
plt.legend([p3], ['residual1'])
fig1.tight_layout()
plt.show()
cntr += 1
prevPhase = phs # Update guess for phase of sine curve
I've tried to distill the important parts of your question into this answer.
First of all, try fitting a single block of data, not an array. Once you are confident that your model is sufficient you can move on.
Your fit is only going to be as good as your model, if you move on to something not "sine"-like you'll need to adjust accordingly.
Fitting is an "art", in that the initial conditions can greatly change the convergence of the error function. In addition there may be more than one minima in your fits, so you often have to worry about the uniqueness of your proposed solution.
While you were on the right track with your FFT idea, I think your implementation wasn't quite correct. The code below should be a great toy system. It generates random data of the type f(x) = a0*sin(a1*x+a2). Sometimes a random initial guess will work, sometimes it will fail spectacularly. However, using the FFT guess for the frequency the convergence should always work for this system. An example output:
import numpy as np
import pylab as plt
import scipy.optimize as optimize
# This is your target function
def sineFit(t, (a, f, p)):
return a * np.sin(2.0*np.pi*f*t + p)
# This is our "error" function
def err_func(p0, X, Y, target_function):
err = ((Y - target_function(X, p0))**2).sum()
return err
# Try out different parameters, sometimes the random guess works
# sometimes it fails. The FFT solution should always work for this problem
inital_args = np.random.random(3)
X = np.linspace(0, 10, 1000)
Y = sineFit(X, inital_args)
# Use a random inital guess
inital_guess = np.random.random(3)
# Fit
sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit))
# Plot the fit
Y2 = sineFit(X, sol)
plt.figure(figsize=(15,10))
plt.subplot(211)
plt.title("Random Inital Guess: Final Parameters: %s"%sol)
plt.plot(X,Y)
plt.plot(X,Y2,'r',alpha=.5,lw=10)
# Use an improved "fft" guess for the frequency
# this will be the max in k-space
timestep = X[1]-X[0]
guess_k = np.argmax( np.fft.rfft(Y) )
guess_f = np.fft.fftfreq(X.size, timestep)[guess_k]
inital_guess[1] = guess_f
# Guess the amplitiude by taking the max of the absolute values
inital_guess[0] = np.abs(Y).max()
sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit))
Y2 = sineFit(X, sol)
plt.subplot(212)
plt.title("FFT Guess : Final Parameters: %s"%sol)
plt.plot(X,Y)
plt.plot(X,Y2,'r',alpha=.5,lw=10)
plt.show()
The problem is due to a bad initial guess of the phase, not the frequency. While cycling through the rows of genSine (inner loop) you use the fit result of the previous line as initial guess for the next row which does not work always. If you determine the phase from an fft of the current row and use that as initial guess the fit will succeed.
You could change the inner loop as follows:
for n,rr in enumerate(sineGen):
fftx = np.fft.fft(rr)
fftx = fftx[:len(fftx)/2]
idx = np.argmax(np.abs(fftx))
init_phase = np.angle(fftx[idx])
print fftx[idx], init_phase
...
Also you need to change
def sineFit(t, a, f, p):
return a * np.sin(2.0 * np.pi * f*t + p)
to
def sineFit(t, a, f, p):
return a * np.cos(2.0 * np.pi * f*t + p)
since phase=0 means that the imaginary part of the fft is zero and thus the function is cosine like.
Btw. your sample above is still lacking definitions of sineGen and xDat.
Without understanding much of your code, according to http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html:
(amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat,
sub1, guess2)
should become:
(amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat,
sub1, p0=guess2)
Assuming that tDat and sub1 are x and y, that should do the trick. But, once again, it is quite difficult to understand such a complex code with so many interlinked variables and no comments at all. A code should always be build from bottom up, meaning that you don't do a loop of fits when a single one is not working, you don't add noise until the code works to fit the non-noisy examples... Good luck!
By "nothing fancy" I meant something like removing EVERYTHING that is not related with the fit, and doing a simplified mock example such as:
import numpy as np
import scipy.optimize as optimize
def sineFit(t, a, f, p):
return a * np.sin(2.0 * np.pi * f*t + p)
# Create array of x and y with given parameters
x = np.asarray(range(100))
y = sineFit(x, 1, 0.05, 0)
# Give a guess and fit, printing result of the fitted values
guess = [1., 0.05, 0.]
print optimize.curve_fit(sineFit, x, y, guess)[0]
The result of this is exactly the answer:
[1. 0.05 0.]
But if you change guess not too much, just enough:
# Give a guess and fit, printing result of the fitted values
guess = [1., 0.06, 0.]
print optimize.curve_fit(sineFit, x, y, guess)[0]
the result gives absurdly wrong numbers:
[ 0.00823701 0.06391323 -1.20382787]
Can you explain this behavior?
You can use curve_fit with a series of trigonometric functions, usually very robust and ajustable to the precision that you need just by increasing the number of terms... here is an example:
from scipy import sin, cos, linspace
def f(x, a0,s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12,
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12):
return a0 + s1*sin(1*x) + c1*cos(1*x) \
+ s2*sin(2*x) + c2*cos(2*x) \
+ s3*sin(3*x) + c3*cos(3*x) \
+ s4*sin(4*x) + c4*cos(4*x) \
+ s5*sin(5*x) + c5*cos(5*x) \
+ s6*sin(6*x) + c6*cos(6*x) \
+ s7*sin(7*x) + c7*cos(7*x) \
+ s8*sin(8*x) + c8*cos(8*x) \
+ s9*sin(9*x) + c9*cos(9*x) \
+ s10*sin(9*x) + c10*cos(9*x) \
+ s11*sin(9*x) + c11*cos(9*x) \
+ s12*sin(9*x) + c12*cos(9*x)
from scipy.optimize import curve_fit
pi/2. / (x.max() - x.min())
x_norm *= norm_factor
popt, pcov = curve_fit(f, x_norm, y)
x_fit = linspace(x_norm.min(), x_norm.max(), 1000)
y_fit = f(x_fit, *popt)
plt.plot( x_fit/x_norm, y_fit )

Categories

Resources