I would like to use the curve_fit function from the scipy.optimize module to determine amplitudes, frequencies, phases of sum of sine functions (and one y0). It's easy to do when I know a number of sines to use. For example when I know two frequencies from the DFT (Discrete Fourier Transform): 1.152 and 0.432 I can define a function:
def func(x, amp1, amp2, freq1 , freq2, phase1, phase2, y0):
return amp1*np.sin(freq1*x + phase1) + amp2*np.sin(freq2*x + phase2) + y0
Then, using the curve_fit and constraining intervals of frequencies I can find a good fitting:
param, _ = curve_fit(func, t, data, bounds=([-np.inf, -np.inf, 1.14, 0.43, -np.inf, -np.inf, -np.inf], [np.inf, np.inf, 1.16, 0.44, np.inf, np.inf, np.inf]))
It looks great:
But in this case I've prepared the data and I've known a number of frequencies. Do you know how to define the func only once and handle all cases (for example five sine functions)? I've tried to put the parameters into lists, e.g. amp = [amp1, amp2, ... ] and I've iterated over their length. But there is a problem to define bounds for parameter lists. bounds is very important to ensure reality model.
The solution does not have to based on curve_fit.
Assuming you know the frequencies beforehand the problem is simple. You can set the lower bound to 0 and set the upper bound to 2 * pi * freq for frequency. For amps, set any number (or np.inf if you want no boundary).
You can formulate the function in the form lambda x, amp1, phase1, amp2, phase2... : y, curve_fit can accept a function of undefined number of arguments as long as you supply a proper initial guess.
A sample code for five frequencies:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
x = np.linspace(0,10,60)
w = [1,2,3,4,5]
a = [1,4,2,3,0.1]
x0 = [0,1,0,1,0.5]
y = np.sum(a_i * np.sin(w_i * x - x0_i) for w_i, a_i, x0_i in zip(w,a, x0)) #base_data
yr = y + np.random.normal(0,0.5, size=x.size) #noisy data
def func(x, *args):
""" function of the form lambda x, amp1, phase1, amp2, phase2...."""
return np.sum(a_i * np.sin(w_i * (x-x0)) for w_i, a_i, x0
in zip(w,args[::2], args[1::2]))
ubounds = np.zeros(len(w) * 2)
ubounds[::2] = 10 #setting amp max value to 10 (arbitrary)
ubounds[1::2] = np.asarray(w) * 2 * np.pi
p0 = [0] * 10 # note p0 size
popt, pcov = curve_fit(func, x, yr, p0, bounds=(0, ubounds))
amps, phases = popt[::2], popt[1::2]
plt.plot(x,func(x, *popt))
plt.plot(x,yr, 'go')
Related
I'm trying to fit a curve with a Gaussian plus a Lorentzian function, using the curve_fit function from scipy.
def gaussian(x, a, x0, sig):
return a * np.exp(-1/2 * (x - x0)**2 / sig**2)
def lorentzian(x, a, b, c):
return a*c**2/((x-b)**2+c**2)
def decompose(x, z, n, b, *par):
hb_n = gaussian(x, par[0], 4861.3*(1+z), n)
hb_b = lorentzian(x, par[1], 4861.3*(1+z), b)
return hb_b + hb_n
And when I set the p0 parameter, I can get a reasonable result, which fits the curve well.
guess = [0.0001, 2, 10, 3e-16, 3e-16]
p, c = curve_fit(decompose, wave, residual, guess)
fitting parameters
the fitting model and data figure when I set the p0 parameter
But if I set the p0 and bounds parameters simultaneously, the curve_fit function gives the initial guess as the final fitting result, which is rather deviated from the data.
guess = [0.0001, 2, 10, 3e-16, 3e-16]
p, c = curve_fit(decompose, wave, residual, guess, bounds=([-0.001, 0, 0, 0, 0], [0.001, 10, 100, 1e-15, 1e-15]))
fitting parameters
the fitting model and data figure when I set the p0 and bounds parameters simultaneously
I have tried many different combinations of boundaries for the parameters, but the fitting results invariably return the initial guess values. I've been stuck in this problem for a long time. I would be very grateful if anyone can give me some advice to solve this problem.
This happens due to a combination of the optimization algorithm and its parameters.
From the official documentation:
method{‘lm’, ‘trf’, ‘dogbox’}, optional
Method to use for optimization. See least_squares for more details.
Default is ‘lm’ for unconstrained problems and ‘trf’ if bounds are
provided. The method ‘lm’ won’t work when the number of observations
is less than the number of variables, use ‘trf’ or ‘dogbox’ in this
case.
So when you add bound constraints curve_fit will use different optimization algorithm (trust region instead of Levenberg-Marquardt).
To debug the problem you can try to set full_output=True as Warren Weckesser noted in the comments.
In the case of the fit with bounds you will see something similar to:
'nfev': 1
`gtol` termination condition is satisfied.
So the optimization stopped after the first iteration. That's why founed parameters similar to the initial guess.
To fix this you can specify lower gtol parameter. Full list of available parameters you can find here: https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.least_squares.html#scipy.optimize.least_squares
Example:
Code:
import numpy as np
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
def gaussian(x, a, x0, sig):
return a * np.exp(-1 / 2 * (x - x0) ** 2 / sig**2)
def lorentzian(x, a, b, c):
return a * c**2 / ((x - b) ** 2 + c**2)
def decompose(x, z, n, b, *par):
hb_n = gaussian(x, par[0], 4861.3 * (1 + z), n)
hb_b = lorentzian(x, par[1], 4861.3 * (1 + z), b)
return hb_b + hb_n
gt_parameters = [-2.42688295e-4, 2.3477827, 1.56977708e1, 4.47455820e-16, 2.2193466e-16]
wave = np.linspace(4750, 5000, num=400)
gt_curve = decompose(wave, *gt_parameters)
noisy_curve = gt_curve + np.random.normal(0, 2e-17, size=len(wave))
guess = [0.0001, 2, 10, 3e-16, 3e-16]
bounds = ([-0.001, 0, 0, 0, 0], [0.001, 10, 100, 1e-15, 1e-15])
options = [
("Levenberg-Marquardt without bounds", dict(method="lm")),
("Trust Region without bounds", dict(method="trf")),
("Trust Region with bounds", dict(method="trf", bounds=bounds)),
(
"Trust Region with bounds + fixed tolerance",
dict(method="trf", bounds=bounds, gtol=1e-36),
),
]
fig, axs = plt.subplots(len(options))
for (title, fit_params), ax in zip(options, axs):
ax.set_title(title)
p, c = curve_fit(decompose, wave, noisy_curve, guess, **fit_params)
fitted_curve = decompose(wave, *p)
ax.plot(wave, gt_curve, label="gt_curve")
ax.plot(wave, noisy_curve, label="noisy")
ax.plot(wave, fitted_curve, label="fitted_curve")
handles, labels = ax.get_legend_handles_labels()
fig.legend(handles, labels)
plt.show()
I am trying to use scipy to numerically solve the following differential equation
x''+x=\sum_{k=1}^{20}\delta(t-k\pi), y(0)=y'(0)=0.
Here is the code
from scipy.integrate import odeint
import numpy as np
import matplotlib.pyplot as plt
from sympy import DiracDelta
def f(t):
sum = 0
for i in range(20):
sum = sum + 1.0*DiracDelta(t-(i+1)*np.pi)
return sum
def ode(X, t):
x = X[0]
y = X[1]
dxdt = y
dydt = -x + f(t)
return [dxdt, dydt]
X0 = [0, 0]
t = np.linspace(0, 80, 500)
sol = odeint(ode, X0, t)
x = sol[:, 0]
y = sol[:, 1]
plt.plot(t,x, t, y)
plt.xlabel('t')
plt.legend(('x', 'y'))
# phase portrait
plt.figure()
plt.plot(x,y)
plt.plot(x[0], y[0], 'ro')
plt.xlabel('x')
plt.ylabel('y')
plt.show()
However what I got from python is zero solution, which is different from what I got from Mathematica. Here are the mathematica code and the graph
so=NDSolve[{x''(t)+x(t)=\sum _{i=1}^{20} DiraDelta (t-i \pi ),x(0)=0,x'(0)=0},x(t),{t,0,80}]
It seems to me that scipy ignores the Dirac delta function. Where am I wrong? Any help is appreciated.
Dirac delta is not a function. Writing it as density in an integral is still only a symbolic representation. It is, as mathematical object, a functional on the space of continuous functions. delta(t0,f)=f(t0), not more, not less.
One can approximate the evaluation, or "sifting" effect of the delta operator by continuous functions. The usual good approximations have the form N*phi(N*t) where N is a large number and phi a non-negative function, usually with a somewhat compact shape, that has integral one. Popular examples are box functions, tent functions, the Gauß bell curve, ... So you could take
def tentfunc(t): return max(0,1-abs(t))
N = 10.0
def rhs(t): return sum( N*tentfunc(N*(t-(i+1)*np.pi)) for i in range(20))
X0 = [0, 0]
t = np.linspace(0, 80, 1000)
sol = odeint(lambda x,t: [ x[1], rhs(t)-x[0]], X0, t, tcrit=np.pi*np.arange(21), atol=1e-8, rtol=1e-10)
x,v = sol.T
plt.plot(t,x, t, v)
which gives
Note that the density of the t array also influences the accuracy, while the tcrit critical points did not do much.
Another way is to remember that delta is the second derivative of max(0,x), so one can construct a function that is the twice primitive of the right side,
def u(t): return sum(np.maximum(0,t-(i+1)*np.pi) for i in range(20))
so that now the equation is equivalent to
(x(t)-u(t))'' + x(t) = 0
set y = x-u then
y''(t) + y(t) = -u(t)
which now has a continuous right side.
X0 = [0, 0]
t = np.linspace(0, 80, 1000)
sol = odeint(lambda y,t: [ y[1], -u(t)-y[0]], X0, t, atol=1e-8, rtol=1e-10)
y,v = sol.T
x=y+u(t)
plt.plot(t,x)
odeint :
does not handle sympy symbolic objects
it's unlikely it can ever handle Dirac Delta terms.
The best bet is probably to turn dirac deltas into boundary conditions: assume that the function is continuous at the location of the Dirac delta, but the first derivative jumps. Integrating over infinitesimal interval around the location of the delta function gives you the boundary condition for the derivative just left and just right from the delta.
Say I want to fit a sine function using scipy.optimize.curve_fit. I don't know any parameters of the function. To get the frequency, I do Fourier transform and guess all the other parameters - amplitude, phase, and offset. When running my program, I do get a fit but it does not make sense. What is the problem? Any help will be appreciated.
import numpy as np
import matplotlib.pyplot as plt
import scipy as sp
ampl = 1
freq = 24.5
phase = np.pi/2
offset = 0.05
t = np.arange(0,10,0.001)
func = np.sin(2*np.pi*t*freq + phase) + offset
fastfft = np.fft.fft(func)
freq_array = np.fft.fftfreq(len(t),t[0]-t[1])
max_value_index = np.argmax(abs(fastfft))
frequency = abs(freq_array[max_value_index])
def fit(a, f, p, o, t):
return a * np.sin(2*np.pi*t*f + p) + o
guess = (0.9, frequency, np.pi/4, 0.1)
params, fit = sp.optimize.curve_fit(fit, t, func, p0=guess)
a, f, p, o = params
fitfunc = lambda t: a * np.sin(2*np.pi*t*f + p) + o
plt.plot(t, func, 'r-', t, fitfunc(t), 'b-')
The main problem in your program was a misunderstanding, how scipy.optimize.curve_fit is designed and its assumption of the fit function:
ydata = f(xdata, *params) + eps
This means that the fit function has to have the array for the x values as the first parameter followed by the function parameters in no particular order and must return an array for the y values. Here is an example, how to do this:
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize
#t has to be the first parameter of the fit function
def fit(t, a, f, p, o):
return a * np.sin(2*np.pi*t*f + p) + o
ampl = 1
freq = 2
phase = np.pi/2
offset = 0.5
t = np.arange(0,10,0.01)
#is the same as fit(t, ampl, freq, phase, offset)
func = np.sin(2*np.pi*t*freq + phase) + offset
fastfft = np.fft.fft(func)
freq_array = np.fft.fftfreq(len(t),t[0]-t[1])
max_value_index = np.argmax(abs(fastfft))
frequency = abs(freq_array[max_value_index])
guess = (0.9, frequency, np.pi/4, 0.1)
#renamed the covariance matrix
params, pcov = scipy.optimize.curve_fit(fit, t, func, p0=guess)
a, f, p, o = params
#calculate the fit plot using the fit function
plt.plot(t, func, 'r-', t, fit(t, *params), 'b-')
plt.show()
As you can see, I have also changed the way the fit function for the plot is calculated. You don't need another function - just utilise the fit function with the parameter list, the fit procedure gives you back.
The other problem was that you called the covariance array fit - overwriting the previously defined function fit. I fixed that as well.
P.S.: Of course now you only see one curve, because the perfect fit covers your data points.
I am using scipy.optimize.curve_fit to fit a curve to some data i have. The curves, for the most part, seem to fit very well. For some reason, pcov = inf when i print it off.
What i really need is to calculate the error associated with the parameters i'm fitting, and am not sure how exactly to do this even if it does give me the covariance matrix.
The model being fit to is:
def intensity(x,R_out,R_in,K_in,K_out,a,b,c):
K_in,K_out = abs(0.0),abs(K_out)
if x<=R_in:
return 2*R_out*(K_out*np.sqrt(1-x**2/R_out**2)-
(K_out-0.0)*np.sqrt(R_in**2/R_out**2-x**2/R_out**2)) + c
elif x>=R_in and x<=R_out:
return K_out*2*R_out*np.sqrt(1-x**2/R_out**2) + c
elif x>R_out:
return c
intensity_vec = np.vectorize(intensity)
def intensity_vec_self(x,R_out,R_in,K_in,K_out,a,b,c):
y = np.zeros(x.shape)
for i in range(len(y)):
y[i]=intensity_vec(x[i],R_out,R_in,K_in,K_out,a,b,c)
return y
and there are 400 data points, i can put that on here if you think it will help.
To summarize, i can't get curve_fit to print off my pcov and need help as to figure out why and if i can get it to do so.
Also, if it is a quick explanation i would like to know how to use the pcov array to attain the errors associated with my fit.
Thanks
The variance of parameters are the diagonal elements of the variance-co variance matrix, and the standard error is the square root of it. np.sqrt(np.diag(pcov))
Regarding getting inf, see and compare these two examples:
In [129]:
import numpy as np
def func(x, a, b, c, d):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5, 1)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = so.curve_fit(func, xdata, ydata)
print np.sqrt(np.diag(pcov))
[ inf inf inf inf]
And:
In [130]:
def func(x, a, b, c):
return a * np.exp(-b * x) + c
xdata = np.linspace(0, 4, 50)
y = func(xdata, 2.5, 1.3, 0.5)
ydata = y + 0.2 * np.random.normal(size=len(xdata))
popt, pcov = so.curve_fit(func, xdata, ydata)
print np.sqrt(np.diag(pcov))
[ 0.11097646 0.11849107 0.05230711]
In this extreme example, d has no effect on the function func, hence it will be associated with variance of +inf, or in another word, it can be just about any value. Removing d from func will get what will make sense.
In reality, if parameters are of very different scale, say:
def func(x, a, b, c, d):
#return a * np.exp(-b * x) + c
return a * np.exp(-b * x) + c + d*1e-10
You will also get inf due to float point overflow/underflow.
In your case, I think you never used a and b. So it is just like the first example here.
I've been trying to fit the amplitude, frequency and phase of a sine curve given some generated two dimensional toy data. (Code at the end)
To get estimates for the three parameters, I first perform an FFT. I use the values from the FFT as initial guesses for the actual frequency and phase and then fit for them (row by row). I wrote my code such that I input which bin of the FFT I want the frequency to be in, so I can check if the fitting is working well. But there's some pretty strange behaviour. If my input bin is say 3.1 (a non integral bin, so the FFT won't give me the right frequency) then the fit works wonderfully. But if the input bin is 3 (so the FFT outputs the exact frequency) then my fit fails, and I'm trying to understand why.
Here's the output when I give the input bins (in the X and Y direction) as 3.0 and 2.1 respectively:
(The plot on the right is data - fit)
Here's the output when I give the input bins as 3.0 and 2.0:
Question: Why does the non linear fit fail when I input the exact frequency of the curve?
Code:
#! /usr/bin/python
# For the purposes of this code, it's easier to think of the X-Y axes as transposed,
# so the X axis is vertical and the Y axis is horizontal
import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as optimize
import itertools
import sys
PI = np.pi
# Function which accepts paramters to define a sin curve
# Used for the non linear fit
def sineFit(t, a, f, p):
return a * np.sin(2.0 * PI * f*t + p)
xSize = 18
ySize = 60
npt = xSize * ySize
# Get frequency bin from user input
xFreq = float(sys.argv[1])
yFreq = float(sys.argv[2])
xPeriod = xSize/xFreq
yPeriod = ySize/yFreq
# arrays should be defined here
# Generate the 2D sine curve
for jj in range (0, xSize):
for ii in range(0, ySize):
sineGen[jj, ii] = np.cos(2.0*PI*(ii/xPeriod + jj/yPeriod))
# Compute 2dim FFT as well as freq bins along each axis
fftData = np.fft.fft2(sineGen)
fftMean = np.mean(fftData)
fftRMS = np.std(fftData)
xFreqArr = np.fft.fftfreq(fftData.shape[1]) # Frequency bins along x
yFreqArr = np.fft.fftfreq(fftData.shape[0]) # Frequency bins along y
# Find peak of FFT, and position of peak
maxVal = np.amax(np.abs(fftData))
maxPos = np.where(np.abs(fftData) == maxVal)
# Iterate through peaks in the FFT
# For this example, number of loops will always be only one
prevPhase = -1000
for col, row in itertools.izip(maxPos[0], maxPos[1]):
# Initial guesses for fit parameters from FFT
init_phase = np.angle(fftData[col,row])
init_amp = 2.0 * maxVal/npt
init_freqY = yFreqArr[col]
init_freqX = xFreqArr[row]
cntr = 0
if prevPhase == -1000:
prevPhase = init_phase
guess = [init_amp, init_freqX, prevPhase]
# Fit each row of the 2D sine curve independently
for rr in sineGen:
(amp, freq, phs), pcov = optimize.curve_fit(sineFit, xDat, rr, guess)
# xDat is an linspace array, containing a list of numbers from 0 to xSize-1
# Subtract fit from original data and plot
fitData = sineFit(xDat, amp, freq, phs)
sub1 = rr - fitData
# Plot
fig1 = plt.figure()
ax1 = fig1.add_subplot(121)
p1, = ax1.plot(rr, 'g')
p2, = ax1.plot(fitData, 'b')
plt.legend([p1,p2], ["data", "fit"])
ax2 = fig1.add_subplot(122)
p3, = ax2.plot(sub1)
plt.legend([p3], ['residual1'])
fig1.tight_layout()
plt.show()
cntr += 1
prevPhase = phs # Update guess for phase of sine curve
I've tried to distill the important parts of your question into this answer.
First of all, try fitting a single block of data, not an array. Once you are confident that your model is sufficient you can move on.
Your fit is only going to be as good as your model, if you move on to something not "sine"-like you'll need to adjust accordingly.
Fitting is an "art", in that the initial conditions can greatly change the convergence of the error function. In addition there may be more than one minima in your fits, so you often have to worry about the uniqueness of your proposed solution.
While you were on the right track with your FFT idea, I think your implementation wasn't quite correct. The code below should be a great toy system. It generates random data of the type f(x) = a0*sin(a1*x+a2). Sometimes a random initial guess will work, sometimes it will fail spectacularly. However, using the FFT guess for the frequency the convergence should always work for this system. An example output:
import numpy as np
import pylab as plt
import scipy.optimize as optimize
# This is your target function
def sineFit(t, (a, f, p)):
return a * np.sin(2.0*np.pi*f*t + p)
# This is our "error" function
def err_func(p0, X, Y, target_function):
err = ((Y - target_function(X, p0))**2).sum()
return err
# Try out different parameters, sometimes the random guess works
# sometimes it fails. The FFT solution should always work for this problem
inital_args = np.random.random(3)
X = np.linspace(0, 10, 1000)
Y = sineFit(X, inital_args)
# Use a random inital guess
inital_guess = np.random.random(3)
# Fit
sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit))
# Plot the fit
Y2 = sineFit(X, sol)
plt.figure(figsize=(15,10))
plt.subplot(211)
plt.title("Random Inital Guess: Final Parameters: %s"%sol)
plt.plot(X,Y)
plt.plot(X,Y2,'r',alpha=.5,lw=10)
# Use an improved "fft" guess for the frequency
# this will be the max in k-space
timestep = X[1]-X[0]
guess_k = np.argmax( np.fft.rfft(Y) )
guess_f = np.fft.fftfreq(X.size, timestep)[guess_k]
inital_guess[1] = guess_f
# Guess the amplitiude by taking the max of the absolute values
inital_guess[0] = np.abs(Y).max()
sol = optimize.fmin(err_func, inital_guess, args=(X,Y,sineFit))
Y2 = sineFit(X, sol)
plt.subplot(212)
plt.title("FFT Guess : Final Parameters: %s"%sol)
plt.plot(X,Y)
plt.plot(X,Y2,'r',alpha=.5,lw=10)
plt.show()
The problem is due to a bad initial guess of the phase, not the frequency. While cycling through the rows of genSine (inner loop) you use the fit result of the previous line as initial guess for the next row which does not work always. If you determine the phase from an fft of the current row and use that as initial guess the fit will succeed.
You could change the inner loop as follows:
for n,rr in enumerate(sineGen):
fftx = np.fft.fft(rr)
fftx = fftx[:len(fftx)/2]
idx = np.argmax(np.abs(fftx))
init_phase = np.angle(fftx[idx])
print fftx[idx], init_phase
...
Also you need to change
def sineFit(t, a, f, p):
return a * np.sin(2.0 * np.pi * f*t + p)
to
def sineFit(t, a, f, p):
return a * np.cos(2.0 * np.pi * f*t + p)
since phase=0 means that the imaginary part of the fft is zero and thus the function is cosine like.
Btw. your sample above is still lacking definitions of sineGen and xDat.
Without understanding much of your code, according to http://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.curve_fit.html:
(amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat,
sub1, guess2)
should become:
(amp2, freq2, phs2), pcov = optimize.curve_fit(sineFit, tDat,
sub1, p0=guess2)
Assuming that tDat and sub1 are x and y, that should do the trick. But, once again, it is quite difficult to understand such a complex code with so many interlinked variables and no comments at all. A code should always be build from bottom up, meaning that you don't do a loop of fits when a single one is not working, you don't add noise until the code works to fit the non-noisy examples... Good luck!
By "nothing fancy" I meant something like removing EVERYTHING that is not related with the fit, and doing a simplified mock example such as:
import numpy as np
import scipy.optimize as optimize
def sineFit(t, a, f, p):
return a * np.sin(2.0 * np.pi * f*t + p)
# Create array of x and y with given parameters
x = np.asarray(range(100))
y = sineFit(x, 1, 0.05, 0)
# Give a guess and fit, printing result of the fitted values
guess = [1., 0.05, 0.]
print optimize.curve_fit(sineFit, x, y, guess)[0]
The result of this is exactly the answer:
[1. 0.05 0.]
But if you change guess not too much, just enough:
# Give a guess and fit, printing result of the fitted values
guess = [1., 0.06, 0.]
print optimize.curve_fit(sineFit, x, y, guess)[0]
the result gives absurdly wrong numbers:
[ 0.00823701 0.06391323 -1.20382787]
Can you explain this behavior?
You can use curve_fit with a series of trigonometric functions, usually very robust and ajustable to the precision that you need just by increasing the number of terms... here is an example:
from scipy import sin, cos, linspace
def f(x, a0,s1,s2,s3,s4,s5,s6,s7,s8,s9,s10,s11,s12,
c1,c2,c3,c4,c5,c6,c7,c8,c9,c10,c11,c12):
return a0 + s1*sin(1*x) + c1*cos(1*x) \
+ s2*sin(2*x) + c2*cos(2*x) \
+ s3*sin(3*x) + c3*cos(3*x) \
+ s4*sin(4*x) + c4*cos(4*x) \
+ s5*sin(5*x) + c5*cos(5*x) \
+ s6*sin(6*x) + c6*cos(6*x) \
+ s7*sin(7*x) + c7*cos(7*x) \
+ s8*sin(8*x) + c8*cos(8*x) \
+ s9*sin(9*x) + c9*cos(9*x) \
+ s10*sin(9*x) + c10*cos(9*x) \
+ s11*sin(9*x) + c11*cos(9*x) \
+ s12*sin(9*x) + c12*cos(9*x)
from scipy.optimize import curve_fit
pi/2. / (x.max() - x.min())
x_norm *= norm_factor
popt, pcov = curve_fit(f, x_norm, y)
x_fit = linspace(x_norm.min(), x_norm.max(), 1000)
y_fit = f(x_fit, *popt)
plt.plot( x_fit/x_norm, y_fit )