I've got the following bit of Python (v2.7.14) code, which uses curve_fit from SciPy (v1.0.1) to find parameters for an exponential decay function. Most of the time, I get reasonable results. Occasionally though, I'll get some results which are completely out of my expected range, even though the found parameters will look fine when plotted against the original graph.
First, my understanding of the exponential decay formula comes from https://en.wikipedia.org/wiki/Exponential_decay which I've translated to Python as:
y = a * numpy.exp(-b * x) + c
Where by:
a is the initial value of the data
b is the decay rate, which is the inverse of when the signal gets to 1/e from initial value
c is an offset, as I am dealing with non-negative values in my data which never reach zero
x is the current time
The script takes into account that non-negative data is being fitted and offsets the initial guess appropriately. But even without guessing, not offsetting, using max/min (instead of first/last values) and other random things I've tried, I cannot seem to get curve_fit to produce sensible values on the troublesome datasets.
My hypothesis is that the troublesome datasets don't have enough of a curve that can be fit without going way outside the realm of the data. I've looked at the bounds argument for curve_fit, and thought that might be a reasonable option. I'm unsure as to what would make good lower and upper bounds for the calculation, or if it is actually the option I am looking for.
Here is the code. Commented out code are things I've tried.
#!/usr/local/bin/python
import numpy as numpy
from scipy.optimize import curve_fit
import matplotlib.pyplot as pyplot
def exponential_decay(x, a, b, c):
return a * numpy.exp(-b * x) + c
def fit_exponential(decay_data, time_data, decay_time):
# The start of the curve is offset by the last point, so subtract
guess_a = decay_data[0] - decay_data[-1]
#guess_a = max(decay_data) - min(decay_data)
# The time that it takes for the signal to reach 1/e becomes guess_b
guess_b = 1/decay_time
# Since this is non-negative data, above 0, we use the last data point as the baseline (c)
guess_c = decay_data[-1]
#guess_c = min(decay_data)
guess=[guess_a, guess_b, guess_c]
print "guess: {0}".format(guess)
#popt, pcov = curve_fit(exponential_decay, time_data, decay_data, maxfev=20000)
popt, pcov = curve_fit(exponential_decay, time_data, decay_data, p0=guess, maxfev=20000)
#bound_lower = [0.05, 0.05, 0.05]
#bound_upper = [decay_data[0]*2, guess_b * 10, decay_data[-1]]
#print "bound_lower: {0}".format(bound_lower)
#print "bound_upper: {0}".format(bound_upper)
#popt, pcov = curve_fit(exponential_decay, time_data, decay_data, p0=guess, bounds=[bound_lower, bound_upper], maxfev=20000)
a, b, c = popt
print "a: {0}".format(a)
print "b: {0}".format(b)
print "c: {0}".format(c)
plot_fit = exponential_decay(time_data, a, b, c)
pyplot.plot(time_data, decay_data, 'g', label='Data')
pyplot.plot(time_data, plot_fit, 'r', label='Fit')
pyplot.legend()
pyplot.show()
print "Gives reasonable results"
time_data = numpy.array([0.0,0.040000000000000036,0.08100000000000018,0.12200000000000011,0.16200000000000014,0.20300000000000007,0.2430000000000001,0.28400000000000003,0.32400000000000007,0.365,0.405,0.44599999999999995,0.486,0.5269999999999999,0.567,0.6079999999999999,0.6490000000000002,0.6889999999999998,0.7300000000000002,0.7700000000000002,0.8110000000000002,0.8510000000000002,0.8920000000000001,0.9320000000000002,0.9730000000000001])
decay_data = numpy.array([1.342146870531986,1.405586070225509,1.3439802492549762,1.3567811728250267,1.2666276377825874,1.1686375326985337,1.216119360088685,1.2022841507836042,1.1926979408026064,1.1544395213303447,1.1904416926531907,1.1054720201415882,1.112100683833435,1.0811434035632939,1.1221671794680403,1.0673295063196415,1.0036146509494743,0.9984005680821595,1.0134498134883763,0.9996920772051201,0.929782730581616,0.9646581154122312,0.9290690593684447,0.8907360533169936,0.9121560047238627])
fit_exponential(decay_data, time_data, 0.567)
print
print "Gives results that are way outside my expectations"
time_data = numpy.array([0.0,0.040000000000000036,0.08099999999999996,0.121,0.16199999999999992,0.20199999999999996,0.24300000000000033,0.28300000000000036,0.32399999999999984,0.3650000000000002,0.40500000000000025,0.44599999999999973,0.48599999999999977,0.5270000000000001,0.5670000000000002,0.6079999999999997,0.6479999999999997,0.6890000000000001,0.7290000000000001,0.7700000000000005,0.8100000000000005,0.851,0.8920000000000003,0.9320000000000004,0.9729999999999999,1.013,1.0540000000000003])
decay_data = numpy.array([1.4401611921948776,1.3720688158534153,1.3793465463227048,1.2939909686762128,1.3376345321949346,1.3352710161631154,1.3413634841956348,1.248705138603995,1.2914294791901497,1.2581763134585313,1.246975264018646,1.2006447776495062,1.188232179689515,1.1032789127515186,1.163294324147017,1.1686263160765304,1.1434009568472243,1.0511578409946472,1.0814520440570896,1.1035953824496334,1.0626893599266163,1.0645580326776076,0.994855722989818,0.9959891485338087,0.9394584009825916,0.949504060086646,0.9278639431146273])
fit_exponential(decay_data, time_data, 0.6890000000000001)
And here is the text output:
Gives reasonable results
guess: [0.4299908658081232, 1.7636684303350971, 0.9121560047238627]
a: 1.10498934435
b: 0.583046565885
c: 0.274503681044
Gives results that are way outside my expectations
guess: [0.5122972490802503, 1.4513788098693758, 0.9278639431146273]
a: 742.824622191
b: 0.000606308344957
c: -741.41398516
Most notably, with the second set of results, the value for a is very high, with the value for c being equally low on the negative scale, and b being a very small decimal number.
Here is the graph of the first dataset, which gives reasonable results.
Here is the graph of the second dataset, which does not give good results.
Note that the graph itself plots correctly, though the line does not really have a good curve to it.
My questions:
Is my implementation of the exponential decay algorithm with curve_fit correct?
Are my initial guess parameters good enough?
Is the bounds parameter the correct solution for this problem? If so, what is a good way to determine lower and upper bounds?
Have I missed something here?
Again, thank you!
When you say that the second fit gives results that are "way outside" of your expectations and that although the second graph "plots correctly" the line does not really "have a good curve fit" you are on the right track to understanding what is going on. I think you are just missing a piece of the puzzle.
The second graph is fit pretty well by a curve that does look linear. That probably means that you don't really have enough change in your data (well, perhaps below the noise level) to detect that it is an exponential decay.
I would bet that if you printed out not only the best-fit values but also the uncertainties and correlations for the variables that you would see that the uncertainties are huge and some of the correlations are very close to 1. That may mean that taking into account the uncertainties (and measurements always have uncertainties) the results might actually fit with your expectation. And that may also tell you that the data you have does not support an exponential decay very well.
You might also try other models for this data ("linear" comes to mind ;)) and compare goodness-of-fit statistics such as chi-square and Akaike information criterion.
scipy.curve_fit does return the covariance matrix -- the pcov that you did not use in your example. Unfortunately, scipy.curve_fit does not convert these values into uncertainties and correlation values, and it does not attempt to return any goodness-of-fit statistics at all.
To fully explain any fit to data, you need not only the best-fit values but also an estimate of the uncertainties for the variable parameters. And you need the goodness-of-fit statistics in order to determine if a fit is good, or at least whether one fit is better than another.
Related
I have this set of experimental data:
x_data = np.array([0, 2, 5, 10, 15, 30, 60, 120])
y_data = np.array([1.00, 0.71, 0.41, 0.31, 0.29, 0.36, 0.26, 0.35])
t = np.linspace(min(x_data), max(x_data), 151)
scatter plot
I want to fit them with a curve that follows an exponential behaviour for t < t_lim and a linear behaviour for t > t_lim, where t_lim is a value that i can set as i want. I want to use curve_fit to find the best fit. I would like to find the best fit meeting these two conditions:
The end point of the first behaviour (exponential) must be the starting point of the second behaviour (linear): in other words, I don't want the jump discontinuity in the middle.
I would like the second behaviour (linear) to be descending.
I solved in this way:
t_lim = 15
def y(t, k, m, q):
return np.concatenate((np.exp(-k*t)[t<t_lim], (m*t + q)[t>=t_lim]))
popt, pcov = curve_fit(y, x_data, y_data, p0=[0.5, -0.005, 0.005])
y_model = y(t, k_opt, m_opt, q_opt)
I obtain this kind of curve:
chart_plot
I don't know how to tell python to find the best values of m, k, q that meet the two conditions (no jump discontinuity, and m < 0)
Instead of trying to add these conditions as explicit constraints, I'd go about modifying the form of y so that these conditions are always satisfied.
For example, try replacing m with -m**2. That way, the coefficient in the linear part will always be negative.
For the continuity condition, how about this: For an exponential with a given decay factor and a linear curve with a given slope which are supposed to meet at a given t_lim there's only exactly one value for q that will satisfy that condition. You can explicitly compute that value and just plug that in.
Basically, q won't be a fit parameter anymore; instead, inside of y, you'd compute the correct q value based on k, m, t_lim.
This post is not a direct answer to the question. This is a preliminary study.
First : Fitting to a simple exponential function with only a constant (without decreasing or increasing linear part) :
The result is not bad considering the wide scatter on the right part.
Second : Fitting to an exponential function with a linear function (without taking account of the expected decreasing on the right).
The slope of the linear part is very low : 0.000361
But the slope is positive which is not as wanted.
Since the scatter is very large one suspects that the slope of the linear function might be governed mainly by the scatter. In order to check this hypothesis one make the same fitting calculus whitout one point. Taking only the seven first points (that is forgetting the eighth point) the result is :
Now the slope is negative as wanted. But this is an untruthful result.
Of course if some technical reason implies that the slope is necessarily negative one could use a picewise function made of an exponenlial and a linear function. But what is the credibility of such a model ?
This doesn't answer to the question. Neverthelss I hope that this inspection will be of interest.
For information :
The usual nonlinear regression methods are often non convergent in case of large scatter due to the difficulty to set initial values of the parameters sufficienly close to the unknown correct values. In order to avoid the difficulty the above fittings where made with a non usual method which doesn't requires "guessed" initial value. For the principle refer to : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales
In the referenced document the case of the function exponential and linear isn't fully treated. In order to overcome this deficiency the method is shown below with the numerical calculus (MathsCAD).
If more accuracy is needed use a nonlinear regression software with the values of p,a,b,c found above as initial values to start the iterative calculus.
I have some data which I want to apply a fit to and then perform a chi-squared test to get the goodness of the fit. It is obvious that the fit I'm applying doesn't fit the data very well (that in of itself isn't a problem, I'm not necessarily expecting it to) but the values scipy.stats.chisquare is returning would suggest an almost perfect fit which is clearly wrong.
What I've done so far is define a function describing the fit I'm applying (a sinusoidal fit), then using scipy.optimize.curve_fit to fit this function to my data by getting the fit parameters from popt then using them in the previously defined function to generate a fit.
I'm then taking the measured data and the fitted data and putting them into scipy.stats.chisquare in an attempt to get a fit but that is returning a p-value of 1.0 which cannot be right. My assumption is that there is some problem with using the values generated by scipy.optimize.curve_fit in scipy.stats.chisquare but if that is the case I don't understand why that's a problem or how to work around it.
I have my measured data in two lists which I'm calling "time" and "rate" below
import numpy as np
import math
%matplotlib inline
import matplotlib.pyplot as plt
from statistics import stdev
import scipy
time =[309.6666666666667, 326.3333333333333, 334.6666666666667, 399.9166666666667, 416.5833333333333, 433.25, 449.91666666666663, 466.58333333333337, 483.25, 499.91666666666663,]
rate = [0.298168, 0.29317, 0.306496, 0.249861, 0.241532, 0.241532, 0.206552, 0.249861, 0.253193, 0.239867]
def oscillation(t,A,C):
return(A*np.cos((2*np.pi*(t-x0))/(t0))+C)
t0 = 365.25
A = 0.35/2
x0 = 152.5
C = 0.475
popt, pcov = curve_fit(oscillation, time, rate, p0=[A,C])
rate_fit = []
for t in time:
r = oscillation(t, popt[0],popt[1])
rate_fit.append(r)
print(scipy.stats.chisquare(rate, f_exp=rate_fit))
plt.plot(time,rate, '.')
plt.plot(time,rate_fit,'--')
The output of the above is a fit which does look like the best fit to the data when plotted but is clearly not a perfect fit, making the other output of a p value of 0.99999999999458533 which is clearly wrong
You are only fitting for two parameters, A and C, thus forcing the phase and period.
If you also fit for the phase and period, you get a much better fit:
Also in this case, my p-value is 1.0.
The reason why your p-value is 1.0 when x0 and t0 are fixed, is that your result is the best fit that can be made with those values for x0 and t0. Forcing those values, will very likely produce an overall worse fit. For comparison, with x0 and t0 free, I get
A = -3.45840427e-02
C = 2.65142203e-01
x0 = 1.88838771e+02
t0 = 2.61112538e+02
Compare that to t0 = 365.25 and x0 = 152.5.
Of course, there are (physical) reasons that you want to fix e.g. t0 to a year, but in such a case, you should worry less that the plot looks bad; your p-value still takes this into account.
The more likely reason, however, is that you are also forgetting the ddof parameter in scipy.stats.chisquare. It's default is ddof=0, which is not what you have: in your case it's len(rate) - 2, in my above case, it would be len(rate) - 4.
For your fit (t0 and x0 fixed), that results in p = 0.902. With all parameters free, it results in 0.999887 (i.e., 1 again).
Bonus: output when I fix the period t0 to 365.25:
A = -4.05218922e-02
C = 2.74772524e-01
x0 = 8.69008279e+01
p = 0.997
and the plotted fit:
I know that there are some similar questions, but since none of them brought me any further, I decided to ask one of my own.
I am sorry, if the answer to my problem is already somewhere out there, but I really couldn't find it.
I tried fitting f(x) = a*x**b to rather linear data using curve_fit. It compiles properly, but the result is way off as shown below:
The thing is, that I don't really know what I am doing, but on the other hand fitting always is more of an art than science and there was at least one general bug with scipy.optimize.
My data looks like this:
x-values:
[16.8, 2.97, 0.157, 0.0394, 14.000000000000002, 8.03, 0.378, 0.192, 0.0428, 0.029799999999999997, 0.000781, 0.0007890000000000001]
y-values:
[14561.766666666666, 7154.7950000000001, 661.53750000000002, 104.51446666666668, 40307.949999999997, 15993.933333333332, 1798.1166666666666, 1015.0476666666667, 194.93800000000002, 136.82833333333332, 9.9531566666666684, 12.073133333333333]
That's my code (using a really nice example in the last answer to that question):
def func(x,p0,p1): # HERE WE DEFINE A FUNCTION THAT WE THINK WILL FOLLOW THE DATA DISTRIBUTION
return p0*(x**p1)
# Here you give the initial parameters for p0 which Python then iterates over to find the best fit
popt, pcov = curve_fit(func,xvalues,yvalues, p0=(1.0,1.0))#p0=(3107,0.944)) #THESE PARAMETERS ARE USER DEFINED
print(popt) # This contains your two best fit parameters
# Performing sum of squares
p0 = popt[0]
p1 = popt[1]
residuals = yvalues - func(xvalues,p0,p1)
fres = sum(residuals**2)
print 'chi-square'
print(fres) #THIS IS YOUR CHI-SQUARE VALUE!
xaxis = np.linspace(5e-4,20) # we can plot with xdata, but fit will not look good
curve_y = func(xaxis,p0,p1)
The starting values are from a fit with gnuplot, that is plausible but I need to cross-check.
This is printed output (first fitted p0, p1, then chi-square):
[ 4.67885857e+03 6.24149549e-01]
chi-square
424707043.407
I guess this is a difficult question, therefore much thanks in advance!
When fitting curve_fit optimizes the sum of (data - model)^2 / (error)^2
If you don't pass in errors (as you are doing here) curve_fit assumes that all of the points have an error of 1.
In this case, as your data spans many orders of magnitude, the points with the largest y values dominate the objective function, and causes curve_fit to attempt to fit them at the expense of the others.
The best way of fixing this would be including the errors in your yvalues in the fit (it looks like you do as you have error bars in the plot you have made!). You can do this by passing them in as the sigma parameter of curve_fit.
I would rethink the experimental part. Two datapoints are questionable:
The image you showed us looks pretty good because you took the log:
You could do a linear fit on log(x) and log(y). In this way you might limit the impact of the largest residuals. Another approach would be robust regression (RANSAC from sklearn or least_squares from scipy).
Nevertheless you should either gather more datapoints or repeat the measurements.
I want to fit a histogram by the sum of two gaussians, both with different amplitude, mean and deviation. To do that, I have used scipy's curve_fit, but the KS-test afterwards was awful. That was mostly because the first few (as in the most negativ x values) values were not very accurate, and therefor the cumulative function was way off. I also noted that the cumulative function was off by 20%, and therefor an accurate outcome of the KS-test is impossible.
Then I tried to make a constraint to the integrand, following this question. The relevant code I got is the following (without importing and plotting part):
def residuals(p, x,y):
integral = quad( gauss2, -300, 300, args= (p[0],p[1],p[2],p[3],p[4],p[5]))[0]
penalization = abs(1-integral)*10000
print penalization
return y - gauss2(x, p[0],p[1],p[2],p[3],p[4],p[5] ) - penalization
def gauss2(x,A, mu, sigma, A2, mu2, sigma2):
if A2<0:
return 1000
return A*np.exp(-(x-mu)**2/(2.*sigma**2))+ A2*np.exp(-(x-mu2)**2/(2.*sigma2**2))
hist, bin_edges = np.histogram(data, normed=True, bins=bins)
hist_cm=np.cumsum(hist)
bin_centres = (bin_edges[:-1] + bin_edges[1:])/2
coeff, pcov2 = leastsq(residuals, x0=(0.01,0.,60.,0.01,150.,40.) ,args=(bin_centres, hist)
hist_fit = gauss2(bin_centres, *coeff)
hist_fit_cm=np.cumsum(hist_fit)
KStest= stats.ks_2samp(hist_cm,hist_fit_cm)
This results in a pretty good estimate, and a P-value of 0.629. As far as I know, this means that the histogram and the fit have a 62.9% change of coming from the same data, is this correct?
Now I thought that I could improve the answer by not penalising for the integrand, but for the P-value. For this I changed the def residuals with the following:
def residuals(p, x,y):
global bin_centres #its global defined, so should be good
iets = np.cumsum(gauss2(bin_centres,p[0],p[1],p[2],p[3],p[4],p[5]))
pizza=stats.ks_2samp(np.cumsum(y),iets)[1]
penalization = 1000*(1-pizza)
return y - gauss2(x, p[0],p[1],p[2],p[3],p[4],p[5] ) - penalization
Since the P-value (which I call pizza) should reach as close to 1 as possible, the penalization becomes smaller with a higher P-value. But this gives results which make less sense: the P-value turns out to be 0.160. When plotting it's even worse: two spikes, instead of the smooth fit I obtained with the first method.
Is a KS-test a good penalisation method, instead of the integrand? How can implement it in a good way then?
(brief answer, as far as I understand reading the code)
The first penalization penalization = abs(1-integral)*10000 is a constraint on the total integral. I think this is the same as imposing A + A2 == 1 so the mixture in gauss2 integrates to one. An alternative without constraints would be to impose this directly by, for example, using a Logit function for the mixing probability.
The Kolmogorov-Smirnov penalization uses a L1 distance and penalized the largest deviation between the empirical and the parametric cdf, approximately (*)
L1 = np.max(np.abs(np.cumsum(y) - iets))
The p-value is just a monotonic transformation of the L1 distance, but will have a different curvature and will penalize differently.
(*) The actual calculation looks at all the step points directly.
As aside: The Kolmogorov-Smirnov test is designed for continuous not for discrete or binned variables. The appropriate distance measure would be based on chi-square test or power divergence. However, this only affects ks_2samp as a hypothesis test, and not if we just use it as a distance measure.
Another aside: the use of integrate.quad could be replaced by using norm.cdf directly.
I've been trying to fit some histogram data with scipy.optimize.curve_fit, but so far I haven't once been able to produce fit parameters that differ significantly from my guess parameters.
I wouldn't be terribly surprised to find that the more arcane parameters in my fit get stuck in local minima, but even linear coefficients won't move from my initial guesses!
If you've seen anything like this before, I'd love some advice. Do least-squared minimization routines just not work for certain classes of functions?
I try this,
import numpy as np
from matplotlib.pyplot import *
from scipy.optimize import curve_fit
def grating_hist(x,frac,xmax,x0):
# model data to be turned into a histogram
dx = x[1]-x[0]
z = np.linspace(0,1,20000,endpoint=True)
grating = np.cos(frac*np.pi*z)
norm_grating = xmax*(grating-grating[-1])/(1-grating[-1])+x0
# produce the histogram
bin_edges = np.append(x,x[-1]+x[1]-x[0])
hist,bin_edges = np.histogram(norm_grating,bins=bin_edges)
return hist
x = np.linspace(0,5,512)
p_data = [0.7,1.1,0.8]
pct = grating_hist(x,*p_data)
p_guess = [1,1,1]
p_fit,pcov = curve_fit(grating_hist,x,pct,p0=p_guess)
plot(x,pct,label='Data')
plot(x,grating_hist(x,*p_fit),label='Fit')
legend()
show()
print 'Data Parameters:', p_data
print 'Guess Parameters:', p_guess
print 'Fit Parameters:', p_fit
print 'Covariance:',pcov
and I see this: http://i.stack.imgur.com/GwXzJ.png (I'm new here, so I can't post images)
Data Parameters: [0.7, 1.1, 0.8]
Guess Parameters: [1, 1, 1]
Fit Parameters: [ 0.97600854 0.99458336 1.00366634]
Covariance: [[ 3.50047574e-06 -5.34574971e-07 2.99306123e-07]
[ -5.34574971e-07 9.78688795e-07 -6.94780671e-07]
[ 2.99306123e-07 -6.94780671e-07 7.17068753e-07]]
Whaaa? I'm pretty sure this isn't a local minimum for variations in xmax and x0, and it's a long way from the global minimum best fit. The fit parameters still don't change, even with better guesses. Different choices for curve functions (e.g. the sum of two normal distributions) do produce new parameters for the same data, so I know it's not the data itself. I also tried the same thing with scipy.optimize.leastsq itself just in case, but no dice; the parameters still don't move. If you have any thoughts on this, I'd love to hear them!
The problem you're facing is actually not due to curve_fit (or leastsq). It is due to the landscape of the objective of your optimisation problem. In your case the objective is the sum of residuals' squares, which you are trying to minimise. Now, if you look closely at your objective in a close surrounding of your initial conditions, for example using the code below, which only focuses on the first parameter:
p_ind = 0
eps = 1e-6
n_points = 100
frac_surroundings = np.linspace(p_guess[p_ind] - eps, p_guess[p_ind] + eps, n_points)
obj = []
temp_guess = p_guess.copy()
for p in frac_surroundings:
temp_guess[0] = p
obj.append(((grating_hist(x, *p_data) - grating_hist(x, *temp_guess))**2.0).sum())
py.plot(frac_surroundings, obj)
py.show()
you will notice that the landscape is piecewise constant (you can easily check that the situation is the same for other parameters. The problem with that is that these pieces are of the order of 10^-6, whereas the initial step of the fitting procedure is somewhere around 10^-8, hence the procedure ends quickly concluding that you cannot improve from the given initial condition. You could try to fix it by changing epsfcn parameter in curve_fit, but you would quickly notice that the landscape, on top of being piecewise constant, is also very "rugged". In other words, curve_fit is simply not well suited for such a problem, which is simply difficult for gradient based methods, as it is highly non-convex. Probably, some stochastic optimisation methods could do a better job. That is, however, a different question/problem.
I think it is a local minimum, or the algorith fails for a non trivial reason. It is far easier to fit the data to the input, instead of fitting the statistical description of the data to the statistical description of the input.
Here's a modified version of the code doing so:
z = np.linspace(0,1,20000,endpoint=True)
def grating_hist_indicator(x,frac,xmax,x0):
# model data to be turned into a histogram
dx = x[1]-x[0]
grating = np.cos(frac*np.pi*z)
norm_grating = xmax*(grating-grating[-1])/(1-grating[-1])+x0
return norm_grating
x = np.linspace(0,5,512)
p_data = [0.7,1.1,0.8]
pct = grating_hist(x,*p_data)
pct_indicator = grating_hist_indicator(x,*p_data)
p_guess = [1,1,1]
p_fit,pcov = curve_fit(grating_hist_indicator,x,pct_indicator,p0=p_guess)
plot(x,pct,label='Data')
plot(x,grating_hist(x,*p_fit),label='Fit')
legend()
show()