I want to fit a gaussian to a curve using python . I found a solution here somewhere but it only seems to work for an n shaped gaussian , not for a u shaped gaussian .
Here is the code:
import pylab, numpy
from scipy.optimize import curve_fit
x=numpy.array(range(10))
y=numpy.array([0,1,2,3,4,5,4,3,2,1])
n=len(x)
mean=sum(y)/n
sigma=sum(y-mean)**2/n
def gaus(x,a,x0,sigma):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))
popt, pcov=curve_fit(gaus,x,y,p0=[1,mean,sigma])
pylab.plot(x,y,'r-',x,y,'ro')
pylab.plot(x,gaus(x,*popt),'k-',x,gaus(x,*popt),'ko')
pylab.show()
The code fits a gaussian to an n shaped curve but if I change y to y=numpy.array([5,4,3,2,1,2,3,4,5,6]) then it return some error: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
What do I have to change/adjust in the code to fit a U shaped gaussian ?
Thanks.
The functional form of your fit is wrong. Gaussian's are expected to go to 0 at the tails regardless of whether it is an n or u shape, but yours goes to ~5.
If you introduce an offset into your equation, and choose reasonable initial values, it works. See code below:
import pylab, numpy
from scipy.optimize import curve_fit
x=numpy.array(range(10))
y=numpy.array([5,4,3,2,1,2,3,4,5,6])
n=len(x)
mean=sum(y)/n
sigma=sum(y-mean)**2/n
def gaus(x,a,x0,sigma,c):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))+c
popt, pcov=curve_fit(gaus,x,y,p0=[-1,mean,sigma,-5])
pylab.plot(x,y,'r-',x,y,'ro')
pylab.plot(x,gaus(x,*popt),'k-',x,gaus(x,*popt),'ko')
pylab.show()
Perhaps you could invert your values, fit to a 'n' shaped gaussian, and then invert the gaussian.
Related
I have tried to implement a Gaussian fit in Python with the given data. However, I am unable to obtain the desired fit. Any suggestions would help.
import numpy as np
import pandas as pd
from matplotlib import pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar, exp
xData=ar([-7.66E-06,-7.60E-06,-7.53E-06,-7.46E-06,-7.40E-06,-7.33E-06,-7.26E-06,-7.19E-06,-7.13E-06,-7.06E-06,-6.99E-06,
-6.93E-06,-6.86E-06,-6.79E-06,-6.73E-06,-6.66E-06,-6.59E-06,-6.52E-06,-6.46E-06,-6.39E-06,-6.32E-06,-6.26E-06,-6.19E-06,
-6.12E-06,-6.06E-06,-5.99E-06,-5.92E-06,-5.85E-06,-5.79E-06,-5.72E-06])
yData=ar([17763,2853,3694,4203,4614,4984,5080,7038,6905,8729,11687,13339,14667,16175,15953,15342,14340,15707,13001,10982,8867,6827,5262,4760,3869,3232,2835,2746,2552,2576])
#plot the data points
plt.plot(xData,yData,'bo',label='experimental_data')
plt.show()
#define the function we want to fit the plot into
# Define the Gaussian function
n = len(xData)
mean = sum(xData*yData)/n
sigma = np.sqrt(sum(yData*(xData-mean)**2)/n)
def Gauss(x,I0,x0,sigma,Background):
return I0*exp(-(x-x0)**2/(2*sigma**2))+Background
popt,pcov = curve_fit(Gauss,xData,yData,p0=[1,mean,sigma, 0.0])
print(popt)
plt.plot(xData,yData,'b+:',label='data')
plt.plot(xData,Gauss(xData,*popt),'ro:',label='fit')
plt.legend()
plt.title('Gaussian_Fit')
plt.xlabel('x-axis')
plt.ylabel('PL Intensity')
plt.show()
When computing mean and sigma, divide by sum(yData), not n.
mean = sum(xData*yData)/sum(yData)
sigma = np.sqrt(sum(yData*(xData-mean)**2)/sum(yData))
The reason is that, say for mean, you need to compute the average of xData weighed by yData. For this, you need to normalize yData to have sum 1, i.e., you need to multiply xData with yData / sum(yData) and take the sum.
With the correction by j1-lee and removing the first point which clearly doesn't agree with the Gaussian model, the fit looks like this:
Removing the bin that clearly doesn't belong in the fit reduces the fitted width by some 20% and the (fitted) noise to background ratio by some 30%. The mean is only marginally affected.
I have access scipy and want to create a FFT about simple Gaussian function which is exp(-t^2). And also it's well known that fourier transform of exp(−t^2) is √πexp(−π^2*k^2). But FFT of exp(-t^2) was not same as √πexp(−π^2*k^2).
I have tried the following code:
import scipy.fftpack as fft
from scipy import integrate
import numpy as np
import matplotlib.pyplot as plt
#FFT
N=int(1e+3)
T=0.01 #sample period
t = np.linspace(0,N*T, N)
h=np.exp(-t**2)
H_shift=2*np.abs(fft.fftshift(np.fft.fft(h)/N))
freq=fft.fftshift(fft.fftfreq(h.shape[0],t[1]-t[0]))
#Comparing FFT with fourier transform
def f(x):
return np.exp(-x**2)
def F(k):
return (np.pi**0.5)*np.exp((-np.pi**2)*(k**2))
plt.figure(num=1)
plt.plot(freq,F(freq),label=("Fourier Transform"))
plt.legend()
plt.figure(num=2)
plt.plot(freq,H_shift,label=("FFT"))
plt.legend()
plt.show()
#Checking Parseval's Theorm
S_h=integrate.simps(h**2,t)
#0.62665690150683084
S_H_s=integrate.simps(H_shift**2,freq)
#0.025215875346935791
S_F=integrate.simps(F(freq)**2,freq)
#1.2533141373154999
The graph I plotted is not same, also values of FFT do not follow Parseval's theorm. . It has to be S_H_s=S_h*2, but my result was not. I think that S_H_s which is result of FFT is wrong value Because of S_F=S_h*2.
Is there any problem in my code?? Help is greatly appreciated! Thanks in advance.
I suggest you plot your input signal h and verify that it looks like a Gaussian.
Spoiler alert: it doesn't, it is half a Gaussian!
By cutting it like this, you introduce a lot of high frequencies that you see in your plot.
To do this experiment correctly, follow this recipe to create your input signal:
t = np.linspace(-(N/2)*T,(N/2-1)*T, N)
h = np.exp(-t**2)
h = fft.ifftshift(h)
The ifftshift function serves to move the t=0 location to the leftmost array element. Note that t here is constructed carefully such that t=0 is exactly in the right place for this to work correctly, assuming an even-sized N. You can verify that fft.ifftshift(t)[0] is 0.0.
I have a set of points in the first quadrant that look like a gaussian, and I am trying to fit it using a gaussian in python and my code is as follows:
import pylab as plb
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy import asarray as ar,exp
import math
x=ar([37,69,157,238,274,319,391,495,533,626,1366,1855,2821,3615,4130,4374,6453,6863,7021,
7951,8646,9656,10464,11400])
y=ar([1.77,1.67,1.65,1.17,1.34,1.46,0.75,1,0.8,1.02,0.65,0.69,0.44,0.44,0.55,0.43,0.75,0.27,0.26,
0.44,0.04,0.44,0.26,0.04])
n = 24 #the number of data
mean = sum(x*y)/n #note this correction
sigma = math.sqrt(sum(y*(x-mean)**2)/n) #note this correction
def gaus(x,a,x0,sigma):
return a*exp(-(x-x0)**2/(2*sigma**2))
popt,pcov = curve_fit(gaus,x,y,p0=None, sigma=None) #'''p0=[1,mean,sigma]'''
plt.plot(x,y,'b+:',label='data')
plt.plot(x,gaus(x,*popt),'ro:',label='fit')
plt.legend()
plt.title('Fig. 3 - Fit for Time Constant')
plt.xlabel('Time (s)')
plt.ylabel('Voltage (V)')
plt.show()
And the output is: this figure:
http://s2.postimg.org/wevggkc95/Workspace_1_022.png
Why are all the red points coming below, Also note that I am interested in a half gaussian as my data is like that, so my y values are big at first and then decreasing like one side of the gaussian bell. Can anyone tell me how to fit this curve in python, (in case it cannot be fit to gaussian). Or in other words, I want code to fit the half(left side) gaussian of my points (in the first quadrant only). Note that my points cannot be fit as an exponentially decreasing curve as I tried that earlier, and it is not fitting well at lower 'x' values.
Apparently your data do not fit well or easily to a Gaussian function. You use the default initial guesses for p0 = [1,1,1] which is so far away from any kind of optimal choice that curve_fit gives up before it gets started (check the values of popt=[1,1,1] and pcov=[inf, inf, inf]). You could try with better guesses (e.g. p0 = [2,0, 2000]), but on my system it won't converge: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
To fit a "half-Gaussian", don't float the centre position x0 (just leave it equal to 0):
def gaus(x,a,sigma):
return a*exp(-(x)**2/(2*sigma**2))
p0 = [1.2, 4000]
popt,pcov = curve_fit(gaus,x,y,p0=p0)
Unless you have a particular reason for wanting to fit a Gaussian, why not do a more robust linear least squares fit to a polynomial, e.g.:
pfit = np.polyfit(x, y, 3)
poly = np.poly1d(pfit)
I've been working with this for the last days and I couldn't see yet where is the problem.
I'm trying to weight a function with 2 variables f(q,r) within a Gaussian distribution g(r) with a specific mean value (R0) and deviation (sigma). This is needed because the theoretical function f(q) has a certain dispersity in its r variable when analyzed experimentally. Therefore, we use a probability density function to weigh our function in the r variable.
I include the code, which works, but doesn't give the expected result (the weighted curve should be smoother as the polydispersity grows (higher sigma) as it is shown below. As you can see, I integrated the convolution of the 2 functions f(r,q)*g(r) from r = 0 to r = +inf.
The result is plotted to compare the weigh result with the simple function:
from scipy.integrate import quad, quadrature
import numpy as np
import math as m
import matplotlib.pyplot as plt
#function weighted with a probability density function (gaussian)
def integrand(r,q):
#gaussian function normalized
def gauss_nor(r):
#gaussian function
def gauss(r):
return m.exp(-((r-R0)**2)/(2*sigma**2))
return (m.exp(-((r-R0)**2)/(2*sigma**2)))/(quad(gauss,0,np.inf)[0])
#function f(r,q)
def f(r,q):
return 3*(np.sin(q*r)-q*r*np.cos(q*r))/((r*q)**3)
return gauss_nor(r)*f(r,q)
#quadratic integration of the integrand (from 0 to +inf)
#integrand is function*density_function (gauss)
def function(q):
return quad(integrand, 0, np.inf, args=(q))[0]
#parameters used in the function
R0=20
sigma=5
#range to plot q
q=np.arange(0.001,2.0,0.005)
#vector where the result of the integral will be saved
function_vec = np.vectorize(function)
#vector for the squared power of the integral
I=[]
I=(function_vec(q))**2
#function without density function
I0=[]
I0=(3*(np.sin(q*R0)-q*R0*np.cos(q*R0))/((R0*q)**3))**2
#plot of weighted and non-weighted functions
p1,=plt.plot(q,I,'b')
p3,=plt.plot(q,I0,'r')
plt.legend([p1,p3],('Weighted','No weighted'))
plt.yscale('log')
plt.xscale('log')
plt.show()
Thank you very much. I've been with this problems for some days already and I haven't found the mistake.
Maybe somebody know how to weigh a function with a PDF in an easier way.
I simplified your code, the output is the same as yours. I think it's already very smooth, there are some very sharp peak in the log-log graph, just because the curve has zero points. So it's not smooth in a log-log graph, but it's smooth in a normal X-Y graph.
import numpy as np
def gauss(r):
return np.exp(-((r-R0)**2)/(2*sigma**2))
def f(r,q):
return 3*(np.sin(q*r)-q*r*np.cos(q*r))/((r*q)**3)
R0=20
sigma=5
qm, rm = np.ogrid[0.001:2.0:0.005, 0.001:40:1000j]
gr = gauss(rm)
gr /= np.sum(gr)
fm = f(rm, qm)
fm *= gr
plot(qm.ravel(), fm.sum(axis=1)**2)
plt.yscale('log')
plt.xscale('log')
Multivariate numpy package numpy.random.multivariate_normal..does not return a normal distribution plot...the example given at the site.
import matplotlib.pyplot as plt
x,y = np.random.multivariate_normal(mean,cov,5000).T
plt.plot(x,y,'x'); plt.axis('equal'); plt.show()
When plotted does not give the normal distribution curve. I am new to numpy and I want to get a normal distribution curve..so please help.
I want to plot x, y and normal pdf in 2-dimension. That is, I want to show that x and y follow , "multivariate" normal distribution.
numpy.random.multivariate_normal() samples from a multivariate normal distribution. Plotting the two coordinates from these samples against each other will not show you a 1D normal distribution curve. numpy itself does not have a function that will compute the 1D normal distribution curve itself. It's easy enough to compute yourself, though, if that's what you really want:
def normpdf(x, mean, std):
z = (x - mean) / std
return numpy.exp(-z**2/2.0)/numpy.sqrt(2*numpy.pi)/std
I think for bivariate as is your case you may look at the formula given at wikipedia:
http://en.wikipedia.org/wiki/Multivariate_normal_distribution