I have two numpy arrays x and y and would like to fit a curve to the data. The fitting function is an exponential with a and t as fitting parameters, and another numpy array ex.
import numpy as np
import scipy
import scipy.optimize as op
k=1.38e-23
h=6.63e-34
c=3e8
def func(ex,a,t):
return a*np.exp(-h*c/(ex*1e-9*kb*t))
t0=300 #initial guess
print op.curve_fit(func,x,y,t0)
Your initial guess should contain two values like t0=(300, 1.) since you have two fitting parameters (a and t).
You need to define the points you want to fit, i.e. defining x and y before calling curve_fit().
Related
I would like to get piecewise linear function from set of points. Here is visual example:
import matplotlib.pyplot as plt
x = [1,2,7,9,11]
y = [2,5,9,1,11]
plt.plot(x, y)
plt.show()
So I need a function that would take two lists and would return piecewise linear function back. I do not need regression or any kind of least square fit.
I can try to write it myself, but wonder if there is something already written. So far, I only found code returning regression
try np.interp. It interpolates the values.
Here is a small example.
>>> import matplotlib.pyplot as plt
>>> import numpy as np
>>> x = [1,2,7,9,11]
>>> y = [2,5,9,1,11]
>>> np.interp([1.5, 3], x, y)
array([ 3.5, 5.8])
A caution note is to make sure for the sample points, make sure the x increases.
I have 200k data points and I'm trying to obtain derivative of fitted polynomial. I divided my data set into smaller ones every 0.5 K, the data is Voltage vs Temperature. My code roughly looks like this:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
testset=pd.read_csv('150615H0.csv',sep='\t')
x=np.linspace(1,220,219)
ub=min(testset['T(K)'])
lb=min(testset['T(K)'])-1
q={i:testset[(testset['T(K)'] < ub+i) & (testset['T(K)'] > lb+i)] for i in x}
f={j:np.polyfit(q[j]['T(K)'],q[j]['Vol(V)'],4) for j in q}
fs={k:np.poly1d(f[k]) for k in f}
fsd={l:np.polyder(fs[l],1) for l in fs}
for kk in q:
plt.plot(q[kk]['T(K)'],fsd[kk](q[kk]['T(K)']),color='blue',linewidth=2,label='fit')
Unsurprinsingly, the derivative is discontinuous and I don't like it. Is there any other way to fit polynomial locally and get continuous derivative at the same time ?
Have a look at the Savitzky-Gollay filter for an efficient local polynomial fitting.
It is implemented, for instance, in scipy.signal.savgol_filter. The derivative of the fitted polynomial can be obtained with the deriv=1 argument.
I am trying to find the root(s) of a line which is defined by data like:
x = [1,2,3,4,5]
y = [-2,4,6,8,4]
I have started by using interpolation but I have been told I can then use the brentq function. How can I use brentq from two lists? I thought continuous functions are needed for it.
As the documentation of brentq says, the first argument must be a continuous function. Therefore, you must first generate, from your data, a function that will return a value for each parameter passed to it. You can do that with interp1d:
import numpy as np
from scipy.interpolate import interp1d
from scipy.optimize import brentq
x, y = np.array([1,2,3,4,5]), np.array([-2,4,6,8,4])
f = interp1d(x,y, kind='linear') # change kind to something different if you want e.g. smoother interpolation
brentq(f, x.min(), x.max()) # returns: 1.33333
You could also use splines to generate the continuous function needed for brentq.
To get the correlation between two arrays in python, I am using:
from scipy.stats import pearsonr
x, y = [1,2,3], [1,5,7]
cor, p = pearsonr(x, y)
However, as stated in the docs, the p-value returned from pearsonr() is only meaningful with datasets larger than 500. So how can I get a p-value that is reasonable for small datasets?
My temporary solution:
After reading up on linear regression, I have come up with my own small script, which basically uses Fischer transformation to get the z-score, from which the p-value is calculated:
import numpy as np
from scipy.stats import zprob
n = len(x)
z = np.log((1+cor)/(1-cor))*0.5*np.sqrt(n-3))
p = zprob(-z)
It works. However, I am not sure if it is more reasonable that p-value given by pearsonr(). Is there a python module which already has this functionality? I have not been able to find it in SciPy or Statsmodels.
Edit to clarify:
The dataset in my example is simplified. My real dataset is two arrays of 10-50 values.
I want to fit a gaussian to a curve using python . I found a solution here somewhere but it only seems to work for an n shaped gaussian , not for a u shaped gaussian .
Here is the code:
import pylab, numpy
from scipy.optimize import curve_fit
x=numpy.array(range(10))
y=numpy.array([0,1,2,3,4,5,4,3,2,1])
n=len(x)
mean=sum(y)/n
sigma=sum(y-mean)**2/n
def gaus(x,a,x0,sigma):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))
popt, pcov=curve_fit(gaus,x,y,p0=[1,mean,sigma])
pylab.plot(x,y,'r-',x,y,'ro')
pylab.plot(x,gaus(x,*popt),'k-',x,gaus(x,*popt),'ko')
pylab.show()
The code fits a gaussian to an n shaped curve but if I change y to y=numpy.array([5,4,3,2,1,2,3,4,5,6]) then it return some error: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
What do I have to change/adjust in the code to fit a U shaped gaussian ?
Thanks.
The functional form of your fit is wrong. Gaussian's are expected to go to 0 at the tails regardless of whether it is an n or u shape, but yours goes to ~5.
If you introduce an offset into your equation, and choose reasonable initial values, it works. See code below:
import pylab, numpy
from scipy.optimize import curve_fit
x=numpy.array(range(10))
y=numpy.array([5,4,3,2,1,2,3,4,5,6])
n=len(x)
mean=sum(y)/n
sigma=sum(y-mean)**2/n
def gaus(x,a,x0,sigma,c):
return a*numpy.exp(-(x-x0)**2/(2*sigma**2))+c
popt, pcov=curve_fit(gaus,x,y,p0=[-1,mean,sigma,-5])
pylab.plot(x,y,'r-',x,y,'ro')
pylab.plot(x,gaus(x,*popt),'k-',x,gaus(x,*popt),'ko')
pylab.show()
Perhaps you could invert your values, fit to a 'n' shaped gaussian, and then invert the gaussian.