I am trying to do an exponential fit to some experimental data results, I have 6999 rows in each of my two columns. I have assigned the time to the x and the results to the y axis. I tried using scipy.optimize.curve_fit() with my data arrays but it produces a runtime error. I would highly appreciate knowing how I can get rid of the error or how I can code this in a different way, should my parameters be what produces the error.
Here's my code:
import pandas as pd
import numpy as np
import scipy as sp
import scipy.optimize
df =pd.read_csv("output.txt") #reads dataframe into our program
print(df) #to make sure this is what we want
x = np.array(df["t.timestep"]) #gets the data points and defines the timestep as x axis
print(x) #just checking
y = np.array(df["t.cTNF"]) #gets the data points and defines them as y axis
print(y) #just checking
sp.optimize.curve_fit(lambda t,a,b: a*np.exp(b*t), x, y) #should fit the line perfectly
It produces the following error message:
File "C:\Users\kimst\anaconda3\lib\site-packages\scipy\optimize\minpack.py", line 789, in curve_fit
raise RuntimeError("Optimal parameters not found: " + errmsg)
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 600.
Please help, I have a deadline and am completely stuck, thanks for reading :)
Related
I have a function of the form: (y1,y2,y3)=x*(a1,a2,a3)+(b1,b2,b3), where x,y1,y2,y3 are measured values and a1,a2,a3,b1,b2,b3 are parameters I want to fit for. I also have some measurement errors associated with x,y1,y2,y3. I would like to fit this function for a1,a2,a3,b1,b2,b3, and obtain an error on the values of each of these parameters, while taking into account the errors on x,y1,y2,y3. How can I do this? I looked into scipy and lmfit, but I didn't really find something that allows me to both pass the errors on the measured points and return the errors on the fitted parameters. Here is some code I have for the data I need to fit:
import numpy as np
x = np.array([1,2,3,4,5])
err_x = np.array([0.1,0.1,0.2,0.2,0.1])
y = []
for i in range(len(x)):
y = y + [x[i]*np.array([3,4,5])+np.array([-2,3,1])]
y = np.array(y)
err_y = np.array([[0.2,0.2,0.1],[0.2,0.2,0.1],[0.2,0.2,0.1],[0.2,0.2,0.1],[0.2,0.2,0.1]])
I was trying to solve a cubic equation and to fit with my experimental data set. But there is some problem in my code regarding curve_fit. Though both function f and del_y defined perfectly (checked using values of parameters ), curve_fit is not working and showing me the error :
ValueError: setting an array element with a sequence.
Can somebody help me out of this ?
import numpy as np
from scipy.optimize import curve_fit
def f(G0,H0,k1,k2):
a=(k1+2*k1*k2*H0-k1*k2*G0)/(k1*k2)
b=(1+k1*H0-k1*G0)/(k1*k2)
c=-G0/(k1*k2)
cf=[1,a,b,c]
k=np.roots(cf)
return abs(k[2])
def del_y(G0,H0,A,k1,k2,n):
return A*(0.04*k1*f(G0,H0,k1,k2)**n)
popt, pcov = curve_fit(del_y,x_data,y_data)
I'm pretty new to python and curve fitting and currently I'm trying to fit the graph below with a Gaussian
I'm following this tutorial and my code looks like this
import numpy as np
import matplotlib.pyplot as plt
from pylab import genfromtxt
from matplotlib import pyplot
from numpy import sqrt, pi, exp, linspace,loadtxt
from lmfit import Model
def gaussian(x,amp,cen,wid):
"1-d gaussian: gaussian(x,amp,cen,wid)"
return (amp/(sqrt(2*pi)*wid))*exp(-(x-cen)**2/(2*wid**2))
filelist=[]
time=[0.00,-1.33,-2.67,-4.00,-5.33,-6.67,1.13,2.67,4.00,5.33,6.67]
index=0
offset=0
filelist.append('0.asc')
for i in range(1,6):
filelist.append("-%s00.asc" %(i))
for i in range(1,6):
filelist.append("+%s00.asc" %(i))
sfgpeaks=[]
for fname in filelist:
data=np.genfromtxt(fname,delimiter=',',unpack=True,skip_footer=20)
SFGX=data[0,500:530]
SFGY=data[1,500:530]
SFGpeakY=np.max(SFGY)
sfgpeaks.append(SFGpeakY)
gmodel = Model(gaussian)
result = gmodel.fit(SFGpeakY, x=time[index], amp=5,cen=5,wid=3)
plt.plot(time[index],sfgpeaks[index],'ro')
plt.plot(time[index],result.init_fit, 'k--',label="Gaussian Fit")
plt.xticks(time)
index=index+1
print(pump2SHGX)
pyplot.title("Time Delay-SFG peak")
plt.xlabel("Timedelay[ps]")
plt.ylabel("Counts[arb.unit]")
plt.savefig("796and804nmtimesfg")
plt.legend(bbox_to_anchor=(1.0,0.5))
plt.show()
However, I'm getting an error when I try to add the data that I have(time delay and the Y value of the graph above) into the gaussian parameters.
The error I'm getting is this
TypeError: Improper input: N=3 must not exceed M=1
Does this error because I'm trying to insert a value from an array into the parameter??
Any help is much appreciated.
You have
result = gmodel.fit(SFGpeakY, x=time[index], amp=5,cen=5,wid=3)
which is passing 1 value as x and 1 value as the data. The model is then evaluated at that 1 point. The error message is the fit is complaining that you have 3 variables and 1 value.
You probably want to fit the data array SFGY with x set to SFGX,
result = gmodel.fit(SFGY, x=SFGX, amp=5,cen=5,wid=3)
though it wasn't clear to me what data is used in the plot you attached.
Also: you probably want to give initial values for amp, cen, and wid based on the data. Your SFGpeakY is probably a decent guess for amp, and SFGX.mean() and SFGX.std() are probably decent guesses or cen and wid.
Also: you plot result.init_fit labeled as "Gaussian Fit". result.init_fit will be the model evaluated with the initial values for the parameters. The best fit with the refined parameters will be in result.best_fit.
I'm trying to fit Einstein approximation of resistivity in a solid in a set of experimental data.
I have resistivity vs temperature (from 200 to 4 K)
import xlrd as xd
import matplotlib.pyplot as plt
import numpy as np
import pylab as pl
import scipy as sp
from scipy.optimize import curve_fit
#retrieve data from file
data = pl.loadtxt('salita.txt')
Temp = data[:, 1]
Res = data[:, 2]
#define fitting function
def einstein_func( T, ro0, AE, TE):
nl = np.sinh(TE/(2*T))
return ro0 + AE*nl*T
p0 = sp.array([1 , 1, 1])
coeffs, cov = curve_fit(einstein_func, Temp, Res, p0)
But I get these warnings
crio.py:14: RuntimeWarning: divide by zero encountered in divide
nl = np.sinh(TE/(2*T))
crio.py:14: RuntimeWarning: overflow encountered in sinh
nl = np.sinh(TE/(2*T))
crio.py:15: RuntimeWarning: divide by zero encountered in divide
return ro0 + AE*np.sinh(TE/(2*T))*T
crio.py:15: RuntimeWarning: overflow encountered in sinh
return ro0 + AE*np.sinh(TE/(2*T))*T
crio.py:15: RuntimeWarning: invalid value encountered in multiply
return ro0 + AE*np.sinh(TE/(2*T))*T
Traceback (most recent call last):
File "crio.py", line 19, in <module>
coeffs, cov = curve_fit(einstein_func, Temp, Res, p0)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/Extras/lib/python/scipy/optimize/minpack.py", line 511, in curve_fit
raise RuntimeError(msg)
RuntimeError: Optimal parameters not found: Number of calls to function has reached maxfev = 800.
I don't understand why it keeps saying that there is a divide by zero in sinh, since I have strictly positive values. Varying my starting guess has no effect on it.
EDIT: My dataset is organized like this:
4.39531E+0 1.16083E-7
4.39555E+0 -5.92258E-8
4.39554E+0 -3.79045E-8
4.39525E+0 -2.13213E-8
4.39619E+0 -4.02736E-8
4.43130E+0 -1.42142E-8
4.45900E+0 -2.60594E-8
4.46129E+0 -9.00232E-8
4.46181E+0 1.42142E-7
4.46195E+0 -2.13213E-8
4.46225E+0 4.26426E-8
4.46864E+0 -2.60594E-8
4.47628E+0 1.37404E-7
4.47747E+0 9.47612E-9
4.48008E+0 2.84284E-8
4.48795E+0 1.35035E-7
4.49804E+0 1.39773E-7
4.51151E+0 -1.75308E-7
4.54916E+0 -1.63463E-7
4.59176E+0 -2.36902E-9
where the first column is temperature and the second one is resistivity (negative values are due to noise in trial current since the sample is a PbIn alloy which becomes superconductive at temperature lower than 6.7-6.9K, here we are at 4.5K).
Argument I'm providing to sinh are Numpy arrays, with a linear function ro0 + AE*T my code works. I've tried with scipy.optimize.minimize but the result is the same.
Now I see that I have almost nine hundred values in my file, may that be the problem?
I have edited my dataset removing some lines and now the only warning showing is
RuntimeWarning: overflow encountered in sinh
How can I workaround it?
Here are a couple of observations that could help:
You could try the least-squares fit directly with leastsq, providing the Jacobian, which might help tame it.
I'm guessing you don't want the superconducting temperatures in your data set at all if you're fitting to an Einstein model (do you have a source for this eqn, btw?)
Do make sure your initial guesses are as good as they could possibly be (ro0=AE=TE=1 probably won't cut it).
Plot your data and make sure there aren't any weird artefacts
You seem to be indexing your data array in the wrong way in your code example: if the data is structured as you say, you want:
Temp = data[:, 0]
Res = data[:, 1]
(Python indexes start at 0).
I'm trying to get my first power spectral density graph plotted using actual data instead of something that's purely theoretical and generated within Python. I'm having problems getting anything to work, however. Code is attached below, followed by the error I get in my console after line 19.
Don't know if it makes a difference, but I'm transitioning to Python from mostly working in MATLAB. I am not counting on having access to a license forever, so I really want to learn how to start doing everything in Python. But it's hard.
Code:
import numpy as np
from scipy import signal
import scipy.io
import matplotlib.pyplot as plt
#import data from a .mat file using the loadmat command
mat = scipy.io.loadmat('Mic_Data_Sums.mat')
# 1 x 1 array, sampling frequency of 22050 Hz
fs = mat['Fs']
# Attempted fix: change data type to 8-point float?
# fs = fs.astype('f8')
# 13 x 1323000 array - 13 separate time series of data, 60 seconds each
data = mat['Mic_Data_Sums']
# Welch function - transpose 'data' and use the 2nd time series
f, Pxx_spec = signal.welch(data.T[1], fs, window = 'hanning', nperseg = fs,
noverlap = fs/2, scaling = 'spectrum')
Console:
/Users/******/anaconda/lib/python3.4/site-packages/scipy/signal/spectral.py:297: RuntimeWarning: divide by zero encountered in double_scalars
scale = 1.0 / win.sum()**2
Traceback (most recent call last):
File "plotPSDs.py", line 20, in <module>
noverlap = fs/2, scaling = 'spectrum')
File "/Users/******/anaconda/lib/python3.4/site-packages/scipy/signal/spectral.py", line 333, in welch
xft = fftpack.rfft(x_dt*win, nfft)
ValueError: operands could not be broadcast together with shapes (22050,) (0,22051)
Note how the ValueError tag gives me weird shape (dimension) results: I have no idea where the 22051 is coming from.
Edit: As a workaround solution, I commented out the line of fs = mat['Fs'] and simply replaced it with fs = 22050, which made the code execute successfully. However, the question still remains, why can't I simply reference the variable as it was stored in the .mat file?
[from the comments above] If you know fs is 1x1, try passing fs[0,0] to welch. The docstring for welch says fs should be a float, so it might behave unpredictably if you give it a two-dimensional array. – Warren Weckesser 22 hours ago
This worked well. The code I implemented is:
# 1 x 1 array, sampling frequency (22050 Hz)
fs = mat['Fs']
fs = fs[0,0]
then using the code from before,
f, Pxx_spec = signal.welch(data.T[1], fs, window = 'hanning', nperseg = fs,
noverlap = fs/2, scaling = 'spectrum')