I am trying to get a polynomial fit for my data. Currently, I am using polyfit from numpy to get the best fit in a loglog plot. But my goal is to get the data fit in a semilogy plot. My code looks as follows:
import matplotlib.pyplot as plt
import numpy as np
from scipy.optimize import curve_fit
import scipy.optimize as optimization
l = [ 0.006, 0.01, 0.014, 0.024, 0.0346, 0.049, 0.0535, 0.0736, 0.11 ]
f = [5.3375903383330048, 60.531976422513054, 89.111502526131474, 47.132498501584969, 17.447001214543118, 5.2583622688081455, 3.7779565652126865, 1.0621247249682186, 0.1922152085619766]
logx = np.log(l)
logy = np.log(f)
coeffs = np.polyfit(logx,logy,deg=3)
poly = np.poly1d(coeffs)
yfit = lambda x: np.exp(poly(np.log(x)))
plt.loglog(l,yfit(l), ':')
plt.loglog(l,f, 'o')
plt.show()
I would appreciate if you suggest what changes do I have to make to get a semilogy best fit curve. Also if there is any other package in python, please mention them too.
I think this.
# log sacle
x2 = np.linspace(np.min(l), np.max(l), 1000)
y2log = poly(np.log(x2))
plt.loglog(x2,np.exp(y2log), ':')
plt.loglog(l,f, 'o')
plt.show()
Related
I am trying to fit a curve for a set of points using numpy and scipy libraries but am getting a closed curve as shown below.
Could anyone let me know how to fit a curve without closing curve?
The code I followed is:
import numpy as np
from scipy.interpolate import splprep, splev
import matplotlib.pyplot as plt
coords = np.array([(3,8),(3,9),(4,10),(5,11),(6,11), (7,13), (9,13),(10,14),(11,14),(12,14),(14,16),(16,17),(17,18),(18,18),(19,18), (20,19),
(21,19),(22,20),(23,20),(24,21),(26,21),(27,21),(28,21),(30,21),(32,20),(33,20),(32,17),(33,16),(33,15),(34,12), (34,10),(33,10),
(33,9),(33,8),(33,6),(34,6),(34,5)])
tck, u = splprep(coords.T, u=None, s=0.0, per=1)
u_new = np.linspace(u.min(), u.max(), 1000)
x_new, y_new = splev(u_new, tck, der=0)
plt.plot(coords[:,1], coords[:,0], 'ro')
plt.plot(y_new, x_new, 'b--')
plt.show()
Output:
I need output without joining the 1st and last point.
Thank you.
Just set per parameter to 0 in scipy.interpolate.splprep:
tck, u = splprep(coords.T, u=None, s=0.0, per=0)
I am trying to fit some data with Gaussian fit. This data from lateral flow image. The fitted line (red) does not cover data. Please check my code. In the code x is just index. y actually real data.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
y = np.array([2.22097081, 2.24776432, 2.35519896, 2.43780396, 2.49708355,
2.54224971, 2.58350984, 2.62965057, 2.68644093, 2.75454015,
2.82912617, 2.90423835, 2.97921199, 3.05864617, 3.14649922,
3.2430853 , 3.3471892 , 3.45919857, 3.58109399, 3.71275641,
3.84604379, 3.94884214, 3.94108998, 3.72148453, 3.28407665,
2.7651018 ])
x = np.linspace(1,np.mean(y),len(y))
n = len(x)
mean = sum(x*y)/n
sigma = np.sqrt(sum(y*(x-mean)**2)/n)
def gaus(x,a,x0,sigma):
return a*np.exp(-(x-x0)**2/(2*sigma**2))/(sigma*np.sqrt(2*np.pi))
popt,pcov = curve_fit(gaus,x,y,p0=[1,mean,sigma])
plt.figure()
plt.plot(x,y,'b+:',label='data')
plt.plot(x,gaus(x,*popt),'ro:',label='fit')
plt.legend()
plt.xlabel('Index')
plt.ylabel('Row Mean')
I was wondering if there's a way to find tangents to curve from discrete data.
For example:
x = np.linespace(-100,100,100001)
y = sin(x)
so here x values are integers, but what if we want to find tangent at something like x = 67.875?
I've been trying to figure out if numpy.interp would work, but so far no luck.
I also found a couple of similar examples, such as this one, but haven't been able to apply the techniques to my case :(
I'm new to Python and don't entirely know how everything works yet, so any help would be appreciated...
this is what I get:
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-100,100,10000)
y = np.sin(x)
tck, u = interpolate.splprep([y])
ti = np.linspace(-100,100,10000)
dydx = interpolate.splev(ti,tck,der=1)
plt.plot(x,y)
plt.plot(ti,dydx[0])
plt.show()
There is a comment in this answer, which tells you that there is a difference between splrep and splprep. For the 1D case you have here, splrep is completely sufficient.
You may also want to limit your curve a but to be able to see the oscilations.
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-15,15,1000)
y = np.sin(x)
tck = interpolate.splrep(x,y)
dydx = interpolate.splev(x,tck,der=1)
plt.plot(x,y)
plt.plot(x,dydx, label="derivative")
plt.legend()
plt.show()
While this is how the code above would be made runnable, it does not provide a tangent. For the tangent you only need the derivative at a single point. However you need to have the equation of a tangent somewhere and actually use it; so this is more a math question.
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-15,15,1000)
y = np.sin(x)
tck = interpolate.splrep(x,y)
x0 = 7.3
y0 = interpolate.splev(x0,tck)
dydx = interpolate.splev(x0,tck,der=1)
tngnt = lambda x: dydx*x + (y0-dydx*x0)
plt.plot(x,y)
plt.plot(x0,y0, "or")
plt.plot(x,tngnt(x), label="tangent")
plt.legend()
plt.show()
It should be noted that you do not need to use splines at all if the points you have are dense enough. In that case obtaining the derivative is just taking the differences between the nearest points.
from scipy import interpolate
import matplotlib.pyplot as plt
import numpy as np
x = np.linspace(-15,15,1000)
y = np.sin(x)
x0 = 7.3
i0 = np.argmin(np.abs(x-x0))
x1 = x[i0:i0+2]
y1 = y[i0:i0+2]
dydx, = np.diff(y1)/np.diff(x1)
tngnt = lambda x: dydx*x + (y1[0]-dydx*x1[0])
plt.plot(x,y)
plt.plot(x1[0],y1[0], "or")
plt.plot(x,tngnt(x), label="tangent")
plt.legend()
plt.show()
The result will be visually identical to the one above.
i've never tried implementing error bars based off of confidence intervals. Being that this is what I want to do, i'm unsure how to proceed further.
I have this large data array that consists ~1000 elements. From plotting the histogram that has this data, it looks well enough like a Maxwell-Boltzmann distribution.
Lets say my data is called x, which I apply the fitting for it as
import scipy.stats as stats
import numpy as np
import matplotlib.pyplot as plt
maxwell = stats.maxwell
## Scale Parameter
params = maxwell.fit(x, floc=0)
print params
## mean
mean = 2*params[1]*np.sqrt(2/np.pi)
print mean
## Variance
sig = (params[1])**(3*np.pi-8)/np.pi
print sig
>>> (0, 178.17597215151301)
>>> 284.327714571
>>> 512.637498406
To which when plotting it
fig = plt.figure(figsize=(7,7))
ax = fig.add_subplot(111)
xd = np.argsort(x)
ax.plot(x[xd], maxwell.pdf(x, *params)[xd])
ax.hist(x[xd], bins=75, histtype="stepfilled", linewidth=1.5, facecolor='none', alpha=0.55, edgecolor='black',
normed=True)
How on earth do you go about implanting confidence intervals with the curve fit?
I can use
conf = maxwell.interval(0.90,loc=mean,scale=sig)
>>> (588.40702793225228, 1717.3973740895271)
But I have no clue what do with this
I've got the following simple script that plots a graph:
import matplotlib.pyplot as plt
import numpy as np
T = np.array([6, 7, 8, 9, 10, 11, 12])
power = np.array([1.53E+03, 5.92E+02, 2.04E+02, 7.24E+01, 2.72E+01, 1.10E+01, 4.70E+00])
plt.plot(T,power)
plt.show()
As it is now, the line goes straight from point to point which looks ok, but could be better in my opinion. What I want is to smooth the line between the points. In Gnuplot I would have plotted with smooth cplines.
Is there an easy way to do this in PyPlot? I've found some tutorials, but they all seem rather complex.
You could use scipy.interpolate.spline to smooth out your data yourself:
from scipy.interpolate import spline
# 300 represents number of points to make between T.min and T.max
xnew = np.linspace(T.min(), T.max(), 300)
power_smooth = spline(T, power, xnew)
plt.plot(xnew,power_smooth)
plt.show()
spline is deprecated in scipy 0.19.0, use BSpline class instead.
Switching from spline to BSpline isn't a straightforward copy/paste and requires a little tweaking:
from scipy.interpolate import make_interp_spline, BSpline
# 300 represents number of points to make between T.min and T.max
xnew = np.linspace(T.min(), T.max(), 300)
spl = make_interp_spline(T, power, k=3) # type: BSpline
power_smooth = spl(xnew)
plt.plot(xnew, power_smooth)
plt.show()
Before:
After:
For this example spline works well, but if the function is not smooth inherently and you want to have smoothed version you can also try:
from scipy.ndimage.filters import gaussian_filter1d
ysmoothed = gaussian_filter1d(y, sigma=2)
plt.plot(x, ysmoothed)
plt.show()
if you increase sigma you can get a more smoothed function.
Proceed with caution with this one. It modifies the original values and may not be what you want.
See the scipy.interpolate documentation for some examples.
The following example demonstrates its use, for linear and cubic spline interpolation:
import matplotlib.pyplot as plt
import numpy as np
from scipy.interpolate import interp1d
# Define x, y, and xnew to resample at.
x = np.linspace(0, 10, num=11, endpoint=True)
y = np.cos(-x**2/9.0)
xnew = np.linspace(0, 10, num=41, endpoint=True)
# Define interpolators.
f_linear = interp1d(x, y)
f_cubic = interp1d(x, y, kind='cubic')
# Plot.
plt.plot(x, y, 'o', label='data')
plt.plot(xnew, f_linear(xnew), '-', label='linear')
plt.plot(xnew, f_cubic(xnew), '--', label='cubic')
plt.legend(loc='best')
plt.show()
Slightly modified for increased readability.
One of the easiest implementations I found was to use that Exponential Moving Average the Tensorboard uses:
def smooth(scalars: List[float], weight: float) -> List[float]: # Weight between 0 and 1
last = scalars[0] # First value in the plot (first timestep)
smoothed = list()
for point in scalars:
smoothed_val = last * weight + (1 - weight) * point # Calculate smoothed value
smoothed.append(smoothed_val) # Save it
last = smoothed_val # Anchor the last smoothed value
return smoothed
ax.plot(x_labels, smooth(train_data, .9), x_labels, train_data)
I presume you mean curve-fitting and not anti-aliasing from the context of your question. PyPlot doesn't have any built-in support for this, but you can easily implement some basic curve-fitting yourself, like the code seen here, or if you're using GuiQwt it has a curve fitting module. (You could probably also steal the code from SciPy to do this as well).
Here is a simple solution for dates:
from scipy.interpolate import make_interp_spline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as dates
from datetime import datetime
data = {
datetime(2016, 9, 26, 0, 0): 26060, datetime(2016, 9, 27, 0, 0): 23243,
datetime(2016, 9, 28, 0, 0): 22534, datetime(2016, 9, 29, 0, 0): 22841,
datetime(2016, 9, 30, 0, 0): 22441, datetime(2016, 10, 1, 0, 0): 23248
}
#create data
date_np = np.array(list(data.keys()))
value_np = np.array(list(data.values()))
date_num = dates.date2num(date_np)
# smooth
date_num_smooth = np.linspace(date_num.min(), date_num.max(), 100)
spl = make_interp_spline(date_num, value_np, k=3)
value_np_smooth = spl(date_num_smooth)
# print
plt.plot(date_np, value_np)
plt.plot(dates.num2date(date_num_smooth), value_np_smooth)
plt.show()
It's worth your time looking at seaborn for plotting smoothed lines.
The seaborn lmplot function will plot data and regression model fits.
The following illustrates both polynomial and lowess fits:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
T = np.array([6, 7, 8, 9, 10, 11, 12])
power = np.array([1.53E+03, 5.92E+02, 2.04E+02, 7.24E+01, 2.72E+01, 1.10E+01, 4.70E+00])
df = pd.DataFrame(data = {'T': T, 'power': power})
sns.lmplot(x='T', y='power', data=df, ci=None, order=4, truncate=False)
sns.lmplot(x='T', y='power', data=df, ci=None, lowess=True, truncate=False)
The order = 4 polynomial fit is overfitting this toy dataset. I don't show it here but order = 2 and order = 3 gave worse results.
The lowess = True fit is underfitting this tiny dataset but may give better results on larger datasets.
Check the seaborn regression tutorial for more examples.
Another way to go, which slightly modifies the function depending on the parameters you use:
from statsmodels.nonparametric.smoothers_lowess import lowess
def smoothing(x, y):
lowess_frac = 0.15 # size of data (%) for estimation =~ smoothing window
lowess_it = 0
x_smooth = x
y_smooth = lowess(y, x, is_sorted=False, frac=lowess_frac, it=lowess_it, return_sorted=False)
return x_smooth, y_smooth
That was better suited than other answers for my specific application case.