How to fit part of a Cosine curve to data in Python? - python

Written this code to try and plot a a graph of y = a(1 + cos(bx - pi)) + c to our data collected but when using np.cos it tries to fit an entire cycle of cos onto the data, which doesn't fit our results. Any help on how to fit only a section of the curve to our data would be fab!
Tried to avoid using cos by using maclaurin series expansion but this still doesn't work.
x_data = w
y_data = mean
e = error
from scipy import optimize
def test_func(x, a, b, c):
y = (a/2)*(1 + (1 - (1/2)*(b*x - np.pi)**2 + (1/24)*(b*x - np.pi)**4)) + c
return y
params, params_covariance = optimize.curve_fit(test_func, x_data, y_data)
print(params)
a = params[0]
b = params[1]
c = params[2]
figure(num=None, figsize=(12, 6), dpi=80, facecolor='w', edgecolor='k')
plt.errorbar(x_data, y_data, yerr=e, fmt='o', marker='o', label='Data', markersize=3, color='k', elinewidth=1, capsize=2, markeredgewidth=1)
plt.plot(x_data, test_func(x_data, params[0], params[1], params[2]), label='Fitted function')
plt.legend(loc='best')
plt.ylabel('Interference intensity, $I$')
plt.xlabel('Rotational velocity of interferometer, $w$')
plt.show()

Your question is "how to fit only a section of a curve to our data." This can be accomplished by defining a piece-wise function and fitting a section of your data to each corresponding piece of the function. You need to define the cut-off values that separate the parts of your data and pick which functions to fit to each part.
In order to fit a curve to only a section of the data, you need to only pass the portion of the data to curve_fit that you want to fit. Here are working examples of fitting the data to both a Maclaurin series and a cosine function:
from scipy import optimize
# Generate sample data
np.random.seed(0)
x_data = np.linspace(-np.pi,3*np.pi,101)
y_data = np.cos(x_data) + np.random.rand(len(x_data))/4
idx = (x_data < 0) | (x_data > 2*np.pi)
y_data[idx] = 1 + np.random.rand(sum(idx))/4
e = np.random.rand(len(x_data))/10
# Select part of data to fit
fit_part = ~idx
x_data_to_fit = x_data[fit_part]
y_data_to_fit = y_data[fit_part]
Cosine Function:
def test_func(x, a, b):
y = a*np.cos(b*x)
return y
params, params_covariance = optimize.curve_fit(test_func, x_data_to_fit, y_data_to_fit)
print(params)
a = params[0]
b = params[1]
plt.figure(num=None, figsize=(12, 6), dpi=80, facecolor='w', edgecolor='k')
plt.title('Cosine Function Fit')
plt.errorbar(x_data, y_data, yerr=e, fmt='o', marker='o', label='Data', markersize=3, color='k', elinewidth=1, capsize=2, markeredgewidth=1)
plt.plot(x_data_to_fit, test_func(x_data_to_fit, a, b), label='Fitted function')
plt.legend(loc='best')
plt.ylabel('Interference intensity, $I$')
plt.xlabel('Rotational velocity of interferometer, $w$')
plt.show()
Maclaurin Series:
def test_func(x, a, b, c):
y = (a/2)*(1 + (1 - (1/2)*(b*x - np.pi)**2 + (1/24)*(b*x - np.pi)**4)) + c
return y
params, params_covariance = optimize.curve_fit(test_func, x_data_to_fit, y_data_to_fit)
print(params)
a = params[0]
b = params[1]
c = params[2]
plt.figure(num=None, figsize=(12, 6), dpi=80, facecolor='w', edgecolor='k')
plt.title('MacLaurin Series Fit')
plt.errorbar(x_data, y_data, yerr=e, fmt='o', marker='o', label='Data', markersize=3, color='k', elinewidth=1, capsize=2, markeredgewidth=1)
plt.plot(x_data_to_fit, test_func(x_data_to_fit, a, b, c), label='Fitted function')
plt.legend(loc='best')
plt.ylabel('Interference intensity, $I$')
plt.xlabel('Rotational velocity of interferometer, $w$')
plt.show()
The cosine function matches the data better than the Maclaurin series in this case because the data was generated using a cosine function.

Related

Plotting histogram of probability function and results of inverse transform sampling

Here I tried to plot the probability function P(s)=C/s and then plot a histogram showing real probability function and then show the results of sampling:
import numpy as np
s_min = 1
s_max = 1000
# calculate the normalization constant
C = 1 / (np.log(s_max) - np.log(s_min))
u = np.random.rand(int(1000000))
s = s_min * np.exp(u * (np.log(s_max) - np.log(s_min)))
a = np.log10(min(s))
b = np.log10(max(s))
mybins = np.logspace(a, b, num=17)
plt.hist(s, bins=mybins, density=True, histtype='step', log=True, label='Random Numbers')
x = np.logspace(a, b, num=100)
y = C / x
plt.plot(x, y, 'r', label='Expected Distribution')
plt.xlabel('s')
plt.ylabel('P(s)')
plt.xscale('log')
plt.yscale('log')
plt.legend()
plt.show()
but the code is generating an empty plot with labels.
Tried to add %matplotlib inline and nothing changed

Trying to fit data into sine cosine curve fit using scipy

I am new to signal processing in python.
Here I am trying fit data to sine cosine curve using the equation -
A * np.sin(x) + B * np.cos(x) + C.
here is the snippet of the code
def func(x, A, B, C):
return A * np.sin(x) + B * np.cos(x) + C
p0 = 0, 25, 10
popt, pcov = curve_fit(func, x, y)
times = np.linspace(x[0], x[-1], num=21)
plt.plot(x, y, 'o', color='red', label="data")
plt.plot(times, func(times, *popt), '--', color='blue', label="optimized data")
# plt.plot(x, func(x, *popt), '--', color='blue', label="optimized data")
plt.legend()
plt.show()
I am getting below output (in image)
Could anyone help me spotting the mistake or any suggestion with the code

How to create a confidence interval with plt.fill_between inside a scatter plot

I created a scatter plot that uses data from two sources: x = []and y = []. In a second step, I added a linear regression line for the two lists of data above using the following code:
(m, b) = np.polyfit(x, y, 1)
Y_Polyval = np.polyval([m, b], x)
plt.plot(x, Y_Polyval, linewidth=3, c="black")
The result of that is a standard scatterplot as shown below.
Now I would like to add a 95% confidence interval to the black regression line, using plt.fill_between. I know that there are many topics on this, I read through many of them, but I cannot solve the problem, i.e., adapting a code to my particular code and regression line.
Adding
CI = 1.96 * np.std(y) / np.mean(y)
plt.fill_between(y, (y-CI), (y+CI), color='blue', alpha=0.1)
to my code results in the following output below.
The blueish confidence interval by plt.fill_between is somewhere drawn on the left side of the image, but not around the regression line. What I would like to achieve is that the confidence interval draws around the black regression line. The full code is shown subsequently:
import numpy as np
import matplotlib.pyplot as plt
# Scatter plot
x = [0.472202, 0.685151, 0.287613, 0.546364, 0.518002, 0.675128, 0.462418, 0.61817, 0.692822, 0.23433,
0.194009, 0.720232, 0.597321, 0.625955, 0.660571, 0.737754, 0.436876, 0.689937, 0.483067, 0.646723,
0.699367, 0.384102, 0.561493]
y = [0.131113, 0.123865, 0.150355, 0.138914, 0.140417, 0.119358, 0.130019, 0.129782, 0.113508, 0.13434,
0.15162, 0.125768, 0.128473, 0.128056, 0.114403, 0.142878, 0.139192, 0.118033, 0.132616, 0.133043,
0.133973, 0.146611, 0.129792]
(m, b) = np.polyfit(x, y, 1)
Y_Polyval = np.polyval([m, b], x)
plt.plot(x, Y_Polyval, linewidth=3, c="black")
CI = 1.96 * np.std(y) / np.mean(y)
plt.fill_between(y, (y-CI), (y+CI), color='blue', alpha=0.1)
plt.scatter(x, y, s=250, linewidths=2, zorder=2)
plt.show()
You should plot the predicted value Y_Polyval instead of the true value y and sort the (x, y) values to fill the areas:
plt.fill_between(x, (Y_Polyval-CI), (Y_Polyval+CI), color='blue', alpha=0.1)
Full Example
import numpy as np
import matplotlib.pyplot as plt
# Scatter plot
x = [0.472202, 0.685151, 0.287613, 0.546364, 0.518002, 0.675128, 0.462418, 0.61817, 0.692822, 0.23433,
0.194009, 0.720232, 0.597321, 0.625955, 0.660571, 0.737754, 0.436876, 0.689937, 0.483067, 0.646723,
0.699367, 0.384102, 0.561493]
y = [0.131113, 0.123865, 0.150355, 0.138914, 0.140417, 0.119358, 0.130019, 0.129782, 0.113508, 0.13434,
0.15162, 0.125768, 0.128473, 0.128056, 0.114403, 0.142878, 0.139192, 0.118033, 0.132616, 0.133043,
0.133973, 0.146611, 0.129792]
# Sort coordinate values
coords = [(a, b) for a, b in zip(x, y)]
coords = sorted(coords, key=lambda x: x[1], reverse=True)
x, y = zip(*coords)
(m, b) = np.polyfit(x, y, 1)
Y_Polyval = np.polyval([m, b], x)
plt.plot(x, Y_Polyval, linewidth=3, c="black")
plt.scatter(x, y, s=250, linewidths=2, zorder=2)
plt.fill_between(x, (Y_Polyval-CI), (Y_Polyval+CI), color='blue', alpha=0.1)

Plot a model with multiple curve_fit parameters

I have a model that describes a sum of Gaussians distributions:
s1 = np.random.normal(2, 0.5, size = (1000, 1))
s2 = np.random.normal(5, 0.5, size = (1000, 1))
mb = (np.concatenate((s1, s2), axis=0)).max()
Xi = np.arange(0,mb,0.1) #bins
#histogram population 1
Y11, bins1 = np.histogram(s1, X)
Y1 = Y11/Y11.sum()
X1 = bins1[:-1]
#histogram population 2
Y22, bins2 = np.histogram(s2, X)
Y2 = Y22/Y22.sum()
X2 = bins2[:-1]
#universe, with all mixed populations
S = np.concatenate((s1, s2), axis=0)
Yi, bins = np.histogram(S, Xi)
Y = Yi/Yi.sum()
X = bins[:-1]
def gaussians(X, amp1, mean1, SD1, amp2, mean2, SD2):
A = amp1 * np.exp(-0.5*((X - mean1)/SD1)**2)
B = amp2 * np.exp(-0.5*((X - mean2)/SD2)**2)
return A + B
params, pcov = curve_fit(gaussians, X,Y, p0=(1,2,1,1,5,1), maxfev=4000)
j = numpy.arange(0.1, mb, 0.1)
plt.figure(figsize=(10, 6)) #size of graph
plt.plot(X, Y, 'o', linewidth=2)
plt.plot(X, gaussians(X ,params[0], params[1],params[2], params[3], params[4], params[5]),'b', linewidth=2)
plt.xlim([-.01, mb])
plt.ylim([0, 0.1])
plt.show()
This code plot a nice graph as follows:
I wonder how to plot each gaussian overlapped in the same graph from the parameters of my model function. I mean, something like this (made by hand):
For those worried to get the answer, I figured out how to do it. It's only matters to become zero all the parameters that you don't want to graph:
plt.plot(X, gaussians(X ,params[0], params[1],params[2], params[3], params[4], params[5]),'b', linewidth=8, alpha=0.1)
plt.plot(X, gaussians(X ,0, params[1],params[2], params[3], params[4], params[5]),'r', linewidth=1 )
plt.plot(X, gaussians(X ,params[0], params[1],params[2], 0, params[4], params[5]),'g', linewidth=1)
plt.xlim([-.01, mb])
plt.ylim([0, 0.1])

How to take into account the data's uncertainty (standard deviation) when fitting with scipy.linalg.lstsq?

I am trying to surface fit 3d data (z is a function of x and y). I have assymetrical error bars for each point. I would like the fit to take this uncertainty into account.
I am using scipy.linalg.lstsq(). It does not have any option for uncertainties in its arguments.
I am trying to adapt some code found on this page.
import numpy as np
import scipy.linalg
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt
# Create data with x and y random over [-2, 2], and z a Gaussian function of x and y.
np.random.seed(12345)
x = 2 * (np.random.random(500) - 0.5)
y = 2 * (np.random.random(500) - 0.5)
def f(x, y):
return np.exp(-(x + y ** 2))
z = f(x, y)
data = np.c_[x,y,z]
# regular grid covering the domain of the data
mn = np.min(data, axis=0)
mx = np.max(data, axis=0)
X,Y = np.meshgrid(np.linspace(mn[0], mx[0], 20), np.linspace(mn[1], mx[1], 20))
XX = X.flatten()
YY = Y.flatten()
# best-fit quadratic curve (2nd-order)
A = np.c_[np.ones(data.shape[0]), data[:,:2], np.prod(data[:,:2], axis=1), data[:,:2]**2]
C,_,_,_ = scipy.linalg.lstsq(A, data[:,2])
# evaluate it on a grid
Z = np.dot(np.c_[np.ones(XX.shape), XX, YY, XX*YY, XX**2, YY**2], C).reshape(X.shape)
# plot points and fitted surface using Matplotlib
fig = plt.figure(figsize=(10, 10))
ax = fig.gca(projection='3d')
ax.plot_surface(X, Y, Z, rstride=1, cstride=1, alpha=0.2)
ax.scatter(data[:,0], data[:,1], data[:,2], c='r', s=50)
plt.xlabel('X')
plt.ylabel('Y')
ax.set_zlabel('Z')
ax.axis('equal')
ax.axis('tight')

Categories

Resources