I have two equations that originate from one single matrix equation:
[x,y] = [[cos(n), -sin(n)],[sin(n), cos(n)]]*[x', y'],
where x' = Acos(w1*t+ p1) and y' = Bcos(w2*t + p2).
This is just a single matrix equation for the vector [x,y], but it can be decomposed into 2 scalar equations: x = A*cos(s)*cos(w1*t+ p1)*x' - B*sin(s)*sin(w2*t + p2)*y' and y = A*sin(s)*cos(w1*t+ p1)*x' + B*cos(s)*sin(w2*t + p2)*y'.
So I am fitting two datasets, x vs. t and y vs. t, but there are some shared parameters in these fits, namely A, B and s.
1) Can I fit directly the matrix equation, or do I have to decompose it into scalar ones? The former would be more elegant.
2) Can I fit with shared parameters on curve_fit? All relevant questions use other packages.
The below example code fits two different equations with one shared parameter using curve_fit. This is not an answer to your question, but I cannot format code in a comment so I'm posting it here.
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
y1 = np.array([ 16.00, 18.42, 20.84, 23.26, 25.68])
y2 = np.array([-20.00, -25.50, -31.00, -36.50, -42.00])
comboY = np.append(y1, y2)
h = np.array([5.0, 6.1, 7.2, 8.3, 9.4])
comboX = np.append(h, h)
def mod1(data, a, b, c): # not all parameters are used here
return a * data + c
def mod2(data, a, b, c): # not all parameters are used here
return b * data + c
def comboFunc(comboData, a, b, c):
# single data set passed in, extract separate data
extract1 = comboData[:len(y1)] # first data
extract2 = comboData[len(y2):] # second data
result1 = mod1(extract1, a, b, c)
result2 = mod2(extract2, a, b, c)
return np.append(result1, result2)
# some initial parameter values
initialParameters = np.array([1.0, 1.0, 1.0])
# curve fit the combined data to the combined function
fittedParameters, pcov = curve_fit(comboFunc, comboX, comboY, initialParameters)
# values for display of fitted function
a, b, c = fittedParameters
y_fit_1 = mod1(h, a, b, c) # first data set, first equation
y_fit_2 = mod2(h, a, b, c) # second data set, second equation
plt.plot(comboX, comboY, 'D') # plot the raw data
plt.plot(h, y_fit_1) # plot the equation using the fitted parameters
plt.plot(h, y_fit_2) # plot the equation using the fitted parameters
plt.show()
print(fittedParameters)
Related
I was following a tutorial for data fitting, and when I just changed original data to my data the fit became not quadratic.
Here's my code, thanks a lot for help:
# fit a second degree polynomial to the economic data
import numpy as np
from numpy import arange
from pandas import read_csv
from scipy.optimize import curve_fit
from matplotlib import pyplot
x = np.array([1,2,3,4,5,6])
y = np.array([1,4,12,29,54,104])
# define the true objective function
def objective(x, a, b, c):
return a * x + b * x**2 + c
# load the dataset
url = 'https://raw.githubusercontent.com/jbrownlee/Datasets/master/longley.csv'
dataframe = read_csv(url, header=None)
data = dataframe.values
# choose the input and output variables
# curve fit
popt, _ = curve_fit(objective, x, y)
# summarize the parameter values
a, b, c = popt
print('y = %.5f * x + %.5f * x^2 + %.5f' % (a, b, c))
# plot input vs output
pyplot.scatter(x, y)
# define a sequence of inputs between the smallest and largest known inputs
x_line = arange(min(x), max(x), 1)
# calculate the output for the range
y_line = objective(x_line, a, b, c)
# create a line plot for the mapping function
pyplot.plot(x_line, y_line, '--', color='red')
pyplot.show()```
I tried python matplotlib quadratic data fit, and I expected quadratic function but visually it's not.
I am fitting a set of experimental data (sample) within two different experimental regions and can be expressed with two mathematical functions as follows:
1st region:
y = m*x + c ( the slope can be constrained to zero)
2nd region:
y = d*exp(-k*x)
the experimental data is shown below and I coded it in python as follows:
def func(x, m, c, d, k):
return m*x+ c + d*np.exp(-k*x)
popt, pcov = curve_fit(func, t, y)
Unfortunately, my data is not fitting properly and fitted (returned) parameters do not make sense (see picture below).
Any assistance will be appreciated.
Very interesting question. As said by a_guest, you will have to fit to the two regions separately. However, I think you probably also want the two regions to connect smoothly at the point t0, the point where we switch from one model to the other. In order to do this, we need to add the constraint that y1 == y2 at the point t0.
In order to do this with scipy, look at scipy.optimize.minimize with the SLSQP method. However, I wrote a scipy wrapper to make this kind of thing easier, called symfit. I will show you how to do this with symfit, because I think it's better suited to the task, but with this example you should also be able to implement it with pure scipy if you prefer.
from symfit import parameters, variables, Fit, Piecewise, exp, Eq
import numpy as np
import matplotlib.pyplot as plt
t, y = variables('t, y')
m, c, d, k, t0 = parameters('m, c, d, k, t0')
# Help the fit by bounding the switchpoint between the models
t0.min = 0.6
t0.max = 0.9
# Make a piecewise model
y1 = m * t + c
y2 = d * exp(- k * t)
model = {y: Piecewise((y1, t <= t0), (y2, t > t0))}
# As a constraint, we demand equality between the two models at the point t0
# to do this, we substitute t -> t0 and demand equality using `Eq`
constraints = [Eq(y1.subs({t: t0}), y2.subs({t: t0}))]
# Read the data
tdata, ydata = np.genfromtxt('Experimental Data.csv', delimiter=',', skip_header=1).T
fit = Fit(model, t=tdata, y=ydata, constraints=constraints)
fit_result = fit.execute()
print(fit_result)
plt.scatter(tdata, ydata)
plt.plot(tdata, fit.model(t=tdata, **fit_result.params).y)
plt.show()
Since your data shows different behavior in different regions you also need to fit the data on these different regions. That is instead of making a sum of the two models (functions) you should fit one the left region with y = m*x + c and separately on the right region with y = d*exp(-k*x). If you have trouble finding the boundary of the two regions you could assess this by comparing the goodness of fit.
popt_1, pcov_1 = curve_fit(lambda x, m, c: m*x + c, t[t < 0.8], y[t < 0.8], p0=(1, 0))
popt_2, pcov_2 = curve_fit(lambda x, d, k: d*exp(-k*x), t[t >= 0.8], y[t >= 0.8], p0=(400, 1))
Edit
Example code:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.optimize import curve_fit
df = pd.read_csv('test.csv', index_col=None)
t = df.t.values
y = df.Y.values
boundary = t[y.argmax()]
t1 = t[t < boundary]
y1 = y[t < boundary]
t2 = t[t >= boundary]
y2 = y[t >= boundary]
f1 = lambda x, m, c: m*x + c
f2 = lambda x, d, k: d*np.exp(-k*x)
popt_1 ,pcov_1 = curve_fit(f1, t1, y1, p0=((y1[-1] - y1[0]) / (t1[-1] - t1[0]), y1[0]))
popt_2 ,pcov_2 = curve_fit(f2, t2, y2, p0=(y2[0], 1))
plt.title('Fitted data on two different domains')
plt.xlabel('t [a.u.]')
plt.ylabel('y [a.u.]')
plt.plot(t, y, '-o', label='Data')
plt.plot(t1, f1(t1, *popt_1), '--', color='#ff7f0e', lw=3, label='Fit')
plt.plot(t2, f2(t2, *popt_2), '--', color='#ff7f0e', lw=3, label='_nolegend_')
plt.grid()
plt.legend()
plt.show()
Which produces the following plot:
Note that the resulting "compound" function is not continuous at the boundary. If that is undesired you can resolve it by fixing one the fit parameters (e.g.k) before fitting the other domain (one way or the other). Alternatively you could fit both regions separately, then determine the value at the boundary as the average of the two separate functions (i.e. y_b = (f1(t1[-1], *popt_1) + f2(t2[0], *popt_2)) / 2) and then repeat the fitting by constraining the parameters such that this boundary condition is fulfilled.
For example fitting the linear function first and then fixing the d parameter of the exponential in order to have a continuous transition at the boundary (note that the linear function f1 is extrapolated outside its domain at t2[0] in order to ensure the continuity):
f1 = lambda x, m, c: m*x + c
popt_1, pcov_1 = curve_fit(f1, t1, y1, p0=((y1[-1] - y1[0]) / (t1[-1] - t1[0]), y1[0]))
d = f1(t2[0], *popt_1)
f2 = lambda x, k: d*np.exp(-k*(x - boundary))
popt_2, pcov_2 = curve_fit(f2, t2, y2, p0=(1,))
Which produces the following plot:
If you would prefer to use a single equation, I found that the Hocket-Sherby equation "y = b - (b-a) * exp(-c * (x**d))" seems like an OK fit to your data, yielding an R-squared of 0.99 and RMSE of 11.2 with parameters a = 1.1262189756312683E+01, b = 3.2040596733114870E+02, c = 3.9385197507261771E-01, and d = -4.7723382040098095E+00
I was wondering if there is a way I can use LMfit to fit datasets that have different sizes with the same model with a some shared and some independent parameters.
This link is the closest I found to a similar question but it assumes the same x for all y.
Thanks for all advice and input
Per your comment that a scipy example is OK:
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
y1 = np.array([ 16.00, 18.42, 20.84, 23.26])
y2 = np.array([-20.00, -25.50, -31.00, -36.50, -42.00])
comboY = np.append(y1, y2)
x1 = np.array([5.0, 6.1, 7.2, 8.3])
x2 = np.array([15.0, 16.1, 17.2, 18.3, 19.4])
comboX = np.append(x1, x2)
if len(y1) != len(x1):
raise(Exception('Unequal x1 and y1 data length'))
if len(y2) != len(x2):
raise(Exception('Unequal x2 and y2 data length'))
def function1(data, a, b, c): # not all parameters are used here, c is shared
return a * data + c
def function2(data, a, b, c): # not all parameters are used here, c is shared
return b * data + c
def combinedFunction(comboData, a, b, c):
# single data reference passed in, extract separate data
extract1 = comboData[:len(x1)] # first data
extract2 = comboData[len(x1):] # second data
result1 = function1(extract1, a, b, c)
result2 = function2(extract2, a, b, c)
return np.append(result1, result2)
# some initial parameter values
initialParameters = np.array([1.0, 1.0, 1.0])
# curve fit the combined data to the combined function
fittedParameters, pcov = curve_fit(combinedFunction, comboX, comboY, initialParameters)
# values for display of fitted function
a, b, c = fittedParameters
y_fit_1 = function1(x1, a, b, c) # first data set, first equation
y_fit_2 = function2(x2, a, b, c) # second data set, second equation
plt.plot(comboX, comboY, 'D') # plot the raw data
plt.plot(x1, y_fit_1) # plot the equation using the fitted parameters
plt.plot(x2, y_fit_2) # plot the equation using the fitted parameters
plt.show()
print('a, b, c:', fittedParameters)
I want to fit complex data set with a two functions which shared the same parameters. For this I used
def funcReal(x,a,b,c,d):
return np.real((a + 1j*b)*(np.exp(1j*k*x - kappa1*x) - np.exp(kappa2*x)) + (c + 1j*d)*(np.exp(-1j*k*x - kappa1*x) - np.exp(-kappa2*x)))
def funcImag(x,a,b,c,d):
return np.imag((a + 1j*b)*(np.exp(1j*k*x - kappa1*x) - np.exp(kappa2*x)) + (c + 1j*d)*(np.exp(-1j*k*x - kappa1*x) - np.exp(-kappa2*x)))`
poptReal, pcovReal = curve_fit(funcReal, x, yReal)
poptImag, pcovImag = curve_fit(funcImag, x, yImag)
Here funcReal is the real part of my model, funcImag the imaginary part, yReal the real part of the data and yImag the imaginary part of the data.
However, both fits does not give me the same parameters for the real and imaginary part.
My question is there a package or a method such that I can realized multi fits for multiple data sets and multiple functions with shared parameters?
To fit both the complex function given above, we can treat the real and imaginary components as a coordinate point, or as a vector. Since curve_fit doesn't care about the order at which data points are inserted in the vectors x (independent data) and y (dependent data), we can simply split the complex data and stack the real and imaginary components using hstack. See the example below.
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
kappa1 = np.pi
kappa2 = -0.01
def long_function(x, a, b, c, d):
return (a + 1j*b)*(np.exp(1j*k*x - kappa1*x) - np.exp(kappa2*x)) + (c + 1j*d)*(np.exp(-1j*k*x - kappa1*x) - np.exp(-kappa2*x))
def funcBoth(x, a, b, c, d):
N = len(x)
x_real = x[:N//2]
x_imag = x[N//2:]
y_real = np.real(long_function(x_real, a, b, c, d))
y_imag = np.imag(long_function(x_imag, a, b, c, d))
return np.hstack([y_real, y_imag])
# Create an independent variable with 100 measurements
N = 100
x = np.linspace(0, 10, N)
# True values of the dependent variable
y = long_function(x, a=1.1, b=0.3, c=-0.2, d=0.23)
# Add uniform complex noise (real + imaginary)
noise = (np.random.rand(N) + 1j * np.random.rand(N) - 0.5 - 0.5j) * 0.1
yNoisy = y + noise
# Split the measurements into a real and imaginary part
yReal = np.real(yNoisy)
yImag = np.imag(yNoisy)
yBoth = np.hstack([yReal, yImag])
# Find the best-fit solution
poptBoth, pcovBoth = curve_fit(funcBoth, np.hstack([x, x]), yBoth)
# Compute the best-fit solution
yFit = long_function(x, *poptBoth)
print(poptBoth)
# Plot the results
plt.figure(figsize=(9, 4))
plt.subplot(121)
plt.plot(x, np.real(yNoisy), "k.", label="Noisy y")
plt.plot(x, np.real(y), "r--", label="True y")
plt.plot(x, np.real(yFit), label="Best fit")
plt.ylabel("Real part of y")
plt.xlabel("x")
plt.legend()
plt.subplot(122)
plt.plot(x, np.imag(yNoisy), "k.")
plt.plot(x, np.imag(y), "r--")
plt.plot(x, np.imag(yFit))
plt.ylabel("Imaginary part of y")
plt.xlabel("x")
plt.tight_layout()
plt.show()
Result:
The best-fit parameters that were found in this example were a = 1.14, b = 0.375, c = -0.236, and d = 0.163, which are close enough to the true parameter values given the amplitude of the noise that I inserted here.
I'm stuck trying to fit a bipolar sigmoid curve - I'd like to have the following curve:
but I need it shifted and stretched. I have the following inputs:
x[0] = 8, x[48] = 2
So over 48 periods I need to drop from 8 to 2 using a bipolar sigmoid function to approximate a nice smooth dropoff. Any ideas how I could derive the curve that would fit those parameters?
Here's what I have so far, but I need to change the sigmoid function:
import math
def sigmoid(x):
return 1 / (1 + math.exp(-x))
plt.plot([sigmoid(float(z)) for z in range(1,48)])
You could redefine the sigmoid function like so
def sigmoid(x, a, b, c, d):
""" General sigmoid function
a adjusts amplitude
b adjusts y offset
c adjusts x offset
d adjusts slope """
y = ((a-b) / (1 + np.exp(x-(c/2))**d)) + b
return y
x = np.arange(49)
y = sigmoid(x, 8, 2, 48, 0.3)
plt.plot(x, y)
Severin's answer is likely more robust, but this should be fine if all you want is a quick and dirty solution.
In [2]: y[0]
Out[2]: 7.9955238269969806
In [3]: y[48]
Out[3]: 2.0044761730030203
From generic bipolar sigmoid function:
f(x,m,b)= 2/(1+exp(-b*(x-m))) - 1
there are two parameters and two unknowns - shift m and scale b
You have two condition:f(0) = 8, f(48) = 2
take first condition, express b vs m, together with second condition write non-linear function to solve, and then use fsolve from SciPy to solve it numerically, and recover back b and m.
Here related by similar method question and answer: How to random sample lognormal data in Python using the inverse CDF and specify target percentiles?
Alternatively, you could also use curve_fit which might come in handy if you have more than just two datapoints. The output looks like this:
As you can see, the graph contains the desired data points. I used #lanery's function for the fit; you can of course choose any function you like. This is the code with some inline comments:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def sigmoid(x, a, b, c, d):
return ((a - b) / (1. + np.exp(x - (c / 2)) ** d)) + b
# one needs at least as many data points as parameters, so I just duplicate the data
xdata = [0., 48.] * 2
ydata = [8., 2.] * 2
# plot data
plt.plot(xdata, ydata, 'bo', label='data')
# fit the data
popt, pcov = curve_fit(sigmoid, xdata, ydata, p0=[1., 1., 50., 0.5])
# plot the result
xdata_new = np.linspace(0, 50, 100)
plt.plot(xdata_new, sigmoid(xdata_new, *popt), 'r-', label='fit')
plt.legend(loc='best')
plt.show()