How to compute standard deviation errors with scipy.optimize.least_squares

How to compute standard deviation errors with scipy.optimize.least_squares - python

I compare fitting with optimize.curve_fit and optimize.least_squares. With curve_fit I get the covariance matrix pcov as an output and I can calculate the standard deviation errors for my fitted variables by that:
perr = np.sqrt(np.diag(pcov))
If I do the fitting with least_squares, I do not get any covariance matrix output and I am not able to calculate the standard deviation errors for my variables.
Here's my example:
#import modules
import matplotlib
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import least_squares
noise = 0.5
N = 100
t = np.linspace(0, 4*np.pi, N)
# generate data
def generate_data(t, freq, amplitude, phase, offset, noise=0, n_outliers=0, random_state=0):
#formula for data generation with noise and outliers
y = np.sin(t * freq + phase) * amplitude + offset
rnd = np.random.RandomState(random_state)
error = noise * rnd.randn(t.size)
outliers = rnd.randint(0, t.size, n_outliers)
error[outliers] *= 10
return y + error
#generate data
data = generate_data(t, 1, 3, 0.001, 0.5, noise, n_outliers=10)
#initial guesses
p0=np.ones(4)
x0=np.ones(4)
# create the function we want to fit
def my_sin(x, freq, amplitude, phase, offset):
return np.sin(x * freq + phase) * amplitude + offset
# create the function we want to fit for least-square
def my_sin_lsq(x, t, y):
# freq=x[0]
# phase=x[1]
# amplitude=x[2]
# offset=x[3]
return (np.sin(t*x[0]+x[2])*x[1]+ x[3]) - y
# now do the fit for curve_fit
fit = curve_fit(my_sin, t, data, p0=p0)
print 'Curve fit output:'+str(fit[0])
#now do the fit for least_square
res_lsq = least_squares(my_sin_lsq, x0, args=(t, data))
print 'Least_squares output:'+str(res_lsq.x)
# we'll use this to plot our first estimate. This might already be good enough for you
data_first_guess = my_sin(t, *p0)
#data_first_guess_lsq = x0[2]*np.sin(t*x0[0]+x0[1])+x0[3]
data_first_guess_lsq = my_sin(t, *x0)
# recreate the fitted curve using the optimized parameters
data_fit = my_sin(t, *fit[0])
data_fit_lsq = my_sin(t, *res_lsq.x)
#calculation of residuals
residuals = data - data_fit
residuals_lsq = data - data_fit_lsq
ss_res = np.sum(residuals**2)
ss_tot = np.sum((data-np.mean(data))**2)
ss_res_lsq = np.sum(residuals_lsq**2)
ss_tot_lsq = np.sum((data-np.mean(data))**2)
#R squared
r_squared = 1 - (ss_res/ss_tot)
r_squared_lsq = 1 - (ss_res_lsq/ss_tot_lsq)
print 'R squared curve_fit is:'+str(r_squared)
print 'R squared least_squares is:'+str(r_squared_lsq)
plt.figure()
plt.plot(t, data)
plt.title('curve_fit')
plt.plot(t, data_first_guess)
plt.plot(t, data_fit)
plt.plot(t, residuals)
plt.figure()
plt.plot(t, data)
plt.title('lsq')
plt.plot(t, data_first_guess_lsq)
plt.plot(t, data_fit_lsq)
plt.plot(t, residuals_lsq)
#error
perr = np.sqrt(np.diag(fit[1]))
print 'The standard deviation errors for curve_fit are:' +str(perr)
I would be very thankful for any help, best wishes
ps: I got a lot of input from this source and used part of the code Robust regression

The result of optimize.least_squares has a parameter inside of it called jac. From the documentation:
jac : ndarray, sparse matrix or LinearOperator, shape (m, n)
Modified Jacobian matrix at the solution, in the sense that J^T J is a Gauss-Newton approximation of the Hessian of the cost function. The type is the same as the one used by the algorithm.
This can be used to estimate the Covariance Matrix of the parameters using the following formula: Sigma = (J'J)^-1.
J = res_lsq.jac
cov = np.linalg.inv(J.T.dot(J))
To find the variance of the parameters one can then use:
var = np.sqrt(np.diagonal(cov))

The SciPy program optimize.least_squares requires the user to provide in input a function fun(...) which returns a vector of residuals. This is typically defined as
residuals = (data - model)/sigma
where data and model are vectors with the data to fit and the corresponding model predictions for each data point, while sigma is the 1σ uncertainty in each data value.
In this situation, and assuming one can trust the input sigma uncertainties, one can use the output Jacobian matrix jac returned by least_squares to estimate the covariance matrix. Moreover, assuming the covariance matrix is diagonal, or simply ignoring non-diagonal terms, one can also obtain the 1σ uncertainty perr in the model parameters (often called "formal errors") as follows (see Section 15.4.2 of Numerical Recipes 3rd ed.)
import numpy as np
from scipy import linalg, optimize
res = optimize.least_squares(...)
U, s, Vh = linalg.svd(res.jac, full_matrices=False)
tol = np.finfo(float).eps*s[0]*max(res.jac.shape)
w = s > tol
cov = (Vh[w].T/s[w]**2) # Vh[w] # robust covariance matrix
perr = np.sqrt(np.diag(cov)) # 1sigma uncertainty on fitted parameters
The above code to obtain the covariance matrix is formally the same as the following simpler one (as suggested by Alex), but the above has the major advantage that it works even when the Jacobian is close to degenerate, which is a common occurrence in real-world least-squares fits
cov = linalg.inv(res.jac.T # res.jac) # covariance matrix when jac not degenerate
If one does not trust the input uncertainties sigma, one can still assume that the fit is good, to estimate the data uncertainties from the fit itself. This corresponds to assuming chi**2/DOF=1, where DOF is the number of degrees of freedom. In this case, one can use the following lines to rescale the covariance matrix before computing the uncertainties
chi2dof = np.sum(res.fun**2)/(res.fun.size - res.x.size)
cov *= chi2dof
perr = np.sqrt(np.diag(cov)) # 1sigma uncertainty on fitted parameters

Related

Super Gaussian fit

I have to do study the laser beam profile. To this aim, I need to find a Super Gaussian curve fit for my data.
Super Gaussian equation:
I * exp(- 2 * ((x - x0) /sigma)^P)
where P takes into account the flat-top laser beam curve characteristics.
I started doing a simple Gaussian fit of my curve, in Python. The fit returns a Gaussian curve where the values of I, x0 and sigma are optimized. (I used the function curve_fit)
Gaussian curve equation:
I * exp(-(x - x0)^2 / (2 * sigma^2))
Now, I would like to do a step forward. I would like to do the Super Gaussian curve fit because I need to consider the flat-top characteristics of the beam. Thus, I need a fit which optimizes also the P parameter.
Does someone know how to do a Super Gaussian curve fit with Python?
I know that there is a way to do a Super Gaussian fit with wolfram mathematica which is not opensource. I do not have it. Thus, I would like also to know if someone knows an open source software thanks to which it is possible to do a Super Gaussian curve fit or to execute wolfram mathematica.
Thank you

Well, you would need to write a function that calculates a parameterized super-Gaussian and use that to model data, say with scipy.optimize.curve_fit. As a lead author of LMFIT (https://lmfit.github.io/lmfit-py/) which provides a high-level interface to fitting and curve-fitting, I would recommend trying that library. With that approach, your model function for a super-Gaussian and using to fit data might look like this:
import numpy as np
from lmfit import Model
def super_gaussian(x, amplitude=1.0, center=0.0, sigma=1.0, expon=2.0):
"""super-Gaussian distribution
super_gaussian(x, amplitude, center, sigma, expon) =
(amplitude/(sqrt(2*pi)*sigma)) * exp(-abs(x-center)**expon / (2*sigma**expon))
"""
sigma = max(1.e-15, sigma)
return ((amplitude/(np.sqrt(2*np.pi)*sigma))
* np.exp(-abs(x-center)**expon / 2*sigma**expon))
# generate some test data
x = np.linspace(0, 10, 101)
y = super_gaussian(x, amplitude=7.1, center=4.5, sigma=2.5, expon=1.5)
y += np.random.normal(size=len(x), scale=0.015)
# make Model from the super_gaussian function
model = Model(super_gaussian)
# build a set of Parameters to be adjusted in fit, named from the arguments
# of the model function (super_gaussian), and providing initial values
params = model.make_params(amplitude=1, center=5, sigma=2., expon=2)
# you can place min/max bounds on parameters
params['amplitude'].min = 0
params['sigma'].min = 0
params['expon'].min = 0
params['expon'].max = 100
# note: if you wanted to make this strictly Gaussian, you could set
# expon=2 and prevent it from varying in the fit:
### params['expon'].value = 2.0
### params['expon'].vary = False
# now do the fit
result = model.fit(y, params, x=x)
# print out the fit statistics, best-fit parameter values and uncertainties
print(result.fit_report())
# plot results
import matplotlib.pyplot as plt
plt.plot(x, y, label='data')
plt.plot(x, result.best_fit, label='fit')
plt.legend()
plt.show()
This will print a report like
[[Model]]
Model(super_gaussian)
[[Fit Statistics]]
# fitting method = leastsq
# function evals = 53
# data points = 101
# variables = 4
chi-square = 0.02110713
reduced chi-square = 2.1760e-04
Akaike info crit = -847.799755
Bayesian info crit = -837.339273
[[Variables]]
amplitude: 6.96892162 +/- 0.09939812 (1.43%) (init = 1)
center: 4.50181661 +/- 0.00217719 (0.05%) (init = 5)
sigma: 2.48339218 +/- 0.02134446 (0.86%) (init = 2)
expon: 3.25148164 +/- 0.08379431 (2.58%) (init = 2)
[[Correlations]] (unreported correlations are < 0.100)
C(amplitude, sigma) = 0.939
C(sigma, expon) = -0.774
C(amplitude, expon) = -0.745
and generate a plot like this

This is the function for the super gaussian
def super_gaussian(x, amp, x0, sigma):
rank = 2
return amp * ((np.exp(-(2 ** (2 * rank - 1)) * np.log(2) * (((x - x0) ** 2) / ((sigma) ** 2)) ** (rank))) ** 2)
And then you need to call it with scipy optimize curve fit like this:
from scipy import optimize
opt, _ = optimize.curve_fit(super_gaussian, x, y)
vals = super_gaussian(x, *opt)
'vals' is what you need to plot, that is the fitted super gaussian function.
This is what you get with rank=1:
rank=2:
rank=3:

The answer of #M Newville works perfectly for me.
But be careful ! Parenthesis have been fogotten in the quotient of the exponential in the definition of super_gaussian function
def super_gaussian(x, amplitude=1.0, center=0.0, sigma=1.0, expon=2.0):
...
return ((amplitude/(np.sqrt(2*np.pi)*sigma))
* np.exp(-abs(x-center)**expon / 2*sigma**expon))
should be replaced by
def super_gaussian(x, amplitude=1.0, center=0.0, sigma=1.0, expon=2.0):
...
return (amplitude/(np.sqrt(2*np.pi)*sigma))
* np.exp(-abs(x-center)**expon / (2*sigma**expon))
Then the FWHM of the super-gaussian function which writes:
FWHM = 2.*sigma*(2.*np.log(2.))**(1/expon)
is well calculated and in excellent agreement with the plot.
I am sorry to write this text as an answer. But my reputation score is low to add a comment to M Newville post...

Fitting of y(x)=a *exp(-b *(x-c)**p) to data for parameters a,b,c,p.
The example of numerical calculus below shows an non-iterative method which doesn't require initial guess of parameters.
This in an application of the general principle explained in the paper : https://fr.scribd.com/doc/14674814/Regressions-et-equations-integrales
In the present version of the paper the case of Super-Gaussian isn't explicitely treated. It is not necessary to read the paper since the screen copy below shows the calculus in whole details.
Note that the numerical results a,b,c,p can be used as initial values for classical iterative methotds of regression.
Note:
The linear equation considered is :
A,B,C,D are the parameters to be computed thanks to linear regression. Numerical values S(k) of the integral are directly computed by numerical integration from the given data (As shown in the above example).

python scipy.optimize curve fitting with only two points

I want to fit power-law model (x**m * c) for only two data points to find out the slope m. I am using the curve_fit function from scipy.optimize for this problem. Now when I run the following code
import numpy as np
from scipy.optimize import curve_fit
func = lambda x, m, c: x**m * c
xdata = np.array([235e6, 610e6])
ydata = np.array([0.077, 0.054])
err = np.array([0.0086, 0.0055])
coeff, var = curve_fit(func, xdata, ydata, sigma=err)
print(coeff, var)
It successfully returns the value of m i.e. coeff[0]. But the value of var is [[ inf inf] [ inf inf]]. Is there any problem because of just two data points? It cannot calculate covariance of best fit parameter values? Then how do I calculate error in m?

You have two free parameters and two data points, so the problem is under-constrained. Your fitted curve passes perfectly through the two data points with no error, and so the optimizer cannot calculate a covariance for the parameters.

How to do linear regression, taking errorbars into account?

I am doing a computer simulation for some physical system of finite size, and after this I am doing extrapolation to the infinity (Thermodynamic limit). Some theory says that data should scale linearly with system size, so I am doing linear regression.
The data I have is noisy, but for each data point I can estimate errorbars. So, for example data points looks like:
x_list = [0.3333333333333333, 0.2886751345948129, 0.25, 0.23570226039551587, 0.22360679774997896, 0.20412414523193154, 0.2, 0.16666666666666666]
y_list = [0.13250359351851854, 0.12098339583333334, 0.12398501145833334, 0.09152715, 0.11167239583333334, 0.10876248333333333, 0.09814170444444444, 0.08560799305555555]
y_err = [0.003306749165349316, 0.003818446389148108, 0.0056036878203831785, 0.0036635292592592595, 0.0037034897788415424, 0.007576672222222223, 0.002981084130692832, 0.0034913019065973983]
Let's say I am trying to do this in Python.
First way that I know is:
m, c, r_value, p_value, std_err = scipy.stats.linregress(x_list, y_list)
I understand this gives me errorbars of the result, but this does not take into account errorbars of the initial data.
Second way that I know is:
m, c = numpy.polynomial.polynomial.polyfit(x_list, y_list, 1, w = [1.0 / ty for ty in y_err], full=False)
Here we use the inverse of the errorbar for the each point as a weight that is used in the least square approximation. So if a point is not really that reliable it will not influence result a lot, which is reasonable.
But I can not figure out how to get something that combines both these methods.
What I really want is what second method does, meaning use regression when every point influences the result with different weight. But at the same time I want to know how accurate my result is, meaning, I want to know what are errorbars of the resulting coefficients.
How can I do this?

Not entirely sure if this is what you mean, but…using pandas, statsmodels, and patsy, we can compare an ordinary least-squares fit and a weighted least-squares fit which uses the inverse of the noise you provided as a weight matrix (statsmodels will complain about sample sizes < 20, by the way).
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
mpl.rcParams['figure.dpi'] = 300
import statsmodels.formula.api as sm
x_list = [0.3333333333333333, 0.2886751345948129, 0.25, 0.23570226039551587, 0.22360679774997896, 0.20412414523193154, 0.2, 0.16666666666666666]
y_list = [0.13250359351851854, 0.12098339583333334, 0.12398501145833334, 0.09152715, 0.11167239583333334, 0.10876248333333333, 0.09814170444444444, 0.08560799305555555]
y_err = [0.003306749165349316, 0.003818446389148108, 0.0056036878203831785, 0.0036635292592592595, 0.0037034897788415424, 0.007576672222222223, 0.002981084130692832, 0.0034913019065973983]
# put x and y into a pandas DataFrame, and the weights into a Series
ws = pd.DataFrame({
'x': x_list,
'y': y_list
})
weights = pd.Series(y_err)
wls_fit = sm.wls('x ~ y', data=ws, weights=1 / weights).fit()
ols_fit = sm.ols('x ~ y', data=ws).fit()
# show the fit summary by calling wls_fit.summary()
# wls fit r-squared is 0.754
# ols fit r-squared is 0.701
# let's plot our data
plt.clf()
fig = plt.figure()
ax = fig.add_subplot(111, facecolor='w')
ws.plot(
kind='scatter',
x='x',
y='y',
style='o',
alpha=1.,
ax=ax,
title='x vs y scatter',
edgecolor='#ff8300',
s=40
)
# weighted prediction
wp, = ax.plot(
wls_fit.predict(),
ws['y'],
color='#e55ea2',
lw=1.,
alpha=1.0,
)
# unweighted prediction
op, = ax.plot(
ols_fit.predict(),
ws['y'],
color='k',
ls='solid',
lw=1,
alpha=1.0,
)
leg = plt.legend(
(op, wp),
('Ordinary Least Squares', 'Weighted Least Squares'),
loc='upper left',
fontsize=8)
plt.tight_layout()
fig.set_size_inches(6.40, 5.12)
plt.show()
WLS residuals:
[0.025624005084707302,
0.013611438189866154,
-0.033569595462217161,
0.044110895217014695,
-0.025071632845910546,
-0.036308252199571928,
-0.010335514810672464,
-0.0081511479431851663]
The mean squared error of the residuals for the weighted fit (wls_fit.mse_resid or wls_fit.scale) is 0.22964802498892287, and the r-squared value of the fit is 0.754.
You can obtain a wealth of data about the fits by calling their summary() method, and/or doing dir(wls_fit), if you need a list of every available property and method.

I wrote a concise function to perform the weighted linear regression of a data set, which is a direct translation of GSL's "gsl_fit_wlinear" function. This is useful if you want to know exactly what your function is doing when it performs the fit
def wlinear_fit (x,y,w) :
"""
Fit (x,y,w) to a linear function, using exact formulae for weighted linear
regression. This code was translated from the GNU Scientific Library (GSL),
it is an exact copy of the function gsl_fit_wlinear.
"""
# compute the weighted means and weighted deviations from the means
# wm denotes a "weighted mean", wm(f) = (sum_i w_i f_i) / (sum_i w_i)
W = np.sum(w)
wm_x = np.average(x,weights=w)
wm_y = np.average(y,weights=w)
dx = x-wm_x
dy = y-wm_y
wm_dx2 = np.average(dx**2,weights=w)
wm_dxdy = np.average(dx*dy,weights=w)
# In terms of y = a + b x
b = wm_dxdy / wm_dx2
a = wm_y - wm_x*b
cov_00 = (1.0/W) * (1.0 + wm_x**2/wm_dx2)
cov_11 = 1.0 / (W*wm_dx2)
cov_01 = -wm_x / (W*wm_dx2)
# Compute chi^2 = \sum w_i (y_i - (a + b * x_i))^2
chi2 = np.sum (w * (y-(a+b*x))**2)
return a,b,cov_00,cov_11,cov_01,chi2
To perform your fit, you would do
a,b,cov_00,cov_11,cov_01,chi2 = wlinear_fit(x_list,y_list,1.0/y_err**2)
Which will return the best estimate for the coefficients a (the intercept) and b (the slope) of the linear regression, along with the elements of the covariance matrix cov_00, cov_01 and cov_11. The best estimate on the error on a is then the square root of cov_00 and the one on b is the square root of cov_11. The weighted sum of the residuals is returned in the chi2 variable.
IMPORTANT: this function accepts inverse variances, not the inverse standard deviations as the weights for the data points.

sklearn.linear_model.LinearRegression supports specification of weights during fit:
x_data = np.array(x_list).reshape(-1, 1) # The model expects shape (n_samples, n_features).
y_data = np.array(y_list)
y_err = np.array(y_err)
model = LinearRegression()
model.fit(x_data, y_data, sample_weight=1/y_err)
Here the sample weight is specified as 1 / y_err. Different versions are possible and often it's a good idea to clip these sample weights to a maximum value in case the y_err varies strongly or has small outliers:
sample_weight = 1 / y_err
sample_weight = np.minimum(sample_weight, MAX_WEIGHT)
where MAX_WEIGHT should be determined from your data (by looking at the y_err or 1 / y_err distributions, e.g. if they have outliers they can be clipped).

I found this document helpful in understanding and setting up my own weighted least squares routine (applicable for any programming language).
Typically learning and using optimized routines is the best way to go but there are times where understanding the guts of a routine is important.

Scipy LeastSq errorbars

I'm fitting an experimental spectrum to a theoretical expectation using LeastSq from SciPy. There are of course errors associated with the experimental values. How can I feed these to the LeastSq or do I need a different routine? I find nothing in the documentation.

The scipy.optimize.leastsq function does not have a built-in way to incorporate weights. However, the scipy.optimize.curve_fit function does have a sigma parameter which can be used to indicate the variance of each y-data point.
curve_fit uses 1.0/sigma as the weight, where sigma can be an array of length N, (the same length as ydata).
So somehow you have to surmise the variance of each ydata point based on the size of the error bar and use that to determine sigma.
For example, if you declare that half the length of the error bar represents 1 standard deviation, then the variance (what curve_fit calls sigma) would be the square of the standard deviation.
sigma = (length_of_error_bar/2)**2
Reference:
Wikipedia page on Weighted Least-Squares

I'm in the middle of doing this myself so I will share what I've done and perhaps we can get some comments from the community. I have a collection of data points taken at definite time intervals from which I've calculated standard deviations. I would like to fit these points with a sin function. Leastsq does this by minimizing the residual, or the difference between your data points and the fit function based on a set of parameters, p. We may weight our residuals by dividing them by the variance, or the square of the standard deviation.
As follows:
from scipy.optimize import leastsq
import numpy as np
from matplotlib import pyplot as plt
def sin_func(t, p):
""" Returns the sin function for the parameters:
p[0] := amplitude
p[1] := period/wavelength
p[2] := phase offset
p[3] := amplitude offset
"""
y = p[0]*np.sin(2*np.pi/p[1]*t+p[2])+p[3]
return y
def sin_residuals(p, y, t, std):
err = (y - p[0]*np.sin(2*np.pi/p[1]*t+p[2])-p[3])/std**2
return err
def sin_fit(t, ydata, std, p0):
""" Fits a set of data, ydata, on a domain, t, with individual standard
deviations, std, to a sin curve given the initial parameters, p0, of the form:
p[0] := amplitude
p[1] := period/wavelength
p[2] := phase offset
p[3] := amplitude offset
"""
# optimization #
pbest = leastsq(sin_residuals, p0, args=(ydata, t, std), full_output=1)
p_fit = pbest[0]
# fit to data #
fit = p_fit[0]*np.sin(2*np.pi/p_fit[1]*t+p_fit[2])+p_fit[3]
return p_fit

How to calculate error for polynomial fitting (in slope and intercept)

Hi I want to calculate errors in slope and intercept which are calculated by scipy.polyfit function. I have (+/-) uncertainty for ydata so how can I include it for calculating uncertainty into slope and intercept? My code is,
from scipy import polyfit
import pylab as plt
from numpy import *
data = loadtxt("data.txt")
xdata,ydata = data[:,0],data[:,1]
x_d,y_d = log10(xdata),log10(ydata)
polycoef = polyfit(x_d, y_d, 1)
yfit = 10**( polycoef[0]*x_d+polycoef[1] )
plt.subplot(111)
plt.loglog(xdata,ydata,'.k',xdata,yfit,'-r')
plt.show()
Thanks a lot

You could use scipy.optimize.curve_fit instead of polyfit. It has a parameter sigma for errors of ydata. If you have your error for every y value in a sequence yerror (so that yerror has the same length as your y_d sequence) you can do:
polycoef, _ = scipy.optimize.curve_fit(lambda x, a, b: a*x+b, x_d, y_d, sigma=yerror)
For an alternative see the paragraph Fitting a power-law to data with errors in the Scipy Cookbook.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.