Scipy: difference between optimize.fmin and optimize.leastsq

Scipy: difference between optimize.fmin and optimize.leastsq - python

What's the difference between scipy's optimize.fmin and optimize.leastsq? They seem to be used in pretty much the same way in this example page. The only difference I can see is that leastsq actually calculates the sum of squares on its own (as its name would suggest) while when using fmin one has to do this manually. Other than that, are the two functions equivalent?

Different algorithms underneath.
fmin is using the simplex method; leastsq is using least squares fitting.

Just to add some information, I am developing a module to fit a biexponential function and the time difference between leastsq and minimize seems to be almost 100 times. Have a look at the code below for more details.
I used a biexponential curve which is a sum of two exponents and the model function has 4 parameters to fit. S, f, D_star and D.
All default parameters for fitting were used
S [f e^(-x * D_star) + (1 - f) e^(-x * D)]
('Time taken for minimize:', 0.011617898941040039)
('Time taken for leastsq :', 0.0003180503845214844)
The code used :
import numpy as np
from scipy.optimize import minimize, leastsq
from time import time
def ivim_function(params, bvals):
"""The Intravoxel incoherent motion (IVIM) model function.
S(b) = S_0[f*e^{(-b*D\*)} + (1-f)e^{(-b*D)}]
S_0, f, D\* and D are the IVIM parameters.
Parameters
----------
params : array
parameters S0, f, D_star and D of the model
bvals : array
bvalues
References
----------
.. [1] Le Bihan, Denis, et al. "Separation of diffusion
and perfusion in intravoxel incoherent motion MR
imaging." Radiology 168.2 (1988): 497-505.
.. [2] Federau, Christian, et al. "Quantitative measurement
of brain perfusion with intravoxel incoherent motion
MR imaging." Radiology 265.3 (2012): 874-881.
"""
S0, f, D_star, D = params
S = S0 * (f * np.exp(-bvals * D_star) + (1 - f) * np.exp(-bvals * D))
return S
def _ivim_error(params, bvals, signal):
"""Error function to be used in fitting the IVIM model
"""
return (signal - ivim_function(params, bvals))
def sum_sq(params, bvals, signal):
"""Sum of squares of the errors. This function is minimized"""
return np.sum(_ivim_error(params, bvals, signal)**2)
x0 = np.array([100., 0.20, 0.008, 0.0009])
bvals = np.array([0., 10., 20., 30., 40., 60., 80., 100.,
120., 140., 160., 180., 200., 220., 240.,
260., 280., 300., 350., 400., 500., 600.,
700., 800., 900., 1000.])
data = ivim_function(x0, bvals)
optstart = time()
opt = minimize(sum_sq, x0, args=(bvals, data))
optend = time()
time_taken = optend - optstart
print("Time taken for opt:", time_taken)
lstart = time()
lst = leastsq(_ivim_error,
x0,
args=(bvals, data),)
lend = time()
time_taken = lend - lstart
print("Time taken for leastsq :", time_taken)
print('Parameters estimated using minimize :', opt.x)
print('Parameters estimated using leastsq :', lst[0])

Related

Applying bounds to scipy.optimize.curve_fit() leads to "ValueError: `x0` must have at most 1 dimension."

Somehow the following code raises the error "ValueError: x0 must have at most 1 dimension." as soon as I add bounds to my Fit. I have absolutely no idea what I'm doing wrong here.
The Goal is to restrain the fit of the 8 Lorentzian Curves to the given bounds.
However, the presented code propably won't lead to a fit, but this is a problem I should be able to solve.
import matplotlib.pyplot as plt
import numpy as np
import scipy as scipy
from scipy.signal import find_peaks, peak_widths
import time
# Functions needed for Fitting model
def lorentzian(x, amp, cen, wid):
return amp*wid**2/((x-cen)**2+wid**2)
def multi_lorentzian(x, params, *args):
if args:
params = [params] + [x for x in args]
try:
params = np.array(params).reshape(len(params)//3, 3)
except:
raise ValueError("Parameter dimensions don't fit the model!")
total_curve = 0
for amp, cen, wid in params:
total_curve += lorentzian(x, amp, cen, wid)
return total_curve
##############################################################################
# create data
samples = 200
start = 2.75
stop = 3
x_incr = (stop-start)/samples
x_array = np.linspace(start, stop, samples) # frequency in GHz
amp_array = np.random.uniform(0.03, 0.1, 8) # 3 bis 10% Kontrast
cen_array = [2.81, 2.829, 2.831, 2.848, 2.897, 2.914, 2.9165, 2.932]
# cen_array = np.random.uniform(start, stop, 8)
wid_array = [0.003, 0.003, 0.003,0.003, 0.003, 0.003, 0.003, 0.003]
y_array = 1-multi_lorentzian(x_array,
np.array([amp_array, cen_array, wid_array]).T)
y_noise = y_array + np.random.normal(0, 1, samples)*1e-3
# mirroring to get maxima instead of minima
y_noise_inv = -y_noise+1
##############################################################################
# prepare guessing of start values
heights= np.random.uniform(0.03, 0.1, 8)
widths = np.random.uniform(0.002, 0.004, 8)
center_guess = cen_array+np.random.normal(0, 1, 8)*1e-3
p0_array =np.array([heights,center_guess, widths]).T
bounds_array = ([0., 2.75, 0.], [1., 3., 0.5])
popt_y, pcov_y = scipy.optimize.curve_fit(multi_lorentzian, x_array, y_noise_inv,
p0=p0_array, bounds= bounds_array)
popt_y = popt_y.reshape(len(popt_y)//3, 3)
single_peaks = [lorentzian(x_array, i, j, k) for i,j,k in popt_y]
perr_y = np.sqrt(np.diag(pcov_y))
residual_y = y_noise_inv - multi_lorentzian(x_array, popt_y)
ss_res = np.sum(residual_y**2)
ss_tot = np.sum((y_noise_inv-np.mean(y_noise_inv))**2)
r_squared = 1 - (ss_res / ss_tot)

Ok, after some digging, the issue was quite simple. p0 is supposed to be flat, not a 2D array that you supplied. I only had to change two lines to make things work.
1st, the bounds array. You're supposed to have as many minimum and maximum values as you have parameters, and since you have 3*8 params, then I just multiplied them as shown here.
bounds_array = ([0., 2.75, 0.]*8, [1., 3., 0.5]*8)
2nd, I flattened p0 when calling curve_fit.
popt_y, pcov_y = scipy.optimize.curve_fit(multi_lorentzian, x_array, y_noise_inv, p0=p0_array.flatten(), bounds= bounds_array)
And this is the fit:

Scipy ODR results with huge relative errors for sd_beta

When running the ODR algorithm on some experiment data, I've been asked to run it with the following model:
It is clear that this fitting function is containing a redundant degree of freedom.
When I run the fitting on my experiment data I get enormous relative errors of beta, starting from 8000% relative error.
When I try to run the fitting again but with a fitting function that doesn't have a redundant degree of freedom, such as:
I don't get this kind of problem.
Why is this happening? Why the ODR algorithm is so sensitive for redundant degrees of freedom? I wasn't able to answer these questions to my supervisors. An answer will be much appreciated.
Reproducing code example:
from scipy.odr import RealData, Model, ODR
def func1(a, x):
return a[0] * (x + a[1]) / (a[3] * (x + a[1]) + a[1] * x) + a[2]
def func2(a, x):
return a[0] / (x + a[1]) + a[2]
# fmt: off
zx = [
1911.125, 2216.95, 2707.71, 3010.225, 3410.612, 3906.015, 4575.105, 5517.548,
6918.481,
]
dx = [
0.291112577, 0.321695254, 0.370771197, 0.401026507, 0.441068641, 0.490601621,
0.557573268, 0.651755155, 0.79184836,
]
zy = [
0.000998056, 0.000905647, 0.000800098, 0.000751041, 0.000699982, 0.000650532,
0.000600444, 0.000550005, 0.000500201,
]
dy = [
5.49029e-07, 5.02824e-07, 4.5005e-07, 4.25532e-07, 3.99991e-07, 3.75266e-07,
3.50222e-07, 3.25003e-07, 3.00101e-07,
]
# fmt: on
data = RealData(x=zx, y=zy, sx=dx, sy=dy)
print("Func 1")
print("======")
beta01 = [
1.46,
4775.4,
0.01,
1000,
]
model1 = Model(func1)
odr1 = ODR(data, model1, beta0=beta01)
result1 = odr1.run()
print("beta", result1.beta)
print("sd beta", result1.sd_beta)
print("relative", result1.sd_beta / result1.beta * 100)
print()
print()
print("Func 2")
print("======")
beta02 = [
1,
1,
1,
]
model2 = Model(func2)
odr2 = ODR(data, model2, beta0=beta02)
result2 = odr2.run()
print("beta", result2.beta)
print("sd beta", result2.sd_beta)
print("relative", result2.sd_beta / result2.beta * 100)
This prints out:
Func 1
======
beta [ 1.30884537e+00 -2.82585952e+03 7.79755196e-04 9.47943376e+01]
sd beta [1.16144608e+02 3.73765816e+06 6.12613738e-01 4.20775596e+03]
relative [ 8873.82193523 -132266.24068473 78564.88054498 4438.82627453]
Func 2
======
beta [1.40128121e+00 9.80844274e+01 3.00511669e-04]
sd beta [2.73990552e-03 3.22344713e+00 3.74538794e-07]
relative [0.1955286 3.28640051 0.12463369]
Scipy/Numpy/Python version information:
Versions are:
Scipy - 1.4.1
Numpy - 1.18.2
Python - 3.7.2

The problem is not with the degrees of freedom.
The degrees of freedom is the difference between the number of data points and the number of fitting parameters.
The problem has the same number of degrees of freedom for the two formulae, as they have the same number of parameters.
It also looks like that you do not have free degrees of freedom, which is good news, it means that it can potentially be fitted.
However, you are right that first expression has some problem: the parameters you are trying to fit are not independent.
This is probably better understood with some simpler example.
Consider the following expression:
y = x + b + c
which you try to fit, given n data for x and y with n >> 2.
The question is: what are the optimal value for b and c? This cannot be answered. All you can say from x and y data is about the combination. Therefore, if b + c is 0, the fit cannot tell us if b = 1000, c = -1000 or b = 1, c= -1, but at least we can say that given b we can determine c.
What is the error on a given b? Potentially infinite. That is the reason for the fitting to give you that large relative error.

Wrong P_value given by ttest_1samp

Here is a one sample t-test example:
from scipy.stats import ttest_1samp
import numpy as np
ages = [32., 34., 29., 29., 22., 39., 38., 37.,38, 36, 30, 26, 22, 22.]
ages_mean = np.mean(ages)
ages_std = np.std(ages, ddof=1)
print(ages_mean)
print(ages_std)
ttest, pval = ttest_1samp(ages, 30)
print("ttest: ", ttest)
print("p_value: ", pval)
#31.0
#6.2634470725607025
#ttest: 0.5973799001456603
#p_value: 0.5605155888171379
# check analytically:
my_ttest = (ages_mean - 30.0)/(ages_std/np.sqrt(len(ages)))
print(t)
#0.5973799001456603
check the p_value
by definition p_value = P(t>=0.59) = 1 - P(t<=.59).
Using the Z-table, we got p_value = 1 - 0.7224 = 0.2776 # 0.56!!!

If you check the vignette of ttest_1samp, it writes:
So it's a two-sided p-value, meaning what the sum of probabilities of getting a absolute t-statistic more extreme than this.
The t distribution is symmetric, so we can take the -abs(t stat) and multiply by two for a 2 sided test, and the p-value will be:
from scipy.stats import t
2*t.cdf(-0.5973799001456603, 13)
0.5605155888171379
Your derived value will be correct for a one-sided t-test :)

Fast b-spline algorithm with numpy/scipy

I need to compute bspline curves in python. I looked into scipy.interpolate.splprep and a few other scipy modules but couldn't find anything that readily gave me what I needed. So i wrote my own module below. The code works fine, but it is slow (test function runs in 0.03s, which seems like a lot considering i'm only asking for 100 samples with 6 control vertices).
Is there a way to simplify the code below with a few scipy module calls, which presumably would speed it up? And if not, what could i do to my code to improve its performance?
import numpy as np
# cv = np.array of 3d control vertices
# n = number of samples (default: 100)
# d = curve degree (default: cubic)
# closed = is the curve closed (periodic) or open? (default: open)
def bspline(cv, n=100, d=3, closed=False):
# Create a range of u values
count = len(cv)
knots = None
u = None
if not closed:
u = np.arange(0,n,dtype='float')/(n-1) * (count-d)
knots = np.array([0]*d + range(count-d+1) + [count-d]*d,dtype='int')
else:
u = ((np.arange(0,n,dtype='float')/(n-1) * count) - (0.5 * (d-1))) % count # keep u=0 relative to 1st cv
knots = np.arange(0-d,count+d+d-1,dtype='int')
# Simple Cox - DeBoor recursion
def coxDeBoor(u, k, d):
# Test for end conditions
if (d == 0):
if (knots[k] <= u and u < knots[k+1]):
return 1
return 0
Den1 = knots[k+d] - knots[k]
Den2 = knots[k+d+1] - knots[k+1]
Eq1 = 0;
Eq2 = 0;
if Den1 > 0:
Eq1 = ((u-knots[k]) / Den1) * coxDeBoor(u,k,(d-1))
if Den2 > 0:
Eq2 = ((knots[k+d+1]-u) / Den2) * coxDeBoor(u,(k+1),(d-1))
return Eq1 + Eq2
# Sample the curve at each u value
samples = np.zeros((n,3))
for i in xrange(n):
if not closed:
if u[i] == count-d:
samples[i] = np.array(cv[-1])
else:
for k in xrange(count):
samples[i] += coxDeBoor(u[i],k,d) * cv[k]
else:
for k in xrange(count+d):
samples[i] += coxDeBoor(u[i],k,d) * cv[k%count]
return samples
if __name__ == "__main__":
import matplotlib.pyplot as plt
def test(closed):
cv = np.array([[ 50., 25., -0.],
[ 59., 12., -0.],
[ 50., 10., 0.],
[ 57., 2., 0.],
[ 40., 4., 0.],
[ 40., 14., -0.]])
p = bspline(cv,closed=closed)
x,y,z = p.T
cv = cv.T
plt.plot(cv[0],cv[1], 'o-', label='Control Points')
plt.plot(x,y,'k-',label='Curve')
plt.minorticks_on()
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(35, 70)
plt.ylim(0, 30)
plt.gca().set_aspect('equal', adjustable='box')
plt.show()
test(False)
The two images below shows what my code returns with both closed conditions:

So after obsessing a lot about my question, and much research, i finally have my answer. Everything is available in scipy , and i'm putting my code here so hopefully someone else can find this useful.
The function takes in an array of N-d points, a curve degree, a periodic state (opened or closed) and will return n samples along that curve. There are ways to make sure the curve samples are equidistant but for the time being i'll focus on this question, as it is all about speed.
Worthy of note: I can't seem to be able to go beyond a curve of 20th degree. Granted, that's overkill already, but i figured it's worth mentioning.
Also worthy of note: on my machine the code below can calculate 100,000 samples in 0.017s
import numpy as np
import scipy.interpolate as si
def bspline(cv, n=100, degree=3, periodic=False):
""" Calculate n samples on a bspline
cv : Array ov control vertices
n : Number of samples to return
degree: Curve degree
periodic: True - Curve is closed
False - Curve is open
"""
# If periodic, extend the point array by count+degree+1
cv = np.asarray(cv)
count = len(cv)
if periodic:
factor, fraction = divmod(count+degree+1, count)
cv = np.concatenate((cv,) * factor + (cv[:fraction],))
count = len(cv)
degree = np.clip(degree,1,degree)
# If opened, prevent degree from exceeding count-1
else:
degree = np.clip(degree,1,count-1)
# Calculate knot vector
kv = None
if periodic:
kv = np.arange(0-degree,count+degree+degree-1)
else:
kv = np.clip(np.arange(count+degree+1)-degree,0,count-degree)
# Calculate query range
u = np.linspace(periodic,(count-degree),n)
# Calculate result
return np.array(si.splev(u, (kv,cv.T,degree))).T
To test it:
import matplotlib.pyplot as plt
colors = ('b', 'g', 'r', 'c', 'm', 'y', 'k')
cv = np.array([[ 50., 25.],
[ 59., 12.],
[ 50., 10.],
[ 57., 2.],
[ 40., 4.],
[ 40., 14.]])
plt.plot(cv[:,0],cv[:,1], 'o-', label='Control Points')
for d in range(1,21):
p = bspline(cv,n=100,degree=d,periodic=True)
x,y = p.T
plt.plot(x,y,'k-',label='Degree %s'%d,color=colors[d%len(colors)])
plt.minorticks_on()
plt.legend()
plt.xlabel('x')
plt.ylabel('y')
plt.xlim(35, 70)
plt.ylim(0, 30)
plt.gca().set_aspect('equal', adjustable='box')
plt.show()
Results for both opened or periodic curves:
ADDENDUM
As of scipy-0.19.0 there is a new scipy.interpolate.BSpline function that can be used.
import numpy as np
import scipy.interpolate as si
def scipy_bspline(cv, n=100, degree=3, periodic=False):
""" Calculate n samples on a bspline
cv : Array ov control vertices
n : Number of samples to return
degree: Curve degree
periodic: True - Curve is closed
"""
cv = np.asarray(cv)
count = cv.shape[0]
# Closed curve
if periodic:
kv = np.arange(-degree,count+degree+1)
factor, fraction = divmod(count+degree+1, count)
cv = np.roll(np.concatenate((cv,) * factor + (cv[:fraction],)),-1,axis=0)
degree = np.clip(degree,1,degree)
# Opened curve
else:
degree = np.clip(degree,1,count-1)
kv = np.clip(np.arange(count+degree+1)-degree,0,count-degree)
# Return samples
max_param = count - (degree * (1-periodic))
spl = si.BSpline(kv, cv, degree)
return spl(np.linspace(0,max_param,n))
Testing for equivalency:
p1 = bspline(cv,n=10**6,degree=3,periodic=True) # 1 million samples: 0.0882 sec
p2 = scipy_bspline(cv,n=10**6,degree=3,periodic=True) # 1 million samples: 0.0789 sec
print np.allclose(p1,p2) # returns True

Giving optimization tips without profiling data is a bit like shooting in the dark. However, the function coxDeBoor seems to be called very often. This is where I would start optimizing.
Function calls in Python are expensive. You should try to replace the coxDeBoor recursion with iteration to avoid excessive function calls. Some general information how to do this can be found in answers to this question. As stack/queue you can use collections.deque.

Fitting negative binomial in python

In scipy there is no support for fitting a negative binomial distribution using data
(maybe due to the fact that the negative binomial in scipy is only discrete).
For a normal distribution I would just do:
from scipy.stats import norm
param = norm.fit(samp)
Is there something similar 'ready to use' function in any other library?

Statsmodels has discrete.discrete_model.NegativeBinomial.fit(), see here:
https://www.statsmodels.org/dev/generated/statsmodels.discrete.discrete_model.NegativeBinomial.fit.html#statsmodels.discrete.discrete_model.NegativeBinomial.fit

Not only because it is discrete, also because maximum likelihood fit to negative binomial can be quite involving, especially with an additional location parameter. That would be the reason why .fit() method is not provided for it (and other discrete distributions in Scipy), here is an example:
In [163]:
import scipy.stats as ss
import scipy.optimize as so
In [164]:
#define a likelihood function
def likelihood_f(P, x, neg=1):
n=np.round(P[0]) #by definition, it should be an integer
p=P[1]
loc=np.round(P[2])
return neg*(np.log(ss.nbinom.pmf(x, n, p, loc))).sum()
In [165]:
#generate a random variable
X=ss.nbinom.rvs(n=100, p=0.4, loc=0, size=1000)
In [166]:
#The likelihood
likelihood_f([100,0.4,0], X)
Out[166]:
-4400.3696690513316
In [167]:
#A simple fit, the fit is not good and the parameter estimate is way off
result=so.fmin(likelihood_f, [50, 1, 1], args=(X,-1), full_output=True, disp=False)
P1=result[0]
(result[1], result[0])
Out[167]:
(4418.599495886474, array([ 59.61196161, 0.28650831, 1.15141838]))
In [168]:
#Try a different set of start paramters, the fit is still not good and the parameter estimate is still way off
result=so.fmin(likelihood_f, [50, 0.5, 0], args=(X,-1), full_output=True, disp=False)
P1=result[0]
(result[1], result[0])
Out[168]:
(4417.1495981801972,
array([ 6.24809397e+01, 2.91877405e-01, 6.63343536e-04]))
In [169]:
#In this case we need a loop to get it right
result=[]
for i in range(40, 120): #in fact (80, 120) should probably be enough
_=so.fmin(likelihood_f, [i, 0.5, 0], args=(X,-1), full_output=True, disp=False)
result.append((_[1], _[0]))
In [170]:
#get the MLE
P2=sorted(result, key=lambda x: x[0])[0][1]
sorted(result, key=lambda x: x[0])[0]
Out[170]:
(4399.780263084549,
array([ 9.37289361e+01, 3.84587087e-01, 3.36856705e-04]))
In [171]:
#Which one is visually better?
plt.hist(X, bins=20, normed=True)
plt.plot(range(260), ss.nbinom.pmf(range(260), np.round(P1[0]), P1[1], np.round(P1[2])), 'g-')
plt.plot(range(260), ss.nbinom.pmf(range(260), np.round(P2[0]), P2[1], np.round(P2[2])), 'r-')
Out[171]:
[<matplotlib.lines.Line2D at 0x109776c10>]

I know this thread is quite old, but current readers may want to look at this repo which is made for this purpose: https://github.com/gokceneraslan/fit_nbinom
There's also an implementation here, though part of a larger package: https://github.com/ernstlab/ChromTime/blob/master/optimize.py

I stumbled across this thread, and found an answer for anyone else wondering.
If you simply need the n, p parameterisation used by scipy.stats.nbinom you can convert the mean and variance estimates:
mu = np.mean(sample)
sigma_sqr = np.var(sample)
n = mu**2 / (sigma_sqr - mu)
p = mu / sigma_sqr
If you the dispersionparameter you can use a negative binomial regression model from statsmodels with just an interaction term. This will find the dispersionparameter alpha using MLE.
# Data processing
import pandas as pd
import numpy as np
# Analysis models
import statsmodels.formula.api as smf
from scipy.stats import nbinom
def convert_params(mu, alpha):
"""
Convert mean/dispersion parameterization of a negative binomial to the ones scipy supports
Parameters
----------
mu : float
Mean of NB distribution.
alpha : float
Overdispersion parameter used for variance calculation.
See https://en.wikipedia.org/wiki/Negative_binomial_distribution#Alternative_formulations
"""
var = mu + alpha * mu ** 2
p = mu / var
r = mu ** 2 / (var - mu)
return r, p
# Generate sample data
n = 2
p = 0.9
sample = nbinom.rvs(n=n, p=p, size=10000)
# Estimate parameters
## Mean estimates expectation parameter for negative binomial distribution
mu = np.mean(sample)
## Dispersion parameter from nb model with only interaction term
nbfit = smf.negativebinomial("nbdata ~ 1", data=pd.DataFrame({"nbdata": sample})).fit()
alpha = nbfit.params[1] # Dispersion parameter
# Convert parameters to n, p parameterization
n_est, p_est = convert_params(mu, alpha)
# Check that estimates are close to the true values:
print("""
{:<3} {:<3}
True parameters: {:<3} {:<3}
Estimates : {:<3} {:<3}""".format('n', 'p', n, p,
np.round(n_est, 2), np.round(p_est, 2)))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Scipy: difference between optimize.fmin and optimize.leastsq - python

Different algorithms underneath. fmin is using the simplex method; leastsq is using least squares fitting.

Related

Applying bounds to scipy.optimize.curve_fit() leads to "ValueError: `x0` must have at most 1 dimension."

Scipy ODR results with huge relative errors for sd_beta

Wrong P_value given by ttest_1samp

Fast b-spline algorithm with numpy/scipy

Fitting negative binomial in python

Categories

Resources