Extending the examples from http://implicit-layers-tutorial.org/neural_odes/ I am tying to mimic the curve fitting function in scipy , scipy.optimize.curve_fit ,using google jax. The function to be fitted is a first order ODE.
#Generate toy data for first order ode.
import jax.numpy as jnp
import jax
import numpy as np
#input data
u = np.zeros(100)
u[10:50] = 1
t = np.arange(len(u))
u = jnp.array(u)
#first order ODE
def f(y,t,k,tau,u):
return (k*u[t]-y)/tau
#Euler integration
def odeint_euler(f, y0, t, *args):
def step(state, t):
y_prev, t_prev = state
dt = t - t_prev
y = y_prev + dt * f(y_prev, t_prev, *args)
return (y, t), y
_, ys = jax.lax.scan(step, (y0, t[0]), t[1:])
return ys
pred = odeint_euler(f, jnp.array([0.0]),t,2.,5.,u)
pred_noise = pred.reshape(-1) + 0.05* np.random.randn(len(pred)) # this is the data to be fitted
# define loss function
def loss_function(params,u,targets):
k,tau = params
pred = odeint_euler(f, jnp.array([0.0]),t,k,tau,u)
return jnp.sum((pred-targets)**2)
def update(params, u, targets):
grads = jax.grad(loss_function)(params,u, targets)
return [w - 0.0001 * dw for w,dw in zip(params, grads)]
updated_params = jnp.array([1.0,2.0]) #initial parameters
for i in range(100):
updated_params = update(updated_params, u, pred_noise)
print(updated_params)
The code works fine. However , this runs pretty slow when compared to scipy curve fit. The accuracy of the solution is not good even after 500, 1000 iterations.
What is wrong with the above code ? Any idea how to make the code run faster and to get more accurate solution? Is there any better way of doing the curve fitting with jax?
I see two overall issues with your approach:
The reason your code is running slowly is because you are doing your looping in Python, which incurs JAX's dispatch overhead every loop. I'd recommend using JAX's built-in tools for minimization of loss functions; for example:
from jax.scipy.optimize import minimize
result = minimize(
loss_function, x0=jnp.array([1.0,2.0]),
method='BFGS', args=(u, pred_noise))
The reason your accuracy does not approach that of scipy is likely because JAX defaults to 32-bit computations (See Double (64 bit) Precision). To run your code in 64-bit, you can run this block before any other imports:
from jax import config
config.update('jax_enable_x64', True)
Related
I am trying to fit below mentioned two equations using python leastsq method but am not sure whether this is the right approach. First equation has incomplete gamma function in it while the second one is slightly complex, and along with an exponential function contains a term which is obtained by using a separate fitting formula.
J_mg = T_incomplete(hw/T_mag)
J_nmg = e^(-hw/T)*g(w,T)
Here g is a function of w and T and is calucated using a given fitting formula.
I am following the steps outlined in this question.
Here is what I have done
import numpy as np
from scipy.optimize import leastsq
from scipy.special import gammaincc
from scipy.special import gamma
from matplotlib.pyplot import plot
# generating data
NPTS = 10
hw = np.linspace(0.5, 10, NPTS)
j1 = np.linspace(0.001,10,NPTS)
j2 = np.linspace(0.003,10,NPTS)
T_mag = np.linspace(0.3,0.5,NPTS)
#defining functions
def calc_gaunt_factor(hw,T):
fitting_coeff= np.loadtxt('fitting_coeff.txt', skiprows=1)
#T is in KeV
#K_b = 8.6173303(50)e−5 ev/K
g = 0
gamma = 0.0136/T
theta= hw/T
A= (np.log10(gamma**2) +0.5)*0.4
B= (np.log10(theta)+1.5)*0.4
for i in range(11):
for j in range(11):
g_ij = fitting_coeff[i][j]*(A**i)*(B**j)
g = g_ij+g
return g
def j_w_mag(hw,T_mag):
order= 0.001
return np.sqrt(1/T_mag)*gamma(order)*gammaincc(order,hw/T_mag)
def j_w_nonmag(hw,T):
gamma = 0.0136/T
theta= hw/T
return np.sqrt(1/T)*np.exp((-hw)/T)*calc_gaunt_factor(hw,T)
def residual_func(T,T_mag,hw,j1,j2):
err_unmag = np.nan_to_num(j1 - j_w_nonmag(hw,T))
err_mag = np.nan_to_num(j2 - j_w_mag(hw,T_mag))
err= np.concatenate((err_unmag, err_mag))
return err
par_init = np.array([.35])
best, cov, info, message, ler = leastsq(residual_func,par_init,args=(T_mag,hw,j1,j2),full_output=True)
print("Best-Fit Parameters:")
print("T=%s" %(best[0]))
I am getting weird value for my fitting parameter, T. Is this the right approach? Thanks.
I have a some data and want to fit a given psychometric function p.
I'm intereseted in the fit parameters and the errors as well. With the 'classical' method using the curve_fit function from the scipy package it's easy to get the parameters of p and the errors. However I want to do the same using a maximum likelihood estimation (MLE). From the output and the figure you can see that both methods offer slight different parameters. Implementing the MLE is not the problem but I don't know how to get the errors using this method. Is there an easy way to get them? My likelihood function L is:
I was not able to adapt the code described here http://rlhick.people.wm.edu/posts/estimating-custom-mle.html but this is probably a solution. How can I implement this? Or this there any other way?
A similar function is fitted here using scipy stats models: https://stats.stackexchange.com/questions/66199/maximum-likelihood-curve-model-fitting-in-python. However the errors of the parameters are not calculated neither.
The negative log-likelihood function is correct, since it offers the right parameters, but I was wondering if this function depends on y-data? The negative log likelihood function l is obviously l = -ln(L).
Here is my code:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
## libary
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import minimize
def p(x,x50,s50):
"""return y value of psychometric function p"""
return 1./(1+np.exp(4.*s50*(x50-x)))
def initialparams(x,y):
"""return initial fit parameters for function p with given dataset"""
midpoint = np.mean(x)
slope = (np.max(y)-np.min(y))/(np.max(x)-np.min(x))
return [midpoint, slope]
def cfit_error(pcov):
"""return errors of fir from covariance matrix"""
return np.sqrt(np.diag(pcov))
def neg_loglike(params):
"""analytical negative log likelihood function. This function is dependend on the dataset (x and y) and the two parameters x50 and s50."""
x50 = params[0]
s50 = params[1]
i = len(xdata)
prod = 1.
for i in range(i):
#print prod
prod *= p(xdata[i],x50,s50)**(ydata[i]*5) * (1-p(xdata[i],x50,s50))**((1.-ydata[i])*5)
return -np.log(prod)
xdata = [0.,-7.5,-9.,-13.500001,-12.436171,-16.208617,-13.533123,-12.998025,-13.377527,-12.570075,-13.320075,-13.070075,-11.820075,-12.070075,-12.820075,-13.070075,-12.320075,-12.570075,-11.320075,-12.070075]
ydata = [1.,0.6,0.8,0.4,1.,0.,0.4,0.6,0.2,0.8,0.4,0.,0.6,0.8,0.6,0.2,0.6,0.,0.8,0.6]
intparams = initialparams(xdata, ydata)## guess some initial parameters
## normal curve fit using least squares algorithm
popt, pcov = curve_fit(p, xdata, ydata, p0=intparams)
print('scipy.optimize.curve_fit:')
print('x50 = {:f} +- {:f}'.format(popt[0], cfit_error(pcov)[0]))
print('s50 = {:f} +- {:f}\n'.format(popt[1], cfit_error(pcov)[1]))
## fitting using maximum likelihood estimation
results = minimize(neg_loglike, initialparams(xdata,ydata), method='Nelder-Mead')
print('MLE with self defined likelihood-function:')
print('x50 = {:f}'.format(results.x[0]))
print('s50 = {:f}'.format(results.x[1]))
#print results
## ploting the data and results
xfit = np.arange(-20,1,0.1)
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.plot(xdata, ydata, 'xb', label='measured data')
ax.plot(xfit, p(xfit, *popt), '-r', label='curve fit')
ax.plot(xfit, p(xfit, *results.x), '-g', label='MLE')
plt.legend()
plt.show()
The output is:
scipy.optimize.curve_fit:
x50 = -12.681586 +- 0.252561
s50 = 0.264371 +- 0.117911
MLE with self defined likelihood-function:
x50 = -12.406544
s50 = 0.107389
Both fits and measured data can be seen here:
My Python version is 2.7 on Debian Stretch. Thank you for your help.
Finally the method described by Rob Hicks (http://rlhick.people.wm.edu/posts/estimating-custom-mle.html) worked out. After installing numdifftools, I could calculate the errors of estimated parameters from the hessian matrix.
Installing numdifftools on Linux with su rights:
apt-get install python-pip
pip install numdifftools
An complete code example of my programm from above is here:
#!/usr/bin/env python
# -*- coding: utf-8 -*-
## libary
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
from scipy.optimize import minimize
import numdifftools as ndt
def p(x,x50,s50):
"""return y value of psychometric function p"""
return 1./(1+np.exp(4.*s50*(x50-x)))
def initialparams(x,y):
"""return initial fit parameters for function p with given dataset"""
midpoint = np.mean(x)
slope = (np.max(y)-np.min(y))/(np.max(x)-np.min(x))
return [midpoint, slope]
def cfit_error(pcov):
"""return errors of fir from covariance matrix"""
return np.sqrt(np.diag(pcov))
def neg_loglike(params):
"""analytical negative log likelihood function. This function is dependend on the dataset (x and y) and the two parameters x50 and s50."""
x50 = params[0]
s50 = params[1]
i = len(xdata)
prod = 1.
for i in range(i):
#print prod
prod *= p(xdata[i],x50,s50)**(ydata[i]*5) * (1-p(xdata[i],x50,s50))**((1.-ydata[i])*5)
return -np.log(prod)
xdata = [0.,-7.5,-9.,-13.500001,-12.436171,-16.208617,-13.533123,-12.998025,-13.377527,-12.570075,-13.320075,-13.070075,-11.820075,-12.070075,-12.820075,-13.070075,-12.320075,-12.570075,-11.320075,-12.070075]
ydata = [1.,0.6,0.8,0.4,1.,0.,0.4,0.6,0.2,0.8,0.4,0.,0.6,0.8,0.6,0.2,0.6,0.,0.8,0.6]
intparams = initialparams(xdata, ydata)## guess some initial parameters
## normal curve fit using least squares algorithm
popt, pcov = curve_fit(p, xdata, ydata, p0=intparams)
print('scipy.optimize.curve_fit:')
print('x50 = {:f} +- {:f}'.format(popt[0], cfit_error(pcov)[0]))
print('s50 = {:f} +- {:f}\n'.format(popt[1], cfit_error(pcov)[1]))
## fitting using maximum likelihood estimation
results = minimize(neg_loglike, initialparams(xdata,ydata), method='Nelder-Mead')
## calculating errors from hessian matrix using numdifftools
Hfun = ndt.Hessian(neg_loglike, full_output=True)
hessian_ndt, info = Hfun(results.x)
se = np.sqrt(np.diag(np.linalg.inv(hessian_ndt)))
print('MLE with self defined likelihood-function:')
print('x50 = {:f} +- {:f}'.format(results.x[0], se[0]))
print('s50 = {:f} +- {:f}'.format(results.x[1], se[1]))
Generates the following output:
scipy.optimize.curve_fit:
x50 = -18.702375 +- 1.246728
s50 = 0.063620 +- 0.041207
MLE with self defined likelihood-function:
x50 = -18.572181 +- 0.779847
s50 = 0.078935 +- 0.028783
However some RuntimeErrors occur in calculating the hessian matrix with numdifftools. There is some Division by Zero. This is maybe because of my self defined neg_loglike funtion. At the end there some results for the errors. The method using "Extending Statsmodels" is probably more elegant, but I couldn't figure it out.
I am solving this system of equations with tensorflow:
f1 = y - x*x = 0
f2 = x - (y - 2)*(y - 2) + 1.1 = 0
If I choose bad starting point (x,y)=(-1.3,2), then I get into local minima optimising f1^2+f2^2 with this code:
f1 = y - x*x
f2 = x - (y - 2)*(y - 2) + 1.1
sq=f1*f1+f2*f2
o = tf.train.AdamOptimizer(1e-1).minimize(sq)
with tf.Session() as sess:
init = tf.global_variables_initializer()
sess.run([init])
for i in range(50):
sess.run([o])
r=sess.run([x,y,f1,f2])
print("x",r)
How can I escape this local minima with built-in tensorflow tools? May be there is any other TF approach I can use to solve this equation starting from this bad point?
At the moment, there is no global optimization method that is built-in tensorflow. There is a window opened on the scipy world via ScipyOptimizerInterface, but it (currently?) only wraps scipy's minimize, which is a local minimizer.
However, you can still treat tensorflow's execution result as any other function, that can be fed to the optimizer of your choice. Say you want to experiment with scipy's basinhopping global optimizer. You could write
import numpy as np
from scipy.optimize import basinhopping
import tensorflow as tf
v = tf.placeholder(dtype=tf.float32, shape=(2,))
x = v[0]
y = v[1]
f1 = y - x*x
f2 = x - (y - 2)*(y - 2) + 1.1
sq = f1 * f1 + f2 * f2
starting_point = np.array([-1.3, 2.0], np.float32)
with tf.Session() as sess:
o = basinhopping(lambda x: sess.run(sq, {v: x}), x0=starting_point, T=10, niter=1000)
print(o.x)
# [0.76925635 0.63757862]
(I had to tweak basinhopping's temperatures and number of iterations, as the default values would often not let the solution get out of the basin of the local minimum taken as the starting point here).
What you loose by treating tensorflow as a black box to the optimizer is that the later does not have access to the gradients that are automatically computed by tensorflow. In that sense, it is not optimal -- though you still benefit from the GPU acceleration to compute your function.
EDIT
Since you can provide explicitly the gradients to the local minimizer used by basinhopping, you could feed in the result of tensorflow's gradients:
import numpy as np
from scipy.optimize import basinhopping
import tensorflow as tf
v = tf.placeholder(dtype=tf.float32, shape=(2,))
x = v[0]
y = v[1]
f1 = y - x*x
f2 = x - (y - 2)*(y - 2) + 1.1
sq = f1 * f1 + f2 * f2
sq_grad = tf.gradients(sq, v)[0]
init_value = np.array([-1.3, 2.0], np.float32)
with tf.Session() as sess:
def f(x):
return sess.run(sq, {v: x})
def g(x):
return sess.run(sq_grad, {v: x})
o = basinhopping(f, x0 = init_value, T=10.0, niter=1000, minimizer_kwargs={'jac': g})
print(o.x)
# [0.79057982 0.62501636]
For some reason, this is much slower than without providing the gradient -- however it could be that gradients are provided, the minimization algorithm is not the same, so the comparison may not make sense.
Tensorflow (TF) does not include built-in global optimization methods. Depending on the initialization, all gradient-based methods (such as Adam) in TF can converge into local minimum for non-convex loss functions. This is generally acceptable (if not desirable) for large neural networks due to over-fitting issues when approaching the global minimum.
For this particular problem what you may want is root-solving routines from scipy:
https://docs.scipy.org/doc/scipy/reference/optimize.html#root-finding
Can anyone provide an example of providing a Jacobian to a integrate.odeint function in SciPy?.
I try to run this code from SciPy tutorial odeint example but seems that Dfun() (the Jacobian function) is never called.
from numpy import * # added
from scipy.integrate import odeint
from scipy.special import gamma, airy
y1_0 = 1.0/3**(2.0/3.0)/gamma(2.0/3.0)
y0_0 = -1.0/3**(1.0/3.0)/gamma(1.0/3.0)
y0 = [y0_0, y1_0]
def func(y, t):
return [t*y[1],y[0]]
def gradient(y,t):
print 'jacobian' # added
return [[0,t],[1,0]]
x = arange(0,4.0, 0.01)
t = x
ychk = airy(x)[0]
y = odeint(func, y0, t)
y2 = odeint(func, y0, t, Dfun=gradient)
print y2 # added
Under the hood, scipy.integrate.odeint uses the LSODA solver from the ODEPACK FORTRAN library. In order to deal with situations where the function you are trying to integrate is stiff, LSODA switches adaptively between two different methods for computing the integral - Adams' method, which is faster but unsuitable for stiff systems, and BDF, which is slower but robust to stiffness.
The particular function you're trying to integrate is non-stiff, so LSODA will use Adams on every iteration. You can check this by returning the infodict (...,full_output=True) and checking infodict['mused'].
Since Adams' method does not use the Jacobian, your gradient function never gets called. However if you give odeint a stiff function to integrate, such as the Van der Pol equation:
def vanderpol(y, t, mu=1000.):
return [y[1], mu*(1. - y[0]**2)*y[1] - y[0]]
def vanderpol_jac(y, t, mu=1000.):
return [[0, 1], [-2*y[0]*y[1]*mu - 1, mu*(1 - y[0]**2)]]
y0 = [2, 0]
t = arange(0, 5000, 1)
y,info = odeint(vanderpol, y0, t, Dfun=vanderpol_jac, full_output=True)
print info['mused'] # method used (1=adams, 2=bdf)
print info['nje'] # cumulative number of jacobian evaluations
plot(t, y[:,0])
you should see that odeint switches to using BDF, and the Jacobian function now gets called.
If you want more control over the solver, you should look into scipy.integrate.ode, which is a much more flexible object-oriented interface to multiple different integrators.
I am a little out of my depth in terms of the math involved in my problem, so I apologise for any incorrect nomenclature.
I was looking at using the scipy function leastsq, but am not sure if it is the correct function.
I have the following equation:
eq = lambda PLP,p0,l0,kd : 0.5*(-1-((p0+l0)/kd) + np.sqrt(4*(l0/kd)+(((l0-p0)/kd)-1)**2))
I have data (8 sets) for all the terms except for kd (PLP,p0,l0). I need to find the value of kd by non-linear regression of the above equation.
From the examples I have read, leastsq seems to not allow for the inputting of the data, to get the output I need.
Thank you for your help
This is a bare-bones example of how to use scipy.optimize.leastsq:
import numpy as np
import scipy.optimize as optimize
import matplotlib.pylab as plt
def func(kd,p0,l0):
return 0.5*(-1-((p0+l0)/kd) + np.sqrt(4*(l0/kd)+(((l0-p0)/kd)-1)**2))
The sum of the squares of the residuals is the function of kd we're trying to minimize:
def residuals(kd,p0,l0,PLP):
return PLP - func(kd,p0,l0)
Here I generate some random data. You'd want to load your real data here instead.
N=1000
kd_guess=3.5 # <-- You have to supply a guess for kd
p0 = np.linspace(0,10,N)
l0 = np.linspace(0,10,N)
PLP = func(kd_guess,p0,l0)+(np.random.random(N)-0.5)*0.1
kd,cov,infodict,mesg,ier = optimize.leastsq(
residuals,kd_guess,args=(p0,l0,PLP),full_output=True,warning=True)
print(kd)
yields something like
3.49914274899
This is the best fit value for kd found by optimize.leastsq.
Here we generate the value of PLP using the value for kd we just found:
PLP_fit=func(kd,p0,l0)
Below is a plot of PLP versus p0. The blue line is from data, the red line is the best fit curve.
plt.plot(p0,PLP,'-b',p0,PLP_fit,'-r')
plt.show()
Another option is to use lmfit.
They provide a great example to get you started:.
#!/usr/bin/env python
#<examples/doc_basic.py>
from lmfit import minimize, Minimizer, Parameters, Parameter, report_fit
import numpy as np
# create data to be fitted
x = np.linspace(0, 15, 301)
data = (5. * np.sin(2 * x - 0.1) * np.exp(-x*x*0.025) +
np.random.normal(size=len(x), scale=0.2) )
# define objective function: returns the array to be minimized
def fcn2min(params, x, data):
""" model decaying sine wave, subtract data"""
amp = params['amp']
shift = params['shift']
omega = params['omega']
decay = params['decay']
model = amp * np.sin(x * omega + shift) * np.exp(-x*x*decay)
return model - data
# create a set of Parameters
params = Parameters()
params.add('amp', value= 10, min=0)
params.add('decay', value= 0.1)
params.add('shift', value= 0.0, min=-np.pi/2., max=np.pi/2)
params.add('omega', value= 3.0)
# do fit, here with leastsq model
minner = Minimizer(fcn2min, params, fcn_args=(x, data))
kws = {'options': {'maxiter':10}}
result = minner.minimize()
# calculate final result
final = data + result.residual
# write error report
report_fit(result)
# try to plot results
try:
import pylab
pylab.plot(x, data, 'k+')
pylab.plot(x, final, 'r')
pylab.show()
except:
pass
#<end of examples/doc_basic.py>