python curve_fit doesn't work with stiff model - python

I am trying to find the x0parameter which fits as much as possible the blue model on the green curve (x0control the width of the crenel; see below).
Here is my attempt:
from pylab import *
from scipy.optimize import curve_fit
x=linspace(0,2*pi,1000)
def crenel(x):return sign(sin(x))
def inverter(x,x0): return (crenel(x-x0)+crenel(x+x0))/2
p,e = curve_fit(inverter,x,sin(x),1)
plot(x,inverter(x,*p),x,sin(x))
ylim(-1.5,1.5)
By hand, the optimal value is x0 = arcsin(1/2) # 0.523598, but curve_fit doesn't estimate any value ( "OptimizeWarning: Covariance of the parameters could not be estimated") . I suspect the stiffness of the model. The docs inform :
The algorithm uses the Levenberg-Marquardt algorithm through leastsq. Additional keyword arguments are passed directly to that algorithm.
So my question is : Is there keyword arguments that can help curve_fit to estimate the parameter in this case ? or another approach ?
Thanks for any advice.

The problem is that the objective function that curve_fit tries to minimize is not continuous. x0 controls the location of the discontinuities in the inverter function. When a discontinuity crosses one of the grid points in x, there is a jump in the objective function. Between these points, the objective function is constant. curve_fit (actually, leastsq, the function used by curve_fit) is not designed to handle such a function.
The following function sse is (in effect) the function that curve_fit tries to minimize, with x being the same x defined in your example, and y = sin(x):
def sse(x0, x, y):
f = inverter(x, x0)
diff = y - f
s = (diff**2).sum()
return s
If you plot this function on a fine grid with code such as
xx = np.linspace(0, 1, 10000)
yy = [sse(x0, x, y) for x0 in xx]
plot(xx, yy)
and zoom in, you'll see
To use scipy to find your optimal value, you can use fmin with a smooth objective function. For example, here's the continuous objective function, using only the interval [0, pi/2] (quad is scipy.integrate.quad):
def func(x0):
s0, e0 = quad(lambda x: np.sin(x)**2, 0, x0)
s1, e0 = quad(lambda x: (1 - np.sin(x))**2, x0, 0.5*np.pi)
return s0 + s1
scipy.optimize.fmin can be used to find the minimum of that function, as in this snippet from an ipython session:
In [202]: fmin(func, 0.3, xtol=1e-8)
Optimization terminated successfully.
Current function value: 0.100545
Iterations: 28
Function evaluations: 56
Out[202]: array([ 0.52359878])
In [203]: np.arcsin(0.5)
Out[203]: 0.52359877559829882

Related

Solver does the integration without calling the derivative callback function

I have a python code (example from Cantera.org) that uses scipy.integrate.ode to solve a system of ODE. The code works fine and the results are reasnoable. However, I noticed something about the ode solver that does not make sense to me.
I have a put a print function inside (print("t inside ODE function", t)) the function the calculates the derivative vector (__call__(self, t, y)), and outside that function in the while loop (print("t outside ODE function", solver.t);).
I expect that inside print has to be called when the solver does the time integration, and then the outside print is called. In other words, two "t outside ODE function" cannot appear right after another without "t inside ODE function" in between. However, this occurs in some of iterations in the while loop, which mean the solver does the integration without calculating the derivatives.
I am wondering how this is possible
import cantera as ct
import numpy as np
import scipy.integrate
class ReactorOde:
def __init__(self, gas):
# Parameters of the ODE system and auxiliary data are stored in the
# ReactorOde object.
self.gas = gas
self.P = gas.P
def __call__(self, t, y):
"""the ODE function, y' = f(t,y) """
# State vector is [T, Y_1, Y_2, ... Y_K]
self.gas.set_unnormalized_mass_fractions(y[1:])
self.gas.TP = y[0], self.P
rho = self.gas.density
print("t inside ODE function", t)
wdot = self.gas.net_production_rates
dTdt = - (np.dot(self.gas.partial_molar_enthalpies, wdot) /
(rho * self.gas.cp))
dYdt = wdot * self.gas.molecular_weights / rho
return np.hstack((dTdt, dYdt))
gas = ct.Solution('gri30.yaml')
# Initial condition
P = ct.one_atm
gas.TPX = 1001, P, 'H2:2,O2:1,N2:4'
y0 = np.hstack((gas.T, gas.Y))
# Set up objects representing the ODE and the solver
ode = ReactorOde(gas)
solver = scipy.integrate.ode(ode)
solver.set_integrator('vode', method='bdf', with_jacobian=True)
solver.set_initial_value(y0, 0.0)
# Integrate the equations, keeping T(t) and Y(k,t)
t_end = 1e-3
states = ct.SolutionArray(gas, 1, extra={'t': [0.0]})
dt = 1e-5
while solver.successful() and solver.t < t_end:
solver.integrate(solver.t + dt)
gas.TPY = solver.y[0], P, solver.y[1:]
states.append(gas.state, t=solver.t)
print("t outside ODE function", solver.t);
print("\n")
The solver has an adaptive step size. Which means that it proceeds in internal steps that are adapted to the given error tolerances. In the segment from one step point to the next, the solution values get interpolated. Thus it can happen that a sequence of the external steps of the time loop falls into the same internal segment. If you set the error tolerances to smaller levels as the default ones, it can happen that the situation reverses, that several internal steps are required per external value request.

How to solve a simple boundary value problem for TISE on python

I am trying to solve the TISE for an infinite potential well V=0 on the interval [0,L]. The exercise gives us that the value of the wavefunction and its derivative at 0 is 0,1 respectively. This allows us to using the scipy.integrate.odeint function in order to solve the problem for a given energy value.
The task is to now find the energy eigenvalues given the further boundary condition that the wavefunction at L is 0, using a root finding function on python. I have done some research and could only find something called the 'shooting method' which I cannot figure out how to implement. Also, I have come across the solve BVP scipy function, however I can't seem to understand what exactly goes in the second input for this function (boundary condition residuals)
m_el = 9.1094e-31 # mass of electron in [kg]
hbar = 1.0546e-34 # Planck's constant over 2 pi [Js]
e_el = 1.6022e-19 # electron charge in [C]
L_bohr = 5.2918e-11 # Bohr radius [m]
import numpy as np
import matplotlib.pyplot as plt
from scipy.integrate import odeint
def eqn(y, x, energy): #array of first order ODE's
y0 = y[1]
y1 = -2*m_el*energy*y[0]/hbar**2
return np.array([y0,y1])
def solve(energy, func): #use of odeint
p0 = 0
dp0 = 1
x = np.linspace(0,L_bohr,1000)
init = np.array([p0,dp0])
ysolve = odeint(func, init, x, args=(energy,))
return ysolve[-1,0]
The method here is to input eqn as func in solve(energy,func). L_bohr is the L value in this problem. We are trying to numerically find the energy eigenvalues using some scipy method
For all the other solvers in scipy the argument order x,y, and even in odeint one can use this order by giving the option tfirst=True. Thus change to
def eqn(x, y, energy): #array of first order ODE's
y0, y1 = y
y2 = -2*m_el*energy*y0/hbar**2
return [y1,y2]
For the BVP solver you have to think of the energy parameter as an
extra state component with zero derivative, thus adding a third slot
in the boundary conditions. Scipy's solve_bvp allows to keep it as parameter,
so that you get 3 slots in the boundary conditions, allowing to fix the first derivative at x=0 to select one non-trivial solution from the eigenspace.
def bc(y0, yL, E):
return [ y0[0], y0[1]-1, yL[0] ]
Next construct an initial state that is close to the suspected ground state and call the solver
x0 = np.linspace(0,L_bohr,6);
y0 = [ x0*(1-x0/L_bohr), 1-2*x0/L_bohr ]
E0 = 134*e_el
sol = solve_bvp(eqn, bc, x0, y0, p=[E0])
print(sol.message, " E=", sol.p[0]/e_el," eV")
and then produce the plot
x = np.linspace(0,L_bohr,1000)
plt.plot(x/L_bohr, sol.sol(x)[0]/L_bohr,'-+', ms=1)
plt.grid()
The algorithm converged to the desired accuracy. E= 134.29310361903723 eV

Python: scipy.optimize.minimize fails with "ValueError: setting an array element with a sequence." when calling on a function with x and y args

As stated in title, scipy.optimize.minimize fails with "ValueError: setting an array element with a sequence." when calling minimize.
I'm applying scipy.optimize.minimize to a function that uses the variables coef (coefficients that I'm optimizing) and xData and yData (data variables).
I'll provide an example code below. I am aware through searching on how to use minimize that the error stems from the function being minimized returning an array when it should return a scalar. I'm not sure why it is returning an array, though.
Importantly, scipy.optimize.least_squares works and it seems to share the same syntax as scipy.optimize.minimize. scipy.optimize.fmin does not work either and it's included as well - it's the same as minimize with the Nelder-Mead method, which I'm calling.
Here is some generalized example code that has the error on Python 3:
import numpy as np
from scipy.optimize import least_squares
from scipy.optimize import minimize
from scipy.optimize import fmin
import matplotlib.pyplot as plt
xData = np.linspace(50,94,334);
yData = (xData-75)**2 + (np.random.random((334,))-.5)*600;
fun = lambda coef, x : coef[0] + coef[1]*x + coef[2]*x**2 ; #create a "lambda" function whatever that is that has a tuple for the polynomial coefficients in it
#function is y = coef0 + coef1*x + coef2*x^2 where y is lambda
funError = lambda coef, x, y: fun(coef,x) - y; #create a "lambda" function for the error between the real data y and the fit data y
#function is yError = y(coef,x) - yReal where yError is the lambda now
#expanded fully: yError = coef0 + coef1*x + coef2*x^2 - yReal
coef_init = (5,10,15); #initial coefficient guess
#coef0 is const (order 0)
#coef1 is order 1 coef
#coef2 is order 2 coef
coef = least_squares(funError,coef_init, args=(xData,yData) ); #calculate the polynomial coefficients to fit the data
yFit_lq = fun(coef.x,xData); #calc the guessed values
plt.figure();
plt.scatter( xData , yData , 20 , "r" );
plt.scatter( xData , yFit_lq , 20 );
plt.title("Least Squares");
plt.show();
coef = minimize(funError,coef_init, args=(xData,yData),method="Nelder-Mead" ); #calculate the polynomial coefficients to fit the data
yFit_min = fun(coef.x,xData); #calc the guessed values
plt.figure();
plt.scatter( xData , yData , 20 , "r" );
plt.scatter( xData , yFit_min , 20 );
plt.title("Minimize with Nelder-Mead");
plt.show();
coef = fmin(funError,coef_init, args=(xData,yData) ); #calculate the polynomial coefficients to fit the data
yFit_fmin = fun(coef.x,xData); #calc the guessed values
plt.figure();
plt.scatter( xData , yData , 20 , "r" );
plt.scatter( xData , yFit_fmin , 20 );
plt.title("fmin, equiv to min. w/ neldy");
plt.show();
I call least_squares, minimize, and fmin the same way and their pages just ask for args=(). I'm not sure what is going wrong in calling minimize and fmin that the "ValueError: setting an array element with a sequence." error occurs while least_squares is perfectly happy with the formatting.
I would also prefer to avoid excess function defs - the clean and simple lambda function should be able to handle this simple case.
least_squares and minimize have different requirements for the objective function.
least_squares expects your function to return a vector. The docstring describes this vector as the "vector of residuals". least_squares takes this vector and sums the squares of the elements to form the actual objective function that is minimized.
minimize expects your objective function to return a scalar. It tries to find the vector input that minimizes the scalar output of your function.
You can solve the least squares optimization problem with minimize by modifying your existing function so that it computes and returns the sum of the squared residuals:
def funError(coef, x, y):
residuals = fun(coef,x) - y
objective = (residuals**2).sum()
return objective
But then that function is not set up to use with least_squares. So instead, you could use two functions:
def funError(coef, x, y):
residuals = fun(coef,x) - y
return residuals
def funErrorSSR(coef, x, y):
residuals = funError(coef, x, y)
objective = (residuals**2).sum()
return objective
Use funError with least_squares, and funErrorSSR with minimize (or fmin).

scipy.optimize on high frequency sine function

I am using Python 2.7. I am wondering why the optimize function of SciPy doesn't converge to the right function when the target is a high frequency sinus wave.
import numpy as np
from scipy import optimize
test_func = lambda x: 5*np.sin(15*x+3)+1
t = linspace(0,25,100000)
y_t = test_func(t)
plot(t,y_t)
fitfunc = lambda p, x: p[0]*np.sin(p[1]*x+p[2])+p[3]
errfunc = lambda p, x, y: fitfunc(p, x) - y
p0 = [max(y_t),10,2,0]
p1, success = optimize.leastsq(errfunc, p0, args=(t,y_t))
plot(t,fitfunc(p1,t))
One can clearly see that the end solution diverges from the target clearly. Am I doing something wrong ? Is the error function ill adapted here ?
Thanks for any input
Your problem is that there are a large number of local minima in your residuals function as the phase and frequency shift with respect to their true values; without really good initial guesses for the phase and frequency you will converge into one instead of falling into the much deeper, global minimum:
If you don't have any more information about the phase and frequency, you can either estimate them from a FFT of the data or rewrite your formula as
Asin(bx + phi) + d = Acos(phi)sin(bx) + Asin(phi)cos(bx) + d
which has only one nonlinear parameter (b): you can use a grid-search for b and much faster and more reliable linear least-squares fitting for the rest (a1 = Acos(phi), a2 = Asin(phi) and d).
Here's a plot of the rms residual as the frequency, b varies, showing the various minima:

SciPy + Numpy: Finding the slope of a sigmoid curve

I have some data that follow a sigmoid distribution as you can see in the following image:
After normalizing and scaling my data, I have adjusted the curve at the bottom using scipy.optimize.curve_fit and some initial parameters:
popt, pcov = curve_fit(sigmoid_function, xdata, ydata, p0 = [0.05, 0.05, 0.05])
>>> print popt
[ 2.82019932e+02 -1.90996563e-01 5.00000000e-02]
So popt, according to the documentation, returns *"Optimal values for the parameters so that the sum of the squared error of f(xdata, popt) - ydata is minimized". I understand here that there is no calculation of the slope with curve_fit, because I do not think the slope of this gentle curve is 282, neither is negative.
Then I tried with scipy.optimize.leastsq, because the documentation says it returns "The solution (or the result of the last iteration for an unsuccessful call).", so I thought the slope would be returned. Like this:
p, cov, infodict, mesg, ier = leastsq(residuals, p_guess, args = (nxdata, nydata), full_output=True)
>>> print p
Param(x0=281.73193626250207, y0=-0.012731420027056234, c=1.0069006606656596, k=0.18836680131910222)
But again, I did not get what I expected. curve_fit and leastsq returned almost the same values, with is not surprising I guess, as curve_fit is using an implementation of the least squares method within to find the curve. But no slope back...unless I overlooked something.
So, how to calculate the slope in a point, say, where X = 285 and Y = 0.5?
I am trying to avoid manual methods, like calculating the derivative in, say, (285.5, 0.55) and (284.5, 0.45) and subtract and divide results and so. I would like to know if there is a more automatic method for this.
Thank you all!
EDIT #1
This is my "sigmoid_function", used by curve_fit and leastsq methods:
def sigmoid_function(xdata, x0, k, p0): # p0 not used anymore, only its components (x0, k)
# This function is called by two different methods: curve_fit and leastsq,
# this last one through function "residuals". I don't know if it makes sense
# to use a single function for two (somewhat similar) methods, but there
# it goes.
# p0:
# + Is the initial parameter for scipy.optimize.curve_fit.
# + For residuals calculation is left empty
# + It is initialized to [0.05, 0.05, 0.05]
# x0:
# + Is the convergence parameter in X-axis and also the shift
# + It starts with 0.05 and ends up being around ~282 (days in a year)
# k:
# + Set up either by curve_fit or leastsq
# + In least squares it is initially fixed at 0.5 and in curve_fit
# + to 0.05. Why? Just did this approach in two different ways and
# + it seems it is working.
# + But honestly, I have no clue on what it represents
# xdata:
# + Positions in X-axis. In this case from 240 to 365
# Finally I changed those parameters as suggested in the answer.
# Sigmoid curve has 2 degrees of freedom, therefore, the initial
# guess only needs to be this size. In this case, p0 = [282, 0.5]
y = np.exp(-k*(xdata-x0)) / (1 + np.exp(-k*(xdata-x0)))
return y
def residuals(p_guess, xdata, ydata):
# For the residuals calculation, there is no need of setting up the initial parameters
# After fixing the initial guess and sigmoid_function header, remove []
# return ydata - sigmoid_function(xdata, p_guess[0], p_guess[1], [])
return ydata - sigmoid_function(xdata, p_guess[0], p_guess[1], [])
I am sorry if I made mistakes while describing the parameters or confused technical terms. I am very new with numpy and I have not studied maths for years, so I am catching up again.
So, again, what is your advice to calculate the slope of X = 285, Y = 0.5 (more or less the midpoint) for this dataset? Thanks!!
EDIT #2
Thanks to Oliver W., I updated my code as he suggested and understood a bit better the problem.
There is a final detail I do not fully get. Apparently, curve_fit returns a popt array (x0, k) with the optimum parameters for the fitting:
x0 seems to be how shifted is the curve by indicating the central point of the curve
k parameter is the slope when y = 0.5, also in the center of the curve (I think!)
Why if the sigmoid function is a growing one, the derivative/slope in popt is negative? Does it make sense?
I used sigmoid_derivative to calculate the slope and, yes, I obtained the same results that popt but with positive sign.
# Year 2003, 2005, 2007. Slope in midpoint.
k = [-0.1910, -0.2545, -0.2259] # Values coming from popt
slope = [0.1910, 0.2545, 0.2259] # Values coming from sigmoid_derivative function
I know this is being a bit peaky because I could use both. The relevant data is in there but with negative sign, but I was wondering why is this happening.
So, the calculation of the derivative function as you suggested, is only required if I need to know the slope in other points than y = 0.5. Only for midpoint, I can use popt.
Thanks for your help, it saved me a lot of time. :-)
You're never using the parameter p0 you're passing to your sigmoid function. Hence, curve fitting will not have any good measure to find convergence, because it can take any value for this parameter. You should first rewrite your sigmoid function like this:
def sigmoid_function(xdata, x0, k):
y = np.exp(-k*(xdata-x0)) / (1 + np.exp(-k*(xdata-x0)))
return y
This means your model (the sigmoid) has only two degrees of freedom. This will be returned in popt:
initial_guess = [282, 1] # (x0, k): at x0, the sigmoid reaches 50%, k is slope related
popt, pcov = curve_fit(sigmoid_function, xdata, ydata, p0=initial_guess)
Now popt will be a tuple (or array of 2 values), being the best possible x0 and k.
To get the slope of this function at any point, to be honest, I would just calculate the derivative symbolically as the sigmoid is not such a hard function. You will end up with:
def sigmoid_derivative(x, x0, k):
f = np.exp(-k*(x-x0))
return -k / f
If you have the results from your curve fitting stored in popt, you could pass this easily to this function:
print(sigmoid_derivative(285, *popt))
which will return for you the derivative at x=285. But, because you ask specifically for the midpoint, so when x==x0 and y==.5, you'll see (from the sigmoid_derivative) that the derivative there is just -k, which can be observed immediately from the curve_fit output you've already obtained. In the output you've shown, that's about 0.19.

Categories

Resources