Optimization algorithm (dog-leg trust-region) in Matlab and Python - python

I'm trying to solve a set of nonlinear equations using the dog-leg trust-region algorithm in Matlab and Python.
In Matlab there is fsolve where this algorithm is the default, whereas for Python we specify 'dogleg' in scipy.optimize.minimize. I won't need to specify a Jacobian or Hessian for the Matlab whereas Python needs either one to solve the problem.
I don't have the Jacobian/Hessian so is there a way around this issue for Python? Or is there another function that performs the equivalent of Matlab's dog-leg method in fsolve?

In newer versions of scipy there is the approx_fprime function. It computes a numerical approximation of the jacobian of function f at position xk using the foward step finite difference. It returns an ndarray with the partial derivate of f at positions xk.
If you can't upgrade your version of scipy, you can always copy the implementation from scipy's source.
Edit:
scipy.optimize.minimize calls approx_fprime internally if the input jac=False. So in your case, it should be enough to do the following:
scipy.optimize.minimize(fun, x0, args, method='dogleg', jac=False)
Edit
scipy does not seem to handle the jac=False condition properly so it is necessary to build a callable jac using approx_fprime as follows
jac = lambda x,*args: scipy.optimize.approx_fprime(x,fun,epsilon,*args)
scipy.optimize.minimize(fun, x0, args, method='dogleg', jac=jac)

Related

Evaluating uncertainty in SciPy root-finding results using Levenberg-Marquardt

I've written a Python script to solve the Time Difference of Arrival (TDoA) angular reconstruction problem in 3-dimensions. To do so, I'm using SciPy's scipy.optimize.root root finding algorithm to solve a system of nonlinear equations. I find that the Levenberg-Marquardt method is the only supported method capable of reliably producing accurate results (most others simply fail).
I'd like to assess the uncertainty in the resulting solution. For most methods (including the default hybr method), SciPy returns the inverse Hessian of the objective function (i.e. the covariance matrix), from which one may begin to calculate the uncertainty(ies) in the found roots. Unfortunately this is not the case for the Levenberg-Marquardt method (which I'm admittedly much less familiar with on a mathematical method than the other methods... it just seems to work).
How (in general) can I estimate the uncertainties in the solution returned by scipy.optimize.root when using the lm method?

Solving large-scale nonlinear system using exact Newton's method in SciPy

I am trying to solve a large-scale nonlinear system using the exact Newton method in SciPy. In my application, the Jacobian is easy to assemble (and factorize) as a sparse matrix.
It seems that all methods available in scipy.optimize.root approximate the Jacobian in one way or another, and I can't find a way to use Newton's method using the API that is discussed in SciPy's documentation.
Nonetheless, using the internal API, I have managed to use Newton's method with the following code:
from scipy.optimize.nonlin import nonlin_solve
x, info = nonlin_solve(f, x0, jac, line_search=False)
where f(x) is the residual and jac(x) is a callable that returns the Jacobian at x as a sparse matrix.
However, I am not sure whether this function is meant to be used outside SciPy and is subject to changes without notice.
Would this be recommended approach?
It is meant to be used.
Scipy's private functions that are not meant to be used from the outside start with a _.
This was confirmed by the scipy's team in an issue I raised recently: cf https://github.com/scipy/scipy/issues/17510

When searching for minimum of a function, how to set minimum change in variables for finite-difference gradients?

I want to find the minimum of a function in python y = f(x)
Problem : the solver tries to compute the gradient with super close x values (delta x around 1e-8), and my function f is not sensitive to such a small step (ie we can see y vary when delta x around 1e-1).
Hence gradient is 0 to the solver, and can not find the proper solution.
I've tried following solvers from scipy, I can't find the option I'm looking for..
scipy.optimize.minimize
scipy.optimize.fmin
In Matlab fmincon , there is an option that does the job 'DiffMinChange' : Minimum change in variables for finite-difference gradients (a positive scalar).
You may want to try and use L-BFGS-B from scipy:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
And provide the “epsilon” parameter to be around 0.1/0.05 and see if it makes it better. I am of course assuming that you will let the solver compute the gradient for you by numerical differentiation (I.e., you pass fprime=None and approx_grad=True) to the routine.
I personally despise the “minimize” interface to various solvers so I prefer to deal with the actual solvers themselves.

Solve differential equation in Python when I don't know the derivative analytically

I'm trying to solve a first-order ODE in Python:
where Gamma and u are square matrices.
I don't explicitly know u(t) at all times, but I do know it at discrete timesteps from doing an earlier calculation.
Every example I found of Python's solvers online (e.g. this one for scipy.integrate.odeint and scipy.integrate.ode) know the expression for the derivative analytically as a function of time.
Is there a way to call these (or other differential equation solvers) without knowing an analytic expression for the derivative?
For now, I've written my own Runge-Kutta solver and jitted it with numba.
You can use any of the SciPy interpolation methods, such as interp1d, to create a callable function based on your discrete data, and pass it to odeint. Cubic spline interpolation,
f = interp1d(x, y, kind='cubic')
should be good enough.
Is there a way to call these (or other differential equation solvers) without knowing an analytic expression for the derivative?
Yes, none of the solvers you mentioned (nor most other solvers) require an analytic expression for the derivative. Instead they call a function you supply that has to evaluate the derivative for a given time and state. So, your code would roughly look something like:
def my_derivative(time,flat_Gamma):
Gamma = flat_Gamma.reshape(dim_1,dim_2)
u = get_u_from_time(time)
dGamma_dt = u.dot(Gamma)
return dGamma_dt.flatten()
from scipy.integrate import ode
my_integrator = ode(my_derivative)
…
The difficulty in your situation is rather that you have to ensure that get_u_from_time provides an appropriate result for every time with which it is called. Probably the most robust and easy solution is to use interpolation (see the other answer).
You can also try to match your integration steps to the data you have, but at least for scipy.integrate.odeint and scipy.integrate.ode this will be very tedious as all the integrators use internal steps that are inconvenient for this purpose. For example, the fifth-order Dormand–Prince method (DoPri5) uses internal steps of 1/5, 3/10, 4/5, 8/9, and 1. This means that if you have temporally equidistant data for u, you would need 90 data points for each integration step (as 1/90 is the greatest common divisor of the internal steps). The only integrator that could make this remotely feasible is the Bogacki–Shampine integrator (RK23) from cipy.integrate.solve_ivp with internal steps of 1/2, 3/4, and 1.

Python function minimisation without derivative

I am familiar with some of the functions in scipy.optimize.optimize and have in the past used fmin_cg to minimize a function where I knew the derivative. However, I now have a formula which is not easily differentiated.
Several of the functions in that module (fmin_cg, for instance) do not actually require the derivative to be provided. I assume that they then calculate a quazi-derivative by adding a small value to each of the parameters in turn - is that correct?
My main question is this: Which of the functions (or one from elsewhere) is the best to use when minimising a function over multiple parameters with no given derivative?
Yes, calling any of fmin_bfgs fmin_cg fmin_powell as
fmin_xx( func, x0, fprime=None, epsilon=.001 ... )
estimates the gradient at x by (func( x + epsilon I ) - func(x)) / epsilon.
Which is "best" for your application, though,
depends strongly on how smooth your function is, and how many variables.
Plain Nelder-Mead, fmin, is a good first choice -- slow but sure;
unfortunately the scipy Nelder-Mead starts off with a fixed-size simplex, .05 / .00025 regardless of the scale of x.
I've heard that fmin_tnc in scipy.optimize.tnc is good:
fmin_tnc( func, x0, approx_grad=True, epsilon=.001 ... ) or
fmin_tnc( func_and_grad, x0 ... ) # func, your own estimated gradient
(fmin_tnc is ~ fmin_ncg with bound constraints, nice messages to see what's happening, somewhat different args.)
I'm not too familiar with what's available in SciPy, but the Downhill Simplex method (aka Nelder-Mead or the Amoeba method) frequently works well for multidimensional optimization.
Looking now at the scipy documentation, it looks like it is available as an option in the minimize() function using the method='Nelder-Mead' argument.
Don't confuse it with the Simplex (Dantzig) algorithm for Linear Programming...

Categories

Resources