I'm using the ODRPACK library in Python to fit some 1d data. It works quite well, but I have one question: is there any possibility to make constraints on the fitting parameters? For example if I have a model y = a * x + b and for physical reasons parameter a can by only in range (-1, 1). I've found that such constraints can be done in original Fortran implementation of the ODRPACK95 library, but I can't find how to do that in Python.
Of course, I can implement my functions such that they will return very big values, if the fitting parameters are out of bounds and chi squared will be big too, but I wonder if there is a right way to do that.
I'm afraid that the older FORTRAN-77 version of ODRPACK wrapped by scipy.odr does not incorporate constraints. ODRPACK95 is a later extension of the original ODRPACK library that predates the scipy.odr wrappers, and it is unclear that we could legally include it in scipy. There is no explicit licensing information for ODRPACK95, only the general ACM TOMS non-commercial license.
Related
I am fitting datasets with some broken powerlaws, the data has assymetrical errors in X and Y, and I'd like to be able to introduce constrains on the fitted parameters (i.e. not below 0, or within a certain range).
Using Scipy.ODR, I can fit the data great including the assymetrical errors on both axes, however I can't seem to find any way in the documentation to introduce bounds on my fitted parameters and discussions online seem to suggest this is flat out impossible with this module: https://stackoverflow.com/a/17786438/19086741
Using Lmfit, I can also fit the data well and can introduce bounds to the fitted parameters. However, discussions online once again state that Lmfit is not able to handle asymmetrical errors, and errors on both axes.
Is there some module, or perhaps I am missing something with one of these modules, that would allow me to meet both of my requirements in this case? Many thanks.
Sorry, I don't have a good answer for you. As you note, Lmfit does not support ODR regression which allows for uncertainties in the (single) independent variable as well as uncertainties in the dependent variables.
I think this would be possible in principle. Unfortunately, ODR has a very different interface to the other minimization routines making a wrapper as "another possible solving algorithm for lmfit" a bit challenging. I am sure that none of the developers would object to someone trying this, but it would take some effort.
FWIW, you say "both axes" as if you are certain there are exactly 2 axes. ODR supports exactly 1 independent variable: lmfit is not limited to this assumption.
You also say that lmfit cannot handle asymmetric uncertainties. That is only partially true. The lmfit.Model interface allows only a single uncertainty value per data point. But with the lmfit.minimize interface, you write your own objective function to calculate the array you want minimized, and so can weight some residual of "data" and "model" any way you want.
I'm trying to fit my data and have so far used sp.optimize.leastsq. I changed to sp.optimize.least_squares to add bounds to the parameters, but both when I use bounds and when I don't the search doesn't converge, even in data sets sp.optimize.leastsq fitted just fine.
Shouldn't these functions work the same?
What could be the difference between them that makes the newer one not to find solutions the older one did?
leastsq
is a wrapper around MINPACK’s lmdif and lmder algorithms.
least_squares implements other methods in addition to the MINPACK algorithm.
method{‘trf’, ‘dogbox’, ‘lm’}, optional
Algorithm to perform minimization.
‘trf’ : Trust Region Reflective algorithm, particularly suitable for large sparse problems with bounds. Generally robust method.
‘dogbox’ : dogleg algorithm with rectangular trust regions, typical use case is small problems with bounds. Not recommended for problems with rank-deficient Jacobian.
‘lm’ : Levenberg-Marquardt algorithm as implemented in MINPACK. Doesn’t handle bounds and sparse Jacobians. Usually the most efficient method for small unconstrained problems.
Default is ‘trf’. See Notes for more information.
It is possible for some problems that lm method does not converge while trf converges.
I am having trouble solving an optimisation problem in python, involving ~20,000 decision variables. The problem is non-linear and I wish to apply both bounds and constraints to the problem. In addition to this, the gradient with respect to each of the decision variables may be calculated.
The bounds are simply that each decision variable must lie in the interval [0, 1] and there is a monotonic constraint placed upon the variables, i.e each decision variable must be greater than the previous one.
I initially intended to use the L-BFGS-B method provided by the scipy.optimize package however I found out that, while it supports bounds, it does not support constraints.
I then tried using the SQLSP method which does support both constraints and bounds. However, because it requires more memory than L-BFGS-B and I have a large number of decision variables, I ran into memory errors fairly quickly.
The paper which this problem comes from used the fmincon solver in Matlab to optimise the function, which, to my knowledge, supports the application of both bounds and constraints in addition to being more memory efficient than the SQLSP method provided by scipy. I do not have access to Matlab however.
Does anyone know of an alternative I could use to solve this problem?
Any help would be much appreciated.
I have a group of data points that I need to fit a curve to and extract the coefficients of the polynomial and use them to determine the roots of the polynomial. I have a python library, SCIPY Optimize Curve Fit that is able to extract the coefficients, but I need to have a C++ version. I noticed the python library is based on minpack so I downloaded minpack but haven't been able to figure out how to use it. I also looked at the John Burkhardt version found here, this is a pretty compact version of minpack but again I haven't figured out how to use it.
The python library leads me to believe the polynomial is of the form AX^2 + BX + C + D/X.
I thought maybe I could port the SCIPY minpack to c++ but after looking at it I realized this was a bad idea, that and my python skills aren't good enough. Does anyone have any related code examples for using the C++ version of minpack, links to read, anything?
I would look into Minuit!
They offer a standalone version, and it's also packaged in ROOT (a much larger data analysis framework).
I've only ever used it through ROOT's interface.
I'm trying to perform a constrained least-squares estimation using Scipy such that all of the coefficients are in the range (0,1) and sum to 1 (this functionality is implemented in Matlab's LSQLIN function).
Does anybody have tips for setting up this calculation using Python/Scipy. I believe I should be using scipy.optimize.fmin_slsqp(), but am not entirely sure what parameters I should be passing to it.[1]
Many thanks for the help,
Nick
[1] The one example in the documentation for fmin_slsqp is a bit difficult for me to parse without the referenced text -- and I'm new to using Scipy.
scipy-optimize-leastsq-with-bound-constraints on SO givesleastsq_bounds, which is
leastsq
with bound constraints such as 0 <= x_i <= 1.
The constraint that they sum to 1 can be added in the same way.
(I've found leastsq_bounds / MINPACK to be good on synthetic test functions in 5d, 10d, 20d;
how many variables do you have ?)
Have a look at this tutorial, it seems pretty clear.
Since MATLAB's lsqlin is a bounded linear least squares solver, you would want to check out scipy.optimize.lsq_linear.
Non-negative least squares optimization using scipy.optimize.nnls is a robust way of doing it. Note that, if the coefficients are constrained to be positive and sum to unity, they are automatically limited to interval [0,1], that is one need not additionally constrain them from above.
scipy.optimize.nnls automatically makes variables positive using Lawson and Hanson algorithm, whereas the sum constraint can be taken care of as discussed in this thread and this one.
Scipy nnls uses an old fortran backend, which is apparently widely used in equivalent implementations of nnls by other software.