how can I find the value of x at which the following gamma function is at its maximum see function
I wonder if there is a simple method with one of the libraries that could do this easily for me.
Ps. I'm considering the tranformation z=x/(x+c)
Thanks
Are you looking for a library that can find (an approximation of) a local minimum in a given interval ? scipy.optimize.minimize_scalar can do that. (For the maximum, just negate your function).
Edit: minimize_scalar is better than minimize in your case, since you only have one variable.
For the function to pass in, instead of integrating yourself, I believe you can use scipy.stats.gamma.cdf).
Related
I found in the release notes from scipy to version 1.9.0 the following about the optimisation module, in the section "scipy.optimize improvements", 4th point:
Add a vectorized parameter to call a vectorized objective function only once per iteration.
However, I already checked out the documentation for such parameters (for the minisation function and minize_sclar) and couldn't find any hint for such a parameter. While searching in the internet I only found some posts concerning some suggestions or GitHub-issuses to implement such a thing (or workarounds for that).
Where is this parameter to find and can I use it?
Those notes have a more specific note for scipy.optimize.differential_evolution. That parameter is explained there. I've also come across it in other SO questions, but I don't recall which functions use it.
Basically for functions that allow it, you can write an objective function, or other callable (jacobian, boundary?), in a way that takes a 2d array of values. Normally the function just takes a 1d array, the current "state". But with "vectorized=True", the function should be prepared to accept a set of "state" array, and return a value for each.
So instead of calling the objective k times to get a range of value, such as when calculating a gradient, it can call it one, with a (n,k) argument, and get back all k results with one call.
I tried to explain how solve_ivp uses this at
scipy.integrate.solve_ivp vectorized
I am trying to rewrite something similar to the following SAS optimization code in Python. The goal of the code is to find parameters for a continuous (in this case normal) distribution that best fits some empirical data. I am familiar with Python but very new to SAS.
proc nlin data=mydata outest=est METHOD=MARQUARDT SMETHOD=GOLDEN SAVE;
parms m = -1.0 to 1.0 by 0.1
s = 1.0 to 2.0 by 0.01;
bounds -10<m<10;
bounds 0<s<10;
model NCDF = CDF('NORMAL',XValue,m,s);
run;
In Python, I have something set up like this:
import pandas as pd
from scipy import stats
from scipy.optimize import minimize
def distance_empirical_fitted_cdf( params: list, data: pd.Series ):
m,s = params
empirical_cdf = ( data / 100 ).cumsum()
cost = 0
for point in range( 10 ):
emprical_cdf_at_point = empirical_cdf.iloc[ point ]
fitted_cdf_at_point = stats.norm.cdf( x = point, loc = m, scale = s )
cost += ( fitted_cdf_at_point - empirical_cdf_at_point ) ** 2
return cost
result = minimize( distance_empirical_fitted_cdf, x0=[0,1.5], args=(distribution),
bounds=[(-10,10),(0,10)] )
fitted_m, fitted_s = result.x
The code I have now gets me fairly close to the existing code's output in most cases, but not in all. Ideally, I could get them to match or be as close as possible and understand why they don't.
As far as I can tell, there are two sources of discrepancy. First, the SAS code is able to take a set of possible starting values (in this case a range from -1 to 1 for m and 0 to 10 for s) to initialize the parameters. Is there an equivalent of this in Python?
Second, the SAS code is specifically using the Marquardt optimization method and the Golden step size search method. The only Python code I could find referencing the Marquardt method is scipy.optimize.least_squares with method="lm", but this doesn't support bounds (and is much further off compared to scipy.optimize.minimize when I try without bounds).
The only Python code I could find referencing the golden step size search method is scipy.optimize.golden, but the documentation says that this is for minimizing functions of one variable and doesn't seem to support bounds either.
Any insight on getting the Python output closer to the SAS output would be greatly appreciated, thanks!
Not an answer, as its still inconclusive as to why (or how) the two sets of code produce different results.
So this will likely change as more info is introduced.
But in the meantime here are observations that might be useful, but is too much to all fit into the comments section.
Algorithm
The Marquardt Method as called by SAS is more commonly referred to as the Levenberg-Marquardt algorithm (atleast to me), and abbreviated LMA or LM.
Bounds
The math of LMA is not defined to handle bounds, which is why no bound options are provided in scipy.
The MINPACK-1 implementation used in scipy.optimize.leastsq for the Levenberg-Marquardt algorithm does not explicitly support bounds on parameters, and expects to be able to fully explore the available range of values for any Parameter. Simply placing hard constraints (that is, resetting the value when it exceeds the desired bounds) prevents the algorithm from determining the partial derivatives, and leads to unstable results.
The discrepancy in results may arise from here, depending on how SAS decides to handle the provided bounds params.
It may be the case that SAS is ignoring the provided bound values, re-running until a solution within the bounds are found, or using other methods entirely.
Another possible cause is that a better solution exists outside of your provided bounds.
Then either SAS limits its solutions to within the bounds but python doesn't, or vice versa.
This results in one code returning a local minima within bounds, but the other returning a better (possibly global) minima out of bounds.
However, I can't find where (the SAS docs for NLIN) explain how this is handled, so this is still inconclusive.
Step-Size Search
Note that the NLIN procedure's SMETHOD is used to search for an optimal step size.
It's unclear reading from the SAS docs on SMETHOD, what the "step size" is exactly,
but I believe in this context it could refer to the "damping parameter" lambda.
As this parameter is quite important to the performance of LMA, different ways of determining the parameter could also affect convergence (and thus final results.)
Again, all of this depends on how the two results differ.
If the two results are significantly different (completely different output CDF), then chances are only one CDF would match the actual data.
The code that doesn't match the actual data is probably doing something wrong, and needs to be scrutinized.
I want to find the minimum of a function in python y = f(x)
Problem : the solver tries to compute the gradient with super close x values (delta x around 1e-8), and my function f is not sensitive to such a small step (ie we can see y vary when delta x around 1e-1).
Hence gradient is 0 to the solver, and can not find the proper solution.
I've tried following solvers from scipy, I can't find the option I'm looking for..
scipy.optimize.minimize
scipy.optimize.fmin
In Matlab fmincon , there is an option that does the job 'DiffMinChange' : Minimum change in variables for finite-difference gradients (a positive scalar).
You may want to try and use L-BFGS-B from scipy:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.fmin_l_bfgs_b.html
And provide the “epsilon” parameter to be around 0.1/0.05 and see if it makes it better. I am of course assuming that you will let the solver compute the gradient for you by numerical differentiation (I.e., you pass fprime=None and approx_grad=True) to the routine.
I personally despise the “minimize” interface to various solvers so I prefer to deal with the actual solvers themselves.
Usually I use Mathematica, but now trying to shift to python, so this question might be a trivial one, so I am sorry about that.
Anyways, is there any built-in function in python which is similar to the function named Interval[{min,max}] in Mathematica ? link is : http://reference.wolfram.com/language/ref/Interval.html
What I am trying to do is, I have a function and I am trying to minimize it, but it is a constrained minimization, by that I mean, the parameters of the function are only allowed within some particular interval.
For a very simple example, lets say f(x) is a function with parameter x and I am looking for the value of x which minimizes the function but x is constrained within an interval (min,max) . [ Obviously the actual problem is just not one-dimensional rather multi-dimensional optimization, so different paramters may have different intervals. ]
Since it is an optimization problem, so ofcourse I do not want to pick the paramter randomly from an interval.
Any help will be highly appreciated , thanks!
If it's a highly non-linear problem, you'll need to use an algorithm such as the Generalized Reduced Gradient (GRG) Method.
The idea of the generalized reduced gradient algorithm (GRG) is to solve a sequence of subproblems, each of which uses a linear approximation of the constraints. (Ref)
You'll need to ensure that certain conditions known as the KKT conditions are met, etc. but for most continuous problems with reasonable constraints, you'll be able to apply this algorithm.
This is a good reference for such problems with a few examples provided. Ref. pg. 104.
Regarding implementation:
While I am not familiar with Python, I have built solver libraries in C++ using templates as well as using function pointers so you can pass on functions (for the objective as well as constraints) as arguments to the solver and you'll get your result - hopefully in polynomial time for convex problems or in cases where the initial values are reasonable.
If an ability to do that exists in Python, it shouldn't be difficult to build a generalized GRG solver.
The Python Solution:
Edit: Here is the python solution to your problem: Python constrained non-linear optimization
I'm trying to perform a constrained least-squares estimation using Scipy such that all of the coefficients are in the range (0,1) and sum to 1 (this functionality is implemented in Matlab's LSQLIN function).
Does anybody have tips for setting up this calculation using Python/Scipy. I believe I should be using scipy.optimize.fmin_slsqp(), but am not entirely sure what parameters I should be passing to it.[1]
Many thanks for the help,
Nick
[1] The one example in the documentation for fmin_slsqp is a bit difficult for me to parse without the referenced text -- and I'm new to using Scipy.
scipy-optimize-leastsq-with-bound-constraints on SO givesleastsq_bounds, which is
leastsq
with bound constraints such as 0 <= x_i <= 1.
The constraint that they sum to 1 can be added in the same way.
(I've found leastsq_bounds / MINPACK to be good on synthetic test functions in 5d, 10d, 20d;
how many variables do you have ?)
Have a look at this tutorial, it seems pretty clear.
Since MATLAB's lsqlin is a bounded linear least squares solver, you would want to check out scipy.optimize.lsq_linear.
Non-negative least squares optimization using scipy.optimize.nnls is a robust way of doing it. Note that, if the coefficients are constrained to be positive and sum to unity, they are automatically limited to interval [0,1], that is one need not additionally constrain them from above.
scipy.optimize.nnls automatically makes variables positive using Lawson and Hanson algorithm, whereas the sum constraint can be taken care of as discussed in this thread and this one.
Scipy nnls uses an old fortran backend, which is apparently widely used in equivalent implementations of nnls by other software.