I'm using the algorithm 'COBYLA' in scipy's optimize.minimize function (v.0.11 build for cygwin). I observed that the parameter bounds seems not to be used in this case. For instance, the simple example:
from scipy.optimize import minimize
def f(x):
return -sum(x)
minimize(f, x0=1, method='COBYLA', bounds=(-2,2))
returns:
status: 2.0
nfev: 1000
maxcv: 0.0
success: False
fun: -1000.0
x: array(1000.0)
message: 'Maximum number of function evaluations has been exceeded.'
instead of the expected 2 for x.
Did anyone perceived the same problem? Is there a known bug or documentation error? In the scipy 0.11 documentation, this option is not excluded for the COBYLA algorithm. In fact the function fmin_cobyla doesn't have the bounds parameter.
Thanks for any hint.
You can formulate the bounds in the form of constraints
import scipy
#function to minimize
def f(x):
return -sum(x)
#initial values
initial_point=[1.,1.,1.]
#lower and upper bound for variables
bounds=[ [-2,2],[-1,1],[-3,3] ]
#construct the bounds in the form of constraints
cons = []
for factor in range(len(bounds)):
lower, upper = bounds[factor]
l = {'type': 'ineq',
'fun': lambda x, lb=lower, i=factor: x[i] - lb}
u = {'type': 'ineq',
'fun': lambda x, ub=upper, i=factor: ub - x[i]}
cons.append(l)
cons.append(u)
#similarly aditional constrains can be added
#run optimization
res = scipy.optimize.minimize(f,initial_point,constraints=cons,method='COBYLA')
#print result
print res
Note that the minimize function will give the design variables to the function. In this case 3 input variables are given with 3 upper and lower bounds. the result yields:
fun: -6.0
maxcv: -0.0
message: 'Optimization terminated successfully.'
nfev: 21
status: 1
success: True
x: array([ 2., 1., 3.])
The original COBYLA(2) FORTRAN algorithm does not support variable bounds explicitly, you have to formulate the bounds in the context of the general constraints.
Looking at the current source code for the SciPy minimize interface here, it is apparent that no measures has yet been taken in SciPy to handle this limitation.
Thus, in order to apply bounds for the cobyla algorithm in the SciPy minimize function, you will need to formulate the variable bounds as inequality constraints and contain them in the associated constraints parameter.
(source code excerpt)
// bounds set to anything else than None yields warning
if meth is 'cobyla' and bounds is not None:
warn('Method %s cannot handle bounds.' % method,
RuntimeWarning)
...
// No bounds argument in the internal call to the COBYLA function
elif meth == 'cobyla':
return _minimize_cobyla(fun, x0, args, constraints, **options)
Related
I have a python script where I compute the value of a normal log-likelihood function for a sample of bivariate data using scipy's multivariate_normal.log_pdf. I am assuming the values of the sample means and variances, leaving only the sample correlation between the variables as the unknown,
from scipy.stats import multivariate_normal
from scipy.optimize import minimize
VAR_X = 0.4
VAR_Y = 0.32
MEAN_X = 1
MEAN_Y = 1.2
def log_likelihood_function(x, data):
log_likelihood = 0
sigma = [ [VAR_X, x[0]], [x[0], VAR_Y] ]
mu = [ MEAN_X, MEAN_Y ]
for point in data:
log_likelihood += multivariate_normal.logpdf(x=point, mean=mu, cov=sigma)
return log_likelihood
if __name__ == "__main__":
some_data = [ [1.1, 2.0], [1.2, 1.9], [0.8, 0.2], [0.7, 1.3] ]
guess = [ 0 ]
# maximize log-likelihood by minimizing the negative
likelihood = lambda x: (-1)*log_likelihood_function(x, some_data)
result = minimize(fun = likelihood, x0 = guess, options = {'disp': True}, method="SLSQP")
print(result)
No matter what I set as my guess, this script reliably throws a ValueError,
ValueError: the input matrix must be positive semidefinite
Now, the problem, by my estimation, seems to be scipy.optimize.minimize is guessing values that create a covariance matrix that is not positive definite. So I need a way to make sure the minimization algorithm throws away values that are outside the domain of the problem. I thought to add a constraint to the minimize call,
## make the determinant always positive
def positive_definite_constraint(x):
return VAR_X*VAR_Y - x*x
Which is basically the Slyvester Criteron for the covariance matrix and would ensure the matrix is positive definite (since we know the variance is always positiv, that condition doesn't need checked) But it seems like scipy.optimize.minimize evaluates the objective function before it determines if the constraints are satisfied (which seems like a design flaw; wouldn't it be faster to search for a solution in a restricted domain, instead of searching all possible solutions and then determining if the constraints are satisfied? I might be mistaken about the order of evaluation, though.)
I am not sure how to proceed. I realize I am stretching the purpose of scipy.optimize here a bit by parameterizing the covariance matrix and then minimizing with respect to that parameterization, and I know there are better ways to calculate the correlation for a normal sample, but I am interested in this problem because of its generalization to distributions that are not normal.
Any suggestions? Is there a better way to solve this problem?
You are on the right track. Note that your definiteness constraint reduces to a simple bound on the optimization variable, i.e. -∞ <= x[0] <= VAR_X*VAR_Y. Variable bounds are better handled internally than the more general constraints, so I'd recommend something like this:
bounds = [(None, VAR_X*VAR_Y)]
res = minimize(fun = likelihood, x0 = guess, bounds=bounds, options = {'disp': True}, method="SLSQP")
This gives me:
fun: 6.610504611834715
jac: array([-0.0063166])
message: 'Optimization terminated successfully'
nfev: 9
nit: 4
njev: 4
status: 0
success: True
x: array([0.12090069])
The problem at hand is optimization of multivariate function with nonlinear constraints.
There is a differential equation (in its oversimplified form)
dy/dx = y(x)*t(x) + g(x)
I need to minimize the solution of the DE y(x), but by varying the t(x).
Since it is physics under the hood, there are constraints on t(x). I successfully implemented all of them except one:
0 < t(x) < 1 for any x in range [a,b]
For certainty, the t(x) is a general polynomial:
t(x) = a0 + a1*x + a2*x**2 + a3*x**3 + a4*x**4 + a5*x**5
The x is fixed numpy.ndarray of floats and the optimization goes for coefficients a. I use scipy.optimize with trust-constr.
What I have tried so far:
Root finding at each step and determining the minimal/maximal value of the function using optimize.root and checking for sign changes. Return 0.5 if constraints are satisfied and numpy.inf or -1 or whatever not in [0;1] range if constraints are not satisfied. The optimizer stops soon and the function is not minimized properly.
Since x is fixed-length and known, I tried to define a constraint for each point, so I got N constraints where N = len(x). This works (at least look like) but takes forever for not-so large N. Also, since x is discrete and non-uniform, I can't be sure that there are no violated constraints for any x in [a,b].
EDIT #1: the minimal reproducible example
import scipy.optimize as optimize
from scipy.optimize import Bounds
import numpy as np
# some function y(x)
x = np.linspace(-np.pi,np.pi,100)
y = np.sin(x)
# polynomial t(z)
def t(a,z):
v = 0.0;
for ii in range(len(a)):
v += a[ii]*z**ii
return v
# let's minimize the sum
def targetFn(a):
return np.sum(y*t(a,x))
# polynomial order
polyord = 3
# simple bounds to have reliable results,
# otherwise the solution will grow toward +-infinity
bnd = 10.0
bounds = Bounds([-bnd for i in range(polyord+1)],
[bnd for i in range(polyord+1)])
res = optimize.minimize(targetFn, [1.0 for i in range(polyord+1)],
bounds = bounds)
if np.max(t(res.x,x))>200:
print('max constraint violated!')
if np.min(t(res.x,x))<-100:
print('min constraint violated!')
In the reproducible example given above, let the constraints to be that the value of the polynomial t(a,x) is in range [-100;200] for the given x.
So the question is: how does one properly define a constraint to tell the optimizer that the function's values must be constrained for the given range of arguments?
I am asked to write an implementation of the gradient descent in python with the signature gradient(f, P0, gamma, epsilon) where f is an unknown and possibly multivariate function, P0 is the starting point for the gradient descent, gamma is the constant step and epsilon the stopping criteria.
What I find tricky is how to evaluate the gradient of f at the point P0 without knowing anything on f. I know there is numpy.gradient but I don't know how to use it in the case where I don't know the dimensions of f. Also, numpy.gradient works with samples of the function, so how to choose the right samples to compute the gradient at a point without any information on the function and the point?
I'm assuming here, So how can i choose a generic set of samples each time I need to compute the gradient at a given point? means, that the dimension of the function is fixed and can be deduced from your start point.
Consider this a demo, using scipy's approx_fprime, which is an easier to use wrapper-method for numerical-differentiation and also used in scipy's optimizers when a jacobian is needed, but not given.
Of course you can't ignore the parameter epsilon, which can make a difference depending on the data.
(This code is also ignoring optimize's args-parameter which is usually a good idea; i'm using the fact that A and b are inside the scope here; surely not best-practice)
import numpy as np
from scipy.optimize import approx_fprime, minimize
np.random.seed(1)
# Synthetic data
A = np.random.random(size=(1000, 20))
noiseless_x = np.random.random(size=20)
b = A.dot(noiseless_x) + np.random.random(size=1000) * 0.01
# Loss function
def fun(x):
return np.linalg.norm(A.dot(x) - b, 2)
# Optimize without any explicit jacobian
x0 = np.zeros(len(noiseless_x))
res = minimize(fun, x0)
print(res.message)
print(res.fun)
# Get numerical-gradient function
eps = np.sqrt(np.finfo(float).eps)
my_gradient = lambda x: approx_fprime(x, fun, eps)
# Optimize with our gradient
res = res = minimize(fun, x0, jac=my_gradient)
print(res.message)
print(res.fun)
# Eval gradient at some point
print(my_gradient(np.ones(len(noiseless_x))))
Output:
Optimization terminated successfully.
0.09272331925776327
Optimization terminated successfully.
0.09272331925776327
[15.77418041 16.43476772 15.40369129 15.79804516 15.61699104 15.52977276
15.60408688 16.29286766 16.13469887 16.29916573 15.57258797 15.75262356
16.3483305 15.40844536 16.8921814 15.18487358 15.95994091 15.45903492
16.2035532 16.68831635]
Using:
# Get numerical-gradient function with a way too big eps-value
eps = 1e-3
my_gradient = lambda x: approx_fprime(x, fun, eps)
shows that eps is a critical parameter resulting in:
Desired error not necessarily achieved due to precision loss.
0.09323354898565098
I want to fit a 4-parameters (a,g,N and k) model to data by minimizing a chi-square loss function with a python implementation of the Simplex algorithm (scipy.optimize.fmin).
Preliminary simulations suggest the following range for each parameter: a = [5, 50], g = [0.05, 1.5], N = [5, 200],and k = [0, 0.05].
Looks like the scipy.optimize.fmin function treats the parameters as if they were all in the same range (presumably [0, 1]). Should I rescale them? Below is my code:
#determine starting point (x0) for each parameter
a = np.random.uniform(5,50)
g = np.random.uniform(0.05, 1.5)
N = np.random.uniform(5, 200)
k = np.random.uniform(0, 0.05)
x0 = np.array ([a, g, N, k]) #initial guess for SIMPLEX
xopt = fmin (chis, x0, maxiter=1000)#call Simplex
Imagine that you want to minimize the following bi-variate function
def to_min1((x,y)):
return abs(1e-15 - x) + abs(1e15 - y)
Even if this example is not realistic, it highlights the main point. For sure, fmin may not move in x (if x0=0), because it is already very close to zero.
So as to get objectives which have equal weights within the optimization program, one makes them in terms of variations rather than in terms of differentials (with arguments to numerators to avoid ZeroDivisionError):
def to_min2((x,y)):
return abs(-1+x/1e-15) + abs(-1+y/1e15)
Note that this is an ftol concern, since, by doing so, one wants its iterative recomputation to be equally weighted over all arguments.
What follows does not exactly answer to your question, but to the one:
Does scipy.optimize.fmin (Simplex) deal with parameters associated with different magnitudes?
Apparently no, since
>>> fmin(to_min1, (0,0))
Optimization terminated successfully.
Current function value: 1000000000000000.000000
Iterations: 3
Function evaluations: 11
array([ 0., 0.])
while
>>> fmin(to_min2, (0,0))
Optimization terminated successfully.
Current function value: 1.000000
Iterations: 118
Function evaluations: 213
array([ 1.00000000e-15, 8.98437500e-05])
For sure the Optimization did not terminate successfully., and it could be done by increasing fmin's maxiter argument, etc... but the two cases are clearly not managed the same way.
I'm using scipy to maximize a likelihood function by using 'minimize' from scipy.optimize to minimize the negative of the function value. I'm using the BFGS method and have written functions for the likelihood and its first derivative.
I have been able to minimize the function by estimating the gradient numerically (not providing an argument for the jacobian). However when I try to pass my gradient function as an argument, no iterations are performed to improve my initial guess of the function input values.
EDIT: Using check_grad from scipy I have figured out that my gradient function is flawed. This causes the line search step of the first iteration to fail so no iterations are carried out.
Here are the function and gradient:
def f(X):
X = X.reshape((N,Q))
cov = kern2.compute_noisy(X,X)
inv_cov = np.linalg.inv(cov)
YYt = np.dot(Y, Y.T)
log_l = (-0.5*D*N*np.log(2*math.pi))-(0.5*D*np.log(np.linalg.det(cov))) - (0.5*np.matrix.trace(np.dot(inv_cov,YYt)))
return -log_l
def grad(X):
X = X.reshape(N,-1)
cov = kern2.compute_noisy(X,X)
inv_cov = np.linalg.inv(cov)
YYt = np.dot(Y, Y.T)
dlogl_dK = np.dot(np.dot(inv_cov,YYt),inv_cov) - D*inv_cov
dK_dX = np.empty((X.shape[0], X.shape[0], X.shape[1]))
Q = int(X.shape[1])
for j in range(0,X.shape[0]):
for i in range(0,X.shape[0]):
for k in range(0,X.shape[1]):
dK_dX[i,j,k] = (X[i][k] - X[j][k]) * kern.K(X[i,:][None],X[j,:][None])
dK_dX = np.sum(dK_dX, axis=1)
dlogl_dX = np.dot(dlogl_dK, dK_dX)
return -dlogl_dX.flatten(1)
Checking the initial function value:
print f(X)
>>6597.80198798
Estimating the gradient numerically seems to be ok (the function is not minimized but at least something happens). X is my initial guess at the input:
from scipy.optimize import minimize
test = minimize(f, X, method='BFGS', options={'disp': True})
>>Warning: Desired error not necessarily achieved due to precision loss.
>> Current function value: 6215.446492
>> Iterations: 289
>> Function evaluations: 67671
>> Gradient evaluations: 335
This is what happens when I try to include the gradient function. No iterations are performed and the function value doesn't change:
test2 = minimize(f, X, method='BFGS', jac=grad, options={'disp': True})
>>Warning: Desired error not necessarily achieved due to precision loss.
>> Current function value: 6597.801988
>> Iterations: 0
>> Function evaluations: 43
>> Gradient evaluations: 32
I have looked at the documentation and can't work out why no iterations are being performed. I think I am using minimize correctly and I don't think my initial guess is at a minimum already as I have the same problem with different sets of values. Help would be much appreciated!