How can I optimize a function with fixed steps? I have developed a function with five thresholds as entry that I want to optimize. I actually tried to optimize them with different solvers, but the steps that the solver takes are so tiny that the function never converge in a good solution.
Defined thresholds vary from 0 to 1, and I want them to take steps of 0.01. For example, in case of threshold_0, I want It to vary from initial guess 0.6 to 0.61 or 0.59, etc. depending on error result.
from scipy import optimize
initial_guess = [0.6,0.3,0.6,0.5,0.5]
def get_sobel3d_accuracy_from_thresholds(thresholds,array_dicts,ponderation_dict):
...
return error
result = optimize.minimize(
get_sobel3d_accuracy_from_thresholds, # function to optimize
initial_guess,
args=(array_dicts,ponderation_dict), # extra fixed args
method='nelder-mead',
options={'xatol': 1e-8, 'disp': True})
What I want to get is a solution that minimizes de error returned from the function get_sobel3d_accuracy_from_thresholds as follows:
optimized_thresholds = [0.61, 0.3, 0.81, 0.52, 0.44]
I would also like to fix boundaries for thresholds from 0 to 1, but I think that It can be done only with some solvers, right?
bounds = [(0, 1) for n in range(0,5)]
thank you all.
Related
I have a python script where I compute the value of a normal log-likelihood function for a sample of bivariate data using scipy's multivariate_normal.log_pdf. I am assuming the values of the sample means and variances, leaving only the sample correlation between the variables as the unknown,
from scipy.stats import multivariate_normal
from scipy.optimize import minimize
VAR_X = 0.4
VAR_Y = 0.32
MEAN_X = 1
MEAN_Y = 1.2
def log_likelihood_function(x, data):
log_likelihood = 0
sigma = [ [VAR_X, x[0]], [x[0], VAR_Y] ]
mu = [ MEAN_X, MEAN_Y ]
for point in data:
log_likelihood += multivariate_normal.logpdf(x=point, mean=mu, cov=sigma)
return log_likelihood
if __name__ == "__main__":
some_data = [ [1.1, 2.0], [1.2, 1.9], [0.8, 0.2], [0.7, 1.3] ]
guess = [ 0 ]
# maximize log-likelihood by minimizing the negative
likelihood = lambda x: (-1)*log_likelihood_function(x, some_data)
result = minimize(fun = likelihood, x0 = guess, options = {'disp': True}, method="SLSQP")
print(result)
No matter what I set as my guess, this script reliably throws a ValueError,
ValueError: the input matrix must be positive semidefinite
Now, the problem, by my estimation, seems to be scipy.optimize.minimize is guessing values that create a covariance matrix that is not positive definite. So I need a way to make sure the minimization algorithm throws away values that are outside the domain of the problem. I thought to add a constraint to the minimize call,
## make the determinant always positive
def positive_definite_constraint(x):
return VAR_X*VAR_Y - x*x
Which is basically the Slyvester Criteron for the covariance matrix and would ensure the matrix is positive definite (since we know the variance is always positiv, that condition doesn't need checked) But it seems like scipy.optimize.minimize evaluates the objective function before it determines if the constraints are satisfied (which seems like a design flaw; wouldn't it be faster to search for a solution in a restricted domain, instead of searching all possible solutions and then determining if the constraints are satisfied? I might be mistaken about the order of evaluation, though.)
I am not sure how to proceed. I realize I am stretching the purpose of scipy.optimize here a bit by parameterizing the covariance matrix and then minimizing with respect to that parameterization, and I know there are better ways to calculate the correlation for a normal sample, but I am interested in this problem because of its generalization to distributions that are not normal.
Any suggestions? Is there a better way to solve this problem?
You are on the right track. Note that your definiteness constraint reduces to a simple bound on the optimization variable, i.e. -∞ <= x[0] <= VAR_X*VAR_Y. Variable bounds are better handled internally than the more general constraints, so I'd recommend something like this:
bounds = [(None, VAR_X*VAR_Y)]
res = minimize(fun = likelihood, x0 = guess, bounds=bounds, options = {'disp': True}, method="SLSQP")
This gives me:
fun: 6.610504611834715
jac: array([-0.0063166])
message: 'Optimization terminated successfully'
nfev: 9
nit: 4
njev: 4
status: 0
success: True
x: array([0.12090069])
I am trying to minimize a function with scipy.optimize with three input variables, two of which are bounded and one has to be chosen from a set of values. To ensure that the third variable is chosen from a predefined set of values, I introduced the following constraint:
from scipy.optimize import rosin, shgo
import numpy as np
# Set from which the third variable to be optimized can hold
Z = np.array([-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1])
def Reson_Test(x): # arbitrary objective function
print (x)
return rosen(x)**2 - np.sin(x[0])
def Cond_1(x):
if x[2] in Z:
return 1
else:
return -1
bounds = [(-512,512),]*3
conds = ({'type': 'ineq' , 'fun' : Cond_1})
result = shgo(Rosen_Test, bounds, constraints=conds)
print (result)
However, when looking at the print results from Rosen_Test, it is evident that the condition is not being enforced - perhaps condition is not defined correctly?
I was wondering if anyone has any ideas to ensure that the third variable can be chosen from a set.
Note: The use of the shgo method was chosen such that constraints can be introduced and can be changed. Also, I am open to use other optimization packages if this condition is met
The inequality constraints do not work like that.
As mentioned in the docs they are defined as
g(x) <= 0
and you need to write g(x) work like that. In your cases that is not the case. You are only returning a single scalar for one dimension. You need to return a vector with three dimensions, of shape (3,).
In your case you could try to use the equality constraints instead, as this could allow a slightly better hack. But I am still not sure if it will work as these optimizers don't work like that.
And the whole thing will probably leave the optimizer with a rather bumpy and discontinuous objective function. You can read on Mixed-Integer Nonlinear Programming (MINLP), maybe start here.
There are is one more reasons why your approach won't work as expected
As optimizers work with floating point numbers it will likely never find a number in your array when optimizing and guessing new solutions.
This illustrates the issue:
import numpy as np
Z = np.array([-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1])
print(0.7999999 in Z) # False, this is what the optimizer will find
print(0.8 in Z) # True, this is what you want
Maybe you should try to define your problem in a way that allows to use an inequality constraint on the whole range of Z.
But let's see how it could work.
An equality constraint is defined as
h(x) == 0
So you could use
def Cond_1(x):
if x[2] in Z:
return numpy.zeros_like(x)
else:
return numpy.ones_like(x) * 1.0 # maybe multiply with some scalar?
The idea is to return an array [0.0, 0.0, 0.0] that satisfies the equality constraint if the number is found. Else return [1.0, 1.0, 1.0] to show that it is not satisfied.
Caveats:
1.)
You might have to tune this to return an array like [0.0, 0.0, 1.0] to show the optimizer which dimension you are unhappy about so the optimizer can make better guesses by only adjusting a single dimension.
2.)
You might have to return a larger value than 1.0 to state an non-satisfied equality constraint. This depends on the implementation. The optimizer could think that 1.0 is fine as it is close to 0.0. So maybe you have to try something [0.0, 0.0, 999.0].
This solves the problem with the dimension. But still will not find any numbers do to the floating point number thing mentioned above.
But we can try to hack this like
import numpy as np
Z = np.array([-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1])
def Cond_1(x):
# how close you want to get to the numbers in your array
tolerance = 0.001
delta = np.abs(x[2] - Z)
print(delta)
print(np.min(delta) < tolerance)
if np.min(delta) < tolerance:
return np.zeros_like(x)
else:
# maybe you have to multiply this with some scalar
# I have no clue how it is implemented
# we need a value stating to the optimizer "NOT THIS ONE!!!"
return np.ones_like(x) * 1.0
sol = np.array([0.5123, 0.234, 0.2])
print(Cond_1(sol)) # True
sol = np.array([0.5123, 0.234, 0.202])
print(Cond_1(sol)) # False
Here are some recommendations on optimization. To make sure it works in a reliable way try to start the optimization at different initial values. Global optimization algorithms might not have initial values if used with boundaries. The optimizer somehow discretizes the space.
What you could do to check the reliability of your optimization and get better overall results:
Optimize on the complete region [-512, 512] (for all three dimensions)
Try 1/2 of that: [-512, 0] and [0, 512] (8 sub-optimizations, 2 for each dimension)
Try 1/3 of that: [-512, -171], [-171, 170], [170, 512] (27 sub-optimizations, 3 for each dimension)
Now compare the converged results to see if the complete global optimization found the same result
If the global optimizer did not find the "real" minima but the sub-optimization:
your objective function is too difficult on the whole domain
try a different global optimizer
tune the parameters (maybe the 999 for the equality constraint)
I often use the sub-optimization as part of the normal process, not only for testing. Especially for blackbox problems.
Please also see these answers:
Scipy.optimize Inequality Constraint - Which side of the inequality is considered?
Scipy minimize constrained function
I'm searching for minimum in 100d space. I'm using gp_minimize from skopt (python 3.6).
space = [(0., 1.) for _ in range(100)]
res = gp_minimize(f, space)
However, I also have a constraint that value in each subsequent dimension is not larger than in the previous dimensions. For example for the case of 5d, point [1, 0.9, 0.9, 0.8, 0.7] is ok, while point [1, 0.3, 0.5, 0.4, 0.2] is not.
How to add this constraint using skopt?
The best way I found is to modify the function f. Choose an upper bound for f, and everywhere in the domain where f is not supposed to be evaluated, have it return this upper bound.
It is straightforward to this that this is a mathematically sound approach, as it doesn't change the minimum and constrains your search space kind of in the same way that Lagrange multipliers do. However, I don't know if it plays nicely with the algorithm, since I don't really know how Bayesian optimization handles wide plateaus.
I'm trying to calculate the maximum log-likelihood (MLE) for the following probability density function (PDF):
I'm computing it by minimising the objective function (negative log-likelihood) without relying on any predefined log-likelihood python built-in modules whatsoever. The code is:
# Alpha Distribution (PDF)
def AD(z, *params):
a, scale = z
diameters = params
return -np.sum(np.log((((diameters)/(a**2) * np.exp(-diameters/a))) / scale))
# load data
currpath = ('path')
os.chdir(currpath)
diameters = scipy.io.loadmat('data.mat')["m1"]
# minimise
x0 = [1,1] # initial guesses
res = optimize.minimize(AD, x0, args = diameters, method='Nelder-Mead',
tol=1e-6)
print(res.x)
My data vector (here already sorted) comprises a number of diameters in the following form (0.19, 0.19, 0.19, 0.2, 0.21, 0.21, 0.22, 0.22, 0.22, 0.25, 0.27 ...).
First question: Since I'm fairly new to the topic of MLE, is the form of my data vector correct? I'm not completely sure whether I use a data vector containing every observed diameter (like shown above), or a data vector which only contains the "possible" diameters (which would be: 0.19, 0.2, 0.21, 0.22, 0.25, 0.27 ...), or just the frequencies of the observed diameters (which would be: 3, 1, 2, 3, 1, 1 ...). I think the first option is the right one, but I just wanted to be completely sure.
Second question: If I wish to use a cumulative distribution function (CDF) instead of a PDF to perform my MLE on, I would have to change my PDF function to a CDF, right? I was just wondering if I could alternatively somehow modify my data vector and still use the PDF.
However, for the minimisation in python (if I understood it correctly) I had to rethink the definition of my variables. That means, normally I would assume that the parameters of my PDF (here "a" and "scale") are the variables which should be passed to "args" in "optimize.minimize". However, in the documentation it is stated, that args should contain the "constant" parameters, therefore I used my data vector as a constant "parameter vector" for the minimisation.
Third question: Is this assumption an error in reasoning?
Fourth question: Is the optimisation method "Nelder-Mead" appropriate? I'm not really familiar with optimisation methods and not sure which of the options I should use/is the best.
Finally, the program returns an error "TypeError: bad operand type for unary -: 'tuple'", where I have no clue how to deal with it, since I'm not passing any tuples to the minimisation function ...
Fifth question: Where does the tuple come from and how can I solve this error?
I'd appreciate any help you could give me very much!
Best regards!
PS: Since this post is kind of a mixture between general math and programming, I wasn't completely sure if this is the right place to put the question. Sorry if I'm mistaken!
First, apart from the first part (before the multiplication operator), we are discussing what is generally called maximum likelihood estimation (MLE) for the exponential distribution. It has just been reparameterised in terms of something called a.
We want to estimate this single parameter based on a sample of diameters; there is no scale parameter. Under MLE, we pretend that the sample is fixed and treat the parameter as something that can be varied. We form the likelihood of the sample by taking the product of the density functions (not the cdfs) where each density function is to be calculated for one element of the sample.
(Likelihood is, in concept, like throwing a die twice. In ultra ugly terms, we could say that the likelihood of getting two ones in a row might be (1/6)(1/6).)
We want to maximise this likelihood. However, to make the optimisation problem mathematically and/or computationally tractable we take the function's logarithm. Since all of its constituent functions are densities, less than one, this function must be everywhere less than zero. Thus, the maximisation problem becomes one of minimisatiion.
If you want to avoid almost all of the algebra then you would:
Write a function to calculate the density function for a given diameter and parameter value.
Write another function that would accept a density function parameter value as its Python parameter, and the sample as its second. Make it call the first function once for each sample value, take the log of each of these and return the sum of these.
Call minimize with the second function as its first argument, some reasonable guess for the density function parameter, in a list, as the second argument, the sample for args. Nelder-Mead is probably ok.
Edit: In a nutshell:
diameters =[ 0.19, 0.19, 0.19, 0.2, 0.21, 0.21, 0.22, 0.22, 0.22, 0.25, 0.27]
from scipy.optimize import minimize
from math import exp, log
def pdf(d, a):
result = d*exp(-d/a)/a**2
return result
def log_L(a, diameters):
result = sum(log(pdf(d, a)) for d in diameters)
return result
res = minimize(log_L, [1], args=diameters)
print (res)
Output:
fun: -337.80985348524604
hess_inv: array([[ 8.71770021e+10]])
jac: array([ -7.62939453e-06])
message: 'Optimization terminated successfully.'
nfev: 93
nit: 30
njev: 31
status: 0
success: True
x: array([ 2157576.39996697])
Addendum:
The wikipedia article offers the following form for the pdf of the exponential.
The constant 'lambda' can be viewed as a value that scales the integral of the remainder of the expression from zero to infinity to one. We can ignore it and equate the exponents of your pdf, without the scaling factor, and the exponential. We have to remember that d takes the role of x.
Solve for 'lambda'.
We see that this is the normalising expression in your pdf. In other words, the alpha is an exponential expressed with different parameters.
Here is another approach, assuming that you're analysing data and not simply working out the details of MLE.
scipy provides means for generating samples from arbitrary distributions. Here I define just the pdf for your alpha. Your parameter a becomes p because a is used as the lower limit for the distribution support, which I define to be zero.
I draw a sample of size 100 with p set somewhat arbitrarily to 0.4. I did a little experimentation, trying to find a value that would give me a sample whose lowest 11 values would approximate those in your sample.
The scipy rv_continuous object has a method called fit that will attempt calculation of MLE estimates of location, scale and 'shape'. In this case, the value for shape, about 0.36, is not all that far from 0.4.
from scipy.stats import rv_continuous
import numpy as np
class Alpha(rv_continuous):
'alpha distribution'
def _pdf(self, x, p):
return x*np.exp(-x/p)/p**2
alpha = Alpha(a=0, shapes='p')
sample = sorted(alpha.rvs(size=100,p=0.4))
for a in sample[:12]:
print ('{:10.2f}'.format(a))
print (Alpha(a=0, shapes='p').fit(sample))
I don't believe that your sample is alpha-distributed. The values seem to be too 'uniform' compared with what I could generate. But I've been wrong before.
I would suggest plotting your sample cdf to see if you can recognise what it is.
Incidentally, when I changed the sign of the log-likelihood in the other answer the code croaked. I suspect that the alpha is just a poor fit.
0.00
0.03
0.04
0.04
0.08
0.09
0.09
0.11
0.12
0.14
0.19
0.20
(1.0902616847853124, -0.039102949269294023, 0.35922022997329517)
Background: A ship is berthed to a jetty using 24 mooring lines and 4 fenders. These mooring lines needs to be pre-tensioned to a design value by experienced engineers. Pre-tensioning is done by setting the appropriate length of each mooring line. Static simulation is done to obtain tension on the lines and compression on the fenders. This is an iterative process as small change in mooring line lenght may cause significant variation in the tension.
Problem Description:
An objective function is set up to take mooring line lenghs as an input array and return the sum of absolute differences between target and achieved pretension values.
Now, I am using the scipy.optimize.minimize function with following options:
target_wire_lenghts = {'Line1': (48.0, 49.0),'Line2': (48.0, 49.0),'Line3': (45.0,46.0),
'Line4': (10.0,11.0),'Line5': (8.0,9.0),'Line6': (7.0,8.0),
'Line7': (46.0,47.0),'Line8': (48.0,49.0),'Line9': (50.0,51.0),
'Line10': (33.0,34.0),'Line11': (31.0,32.0),'Line12': (29.0,30.0),
'Line13': (32.0,33.0),'Line14': (34.0,35.0),'Line15': (36.0,37.0),
'Line16': (48.0,49.0),'Line17': (46.0,47.0),'Line18': (45.0,46.0),
'Line19': (8.0,9.0), 'Line20': (8.0,9.0), 'Line21': (9.0,10.0),
'Line22': (44.0,45.0),'Line23': (45.0,46.0), 'Line24': (46.0,48.0)}
# Bounds
bounds = list(target_wire_lenghts.values())
# Initial guess
x0 = [np.mean([min, max], axis=0) for min,max in bounds]
# Options
options = {'ftol' : 0.1,
'xtol' : 0.1,
'gtol' : 0.1,
'maxiter' : 100,
'accuracy' : 0.1}
result = minimize(objfn, x0, method = 'TNC', bounds = bounds, options = options)
print(result)
However, the optimizer is not varying the input array. The results are the same as initial input array x0(See the length column below). I tried playing around with the optional tolerance parameters of the 'TNC' solver, but do not see any improvement. Also, notice that eventhough I have set the maxiter = 100, the iteration went to 130.
Please suggest what mistake am I making while calling the minimize function.
EDIT: I figured the optimization was running, but changing the variables by 0.000001 at a time. The option parameter eps (Step size used for numerical approximation of the jacobian.) when set to 0.01, the optimization looked working. Unfortunately, it still was not able to reach a reasonable solution. I tried doing an unbounded optimization, with initial guess x0 being very close to the answer (which I found by manually altering each variable), and then the optimizer was able to give a better solution than my manual one.
So the question now is how to do a 24 variable optimization quickly with bad initial guess? Could multi objective optimization be the answer, where reaching each line pre-tension is an objective?