Choosing variables for scipy.optimize from a pre-defined set - python

I am trying to minimize a function with scipy.optimize with three input variables, two of which are bounded and one has to be chosen from a set of values. To ensure that the third variable is chosen from a predefined set of values, I introduced the following constraint:
from scipy.optimize import rosin, shgo
import numpy as np
# Set from which the third variable to be optimized can hold
Z = np.array([-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1])
def Reson_Test(x): # arbitrary objective function
print (x)
return rosen(x)**2 - np.sin(x[0])
def Cond_1(x):
if x[2] in Z:
return 1
else:
return -1
bounds = [(-512,512),]*3
conds = ({'type': 'ineq' , 'fun' : Cond_1})
result = shgo(Rosen_Test, bounds, constraints=conds)
print (result)
However, when looking at the print results from Rosen_Test, it is evident that the condition is not being enforced - perhaps condition is not defined correctly?
I was wondering if anyone has any ideas to ensure that the third variable can be chosen from a set.
Note: The use of the shgo method was chosen such that constraints can be introduced and can be changed. Also, I am open to use other optimization packages if this condition is met

The inequality constraints do not work like that.
As mentioned in the docs they are defined as
g(x) <= 0
and you need to write g(x) work like that. In your cases that is not the case. You are only returning a single scalar for one dimension. You need to return a vector with three dimensions, of shape (3,).
In your case you could try to use the equality constraints instead, as this could allow a slightly better hack. But I am still not sure if it will work as these optimizers don't work like that.
And the whole thing will probably leave the optimizer with a rather bumpy and discontinuous objective function. You can read on Mixed-Integer Nonlinear Programming (MINLP), maybe start here.
There are is one more reasons why your approach won't work as expected
As optimizers work with floating point numbers it will likely never find a number in your array when optimizing and guessing new solutions.
This illustrates the issue:
import numpy as np
Z = np.array([-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1])
print(0.7999999 in Z) # False, this is what the optimizer will find
print(0.8 in Z) # True, this is what you want
Maybe you should try to define your problem in a way that allows to use an inequality constraint on the whole range of Z.
But let's see how it could work.
An equality constraint is defined as
h(x) == 0
So you could use
def Cond_1(x):
if x[2] in Z:
return numpy.zeros_like(x)
else:
return numpy.ones_like(x) * 1.0 # maybe multiply with some scalar?
The idea is to return an array [0.0, 0.0, 0.0] that satisfies the equality constraint if the number is found. Else return [1.0, 1.0, 1.0] to show that it is not satisfied.
Caveats:
1.)
You might have to tune this to return an array like [0.0, 0.0, 1.0] to show the optimizer which dimension you are unhappy about so the optimizer can make better guesses by only adjusting a single dimension.
2.)
You might have to return a larger value than 1.0 to state an non-satisfied equality constraint. This depends on the implementation. The optimizer could think that 1.0 is fine as it is close to 0.0. So maybe you have to try something [0.0, 0.0, 999.0].
This solves the problem with the dimension. But still will not find any numbers do to the floating point number thing mentioned above.
But we can try to hack this like
import numpy as np
Z = np.array([-1, -0.8, -0.6, -0.4, -0.2, 0, 0.2, 0.4, 0.6, 0.8, 1])
def Cond_1(x):
# how close you want to get to the numbers in your array
tolerance = 0.001
delta = np.abs(x[2] - Z)
print(delta)
print(np.min(delta) < tolerance)
if np.min(delta) < tolerance:
return np.zeros_like(x)
else:
# maybe you have to multiply this with some scalar
# I have no clue how it is implemented
# we need a value stating to the optimizer "NOT THIS ONE!!!"
return np.ones_like(x) * 1.0
sol = np.array([0.5123, 0.234, 0.2])
print(Cond_1(sol)) # True
sol = np.array([0.5123, 0.234, 0.202])
print(Cond_1(sol)) # False
Here are some recommendations on optimization. To make sure it works in a reliable way try to start the optimization at different initial values. Global optimization algorithms might not have initial values if used with boundaries. The optimizer somehow discretizes the space.
What you could do to check the reliability of your optimization and get better overall results:
Optimize on the complete region [-512, 512] (for all three dimensions)
Try 1/2 of that: [-512, 0] and [0, 512] (8 sub-optimizations, 2 for each dimension)
Try 1/3 of that: [-512, -171], [-171, 170], [170, 512] (27 sub-optimizations, 3 for each dimension)
Now compare the converged results to see if the complete global optimization found the same result
If the global optimizer did not find the "real" minima but the sub-optimization:
your objective function is too difficult on the whole domain
try a different global optimizer
tune the parameters (maybe the 999 for the equality constraint)
I often use the sub-optimization as part of the normal process, not only for testing. Especially for blackbox problems.
Please also see these answers:
Scipy.optimize Inequality Constraint - Which side of the inequality is considered?
Scipy minimize constrained function

Related

Find linear combination of vectors that is the best fit for a target vector

I am trying to find weights across a number of forecasts to give a result that is as close as possible (say, mean squared error) to a known target.
Here is a simplified example showing three different types of forecast across four data points:
target = [1.0, 1.02, 1.01, 1.04] # all approx 1.0
forecasts = [
[0.9, 0.91, 0.92, 0.91], # all approx 0.9
[1.1, 1.11, 1.13, 1.11], # all approx 1.1
[1.21, 1.23, 1.21, 1.23] # all approx 1.2
]
where one forecast is always approximately 0.9, one is always approximately 1.1 and one is always approximately 1.2.
I'd like an automated way of finding weights of approximately [0.5, 0.5, 0.0] for the three forecasts because averaging the first two forecasts and ignoring the third is very close to the target. Ideally the weights would be constrained to be non-negative and sum to 1.
I think I need to use some form of linear programming or quadratic programming to do this. I have installed the Python quadprog library, but I'm not sure how to translate this problem into the form that solvers like this require. Can anyone point me in the right direction?
If I understand you correctly, you want to model some optimization problem and solve it. If you are interested in the general case (without any constraints), your problem seems pretty close to the regular least square error problem (which you can solve with scikit-learn for example).
I recommend to use cvxpy library for modeling an optimization problem. It's a convenient way to model a convex optimization problem, and you can choose which solver you want to work in the background.
Expanding cvxpy least square example, by adding the constraints you mentioned:
# Import packages.
import cvxpy as cp
import numpy as np
# Generate data.
m = 20
n = 15
np.random.seed(1)
A = np.random.randn(m, n)
b = np.random.randn(m)
# Define and solve the CVXPY problem.
x = cp.Variable(n)
cost = cp.sum_squares(A # x - b)
prob = cp.Problem(cp.Minimize(cost), [x>=0, cp.sum(x)==1])
prob.solve()
# Print result.
print("\nThe optimal value is", prob.value)
print("The optimal x is")
print(x.value)
print("The norm of the residual is ", cp.norm(A # x - b, p=2).value)
In this example, A (the matrix) is a matrix of all your vector, x (the variable) is the weights, and b is the known target.
EDIT:
example with your data:
forecasts = np.array([
[0.9, 0.91, 0.92, 0.91],
[1.1, 1.11, 1.13, 1.11],
[1.21, 1.23, 1.21, 1.23]
])
target = np.array([1.0, 1.02, 1.01, 1.04])
x = cp.Variable(forecasts.shape[0])
cost = cp.sum_squares(forecasts.T # x - target)
prob = cp.Problem(cp.Minimize(cost), [x >= 0, cp.sum(x) == 1])
prob.solve()
print("\nThe optimal value is", prob.value)
print("The optimal x is")
print(x.value)
Output:
The optimal value is 0.0005306233766233817
The optimal x is
[ 6.52207792e-01 -1.45736370e-24 3.47792208e-01]
results are approximately [0.65, 0, 0.34] which is different from the [0.5, 0.5, 0.0] you mentioned, but that depends on how you define your problem. This is a solution for the least squares error.
We can see this problem as a least squares, which is indeed equivalent to quadratic programming. If I understand correctly, the weight vector you are looking for is a convex combination, so in least squares form the problem is:
minimize || [w0 w1 w2] * forecasts - target ||^2
s.t. w0 >= 0, w1 >= 0, w2 >= 0
w0 + w1 + w2 == 1
There is a least-squares function you can use out of the box in the qpsolvers package:
import numpy as np
from qpsolvers import solve_ls
target = np.array(target)
forecasts = np.array(forecasts)
w = solve_ls(forecasts.T, target, G=-np.eye(3), h=np.zeros(3), A=np.array([1, 1., 1]), b=np.array([1.]))
You can check in the documentation that the matrices G, h, A and b correspond to the problem above. Using quadprog as the backend solver, I get the following solution on my machine:
In [6]: w
Out[6]: array([6.52207792e-01, 9.94041282e-15, 3.47792208e-01])
In [7]: np.dot(w, forecasts)
Out[7]: array([1.00781558, 1.02129351, 1.02085974, 1.02129351])
Which is the same solution as in Roim's answer. (CVXPY is indeed a great way to start!)

Is there a way to optimize function thresholds with fixed steps?

How can I optimize a function with fixed steps? I have developed a function with five thresholds as entry that I want to optimize. I actually tried to optimize them with different solvers, but the steps that the solver takes are so tiny that the function never converge in a good solution.
Defined thresholds vary from 0 to 1, and I want them to take steps of 0.01. For example, in case of threshold_0, I want It to vary from initial guess 0.6 to 0.61 or 0.59, etc. depending on error result.
from scipy import optimize
initial_guess = [0.6,0.3,0.6,0.5,0.5]
def get_sobel3d_accuracy_from_thresholds(thresholds,array_dicts,ponderation_dict):
...
return error
result = optimize.minimize(
get_sobel3d_accuracy_from_thresholds, # function to optimize
initial_guess,
args=(array_dicts,ponderation_dict), # extra fixed args
method='nelder-mead',
options={'xatol': 1e-8, 'disp': True})
What I want to get is a solution that minimizes de error returned from the function get_sobel3d_accuracy_from_thresholds as follows:
optimized_thresholds = [0.61, 0.3, 0.81, 0.52, 0.44]
I would also like to fix boundaries for thresholds from 0 to 1, but I think that It can be done only with some solvers, right?
bounds = [(0, 1) for n in range(0,5)]
thank you all.

Constraints on search space - python - scikit

I'm searching for minimum in 100d space. I'm using gp_minimize from skopt (python 3.6).
space = [(0., 1.) for _ in range(100)]
res = gp_minimize(f, space)
However, I also have a constraint that value in each subsequent dimension is not larger than in the previous dimensions. For example for the case of 5d, point [1, 0.9, 0.9, 0.8, 0.7] is ok, while point [1, 0.3, 0.5, 0.4, 0.2] is not.
How to add this constraint using skopt?
The best way I found is to modify the function f. Choose an upper bound for f, and everywhere in the domain where f is not supposed to be evaluated, have it return this upper bound.
It is straightforward to this that this is a mathematically sound approach, as it doesn't change the minimum and constrains your search space kind of in the same way that Lagrange multipliers do. However, I don't know if it plays nicely with the algorithm, since I don't really know how Bayesian optimization handles wide plateaus.

Is there a function in tensorflow for doing transformations that a functions of the indices?

I'm looking for (but have been completely unable to find) a function in tensorflow that will allow me to do a 'map' on a tensor.
map
Firstly, I'm not even sure if there is a 'map' function? By this a mean something that lets me apply a given f(x) to even entry in a tensor. e.g. I want something like this
def f(x):
return x**2
Y = tf.Variable(np.array([[1.0, 2.0],
[3.0, 4.0]])
Y = tf.map_function(X, f)
producing (after suitably running in a session, obviously) a tensor with values
Y = [[1.0, 4.0],
[9.0, 16.0]]
Does this exist (for general f - I realise that tf.nn.relu and tf.nn.sigmoid? On one hand, it seems like it should, sincemap` is a pretty fundamental operation. On the other hand, it would involve taking the supplied python function and somehow converting it to be executed on the GPU, and that sounds like something that might not be possible.
Am I asking for the moon on a stick here?
**mapi*
If such a function exists, is there a version that allows me to use an index-aware f? e.g.
def f(x, i):
if (i != [0, 0]):
k2 = np.sum([x**2 for x in i])
else:
k2 = 1.0 # To avoid division by zero
return (x / k2)
Y = tf.Variable(np.ones(shape=(2,3)))
Y = tf.mapi_function(X, f)
producing
Y = [[1.0, 1.0, 0.25],
[1.0, 0.5, 0.2]]
If such function don't exist, would it be possible (for fixed f) for me to add them by building tensorflow from (slightly modified) source?
Why I need such a function
The reason I'm asking this is that I'm trying to use tensorflow to numerically integrate a PDE. As part of that I need to compute the laplacian (d^2/dx^2 + d^2/dy^2 + d^2/dz^2) u(x,y,z). In a Fourier-transformed representation of the field u(k_X, k_y, k_z) this involves dividing by k_x^2 + k_y^2 + k_z^2.
I could precompute a tensor of inverse squared wavenumber values and dow an element-wise multiply. But this would use up a lot of memory. I suspect it would also be slower to load those values from memory.
In your specific example of wanting to map individually to each of the x,y,z coordinates, you can accomplish this readily with tf.split() and tf.stack(). That is, I presume you have an input tensor (call it K) that is of size [n,m,...,3]; that is, where the last dimension indexes the x,y,z coordinates. If so, then use tf.split() to break up K into Kx,Ky,Kz. Then apply your map operation (I use tf.map_fn() for this purpose typically), and then finally stack things back together with tf.stack().
If I understand the setup correctly that should do it. If not, please provide a minimal working example that will make the problem concrete; otherwise we are at best guessing at a solution.

Discontinuity in results when using scipy.integrate.quad

I've discovered a strange behavior when using scipy.integrate.quad. This behavior also shows up in Octave's quad function, which leads me to believe that it may have something to do with QUADPACK itself. Interestingly enough, using the exact same Octave code, this behavior does not show up in MATLAB.
On to the question. I'm numerically integrating a lognormal distribution over various bounds. For F is cdf of lognormal, a is lower bound and b is upper bound, I find that under some conditions,
integral(F, a, b) = 0 when b is a "very large number," while
integral(F, a, b) = the correct limit when b is np.inf. (or just Inf for Octave.)
Here's some example code to show it in action:
from __future__ import division
import numpy as np
import scipy.stats as stats
from scipy.integrate import quad
# Set up the probability space:
sigma = 0.1
mu = -0.5*(sigma**2) # To get E[X] = 1
N = 7
z = stats.lognormal(sigma, 0, np.exp(mu))
# Set up F for integration:
F = lambda x: x*z.pdf(x)
# An example that appears to work correctly:
a, b = 1.0, 10
quad(F, a, b)
# (0.5199388..., 5.0097567e-11)
# But if we push it higher, we get a value which drops to 0:
quad(F, 1.0, 1000)
# (1.54400e-11, 3.0699e-11)
# HOWEVER, if we shove np.inf in there, we get correct answer again:
quad(F, 1.0, np.inf)
# (0.5199388..., 3.00668e-09)
# If we play around we can see where it "breaks:"
quad(F, 1.0, 500) # Ok
quad(F, 1.0, 831) # Ok
quad(F, 1.0, 832) # Here we suddenly hit close to zero.
quad(F, 1.0, np.inf) # Ok again
What is going on here? Why does quad(F, 1.0, 500) evaluate to approximately the correct thing, but quad(F, 1.0, b) goes to zero for all values 832 <= b < np.inf?
While I'm not exactly familiar with QUADPACK, adaptive integration generally works by increasing resolution until the answer no longer improves. Your function is so close to 0 for most of the interval (with F(10)==9.356e-116) that the improvement is negligible for the initial grid points that quad chooses, and it decides that the integral must be close to 0. Basically, if your data hides in a very narrow subinterval of the range of integration, quad eventually won't be able to find it.
For integration from 0 to inf, the interval obviously cannot be subdivided into a finite number of intervals, so quad will need some preprocessing before computing the integral. For example, a change of variables like y=1/(1+x) would map the interval 0..inf to 0..1. Subdividing that interval will sample more points near zero from the original function, enabling quad to find your data.
try lowering the error tolerance
>>> quad(F, a, 1000, epsabs=1.49e-11)
(0.5199388058383727, 2.6133800952484582e-11)
I guess numerical integration is just sensitive to certain configuration. You can try to debug it by calling quad(..., full_output=1) and analyzing the verbose output carefully. Sorry if the answer is not satisfactory though.

Categories

Resources