Using LinearOperator as nonlinear constraints in Scipy optimize - python

I'm trying to use the optimization module in SciPy to solve constrained optimization problem. I need to implement the 'hess' argument. In scipy's documentation and tutorial, their hessian are simply [[2, 0], [0, 0]] and [[2, 0], [0, 0]]. However, my hessian is something like [[(-24)*x[0]**2 + 48*x[0]-16, 0], [0, 0]] and [[(-48)*x[0]**2 + 192*x[0]-176, 0], [0, 0]] so that I cannot simply use numpy.array to do multiplication. It seems that I should send a LinearOperator object to the 'hess' arguement. Examples of using LinearOperator is unclear in both scipy.optimize tutorial and LinearOperator documentation since they only show examples of lower dimension. I'm wondering how to correctly use it?
The problem formulation is
my code is:
import numpy as np
from scipy.optimize import Bounds
from scipy.optimize import NonlinearConstraint
from scipy.optimize import minimize
def f(x):
return (-x[0]-x[1])
def grad(x):
return np.array([-1, -1])
def hess(x):
return np.array([[0, 0], [0, 0]])
def cons_f(x):
return [(-2)*x[0]**4 + 8*x[0]**3 + (-8)*x[0]**2 + x[1] -2, (-4)*x[0]**4 + 32*x[0]**3 + (-88)*x[0]**2 + 96*x[0] + x[1] -36]
def cons_Jacobian(x):
return [[(-8)*x[0]**3 + 24*x[0]**2 - 16*x[0], 1], [(-16)*x[0]**3 + 96*x[0]**2 - 176*x[0] +96, 1]]
def cons_Hessian(x,v):
# TODO
return v[0]*[[(-24)*x[0]**2 + 48*x[0]-16, 0], [0, 0]] + v[1]*[[(-48)*x[0]**2 + 192*x[0]-176, 0], [0, 0]]
nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 0, jac=cons_Jacobian, hess=cons_Hessian)
bounds = Bounds([0, 0], [3.0, 4.0])
x0 = np.array([0.5, 1])
res = minimize(f, x0, method='trust-constr', jac=grad, hess=hess,
constraints=[nonlinear_constraint],bounds=bounds)
The cons_Hessian(x,v)is absolutely wrong in my code.
In their example, although hessians are simply[[2, 0], [0, 0]] and [[2, 0], [0, 0]], the usage is confusing. I don't understand where p comes in.
from scipy.sparse.linalg import LinearOperator
def cons_H_linear_operator(x, v):
def matvec(p):
return np.array([p[0]*2*(v[0]+v[1]), 0])
return LinearOperator((2, 2), matvec=matvec)
nonlinear_constraint = NonlinearConstraint(cons_f, -np.inf, 1,
jac=cons_J, hess=cons_H_linear_operator)

There's no need to use a LinearOperator. You only need to ensure that cons_f, cons_Jacobian and cons_Hessian return np.ndarrays. That's the reason why you can't evaluate your cons_Hessian. Additionally, it's highly recommended to use double literals instead of integers, i.e. -2.0 instead of 2 to prevent that the function returns np.ndarrays with a integer dtype.
Your example works for me by writing these functions as follows:
def cons_f(x):
con1 = (-2.0)*x[0]**4 + 8*x[0]**3 + (-8)*x[0]**2 + x[1] - 2
con2 = (-4)*x[0]**4 + 32*x[0]**3 + (-88)*x[0]**2 + 96*x[0] + x[1] -36
return np.array([con1, con2])
def cons_Jacobian(x):
con1_grad = [(-8.0)*x[0]**3 + 24*x[0]**2 - 16*x[0], 1]
con2_grad = [(-16)*x[0]**3 + 96*x[0]**2 - 176*x[0] +96, 1]
return np.array([con1_grad, con2_grad])
def cons_Hessian(x,v):
con1_hess = np.array([[(-24.0)*x[0]**2 + 48*x[0]-16, 0], [0, 0]])
con2_hess = np.array([[(-48)*x[0]**2 + 192*x[0]-176, 0], [0, 0]])
return v[0]*con1_hess + v[1]*con2_hess

Related

Is the a workaraound for dividing two variables in a DCP optimization problem?

I am new to the world of CVXPY and have run in to an issue with an optimization problem. In it I need to derive an expression based on the optimized variable and dividing the variable with the expression - simplified but like this:
import cvxpy as cvx
x = cvx.Variable(1,nonneg=True)
y = cvx.sqrt(x)
print("y is DCP:" + str(y.is_dcp()))
z = x/y # y is x dependent so not dcp
print("z is DCP:" + str(z.is_dcp()))
objective = cvx.Maximize(cvx.sum(z))
probl = cvx.Problem(objective, [x<=10])
probl.solve(verbose=True)
Looking at the rules for DCP optimization I realize that variable/variable division is not DCP. My question if therefore if someone has a solution or workI am noew to the world ov cvxpy and have d for this issue?
Inputing a constant in place for y in z obviously fixes the issue. However, I need to optimize on an expression based on the variable. Is there a way to do this?
I added the example above for simplicity, but my problem is more in line with the following:
import numpy as np
import cvxpy as cvx
import warnings
warnings.simplefilter(action='ignore')
ratio = np.array([-1.95844359, -7.14519994, 0.08811825, 2.92089828, 2.87685278,
-3.13022284, -1.12513724, 3.72449473, -2.68733876, 2.31347068,
4.06927235, -5.38002868, 2.18026303, -2.95228569, -7.00564848,
-3.19870931, -2.1249305 ])
category = np.array([[0,0, 1, 0],
[0,0, 0, 1],
[0,1, 0, 0],
[1,0, 0, 0],
[0,1, 0, 0],
[0,0, 1, 0],
[0,0, 1, 0],
[0,1, 0, 0],
[0,0, 1, 0],
[1,0, 0, 0],
[0,1, 0, 0],
[0,0, 1, 0],
[1,0, 0, 0],
[0,0, 1, 0],
[0,0, 0, 1],
[0,0, 1, 0],
[0,0, 1, 0]])
x = cvx.Variable(17,nonneg=True)
constraints = [cvx.sum(x) == 1]
constraints += [cvx.max(x.T*category) <= 0.34]
x2 =x.copy()
category_weight = x2.T*category # Category weights
category_weight.is_dcp()
category_weight_x = category_weight*category.T
category_weight_x.is_dcp() # Category weight for each x
category_weight_x = cvx.sum(category_weight_x,axis = 1)
# sum over rows to get (len(x),)
category_weight_x_inv = cvx.inv_pos(category_weight_x)
category_weight_x_inv.is_dcp() #1/n
# PROBLEM:
x_category_weight = x2/category_weight_x # category weight_x is not constant - not allowed!
x_category_weight.is_dcp()
#
ratio_weighted_opt = ratio*x_category_weight.T #Get Ratio value for x in category
ratio_category_opt = ratio_weighted_opt.T*category #split ratio to category columns
ratio_category_opt_cap = cvx.pos(ratio_category_opt) #set negativ to 0
ratio_category_opt_cap.is_dcp()
ratio_category_opt_cap = cvx.pos(1-ratio_category_opt) #set bigger than 1 to 1
ratio_category_opt_cap +=1
ratio_category_opt_cap.is_dcp()
ratio_category_opt_cap_category = ratio_category_opt_cap*category_weight #multiply with category weight to total
objective = cvx.Maximize(cvx.sum(ratio_weighted_opt))
probl = cvx.Problem(objective, constraints)
probl.solve(verbose=True)

Factoring a polynomial with respect to specific terms

import numpy as np
import sympy as sp
from sympy import *
init_printing()
uVars = list(symbols(', '.join([f'u{n}' for n in range(1, 3 + 1)])))
aVars = list(symbols(', '.join([f'a{n}' for n in range(1, 3 + 1)])))
lambda1, mu = symbols('lambda, mu')
U = np.array([ [0, -uVars[2], uVars[1]], [uVars[2], 0, -uVars[0]], [-uVars[1], uVars[0], 0] ])
a = np.array([ [aVars[0], 0, 0], [0, aVars[1], 0], [0, 0, aVars[2]] ])
I = np.eye(3)
L = a*lambda1 + U
preCharPoly = L - mu*I
preCharPoly_sym = sp.Matrix(preCharPoly)
factor(preCharPoly_sym.det())
The above code outputs the following polynomial:
However, I require the polynomial to be factored with respect to the variables lambda and mu as shown here:
I have been examining the documentation at https://docs.sympy.org/latest/modules/simplify/simplify.html but cannot figure out how to do what is desired. How do I specify factor() or simplify() to perform their tasks with respect to lambda and mu?

Python-How to multiply matrix with symbols and 0s

I am brand new to python, but is there any way to multiply matrices with both 0's and symbols? For example, see below:
import sympy as sym
import numpy as np
teams=np.matrix([[1,2],[3,4]])
teams=teams-1
n=4
x,a,b=sym.symbols('x a b')
X=np.empty((n,n), dtype=object)
Y=np.empty((n,n), dtype=object)
Z=np.empty((n,n), dtype=object)
for i in range(n):
for j in range(n):
if j==i:
X[i,j]=x
elif ([i,j] in teams.tolist()):
Y[i,j]=a
elif ([j,i] in teams.tolist()):
Y[i,j]=a
else:
Z[i,j]=b
for i in range(n):
for j in range(n):
if X[i,j]==None:
X[i,j]=0
if Y[i,j]==None:
Y[i,j]=0
if Z[i,j]==None:
Z[i,j]=0
print(np.matmul(X,Y))
TypeError Traceback (most recent call last)
<ipython-input-189-00b753462a2d> in <module>
2 print(Y)
3 print(Z)
----> 4 print(np.matmul(X,Y))
TypeError: ufunc 'matmul' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
I know why it is messing up, I am trying to multiply a symbol by a number, but I was wondering if there was anyway to make this recognize that a symbol times 0 is just zero and should be disregarded if being added to another symbol.
The problem isn't specifically with the symbols, but with the object dtype. matmul doesn't (or didn't) work with object dtype arrays. The fast version uses BLAS library functions, which only work with C numeric types - float and integers. np.dot does have a slower branch that does work with non-numeric dtypes.
In a isympy session:
In [4]: X
Out[4]:
array([[x, 0, 0, 0],
[0, x, 0, 0],
[0, 0, x, 0],
[0, 0, 0, x]], dtype=object)
In [5]: Y
Out[5]:
array([[0, a, 0, 0],
[a, 0, 0, 0],
[0, 0, 0, a],
[0, 0, a, 0]], dtype=object)
In [6]: np.dot(X,Y)
Out[6]:
array([[0, a*x, 0, 0],
[a*x, 0, 0, 0],
[0, 0, 0, a*x],
[0, 0, a*x, 0]], dtype=object)
BUT, matmul does work for me. I wonder if that's because of my numpy version?
In [7]: np.matmul(X,Y)
Out[7]:
array([[0, a*x, 0, 0],
[a*x, 0, 0, 0],
[0, 0, 0, a*x],
[0, 0, a*x, 0]], dtype=object)
In [8]: np.__version__
Out[8]: '1.17.4'
As a general rule mixing sympy and numpy is not a good idea. numpy arrays containing symbols are necessarily object dtype. Math on object dtype depends on delegating the action to methods. The result is hit-or-miss. Multiplication and addition may work (x+x), but np.sin does not, because x.sin() fails. It's best to use sympy.lambdify if you want to use sympy expressions in numpy. Otherwise, try to use pure sympy.
In [12]: X*X
Out[12]:
array([[x**2, 0, 0, 0],
[0, x**2, 0, 0],
[0, 0, x**2, 0],
[0, 0, 0, x**2]], dtype=object)
In [13]: np.sin(X)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
AttributeError: 'Symbol' object has no attribute 'sin'
===
From the numpy 1.17.0 release notes
Support of object arrays in matmulĀ¶
It is now possible to use matmul (or the # operator) with object arrays. For instance, it is now possible to do:
from fractions import Fraction
a = np.array([[Fraction(1, 2), Fraction(1, 3)], [Fraction(1, 3), Fraction(1, 2)]])
b = a # a
Whenever you are working with symbolic math, you should leave out numpy and keep everything inside sympy. Numpy doesn't understand about sympy's symbols. You can be lucky a few times with multiplying by zero, but it doesn't make much sense in general. Numpy works with arrays of numbers, preferably everything of the same type.
However, you can use lambdify to bridge the gap and convert sympy expressions to be used by numpy.
Here is your code with sympy's matrices:
import sympy as sym
teams = sym.Matrix([[1, 2], [3, 4]])
teams = teams - sym.ones(2, 2)
n = 4
x, a, b = sym.symbols('x a b')
X = sym.zeros(n, n)
Y = sym.zeros(n, n)
Z = sym.zeros(n, n)
for i in range(n):
for j in range(n):
if j == i:
X[i, j] = x
elif [i, j] in teams.tolist() or [j, i] in teams.tolist():
Y[i, j] = a
else:
Z[i, j] = b
for i in range(n):
for j in range(n):
if X[i, j] is None:
X[i, j] = 0
if Y[i, j] is None:
Y[i, j] = 0
if Z[i, j] is None:
Z[i, j] = 0
print(X * Y)
Result:
Matrix([[0, a*x, 0, 0],
[a*x, 0, 0, 0],
[0, 0, 0, a*x],
[0, 0, a*x, 0]])
I tested your code with print(np.dot(X,Y)) instead of print(np.matmul(X,Y)) and it worked. According to the documentation np.matmul is preferred over np.dot for matrix multiplication, but I wasn't able to figure out how to do it using np.matmul. I tried np.matmul(X, Y, casting='unsafe'), but the same error resulted. I don't think the error is caused by adding 0 or multiplying by 0, sympy is able to do simplifications.
E.g.
x = sym.symbols('x')
print(x + 0)
print(x*0)
print(3*x + 5*x)
returns just as expected x, 0 and x*8.
Hopefully this helps you out.

Extract sub arrays based on kernel in numpy

I would like to know if there is an efficient method to get sub-arrays from a larger numpy array.
What I have is an application of np.where. I iterate 'manually' over x and y as offsets and apply where with a kernel to each rectangle extracted from the larger array with proper dimensions.
But is there a more direct approach in numpy's collection of methods?
import numpy as np
example = np.arange(20).reshape((5, 4))
# e.g. a cross kernel
a_kernel = np.asarray([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
np.where(a_kernel, example[1:4, 1:4], 0)
# returns
# array([[ 0, 6, 0],
# [ 9, 10, 11],
# [ 0, 14, 0]])
def arrays_from_kernel(a, a_kernel):
width, height = a_kernel.shape
y_max, x_max = a.shape
return [np.where(a_kernel, a[y:(y + height), x:(x + width)], 0)
for y in range(y_max - height + 1)
for x in range(x_max - width + 1)]
sub_arrays = arrays_from_kernel(example, a_kernel)
This returns the arrays I need for further processing.
# [array([[0, 1, 0],
# [4, 5, 6],
# [0, 9, 0]]),
# array([[ 0, 2, 0],
# [ 5, 6, 7],
# [ 0, 10, 0]]),
# ...
# array([[ 0, 9, 0],
# [12, 13, 14],
# [ 0, 17, 0]]),
# array([[ 0, 10, 0],
# [13, 14, 15],
# [ 0, 18, 0]])]
The context: similar to 2D convolution I would like to apply a custom function on each of the subarrays (e.g. product of squared numbers).
At the moment, you're manually advancing a sliding window over the data - stride tricks to the rescue! (And no, I didn't just make that up - there's actually a submodule called stride_tricks in numpy!) Instead of manually building windows into the data, and calling np.where() on them, if you had the windows in an array, you could call np.where() just once. Stride tricks allow you to create such an array without even having to copy the data.
Let me explain. Normal slices in numpy create views into the original data instead of copies. This is done by referring to the original data, but changing the strides used to access the data (ie. how much to jump between two elements or two rows, and so on). Stride tricks allow you to modify those strides more freely than just slicing and reshaping does, so you can eg. iterate over the same data more than once, which is useful here.
Let me demonstrate:
import numpy as np
example = np.arange(20).reshape((5, 4))
a_kernel = np.array([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
def sliding_window(data, win_shape, **kwargs):
assert data.ndim == len(win_shape)
shape = tuple(dn - wn + 1 for dn, wn in zip(data.shape, win_shape)) + win_shape
strides = data.strides * 2
return np.lib.stride_tricks.as_strided(data, shape=shape, strides=strides, **kwargs)
def arrays_from_kernel(a, a_kernel):
windows = sliding_window(a, a_kernel.shape)
return np.where(a_kernel, windows, 0)
sub_arrays = arrays_from_kernel(example, a_kernel)
The scipy.ndimage module offers a number of filters -- one of which might meet your needs. If none of those filters do what you want, you could use ndimage.generic_filter
to call a custom function on each subarray. ndimage.generic_filter is not as fast as the other ndimage filters, however.
For example,
import numpy as np
example = np.arange(20).reshape((5, 4))
a_kernel = np.asarray([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
# def arrays_from_kernel(a, a_kernel):
# width, height = a_kernel.shape
# y_max, x_max = a.shape
# return [np.where(a_kernel, a[y:(y + height), x:(x + width)], 0)
# for y in range(y_max - height + 1)
# for x in range(x_max - width + 1)]
# sub_arrays = arrays_from_kernel(example, a_kernel)
# for arr in sub_arrays:
# print(arr)
# print('-'*80)
import scipy.ndimage as ndimage
def func(x):
# reject subarrays that extend beyond the border of the `example` array
if not np.isnan(x).any():
y = np.zeros_like(a_kernel, dtype=example.dtype)
np.put(y, np.flatnonzero(a_kernel), x)
print(y)
# Instead or returning 0, you can perform your desired computation on the subarray here.
# Note that you may not need the 2D array y; often, you only need the values in the 1D array x
return 0
result = ndimage.generic_filter(example, func, footprint=a_kernel, mode='constant', cval=np.nan)
For the particular problem of computing the product of squares for each subarray, you
could convert the product into a sum by taking advantage of the fact that A * B = exp(log(A)+log(B)). This would allow you to express the computation as a normal convolution. Now using ndimage.convolve can improve performance a lot. The amount of the improvement depends on the size of example:
import numpy as np
import scipy.ndimage as ndimage
import perfplot
a_kernel = np.asarray([[0, 1, 0], [1, 1, 1], [0, 1, 0]])
def orig(example, a_kernel=a_kernel):
def arrays_from_kernel(a, a_kernel):
width, height = a_kernel.shape
y_max, x_max = a.shape
return [
np.where(a_kernel, a[y : (y + height), x : (x + width)], 1)
for y in range(y_max - height + 1)
for x in range(x_max - width + 1)
]
return [np.prod(x) ** 2 for x in arrays_from_kernel(example, a_kernel)]
def alt(example, a_kernel=a_kernel):
logged = np.log(example)
result = ndimage.convolve(logged, a_kernel, mode="constant", cval=0)[1:-1, 1:-1]
return (np.exp(result) ** 2).ravel()
def make_example(N):
return np.random.random(size=(N, N))
def check(A, B):
return np.allclose(A, B)
perfplot.show(
setup=make_example,
kernels=[orig, alt],
n_range=[2 ** k for k in range(2, 11)],
logx=True,
logy=True,
xlabel="len(example)",
equality_check=check,
)

Python linprog minimization--simplex method

I'm using scipy.optimize.linprog library to calculate the minimization using the simplex method. I'm working on this problem in my textbook and I'm hoping someone can point me in the right direction because I'm not getting the output I expect. The problem is:
Minimize w = 10*y1 + 15*y2 + 25*y3
Subject to: y1 + y2 + y3 >= 1000
y1 - 2*y2 >= 0
y3 >= 340
with y1 >= 0, y2 >= 0
The code I wrote for this is:
import numpy as np
import pandas as pd
from scipy.optimize import linprog
A = np.array([
[1, 1, 1],
[1,-2, 0],
[0, 0, 1]])
b = np.array([1000,0,340])
c = np.array([-10,-15,-25])
res = linprog(c, A_ub=A, b_ub=b,
bounds=(0, None))
print('Optimal value:', res.fun, '\nX:', res.x)
Which gives the output:
Optimal value: -18400.0
X: [ 0. 660. 340.]
I expect it to be:
Optimal value: -15100.0
X: [ 660. 0. 340.]
I can't seem to find consistency with this function but maybe it's the way I'm using it.
You've set up the inputs slightly wrong; see the manual. Specifically, you have a number of sign errors.
Your vector c has the wrong sign; linprog minimizes c x so c should just be the coefficients in w = c x
Your vector b and matrix A have the wrong sign. Their signs should be inverted to switch from your form of constraint f(x) >= const to the desired form for the linprog method, which is a less-than-or-equal, i.e. -f(x) <= - const
You are missing the final two constraints.
Your proposed minimum is < 0, which is obviously impossible as
w = 10*x1 + 15*x2 + 25*x3 is always positive with your constraints as x1,x2,x3>=0.
The correct code reads:
import numpy as np
from scipy.optimize import linprog
A = np.array([[-1, -1, -1], [-1,2, 0], [0, 0, -1], [-1, 0, 0], [0, -1, 0]])
b = np.array([-1000, 0, -340, 0, 0])
c = np.array([10,15,25])
res = linprog(c, A_ub=A, b_ub=b,bounds=(0, None))
print('Optimal value:', res.fun, '\nX:', res.x)
# python2
# ('Optimal value:', 15100.0, '\nX:', array([ 660., 0., 340.]))
# python3
# Optimal value: 15099.999961403426
# X: [6.59999996e+02 1.00009440e-07 3.40000000e+02]
As the positiveness of y1 and y2 can be guaranteed under bounds=(0, None), a simplified version of code is shown as below:
import numpy as np
from scipy.optimize import linprog
A = np.array([[-1, -1, -1], [-1,2, 0], [0, 0, -1]])
b = np.array([-1000, 0, -340])
c = np.array([10,15,25])
res = linprog(c, A_ub=A, b_ub=b,bounds=(0, None))
print('Optimal value:', res.fun, '\nX:', res.x)
Output:
Optimal value: 15099.999961403195
X: [6.59999996e+02 1.00009440e-07 3.40000000e+02]

Categories

Resources