Python - Optimizing a non-convex function that is "convex" when adding constraints - python

Suppose I have an objective function f(x) that is non-convex (x can be a vector), but once I add constraints it becomes a convex problem. To show what i mean, consider this trivial example: let f(x) = cos(x). Clearly, cos(x) is not convex, but if i only consider x in [0, pi/2], then the function is convex when restricting x to these values.
CVXPY does not accept such a problem because it does not satisfy DCP rules. One option, in the previous example, is to minimize f(x) + g(x) where g is an indicator function such that: g(x) = 0 for x in [0, pi/2] and g(x) = +infinity otherwise, but I don't know how to implement that in CVXPY. What can i use in Python to take advantage of the "convexity" of the problem?
Thanks.

Related

Imposing monotonicity with scipy.optimize.minimize

I am trying to minimize a function of a vector of length 20, but I want to constrain the solution to be monotonic, i.e.
x[1] <= x[2]... <= x[20]
I have tried to implement this in the following way using "constraints" for this routine:
cons = tuple([{'type':'ineq', 'fun': lambda x: x[i]- x[i-1]} for i in range(1, len(node_vals))])
res = sp.optimize.minimize(localisation, b, args=(d), constraints = cons) #optimize
However, the results I get are not monotonic, even when the initial guess b is, it seems that the optimizer is completely ignoring the constraints. What could be going wrong? I have also tried changing the constraint to x[i]**3 - x[i+1]**3 to make it "smoother", but it didn't help at all. My objective function, localisation is the integral of solution to an eigenvalue problem whose parameters are defined beforehand:
def localisation(node_vals, domain): #calculate localisation for solutions with piecewise linear grading
f = piecewise(node_vals, domain) #create piecewise linear function using given values at nodes
#plt.plot(domain, f(domain))
M = diff_matrix(f(domain)) #differentiation matrix created from piecewise linear function
m = np.concatenate(([0], get_solutions(M)[1][:, 0], [0]))
integral = num_int(domain, m)
return integral
You didn’t post a minimum reproducible example that we can run. However, did you try to specify which optimization algorithm to use in SciPy? Something like this:
res = sp.optimize.minimize(localisation, b, args=(d), constraints = cons, method=‘SLSQP’)
I'm having a very similar problem but with additional upper and lower bounds on the monotonicity property. I'm tackling the problem like this (maybe it helps you):
Using the Trust-Region Constrained Algorithm given by scipy. This provides us a way of dealing with linear constraints in a matrix-manner:
lb <= A.dot(x) <= ub
where lb & and ub are the lower (upper) bounds of this constraint problem and A is the matrix, representing the linear constraint problem.
every row of matrix A is a linear term which defines a constraint
If, for example, x[0] <= x[1], then this can be transformed into x[0] - x[1] <= 0 which in terms of the linear constraint matrix A looks like this [1, -1,...], provided that the upper bound vector has a 0 value on this level of course (vice versa is also possible but either way, having at least one of both, lower or upper bound, makes this easy)
Setting up enough of these inequalities and at the same time merging a couple of those into a single inequality may create a sufficient matrix to solve this.
Hope this helps a bit, It did the job for my problem.

What is the best way to find the inverse image of a function in python?

so I have a function that takes some constants and the value of my c variable and returns the value of my x and y variable like this:
def fun(*constants, c):
#Calculates some stuf to get x and y
return x, y
(x,y) = fun(constants, c)
All variables are real numbers
c belongs between 0 and a positive value cmax
The x,y points are ordered with respect to c
The function produces a curve that is continuous in the x-y plane
What is the best way to approximate the value of c given a specific value of y?
[Edited]
Tim Roberts suggests to use scipy.optimize.fsolve and this almost works for me. Is there a way to tell the fsolve to look only for roots specified in a range of c, in my case between 0 and cmax?
from scipy.optimize import fsolve
def fun(*constants, c):
#Calculates some stuf to get x and y
return x, y
def func(c):
return fun(*constants, c)[1]-y_objective
gess0 = cmax/2
y_objective = 10
c_wanted = fsolve(func, [gess0])
print(c_wanted)
The question as stated is quite broad and can delve into some deep mathematical results. I will attempt to answer your question as reasonably as possible below.
The set of assumptions you listed are AFAICT not general enough for an inverse to exist, even in a neighborhood around some region of interest.
However, let us instead assume that the conditions required of the inverse function theorem hold (see https://en.wikipedia.org/wiki/Inverse_function_theorem). The IFT gives a formula for the inverse derivative within a region where the conditions hold. You can then utilize the fundamental theorem of calculus to compute the inverse function in this region . See https://en.wikipedia.org/wiki/Fundamental_theorem_of_calculus.
The integration will need to be either done symbolically (very advanced) or can be approximated using quadrature. See https://en.wikipedia.org/wiki/Numerical_integration

How do I understand the gradient for complex functions that do not satisfy the Cauchy Reimann Equation

Let us suppose that my function is
(z : C -> C)
z = x - i*y
now here the real part is,
u(x, y) = x
the imaginary part is,
v(x, y) = -y
so, when we get the derivatives, we find
d_u_x(x,y) = 1 # derivative of u wrt x
d_u_y(x,y) = 0
d_v_x(x, y) = 0
d_v_y(x, y) = -1
so, here,
d_u_x != d_v_y
thus, it does not follow Cauchy Reimann equation.
but, then comes the Wirtinger calculus, that says, I could write my function as,
u(x, y) = ((x + iy) + (x - iy))/2
= (z + z.conj())/2
v(x, y) = (((x + iy) - (x - iy))/2i
= (z - z.conj())/2i
but what after this, how do I find the gradient.
plus, in PyTorch, what is the correct way to specify such a function,
if I do,
import torch
a = torch.randn(1, dtype=torch.cfloat, requires_grad=True)
f = a.conj()
f.backward()
print(a.grad)
is this a correct way?
You may find the following page of interest:
When you use PyTorch to differentiate any function f(z) with complex domain and/or codomain, the gradients are computed under the assumption that the function is a part of a larger real-valued loss function g(input)=L. The gradient computed is ∂L/∂z* (note the conjugation of z), the negative of which is precisely the direction of steepest descent used in Gradient Descent algorithm. Thus, all the existing optimizers work out of the box with complex parameters.
This convention matches TensorFlow’s convention for complex differentiation, but is different from JAX (which computes ∂L/∂z).
If you have a real-to-real function which internally uses complex operations, the convention here doesn’t matter: you will always get the same result that you would have gotten if it had been implemented with only real operations.
...
For optimization problems, only real valued objective functions are used in the research community since complex numbers are not part of any ordered field and so having complex valued loss does not make much sense.
It also turns out that no interesting real-valued objective fulfill the Cauchy-Riemann equations. So the theory with homomorphic function cannot be used for optimization and most people therefore use the Wirtinger calculus.
https://pytorch.org/docs/stable/notes/autograd.html

SciPy multivariate optimization with summation bound

I am attempting to perform a multivariate optimization using scipy.optimize.minimize with a constraint, but the constraint is not on each individual variable; rather, it is on the summation of variables.
Here is the quadratic objective:
where A is a symmetric m by m matrix (m is the dimensionality of the points x and y).
The derivative of this function is very nice; A vanishes completely, making the gradient a constant that I can precompute. This is the gradient:
Here's the Python code I'm using to perform the optimization:
retval = scipy.optimize.minimize(f, A.flatten(),
args = (S, dAi.flatten(), A.shape[0]),
jac = True, method = 'SLSQP')
where A is the matrix (flattened), S is the set containing pairs of points x and y, and dAi is the precomputed gradient matrix (also flattened). The objective function f looks like this:
def f(A, S, dfA, k):
A = A.reshape((k, k))
return [np.sum([np.dot(x - y, A).dot(x - y) for x, y in S]), dfA]
However, this implementation spins off into infinity and never completes. I haven't been able to specify the summation constraint anywhere because the optimization method expects either bounds or inequality constraints on each variable, rather than on an aggregation.
Is there a way to do this that I'm missing? This question seemed close but never got a solution. This question involves multivariate optimization but was just an issue of an incorrect derivation, and this question seems analogous to my problem but involves Pandas, which I'm not using.

Fastest way to solve long polynomial with lots of different powers

I'm looking for the fastest solution, x, to this polynomial equation:
Let m be an element in set M.
sum over all m {a_m * x^(b_m) - c_m * x^(b_m - 1)} = 0, where a_m, b_m, c_m are all different for each m. The set M has ~15-20 elements.
If the solution is > 4, it will return 4. If the solution is < 0, it will return 0.
What is the fastest way to do this? Doing it numerically?
I would prefer a solution in python, and other languages only if it's very beneficial to switch.
Note this is the derivative of an objective function. I am just trying to maximize the objective function, so if there's a better way to do it aside from solving this polynomial, that would work too! The solution should be fairly fast, as I am trying to solve many of these objective functions.
If you're only looking for one root and not all roots, you can use Newton's Method, which I expect is reasonably fast for the polynomials you've described.
let f(x) = sum over all m {a*x^(b) - c*x^(b-1)}
then f'(x), the derivative of f(x), is the sum over all m {(a*b)*x^(b-1) - (c*(b-1))*x^(b-2)}.
def newton(f, fprime, firstguess, epsilon):
x = firstguess
while abs(f(x)) > epsilon:
x = x - (f(x) / fprime(x))
return x
This will return an approximate root to your polynomial. If it's not accurate enough, pass in a smaller epsilon until it is accurate enough.
Note that this function may diverge, and run forever, or throw a ZeroDivisionError. Handle with caution.

Categories

Resources