Optimization Problem with fast matrix-vector multiplication in Python / cvxpy

Optimization Problem with fast matrix-vector multiplication in Python / cvxpy - python

I want to solve the following (convex) minimization problem:
min ||x||_1 under the constraints sgn(A[x,R]=y) and ||x||_2 = 1
where A is a mx(N+1) matrix, x in R^N a vector, and \[x,R\] a vector that is created by appending a given number R. The objective is to find the optimal value for x.
A is a Fourier matrix and there are fast matrix-vector, inversion, etc. algorithms available. Since this matrix is really big, I need to use an optimization algorithm that utilizes this.
Currently, I use the following implementation in cvxpy, which is way too slow:
import cvxpy as cvx
# rewrite the problem in the form x = x^- + x^+
n = A.shape[1]-1
vx = cvx.Variable(2*n)
objective = cvx.Minimize(cvx.pnorm(vx, 1)) # min ||x||_1
constraints = [vx >= 0, cvx.multiply(A[:,:n] # vx[:n] - A[:,:n] # vx[n:] + A[:,n]*R, y) >= 0,
cvx.norm(vx, 2) <= R] # sgn(A[x,1]) = y, ||x||_2 <= R
x, solve_time = solve(vx, objective, constraints)
solution = x[:n] - x[n:]
Is there a way to use fast matrix computations in cvxpy? Or is there a better library? I found a few implementations that can do this for one special algorithm but not in the general case, so I was not able to implement my problem.

No. The solver will not call your matrix multiplication code. They do their own linear algebra, which is very different in many ways. In a sense your matrix multiplication is just notation for the problem statement.
Regarding performance, it depends heavily on where the bottleneck is. Is it in generating the model (in cvxpy itself) or in the solver? What solver are you using? Consider using a different solver. Obviously, we don't have enough information (and no reproducible example) to answer this question.

Related

Imposing monotonicity with scipy.optimize.minimize

I am trying to minimize a function of a vector of length 20, but I want to constrain the solution to be monotonic, i.e.
x[1] <= x[2]... <= x[20]
I have tried to implement this in the following way using "constraints" for this routine:
cons = tuple([{'type':'ineq', 'fun': lambda x: x[i]- x[i-1]} for i in range(1, len(node_vals))])
res = sp.optimize.minimize(localisation, b, args=(d), constraints = cons) #optimize
However, the results I get are not monotonic, even when the initial guess b is, it seems that the optimizer is completely ignoring the constraints. What could be going wrong? I have also tried changing the constraint to x[i]**3 - x[i+1]**3 to make it "smoother", but it didn't help at all. My objective function, localisation is the integral of solution to an eigenvalue problem whose parameters are defined beforehand:
def localisation(node_vals, domain): #calculate localisation for solutions with piecewise linear grading
f = piecewise(node_vals, domain) #create piecewise linear function using given values at nodes
#plt.plot(domain, f(domain))
M = diff_matrix(f(domain)) #differentiation matrix created from piecewise linear function
m = np.concatenate(([0], get_solutions(M)[1][:, 0], [0]))
integral = num_int(domain, m)
return integral

You didn’t post a minimum reproducible example that we can run. However, did you try to specify which optimization algorithm to use in SciPy? Something like this:
res = sp.optimize.minimize(localisation, b, args=(d), constraints = cons, method=‘SLSQP’)

I'm having a very similar problem but with additional upper and lower bounds on the monotonicity property. I'm tackling the problem like this (maybe it helps you):
Using the Trust-Region Constrained Algorithm given by scipy. This provides us a way of dealing with linear constraints in a matrix-manner:
lb <= A.dot(x) <= ub
where lb & and ub are the lower (upper) bounds of this constraint problem and A is the matrix, representing the linear constraint problem.
every row of matrix A is a linear term which defines a constraint
If, for example, x[0] <= x[1], then this can be transformed into x[0] - x[1] <= 0 which in terms of the linear constraint matrix A looks like this [1, -1,...], provided that the upper bound vector has a 0 value on this level of course (vice versa is also possible but either way, having at least one of both, lower or upper bound, makes this easy)
Setting up enough of these inequalities and at the same time merging a couple of those into a single inequality may create a sufficient matrix to solve this.
Hope this helps a bit, It did the job for my problem.

ValueError: Could not find root within given tolerance.How to solve?

i'm triyng to solve this equation using "nsolve" function. unfortunately, this error appears:
ValueError: Could not find root within given tolerance. (435239733.760000060718 > 2.16840434497100886801e-19)
Try another starting point or tweak arguments.
The code is:
import sympy
d=[0.3, 32.6, 33.4, 241.7, 396.2, 444.4, 480.8, 588.9, 1043.9, 1136.1, 1288.1, 1408.1, 1439.4, 1604.8]
N=len(d)
x = sympy.Symbol('x', real=True)
expr2 = sympy.Eq(d[13] + N * sympy.Pow(x, -1) - N * d[13] * sympy.Pow(1 - sympy.exp(-d[13] * N), -1), 0)
expr_2 = sympy.simplify(expr=expr2)
solution = sympy.nsolve(expr_2, -0.01)
s = round(solution, 6)
print(s)

The system you are trying to solve leads to large derivatives and very abrupt changes, including a singularity at x == 0. This is the graph of the equation (Using Mathematica).
Numerical solvers struggle with these functions because most of them assumes some amount of smoothness around the solution and can be confused around singularities. Almost all of them (I'm talking about solvers in general, not just SymPy) benefits from regularization or a reformulation of the problem.
I would suggest to simplify the equation by multiplying both sides by x, which would remove the division by x and lead to a smoother function (linear in this case), for which numerical solvers behave correctly.
With this reformulation, you should find that the solution is 0.000671064.
Moreover, I would also suggest to rescale the coefficients so that they are all in [-1,1]. This also generally helps solvers. In your case, it will find the solution easily since it is linear, but more complex equations might cause problems.

cvxpy: best strategy for enforcing smoothness in second dimension for large problems

There might be an answer for that floating around somewhere but I wasn't able to find it.
I want to minimize with variable X >= 0 and 1st derivative matrix D, so that X is smooth in column-direction and with relatively large data.
In the past I have used various ways of solving this (e.g. with scipy.optimize.lsq_linear or pylops) and now wanted to try out cvxpy with the generic approach below (using SCS, because the problem wouldn't fit into memory otherwise):
def cvxpy_solve_smooth(A, B, lambd=0):
D = get_D(B.shape[1])
X = cp.Variable((A.shape[1], B.shape[1]))
cost1 = cp.sum_squares(A # X - B)
cost2 = lambd * cp.sum_squares(X # D.T)
constr = [0 <= X]
prob = cp.Problem(cp.Minimize(cost1 + cost2), constr)
prob.solve(solver=cp.SCS)
return X.value
def get_D(n):
D = np.diag(np.ones(n - 1), 1)
np.fill_diagonal(D, -1)
D = D[:-1]
return D
However, this is considerably slower than scipy.optimize.lsq_linear with sparse matrices. What can I do in terms of problem formulation, solver options, cvxpy advanced features etc. to improve performance?

SCS-based
Check what SCS is configured with in your option-free call. I suspect it's started in direct-mode and you should also try the indirect-mode (use_indirect=True). Turning on verbosity (verbose=True) should show you what's currently being used
Alternatives
This looks like smooth unconstrained optimization with bound-constraints only. I doubt, that SCS is the right approach here (it's just too powerful; too general).
I would fire up L-BFGS-B (which supports bound-constraints) (reference impl), e.g. through scipy. For a prototype, you might get away with automatic numerical-diff, but for your final version, you should provide a customized gradient then. I suspect, it will be much more efficient than SCS. But that also depends on the accuracy you are striving for.

Finding A Solution To A System of Inequalities in Python Without A Optimization Function

I have a large system of linear inequalities I am trying to solve for in python, ideally with non-negative integers.
Eg:
I have a matrix A of size (1000*700) and matrix b of size (700) where each line in the system is set up such that A*C <= b. I am trying to solve for C. Multiple solutions online involve Scipy optimization module, however this requires an optimization function. Trying to put all 0's for the coefficients in the optimization function and inputting the constraints as the system of inequalities gives me a solution which is wrong. I have put down a part of my code below.
from scipy.optimize import linprog
coefficients_min_y = [0] * len(A[0])
res = linprog(coefficients_min_y, A_ub=A, b_ub=b, bounds=(0, None))
if res.success:
coefficients = res.x
else:
print("Solution is infeasible")
matrixMult = np.dot(A, coefficients)
Given that linprog outputs a feasible solution, I use matrixMult to check if the solution satisfies the conditions. However it is not solving correctly. Is this because of the optimization function or am I doing something else wrong.

lmfit minimize fails with ValueError: array is too big

I am trying to use the "brute" method to minimize a function of 20 variables. It is failing with a mysterious error. Here is the complete code:
import random
import numpy as np
import lmfit
def progress_update(params, iter, resid, *args, **kws):
pass
#print(resid)
def score(params, data = None):
parvals = params.valuesdict()
M = data
X_params = []
Y_params = []
for i in range(M.shape[0]):
X_params.append(parvals['x'+str(i)])
for j in range(M.shape[1]):
Y_params.append(parvals['y'+str(i)])
return diff(M, X_params, Y_params)
def diff(M, X_params, Y_params):
total = 0
for i in range(M.shape[0]):
for j in range(M.shape[1]):
total += abs(M[i,j] - (X_params[i] - Y_params[j])**2)
return total
dim = 10
random.seed(0)
M = np.empty((dim, dim))
for i in range(M.shape[0]):
for j in range(M.shape[1]):
M[i,j] = i*random.random()+j**2
params = lmfit.Parameters()
for i in range(M.shape[0]):
params.add('x'+str(i), value=random.random()*10, min=0, max=10)
for j in range(M.shape[1]):
params.add('y'+str(j), value=random.random()*10, min=0, max=10)
result = lmfit.minimize(score, params, method='brute', kws={'data': M}, iter_cb=progress_update)
However, this fails with:
ValueError: array is too big; `arr.size * arr.dtype.itemsize` is larger than the maximum possible size.
What is causing this problem?

"What is causing this problem"
Math
You can't brute force a high dimensional problem because brute force methods require exponential work (time, and memory if implemented naively).
More directly, lmfit uses numpy (*) under the hood, which has a maximum size of how much data it can allocate. Your initial data structure isn't too big (10x10), it's the combinatorical table required for a brute force that's causing problems.
If you're willing to hack the implementation, you could switch to a sparse memory structure. But this doesn't solve the math problem.
On High Dimensional Optimization
Try a different minimzer, but be warned: it's very difficult to minimze globally in high dimensional space. "Local minima" methods like fixed point / gradient descent might be more productive.
I hate to be pessimistic, but high level optimization is very hard when probed generally, and I'm afraid is beyond the scope of an SO question. Here is a survey.
Practical Alternatives
Gradient descent is supported a little in sklearn but more for machine learning than general optimization; scipy actually has pretty good optimization coverage, and great documentation. I'd start there. It's possible to do gradient descent there too, but not necessary.
From scipy's docs on unconstrained minimization, you have many options:
Method Nelder-Mead uses the Simplex algorithm [], []. This algorithm
is robust in many applications. However, if numerical computation of
derivative can be trusted, other algorithms using the first and/or
second derivatives information might be preferred for their better
performance in general.
Method Powell is a modification of Powell’s method [], [] which is a
conjugate direction method. It performs sequential one-dimensional
minimizations along each vector of the directions set (direc field in
options and info), which is updated at each iteration of the main
minimization loop. The function need not be differentiable, and no
derivatives are taken.
and many more derivative-based methods are available. (In general, you do better when you have derivative information available.)
Footnotes/Looking at the Source Code
(*) the actual error is thrown here, based on your numpy implementation. Quoted:
`if (npy_mul_with_overflow_intp(&nbytes, nbytes, dim)) {
PyErr_SetString(PyExc_ValueError,
"array is too big; `arr.size * arr.dtype.itemsize` "
"is larger than the maximum possible size.");
Py_DECREF(descr);
return NULL;`

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.