L-BFGS-B does not satisfy given constraint

L-BFGS-B does not satisfy given constraint - python

I try to find optimized weight values of my model by using minimize function in scipy. As seen below the code, I define my error function returning one minus f1 score of the model.
def err_func(weights,x,y):
undetected=0
correct=0
incorrect=0
results=fun(weights,x)
for i in range(0,len(results)):
if(results[i]==y[i]):
correct+=1
elif(not (results[i]==y[i])):
incorrect+=1
undetected=len(y)-(correct+incorrect)
precision=float(correct) / float(correct + incorrect)
recall=float(correct) / float(correct + incorrect + undetected)
f1=2 * precision * recall / (precision + recall)
return 1.0-f1
I use constraints that each value in weights between zero and one, and sum of weights are equals to one. These definitions are as below:
cons = ({'type': 'eq', 'fun': lambda x: 1 - sum(x)})
bnds = tuple((0.0, 1.0) for x in weights)
eps=1e-2
But while running minimze method, my function does not satisfy the constraint.
from scipy.optimize import minimize
res = minimize(err_func, weights,method='L-BFGS-B', args=(x,y),constraints=cons,bounds=bnds,options = {'eps':eps,'maxiter':100})
print res
test_weights=res.x
print sum(test_weights)
I got such an output, sum of weights are larger than one. What am I missing?
> fun: 0.4955555555555555 hess_inv: <11x11 LbfgsInvHessProduct with
> dtype=float64>
> jac: array([ 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]) message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
> nfev: 24
> nit: 1 status: 0 success: True
> x: array([ 0. , 0.22222222, 0. , 1. , 1. ,
> 0.11111111, 1. , 1. , 1. , 0. , 1. ])
> 6.33333333333

L-BFGS-B only supports bound constraints (that is what the second 'B' means). General constraints are not supported by this method.
Excerpt from scipy docs:
Parameters:
...
constraints : dict or sequence of dict, optional
...
Constraints definition (only for COBYLA and SLSQP)

Related

why does np.convolve shift the resulted signal by 1

I have the following two signals:
X0 = array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])
rbf_kernel = array([2.40369476e-04, 4.82794999e-03, 4.97870684e-02, 2.63597138e-01,
7.16531311e-01, 1.00000000e+00, 7.16531311e-01, 2.63597138e-01,
4.97870684e-02, 4.82794999e-03])
I tried to convolve the two signals using np.convolve(X0, rbf_kernel, mode='same') but the resulted convolution is shifted by one to the right as shown below. Green, orange, blue curves are X0, rbf_kernel, and the result from the last command line respectively. I expect to see the maximum convolution when the two convoluted signals were matched (i.e, at point 5) but that did not happen.

The result is shifted because of padding used for same convolution. Convolution is a process of sliding flipped kernel on input and taking dot product at each step.
For valid convolution kernel should overlap at every stride fully hence output size will be n - m + 1 (n - len(input), m - len(kernel) assuming m <= n). For same convolution output size will be max(m, n) to achieve that we need to apply (m - 1) zero padding on input and then perform valid convolution.
In your example n = m = 10 and same convolution output size will be max(10, 10) = 10. It requires zero padding of m - 1 = 9 which is 5 left zero padding and 4 right zero padding. Padded input(X0) looks like :
padded_x = [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] with length 19.
flipped kernel = [4.82794999e-03 4.97870684e-02 2.63597138e-01 7.16531311e-0, 1.00000000e+00 7.16531311e-01 2.63597138e-01 4.97870684e-02
4.82794999e-03 2.40369476e-04]
So on convolution output will be maximum at 6th step(starting from 0)
Here's a sample SAME convolution code:
import numpy as np
import matplotlib.pyplot as plt
def same_conv(x, k):
if len(k) > len(x):
# consider longer as x and other as kernel
x, k = k, x
n = x.shape[0]
m = k.shape[0]
padding = m - 1
left_pad = int(np.ceil(padding / 2))
right_pad = padding - left_pad
x = np.pad(x, (left_pad, right_pad), 'constant')
# print(len(x))
out = []
# flip the kernel
k = k[::-1]
# print(k)
for i in range(n):
out.append(np.dot(x[i: i+m], k))
return np.array(out)
X0 = np.array([0., 0., 0., 0., 0., 1., 0., 0., 0., 0.])
rbf_kernel = np.array([2.40369476e-04, 4.82794999e-03, 4.97870684e-02, 2.63597138e-01,
7.16531311e-01, 1.00000000e+00, 7.16531311e-01, 2.63597138e-01,
4.97870684e-02, 4.82794999e-03])
convolved = same_conv(X0, rbf_kernel)
plt.plot(X0)
plt.plot(rbf_kernel)
plt.plot(convolved)
plt.show()
which results in the same shifted output as yours.

A loop to write equations to be used with odeint

I have an initial value problem that needs to be solved; the differential equations are derived from a dictionary that looks like:
eqs = {'a': array([-1., 2., 4., 0., ...]),
'b': array([ 1., -10., 0., 0., ...]),
'c': array([ 0., 3., -4., 0., ...]),
'd': array([ 0., 5., 0., -0., ...]),
...}
The differential equation da/dt is given as -1*[a]+2*[b]+4*[c]+0*[d]....
Using the dictionary above, I write a function dXdt as:
def dXdt (X, t):
sys_a, sys_b, sys_c, sys_d,... = eqs['a'], eqs['b'], eqs['c'], eqs['d'],...
dadt = sys_a[0]*X[0]+sys_a[1]*X[1]+sys_a[2]*X[2]+sys_a[3]*X[3]+...
dbdt = sys_b[0]*X[0]+sys_b[1]*X[1]+sys_b[2]*X[2]+sys_b[3]*X[3]+...
dcdt = sys_c[0]*X[0]+sys_c[1]*X[1]+sys_c[2]*X[2]+sys_c[3]*X[3]+...
dddt = sys_d[0]*X[0]+sys_d[1]*X[1]+sys_d[2]*X[2]+sys_d[3]*X[3]+...
...
return [dadt, dbdt, dcdt, dddt, ...]
The initial conditions are:
X0 = [1, 0, 0, 0, ...]
and the solution is given as:
X = integrate.odeint(dXdt, X0, np.linspace(0,10,11))
This works well for a small system, where I can write the equations by hand. However, I have a system that has ~150 differential equations, and I need to automate the way I write dXdt to be used with scipy.integrate.odeint, given the dictionary of eqs. Is there a way to do so?

Any time something follows a simple linear pattern, you can use an iteration or a comprehension to express it. If you have multiple such patterns, you can just nest them. So this:
sys_a, sys_b, sys_c, sys_d,... = eqs['a'], eqs['b'], eqs['c'], eqs['d'],...
dadt = sys_a[0]*X[0]+sys_a[1]*X[1]+sys_a[2]*X[2]+sys_a[3]*X[3]+...
dbdt = sys_b[0]*X[0]+sys_b[1]*X[1]+sys_b[2]*X[2]+sys_b[3]*X[3]+...
dcdt = sys_c[0]*X[0]+sys_c[1]*X[1]+sys_c[2]*X[2]+sys_c[3]*X[3]+...
dddt = sys_d[0]*X[0]+sys_d[1]*X[1]+sys_d[2]*X[2]+sys_d[3]*X[3]+...
...
[dadt, dbdt, dcdt, dddt, ...]
can be expressed simply as:
[sum(eqs[char][i] * X[i] for i in range(len(X))) for char in eqs.keys()]

Best way to scale the matrix variables in SCIPY linear programming scheme

I have the following optimization scheme implemented under NNLS
in scipy.
import numpy as np
from scipy.optimize import nnls
from scipy import stats
#Define problem
A = np.array([[60., 90., 120.],
[30., 120., 90.]])
b = np.array([6700.5, 699.,])
# Add ones to ensure the solution sums to 1
b = np.hstack([b,1.0])
A = np.vstack([A,np.ones(3)])
x, rnorm = nnls(A,b)
print x
# the solution is
# [ 93.97933792 0. 0. ]
# we expect it to sum to 1 if it's not skewed
As you can see the b vector is much higher than values in A.
My question is what's the best/reasonable way to scale A and b so that the solution
is not skewed.
Note that both A and b are gene expression raw data without pre-processing.

If you want to include the equality constraint, you can't really use the nnls routine, since it doesn't cater for equalities. If you are limited to what's on offer in scipy, you can use this:
import numpy as np
from scipy.optimize import minimize
#Define problem
A = np.array([[60., 90., 120.],
[30., 120., 90.]])
b = np.array([6700.5, 699.,])
#-----------------------------
# I tried rescaling the data by adding this two lines,
# so that they're in same scale.
# but why the solution is different?
# x: array([ 1., 0., 0.])
# What's the correct way to go?
#-----------------------------
# A = A/np.linalg.norm(A,axis=0)
# b = b/np.linalg.norm(b)
def f(x):
return np.linalg.norm(A.dot(x) - b)
cons ={'type': 'eq',
'fun': lambda x: sum(x) - 1}
x0 = [1, 0, 0] # initial guess
minimize(f, x0, method='SLSQP', bounds=((0, np.inf),)*3, constraints=cons)
Output:
status: 0
success: True
njev: 2
nfev: 10
fun: 6608.620222860367
x: array([ 0., 0., 1.])
message: 'Optimization terminated successfully.'
jac: array([ -62.50927734, -100.675354 , -127.78314209, 0. ])
nit: 2
This minimises the objective function directly while also imposing the equality constraint you're interested in.
If speed is important, you can add the jacobian and hessian information, or even better, use a proper QP solver, as supplied by cvxopt.

Implementing gradient operator in Python

I'm working on a Computer Vision system and this is giving me a serious headache. I'm having trouble re-implementing an old gradient operator more efficiently, I'm working with numpy and openCV2.
This is what I had:
def gradientX(img):
rows, cols = img.shape
out = np.zeros((rows,cols))
for y in range(rows-1):
Mr = img[y]
Or = out[y]
Or[0] = Mr[1] - Mr[0]
for x in xrange(1, cols - 2):
Or[x] = (Mr[x+1] - Mr[x-1])/2.0
Or[cols-1] = Mr[cols-1] - Mr[cols-2]
return out
def gradient(img):
return [gradientX(img), (gradientX(img.T).T)]
I've tried using numpy's gradient operator but the result is not the same
For this input
array([[ 3, 4, 5],
[255, 0, 12],
[ 25, 15, 200]])
Using my gradient returns
[array([[ 1., 0., 1.],
[-255., 0., 12.],
[ 0., 0., 0.]]),
array([[ 252., -4., 0.],
[ 0., 0., 0.],
[-230., 15., 0.]])]
While using numpy's np.gradient returns
[array([[ 252. , -4. , 7. ],
[ 11. , 5.5, 97.5],
[-230. , 15. , 188. ]]),
array([[ 1. , 1. , 1. ],
[-255. , -121.5, 12. ],
[ -10. , 87.5, 185. ]])]
There are cleary some similarities between the results but they're definitely not the same. So I'm missing something here or the two operators aren't mean to produce the same results. In that case, I wanted to know how to re-implement my gradientX function so it doesn't use that awful looking double loop for traversing the 2-d array using mostly numpy's potency.

I've been working a bit more on this just to find that my mistake.
I was skipping last row and last column when iterating. As #wflynny noted, the result was identical except for a row and a column of zeros.
Provided this, the result could not be the same as np.gradient, but with that change, the results are identical, so there's no need to find any other numpy implementation for this.
Answering my own question, a good numpy's implementation for my gradient algorithm would be
import numpy as np
def gradientX(img):
return np.gradient(img)[::-1]
I'm also posting the working code, just because it shows how numpy's gradient operator works
def computeMatXGradient(img):
rows, cols = img.shape
out = np.zeros((rows,cols))
for y in range(rows):
Mr = img[y]
Or = out[y]
Or[0] = float(Mr[1]) - float(Mr[0])
for x in xrange(1, cols - 1):
Or[x] = (float(Mr[x+1]) - float(Mr[x-1]))/2.0
Or[cols-1] = float(Mr[cols-1]) - float(Mr[cols-2])
return out

Quadratic Program (QP) Solver that only depends on NumPy/SciPy?

I would like students to solve a quadratic program in an assignment without them having to install extra software like cvxopt etc. Is there a python implementation available that only depends on NumPy/SciPy?

I'm not very familiar with quadratic programming, but I think you can solve this sort of problem just using scipy.optimize's constrained minimization algorithms. Here's an example:
import numpy as np
from scipy import optimize
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d.axes3d import Axes3D
# minimize
# F = x[1]^2 + 4x[2]^2 -32x[2] + 64
# subject to:
# x[1] + x[2] <= 7
# -x[1] + 2x[2] <= 4
# x[1] >= 0
# x[2] >= 0
# x[2] <= 4
# in matrix notation:
# F = (1/2)*x.T*H*x + c*x + c0
# subject to:
# Ax <= b
# where:
# H = [[2, 0],
# [0, 8]]
# c = [0, -32]
# c0 = 64
# A = [[ 1, 1],
# [-1, 2],
# [-1, 0],
# [0, -1],
# [0, 1]]
# b = [7,4,0,0,4]
H = np.array([[2., 0.],
[0., 8.]])
c = np.array([0, -32])
c0 = 64
A = np.array([[ 1., 1.],
[-1., 2.],
[-1., 0.],
[0., -1.],
[0., 1.]])
b = np.array([7., 4., 0., 0., 4.])
x0 = np.random.randn(2)
def loss(x, sign=1.):
return sign * (0.5 * np.dot(x.T, np.dot(H, x))+ np.dot(c, x) + c0)
def jac(x, sign=1.):
return sign * (np.dot(x.T, H) + c)
cons = {'type':'ineq',
'fun':lambda x: b - np.dot(A,x),
'jac':lambda x: -A}
opt = {'disp':False}
def solve():
res_cons = optimize.minimize(loss, x0, jac=jac,constraints=cons,
method='SLSQP', options=opt)
res_uncons = optimize.minimize(loss, x0, jac=jac, method='SLSQP',
options=opt)
print '\nConstrained:'
print res_cons
print '\nUnconstrained:'
print res_uncons
x1, x2 = res_cons['x']
f = res_cons['fun']
x1_unc, x2_unc = res_uncons['x']
f_unc = res_uncons['fun']
# plotting
xgrid = np.mgrid[-2:4:0.1, 1.5:5.5:0.1]
xvec = xgrid.reshape(2, -1).T
F = np.vstack([loss(xi) for xi in xvec]).reshape(xgrid.shape[1:])
ax = plt.axes(projection='3d')
ax.hold(True)
ax.plot_surface(xgrid[0], xgrid[1], F, rstride=1, cstride=1,
cmap=plt.cm.jet, shade=True, alpha=0.9, linewidth=0)
ax.plot3D([x1], [x2], [f], 'og', mec='w', label='Constrained minimum')
ax.plot3D([x1_unc], [x2_unc], [f_unc], 'oy', mec='w',
label='Unconstrained minimum')
ax.legend(fancybox=True, numpoints=1)
ax.set_xlabel('x1')
ax.set_ylabel('x2')
ax.set_zlabel('F')
Output:
Constrained:
status: 0
success: True
njev: 4
nfev: 4
fun: 7.9999999999997584
x: array([ 2., 3.])
message: 'Optimization terminated successfully.'
jac: array([ 4., -8., 0.])
nit: 4
Unconstrained:
status: 0
success: True
njev: 3
nfev: 5
fun: 0.0
x: array([ -2.66453526e-15, 4.00000000e+00])
message: 'Optimization terminated successfully.'
jac: array([ -5.32907052e-15, -3.55271368e-15, 0.00000000e+00])
nit: 3

This might be a late answer, but I found CVXOPT - http://cvxopt.org/ - as the commonly used free python library for Quadratic Programming. However, it is not easy to install, as it requires the installation of other dependencies.

I ran across a good solution and wanted to get it out there. There is a python implementation of LOQO in the ELEFANT machine learning toolkit out of NICTA (http://elefant.forge.nicta.com.au as of this posting). Have a look at optimization.intpointsolver. This was coded by Alex Smola, and I've used a C-version of the same code with great success.

mystic provides a pure python implementation of nonlinear/non-convex optimization algorithms with advanced constraints functionality that typically is only found in QP solvers. mystic actually provides more robust constraints than most QP solvers. However, if you are looking for optimization algorithmic speed, then the following is not for you. mystic is not slow, but it's pure python as opposed to python bindings to C. If you are looking for flexibility and QP constraints functionality in a nonlinear solver, then you might be interested.
"""
Maximize: f = 2*x[0]*x[1] + 2*x[0] - x[0]**2 - 2*x[1]**2
Subject to: -2*x[0] + 2*x[1] <= -2
2*x[0] - 4*x[1] <= 0
x[0]**3 -x[1] == 0
where: 0 <= x[0] <= inf
1 <= x[1] <= inf
"""
import numpy as np
import mystic.symbolic as ms
import mystic.solvers as my
import mystic.math as mm
# generate constraints and penalty for a nonlinear system of equations
ieqn = '''
-2*x0 + 2*x1 <= -2
2*x0 - 4*x1 <= 0'''
eqn = '''
x0**3 - x1 == 0'''
cons = ms.generate_constraint(ms.generate_solvers(ms.simplify(eqn,target='x1')))
pens = ms.generate_penalty(ms.generate_conditions(ieqn), k=1e3)
bounds = [(0., None), (1., None)]
# get the objective
def objective(x, sign=1):
x = np.asarray(x)
return sign * (2*x[0]*x[1] + 2*x[0] - x[0]**2 - 2*x[1]**2)
# solve
x0 = np.random.rand(2)
sol = my.fmin_powell(objective, x0, constraint=cons, penalty=pens, disp=True,
bounds=bounds, gtol=3, ftol=1e-6, full_output=True,
args=(-1,))
print 'x* = %s; f(x*) = %s' % (sol[0], -sol[1])
Things to note is that mystic can generically apply LP, QP, and higher order equality and inequality constraints to any given optimizer, not just a special QP solver. Secondly, mystic can digest symbolic math, so the ease of defining/entering the constraints is a bit nicer than working with the matrices and derivatives of functions. mystic depends on numpy, and will use scipy if it is installed (however, scipy is not required). mystic utilizes sympy to handle symbolic constraints, but it's also not required for optimization in general.
Output:
Optimization terminated successfully.
Current function value: -2.000000
Iterations: 3
Function evaluations: 103
x* = [ 2. 1.]; f(x*) = 2.0
Get mystic here: https://github.com/uqfoundation

The qpsolvers package also seems to fit the bill. It only depends on NumPy and can be installed by pip install qpsolvers. Then, you can do:
from numpy import array, dot
from qpsolvers import solve_qp
M = array([[1., 2., 0.], [-8., 3., 2.], [0., 1., 1.]])
P = dot(M.T, M) # quick way to build a symmetric matrix
q = dot(array([3., 2., 3.]), M).reshape((3,))
G = array([[1., 2., 1.], [2., 0., 1.], [-1., 2., -1.]])
h = array([3., 2., -2.]).reshape((3,))
# min. 1/2 x^T P x + q^T x with G x <= h
print "QP solution:", solve_qp(P, q, G, h)
You can also try different QP solvers (such as CVXOPT mentioned by Curious) by changing the solver keyword argument, for example solver='cvxopt' or solver='osqp'.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

L-BFGS-B does not satisfy given constraint - python

L-BFGS-B only supports bound constraints (that is what the second 'B' means). General constraints are not supported by this method. Excerpt from scipy docs: Parameters: ... constraints : dict or sequence of dict, optional ... Constraints definition (only for COBYLA and SLSQP)

Related

why does np.convolve shift the resulted signal by 1

A loop to write equations to be used with odeint

Best way to scale the matrix variables in SCIPY linear programming scheme

Implementing gradient operator in Python

Quadratic Program (QP) Solver that only depends on NumPy/SciPy?

Categories

Resources