The minimize function of scipy.optimize doesn't give the right answer

The minimize function of scipy.optimize doesn't give the right answer - python

I'm trying the solve a minimization problem using the minimize function of Scipy. The objective function is simply the ratio of two multivariate normal distributions with different mean and variance. I'm hoping to find the maximum of the function g_func, which is equivalent to find the minimum of the function g_optimization. Also, I added a constraint of x[0] = 0. Here, x is a vector with 8 elements. The objective function g_optimization is as following:
import numpy as np
from scipy.optimize import minimize
# Set up mean and variance for two MVN distributions
n_trait = 8
sigma = np.full((n_trait, n_trait),0.0005)
np.fill_diagonal(sigma,0.005)
omega = np.full((n_trait, n_trait),0.0000236)
np.fill_diagonal(omega,0.0486)
sigma_pos = np.linalg.inv(np.linalg.inv(sigma)+np.linalg.inv(omega))
mu_pos = np.array([-0.01288244,0.08732091,0.01049617,0.0860966,0.10055626,0.07952922,0.04363669,-0.0061975])
mu_pri = 0
sigma_pri = omega
#objective function
def g_func(beta,mu_sim_pos):
g1 = ((np.linalg.det(sigma_pri))**(1/2))/((np.linalg.det(sigma_pos))**(1/2))
g2 = (-1/2)*np.linalg.multi_dot([np.transpose(beta-mu_sim_pos),np.linalg.inv(sigma_pos),beta-mu_sim_pos])
g3 = (1/2)*np.linalg.multi_dot([np.transpose(beta-mu_pri),np.linalg.inv(sigma_pri),beta-mu_pri])
g = g1*np.exp(g2+g3)
return g
def g_optimization(beta,mu_sim_pos):
return -1*g_func(beta,mu_sim_pos)
#optimization
start_point = np.full(8,0)
cons = ({'type': 'eq',
'fun' : lambda x: np.array([x[0]])})
anws = minimize (g_optimization, [start_point], args=(mu_pos),
constraints=cons, options={'maxiter': 50}, tol=0.001)
anws
The optimization stops after two iterations, and the minimum value that the function gives is 0, at the point np.array([0,10.32837891,-1.62396508,10.13790152,12.38752653,9.11615259,3.53201544,-4.22115517]). This cannot be true because even we plug in the starting point np.zeros(8) to the g_optimization function, the result given is -657.0041125829354, which is smaller than 0. So the solution provided is definitely not minimal.
g_optimization(np.zeros(8),mu_pos) #gives solution of -657.0041125829354
I'm not sure where did I go wrong.

I would try a different solver. For example L-BFGS-B works well.
You can look at all options here.
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
constraints=cons, options={'maxiter': 50}, tol=0.001)
print(anws)
# success: True
# message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
# fun: -21688.00879938617
# x: array([-0.0101048, 0.09937778, 0.01543875, 0.0980401, 0.11383878, 0.09086455, 0.05164822, -0.00280081])
EDIT:
L-BFGS-B can not handle general constraints h(x)=0, only bounding boxes on the variables:
Bounds on variables for L-BFGS-B, TNC, SLSQP, Powell, and trust-constr methods. There are two ways to specify the bounds:
Instance of Bounds class.
Sequence of (min, max) pairs for each element in x. None is used to specify no bound.
In your case you have to define 8 pairs of lower and upper limits.
For x[0] you have to make a tight bound as the method can not handle x_low == x_high.
bounds = [(None, None)] * 8
bounds[0] = (0, 0.00001)
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B', bounds=bounds,
options={'maxiter': 50}, tol=0.001)
# fun: -21467.48153792194
# x: array([0., 0.10039832, 0.01641271, 0.0990599, 0.11486735, 0.09188037, 0.05264228, -0.00183697])
Another alternative is to exclude the value x[0] from your optimisation problem:
def g_optimization(beta,mu_sim_pos):
beta2 = np.empty(8)
beta2[0] = 0
beta2[1:] = beta
return -1*g_func(beta2, mu_sim_pos)
start_point = np.zeros(7) # exclude x[0]
anws = minimize(g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
options={'maxiter': 50}, tol=0.001)
# fun: -21467.47686079844
# x: array([0.10041797, 0.01648995, 0.09908046, 0.11487707, 0.09190585, 0.05269467, -0.00174722])
# ^ missing x[0]

Related

How to constrain the weight of characteristic variables in regression

Now I faced a problem that for a data sample(lets‘s say 10 continuous variables and one dependent variable),  I need fit a model for the prediction. I would like constrain the weights of all the variables between a particular number, like abs(0.2). Which means the variables should no more than 0.2 or less than -0.2. However, I tried lasso and ridge regression in sklearn.linear_model(Also tried ElasticNet) to control the weights of variables, it's not quite good because there always be one or two extreme large weights or sometimes when I gave a large alpha the r square shows the model was really bad. I tried to write my own methods, but I could only constrain the sum of weights nor the every weight of variables. SVR would provide a pretty close answer, however I still wanna ask if there are some good choices for muti-regression with self define constrains?
import numpy as np
from scipy.optimize import shgo
def my_general_linear_model_func(A1,b1):
num_x = np.shape(A1)[1]
def my_func(x):
ls = 0.5*(b1-np.dot(A1,x))**2
result = np.sum(ls)
return result
def g1(x):
return np.sum(x) #sum of X >= 0
def g2(x):
return 1-np.sum(x) #sum of X <= 1
cons = ({'type': 'ineq', 'fun': g1}
,{'type': 'ineq', 'fun': g2})
x0 = np.zeros(num_x)
bnds = [(0,1)]
for i in range(num_x-1):
bnds.append((0,1))
res1 = shgo(my_func,
bounds = bnds,
constraints=cons)
return res1
A1 = np.array([[0.12,5.96,3.14],[0.68,7.89,4.56]])
b1 = np.array([3,5])
my_general_linear_model_func(A1,b1)
The result:
fun: 0.07651391974288956
funl: array([0.07651392, 0.11079534, 0.2564125 ])
message: 'Optimization terminated successfully.'
nfev: 53
nit: 2
nlfev: 49
nlhev: 0
nljev: 12
success: True
x: array([1.12339358e-16, 5.62146099e-02, 9.43785390e-01])
xl: array([[1.12339358e-16, 5.62146099e-02, 9.43785390e-01],
[3.90241087e-01, 5.00000000e-01, 1.09758913e-01],
[5.00000000e-01, 5.00000000e-01, 0.00000000e+00]])

Scipy method SLSQP with vector returns in constraints and objective function

[Edited]
Recently I work with Nelson Siegel Svensson Yield Curve Model, but I ran into a situation: search best fit of paramenters Model.
Given the above, I use simple dataset represented with Period (const vector) and Yield Value (y_real vector), to calibrate the parameters b0, b1, b2, b3, t1 and t2 of original function (detailed in model objective function with x[0], x[1], x[2], x[3], x[4] and x[5] respectively) to find a minimal difference of original Yield Value vs. estimated Yield Value with the NSS Yield Curve Model using the most adjusted parameters for this purpose, so that in the end calculate interpolation and extrapolations Yields Values based on specific Periods.
Note: The constraint function defined with this premise fun(x) == 0 (type 'eq') to search a minimal difference between the y_real and result of model (in vector form), for example:
If y_real = [1.1, 1.4 ,1.3] and the result of model function is [0.2, 1.7, 3.3], then the difference result is [0.9, -0.3, -2] and its necesary to iterate again until you get approximately zero vector in result difference, for exalmple of solution vector difference is: [1.0e-22, 1.0e-21, 1.0e-25]
I deveolped trial solution with Scipy minimize least_squares (Calibrate parameters of Yield Curve Nelson Siegel Svensson) but it's a very simple form and I need more accuracy, for this some people recommended me SLSQP method of Scipy Optimize Minimize.
This is my code for search a calibrated parameters and then use in a NSS Yield Curve:
from numpy import array, append
from scipy.optimize import minimize
from math import exp as EXP
const = [30,90,180,270,365,730,1095,1460,1825,2190,2555,2920,3285,3650,4015,4380,4745,5110,5475,5840,6205,6570,6935,7300]
y_real = [3.11826,3.71463,3.74677,3.83900,4.00049,4.40666,4.52346,4.64026,4.75706,4.87386,4.99066,5.10746,5.22426,
5.34106,5.44522,5.54669,5.64816,5.74963,5.85110,5.88607,5.91162,5.93717,5.96272,5.98827]
def model(x, const, y_real):
arr = array([])
for val in const:
arr = append(arr,(x[0])+(x[1]*((1-EXP(-val/x[4]))/(val/x[4])))+(x[2]*((((1-EXP(-val/x[4]))/(val/x[4])))-(EXP(-val/x[4]))))+x[3]*((((1-EXP(-val/x[5]))/(val/x[5])))-(EXP(-val/x[5]))))
return array(y_real) - arr
def fun(x, const, y_real):
eval = model(x, const, y_real)
leval = array([0 if val < 1.0e-20 else 1 for val in eval])
return leval
con = {'type': 'eq', 'fun': fun, 'args' : (const, y_real)}
x0 = array([0.001, 0.001, 0.001, 0.001, 1.0e-10, 1.0e-10])
#bounds_ = [(0.001,8),(0.001,8),(0.001,8),(0.001,8),(1.0e-15,3),(1.0e-15,3)] bounds=bounds_
res = minimize(model, x0, method='SLSQP', constraints=[con] , args=(const, y_real))
print(res)
but I reached one result error:
in _minimize_slsqp
w = zeros(len_w)
ValueError: negative dimensions are not allowed
How I reach the solution with SLSQP method (or another best option) with a comprehensive way without this error?
Thanks in advance.

First of all, your problem is probably that your model doesn't return a scalar value.
It's highly recommended to make yourself familiar with numpy. For instance, given const and y_real are numpy arrays, you don't need loops and append to implement your model function. It can be written as follows (note that it returns the sum of the squared differences):
import numpy as np
const = np.array([ # your values here ])
y_real = np.array([ # your values here ])
def model(x, const, y_real):
arr = (
x[0] + (x[1] * ((1 - np.exp(-const / x[4])) / (const / x[4])))
+ (x[2] * ((((1 - np.exp(-const / x[4])) / (const / x[4])))
- (np.exp(-const / x[4]))))
+ x[3] * ((((1 - np.exp(-const / x[5])) / (const / x[5])))
- (np.exp(-const / x[5])))
)
return np.sum((y_real - arr)**2)
In addition, there are a few more things that do not make sense and it's still not 100% clear to me what you are trying to achieve:
Subtracting a vector of zeros from the evaluated model inside fun
Subtracting a vector of zeros from leval inside fun
Adding the constraint fun(x) >= 0. What's the purpose of this constraint? The function returns a vector consisting of zeros or ones, so this constraint is always fulfilled.
By ignoring the constraint fun (which, by the way, is not differentiable and contradicts the mathematical assumptions of the SLSQP algorithm), you can write:
from scipy.optimize import minimize
x0 = np.array([0.001, 0.001, 0.001, 0.001, 1.0e-10, 1.0e-10])
res = minimize(lambda x: model(x, const, y_real), x0, method="SLSQP")

Team
In order to this case, I worked part of functional solution using perspective and some good tips of #joni in previos answer (very kind for that), it's only a part because works approximately for first third segment of dataset, to work with rest of dataset segment can use the previous case code (Calibrate parameters of Yield Curve Nelson Siegel Svensson).
I share with you to reuse and get idea of one possible solution in two parts, please discuss if you have a more sofistified solution
from scipy.optimize import minimize
import numpy as np
from math import trunc
from nelson_siegel_svensson import NelsonSiegelSvenssonCurve
import pandas as pd
import matplotlib.pyplot as plt
# days
const = np.array([30,90,270,548,913,1278,1643,2008,2373,2738,3103,3468,3833,4198,4563,4928,5293,5658,6023,6388,6935,7300,7665,8030])/365
# empty values represented with 0
y_real = np.array([3.33156,3.44928,3.62778,3.74313,3.96015,4.384,4.4705,4.55701,4.63817,4.69949,4.76081,4.82213,4.87285,4.8681,4.86336,4.85861,4.85387,4.84912,4.87039,4.89833,4.94286,4.98739,5.03192,5.07645])
def model(x, const, y_real):
arr = np.array([])
for val in const:
arr = np.append(arr,(x[0])+(x[1]*((1-np.exp(-val/x[4]))/(val/x[4])))+(x[2]*((((1-np.exp(-val/x[4]))/(val/x[4])))-(np.exp(-val/x[4]))))+x[3]*((((1-np.exp(-val/x[5]))/(val/x[5])))-(np.exp(-val/x[5]))))
return np.sum((y_real - arr)**2)
# initial coefficients of Nelson Siegel Svensson
x0 = np.array([0.001, 0.001, 0.001, 0.001, 1.0e-10, 1.0e-10])
res = minimize(model, x0, method='SLSQP', args=(const, y_real),
options={'maxiter': 10000, 'ftol': 1e-190})
print(res.x)
X_fix = np.linspace(start=const[0], stop=const[-1], num=(const.size*20))
NSS = NelsonSiegelSvenssonCurve(beta0=res.x[0], beta1=res.x[1], beta2=res.x[2], beta3=res.x[3], tau1=res.x[4], tau2=res.x[5])
pd_interpolation = pd.DataFrame(columns=['Period','Value'])
font = {'family': 'serif',
'color': '#1F618D',
'weight': 'bold',
'size': 14,
}
font_x_y = {'family': 'serif',
'color': '#C70039',
'weight': 'bold',
'size': 13,
'style': 'oblique'
}
fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(10, 6))
minx = -const[1]*3
config_manager = plt.get_current_fig_manager()
config_manager.set_window_title("Visualización " )
screen_x, screen_y = config_manager.window.wm_maxsize()
anchura = str(trunc(screen_x/12))
altura = str(trunc(screen_y/8))
middle_window = "+" + anchura + "+" + altura
config_manager.window.wm_geometry(middle_window)
plt.title('Interpolación ', fontdict=font, loc='center')
plt.xlabel('Periodo', fontdict=font_x_y)
plt.ylabel('Aproximación', fontdict=font_x_y, rotation=90, labelpad=10)
ax.set_ylim(ymin=0, ymax=(np.amax(y_real)*1.1))
ax.set_xlim(xmin=minx, xmax=(np.amax(const)*1.03))
ax.plot(const, y_real, 'ro', label='Dato real')
ax.plot(X_fix, NSS(X_fix),'--', label='Dato interpolado')
ax.legend(loc='lower right', frameon=False)
plt.show()
And result graph is this:

How can I get sco.minimize to give me a solution that isn't the initial guesses?

I have a piece of code that worked well when I optimized advertising budget with 2 variables (channels) but when I added aditional channels, it stopped optimizing with no error messages.
import numpy as np
import scipy.optimize as sco
# setup variables
media_budget = 100000 # total media budget
media_labels = ['launchvideoviews', 'conversion', 'traffic', 'videoviews', 'reach'] # channel names
media_coefs = [0.3524764781, 5.606903166, -0.1761937775, 5.678596017, 10.50445914] #
# model coefficients
media_drs = [-1.15, 2.09, 6.7, -0.201, 1.21] # diminishing returns
const = -243.1018144
# the function for our model
def model_function(x, media_coefs, media_drs, const):
# transform variables and multiply them by coefficients to get contributions
channel_1_contrib = media_coefs[0] * x[0]**media_drs[0]
channel_2_contrib = media_coefs[1] * x[1]**media_drs[1]
channel_3_contrib = media_coefs[2] * x[2]**media_drs[2]
channel_4_contrib = media_coefs[3] * x[3]**media_drs[3]
channel_5_contrib = media_coefs[4] * x[4]**media_drs[4]
# sum contributions and add constant
y = channel_1_contrib + channel_2_contrib + channel_3_contrib + channel_4_contrib + channel_5_contrib + const
# return negative conversions for the minimize function to work
return -y
# set up guesses, constraints and bounds
num_media_vars = len(media_labels)
guesses = num_media_vars*[media_budget/num_media_vars,] # starting guesses: divide budget evenly
args = (media_coefs, media_drs, const) # pass non-optimized values into model_function
con_1 = {'type': 'eq', 'fun': lambda x: np.sum(x) - media_budget} # so we can't go over budget
constraints = (con_1)
bound = (0, media_budget) # spend for a channel can't be negative or higher than budget
bounds = tuple(bound for x in range(5))
# run the SciPy Optimizer
solution = sco.minimize(model_function, x0=guesses, args=args, method='SLSQP', constraints=constraints, bounds=bounds)
# print out the solution
print(f"Spend: ${round(float(media_budget),2)}\n")
print(f"Optimized CPA: ${round(media_budget/(-1 * solution.fun),2)}")
print("Allocation:")
for i in range(len(media_labels)):
print(f"-{media_labels[i]}: ${round(solution.x[i],2)} ({round(solution.x[i]/media_budget*100,2)}%)")
And the result is
Spend: $100000.0
Optimized CPA: $-0.0
Allocation:
-launchvideoviews: $20000.0 (20.0%)
-conversion: $20000.0 (20.0%)
-traffic: $20000.0 (20.0%)
-videoviews: $20000.0 (20.0%)
-reach: $20000.0 (20.0%)
Which is the same as the initial guesses argument.
Thank you very much!
Update: Following #joni comment, I passed the gradient function explicitly, but still no result.
I don't know how to change the constrains to test #chthonicdaemon
comment yet.
import numpy as np
import scipy.optimize as sco
# setup variables
media_budget = 100000 # total media budget
media_labels = ['launchvideoviews', 'conversion', 'traffic', 'videoviews', 'reach'] # channel names
media_coefs = [0.3524764781, 5.606903166, -0.1761937775, 5.678596017, 10.50445914] #
# model coefficients
media_drs = [-1.15, 2.09, 6.7, -0.201, 1.21] # diminishing returns
const = -243.1018144
# the function for our model
def model_function(x, media_coefs, media_drs, const):
# transform variables and multiply them by coefficients to get contributions
channel_1_contrib = media_coefs[0] * x[0]**media_drs[0]
channel_2_contrib = media_coefs[1] * x[1]**media_drs[1]
channel_3_contrib = media_coefs[2] * x[2]**media_drs[2]
channel_4_contrib = media_coefs[3] * x[3]**media_drs[3]
channel_5_contrib = media_coefs[4] * x[4]**media_drs[4]
# sum contributions and add constant (objetive function)
y = channel_1_contrib + channel_2_contrib + channel_3_contrib + channel_4_contrib + channel_5_contrib + const
# return negative conversions for the minimize function to work
return -y
# partial derivative of the objective function
def fun_der(x, media_coefs, media_drs, const):
d_chan1 = 1
d_chan2 = 1
d_chan3 = 1
d_chan4 = 1
d_chan5 = 1
return np.array([d_chan1, d_chan2, d_chan3, d_chan4, d_chan5])
# set up guesses, constraints and bounds
num_media_vars = len(media_labels)
guesses = num_media_vars*[media_budget/num_media_vars,] # starting guesses: divide budget evenly
args = (media_coefs, media_drs, const) # pass non-optimized values into model_function
con_1 = {'type': 'eq', 'fun': lambda x: np.sum(x) - media_budget} # so we can't go over budget
constraints = (con_1)
bound = (0, media_budget) # spend for a channel can't be negative or higher than budget
bounds = tuple(bound for x in range(5))
# run the SciPy Optimizer
solution = sco.minimize(model_function, x0=guesses, args=args, method='SLSQP', constraints=constraints, bounds=bounds, jac=fun_der)
# print out the solution
print(f"Spend: ${round(float(media_budget),2)}\n")
print(f"Optimized CPA: ${round(media_budget/(-1 * solution.fun),2)}")
print("Allocation:")
for i in range(len(media_labels)):
print(f"-{media_labels[i]}: ${round(solution.x[i],2)} ({round(solution.x[i]/media_budget*100,2)}%)")

The reason you are not able to solve this exact problem turns out to be all about the specific coefficients you have. For the problem as it is specified, the optimum appears to be near allocations where some spends are zero. However, at spends near zero, due to the negative coefficients in media_drs, the objective function rapidly becomes infinite. I believe this is what is causing the issues you are experiencing. I can get a solution with success = True by manipulating the 6.7 to be 0.7 in the coefficients and setting lower bound that is larger than 0 to stop the objective function from exploding. So this isn't so much of a programming issue as a problem formulation issue.
I cannot imagine it would be true that you would see more payoff when you reduce the budget on a particular item, so all the negative powers in media_dirs seem off to me.
I will also post here some improvements I made while debugging this issue. Notice that I'm using numpy arrays more to make some of the functions easier to read. Also notice how I have calculated a correct jacobian:
import numpy as np
import scipy.optimize as sco
# setup variables
media_budget = 100000 # total media budget
media_labels = ['launchvideoviews', 'conversion', 'traffic', 'videoviews', 'reach'] # channel names
media_coefs = np.array([0.3524764781, 5.606903166, -0.1761937775, 5.678596017, 10.50445914]) #
# model coefficients
media_drs = np.array([-1.15, 2.09, 1.7, -0.201, 1.21]) # diminishing returns
const = -243.1018144
# the function for our model
def model_function(x, media_coefs, media_drs, const):
# transform variables and multiply them by coefficients to get contributions
channel_contrib = media_coefs * x**media_drs
# sum contributions and add constant
y = channel_contrib.sum() + const
# return negative conversions for the minimize function to work
return -y
def model_function_jac(x, media_coefs, media_drs, const):
dy_dx = media_coefs * media_drs * x**(media_drs-1)
return -dy_dx
# set up guesses, constraints and bounds
num_media_vars = len(media_labels)
guesses = num_media_vars*[media_budget/num_media_vars,] # starting guesses: divide budget evenly
args = (media_coefs, media_drs, const) # pass non-optimized values into model_function
con_1 = {'type': 'ineq', 'fun': lambda x: media_budget - sum(x)} # so we can't go over budget
constraints = (con_1,)
bound = (10, media_budget) # spend for a channel can't be negative or higher than budget
bounds = tuple(bound for x in range(5))
# run the SciPy Optimizer
solution = sco.minimize(
model_function, x0=guesses, args=args,
method='SLSQP',
jac=model_function_jac,
constraints=constraints,
bounds=bounds
)
# print out the solution
print(solution)
print(f"Spend: ${round(float(media_budget),2)}\n")
print(f"Optimized CPA: ${round(media_budget/(-1 * solution.fun),2)}")
print("Allocation:")
for i in range(len(media_labels)):
print(f"-{media_labels[i]}: ${round(solution.x[i],2)} ({round(solution.x[i]/media_budget*100,2)}%)")
This solution at least "works" in the sense that it reports a successful solve and returns an answer different from the initial guess.

nonlinear optimization with vectors, scalars and inequality constraints

I have set of equation in form: Y=aA+bB
where Y-is know vector of floats (only this one is known!); a, b are unkown scalar (float) and A, B are unknown vectors of floats. Each equation have it own Y, a, b, whereas all equation share the same unknow vectors A and B.
I have set of such equation so my problem is to minimize function:
(Y-aA-bB)+(Y'-a'A-b'B)+....
I have also many inequality constrains of type: Ai>Aj (Ai i-th element of vector A), Bi>= Bk, Bi>0, a>a', ...
Is there any software or library (ideally for python) which can handle this problem?

General remarks
This is a linear problem (at least in the linear least-squares sense, continue reading)!
It's also incompletely specified as it's not clear if there should be always a feasible solution in your case or if you want to minimize some given loss in general. Your text sounds like the latter, but in this case one has to chose the loss (which makes a difference in regards to possible algorithms). Let's take the euclidean-norm (probably the best pick here)!
Ignoring constraints for a moment, we can view this problem as basic least-squares solution to a linear matrix equation problem (euclidean-norm vs. squared euclidean-norm does not make a difference!).
min || b - Ax ||^2
Here:
M = number of Y's
N = size of Y
b = (Y0,
Y1,
...) -> shape: M*N (flattened: Y_x = (y_x_0, y_x_1).T)
A = ((a0, 0, 0, ..., b0, 0, 0, ...),
(0, a0, 0, ..., 0, b0, 0, ...),
(0, 0, a0, ..., 0, 0, b0, ...),
...
(a1, 0, 0, ..., b1, 0, 0, ...)) -> shape: (M*N, N*2)
x = (A0, A1, A2, ... B0, B1, B2, ...) -> shape: N*2 (one for A, one for B)
What you should do
If unconstrained:
Convert to standard-form and use numpy's lstsq
If constrained:
Either use customized optimization algorithms, or:
Linear-programming (if minimizing absolute-differences / l1-norm)
I'm too lazy to formulate it for scipy's linprog
Not that hard, but l1-norm is non-trivial using scipy's API
Much easier to formulate with cvxpy (obj=cvxpy.norm(X, 1))
Quadratic-programming / Second-order-cone-programming (if minimizing euclidean norm / l2-norm)
Again, too lazy to formuate it; no special solver available at scipy yet
Could be easily formulated with cvxpy (obj=cvxpy.norm(X, 2))
Emergency: use general-purpose constrained nonlinear-optimization algorithms like SLSQP -> see code
Some hacky code (not the best approach!)
This code:
Is just a demo!
Uses general nonlinear optimization algorithms from scipy
Therefore:
easier to formulate
Less fast & robust than LP, QP, SOCP
But will achieve approximately the same result as convergence on convex optimization problems is guaranteed
Uses automatic-differentiation whenever needed
(author too lazy to add gradients)
this can really hurt if performance is important
Is really ugly in terms of np.repeat vs. broadcasting!
Code:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
""" Fake-problem (usually the job of the question-author!) """
def get_partial(N=10):
Y = np.random.uniform(size=N)
a, b = np.random.uniform(size=2)
return Y, a, b
""" Optimization """
def optimize(list_partials, N, M):
""" General approach:
This is a linear system of equations (with constraints)
Basic (unconstrained) form: min || b - Ax ||^2
"""
Y_all = np.vstack(map(lambda x: x[0], list_partials)).ravel() # flat 1d
a_all = np.hstack(map(lambda x: np.repeat(x[1], N), list_partials)) # repeat to be of same shape
b_all = np.hstack(map(lambda x: np.repeat(x[2], N), list_partials)) # """
def func(x):
A = x[:N]
B = x[N:]
return np.linalg.norm(Y_all - a_all * np.repeat(A, M) - b_all * np.repeat(B, M))
""" Example constraints: A >= B element-wise """
cons = ({'type': 'ineq',
'fun' : lambda x: x[:N] - x[N:]})
res = minimize(func, np.zeros(N*2), constraints=cons, method='SLSQP', options={'disp': True})
print(res)
print(Y_all - a_all * np.repeat(res.x[:N], M) - b_all * np.repeat(res.x[N:], M))
""" Test """
M = 4
N = 3
list_partials = [get_partial(N) for i in range(M)]
optimize(list_partials, N, M)
Output:
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.9019356096498999
Iterations: 12
Function evaluations: 96
Gradient evaluations: 12
fun: 0.9019356096498999
jac: array([ 1.03786588e-04, 4.84041870e-04, 2.08129734e-01,
1.57609582e-04, 2.87599862e-04, -2.07959406e-01])
message: 'Optimization terminated successfully.'
nfev: 96
nit: 12
njev: 12
status: 0
success: True
x: array([ 1.82177105, 0.62803449, 0.63815278, -1.16960281, 0.03147683,
0.63815278])
[ 3.78873785e-02 3.41189867e-01 -3.79020251e-01 -2.79338679e-04
-7.98836875e-02 7.94168282e-02 -1.33155595e-01 1.32869391e-01
-3.73398306e-01 4.54460178e-01 2.01297470e-01 3.42682496e-01]
I did not check the result! If there is an error it's an implementation-error, not a conceptional one (my opinion)!

I agree with sascha that this is a linear problem. As I do not like constrains very much, I prefer, actually, to make it a non-linear without constrains. I do so by setting the vector A=(a1**2, a1**2+a2**2, a1**2+a2**2+a3**2, ...) like this it is ensured that it is all positive and A_i > A_j for i>j. That makes errors a bit problematic, as you now have to consider error propagation to get A1, A2, etc. including correlation, but I will have an important point on that at the end. The "simple" solution would look as follows:
import numpy as np
from scipy.optimize import leastsq
from random import random
np.set_printoptions(linewidth=190)
def generate_random_vector(n, sortIt=True):
out=np.fromiter( (random() for x in range(n) ),np.float)
if sortIt:
out.sort()
return out
def residuals(parameters,dataVec,dataLength,vecDims):
aParams=parameters[:dataLength]
bParams=parameters[dataLength:2*dataLength]
AParams=parameters[-2*vecDims:-vecDims]
BParams=parameters[-vecDims:]
YList=dataVec
AVec=[a**2 for a in AParams]##assures A_i > 0
BVec=[b**2 for b in BParams]
AAVec=np.cumsum(AVec)##assures A_i>A_j for i>j
BBVec=np.cumsum(BVec)
dist=[ np.array(Y)-a*np.array(AAVec)-b*np.array(BBVec) for Y,a,b in zip(YList,aParams,bParams) ]
dist=np.ravel(dist)
return dist
if __name__=="__main__":
aList=generate_random_vector(20, sortIt=False)
bList=generate_random_vector(20, sortIt=False)
AVec=generate_random_vector(5)
BVec=generate_random_vector(5)
YList=[a*AVec+b*BVec for a,b in zip(aList,bList)]
aGuess=20*[.2]
bGuess=20*[.3]
AGuess=5*[.4]
BGuess=5*[.5]
bestFitValues, covMX, infoDict, messages ,ier = leastsq(residuals, aGuess+bGuess+AGuess+BGuess ,args=(YList,20,5) ,full_output=True)
print "a"
print aList
besta = bestFitValues[:20]
print besta
print "b"
print bList
bestb = bestFitValues[20:40]
print bestb
print "A"
print AVec
bestA = bestFitValues[-2*5:-5]
realBestA = np.cumsum([x**2 for x in bestA])
print realBestA
print "B"
print BVec
bestB = bestFitValues[-5:]
realBestB = np.cumsum([x**2 for x in bestB])
print realBestB
print covMX
The problem on errors and correlation is that the solution to the problem is not unique. If Y = a A + b B is a solution and we, e.g., rotate such that A = c E + s F and B = -s E + c F then also Y = (ac-bs) E + (as+bc) F =e E + f F is a solution. The parameter space is, hence, completely flat at "the solution" resulting in huge errors and apocalyptic correlations.

Problems with scipy.optimize using matrix as input, bounds, constraints

I have used Python to perform optimization in the past; however, I am now trying to use a matrix as the input for the objective function as well as set bounds on the individual element values and the sum of the value of each row in the matrix, and I am encountering problems.
Specifically, I would like to pass the objective function ObjFunc three parameters - w, p, ret - and then minimize the value of this function (technically I am trying to maximize the function by minimizing the value of -1*ObjFunc) by adjusting the value of w subject to the bound that all elements of w should fall within the range [0, 1] and the constraint that sum of each row in w should sum to 1.
I have included a simplified piece of example code below to demonstrate the issue I'm encountering. As you can see, I am using the minimize function from scipy.opimize. The problems begin in the first line of objective function x = np.dot(p, w) in which the optimization procedure attempts to flatten the matrix into a one-dimensional vector - a problem that does not occur when the function is called without performing optimization. The bounds = b and constraints = c are both producing errors as well.
I know that I am making an elementary mistake in how I am approaching this optimization and would appreciate any insight that can be offered.
import numpy as np
from scipy.optimize import minimize
def objFunc(w, p, ret):
x = np.dot(p, w)
y = np.multiply(x, ret)
z = np.sum(y, axis=1)
r = z.mean()
s = z.std()
ratio = r/s
return -1 * ratio
# CREATE MATRICES
# returns, ret, of each of the three assets in the 5 periods
ret = np.matrix([[0.10, 0.05, -0.03], [0.05, 0.05, 0.50], [0.01, 0.05, -0.10], [0.01, 0.05, 0.40], [1.00, 0.05, -0.20]])
# probability, p, of being in each stae {X, Y, Z} in each of the 5 periods
p = np.matrix([[0,0.5,0.5], [0,0.6,0.4], [0.2,0.4,0.4], [0.3,0.3,0.4], [1,0,0]])
# initial equal weights, w
w = np.matrix([[0.33333,0.33333,0.33333],[0.33333,0.33333,0.33333],[0.33333,0.33333,0.33333]])
# OPTIMIZATION
b = [(0, 1)]
c = ({'type': 'eq', 'fun': lambda w_: np.sum(w, 1) - 1})
result = minimize(objFunc, w, (p, ret), method = 'SLSQP', bounds = b, constraints = c)

Digging into the code a bit. minimize calls optimize._minimize._minimize_slsqp. One of the first things it does is:
x = asfarray(x0).flatten()
So you need to design your objFunc to work with the flattened version of w. It may be enough to reshape it at the start of that function.
I read the code from a IPython session, but you can also find it in your scipy directory:
/usr/local/lib/python3.5/dist-packages/scipy/optimize/_minimize.py

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

The minimize function of scipy.optimize doesn't give the right answer - python

Related

How to constrain the weight of characteristic variables in regression

Scipy method SLSQP with vector returns in constraints and objective function

How can I get sco.minimize to give me a solution that isn't the initial guesses?

nonlinear optimization with vectors, scalars and inequality constraints

Problems with scipy.optimize using matrix as input, bounds, constraints

Categories

Resources