How to constrain the weight of characteristic variables in regression - python

Now I faced a problem that for a data sample(lets‘s say 10 continuous variables and one dependent variable),  I need fit a model for the prediction. I would like constrain the weights of all the variables between a particular number, like abs(0.2). Which means the variables should no more than 0.2 or less than -0.2. However, I tried lasso and ridge regression in sklearn.linear_model(Also tried ElasticNet) to control the weights of variables, it's not quite good because there always be one or two extreme large weights or sometimes when I gave a large alpha the r square shows the model was really bad. I tried to write my own methods, but I could only constrain the sum of weights nor the every weight of variables. SVR would provide a pretty close answer, however I still wanna ask if there are some good choices for muti-regression with self define constrains?
import numpy as np
from scipy.optimize import shgo
def my_general_linear_model_func(A1,b1):
num_x = np.shape(A1)[1]
def my_func(x):
ls = 0.5*(b1-np.dot(A1,x))**2
result = np.sum(ls)
return result
def g1(x):
return np.sum(x) #sum of X >= 0
def g2(x):
return 1-np.sum(x) #sum of X <= 1
cons = ({'type': 'ineq', 'fun': g1}
,{'type': 'ineq', 'fun': g2})
x0 = np.zeros(num_x)
bnds = [(0,1)]
for i in range(num_x-1):
bnds.append((0,1))
res1 = shgo(my_func,
bounds = bnds,
constraints=cons)
return res1
A1 = np.array([[0.12,5.96,3.14],[0.68,7.89,4.56]])
b1 = np.array([3,5])
my_general_linear_model_func(A1,b1)
The result:
fun: 0.07651391974288956
funl: array([0.07651392, 0.11079534, 0.2564125 ])
message: 'Optimization terminated successfully.'
nfev: 53
nit: 2
nlfev: 49
nlhev: 0
nljev: 12
success: True
x: array([1.12339358e-16, 5.62146099e-02, 9.43785390e-01])
xl: array([[1.12339358e-16, 5.62146099e-02, 9.43785390e-01],
[3.90241087e-01, 5.00000000e-01, 1.09758913e-01],
[5.00000000e-01, 5.00000000e-01, 0.00000000e+00]])

Related

Minimizing function using scipy with constraints

I have the following constraints for the problem,
I want to minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:
1. w_i is at maximum, ul
2. w_i is, at least, 0.05
3. The sum of all w for a sector code can not be bigger than 0.50
So basically I want to generate all w_i for each row, however, I dont know how to implement the third restriction with scipy.
With scipy.optimize.lsq_linear I can force the first two conditions with bound = (0.05, ul), but I don't know how to force the third one.
import pandas as pd
import scipy
import numpy as np
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
df
I think you are trying to do something like this:
import pandas as pd
import scipy
import numpy as np
'''
Minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:
Constraints:
1. w_i is at maximum, ul [NOTE: I think this should say 'minimum']
2. w_i is, at least, 0.05 [NOTE: I think this should say 'at most']
3. The sum of all w for a sector code can not be bigger than 0.50
'''
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
print(df)
gb = df.groupby('Sector Code')['ul']
codeCounts = gb.count().to_list()
cumCounts = [0] + [sum(codeCounts[:i + 1]) for i in range(len(codeCounts))]
newIdx = []
for code, dfGp in gb:
newIdx += list(dfGp.index)
df = df.reindex(newIdx)
# For each unique Sector Code, create constraint that 0.50 minus the sum of w for that code must be non-negative:
def foo(i, c):
# return a closure that acts as a constraint for the i'th interval defined by c[i-1]:c[i]
def bar(x):
return 0.50 - sum(x[c[i-1]:c[i]])
return bar
cons = [{'type': 'ineq', 'fun': foo(i, cumCounts)} for i in range(1, len(cumCounts))]
# Value of bounds argument to enforce ul <= w_i <= 0.05
bnds = tuple((ul_i, 0.05) for code, ul_group in gb for ul_i in ul_group)
# Initial guess
n = len(df.index)
w_i = np.ones(n) * (1 / n)
# The objective function to be minimized
uw_sum = df.uw.sum()
def fun(w):
return (pd.Series(w) - df.uw).pow(2).sum() / uw_sum
# Optimize using scipy minimize() function
from scipy.optimize import minimize
res = minimize(fun, w_i, method='SLSQP', bounds=bnds, constraints=cons)
print(res)
df['w'] = res.x
df = df.reindex(range(len(df.index)))
print(df)
Explanation:
Use groupby() to get the row count for each unique Sector Code value and also to construct an index ordered by Sector Code, which we use to re-order the original input df
create a list of constraint dictionaries to be passed to the optimizer, one for each Sector Code, which will use python closures to constrain the sum of the corresponding solution elements to be <= 0.50
create a sequence of bounds to constrain solution elements w_i to be between ul and 0.05
create the objective function to return the sum of squared differences of w_i, uw_i divided by sum(uw)
call minimize() from scipy.optimize with the above constraints, bounds, objective function and an initial guess
add a column to the dataframe with the result and call reindex() to restore the original row order.
Output:
uw ul Sector Code
0 0.006822 0.050000 40
1 0.017949 0.050000 40
2 0.001906 0.031289 40
3 0.000904 0.040318 20
4 0.001147 0.046904 15
... ... ... ...
1226 0.003653 0.033553 10
1227 0.002556 0.031094 10
1228 0.002816 0.041031 10
1229 0.010216 0.050000 40
1230 0.001559 0.033480 55
[1231 rows x 3 columns]
fun: 0.4487707682194904
jac: array([0.02089997, 0.00466947, 0.01358654, ..., 0.02070332, nan,
0.02188896])
message: 'Positive directional derivative for linesearch'
nfev: 919
nit: 5
njev: 1
status: 8
success: False
x: array([0.03730054, 0.0247585 , 0.02171931, ..., 0.03300862, 0.05 ,
0.03348039])
uw ul Sector Code w
0 0.006822 0.050000 40 0.050000
1 0.017949 0.050000 40 0.050000
2 0.001906 0.031289 40 0.031289
3 0.000904 0.040318 20 0.040318
4 0.001147 0.046904 15 0.046904
... ... ... ... ...
1226 0.003653 0.033553 10 0.033553
1227 0.002556 0.031094 10 0.031094
1228 0.002816 0.041031 10 0.041031
1229 0.010216 0.050000 40 0.050000
1230 0.001559 0.033480 55 0.033480
[1231 rows x 4 columns]
Note that success is False, so perhaps some work remains. Hopefully the dataframe related manipulations are helpful in addressing your question.
You already got a working answer from #constantstranger. IMO, there's just one problem: it's quite slow. More precisely, it took more than a minute to solve the problem on my machine.
Therefore, some notes on what could be done in order to speed up the solver in the following:
Since Python has a noticeable overhead when calling functions, it's a good idea to implement all functions as fast as possible. For instance, evaluating one vectorial constraint function is faster than evaluating multiple scalar constraint functions.
At the moment, all derivatives (the objective gradient and the constraint Jacobian) are approximated by finite differences. This is a real bottleneck because each evaluation of the approximated derivative goes in hand with multiple objective/constraint function evaluations. Instead, it's highly recommended to provide the exact derivatives or use algorithmic differentiation.
Last but not least, scipy.optimize.minimize is only suited for small to mid-sized problems at least. If you are willing to use another package, you could use IPOPT, the state-of-the-art NLP solver. The cyipopt package provides a scipy-like interface, so it isn't hard switching from scipy.optimize.minimize.
Besides from that, your problem is a (convex) quadratic optimization problem and can be formulated as follows:
min f(w) s.t. A*w <= 0.5, u_l <= w <= 0.05
with
f(w) = (1/sum(u_w)) * ||w - u_w||^2_2 = (1/sum(u_w)) * (w'Iw - 2u_w'*w + u_w'u_w)
where A[i,j] = 1 if w[j] belongs to sector i and 0 otherwise.
Then, solving the problem with IPOPT (note that we pass the exact derivatives) looks like this:
import numpy as np
import pandas as pd
from cyipopt import minimize_ipopt
# dataframe
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
# sectors
sectors = df["Sector Code"].unique()
# building the matrix A
A = np.zeros((sectors.size, len(df)))
for i, sec in enumerate(sectors):
indices = df[df["Sector Code"] == sec].index.values
A[i, indices] = 1
uw = df['uw'].values
uw_sum = uw.sum()
# objective
def obj(w):
return np.sum((w - uw)**2) / uw_sum
# objective gradient
def grad(w):
return (2*w - 2*uw) / uw_sum
# Linear Constraint A # w <= 0.5 <=> 0.5 - A # w >= 0
cons = [{'type': 'ineq', 'fun': lambda w: 0.5 - A # w, 'jac': lambda w: -A}]
# variable bounds
bounds = [(u_i, 0.05) for u_i in df.ul.values]
# feasible initial guess
w0 = np.ones(len(df)) / len(df)
# solve the problem
res = minimize_ipopt(obj, x0=w0, jac=grad, bounds=bounds, constraints=cons)
print(res)
On my machine, this terminates in less than 2 seconds and yields
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
fun: 0.4306218505716169
info: {'x': array([0.05 , 0.05 , 0.03128946, ..., 0.04103131, 0.05 ,
0.03348038]), 'g': array([-3.51687688, -9.45217602, -7.88799127, -1.78825803, -1.86650095,
-5.09092925, -2.11181422, -1.35485327, -1.15847276, 0.35 ]), 'obj_val': 0.4306218505716169, 'mult_g': array([-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
-1.000000e+03, -2.857166e-09]), 'mult_x_L': array([1000.02960821, 1000.02197802, 1000.00000005, ..., 1000.00000011,
1000.02728049, 1000.00000006]), 'mult_x_U': array([0.00000000e+00, 0.00000000e+00, 5.34457820e-08, ...,
1.11498931e-07, 0.00000000e+00, 6.05340266e-08]), 'status': 2, 'status_msg': b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'}
message: b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'
nfev: 13
nit: 9
njev: 7
status: 2
success: False
x: array([0.05 , 0.05 , 0.03128946, ..., 0.04103131, 0.05 ,
0.03348038])
[Finished in 1.9s]

The minimize function of scipy.optimize doesn't give the right answer

I'm trying the solve a minimization problem using the minimize function of Scipy. The objective function is simply the ratio of two multivariate normal distributions with different mean and variance. I'm hoping to find the maximum of the function g_func, which is equivalent to find the minimum of the function g_optimization. Also, I added a constraint of x[0] = 0. Here, x is a vector with 8 elements. The objective function g_optimization is as following:
import numpy as np
from scipy.optimize import minimize
# Set up mean and variance for two MVN distributions
n_trait = 8
sigma = np.full((n_trait, n_trait),0.0005)
np.fill_diagonal(sigma,0.005)
omega = np.full((n_trait, n_trait),0.0000236)
np.fill_diagonal(omega,0.0486)
sigma_pos = np.linalg.inv(np.linalg.inv(sigma)+np.linalg.inv(omega))
mu_pos = np.array([-0.01288244,0.08732091,0.01049617,0.0860966,0.10055626,0.07952922,0.04363669,-0.0061975])
mu_pri = 0
sigma_pri = omega
#objective function
def g_func(beta,mu_sim_pos):
g1 = ((np.linalg.det(sigma_pri))**(1/2))/((np.linalg.det(sigma_pos))**(1/2))
g2 = (-1/2)*np.linalg.multi_dot([np.transpose(beta-mu_sim_pos),np.linalg.inv(sigma_pos),beta-mu_sim_pos])
g3 = (1/2)*np.linalg.multi_dot([np.transpose(beta-mu_pri),np.linalg.inv(sigma_pri),beta-mu_pri])
g = g1*np.exp(g2+g3)
return g
def g_optimization(beta,mu_sim_pos):
return -1*g_func(beta,mu_sim_pos)
#optimization
start_point = np.full(8,0)
cons = ({'type': 'eq',
'fun' : lambda x: np.array([x[0]])})
anws = minimize (g_optimization, [start_point], args=(mu_pos),
constraints=cons, options={'maxiter': 50}, tol=0.001)
anws
The optimization stops after two iterations, and the minimum value that the function gives is 0, at the point np.array([0,10.32837891,-1.62396508,10.13790152,12.38752653,9.11615259,3.53201544,-4.22115517]). This cannot be true because even we plug in the starting point np.zeros(8) to the g_optimization function, the result given is -657.0041125829354, which is smaller than 0. So the solution provided is definitely not minimal.
g_optimization(np.zeros(8),mu_pos) #gives solution of -657.0041125829354
I'm not sure where did I go wrong.
I would try a different solver. For example L-BFGS-B works well.
You can look at all options here.
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
constraints=cons, options={'maxiter': 50}, tol=0.001)
print(anws)
# success: True
# message: b'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
# fun: -21688.00879938617
# x: array([-0.0101048, 0.09937778, 0.01543875, 0.0980401, 0.11383878, 0.09086455, 0.05164822, -0.00280081])
EDIT:
L-BFGS-B can not handle general constraints h(x)=0, only bounding boxes on the variables:
Bounds on variables for L-BFGS-B, TNC, SLSQP, Powell, and trust-constr methods. There are two ways to specify the bounds:
Instance of Bounds class.
Sequence of (min, max) pairs for each element in x. None is used to specify no bound.
In your case you have to define 8 pairs of lower and upper limits.
For x[0] you have to make a tight bound as the method can not handle x_low == x_high.
bounds = [(None, None)] * 8
bounds[0] = (0, 0.00001)
anws = minimize (g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B', bounds=bounds,
options={'maxiter': 50}, tol=0.001)
# fun: -21467.48153792194
# x: array([0., 0.10039832, 0.01641271, 0.0990599, 0.11486735, 0.09188037, 0.05264228, -0.00183697])
Another alternative is to exclude the value x[0] from your optimisation problem:
def g_optimization(beta,mu_sim_pos):
beta2 = np.empty(8)
beta2[0] = 0
beta2[1:] = beta
return -1*g_func(beta2, mu_sim_pos)
start_point = np.zeros(7) # exclude x[0]
anws = minimize(g_optimization, [start_point], args=(mu_pos), method='L-BFGS-B',
options={'maxiter': 50}, tol=0.001)
# fun: -21467.47686079844
# x: array([0.10041797, 0.01648995, 0.09908046, 0.11487707, 0.09190585, 0.05269467, -0.00174722])
# ^ missing x[0]

machine learning and optimizing scipy

I have coding machine learning and for optimizing my cost function i used scipy.optimize.minimum for it and scipy doesn't return right answer.so what should i do?
code:
data1 = pd.read_csv('ex2data1.txt', header = None, names =
['exam1','exam2', 'y'])
data1['ones'] = pd.Series(np.ones(100), dtype = int)
data1 = data1[['ones', 'exam1', 'exam2', 'y']]
X = np.matrix(data1.iloc[:, 0:3])
y = np.matrix(data1.iloc[:, 3:])
def gFunction(z):
return sc.special.expit(-z)
def hFunction(theta, X):
theta = np.matrix(theta).T
h = np.matrix(gFunction(X.dot(theta)))
return h
def costFunction(theta, X, y):
m = y.size
h = hFunction(theta, X).T
j = (-1 / m) * (np.dot(np.log(h), y) + np.dot(np.log(1-h), (1-y)))
return j
def gradientDescent(theta, X, y):
theta = np.matrix(theta)
m = y.size
h = hFunction(theta, X)
gradient = (1 / m) * X.T.dot(h - y)
return gradient.flatten()
initial_theta = np.zeros(X.shape[1])
cost = costFunction(initial_theta, X, y)
grad = gradientDescent(initial_theta, X, y)
print('Cost: \n', cost)
print('Grad: \n', grad)
Cost:
[[ 0.69314718]]
Grad:
[[ -0.1 -12.00921659 -11.26284221]]
def optimizer(costFunction, theta, X, y, gradientDescent):
optimum = sc.optimize.minimize(costFunction, theta, args = (X, y),
method = None, jac = gradientDescent, options={'maxiter':400})
return optimum
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 0.693147
Iterations: 0
Function evaluations: 106
Gradient evaluations: 94
Out[46]:
fun: matrix([[ 0.69314718]])
hess_inv: array([[1, 0, 0],
[0, 1, 0],
[0, 0, 1]])
jac: matrix([[ -0.1 , -12.00921659, -11.26284221]])
message: 'Desired error not necessarily achieved due to precision loss.'
nfev: 106
nit: 0
njev: 94
status: 2
success: False
x: array([ 0., 0., 0.])
this is the message that says success False
i have done everything right i don't know what's happen
It's hard to debug something like this when:
code is not reproducible because of external data
question does not even try to explain what is optimized here
There are some strange design-decisions:
use of np.matrix -> do use np.array!
don't call the jacobian gradientDescent
And then in regards to your observation:
Iterations: 0
Function evaluations: 106
Gradient evaluations: 94
zero iterations while doing so many function-evaluations is a very bad sign. Something is very broken. Probably line-search is going crazy above, but that's just a guess.
Now what's broken?:
your jacobian is definitely broken!
i did not check the math, but:
your jacobian-shape is dependent on the number of samples when number of variables is fixed -> no ! that does not make sense!
Steps to do:
run with jac=False
If working: your cost-fuc looks ok
If not working: your trouble probably (no proof) starts even there
repair the jacobian!
check the jacobian against check_grad
I wonder why you don't get any shape errors here. I do, when trying to mimic your input shapes and playing around with sample-size!

Optimization using scipy

I am trying to build an efficient frontier as in the Markowitz problem.
I have written the code below, but I get the error "ValueError: Objective function must return a scalar". I have tested 'fun' with some values, for example, I input to the console:
W = np.ones([n])/n # start optimization with equal weights
cov_matrix = returns.cov()
fun = 0.5*np.dot(np.dot(W, cov_matrix), W) # variance of the portfolio
fun
The output is 0.00015337622774133828, which is a scalar.
I don't know what might be wrong. Any help is appreciated.
Code:
from scipy.optimize import minimize
import pandas as pd
import numpy as np
from openpyxl import load_workbook
wb = load_workbook('path/Assets_3.xlsx') # in this workbook there is data for returns.
# The next lines clean unnecessary first column and first row.
ws = wb.active
df = pd.DataFrame(ws.values)
df1 = df.drop(0,axis=1)
df1 = df1.drop(0)
df1 = df1.astype(float)
rf = 0.05
r_bar = 0.05
returns = df1.copy()
def efficient_frontier(rf, r_bar, returns):
n = len(returns.transpose())
W = np.ones([n])/n # start optimization with equal weights
exp_ret = returns.mean()
cov_matrix = returns.cov()
fun = 0.5*np.dot(np.dot(W, cov_matrix), W) # variance of the portfolio
cons = ({'type': 'eq', 'fun': lambda W: sum(W) - 1. },
{'type': 'ineq', 'fun': lambda W: np.dot(exp_ret,W) - r_bar })
bnds = [(0.,1.) for i in range(n)] # weights between 0..1.
res = minimize(fun, W, (returns, cov_matrix, rf),
method='SLSQP', bounds = bnds, constraints = cons)
return res
x= efficient_frontier(rf,r_bar,returns)
x
Some Data
1 2 3
1 0.060206 0.005781 0.001117
2 0.006463 -0.007390 0.001133
3 -0.003211 -0.015730 0.001167
4 0.044227 -0.006250 0.001225
5 -0.040571 -0.006910 0.001292
6 -0.007900 -0.006160 0.001208
7 0.068702 0.013836 0.001300
8 0.039286 0.009854 0.001350
9 0.012457 -0.007950 0.001358
10 -0.013758 0.001021 0.001283
11 -0.002616 -0.013600 0.001300
12 0.059004 -0.006090 0.001442
13 0.015566 0.002818 0.001308
14 -0.036454 0.001395 0.001283
15 0.058899 0.011072 0.001325
16 -0.043086 0.017070 0.001308
17 0.023156 -0.003350 0.001392
18 0.063705 0.000301 0.001417
19 0.017628 -0.001960 0.001508
20 -0.014567 -0.006990 0.001525
21 -0.007191 -0.013000 0.001425
22 -0.000815 0.014773 0.001450
23 0.046493 -0.001540 0.001542
24 0.051832 -0.008580 0.001742
25 -0.007151 0.001177 0.001633
26 -0.018196 -0.008680 0.001642
27 -0.013513 -0.008810 0.001675
28 -0.026493 -0.010510 0.001825
29 -0.003249 -0.014750 0.001800
30 0.001222 0.022258 0.001758
This code is a mess and while i can show you something which runs, that does not mean anything.
You will see convergence to your starting-point, whatever that means in your task! It's a strong indicator that something is still very wrong (might be the underlying theory)!
Some additional remarks:
scipy's optimizers are build to work with numpy-arrays, not pandas Dataframes or Series objects!
the only things in your original question which hinted pandas-usage was a var-name df and returns.cov() which does not exist for numpy-arrays!
rf is never used anywhere!
there are multiple things in optimize's args, which are not used!
it does not feel like a problem one should use scipy's optimizers for! (but it's possible; here we are paying for numerical-differentiation for example)
cvxpy would probably a much much better approach (more clean, faster, more accurate) if interpret the problem correctly (did not analyze much)
but the same rules apply: some python-knowledge is needed!
Code:
from scipy.optimize import minimize
import numpy as np
import pandas as pd
rf = 0.05
r_bar = 0.05
returns = pd.DataFrame(np.random.randn(30, 3), columns=list('ABC')) # PANDAS DF
cov_matrix = returns.cov().as_matrix() # use PANDAS one last time
# but result = np.array!
returns = returns.as_matrix() # From now on: np-only!
def fun(x, returns, cov_matrix, rf):
return 0.5*np.dot(np.dot(x, cov_matrix), x)
def efficient_frontier(rf, r_bar, returns):
n = len(returns.transpose())
W = np.ones([n])/n # start optimization with equal weights
exp_ret = returns.mean()
cons = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1. }, # let's use numpy here
{'type': 'ineq', 'fun': lambda x: np.dot(exp_ret, x) - r_bar })
bnds = [(0.,1.) for i in range(n)] # weights between 0..1.
res = minimize(fun, W, (returns, cov_matrix, rf),
method='SLSQP', bounds = bnds, constraints = cons)
return res
x= efficient_frontier(rf,r_bar,returns)
print(x)
Output:
A B C
A 0.813375 -0.001370 0.173901
B -0.001370 1.482756 0.380514
C 0.173901 0.380514 1.285936
fun: 0.2604530793556774
jac: array([ 0.32863522, 0.62063321, 0.61345008])
message: 'Optimization terminated successfully.'
nfev: 35
nit: 7
njev: 3
status: 0
success: True
x: array([ 0.33333333, 0.33333333, 0.33333333])

nonlinear optimization with vectors, scalars and inequality constraints

I have set of equation in form: Y=aA+bB
where Y-is know vector of floats (only this one is known!); a, b are unkown scalar (float) and A, B are unknown vectors of floats. Each equation have it own Y, a, b, whereas all equation share the same unknow vectors A and B.
I have set of such equation so my problem is to minimize function:
(Y-aA-bB)+(Y'-a'A-b'B)+....
I have also many inequality constrains of type: Ai>Aj (Ai i-th element of vector A), Bi>= Bk, Bi>0, a>a', ...
Is there any software or library (ideally for python) which can handle this problem?
General remarks
This is a linear problem (at least in the linear least-squares sense, continue reading)!
It's also incompletely specified as it's not clear if there should be always a feasible solution in your case or if you want to minimize some given loss in general. Your text sounds like the latter, but in this case one has to chose the loss (which makes a difference in regards to possible algorithms). Let's take the euclidean-norm (probably the best pick here)!
Ignoring constraints for a moment, we can view this problem as basic least-squares solution to a linear matrix equation problem (euclidean-norm vs. squared euclidean-norm does not make a difference!).
min || b - Ax ||^2
Here:
M = number of Y's
N = size of Y
b = (Y0,
Y1,
...) -> shape: M*N (flattened: Y_x = (y_x_0, y_x_1).T)
A = ((a0, 0, 0, ..., b0, 0, 0, ...),
(0, a0, 0, ..., 0, b0, 0, ...),
(0, 0, a0, ..., 0, 0, b0, ...),
...
(a1, 0, 0, ..., b1, 0, 0, ...)) -> shape: (M*N, N*2)
x = (A0, A1, A2, ... B0, B1, B2, ...) -> shape: N*2 (one for A, one for B)
What you should do
If unconstrained:
Convert to standard-form and use numpy's lstsq
If constrained:
Either use customized optimization algorithms, or:
Linear-programming (if minimizing absolute-differences / l1-norm)
I'm too lazy to formulate it for scipy's linprog
Not that hard, but l1-norm is non-trivial using scipy's API
Much easier to formulate with cvxpy (obj=cvxpy.norm(X, 1))
Quadratic-programming / Second-order-cone-programming (if minimizing euclidean norm / l2-norm)
Again, too lazy to formuate it; no special solver available at scipy yet
Could be easily formulated with cvxpy (obj=cvxpy.norm(X, 2))
Emergency: use general-purpose constrained nonlinear-optimization algorithms like SLSQP -> see code
Some hacky code (not the best approach!)
This code:
Is just a demo!
Uses general nonlinear optimization algorithms from scipy
Therefore:
easier to formulate
Less fast & robust than LP, QP, SOCP
But will achieve approximately the same result as convergence on convex optimization problems is guaranteed
Uses automatic-differentiation whenever needed
(author too lazy to add gradients)
this can really hurt if performance is important
Is really ugly in terms of np.repeat vs. broadcasting!
Code:
import numpy as np
from scipy.optimize import minimize
np.random.seed(1)
""" Fake-problem (usually the job of the question-author!) """
def get_partial(N=10):
Y = np.random.uniform(size=N)
a, b = np.random.uniform(size=2)
return Y, a, b
""" Optimization """
def optimize(list_partials, N, M):
""" General approach:
This is a linear system of equations (with constraints)
Basic (unconstrained) form: min || b - Ax ||^2
"""
Y_all = np.vstack(map(lambda x: x[0], list_partials)).ravel() # flat 1d
a_all = np.hstack(map(lambda x: np.repeat(x[1], N), list_partials)) # repeat to be of same shape
b_all = np.hstack(map(lambda x: np.repeat(x[2], N), list_partials)) # """
def func(x):
A = x[:N]
B = x[N:]
return np.linalg.norm(Y_all - a_all * np.repeat(A, M) - b_all * np.repeat(B, M))
""" Example constraints: A >= B element-wise """
cons = ({'type': 'ineq',
'fun' : lambda x: x[:N] - x[N:]})
res = minimize(func, np.zeros(N*2), constraints=cons, method='SLSQP', options={'disp': True})
print(res)
print(Y_all - a_all * np.repeat(res.x[:N], M) - b_all * np.repeat(res.x[N:], M))
""" Test """
M = 4
N = 3
list_partials = [get_partial(N) for i in range(M)]
optimize(list_partials, N, M)
Output:
Optimization terminated successfully. (Exit mode 0)
Current function value: 0.9019356096498999
Iterations: 12
Function evaluations: 96
Gradient evaluations: 12
fun: 0.9019356096498999
jac: array([ 1.03786588e-04, 4.84041870e-04, 2.08129734e-01,
1.57609582e-04, 2.87599862e-04, -2.07959406e-01])
message: 'Optimization terminated successfully.'
nfev: 96
nit: 12
njev: 12
status: 0
success: True
x: array([ 1.82177105, 0.62803449, 0.63815278, -1.16960281, 0.03147683,
0.63815278])
[ 3.78873785e-02 3.41189867e-01 -3.79020251e-01 -2.79338679e-04
-7.98836875e-02 7.94168282e-02 -1.33155595e-01 1.32869391e-01
-3.73398306e-01 4.54460178e-01 2.01297470e-01 3.42682496e-01]
I did not check the result! If there is an error it's an implementation-error, not a conceptional one (my opinion)!
I agree with sascha that this is a linear problem. As I do not like constrains very much, I prefer, actually, to make it a non-linear without constrains. I do so by setting the vector A=(a1**2, a1**2+a2**2, a1**2+a2**2+a3**2, ...) like this it is ensured that it is all positive and A_i > A_j for i>j. That makes errors a bit problematic, as you now have to consider error propagation to get A1, A2, etc. including correlation, but I will have an important point on that at the end. The "simple" solution would look as follows:
import numpy as np
from scipy.optimize import leastsq
from random import random
np.set_printoptions(linewidth=190)
def generate_random_vector(n, sortIt=True):
out=np.fromiter( (random() for x in range(n) ),np.float)
if sortIt:
out.sort()
return out
def residuals(parameters,dataVec,dataLength,vecDims):
aParams=parameters[:dataLength]
bParams=parameters[dataLength:2*dataLength]
AParams=parameters[-2*vecDims:-vecDims]
BParams=parameters[-vecDims:]
YList=dataVec
AVec=[a**2 for a in AParams]##assures A_i > 0
BVec=[b**2 for b in BParams]
AAVec=np.cumsum(AVec)##assures A_i>A_j for i>j
BBVec=np.cumsum(BVec)
dist=[ np.array(Y)-a*np.array(AAVec)-b*np.array(BBVec) for Y,a,b in zip(YList,aParams,bParams) ]
dist=np.ravel(dist)
return dist
if __name__=="__main__":
aList=generate_random_vector(20, sortIt=False)
bList=generate_random_vector(20, sortIt=False)
AVec=generate_random_vector(5)
BVec=generate_random_vector(5)
YList=[a*AVec+b*BVec for a,b in zip(aList,bList)]
aGuess=20*[.2]
bGuess=20*[.3]
AGuess=5*[.4]
BGuess=5*[.5]
bestFitValues, covMX, infoDict, messages ,ier = leastsq(residuals, aGuess+bGuess+AGuess+BGuess ,args=(YList,20,5) ,full_output=True)
print "a"
print aList
besta = bestFitValues[:20]
print besta
print "b"
print bList
bestb = bestFitValues[20:40]
print bestb
print "A"
print AVec
bestA = bestFitValues[-2*5:-5]
realBestA = np.cumsum([x**2 for x in bestA])
print realBestA
print "B"
print BVec
bestB = bestFitValues[-5:]
realBestB = np.cumsum([x**2 for x in bestB])
print realBestB
print covMX
The problem on errors and correlation is that the solution to the problem is not unique. If Y = a A + b B is a solution and we, e.g., rotate such that A = c E + s F and B = -s E + c F then also Y = (ac-bs) E + (as+bc) F =e E + f F is a solution. The parameter space is, hence, completely flat at "the solution" resulting in huge errors and apocalyptic correlations.

Categories

Resources