I have the following constraints for the problem,
I want to minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:
1. w_i is at maximum, ul
2. w_i is, at least, 0.05
3. The sum of all w for a sector code can not be bigger than 0.50
So basically I want to generate all w_i for each row, however, I dont know how to implement the third restriction with scipy.
With scipy.optimize.lsq_linear I can force the first two conditions with bound = (0.05, ul), but I don't know how to force the third one.
import pandas as pd
import scipy
import numpy as np
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
df
I think you are trying to do something like this:
import pandas as pd
import scipy
import numpy as np
'''
Minimize the sum of squared differences of w_i, uw_i divided by SUM(uw) following these restrictions:
Constraints:
1. w_i is at maximum, ul [NOTE: I think this should say 'minimum']
2. w_i is, at least, 0.05 [NOTE: I think this should say 'at most']
3. The sum of all w for a sector code can not be bigger than 0.50
'''
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
print(df)
gb = df.groupby('Sector Code')['ul']
codeCounts = gb.count().to_list()
cumCounts = [0] + [sum(codeCounts[:i + 1]) for i in range(len(codeCounts))]
newIdx = []
for code, dfGp in gb:
newIdx += list(dfGp.index)
df = df.reindex(newIdx)
# For each unique Sector Code, create constraint that 0.50 minus the sum of w for that code must be non-negative:
def foo(i, c):
# return a closure that acts as a constraint for the i'th interval defined by c[i-1]:c[i]
def bar(x):
return 0.50 - sum(x[c[i-1]:c[i]])
return bar
cons = [{'type': 'ineq', 'fun': foo(i, cumCounts)} for i in range(1, len(cumCounts))]
# Value of bounds argument to enforce ul <= w_i <= 0.05
bnds = tuple((ul_i, 0.05) for code, ul_group in gb for ul_i in ul_group)
# Initial guess
n = len(df.index)
w_i = np.ones(n) * (1 / n)
# The objective function to be minimized
uw_sum = df.uw.sum()
def fun(w):
return (pd.Series(w) - df.uw).pow(2).sum() / uw_sum
# Optimize using scipy minimize() function
from scipy.optimize import minimize
res = minimize(fun, w_i, method='SLSQP', bounds=bnds, constraints=cons)
print(res)
df['w'] = res.x
df = df.reindex(range(len(df.index)))
print(df)
Explanation:
Use groupby() to get the row count for each unique Sector Code value and also to construct an index ordered by Sector Code, which we use to re-order the original input df
create a list of constraint dictionaries to be passed to the optimizer, one for each Sector Code, which will use python closures to constrain the sum of the corresponding solution elements to be <= 0.50
create a sequence of bounds to constrain solution elements w_i to be between ul and 0.05
create the objective function to return the sum of squared differences of w_i, uw_i divided by sum(uw)
call minimize() from scipy.optimize with the above constraints, bounds, objective function and an initial guess
add a column to the dataframe with the result and call reindex() to restore the original row order.
Output:
uw ul Sector Code
0 0.006822 0.050000 40
1 0.017949 0.050000 40
2 0.001906 0.031289 40
3 0.000904 0.040318 20
4 0.001147 0.046904 15
... ... ... ...
1226 0.003653 0.033553 10
1227 0.002556 0.031094 10
1228 0.002816 0.041031 10
1229 0.010216 0.050000 40
1230 0.001559 0.033480 55
[1231 rows x 3 columns]
fun: 0.4487707682194904
jac: array([0.02089997, 0.00466947, 0.01358654, ..., 0.02070332, nan,
0.02188896])
message: 'Positive directional derivative for linesearch'
nfev: 919
nit: 5
njev: 1
status: 8
success: False
x: array([0.03730054, 0.0247585 , 0.02171931, ..., 0.03300862, 0.05 ,
0.03348039])
uw ul Sector Code w
0 0.006822 0.050000 40 0.050000
1 0.017949 0.050000 40 0.050000
2 0.001906 0.031289 40 0.031289
3 0.000904 0.040318 20 0.040318
4 0.001147 0.046904 15 0.046904
... ... ... ... ...
1226 0.003653 0.033553 10 0.033553
1227 0.002556 0.031094 10 0.031094
1228 0.002816 0.041031 10 0.041031
1229 0.010216 0.050000 40 0.050000
1230 0.001559 0.033480 55 0.033480
[1231 rows x 4 columns]
Note that success is False, so perhaps some work remains. Hopefully the dataframe related manipulations are helpful in addressing your question.
You already got a working answer from #constantstranger. IMO, there's just one problem: it's quite slow. More precisely, it took more than a minute to solve the problem on my machine.
Therefore, some notes on what could be done in order to speed up the solver in the following:
Since Python has a noticeable overhead when calling functions, it's a good idea to implement all functions as fast as possible. For instance, evaluating one vectorial constraint function is faster than evaluating multiple scalar constraint functions.
At the moment, all derivatives (the objective gradient and the constraint Jacobian) are approximated by finite differences. This is a real bottleneck because each evaluation of the approximated derivative goes in hand with multiple objective/constraint function evaluations. Instead, it's highly recommended to provide the exact derivatives or use algorithmic differentiation.
Last but not least, scipy.optimize.minimize is only suited for small to mid-sized problems at least. If you are willing to use another package, you could use IPOPT, the state-of-the-art NLP solver. The cyipopt package provides a scipy-like interface, so it isn't hard switching from scipy.optimize.minimize.
Besides from that, your problem is a (convex) quadratic optimization problem and can be formulated as follows:
min f(w) s.t. A*w <= 0.5, u_l <= w <= 0.05
with
f(w) = (1/sum(u_w)) * ||w - u_w||^2_2 = (1/sum(u_w)) * (w'Iw - 2u_w'*w + u_w'u_w)
where A[i,j] = 1 if w[j] belongs to sector i and 0 otherwise.
Then, solving the problem with IPOPT (note that we pass the exact derivatives) looks like this:
import numpy as np
import pandas as pd
from cyipopt import minimize_ipopt
# dataframe
df = pd.read_csv("https://raw.githubusercontent.com/norhther/datasets/main/data(1).csv")
df = df.drop("Unnamed: 0", axis = 1)
# sectors
sectors = df["Sector Code"].unique()
# building the matrix A
A = np.zeros((sectors.size, len(df)))
for i, sec in enumerate(sectors):
indices = df[df["Sector Code"] == sec].index.values
A[i, indices] = 1
uw = df['uw'].values
uw_sum = uw.sum()
# objective
def obj(w):
return np.sum((w - uw)**2) / uw_sum
# objective gradient
def grad(w):
return (2*w - 2*uw) / uw_sum
# Linear Constraint A # w <= 0.5 <=> 0.5 - A # w >= 0
cons = [{'type': 'ineq', 'fun': lambda w: 0.5 - A # w, 'jac': lambda w: -A}]
# variable bounds
bounds = [(u_i, 0.05) for u_i in df.ul.values]
# feasible initial guess
w0 = np.ones(len(df)) / len(df)
# solve the problem
res = minimize_ipopt(obj, x0=w0, jac=grad, bounds=bounds, constraints=cons)
print(res)
On my machine, this terminates in less than 2 seconds and yields
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
fun: 0.4306218505716169
info: {'x': array([0.05 , 0.05 , 0.03128946, ..., 0.04103131, 0.05 ,
0.03348038]), 'g': array([-3.51687688, -9.45217602, -7.88799127, -1.78825803, -1.86650095,
-5.09092925, -2.11181422, -1.35485327, -1.15847276, 0.35 ]), 'obj_val': 0.4306218505716169, 'mult_g': array([-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
-1.000000e+03, -1.000000e+03, -1.000000e+03, -1.000000e+03,
-1.000000e+03, -2.857166e-09]), 'mult_x_L': array([1000.02960821, 1000.02197802, 1000.00000005, ..., 1000.00000011,
1000.02728049, 1000.00000006]), 'mult_x_U': array([0.00000000e+00, 0.00000000e+00, 5.34457820e-08, ...,
1.11498931e-07, 0.00000000e+00, 6.05340266e-08]), 'status': 2, 'status_msg': b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'}
message: b'Algorithm converged to a point of local infeasibility. Problem may be infeasible.'
nfev: 13
nit: 9
njev: 7
status: 2
success: False
x: array([0.05 , 0.05 , 0.03128946, ..., 0.04103131, 0.05 ,
0.03348038])
[Finished in 1.9s]
Related
Now I faced a problem that for a data sample(lets‘s say 10 continuous variables and one dependent variable), I need fit a model for the prediction. I would like constrain the weights of all the variables between a particular number, like abs(0.2). Which means the variables should no more than 0.2 or less than -0.2. However, I tried lasso and ridge regression in sklearn.linear_model(Also tried ElasticNet) to control the weights of variables, it's not quite good because there always be one or two extreme large weights or sometimes when I gave a large alpha the r square shows the model was really bad. I tried to write my own methods, but I could only constrain the sum of weights nor the every weight of variables. SVR would provide a pretty close answer, however I still wanna ask if there are some good choices for muti-regression with self define constrains?
import numpy as np
from scipy.optimize import shgo
def my_general_linear_model_func(A1,b1):
num_x = np.shape(A1)[1]
def my_func(x):
ls = 0.5*(b1-np.dot(A1,x))**2
result = np.sum(ls)
return result
def g1(x):
return np.sum(x) #sum of X >= 0
def g2(x):
return 1-np.sum(x) #sum of X <= 1
cons = ({'type': 'ineq', 'fun': g1}
,{'type': 'ineq', 'fun': g2})
x0 = np.zeros(num_x)
bnds = [(0,1)]
for i in range(num_x-1):
bnds.append((0,1))
res1 = shgo(my_func,
bounds = bnds,
constraints=cons)
return res1
A1 = np.array([[0.12,5.96,3.14],[0.68,7.89,4.56]])
b1 = np.array([3,5])
my_general_linear_model_func(A1,b1)
The result:
fun: 0.07651391974288956
funl: array([0.07651392, 0.11079534, 0.2564125 ])
message: 'Optimization terminated successfully.'
nfev: 53
nit: 2
nlfev: 49
nlhev: 0
nljev: 12
success: True
x: array([1.12339358e-16, 5.62146099e-02, 9.43785390e-01])
xl: array([[1.12339358e-16, 5.62146099e-02, 9.43785390e-01],
[3.90241087e-01, 5.00000000e-01, 1.09758913e-01],
[5.00000000e-01, 5.00000000e-01, 0.00000000e+00]])
I am having trouble understanding why my code below is not producing the optimal result. I am attempting to create a step function but this clearly isn't working as the solution value for model.Z isn't one of the range points specified.
Any help in understanding/correcting this is greatly appreciated.
What I am trying to do
Maximize Z * X, subject to:
/ 20.5 , X <= 5
Z(X) = | 10 , 5 <= X <= 10
\ 9 , 10 <= X <= 11
Even better, I'd like to solve under the following conditions (disjoint at breakpoints):
/ 20.5 , X <= 5
Z(X) = | 10 , 5 < X <= 10
\ 9 , 10 < X <= 11
where X and Z are floating point numbers.
I would expect X to be 5 and Z to be 20.5, however model results are 7.37 and 15.53.
Code
from pyomo.core import *
# Break points for step-function
DOMAIN_PTS = [5., 10., 11.]
RANGE_PTS = [20.5, 10., 9.]
# Define model and variables
model = ConcreteModel()
model.X = Var(bounds=(5,11))
model.Z = Var()
# Set piecewise constraint
model.con = Piecewise(model.Z,model.X,
pw_pts=DOMAIN_PTS ,
pw_constr_type='EQ',
f_rule=RANGE_PTS,
force_pw=True,
pw_repn='SOS2')
model.obj = Objective(expr= model.Z * model.X, sense=maximize)
opt = SolverFactory('gurobi')
opt.options['NonConvex'] = 2
obj_val = opt.solve(model)
print(value(model.X))
print(value(model.Z))
print(model.obj())
I would never piecewice linearize z, but always z*x. If you have a piecewise linear expression for z only, then z*x is nonlinear (and in a nasty way). If you however write down a piecewise linear expression for z*x then the whole thing becomes linear. Note that discontinuities in the piecewise functions require attention.
It is important to understand mathematically what you are writing down before you pass it on to a solver.
Piecewise in Pyomo is intended to interpolate linearly between some bounds given by another variable. This means that if you give the bounds given as you're trying, your interpolating as this (sorry for such a bad graph) which basically means that you're placing a line between x=5 and x=10 a line given by Z= 31 - 2.1X, and another line between 10 and 11. In fact, Gurobi is computing the optimal result, since x its a NonNegativeReal and in such a line Z= 31 - 2.1X, x=7.37 give as result Z=15.53.
Now, I understand that you want rather a step function than a interpolation, something similar to this (sorry for such a bad graph, again), then you need to change your DOMAIN_PTS and RANGE_PTS in order to correctly model what you want
# Break points for step-function
DOMAIN_PTS = [5.,5., 10., 10.,11.]
RANGE_PTS = [20.5,10., 10., 9., 9.]
In this way, you're interpolating between f(x)=20.5: 5<=x<=10 and so on.
I have a DataFrame as follows :
Name Volatility Return
a 0.0243 0.212
b 0.0321 0.431
c 0.0323 0.443
d 0.0391 0.2123
e 0.0433 0.3123
I'd like to have a Volatility of 0.035 and the maximized Return for that volatility.
That is, I'd like, in a new Df the Name and the percentage of that asset that will be in my portfolio that gives the maximum Return for a Volatility equals to 0.035.
Therefore, I need to solve a system of equations with multiple conditions, to obtain the best solution (HighestReturn) for a fixed outcome (Volatility == 0.035).
The conditions are:
Each asset has a weight between 0 and 1.
The sum of the weights is 1.
The sum of the weights times the volatility of each asset is the "Desired Volatility".
The sum of the weights times the return of each asset is the "Total Return". This should be maximized.
Here is an approach using Z3Py, an open source SAT/SMT solver.
In a SAT/SMT solver you can write your code just as a list of conditions, and the program finds an optimal solution (or just a solution that satisfies all the conditions when Z3 is used as solver).
Originally SAT solvers only worked with pure boolean expressions, but modern SAT/SMT solvers also allow for fixed-bit and unlimited integers, fractions, reals and even functions as central variable.
To write the given equations into Z3, they are converted quite literally into Z3 expressions. The code below comments each of the steps.
import pandas as pd
from z3 import *
DesiredVolatility = 0.035
df = pd.DataFrame(columns=['Name', 'Volatility', 'Return'],
data=[['a', 0.0243, 0.212],
['b', 0.0321, 0.431],
['c', 0.0323, 0.443],
['d', 0.0391, 0.2123],
['e', 0.0433, 0.3123]])
# create a Z3 instance to optimize something
s = Optimize()
# the weight of each asset, as a Z3 variable
W = [Real(row.Name) for row in df.itertuples()]
# the total volatility
TotVol = Real('TotVol')
# the total return, to be maximized
TotReturn = Real('TotReturn')
# weights between 0 and 1, and sum to 1
s.add(And([And(w >= 0, w <= 1) for w in W]))
s.add(Sum([w for w in W]) == 1)
# the total return is calculated as the weighted sum of the asset returns
s.add(TotReturn == Sum([w * row.Return for w, row in zip(W, df.itertuples())]))
# the volatility is calculated as the weighted sum of the asset volatility
s.add(TotVol == Sum([w * row.Volatility for w, row in zip(W, df.itertuples())]))
# the volatility should be equal to the desired volatility
s.add(TotVol == DesiredVolatility)
# we're maximizing the total return
h1 = s.maximize(TotReturn)
# we ask Z3 to do its magick
res = s.check()
# we check the result, hoping for 'sat': all conditions satisfied, a maximum is found
if res == sat:
s.upper(h1)
m = s.model()
#for w in W:
# print(f'asset {w}): {m[w]} = {m[w].numerator_as_long() / m[w] .denominator_as_long() : .6f}')
# output the total return
print(f'Total Return: {m[TotReturn]} = {m[TotReturn].numerator_as_long() / m[TotReturn] .denominator_as_long() :.6f}')
# get the proportions out of the Z3 model
proportions = [m[w].numerator_as_long() / m[w] .denominator_as_long() for w in W]
# create a dataframe with the result
df_result = pd.DataFrame({'Name': df.Name, 'Proportion': proportions})
print(df_result)
else:
print("No satisfiable solution found")
Result:
Total Return: 452011/1100000 = 0.410919
Name Proportion
0 a 0.000000
1 b 0.000000
2 c 0.754545
3 d 0.000000
4 e 0.245455
You can easily add additional constraints, for example "no asset can have more than 30% of the total":
# change
s.add(And([And(w >= 0, w <= 1) for w in W]))`
# to
s.add(And([And(w >= 0, w <= 0.3) for w in W]))`
Which would result in:
Total Return: 558101/1480000 = 0.377095
Name Proportion
0 a 0.082432
1 b 0.300000
2 c 0.300000
3 d 0.017568
4 e 0.300000
I am trying to build an efficient frontier as in the Markowitz problem.
I have written the code below, but I get the error "ValueError: Objective function must return a scalar". I have tested 'fun' with some values, for example, I input to the console:
W = np.ones([n])/n # start optimization with equal weights
cov_matrix = returns.cov()
fun = 0.5*np.dot(np.dot(W, cov_matrix), W) # variance of the portfolio
fun
The output is 0.00015337622774133828, which is a scalar.
I don't know what might be wrong. Any help is appreciated.
Code:
from scipy.optimize import minimize
import pandas as pd
import numpy as np
from openpyxl import load_workbook
wb = load_workbook('path/Assets_3.xlsx') # in this workbook there is data for returns.
# The next lines clean unnecessary first column and first row.
ws = wb.active
df = pd.DataFrame(ws.values)
df1 = df.drop(0,axis=1)
df1 = df1.drop(0)
df1 = df1.astype(float)
rf = 0.05
r_bar = 0.05
returns = df1.copy()
def efficient_frontier(rf, r_bar, returns):
n = len(returns.transpose())
W = np.ones([n])/n # start optimization with equal weights
exp_ret = returns.mean()
cov_matrix = returns.cov()
fun = 0.5*np.dot(np.dot(W, cov_matrix), W) # variance of the portfolio
cons = ({'type': 'eq', 'fun': lambda W: sum(W) - 1. },
{'type': 'ineq', 'fun': lambda W: np.dot(exp_ret,W) - r_bar })
bnds = [(0.,1.) for i in range(n)] # weights between 0..1.
res = minimize(fun, W, (returns, cov_matrix, rf),
method='SLSQP', bounds = bnds, constraints = cons)
return res
x= efficient_frontier(rf,r_bar,returns)
x
Some Data
1 2 3
1 0.060206 0.005781 0.001117
2 0.006463 -0.007390 0.001133
3 -0.003211 -0.015730 0.001167
4 0.044227 -0.006250 0.001225
5 -0.040571 -0.006910 0.001292
6 -0.007900 -0.006160 0.001208
7 0.068702 0.013836 0.001300
8 0.039286 0.009854 0.001350
9 0.012457 -0.007950 0.001358
10 -0.013758 0.001021 0.001283
11 -0.002616 -0.013600 0.001300
12 0.059004 -0.006090 0.001442
13 0.015566 0.002818 0.001308
14 -0.036454 0.001395 0.001283
15 0.058899 0.011072 0.001325
16 -0.043086 0.017070 0.001308
17 0.023156 -0.003350 0.001392
18 0.063705 0.000301 0.001417
19 0.017628 -0.001960 0.001508
20 -0.014567 -0.006990 0.001525
21 -0.007191 -0.013000 0.001425
22 -0.000815 0.014773 0.001450
23 0.046493 -0.001540 0.001542
24 0.051832 -0.008580 0.001742
25 -0.007151 0.001177 0.001633
26 -0.018196 -0.008680 0.001642
27 -0.013513 -0.008810 0.001675
28 -0.026493 -0.010510 0.001825
29 -0.003249 -0.014750 0.001800
30 0.001222 0.022258 0.001758
This code is a mess and while i can show you something which runs, that does not mean anything.
You will see convergence to your starting-point, whatever that means in your task! It's a strong indicator that something is still very wrong (might be the underlying theory)!
Some additional remarks:
scipy's optimizers are build to work with numpy-arrays, not pandas Dataframes or Series objects!
the only things in your original question which hinted pandas-usage was a var-name df and returns.cov() which does not exist for numpy-arrays!
rf is never used anywhere!
there are multiple things in optimize's args, which are not used!
it does not feel like a problem one should use scipy's optimizers for! (but it's possible; here we are paying for numerical-differentiation for example)
cvxpy would probably a much much better approach (more clean, faster, more accurate) if interpret the problem correctly (did not analyze much)
but the same rules apply: some python-knowledge is needed!
Code:
from scipy.optimize import minimize
import numpy as np
import pandas as pd
rf = 0.05
r_bar = 0.05
returns = pd.DataFrame(np.random.randn(30, 3), columns=list('ABC')) # PANDAS DF
cov_matrix = returns.cov().as_matrix() # use PANDAS one last time
# but result = np.array!
returns = returns.as_matrix() # From now on: np-only!
def fun(x, returns, cov_matrix, rf):
return 0.5*np.dot(np.dot(x, cov_matrix), x)
def efficient_frontier(rf, r_bar, returns):
n = len(returns.transpose())
W = np.ones([n])/n # start optimization with equal weights
exp_ret = returns.mean()
cons = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1. }, # let's use numpy here
{'type': 'ineq', 'fun': lambda x: np.dot(exp_ret, x) - r_bar })
bnds = [(0.,1.) for i in range(n)] # weights between 0..1.
res = minimize(fun, W, (returns, cov_matrix, rf),
method='SLSQP', bounds = bnds, constraints = cons)
return res
x= efficient_frontier(rf,r_bar,returns)
print(x)
Output:
A B C
A 0.813375 -0.001370 0.173901
B -0.001370 1.482756 0.380514
C 0.173901 0.380514 1.285936
fun: 0.2604530793556774
jac: array([ 0.32863522, 0.62063321, 0.61345008])
message: 'Optimization terminated successfully.'
nfev: 35
nit: 7
njev: 3
status: 0
success: True
x: array([ 0.33333333, 0.33333333, 0.33333333])
I use scipy.optimize to minimize a function of 12 arguments.
I started the optimization a while ago and still waiting for results.
Is there a way to force scipy.optimize to display its progress (like how much is already done, what are the current best point)?
As mg007 suggested, some of the scipy.optimize routines allow for a callback function (unfortunately leastsq does not permit this at the moment). Below is an example using the "fmin_bfgs" routine where I use a callback function to display the current value of the arguments and the value of the objective function at each iteration.
import numpy as np
from scipy.optimize import fmin_bfgs
Nfeval = 1
def rosen(X): #Rosenbrock function
return (1.0 - X[0])**2 + 100.0 * (X[1] - X[0]**2)**2 + \
(1.0 - X[1])**2 + 100.0 * (X[2] - X[1]**2)**2
def callbackF(Xi):
global Nfeval
print '{0:4d} {1: 3.6f} {2: 3.6f} {3: 3.6f} {4: 3.6f}'.format(Nfeval, Xi[0], Xi[1], Xi[2], rosen(Xi))
Nfeval += 1
print '{0:4s} {1:9s} {2:9s} {3:9s} {4:9s}'.format('Iter', ' X1', ' X2', ' X3', 'f(X)')
x0 = np.array([1.1, 1.1, 1.1], dtype=np.double)
[xopt, fopt, gopt, Bopt, func_calls, grad_calls, warnflg] = \
fmin_bfgs(rosen,
x0,
callback=callbackF,
maxiter=2000,
full_output=True,
retall=False)
The output looks like this:
Iter X1 X2 X3 f(X)
1 1.031582 1.062553 1.130971 0.005550
2 1.031100 1.063194 1.130732 0.004973
3 1.027805 1.055917 1.114717 0.003927
4 1.020343 1.040319 1.081299 0.002193
5 1.005098 1.009236 1.016252 0.000739
6 1.004867 1.009274 1.017836 0.000197
7 1.001201 1.002372 1.004708 0.000007
8 1.000124 1.000249 1.000483 0.000000
9 0.999999 0.999999 0.999998 0.000000
10 0.999997 0.999995 0.999989 0.000000
11 0.999997 0.999995 0.999989 0.000000
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 11
Function evaluations: 85
Gradient evaluations: 17
At least this way you can watch as the optimizer tracks the minimum
Following #joel's example, there is a neat and efficient way to do the similar thing. Following example show how can we get rid of global variables, call_back functions and re-evaluating target function multiple times.
import numpy as np
from scipy.optimize import fmin_bfgs
def rosen(X, info): #Rosenbrock function
res = (1.0 - X[0])**2 + 100.0 * (X[1] - X[0]**2)**2 + \
(1.0 - X[1])**2 + 100.0 * (X[2] - X[1]**2)**2
# display information
if info['Nfeval']%100 == 0:
print '{0:4d} {1: 3.6f} {2: 3.6f} {3: 3.6f} {4: 3.6f}'.format(info['Nfeval'], X[0], X[1], X[2], res)
info['Nfeval'] += 1
return res
print '{0:4s} {1:9s} {2:9s} {3:9s} {4:9s}'.format('Iter', ' X1', ' X2', ' X3', 'f(X)')
x0 = np.array([1.1, 1.1, 1.1], dtype=np.double)
[xopt, fopt, gopt, Bopt, func_calls, grad_calls, warnflg] = \
fmin_bfgs(rosen,
x0,
args=({'Nfeval':0},),
maxiter=1000,
full_output=True,
retall=False,
)
This will generate output like
Iter X1 X2 X3 f(X)
0 1.100000 1.100000 1.100000 2.440000
100 1.000000 0.999999 0.999998 0.000000
200 1.000000 0.999999 0.999998 0.000000
300 1.000000 0.999999 0.999998 0.000000
400 1.000000 0.999999 0.999998 0.000000
500 1.000000 0.999999 0.999998 0.000000
Warning: Desired error not necessarily achieved due to precision loss.
Current function value: 0.000000
Iterations: 12
Function evaluations: 502
Gradient evaluations: 98
However, no free launch, here I used function evaluation times instead of algorithmic iteration times as a counter. Some algorithms may evaluate target function multiple times in a single iteration.
Try using:
options={'disp': True}
to force scipy.optimize.minimize to print intermediate results.
Many of the optimizers in scipy indeed lack verbose output (the 'trust-constr' method of scipy.optimize.minimize being an exception). I faced a similar issue and solved it by creating a wrapper around the objective function and using the callback function. No additional function evaluations are performed here, so this should be an efficient solution.
import numpy as np
class Simulator:
def __init__(self, function):
self.f = function # actual objective function
self.num_calls = 0 # how many times f has been called
self.callback_count = 0 # number of times callback has been called, also measures iteration count
self.list_calls_inp = [] # input of all calls
self.list_calls_res = [] # result of all calls
self.decreasing_list_calls_inp = [] # input of calls that resulted in decrease
self.decreasing_list_calls_res = [] # result of calls that resulted in decrease
self.list_callback_inp = [] # only appends inputs on callback, as such they correspond to the iterations
self.list_callback_res = [] # only appends results on callback, as such they correspond to the iterations
def simulate(self, x, *args):
"""Executes the actual simulation and returns the result, while
updating the lists too. Pass to optimizer without arguments or
parentheses."""
result = self.f(x, *args) # the actual evaluation of the function
if not self.num_calls: # first call is stored in all lists
self.decreasing_list_calls_inp.append(x)
self.decreasing_list_calls_res.append(result)
self.list_callback_inp.append(x)
self.list_callback_res.append(result)
elif result < self.decreasing_list_calls_res[-1]:
self.decreasing_list_calls_inp.append(x)
self.decreasing_list_calls_res.append(result)
self.list_calls_inp.append(x)
self.list_calls_res.append(result)
self.num_calls += 1
return result
def callback(self, xk, *_):
"""Callback function that can be used by optimizers of scipy.optimize.
The third argument "*_" makes sure that it still works when the
optimizer calls the callback function with more than one argument. Pass
to optimizer without arguments or parentheses."""
s1 = ""
xk = np.atleast_1d(xk)
# search backwards in input list for input corresponding to xk
for i, x in reversed(list(enumerate(self.list_calls_inp))):
x = np.atleast_1d(x)
if np.allclose(x, xk):
break
for comp in xk:
s1 += f"{comp:10.5e}\t"
s1 += f"{self.list_calls_res[i]:10.5e}"
self.list_callback_inp.append(xk)
self.list_callback_res.append(self.list_calls_res[i])
if not self.callback_count:
s0 = ""
for j, _ in enumerate(xk):
tmp = f"Comp-{j+1}"
s0 += f"{tmp:10s}\t"
s0 += "Objective"
print(s0)
print(s1)
self.callback_count += 1
A simple test can be defined
from scipy.optimize import minimize, rosen
ros_sim = Simulator(rosen)
minimize(ros_sim.simulate, [0, 0], method='BFGS', callback=ros_sim.callback, options={"disp": True})
print(f"Number of calls to Simulator instance {ros_sim.num_calls}")
resulting in:
Comp-1 Comp-2 Objective
1.76348e-01 -1.31390e-07 7.75116e-01
2.85778e-01 4.49433e-02 6.44992e-01
3.14130e-01 9.14198e-02 4.75685e-01
4.26061e-01 1.66413e-01 3.52251e-01
5.47657e-01 2.69948e-01 2.94496e-01
5.59299e-01 3.00400e-01 2.09631e-01
6.49988e-01 4.12880e-01 1.31733e-01
7.29661e-01 5.21348e-01 8.53096e-02
7.97441e-01 6.39950e-01 4.26607e-02
8.43948e-01 7.08872e-01 2.54921e-02
8.73649e-01 7.56823e-01 2.01121e-02
9.05079e-01 8.12892e-01 1.29502e-02
9.38085e-01 8.78276e-01 4.13206e-03
9.73116e-01 9.44072e-01 1.55308e-03
9.86552e-01 9.73498e-01 1.85366e-04
9.99529e-01 9.98598e-01 2.14298e-05
9.99114e-01 9.98178e-01 1.04837e-06
9.99913e-01 9.99825e-01 7.61051e-09
9.99995e-01 9.99989e-01 2.83979e-11
Optimization terminated successfully.
Current function value: 0.000000
Iterations: 19
Function evaluations: 96
Gradient evaluations: 24
Number of calls to Simulator instance 96
Of course this is just a template, it can be adjusted to your needs. It does not provide all information about the status of the optimizer (like e.g. in the Optimization Toolbox of MATLAB), but at least you have some idea of the progress of the optimization.
A similar approach can be found here, without using the callback function. In my approach the callback function is used to print output exactly when the optimizer has finished an iteration, and not every single function call.
Which minimization function are you using exactly?
Most of the functions have progress report built, including multiple levels of reports showing exactly the data you want, by using the disp flag (for example see scipy.optimize.fmin_l_bfgs_b).
Below is a solution that works for me :
def f_(x): # The rosenbrock function
return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2
def conjugate_gradient(x0, f):
all_x_i = [x0[0]]
all_y_i = [x0[1]]
all_f_i = [f(x0)]
def store(X):
x, y = X
all_x_i.append(x)
all_y_i.append(y)
all_f_i.append(f(X))
optimize.minimize(f, x0, method="CG", callback=store, options={"gtol": 1e-12})
return all_x_i, all_y_i, all_f_i
and by example :
conjugate_gradient([2, -1], f_)
Source
It is also possible to include a simple print() statement in the function to be minimized. If you import the function you can create a wapper.
import numpy as np
from scipy.optimize import minimize
def rosen(X): #Rosenbrock function
print(X)
return (1.0 - X[0])**2 + 100.0 * (X[1] - X[0]**2)**2 + \
(1.0 - X[1])**2 + 100.0 * (X[2] - X[1]**2)**2
x0 = np.array([1.1, 1.1, 1.1], dtype=np.double)
minimize(rosen,
x0)
There you go! (Beware: Most of the time, global variables are bad practice.)
from scipy.optimize import minimize
import numpy as np
f = lambda x, b=.1 : x[0]**2 + b * x[1]**2
x0 = np.array( [2.,2.] )
P = [ x0 ]
def save(x):
global P
P.append(x)
minimize(f, x0=x0, callback=save)
fun: 4.608946876190852e-13
hess_inv: array([[4.99995194e-01, 3.78976566e-04],
[3.78976566e-04, 4.97011817e+00]])
jac: array([ 5.42429092e-08, -4.27698767e-07])
message: 'Optimization terminated successfully.'
nfev: 24
nit: 7
njev: 8
status: 0
success: True
x: array([ 1.96708740e-08, -2.14594442e-06])
print(P)
[array([2., 2.]),
array([0.99501244, 1.89950125]),
array([-0.0143533, 1.4353279]),
array([-0.0511755 , 1.11283405]),
array([-0.03556007, 0.39608524]),
array([-0.00393046, -0.00085631]),
array([-0.00053407, -0.00042556]),
array([ 1.96708740e-08, -2.14594442e-06])]