Sorry, a little finance related on the topic, but also a scipy/python question. For context of what I am trying to do, it's literally the same as these two blog posts.
https://quantdare.com/risk-parity-in-python/
https://thequantmba.wordpress.com/2016/12/14/risk-parityrisk-budgeting-portfolio-in-python/
So I have a bunch of returns on stocks, and I want to equalize the risk contributions of each stock. To do this I will need to solve for the weights that will give me an equal risk contribution for each using the scipy minimize optimizer.
So I will pass in my target risk contributions, and my initial guess into the optimizer. For example, 6 stocks. My initial guess is merely 1/6 of the total 100% weight in the portfolio.
initial_weight = [0.16666666666667, 0.16666666666667, 0.16666666666667,
0.16666666666667, 0.16666666666667, 0.16666666666667]
risk_contrib_target =[0.16666666666667, 0.16666666666667, 0.16666666666667,
0.16666666666667, 0.16666666666667, 0.16666666666667]
This was taken from the quantmba link, so all credit to that guy. It looks right to me.
# risk budgeting optimization
def calculate_portfolio_var(w,V):
# function that calculates portfolio risk
w = np.matrix(w)
return (w*V*w.T)[0,0]
def calculate_risk_contribution(w,V):
# function that calculates asset contribution to total risk
w = np.matrix(w, dtype=object)
sigma = np.sqrt(calculate_portfolio_var(w,V))
# Marginal Risk Contribution
MRC = V*w.T
# Risk Contribution
RC = np.multiply(MRC,w.T)/sigma
RC = RC / sum(RC)
return RC
def risk_budget_objective(x,pars):
# calculate portfolio risk
V = pars[0]# covariance table
x_t = pars[1] # risk target in percent of portfolio risk
sig_p = np.sqrt(calculate_portfolio_var(x,V)) # portfolio sigma
risk_target = np.asmatrix(x_t, dtype=object)
asset_RC = calculate_risk_contribution(x,V)
J = sum(np.square(asset_RC-risk_target.T))[0,0] * 1000 # sum of squared error
return J
I also have a list of dates that I am running through to solve this many times over a time period.
rebalance_dates = my_list_of_dates
I noticed that sometimes, it doesn't solve this correctly. This is easy to check because the way it is set up, the function should have a 0 solution. Also I can check the risk contribution afterwards to see that they reached my target. To get around this, I kick it to basin hopping if it does not find this 0 solution. I think it is solving a local minimum and not a global minimum and I read this is one solution to that problem.
The get_returns_matrix function is just getting the data that I want from one of my files. This part is not important.
returns_matrix = get_returns_matrix(asset_returns, 60, date, components)
This is the optimization.
for date in rebalance_dates:
print(date)
returns_matrix = get_returns_matrix(asset_returns, 60, date, components)
covariance = np.cov(returns_matrix)
annual_covar = [map(lambda x:x * 260, group) for group in covariance]
annual_covar = [list(x) for x in annual_covar]
cons = ({'type': 'eq', 'fun': lambda x: np.sum(x) - 1.0},
{'type': 'ineq', 'fun': lambda x: x})
res= minimize(risk_budget_objective, initial_weight, args=[annual_covar, risk_contrib_target], method='SLSQP',constraints=cons,
options={'disp': False, 'ftol': .00000000001, 'eps' : .0000000000000005, 'maxiter':1000})
if res.fun > .00000000001:
print("Kick to basin hopping")
minimizer_kwargs = dict(method="SLSQP", constraints=cons, args=[annual_covar, risk_contrib_target], options={'ftol': .000000000000000000001, 'eps' : .0000000000000005, 'maxiter':100})
res = basinhopping(risk_budget_objective, initial_weight, niter=50, minimizer_kwargs=minimizer_kwargs)
I have two constraints, one being the sum of weights needs to equal 100% and the other being all weights should be positive.
This solves correctly about 75% of the time, the other times it gets stuck at a local minimum I believe. So a correct result from this would look like:
|--------|-----------|-----------|-----------|-----------|-----------|-----------|
|Category|Stock 1 |Stock 2 |Stock 3 |Stock 4 |Stock 5 |Stock 6 |
|--------|-----------|-----------|-----------|-----------|-----------|-----------|
|Weights |0.121465654|0.17829418 |0.091558469|0.105659033|0.156959021|0.346063642|
|--------|-----------|-----------|-----------|-----------|-----------|-----------|
|Risk Con|0.166666667|0.166666667|0.166666667|0.166666667|0.166666667|0.166666667|
Function return val 0.0000000000
But occasionally (25% of the times) I will get a result that does not solve the function, like this:
|--------|-----------|-----------|-----------|-----------|-----------|-----------|
|Category|Stock 1 |Stock 2 |Stock 3 |Stock 4 |Stock 5 |Stock 6 |
|--------|-----------|-----------|-----------|-----------|-----------|-----------|
|Weights |0.159442825|0.166949713|0.235404372|0.175430619|0.262772472|0.000000000|
|--------|-----------|-----------|-----------|-----------|-----------|-----------|
|Risk Con|0.199661774|0.199803048|0.200448716|0.199943667|0.200142796|0.000000000|
Function return val 33.33371143
The times that it is wrong, it seems to completely disregard stock 6. Giving it both a 0 weight and a 0 risk contribution.
Is there any parameter I am not using correctly in the solver? Sorry, this might be a little difficult to solve without the data that I'm using. But just wondering if there is anything obviously wrong with my approach.
I also happen to know there is a solution to the ones scipy doesn't solve correctly because I can do the same thing correctly in an excel spreadsheet with the GRG-nonlinear constraint solver.
Thanks so much!
Basinhopping is a stochastic global optimizer. There is no guarantee that it will find the global optimum within the specified number of iterations.
It sounds like from your description that you have a way of checking whether a solution is the global optimum. In that case you can use the callback parameter to optimize your search
callback : callable, callback(x, f, accept), optional
A callback function which will be called for all minima found. x and f are the coordinates and function value of the trial minimum, and accept is whether or not that minimum was accepted. This can be used, for example, to save the lowest N minima found. Also, callback can be used to specify a user defined stop criterion by optionally returning True to stop the basinhopping routine.
def my_callback(x, f, accept):
return minimum_is_global_minimum(x)
Then you can set niter to some large number and it will stop as soon as it finds the gobal minimum.
Related
While going through an article, I encountered a situation where I encountered below polynomial equation.
For reference, below is the equation.
15446 = 537.06/(1+r) + 612.25/(1+r)**2 + 697.86/(1+r)**3 + 795.67/(1+r)**4 + 907.07/(1+r)**5
This is discount cash flow time series values which we use in finance to get the idea of present value of future cash flows after applying the appropriate discount rate.
So from above equation, I need to calculate the variable r in python programming environment?. I do hope that there must be some library which can be used to solve such equations?.
I solve this, I thought to use the numpy.npv API.
import numpy as np
presentValue = 15446
futureValueList = [537.06, 612.25, 697.86,795.67, 907.07]
// I know it is not possible to get r from below. Just put
// it like this to describe my intention.
presentValue = np.npv(r, futureValueList)
print(r)
You can multiply your NPV formula with the highest power or (1+r) and then find the roots of the polynomial with polyroots (just take the only real root and disregard the complex ones):
import numpy as np
presentValue = 15446
futureValueList = [537.06, 612.25, 697.86,795.67, 907.07]
roots = np.polynomial.polynomial.polyroots(futureValueList[::-1]+[-presentValue])
r = roots[np.argwhere(roots.imag==0)].real[0,0] - 1
print(r)
#-0.3332398877886278
As it turns out the formula given is incomplete, see p. 14 of the linked article. The correct equation can be solved with standard optimization procedures, e.g. optimize.root providing a sensible initial guess:
from scipy import optimize
def fun(r):
r1 = 1 + r
return 537.06/r1 + 612.25/r1**2 + 697.86/r1**3 + 795.67/r1**4 + 907.07/r1**5 * (1 + 1.0676/(r-.0676)) - 15446
roots = optimize.root(fun, [.1])
print(roots.x if roots.success else roots.message)
#[0.11177762]
The following is the objective function:
The idea is that a mean-variance optimization has already been done on a universe of securities. This gives us the weights for a target portfolio. Now suppose the investor already is holding a portfolio and does not want to change their entire portfolio to the target one.
Let w_0 = [w_0(1),w_0(2),...,w_0(N)] be the initial portfolio, where w_0(i) is the fraction of the portfolio invested in
stock i = 1,...,N. Let w_t = [w_t(1), w_t(2),...,w_t(N)] be the target portfolio, i.e., the portfolio
that it is desirable to own after rebalancing. This target portfolio may be constructed using quadratic optimization techniques such as variance minimization.
The objective is to decide the final portfolio w_f = [w_f (1), w_f (2),..., w_f(N)] that satisfies the
following characteristics:
(1) The final portfolio is close to our target portfolio
(2) The number of transactions from our initial portfolio is sufficiently small
(3) The return of the final portfolio is high
(4) The final portfolio does not hold many more securities that our initial portfolio
An objective function which is to be minimized is created by summing together the characteristic terms 1 through 4.
The first term is captured by summing the absolute difference in weights from the final and the target portfolio.
The second term is captured by the sum of an indicator function multiplied by a user specified penalty. The indicator function is y_{transactions}(i) where it is 1 if the weight of security i is different in the initial portfolio and the final portfolio, and 0 otherwise.
The third term is captured by the total final portfolio return multiplied by a negative user specified penalty since the objective is minimization.
The final term is the count of assets in the final portfolio (ie. sum of an indicator function counting the number of positive weights in the final portfolio), multiplied by a user specified penalty.
Assuming that we already have the target weights as target_w how do I setup this optimization problem in docplex python library? Or if anyone is familiar with mixed integer programming in NAG it would be helpful to know how to setup such a problem there as well.
`
final_w = [0.]*n
final_w = np.array(final_w)
obj1 = np.sum(np.absolute(final_w - target_w))
pen_trans = 1.2
def ind_trans(final,inital):
list_trans = []
for i in range(len(final)):
if abs(final[i]-inital[i]) == 0:
list_trans.append(0)
else:
list_trans.append(1)
return list_trans
obj2 = pen_trans*sum(ind_trans(final_w,initial_w))
pen_returns = 0.6
returns_np = np.array(df_secs['Return'])
obj3 = (-1)*np.dot(returns_np,final_w)
pen_count = 1.
def ind_count(final):
list_count = []
for i in range(len(final)):
if final[i] == 0:
list_count.append(0)
else:
list_count.append(1)
return list_count
obj4 = sum(ind_count(final_w))
objective = obj1 + obj2 + obj3 + obj4
The main issue in your code is that final_w is not a an array of variables but an array of data. So there will be nothing to optimize. To create an array of variables in docplex you have to do something like this:
from docplex.mp.model import Model
with Model() as m:
final = m.continuous_var_list(n, 0.0, 1.0)
That creates n variables that can take values between 0 and 1. With that in hand you can start things. For example:
obj1 = m.sum(m.abs(initial[i] - final[i]) for i in range(n))
For the next objective things become harder since you need indicator constraints. To simplify definition of these constraints first define a helper variable delta that gives the absolute difference between stocks:
delta = m.continuous_var_list(n, 0.0, 1.0)
m.add_constraints(delta[i] == m.abs(initial[i] - final[i]) for i in range(n))
Next you need an indicator variable that is 1 if a transaction is required to adjust stock i:
needtrans = m.binary_var_list(n)
for i in range(n):
# If needtrans[i] is 0 then delta[i] must be 0.
# Since needtrans[i] is penalized in the objective, the solver will
# try hard to set it to 0. It will only set it to 1 if delta[i] != 0.
# That is exactly what we want
m.add_indicator(needtrans[i], delta[i] == 0, 0)
With that you can define the second objective:
obj2 = pen_trans * m.sum(needtrans)
once all objectives have been defined, you can add their sum to the model:
m.minimize(obj1 + obj2 + obj3 + obj4)
and then solve the model and display its solution:
m.solve()
print(m.solution.get_values(final))
If any of the above is not (yet) clear to you then I suggest you take a look at the many examples that ship with docplex and also at the (reference) documentation.
I am working on a code to solve for the optimum combination of diameter size of number of pipelines. The objective function is to find the least sum of pressure drops in six pipelines.
As I have 15 choices of discrete diameter sizes which are [2,4,6,8,12,16,20,24,30,36,40,42,50,60,80] that can be used for any of the six pipelines that I have in the system, the list of possible solutions becomes 15^6 which is equal to 11,390,625
To solve the problem, I am using Mixed-Integer Linear Programming using Pulp package. I am able to find the solution for the combination of same diameters (e.g. [2,2,2,2,2,2] or [4,4,4,4,4,4]) but what I need is to go through all combinations (e.g. [2,4,2,2,4,2] or [4,2,4,2,4,2] to find the minimum. I attempted to do this but the process is taking a very long time to go through all combinations. Is there a faster way to do this ?
Note that I cannot calculate the pressure drop for each pipeline as the choice of diameter will affect the total pressure drop in the system. Therefore, at anytime, I need to calculate the pressure drop of each combination in the system.
I also need to constraint the problem such that the rate/cross section of pipeline area > 2.
Your help is much appreciated.
The first attempt for my code is the following:
from pulp import *
import random
import itertools
import numpy
rate = 5000
numberOfPipelines = 15
def pressure(diameter):
diameterList = numpy.tile(diameter,numberOfPipelines)
pressure = 0.0
for pipeline in range(numberOfPipelines):
pressure += rate/diameterList[pipeline]
return pressure
diameterList = [2,4,6,8,12,16,20,24,30,36,40,42,50,60,80]
pipelineIds = range(0,numberOfPipelines)
pipelinePressures = {}
for diameter in diameterList:
pressures = []
for pipeline in range(numberOfPipelines):
pressures.append(pressure(diameter))
pressureList = dict(zip(pipelineIds,pressures))
pipelinePressures[diameter] = pressureList
print 'pipepressure', pipelinePressures
prob = LpProblem("Warehouse Allocation",LpMinimize)
use_diameter = LpVariable.dicts("UseDiameter", diameterList, cat=LpBinary)
use_pipeline = LpVariable.dicts("UsePipeline", [(i,j) for i in pipelineIds for j in diameterList], cat = LpBinary)
## Objective Function:
prob += lpSum(pipelinePressures[j][i] * use_pipeline[(i,j)] for i in pipelineIds for j in diameterList)
## At least each pipeline must be connected to a diameter:
for i in pipelineIds:
prob += lpSum(use_pipeline[(i,j)] for j in diameterList) ==1
## The diameter is activiated if at least one pipelines is assigned to it:
for j in diameterList:
for i in pipelineIds:
prob += use_diameter[j] >= lpSum(use_pipeline[(i,j)])
## run the solution
prob.solve()
print("Status:", LpStatus[prob.status])
for i in diameterList:
if use_diameter[i].varValue> pressureTest:
print("Diameter Size",i)
for v in prob.variables():
print(v.name,"=",v.varValue)
This what I did for the combination part which took really long time.
xList = np.array(list(itertools.product(diameterList,repeat = numberOfPipelines)))
print len(xList)
for combination in xList:
pressures = []
for pipeline in range(numberOfPipelines):
pressures.append(pressure(combination))
pressureList = dict(zip(pipelineIds,pressures))
pipelinePressures[combination] = pressureList
print 'pipelinePressures',pipelinePressures
I would iterate through all combinations, I think you would run into memory problems otherwise trying to model ALL combinations in a MIP.
If you iterate through the problems perhaps using the multiprocessing library to use all cores, it shouldn't take long just remember only to hold information on the best combination so far, and not to try and generate all combinations at once and then evaluate them.
If the problem gets bigger you should consider Dynamic Programming Algorithms or use pulp with column generation.
We are trying to make a cluster analysis for a big amount of data. We are kind of new to python and found out that an iterative function is way more efficient than an recursive one. Now we are trying to change that but it is way harder than we thought.
This code underneath is the heart of our clustering function. This takes over 90 percent of the time. Can you help us to change that into a recursive one?
Some extra information: The taunach function gets neighbours of our point which will later form the clusters. The problem is that we have many many points.
def taunach(tau,delta, i,s,nach,anz):
dis=tabelle[s].dist
#delta=tau
x=data[i]
y=Skalarprodukt(data[tabelle[s].index]-x)
a=tau-abs(dis)
#LA.norm(data[tabelle[s].index]-x)
if y<a*abs(a):
nach.update({item.index for item in tabelle[tabelle[s].inner:tabelle[s].outer-1]})
anz = anzahl(delta, i, tabelle[s].inner, anz)
if dis>-1:
b=dis-tau
if y>=b*abs(b):#*(1-0.001):
nach,anz=taunach(tau,delta, i,tabelle[s].outer,nach,anz)
else:
if y<tau**2:
nach.add(tabelle[s].index)
if y < delta:
anz += 1
if tabelle[s].dist>-4:
b = dis - tau
if y>=b*abs(b):#*(1-0.001)):
nach,anz=taunach(tau,delta, i,tabelle[s].outer,nach,anz)
if tabelle[s].dist > -1:
if y<=(dis+tau)**2:
nach,anz=taunach(tau,delta, i,tabelle[s].inner,nach,anz)
return nach,anz
With the comments from the answer, I rewrote the code below (math.1p(x)->math.log(x)), which now should work and give a good approximation of the volatility.
I am trying to create a short code to calculate the implied volatility of a European Call option. I wrote the code below:
from scipy.stats import norm
import math
norm.cdf(1.96)
#c_p - Call(+1) or Put(-1) option
#P - Price of option
#S - Strike price
#E - Exercise price
#T - Time to expiration
#r - Risk-free rate
#C = SN(d_1) - Ee^{-rT}N(D_2)
def implied_volatility(Price,Stock,Exercise,Time,Rf):
P = float(Price)
S = float(Stock)
E = float(Exercise)
T = float(Time)
r = float(Rf)
sigma = 0.01
print (P, S, E, T, r)
while sigma < 1:
d_1 = float(float((math.log(S/E)+(r+(sigma**2)/2)*T))/float((sigma*(math.sqrt(T)))))
d_2 = float(float((math.log(S/E)+(r-(sigma**2)/2)*T))/float((sigma*(math.sqrt(T)))))
P_implied = float(S*norm.cdf(d_1) - E*math.exp(-r*T)*norm.cdf(d_2))
if P-(P_implied) < 0.001:
return sigma
sigma +=0.001
return "could not find the right volatility"
print implied_volatility(15,100,100,1,0.05)
This yields: 0.595 volatility which should be somewhere 0.3203. That is a huge difference...
I know this is not a fast method by any means, I just want to demonstrate how the principle works, but I am not able to calculate a good approximation.
For some reason when I call the function it gives me really bad approximation of the actual implied volatility which I calculated using a Matlab Program and the following webpage: Implied Volatility. Could anyone please help me to figure out where I made the mistake?
There are two problems I see, none of which are directly python related:
You are using log1p(x), which is the natural logarithm of 1+x, while you actually want log(x), which is the natural logarithm of x (cf. Wikipedia).
An option price of 100 is way to high considering the other parameters. Try to calculate the implied volatility for a price of 10 - which should be about 0.18 both by your program and the calculator you linked.
In Python2, the result of 5 / 2 is 2. It uses floor division. To fix that, make every number a float. In your implied_volatility function, change P = Price to P = float(Price), S = Stock to S = float(Stock), etc.