Python - solving for risk budget portfolio using cvxpy - python

I'm looking to write a set of code that allows me to set risk budget constraints to individual positions in a portfolio, i.e. each position to contribute a set amount of risk to the portfolio, and I'm looking to do it specifically in CVXPY as I have noticed sometimes SCIPY breaks the constraints.
I have the below code, I was wondering if you would be able to provide me with some direction as I have encountered cvxpy "The objective is not DCP" error and I'm not sure how to correct it.
Please also let me know if I should rephrase my question to make it clearer.
import cvxpy as cp
import numpy as np
# covmat is a (31,31) numpy array, covariance matrix calculated from monthly returns
risk_budget = np.repeat(1/31, 31).reshape(-1,1)
def risk_budget_objective(risk_budget, covmat):
n = covmat.shape[0]
# set equal weight
equal_wts = np.repeat(1 / n, n)
# weights vertical
wts = cp.Variable((n,1))
constraints = [cp.sum(wts) == 1.0] # weight constraints
port_variance = cp.square(cp.quad_form(wts, covmat)) # portfolio variance, not volatility
mrc = covmat # wts * 12 # vector # marginal risk contribution annualised
risk_contrib = cp.multiply(mrc, wts) / port_variance # calculate risk contribution
mean_square_diff = cp.sum(cp.square(risk_contrib - risk_budget)) # squared difference and summed
prob = cp.Problem(cp.Minimize(mean_square_diff), constraints) # minimise squared difference
if problem.status not in ["infeasible", "unbounded"]:
solution = wts.value
return solution
print('Problem not feasible... resorting to equal weight...')
return equal_wts
Looking into the objects, it seems that the problem lies in the following code:
risk_contrib = cp.multiply(mrc, wts) / port_variance
As the curvature is "UNKOWN" rather than convex.


Why ifft2 is not working but fft2 is fine?

I want to implement ifft2 using DFT matrix. The following code works for fft2.
import numpy as np
def DFT_matrix(N):
i, j = np.meshgrid(np.arange(N), np.arange(N))
omega = np.exp( - 2 * np.pi * 1j / N )
W = np.power( omega, i * j ) # Normalization by sqrt(N) Not included
return W
# Matrix multiply the 3 matrices together
mA = dftMtxM # rA # dftMtxN
print(np.allclose(np.abs(mA), np.abs(rAfft)))
print(np.allclose(np.angle(mA), np.angle(rAfft)))
To get to ifft2 I assumd I need to change only the dft matrix to it's transpose, so expected the following to work, but I got false for the last two print any suggesetion please?
import numpy as np
def DFT_matrix(N):
i, j = np.meshgrid(np.arange(N), np.arange(N))
omega = np.exp( - 2 * np.pi * 1j / N )
W = np.power( omega, i * j ) # Normalization by sqrt(N) Not included
return W
# Matrix multiply the 3 matrices together
mA = dftMtxM # rA # dftMtxN
print(np.allclose(np.abs(mA), np.abs(rAfft)))
print(np.allclose(np.angle(mA), np.angle(rAfft)))
I am going to be building on some things from my answer to your previous question. Please note that I will try to distinguish between the terms Discrete Fourier Transform (DFT) and Fast Fourier Transform (FFT). Remember that DFT is the transform while FFT is only an efficient algorithm for performing it. People, including myself, however very commonly refer to the DFT as FFT since it is practically the only algorithm used for computing the DFT
The problem here is again the normalization of the data. It's interesting that this is such a fundamental and confusing part of any DFT operations yet I couldn't find a good explanation on the internet. I will try to provide a summary at the end about DFT normalization however I think the best way to understand this is by working through some examples yourself.
Why the comparisons fail?
It's important to note, that even though both of the allclose tests seemingly fail, they are actually not a very good method of comparing two complex number arrays.
Difference between two angles
In particular, the problem is when it comes to comparing angles. If you just take the difference of two close angles that are on the border between -pi and pi, you can get a value that is around 2*pi. The allclose just takes differences between values and checks that they are bellow some threshold. Thus in our cases, it can report a false negative.
A better way to compare angles is something along the lines of this function:
def angle_difference(a, b):
diff = a - b
diff[diff < -np.pi] += 2*np.pi
diff[diff > np.pi] -= 2*np.pi
return diff
You can then take the maximum absolute value and check that it's bellow some threshold:
np.max(np.abs(angle_difference(np.angle(mA), np.angle(rAfft)))) < threshold
In the case of your example, the maximum difference was 3.072209153742733e-12.
So the angles are actually correct!
Magnitude scaling
We can get an idea of the issue is when we look at the magnitude ratio between the matrix iDFT and the library iFFT.
We find that all the values in mA are 800, which means that our absolute values are 800 times larger than those computed by the library. Suspiciously, 800 = 40 * 20, the dimensions of our data! I think you can see where I am going with this.
Confusing DFT normalization
We spot some indications why this is the case when we have a look at the FFT formulas as taken from the Numpy FFT documentation:
You will notice that while the forward transform doesn't normalize by anything. The reverse transform divides the output by 1/N. These are the 1D FFTs but the exact same thing applies in the 2D case, the inverse transform multiplies everything by 1/(N*M)
So in our example, if we update this line, we will get the magnitudes to agree:
mA = dftMtxM # rA/(sizeM * sizeN) # dftMtxN
A side note on comparing the outputs, an alternative way to compare complex numbers is to compare the real and imaginary components:
print(np.allclose(mA.real, rAfft.real))
print(np.allclose(mA.imag, rAfft.imag))
And we find that now indeed both methods agree.
Why all this normalization mess and which should I use?
The fundamental property of the DFT transform must satisfy is that iDFT(DFT(x)) = x. When you work through the math, you find that the product of the two coefficients before the sum has to be 1/N.
There is also something called the Parseval's theorem. In simple terms, it states that the energy in the signals is just the sum of square absolutes in both the time domain and frequency domain. For the FFT this boils down to this relationship:
Here is the function for computing the energy of a signal:
def energy(x):
return np.sum(np.abs(x)**2)
You are basically faced with a choice about the 1/N factor:
You can put the 1/N before the DFT sum. This makes senses as then the k=0 DC component will be equal to the average of the time domain values. However you will have to multiply the energy in frequency domain by N in order to match it with time domain frequency.
N = len(x)
X = np.fft.fft(x)/N # Compute the FFT scaled by `1/N`
# Energy related by `N`
np.allclose(energy(x), energy(X) * N) == True
# Perform some processing...
Y = X * H
y = np.fft.ifft(Y*N) # Compute the iFFT, remember to cancel out the built in `1/N` of ifft
You put the 1/N before the iDFT. This is, slightly counterintuitively, what most implementations, including Numpy do. I could not find a definitive consensus on the reasoning behind this, but I think it has something to do with the implementation efficiency. (If anyone has a better explanation for this, please leave it in the comments) As shown in the equations earlier, the energy in the frequency domain has to be divided by N to match the time domain energy.
N = len(x)
X = np.fft.fft(x) # Compute the FFT without scaling
# Energy, related by 1/N
np.allclose(energy(x), energy(X) / N) == True
# Perform some processing...
Y = X * H
y = np.fft.ifft(Y) # Compute the iFFT with the build in `1/N`
You can split the 1/N by placing 1/sqrt(N) before each of the transforms making them perfectly symmetric. In Numpy, you can provide the parameter norm="ortho" to the fft functions which will make them use the 1/sqrt(N) normalization instead: np.fft.fft(x, norm="ortho") The nice property here is that the energy now matches in both domains.
X = np.fft.fft(x, norm='orth') # Compute the FFT scaled by `1/sqrt(N)`
# Perform some processing...
# Energy are equal:
np.allclose(energy(x), energy(X)) == True
Y = X * H
y = np.fft.ifft(Y, norm='orth') # Compute the iFFT, with scaling by `1/sqrt(N)`
In the end it boils down to what you need. Most of the time the absolute magnitude of your DFT is actually not that important. You are mostly interested in the ratio of various components or you want to perform some operation in the frequency domain but then transform back to the time domain or you are interested in the phase (angles). In all of these case, the normalization does not really play an important role, as long as you stay consistent.

CVXPY failing randomly on basic quadratic problem

I'm finding CVXPY is randomly failing with the following error:
ArpackError: ARPACK error 3: No shifts could be applied during a cycle of the Implicitly restarted
Arnoldi iteration. One possibility is to increase the size of NCV relative to NEV.
The code below is a minimal example where it is just trying to do mean variance optimisation with no constraints, identity correlation matrix, and normally distributed mean vector. Roughly once in every thousand runs this fails. It doesn't seem to matter which solver I ask it to use, which makes me think it is failing setting up the problem?
import cvxpy as cp
import numpy as np
n = 199
mu = np.random.normal(size = n)
C = np.eye(n)
for repeat in range(1000):
x = cp.Variable(n)
mean = x.T # mu
variance = cp.quad_form(x, C)
objective = cp.Maximize(mean - variance)
constraints = []
prob = cp.Problem(objective, constraints)
result = prob.solve()
print(repeat, end = " ")

How do I find the percentage error in a Monte Carlo algorithm?

I have written a Monte Carlo program to integrate a function f(x).
I have now been asked to calculate the percentage error.
Having done a quick literature search, I found that this can be given with the equation %error = (sqrt(var[f(x)]/n))*100, where n is the number of random points I used to derive my answer.
However, when I run my integration code, my percentage error is greater than that given by this formula.
Do I have the correct formula?
Any help would be greatly appreciated. Thanks x
Here is quick example - estimate integral of linear function on the interval [0...1] using Monte-Carlo. To estimate error you have to collect second momentum (values squared), then compute variance, standard deviation, and (assuming CLT), error of the simulation in the original units as well as in %
Code, Python 3.7, Anaconda, Win10 64x
import numpy as np
def f(x): # linear function to integrate
return x
N = 100000
x = np.random.random(N)
q = f(x) # first momentum
q2 = q*q # second momentum
mean = np.sum(q) / float(N) # compute mean explicitly, not using np.mean
var = np.sum(q2) / float(N) - mean * mean # variance as E[X^2] - E[X]^2
sd = np.sqrt(var) # std.deviation
print(mean) # should be 1/2
print(var) # should be 1/12
print(sd) # should be 0.5/sqrt(3)
sigma = sd / np.sqrt(float(N)) # assuming CLT, error estimation in original units
print("result = {0} with error +- {1}".format(mean, sigma))
err_pct = sigma / mean * 100.0 # error estimate in percents
print("result = {0} with error +- {1}%".format(mean, err_pct))
Be aware, that we computed one sigma error and (even not talking about it being random value itself) true result is within printed mean+-error only for 68% of the runs. You could print mean+-2*error, and it would mean true result is inside that region for 95% cases, mean+-3*error true result is inside that region for 99.7% of the runs and so on and so forth.
For sampling variance estimate, there is known problem called Bias in the estimator. Basically, we underestimate a bit sampling variance, proper correction (Bessel's correction) shall be applied
var = np.sum(q2) / float(N) - mean * mean # variance as E[X^2] - E[X]^2
var *= float(N)/float(N-1)
In many cases (and many examples) it is omitted because N is very large, which makes correction pretty much invisible - f.e., if you have statistical error 1% but N is in millions, correction is of no practical use.

High frequency noise at solving differential equation

I'm trying to simulate a simple diffusion based on Fick's 2nd law.
from pylab import *
import numpy as np
gridpoints = 128
def profile(x):
range = 2.
straggle = .1576
dose = 1
return dose/(sqrt(2*pi)*straggle)*exp(-(x-range)**2/2/straggle**2)
x = linspace(0,4,gridpoints)
nx = profile(x)
dx = x[1] - x[0] # use np.diff(x) if x is not uniform
dxdx = dx**2
timestep = 0.5
steps = 21
diffusion_coefficient = 0.002
for i in range(steps):
coefficients = [-1.785714e-3, 2.539683e-2, -0.2e0, 1.6e0,
1.6e0, -0.2e0, 2.539683e-2, -1.785714e-3]
ccf = (np.convolve(nx, coefficients) / dxdx)[4:-4] # second order derivative
nx = timestep*diffusion_coefficient*ccf + nx
for the first few time steps everything looks fine, but then I start to get high frequency noise, do to build-up from numerical errors which are amplified through the second derivative. Since it seems to be hard to increase the float precision I'm hoping that there is something else that I can do to suppress this? I already increased the number of points that are being used to construct the 2nd derivative.
I don't have the time to study your solution in detail, but it seems that you are solving the partial differential equation with a forward Euler scheme. This is pretty easy to implement, as you show, but this can become numerical instable if your timestep is too small. Your only solution is to reduce the timestep or to increase the spatial resolution.
The easiest way to explain this is for the 1-D case: assume your concentration is a function of spatial coordinate x and timestep i. If you do all the math (write down your equations, substitute the partial derivatives with finite differences, should be pretty easy), you will probably get something like this:
C(x, i+1) = [1 - 2 * k] * C(x, i) + k * [C(x - 1, i) + C(x + 1, i)]
so the concentration of a point on the next step depends on its previous value and the ones of its two neighbors. It is not too hard to see that when k = 0.5, every point gets replaced by the average of its two neighbors, so a concentration profile of [...,0,1,0,1,0,...] will become [...,1,0,1,0,1,...] on the next step. If k > 0.5, such a profile will blow up exponentially. You calculate your second order derivative with a longer convolution (I effectively use [1,-2,1]), but I guess that does not change anything for the instability problem.
I don't know about normal diffusion, but based on experience with thermal diffusion, I would guess that k scales with dt * diffusion_coeff / dx^2. You thus have to chose your timestep small enough so that your simulation does not become instable. To make the simulation stable, but still as fast as possible, chose your parameters so that k is a bit smaller than 0.5. Something similar can be derived for 2-D and 3-D cases. The easiest way to achieve this is to increase dx, since your total calculation time will scale with 1/dx^3 for a linear problem, 1/dx^4 for 2-D problems, and even 1/dx^5 for 3-D problems.
There are better methods to solve diffusion equations, I believe that Crank Nicolson is at least standard for solving heat-equations (which is also a diffusion problem). The 'problem' is that this is an implicit method, which means that you have to solve a set of equations to calculate your 'concentration' at the next timestep, which is a bit of a pain to implement. But this method is guaranteed to be numerical stable, even for big timesteps.

Truncated multivariate normal in SciPy?

I'm trying to automate a process that at some point needs to draw samples from a truncated multivariate normal. That is, it's a normal multivariate normal distribution (i.e. Gaussian) but the variables are constrained to a cuboid. My given inputs are the mean and covariance of the full multivariate normal but I need samples in my box.
Up to now, I'd just been rejecting samples outside the box and resampling as necessary, but I'm starting to find that my process sometimes gives me (a) large covariances and (b) means that are close to the edges. These two events conspire against the speed of my system.
So what I'd like to do is sample the distribution correctly in the first place. Googling led only to this discussion or the truncnorm distribution in scipy.stats. The former is inconclusive and the latter seems to be for one variable. Is there any native multivariate truncated normal? And is it going to be any better than rejecting samples, or should I do something smarter?
I'm going to start working on my own solution, which would be to rotate the untruncated Gaussian to it's principal axes (with an SVD decomposition or something), use a product of truncated Gaussians to sample the distribution, then rotate that sample back, and reject/resample as necessary. If the truncated sampling is more efficient, I think this should sample the desired distribution faster.
So, according to the Wikipedia article, sampling a multivariate truncated normal distribution (MTND) is more difficult. I ended up taking a relatively easy way out and using an MCMC sampler to relax an initial guess towards the MTND as follows.
I used emcee to do the MCMC work. I find this package phenomenally easy-to-use. It only requires a function that returns the log-probability of the desired distribution. So I defined this function
from numpy.linalg import inv
def lnprob_trunc_norm(x, mean, bounds, C):
if np.any(x < bounds[:,0]) or np.any(x > bounds[:,1]):
return -np.inf
return -0.5*(x-mean).dot(inv(C)).dot(x-mean)
Here, C is the covariance matrix of the multivariate normal. Then, you can run something like
S = emcee.EnsembleSampler(Nwalkers, Ndim, lnprob_trunc_norm, args = (mean, bounds, C))
pos, prob, state = S.run_mcmc(pos, Nsteps)
for given mean, bounds and C. You need an initial guess for the walkers' positions pos, which could be a ball around the mean,
pos = emcee.utils.sample_ball(mean, np.sqrt(np.diag(C)), size=Nwalkers)
or sampled from an untruncated multivariate normal,
pos = numpy.random.multivariate_normal(mean, C, size=Nwalkers)
and so on. I personally do several thousand steps of sample discarding first, because it's fast, then force the remaining outliers back within the bounds, then run the MCMC sampling.
The number of steps for convergence is up to you.
Note also that emcee easily supports basic parallelization by adding the argument threads=Nthreads to the EnsembleSampler initialization. So you can make this blazing fast.
I have reimplemented an algorithm which does not depend on MCMC but creates independent and identically distributed (iid) samples from the truncated multivariate normal distribution. Having iid samples can be very useful! I used to also use emcee as described in the answer by Warrick, but for convergence the number of samples needed exploded in higher dimensions, making it impractical for my use case.
The algorithm was introduced by Botev (2016) and uses an accept-reject algorithm based on minimax exponential tilting. It was originally implemented in MATLAB but reimplementing it for Python increased the performance significantly compared to running it using the MATLAB engine in Python. It also works well and is fast at higher dimensions.
The code is available at:
An Example:
d = 10 # dimensions
# random mu and cov
mu = np.random.rand(d)
cov = 0.5 - np.random.rand(d ** 2).reshape((d, d))
cov = np.triu(cov)
cov += cov.T - np.diag(cov.diagonal())
cov =, cov)
# constraints
lb = np.zeros_like(mu) - 1
ub = np.ones_like(mu) * np.inf
# create truncated normal and sample from it
n_samples = 100000
tmvn = TruncatedMVN(mu, cov, lb, ub)
samples = tmvn.sample(n_samples)
Plotting the first dimension results in:
Botev, Z. I., (2016), The normal law under linear restrictions: simulation and estimation via minimax tilting, Journal of the Royal Statistical Society Series B, 79, issue 1, p. 125-148
Simulating truncated multivariate normal can be tricky and usually involves some conditional sampling by MCMC.
My short answer is, you can use my code (!!! It implements the Gibbs sampler algorithm from , which can handle general linear constraints in the form of , even when you have non-full rank D and more constraints than the dimensionality.
import numpy as np
from trun_mvnt import rtmvn, rtmvt
########## Traditional problem, probably what you need... ##########
##### lower < X < upper #####
# So D = identity matrix
D = np.diag(np.ones(4))
lower = np.array([-1,-2,-3,-4])
upper = -lower
Mean = np.zeros(4)
Sigma = np.diag([1,2,3,4])
n = 10 # want 500 final sample
burn = 100 # burn-in first 100 iterates
thin = 1 # thinning for Gibbs
random_sample = rtmvn(n, Mean, Sigma, D, lower, upper, burn, thin)
# Numpy array n-by-p as result!
########## Non-full rank problem (more constraints than dimension) ##########
Mean = np.array([0,0])
Sigma = np.array([1, 0.5, 0.5, 1]).reshape((2,2)) # bivariate normal
D = np.array([1,0,0,1,1,-1]).reshape((3,2)) # non-full rank problem
lower = np.array([-2,-1,-2])
upper = np.array([2,3,5])
n = 500 # want 500 final sample
burn = 100 # burn-in first 100 iterates
thin = 1 # thinning for Gibbs
random_sample = rtmvn(n, Mean, Sigma, D, lower, upper, burn, thin) # Numpy array n-by-p as result!
A little late I guess but for the record, you could use Hamiltonian Monte Carlo. A module in Matlab exists named HMC exact. It shouldn't be too difficult to translate in Py.

