Image reconstruction with compressed sensing - python

I'm trying to code a demonstration of compressed sensing for my final year project but am getting poor image reconstruction when using the Lasso algorithm. I've relied on the following as a reference:
However my code has some differences:
I use scikit-learn to perform a lasso optimisation (basis pursuit) as opposed to using cvxpy to perform an l_1 minimisation with an equality constraint as in the article.
I construct psi differently/more simply, testing seems to show that it's correct.
I use a different package to read and write the image.
import numpy as np
import scipy.fftpack as spfft
import scipy.ndimage as spimg
import imageio
from sklearn.linear_model import Lasso
x_orig = imageio.imread('gt40.jpg', pilmode='L') # read in grayscale
x = spimg.zoom(x_orig, 0.2) #zoom for speed
ny,nx = x.shape
k = round(nx * ny * 0.5) #50% sample
ri = np.random.choice(nx * ny, k, replace=False)
y = x.T.flat[ri] #y is the measured sample
# y = np.expand_dims(y, axis=1) ---- this doesn't seem to make a difference, was presumably required with cvxpy
psi = spfft.idct(np.identity(nx*ny), norm='ortho', axis=0) #my construction of psi
# psi = np.kron(
# spfft.idct(np.identity(nx), norm='ortho', axis=0),
# spfft.idct(np.identity(ny), norm='ortho', axis=0)
# )
# psi = 2*np.random.random_sample((nx*ny,nx*ny)) - 1
theta = psi[ri,:] #equivalent to phi*psi
lasso = Lasso(alpha=0.001, max_iter=10000), y)
s = np.array(lasso.coef_)
x_recovered = psi#s
x_recovered = x_recovered.reshape(nx, ny).T
x_recovered_final = x_recovered.astype('uint8') #recovered image is float64 and has negative values..
imageio.imwrite('gt40_recovered.jpg', x_recovered_final)
Unfortunately I'm not allowed to post images yet so here is a link to the original zoomed image, the image recovered with lasso and the image recovered with cvxpy (described later):
As you can see not only is the recovery poor but the image completely corrupted - the colours seem to be negative and the detail from the 50% sample lost. I think I've managed to track down the problem to the Lasso regression - it returns a vector that, when inverse transformed, has values that are not necessarily in the 0-255 range as expected for the image. So the conversion to from dtype float64 to uint8 is rather random (e.g. -55 becomes 255-55=200).
Following this I tried swapping out lasso for the same optimisation as in the article (minimising the l_1 norm subject to theta*s=y using cvxpy):
import cvxpy as cvx
x_orig = imageio.imread('gt40.jpg', pilmode='L') # read in grayscale
x = spimg.zoom(x_orig, 0.2)
ny,nx = x.shape
k = round(nx * ny * 0.5)
ri = np.random.choice(nx * ny, k, replace=False)
y = x.T.flat[ri]
psi = spfft.idct(np.identity(nx*ny), norm='ortho', axis=0)
theta = psi[ri,:] #equivalent to phi*psi
vx = cvx.Variable(nx * ny)
objective = cvx.Minimize(cvx.norm(vx, 1))
constraints = [theta#vx == y]
prob = cvx.Problem(objective, constraints)
result = prob.solve(verbose=True)
s = np.array(vx.value).squeeze()
x_recovered = psi#s
x_recovered = x_recovered.reshape(nx, ny).T
x_recovered_final = x_recovered.astype('uint8')
imageio.imwrite('gt40_recovered_altopt.jpg', x_recovered_final)
This took nearly 6 hours but finally I got a somewhat satisfactory result. However I would like to perform a demonstration of lasso if possible. Any help in getting the lasso to return appropriate values or somehow converting its result appropriately would be very much appreciated.


Perlin noise in Python's noise library

I have a problem with generating Perlin noise for my project. As I wanted to understand how to use library properly, I tried to follow step-by-step this page:
In first part, there is code:
import noise
import numpy as np
from scipy.misc import toimage
shape = (1024,1024)
scale = 100.0
octaves = 6
persistence = 0.5
lacunarity = 2.0
world = np.zeros(shape)
for i in range(shape[0]):
for j in range(shape[1]):
world[i][j] = noise.pnoise2(i/scale,
I copy-paste it with small change at the end (toimage is obsolete) so I have:
import noise
import numpy as np
from PIL import Image
shape = (1024,1024)
scale = 100
octaves = 6
persistence = 0.5
lacunarity = 2.0
seed = np.random.randint(0,100)
world = np.zeros(shape)
for i in range(shape[0]):
for j in range(shape[1]):
world[i][j] = noise.pnoise2(i/scale,
Image.fromarray(world, mode='L').show()
I tried a lot of diffrient modes, but this noise is not even close to coherent noise. My result is something like this (mode='L'). Could someone explain me, what am I doing wrong?
Here is the working code. I took the liberty of cleaning it up a little. See comments for details. As a final advice: When testing code, use matplotlib for visualization. Its imshow() function is way more robust than PIL.
import noise
import numpy as np
from PIL import Image
shape = (1024,1024)
scale = .5
octaves = 6
persistence = 0.5
lacunarity = 2.0
seed = np.random.randint(0,100)
world = np.zeros(shape)
# make coordinate grid on [0,1]^2
x_idx = np.linspace(0, 1, shape[0])
y_idx = np.linspace(0, 1, shape[1])
world_x, world_y = np.meshgrid(x_idx, y_idx)
# apply perlin noise, instead of np.vectorize, consider using itertools.starmap()
world = np.vectorize(noise.pnoise2)(world_x/scale,
# here was the error: one needs to normalize the image first. Could be done without copying the array, though
img = np.floor((world + .5) * 255).astype(np.uint8) # <- Normalize world first
Image.fromarray(img, mode='L').show()
If someone comes after me, with noise library you should rather normalize with
img = np.floor((world + 1) * 127).astype(np.uint8)
This way there will not be any spots of abnormal colour opposite to what it should be.

Creating a 2D Gaussian random field from a given 2D variance

I've been trying to create a 2D map of blobs of matter (Gaussian random field) using a variance I have calculated. This variance is a 2D array. I have tried using numpy.random.normal since it allows for a 2D input of the variance, but it doesn't really create a map with the trend I expect from the input parameters. One of the important input constants lambda_c should manifest itself as the physical size (diameter) of the blobs. However, when I change my lambda_c, the size of the blobs does not change if at all. For example, if I set lambda_c = 40 parsecs, the map needs blobs that are 40 parsecs in diameter. A MWE to produce the map using my variance:
import numpy as np
import random
import matplotlib.pyplot as plt
from matplotlib.pyplot import show, plot
import scipy.integrate as integrate
from scipy.interpolate import RectBivariateSpline
n = 300
c = 3e8
G = 6.67e-11
M_sun = 1.989e30
pc = 3.086e16 # parsec
Dds = 1097.07889283e6*pc
Ds = 1726.62069147e6*pc
Dd = 1259e6*pc
FOV_arcsec_original = 5.
FOV_arcmin = FOV_arcsec_original/60.
pix2rad = ((FOV_arcmin/60.)/float(n))*np.pi/180.
rad2pix = 1./pix2rad
x_pix = np.linspace(-FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,n)
y_pix = np.linspace(-FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,FOV_arcsec_original/2/pix2rad/180.*np.pi/3600.,n)
X_pix,Y_pix = np.meshgrid(x_pix,y_pix)
conc = 10.
M = 1e13*M_sun
r_s = 18*1e3*pc
lambda_c = 40*pc ### The important parameter that doesn't seem to manifest itself in the map when changed
rho_s = M/((4*np.pi*r_s**3)*(np.log(1+conc) - (conc/(1+conc))))
sigma_crit = (c**2*Ds)/(4*np.pi*G*Dd*Dds)
k_s = rho_s*r_s/sigma_crit
theta_s = r_s/Dd
Renorm = (4*G/c**2)*(Dds/(Dd*Ds))
#### Here I just interpolate and zoom into my field of view to get better resolutions
A = np.sqrt(X_pix**2 + Y_pix**2)*pix2rad/theta_s
A_1 = A[100:200,0:100]
n_x = n_y = 100
FOV_arcsec_x = FOV_arcsec_original*(100./300)
FOV_arcmin_x = FOV_arcsec_x/60.
pix2rad_x = ((FOV_arcmin_x/60.)/float(n_x))*np.pi/180.
rad2pix_x = 1./pix2rad_x
FOV_arcsec_y = FOV_arcsec_original*(100./300)
FOV_arcmin_y = FOV_arcsec_y/60.
pix2rad_y = ((FOV_arcmin_y/60.)/float(n_y))*np.pi/180.
rad2pix_y = 1./pix2rad_y
x1 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x)
y1 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y)
X1,Y1 = np.meshgrid(x1,y1)
n_x_2 = 500
n_y_2 = 500
x2 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x_2)
y2 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y_2)
X2,Y2 = np.meshgrid(x2,y2)
interp_spline = RectBivariateSpline(y1,x1,A_1)
A_2 = interp_spline(y2,x2)
A_3 = A_2[50:450,0:400]
n_x_3 = n_y_3 = 400
FOV_arcsec_x = FOV_arcsec_original*(100./300)*400./500.
FOV_arcmin_x = FOV_arcsec_x/60.
pix2rad_x = ((FOV_arcmin_x/60.)/float(n_x_3))*np.pi/180.
rad2pix_x = 1./pix2rad_x
FOV_arcsec_y = FOV_arcsec_original*(100./300)*400./500.
FOV_arcmin_y = FOV_arcsec_y/60.
pix2rad_y = ((FOV_arcmin_y/60.)/float(n_y_3))*np.pi/180.
rad2pix_y = 1./pix2rad_y
x3 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x_3)
y3 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y_3)
X3,Y3 = np.meshgrid(x3,y3)
n_x_4 = 1000
n_y_4 = 1000
x4 = np.linspace(-FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,FOV_arcsec_x/2/pix2rad_x/180.*np.pi/3600.,n_x_4)
y4 = np.linspace(-FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,FOV_arcsec_y/2/pix2rad_y/180.*np.pi/3600.,n_y_4)
X4,Y4 = np.meshgrid(x4,y4)
interp_spline = RectBivariateSpline(y3,x3,A_3)
A_4 = interp_spline(y4,x4)
############### Function to calculate variance
variance = np.zeros((len(A_4),len(A_4)))
def variance_fluctuations(x):
for i in xrange(len(x)):
for j in xrange(len(x)):
if x[j][i] < 1.:
variance[j][i] = (k_s**2)*(lambda_c/r_s)*((np.pi/x[j][i]) - (1./(x[j][i]**2 -1)**3.)*(((6.*x[j][i]**4. - 17.*x[j][i]**2. + 26)/3.)+ (((2.*x[j][i]**6. - 7.*x[j][i]**4. + 8.*x[j][i]**2. - 8)*np.arccosh(1./x[j][i]))/(np.sqrt(1-x[j][i]**2.)))))
elif x[j][i] > 1.:
variance[j][i] = (k_s**2)*(lambda_c/r_s)*((np.pi/x[j][i]) - (1./(x[j][i]**2 -1)**3.)*(((6.*x[j][i]**4. - 17.*x[j][i]**2. + 26)/3.)+ (((2.*x[j][i]**6. - 7.*x[j][i]**4. + 8.*x[j][i]**2. - 8)*np.arccos(1./x[j][i]))/(np.sqrt(x[j][i]**2.-1)))))
#### Creating the map
mean = 0
delta_kappa = np.random.normal(0,variance,A_4.shape)
xfinal = np.linspace(-FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,1000)
yfinal = np.linspace(-FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,FOV_arcsec_x*np.pi/180./3600.*Dd/pc/2,1000)
Xfinal, Yfinal = np.meshgrid(xfinal,yfinal)
The map looks like this, with the density of blobs increasing towards the right. However, the size of the blobs don't change and the map looks virtually the same whether I use lambda_c = 40*pc or lambda_c = 400*pc.
I'm wondering if the np.random.normal function isn't really doing what I expect it to do? I feel like the pixel scale of the map and the way samples are drawn make no link to the size of the blobs. Maybe there is a better way to create the map using the variance, would appreciate any insight.
I expect the map to look something like this , the blob sizes change based on the input parameters for my variance :
This is quite a well visited problem in (surprise surprise) astronomy and cosmology.
You could use lenstool:
You could also try here:
Not to mention:
I am not reproducing code here because all credit goes to the above authors. However, they did just all come right out a google search :/
Easiest of all is probably a python module FyeldGenerator, apparently designed for this exact purpose:
So (adapted from github example):
pip install FyeldGenerator
from FyeldGenerator import generate_field
from matplotlib import use
import matplotlib.pyplot as plt
import numpy as np
# Helper that generates power-law power spectrum
def Pkgen(n):
def Pk(k):
return np.power(k, -n)
return Pk
# Draw samples from a normal distribution
def distrib(shape):
a = np.random.normal(loc=0, scale=1, size=shape)
b = np.random.normal(loc=0, scale=1, size=shape)
return a + 1j * b
shape = (512, 512)
field = generate_field(distrib, Pkgen(2), shape)
plt.imshow(field, cmap='jet')
This gives:
Looks pretty straightforward to me :)
PS: FoV implied a telescope observation of the gaussian random field :)
A completely different and much quicker way may be just to blur the delta_kappa array with gaussian filter. Try adjusting sigma parameter to alter the blobs size.
from scipy.ndimage.filters import gaussian_filter
dk_gf = gaussian_filter(delta_kappa, sigma=20)
Xfinal, Yfinal = np.meshgrid(xfinal,yfinal)
plt.contourf(Xfinal,Yfinal,dk_ma,100, cmap='jet');
this is image with sigma=20
this is image with sigma=2.5
ThunderFlash, try this code to draw the map:
# function to produce blobs:
from scipy.stats import multivariate_normal
def blob (positions, mean=(0,0), var=1):
cov = [[var,0],[0,var]]
return multivariate_normal(mean, cov).pdf(positions)
now prepare for blobs generation.
note that I use less dense grid to pick blobs centers (regulated by `step`)
this makes blobs more pronounced and saves calculation time.
use this part instead of your code section below comment #### Creating the map
delta_kappa = np.random.normal(0,variance,A_4.shape) # same
step = 10 #
dk2 = delta_kappa[::step,::step] # taking every 10th element
x2, y2 = xfinal[::step],yfinal[::step]
field = np.dstack((Xfinal,Yfinal))
print (field.shape, dk2.shape, x2.shape, y2.shape)
>> (1000, 1000, 2), (100, 100), (100,), (100,)
result = np.zeros(field.shape[:2])
for x in range (len(x2)):
for y in range (len(y2)):
res2 = blob(field, mean = (x2[x], y2[y]), var=10000)*dk2[x,y]
result += res2
# the cycle above took over 20 minutes on Ryzen 2700X. It could be accelerated by vectorization presumably.
you may want to play with var parameter in blob() to smoothen the image and with step to make it more compressed.
Here is the image that I got using your code (somehow axes are flipped and more dense areas on the top):

Fitting a quadratic function in python without numpy polyfit

I am trying to fit a quadratic function to some data, and I'm trying to do this without using numpy's polyfit function.
Mathematically I tried to follow this website but somehow I don't think that I'm doing it right. If anyone could assist me that would be great, or If you could suggest another way to do it that would also be awesome.
What I've tried so far:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
ones = np.ones(3)
A = np.array( ((0,1),(1,1),(2,1)))
xfeature = A.T[0]
squaredfeature = A.T[0] ** 2
b = np.array( (1,2,0), ndmin=2 ).T
b = b.reshape(3)
features = np.concatenate((np.vstack(ones), np.vstack(xfeature), np.vstack(squaredfeature)), axis = 1)
featuresc = features.copy()
m_det = np.linalg.det(features)
determinants = []
for i in range(3):
featuresc.T[i] = b
det = np.linalg.det(featuresc)
featuresc = features.copy()
determinants = determinants / m_det
u = np.linspace(0,3,100)
plt.plot(u, u**2*determinants[2] + u*determinants[1] + determinants[0] )
p2 = np.polyfit(A.T[0],b,2)
plt.plot(u, np.polyval(p2,u), 'b--')
As you can see my curve doesn't compare well to nnumpy's polyfit curve.
I went through my code and removed all the stupid mistakes and now it works, when I try to fit it over 3 points, but I have no idea how to fit over more than three points.
This is the new code:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
ones = np.ones(3)
A = np.array( ((0,1),(1,1),(2,1)))
xfeature = A.T[0]
squaredfeature = A.T[0] ** 2
b = np.array( (1,2,0), ndmin=2 ).T
b = b.reshape(3)
features = np.concatenate((np.vstack(ones), np.vstack(xfeature), np.vstack(squaredfeature)), axis = 1)
featuresc = features.copy()
m_det = np.linalg.det(features)
determinants = []
for i in range(3):
featuresc.T[i] = b
det = np.linalg.det(featuresc)
featuresc = features.copy()
determinants = determinants / m_det
u = np.linspace(0,3,100)
plt.plot(u, u**2*determinants[2] + u*determinants[1] + determinants[0] )
p2 = np.polyfit(A.T[0],b,2)
plt.plot(u, np.polyval(p2,u), 'r--')
Instead using Cramer's Rule, actually solve the system using least squares. Remember that Cramer's Rule will only work if the total number of points you have equals the desired order of polynomial plus 1.
If you don't have this, then Cramer's Rule will not work as you're trying to find an exact solution to the problem. If you have more points, the method is unsuitable as we will create an overdetermined system of equations.
To adapt this to more points, numpy.linalg.lstsq would be a better fit as it solves the solution to the Ax = b by computing the vector x that minimizes the Euclidean norm using the matrix A. Therefore, remove the y values from the last column of the features matrix and solve for the coefficients and use numpy.linalg.lstsq to solve for the coefficients:
import numpy as np
import matplotlib.pyplot as plt
ones = np.ones(4)
xfeature = np.asarray([0,1,2,3])
squaredfeature = xfeature ** 2
b = np.asarray([1,2,0,3])
features = np.concatenate((np.vstack(ones),np.vstack(xfeature),np.vstack(squaredfeature)), axis = 1) # Change - remove the y values
determinants = np.linalg.lstsq(features, b)[0] # Change - use least squares
u = np.linspace(0,3,100)
plt.plot(u, u**2*determinants[2] + u*determinants[1] + determinants[0] )
I get this plot now, which matches what the dashed curve is in your graph, also matching what numpy.polyfit gives you:

Multiple regression with pykalman?

I'm looking for a way to generalize regression using pykalman from 1 to N regressors. We will not bother about online regression initially - I just want a toy example to set up the Kalman filter for 2 regressors instead of 1, i.e. Y = c1 * x1 + c2 * x2 + const.
For the single regressor case, the following code works. My question is how to change the filter setup so it works for two regressors:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from pykalman import KalmanFilter
if __name__ == "__main__":
file_name = '<path>\KalmanExample.txt'
df = pd.read_csv(file_name, index_col = 0)
prices = df[['ETF', 'ASSET_1']] #, 'ASSET_2']]
delta = 1e-5
trans_cov = delta / (1 - delta) * np.eye(2)
obs_mat = np.vstack( [prices['ETF'],
np.ones(prices['ETF'].shape)]).T[:, np.newaxis]
kf = KalmanFilter(
initial_state_covariance=np.ones((2, 2)),
state_means, state_covs = kf.filter(prices['ASSET_1'].values)
# Draw slope and intercept...
slope=state_means[:, 0],
intercept=state_means[:, 1]
), index=prices.index
The example file KalmanExample.txt contains the following data:
The single regressor case provides the following output and for the two-regressor case I want a second "slope"-plot representing C2.
Answer edited to reflect my revised understanding of the question.
If I understand correctly you wish to model an observable output variable Y = ETF, as a linear combination of two observable values; ASSET_1, ASSET_2.
The coefficients of this regression are to be treated as the system states, i.e. ETF = x1*ASSET_1 + x2*ASSET_2 + x3, where x1 and x2 are the coefficients assets 1 and 2 respectively, and x3 is the intercept. These coefficients are assumed to evolve slowly.
Code implementing this is given below, note that this is just extending the existing example to have one more regressor.
Note also that you can get quite different results by playing with the delta parameter. If this is set large (far from zero), then the coefficients will change more rapidly, and the reconstruction of the regressand will be near-perfect. If it is set small (very close to zero) then the coefficients will evolve more slowly and the reconstruction of the regressand will be less perfect. You might want to look into the Expectation Maximisation algorithm - supported by pykalman.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from pykalman import KalmanFilter
if __name__ == "__main__":
file_name = 'KalmanExample.txt'
df = pd.read_csv(file_name, index_col = 0)
prices = df[['ETF', 'ASSET_1', 'ASSET_2']]
delta = 1e-3
trans_cov = delta / (1 - delta) * np.eye(3)
obs_mat = np.vstack( [prices['ASSET_1'], prices['ASSET_2'],
np.ones(prices['ASSET_1'].shape)]).T[:, np.newaxis]
kf = KalmanFilter(
initial_state_covariance=np.ones((3, 3)),
# state_means, state_covs = kf.em(prices['ETF'].values).smooth(prices['ETF'].values)
state_means, state_covs = kf.filter(prices['ETF'].values)
# Re-construct ETF from coefficients and 'ASSET_1' and ASSET_2 values:
ETF_est = np.array([ for a, b in zip(np.squeeze(obs_mat), state_means)])
# Draw slope and intercept...
slope1=state_means[:, 0],
slope2=state_means[:, 1],
intercept=state_means[:, 2],
), index=prices.index
# Draw actual y, and estimated y:
), index=prices.index

Modified BPMF in PyMC3 using `LKJCorr` priors: PositiveDefiniteError using `NUTS`

I previously implemented the original Bayesian Probabilistic Matrix Factorization (BPMF) model in pymc3. See my previous question for reference, data source, and problem setup. Per the answer to that question from #twiecki, I've implemented a variation of the model using LKJCorr priors for the correlation matrices and uniform priors for the standard deviations. In the original model, the covariance matrices are drawn from Wishart distributions, but due to current limitations of pymc3, the Wishart distribution cannot be sampled from properly. This answer to a loosely related question provides a succinct explanation for the choice of LKJCorr priors. The new model is below.
import pymc3 as pm
import numpy as np
import theano.tensor as t
n, m = train.shape
dim = 10 # dimensionality
beta_0 = 1 # scaling factor for lambdas; unclear on its use
alpha = 2 # fixed precision for likelihood function
std = .05 # how much noise to use for model initialization
# We will use separate priors for sigma and correlation matrix.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
n_elem = dim * (dim - 1) / 2
tri_index = np.zeros([dim, dim], dtype=int)
tri_index[np.triu_indices(dim, k=1)] = np.arange(n_elem)
tri_index[np.triu_indices(dim, k=1)[::-1]] = np.arange(n_elem)'building the BPMF model')
with pm.Model() as bpmf:
# Specify user feature matrix
sigma_u = pm.Uniform('sigma_u', shape=dim)
corr_triangle_u = pm.LKJCorr(
'corr_u', n=1, p=dim,
testval=np.random.randn(n_elem) * std)
corr_matrix_u = corr_triangle_u[tri_index]
corr_matrix_u = t.fill_diagonal(corr_matrix_u, 1)
cov_matrix_u = t.diag(sigma_u).dot(
lambda_u = t.nlinalg.matrix_inverse(cov_matrix_u)
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
testval=np.random.randn(dim) * std)
U = pm.MvNormal(
'U', mu=mu_u, tau=lambda_u,
shape=(n, dim), testval=np.random.randn(n, dim) * std)
# Specify item feature matrix
sigma_v = pm.Uniform('sigma_v', shape=dim)
corr_triangle_v = pm.LKJCorr(
'corr_v', n=1, p=dim,
testval=np.random.randn(n_elem) * std)
corr_matrix_v = corr_triangle_v[tri_index]
corr_matrix_v = t.fill_diagonal(corr_matrix_v, 1)
cov_matrix_v = t.diag(sigma_v).dot(
lambda_v = t.nlinalg.matrix_inverse(cov_matrix_v)
mu_v = pm.Normal(
'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
testval=np.random.randn(dim) * std)
V = pm.MvNormal(
'V', mu=mu_v, tau=lambda_v,
testval=np.random.randn(m, dim) * std)
# Specify rating likelihood function
R = pm.Normal(
'R',, V.T), tau=alpha * np.ones((n, m)),
# `start` is the start dictionary obtained from running find_MAP for PMF.
# See the previous post for PMF code.
for key in bpmf.test_point:
if key not in start:
start[key] = bpmf.test_point[key]
with bpmf:
step = pm.NUTS(scaling=start)
The goal with this reimplementation was to produce a model that could be estimated using the NUTS sampler. Unfortunately, I'm still getting the same error at the last line:
PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [ 0 1 2 3 ... 1030 1031 1032 1033 1034 ]
I've made all the code for PMF, BPMF, and this modified BPMF available in this gist to make it simple to replicate the error. All you need to do is download the data (also referenced in the gist).
It looks like you are passing the complete precision matrix into the normal distribution:
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
testval=np.random.randn(dim) * std)
I assume you only want to pass the diagonal values:
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * t.diag(lambda_u), shape=dim,
testval=np.random.randn(dim) * std)
Does this change to mu_u and mu_v fix it for you?

