I am having trouble fitting a multivariate gaussian distribution to my dataset, more specifically, finding a mean vector (or multiple mean vectors). My dataset is an N x 8 matrix and currently I am using this code:
muVector = np.mean(Xtrain, axis=0) where Xtrain is my training data set.
For the covariance I am building it using a arbitrary variance value (.5) and doing:
covariance = np.dot(.5, np.eye(N,N) where N is the number of observations.
But when I construct my Phi matrix, I am getting all zeros. Here is my code:
muVector = np.mean(Xtrain, axis=0)
# get covariance matrix from Xtrain
cov = np.dot(var, np.eye(N,N))
cov = np.linalg.inv(cov)
# build Xtrain Phi
Phi = np.ones((N,M))
for row in range(N):
temp = Xtrain[row,:] - muVector
temp.shape = (1,M)
temp = np.dot((-.5), temp)
temp = np.dot(temp, cov)
temp = np.dot(temp, (Xtrain[row,:] - muVector))
Phi[row,:] = np.exp(temp)
Any help is appreciated. I think I might have to use np.random.multivariate_normal()? But I do not know how to use it in this case.
By "Phi" I believe that you mean the probability density function (pdf) that you want to estimate. In this case, the covariance matrix should be MxM and the output Phi will be Nx1:
# -*- coding: utf-8 -*-
import numpy as np
N = 1024
M = 8
var = 0.5
# Creating a Xtrain NxM observation matrix.
# Its muVector is [0, 1, 2, 3, 4, 5, 6, 7] and the variance for all
# independent random variables is 0.5.
Xtrain = np.random.multivariate_normal(np.arange(8), np.eye(8,8)*var, N)
# Estimating the mean vector.
muVector = np.mean(Xtrain, axis=0)
# Creating the estimated covariance matrix and its inverse.
cov = np.eye(M,M)*var
inv_cov = np.linalg.inv(cov)
# Normalization factor from the pdf.
norm_factor = 1/np.sqrt((2*np.pi)**M * np.linalg.det(cov))
# Estimating the pdf.
Phi = np.ones((N,1))
for row in range(N):
temp = Xtrain[row,:] - muVector
temp.shape = (1,M)
temp = np.dot(-0.5*temp, inv_cov)
temp = np.dot(temp, (Xtrain[row,:] - muVector))
Phi[row] = norm_factor*np.exp(temp)
Alternatively, you can use the pdf method from scipy.stats.multivariate_normal:
# -*- coding: utf-8 -*-
import numpy as np
from scipy.stats import multivariate_normal
N = 1024
M = 8
var = 0.5
# Creating a Xtrain NxM observation matrix.
# Its muVector is [0, 1, 2, 3, 4, 5, 6, 7] and the variance for all
# independent random variables is 0.5.
Xtrain = np.random.multivariate_normal(np.arange(8), np.eye(8,8)*var, N)
# Estimating the mean vector.
muVector = np.mean(Xtrain, axis=0)
# Creating the estimated covariance matrix.
cov = np.eye(M,M)*var
Phi2 = multivariate_normal.pdf(Xtrain, mean=muVector, cov=cov)
Both Phi and Phi2 output arrays will be equal.
Related
I am making PCA in python with this code:
def OWN_PCA(X,num_components):
#Step-1
X_meaned = X - np.mean(X , axis = 0)
#creating covariance matrix
cov_mat = np.cov(X_meaned,rowvar=False)
#calculating eigenvector and eigen value
eigen_values , eigen_vectors = np.linalg.eigh(cov_mat)
#sorting the vectors based on eigen values
sorted_index = np.argsort(eigen_values)[::-1]
sorted_eigenvalue = eigen_values[sorted_index]
sorted_eigenvectors = eigen_vectors[:,sorted_index]
#choosing number of components
eigenvector_subset = sorted_eigenvectors[:,0:num_components]
X_reduced = np.dot( eigenvector_subset.transpose(), X.transpose() ).transpose()
return X_reduced
The problem is when I apply it to the iris dataset and plot it, I will get this:
and when I use PCA in sklearn the image is reversed:
What is wrong with my code?
I am trying to implement PCA analysis using numpy to mimic the results from sklearn's decomposition.PCA classifier.
I am using as input vectors of N flattened images of fixed size M = 128x192 (image dimensions) joined horizontally into a single matrix D of dimensions MxN
I am aiming to use the Snapshot method, as other implementations (see here and here) crash my build while computing np.cov, since the size of the covariant matrix would be C = D(D^T) = MxM.
The snapshot method first computes C_acute = (D^T)D, then computes the (acute) eigenvectors and values of this NxN matrix. This gives eigenvectors that are (D^T)v, and eigenvalues that are the same.
To retrieve the eigenvectors v from the (acute) eigenvectors, we simply do v = (1/eigenvalue) * (D(v_acute)).
Here is the reference implementation I am using adapted from this SO post (which is known to work):
class TemplatePCA:
def __init__(self, n_components=None):
self.n_components = n_components
def fit_transform(self, X):
X -= np.mean(X, axis = 0)
R = np.cov(X, rowvar=False)
# calculate eigenvectors & eigenvalues of the covariance matrix
evals, evecs = np.linalg.eig(R)
# sort eigenvalue in decreasing order
idx = np.argsort(evals)[::-1]
evecs = evecs[:,idx]
# sort eigenvectors according to same index
evals = evals[idx]
# select the first n eigenvectors (n is desired dimension
# of rescaled data array, or dims_rescaled_data)
evecs = evecs[:, :self.n_components]
# carry out the transformation on the data using eigenvectors
# and return the re-scaled data
return -1 * np.dot(X, evecs) #
Here is the implementation I have so far.
class MyPCA:
def __init__(self, n_components=None):
self.n_components = n_components
def fit_transform(self, X):
X -= np.mean(X, axis = 0)
D = X.T
M, N = D.shape
D_T = X # D.T == (X.T).T == X
C_acute = np.dot(D_T, D)
eigen_values, eigen_vectors_acute = np.linalg.eig(C_acute)
eigen_vectors = []
for i in range(eigen_vectors_acute.shape[0]): # for each eigenvector
v = np.dot(D, eigen_vectors_acute[i]) / eigen_values[i]
eigen_vectors.append(v)
eigen_vectors = np.array(eigen_vectors)
# sort eigenvalues and eigenvectors in decreasing order
idx = np.argsort(eigen_values)[::-1]
eigen_vectors = eigen_vectors[:,idx]
eigen_values = eigen_values[idx]
# select the first n_components eigenvectors
eigen_vectors = eigen_vectors[:, :self.n_components]
# carry out the transformation on the data using eigenvectors
# return the re-scaled data (projection)
return np.dot(C_acute, eigen_vectors)
The reference text I am using notes that:
The eigenvector is now (D^T)v, so to do face detection we first multiply our test image vector by (D^T) before projecting onto the eigenimages.
I am not sure whether it is possible to retrieve the exact same principal components (i.e. eigenvectors) using this method, and it would seem impossible to even get the same eigenvectors back, since the size of the eigen_vectors_acute is only (4, 6) (meaning there are only 4 vectors), compared to the other method where it is (6, 6) (there are 6).
Running both on an input:
x = np.array([
[0.387,123, 789,256, 4878, 5.42],
[0.723,9.78,1.90,1234, 12104,5.25],
[1,123, 67.98,7.91,12756,5.52],
[1.524,1.34,23.456,1.23,6787,3.94],
])
# These two are the same
print(sklearn.decomposition.PCA(n_components=3).fit_transform(x))
print(TemplatePCA(n_components=3).fit_transform(x))
# This one is different
print(MyPCA(n_components=3).fit_transform(x))
Output:
[[ 4282.20163145 147.84415964 -267.73483211]
[-3025.62452358 683.58580386 67.76941319]
[-3599.15380006 -569.33984612 -148.62757658]
[ 2342.57669218 -262.09011737 348.5929955 ]]
[[-4282.20163145 -147.84415964 267.73483211]
[ 3025.62452358 -683.58580386 -67.76941319]
[ 3599.15380006 569.33984612 148.62757658]
[-2342.57669218 262.09011737 -348.5929955 ]]
[[ 3.35535639e+15, -5.70493660e+17, -8.57482740e+17],
[-2.45510474e+15, 4.17428591e+17, 6.27417685e+17],
[-2.82475918e+15, 4.80278997e+17, 7.21885236e+17],
[ 1.92450753e+15, -3.27213928e+17, -4.91820181e+17]]
Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 1 year ago.
Improve this question
import numpy as np
from numpy import sin, cos, pi
from matplotlib.pyplot import *
rng = np.random.default_rng(42)
N = 200
center = 10, 15
sigmas = 10, 2
theta = 20 / 180 * pi
# covariance matrix
rotmat = np.array([[cos(theta), -sin(theta)],[sin(theta), cos(theta)]])
diagmat = np.diagflat(sigmas)
mean =np.array([−1,−2,−3])
# covar = rotmat # diagmat # rotmat.T
covar= np.array([[2, 2 ,0],[2 ,3, 1],[0, 1 ,19]])
print('covariance matrix:')
print(covar)`enter code here`
eigval, eigvec = np.linalg.eigh(covar)
print(f'eigenvalues: {eigval}\neigenvectors:\n{eigvec}')
print('angle of eigvector corresponding to larger eigenvalue:',
180 /pi * np.arctan2(eigvec[1,1], eigvec[0,1]))
# PCA
mean = data.mean(axis=0)
print('mean:', mean)
# S1: explicit sum
S1 = np.zeros((2,2), dtype=float)
print(len(data))
for i in range(len(data)):
S1 += np.outer(data[i] - mean, data[i] - mean)
S1 /= len(data)
print(f'S1= (explicit sum)\n{S1}')
# S2:
S2 = np.cov(data, rowvar=False, bias=True)
print(f'S2= (np.cov)\n{S2}')
# PCA:
lambdas, u = np.linalg.eigh(S2)
print(f'\nPCA\nlambda={lambdas}\nu=\n{u}')
u1 = u[:,1] # largest
print('u1=\n',u1)
print(f'first principal component angle: {180/pi*np.arctan2(u1[1], u1[0])}')
after that I need to Perform PCA on the above data to one principal component and two principal components. What is the fractional explained variance in this two
cases
For generating the data, you need two tricks:
Compute a "square root" of covariance matrix S using eigenvalue-eigenvector factorization
Use the standard formula for generating a random normal with given mean and covariance. With Numpy it works on vectors (quoting from help(np.random.randn)):
For random samples from :math:`N(\mu, \sigma^2)`, use:
``sigma * np.random.randn(...) + mu``
Example:
import numpy as np
import matplotlib.pyplot as plt
# Part 1: generating random normal data with the given mean and covariance
N = 200
# covariance matrix
S = np.array([[2, 2, 0], [2, 3, 1], [0, 1, 19]])
# mean
mu = np.array([[-1, -2, -3]]).T
# get "square root" of covariance matrix via eigenfactorization
w, v = np.linalg.eig(S)
sigma = np.sqrt(w) * v
# ready, set, go!
A = sigma # np.random.randn(3, N) + mu
print(f'sample covariance:\n{np.cov(A)}')
# sample covariance:
# [[ 1.70899164 1.74288639 0.21190326]
# [ 1.74288639 2.59595547 1.2822817 ]
# [ 0.21190326 1.2822817 22.04077608]]
print(f'sample mean:\n{A.mean(axis=1)}')
# sample mean:
# [-1.02385787 -1.87783415 -2.96077204]
# --------------------------------------------
# Part 2: principal component analysis on random data A
# estimate the sample covariance
R = np.cov(A)
# do the PCA
lam, u = np.linalg.eig(R)
# fractional explained variance is the relative magnitude of
# the accumulated eigenvalues
# reorder the eigenvalues & vectors with hottest eigenvalues first
col_order = np.argsort(lam)[::-1]
lam = lam[col_order]
u = u[:, col_order]
print(f'eigenvalues: {lam}')
# eigenvalues: [22.13020272 3.87946467 0.3360558 ]
var_explained = lam.cumsum() / lam.sum()
print(f'fractional explained variance: {var_explained}')
# fractional explained variance: [0.83999223 0.98724439 1. ]
# ^^ 84% in first dimension alone,
# 99% in first two dimensions
# do the projection
B = u.T # A
# now the variance in B is concentrated in the first two dimensions
covariance after PCA projection:
[[ 2.21302027e+01 -2.68545720e-15 -1.60675493e-15]
[-2.68545720e-15 3.87946467e+00 -1.19613978e-15]
[-1.60675493e-15 -1.19613978e-15 3.36055802e-01]]
# scatter plot
plt.plot(B[0], B[1], '.')
plt.axis('equal')
plt.grid('on')
plt.xlabel('principal axis 0')
plt.ylabel('principal axis 1')
plt.title('Random data projected onto two principal axes')
# project back using ONLY a two dimensional subspace of B
# i.e. drop the last eigenvector
A_approx = u[:,:2] # B[:2,:]
# error analysis
err3 = A - A_approx
mse = (err3**2).sum(axis=0).mean()
print(f'predicted error variance: {lam[-1]}')
print(f'measured error variance: {mse}')
# predicted error variance: 0.3360558019705344
# measured error variance: 0.41137559916273914
Below I used the nympy generated draws to compare the empirical moments of Pareto distribution with analytical ones (I used the following link for the formulas of mean, variance, skewness and excess kurtosis, https://en.wikipedia.org/wiki/Pareto_distribution). Obtained results for mean and variance are similar, however the results for skewness and kurtosis are very different. What am I doing wrong?
Thank you beforehand
import numpy as np
import pandas as pd
from scipy.stats import skew
from scipy.stats import kurtosis
from prettytable import PrettyTable
x_m = [1, 2, 3, 4]
alpha = [5, 6, 7, 8]
#drawing samples from distribution
for a, x in zip(alpha, x_m):
print (a, x)
data = (np.random.default_rng().pareto(a, 10000000)+1) * x
mean = np.mean(data)
var = np.var(data)
skew = skew(data)
kurt = kurtosis(data)
#Analytical estimation
for a, x in zip(alpha, x_m):
a_mean = (a*x)/(a-1)
a_var = (a*x**2)/((a-1)**2*(a-2))
a_skew = (2*(1+a)/(a-3))*(np.sqrt(a-2/a))
a_kurt = (6*(a**3+a**2-6*a-2))/(a*(a-3)*(a-4))
#Table
header = ['Moments', 'Simulated', 'Analytical']
Moments = ['Mean', 'Variance', 'Skewness', 'Excess Kurtosis']
Simulated = [round(mean,4), round(var,4), round(skew,4), round(kurt,4)]
Analytical = [round(a_mean,4), round(a_var,4), round(a_skew,4), round(a_kurt,4)]
table = PrettyTable()
table.add_column(header[0], Moments)
table.add_column(header[1], Simulated)
table.add_column(header[2], Analytical)
print(table)
its a small typo in the analytical skewness line:
a_skew = (2*(1+a)/(a-3))*(np.sqrt(a-2/a))
should be
a_skew = (2*(1+a)/(a-3))*(np.sqrt((a-2)/a))
The Kurtosis is correct with a sufficiently large sample.
I previously implemented the original Bayesian Probabilistic Matrix Factorization (BPMF) model in pymc3. See my previous question for reference, data source, and problem setup. Per the answer to that question from #twiecki, I've implemented a variation of the model using LKJCorr priors for the correlation matrices and uniform priors for the standard deviations. In the original model, the covariance matrices are drawn from Wishart distributions, but due to current limitations of pymc3, the Wishart distribution cannot be sampled from properly. This answer to a loosely related question provides a succinct explanation for the choice of LKJCorr priors. The new model is below.
import pymc3 as pm
import numpy as np
import theano.tensor as t
n, m = train.shape
dim = 10 # dimensionality
beta_0 = 1 # scaling factor for lambdas; unclear on its use
alpha = 2 # fixed precision for likelihood function
std = .05 # how much noise to use for model initialization
# We will use separate priors for sigma and correlation matrix.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
n_elem = dim * (dim - 1) / 2
tri_index = np.zeros([dim, dim], dtype=int)
tri_index[np.triu_indices(dim, k=1)] = np.arange(n_elem)
tri_index[np.triu_indices(dim, k=1)[::-1]] = np.arange(n_elem)
logging.info('building the BPMF model')
with pm.Model() as bpmf:
# Specify user feature matrix
sigma_u = pm.Uniform('sigma_u', shape=dim)
corr_triangle_u = pm.LKJCorr(
'corr_u', n=1, p=dim,
testval=np.random.randn(n_elem) * std)
corr_matrix_u = corr_triangle_u[tri_index]
corr_matrix_u = t.fill_diagonal(corr_matrix_u, 1)
cov_matrix_u = t.diag(sigma_u).dot(corr_matrix_u.dot(t.diag(sigma_u)))
lambda_u = t.nlinalg.matrix_inverse(cov_matrix_u)
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
testval=np.random.randn(dim) * std)
U = pm.MvNormal(
'U', mu=mu_u, tau=lambda_u,
shape=(n, dim), testval=np.random.randn(n, dim) * std)
# Specify item feature matrix
sigma_v = pm.Uniform('sigma_v', shape=dim)
corr_triangle_v = pm.LKJCorr(
'corr_v', n=1, p=dim,
testval=np.random.randn(n_elem) * std)
corr_matrix_v = corr_triangle_v[tri_index]
corr_matrix_v = t.fill_diagonal(corr_matrix_v, 1)
cov_matrix_v = t.diag(sigma_v).dot(corr_matrix_v.dot(t.diag(sigma_v)))
lambda_v = t.nlinalg.matrix_inverse(cov_matrix_v)
mu_v = pm.Normal(
'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
testval=np.random.randn(dim) * std)
V = pm.MvNormal(
'V', mu=mu_v, tau=lambda_v,
testval=np.random.randn(m, dim) * std)
# Specify rating likelihood function
R = pm.Normal(
'R', mu=t.dot(U, V.T), tau=alpha * np.ones((n, m)),
observed=train)
# `start` is the start dictionary obtained from running find_MAP for PMF.
# See the previous post for PMF code.
for key in bpmf.test_point:
if key not in start:
start[key] = bpmf.test_point[key]
with bpmf:
step = pm.NUTS(scaling=start)
The goal with this reimplementation was to produce a model that could be estimated using the NUTS sampler. Unfortunately, I'm still getting the same error at the last line:
PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [ 0 1 2 3 ... 1030 1031 1032 1033 1034 ]
I've made all the code for PMF, BPMF, and this modified BPMF available in this gist to make it simple to replicate the error. All you need to do is download the data (also referenced in the gist).
It looks like you are passing the complete precision matrix into the normal distribution:
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
testval=np.random.randn(dim) * std)
I assume you only want to pass the diagonal values:
mu_u = pm.Normal(
'mu_u', mu=0, tau=beta_0 * t.diag(lambda_u), shape=dim,
testval=np.random.randn(dim) * std)
Does this change to mu_u and mu_v fix it for you?