Using Numpy (np.linalg.svd) for Singular Value Decomposition - python

Im reading Abdi & Williams (2010) "Principal Component Analysis", and I'm trying to redo the SVD to attain values for further PCA.
The article states that following SVD:
X = P D Q^t
I load my data in a np.array X.
X = np.array(data)
P, D, Q = np.linalg.svd(X, full_matrices=False)
D = np.diag(D)
But i do not get the above equality when checking with
X_a = np.dot(np.dot(P, D), Q.T)
X_a and X are the same dimensions, but the values are not the same. Am I missing something, or is the functionality of the np.linalg.svd function not compatible somehow with the equation in the paper?

TL;DR: numpy's SVD computes X = PDQ, so the Q is already transposed.
SVD decomposes the matrix X effectively into rotations P and Q and the diagonal matrix D. The version of linalg.svd() I have returns forward rotations for P and Q. You don't want to transform Q when you calculate X_a.
import numpy as np
X = np.random.normal(size=[20,18])
P, D, Q = np.linalg.svd(X, full_matrices=False)
X_a = np.matmul(np.matmul(P, np.diag(D)), Q)
print(np.std(X), np.std(X_a), np.std(X - X_a))
I get: 1.02, 1.02, 1.8e-15, showing that X_a very accurately reconstructs X.
If you are using Python 3, the # operator implements matrix multiplication and makes the code easier to follow:
import numpy as np
X = np.random.normal(size=[20,18])
P, D, Q = np.linalg.svd(X, full_matrices=False)
X_a = P # diag(D) # Q
print(np.std(X), np.std(X_a), np.std(X - X_a))
print('Is X close to X_a?', np.isclose(X, X_a).all())

I think there are still some important points for those who use SVD in Python/linalg library. Firstly, https://docs.scipy.org/doc/numpy/reference/generated/numpy.linalg.svd.html is a good reference for SVD computation function.
Taking SVD computation as A= U D (V^T),
For U, D, V = np.linalg.svd(A), this function returns V in V^T form already. Also D contains eigenvalues only, hence it has to be shaped into matrix form. Hence the reconstruction can be formed with
import numpy as np
U, D, V = np.linalg.svd(A)
A_reconstructed = U # np.diag(D) # V
The point is that, If A matrix is not a square but rectangular matrix, this won't work, you can use this instead
import numpy as np
U, D, V = np.linalg.svd(A)
m, n = A.shape
A_reconstructed = U[:,:n] # np.diag(D) # V[:m,:]
or you may use 'full_matrices=False' option in the SVD function;
import numpy as np
U, D, V = np.linalg.svd(A,full_matrices=False)
A_reconstructed = U # np.diag(D) # V

From the scipy.linalg.svd docstring, where (M,N) is the shape of the input matrix, and K is the lesser of the two:
Returns
-------
U : ndarray
Unitary matrix having left singular vectors as columns.
Of shape ``(M,M)`` or ``(M,K)``, depending on `full_matrices`.
s : ndarray
The singular values, sorted in non-increasing order.
Of shape (K,), with ``K = min(M, N)``.
Vh : ndarray
Unitary matrix having right singular vectors as rows.
Of shape ``(N,N)`` or ``(K,N)`` depending on `full_matrices`.
Vh, as described, is the transpose of the Q used in the Abdi and Williams paper. So just
X_a = P.dot(D).dot(Q)
should give you your answer.

Though this post is quite old, I thought it deserves a crucial update. In the above answers, the right singular vectors (typically placed in columns of the matrix V) are said to be given directly as columns from np.linalg.svd(). However, this is incorrect. The matrix return from np.linalg.svd() is Vh, the hermitian or conjugate transpose of V, therefore the right singular vectors are in fact in the rows of Vh. Be careful with this as the matrix itself is square so you cannot determine this correctly using the shape, but you can use reconstruction to test if you are viewing the matrix correctly.

Related

In which scenario would one use another matrix than the identity matrix for finding eigenvalues?

The scipy.linalg.eigh function can take two matrices as arguments: first the matrix a, of which we will find eigenvalues and eigenvectors, but also the matrix b, which is optional and chosen as the identity matrix in case it is left blank.
In what scenario would someone like to use this b matrix?
Some more context: I am trying to use xdawn covariances from the pyRiemann package. This uses the scipy.linalg.eigh function with a covariance matrix a and a baseline covariance matrix b. You can find the implementation here. This yields an error, as the b matrix in my case is not positive definitive and thus not useable in the scipy.linalg.eigh function. Removing this matrix and just using the identity matrix however solves this problem and yields relatively nice results... The problem is that I do not really understand what I changed, and maybe I am doing something I should not be doing.
This is the code from the pyRiemann package I am using (modified to avoid using functions defined in other parts of the package):
# X are samples (EEG data), y are labels
# shape of X is (1000, 64, 2459)
# shape of y is (1000,)
from scipy.linalg import eigh
Ne, Ns, Nt = X.shape
tmp = X.transpose((1, 2, 0))
b = np.matrix(sklearn.covariance.empirical_covariance(tmp.reshape(Ne, Ns * Nt).T))
for c in self.classes_:
# Prototyped response for each class
P = np.mean(X[y == c, :, :], axis=0)
# Covariance matrix of the prototyper response & signal
a = np.matrix(sklearn.covariance.empirical_covariance(P.T))
# Spatial filters
evals, evecs = eigh(a, b)
# and I am now using the following, disregarding the b matrix:
# evals, evecs = eigh(a)
If A and B were both symmetric matrices that doesn't necessarily have to imply that inv(A)*B must be a symmetric matrix. And so, if i had to solve a generalised eigenvalue problem of Ax=lambda Bx then i would use eig(A,B) rather than eig(inv(A)*B), so that the symmetry isn't lost.
One practical application is in finding the natural frequencies of a dynamic mechanical system from differential equations of the form M (d²x/dt²) = Kx where M is a positive definite matrix known as the mass matrix and K is the stiffness matrix, and x is displacement vector and d²x/dt² is acceleration vector which is the second derivative of the displacement vector. To find the natural frequencies, x can be substituted with x0 sin(ωt) where ω is the natural frequency. The equation reduces to Kx = ω²Mx. Now, one can use eig(inv(K)*M) but that might break the symmetry of the resultant matrix, and so I would use eig(K,M) instead.
A - lambda B x it means that x is not in the same basis as the covariance matrix.
If the matrix is not definite positive it means that there are vectors that can be flipped by your B.
I hope it was helpful.

Given A = S*D*S.T (D is a diagonal matrix , S/S.T an arbitrary nxn matrix) Shouldn't the eigenvalues of A correspond to the diagonal entries of D?

I was asked to write a function that generates a random symmetric positive definite 2D matrix.
Here is my attempt:
import numpy as np
from numpy import linalg as la
def random_spd(n):
"""Generates random 2D SPD matrix (symmetric positive definite)"""
while True:
S = np.random.rand(n,n)
if la.matrix_rank(S)==n: #Make sure that S has full rank.
break
D = np.diag(np.random.randint(0,10,size=n))
print(f"S:\n{S}\n\nD:\n{D}\n") #Only for debugging
return S#D#S.T
A = random_spd(2)
print(f"A:\n{A}\n")
ei_vals, ei_vecs = la.eig(A)
print(f"Eigenvalues:\n{ei_vals}\n\nEigenvectors:\n{ei_vecs}")
Output:
D:
[[6 0]
[0 5]]
A:
[[1.97478191 1.71620628]
[1.71620628 2.37372465]]
Eigenvalues:
[0.4464938 3.90201276]
Eigenvectors:
[[-0.74681018 -0.66503726]
[ 0.66503726 -0.74681018]]
As far as I know, the function works.
Now, if I try to calculate the eigenvalues of a randomly generated matrix, shouldn't they be the same as
the diagonal entries of the matrix D?
Can someone help me understand my misconception or mistake?
Thank you very much!
Best regards, Max :)
What you are applying is a congruency transform, it preserves definiteness.
A positive definite matrix P is one that for any (non-null) vector x of shape (N, 1), x.T # P # x.
Now if you replace x = S # y, in the above condition you get y.T # S.T # P # S # y, comparing the two you conclude that S.T # P # S is positive definite as well (semidefinite positive if S is not full-rank).
Similarly the eigenvalues are defined by the equation
A # v = lambda * v
If you replace v = S u the equation you get is
A # S # u = lambda * S # u
To place this equation in the same form as the eigenvalues equatios, left-multiply the equation for inv(S)
(inv(S) # A # S) # u = lambda * u
We say that the matrix obtained this way inv(S) # A # S is similar to A, and we call this a similarity transformation.
There are simpler ways to do create a positive definite matrix. One simple way
S = np.random.rand(n,n)
A = S.T # S + eps * np.eye(n)
S.T # S can be seen as a congruency transform of the identity matrix, thus positive semidefinite, adding eps * eye(n) will ensure that all eigen values are greater than eps. No matrix inversions, no eigen decomposition.

N-dimensional high order polynomial interpolation

I am searching some clues about a complex issue that I am facing, regarding interpolation in a 4D space.
I have a dataset composed by 340 points in a 3-dimensional space (I have three variables - A, B, C - each defined by 340 elements). Each point is identified by a certain value of an output variable. So, generally I have
f(A,B,C) = D
I need to interpolate the dataset in order to predict the value of D for each point in the design space. What I did was to write a small script to obtain the coefficients of the polynomial m through the numpy method linalg.lstsq
def polyfit4d(x,y,z, metric, order):
ncols = (order + 1)**3
G = np.zeros((x.size, ncols))
ij = itertools.product(range(order+1), repeat=3)
for w, (i,j,k) in enumerate(ij):
G[:,w] = x**i * y**j * z**k
m, residuals, rank, s = np.linalg.lstsq(G, metric)
return m, residuals
Then, I used an evaluating function to obtain the values of the function at all the points of the design space.
def polyval4d(x, y, z, m):
string = ''
order = int(math.ceil(((len(m))**(1/3.0))-1))
ij = itertools.product(range(order+1), repeat=3)
f = np.zeros_like(x)
for a, (i,j,k) in zip(m, ij):
f += a * x**i * y**j * z**k
return f
Since my design space is 3-dimensional, I passed to the polyval function three 3D matrix with all the points X,Y,Z of the design space.
The f is a 3D matrix of outputs D. Each point in this matrix is the value of D calculated evaluating the polynomial, found with polyfit, in each point of the design space (sorry for the tricky sentence).
What I do then is to plot the contour plot of a slice of this 3D design space. I choose one value of Z, and I plot the 2D plane formed by X,Y with the contours levels based on the values of D.
The problem is that the result is not what I expect. The contour plot is almost of the same colour, with some variations in one corner.
I searched everywhere on the internet, and also the Python wiki suggests functions that works only for the 2D case.
Has anyone ever faced this kind of problem? Do I miss something in the evaluation/definition of this N-dimensional polynomial?
Thanks a lot for your kind attention.
Federico

Find SVD of a symmetric matrix in Python

I know np.linalg.svd(A) would return the SVD of matrix A.
A=u * np.diag(s) * v
However if it is a symmetric matrix you only need one unitary matrix:
A=v.T * np.diag(s) * v
In R we can use La.svd(A,nu=0) but is there any functions to accelerate the SVD process in Python for a symmetric matrix?
In SVD of symmetric matrices, U=V would be valid only if the eigenvalues of A are all positive. Otherwise V would be almost equal to U but not exactly equal to U, as some the columns of V corresponding to negative eigenvalues would be negatives of the corresponding columns of U. And so, A=v.T * np.diag(s) * v is valid only if A has all its eigenvalues positive.
Assuming the symmetric matrix is a real matrix, the U matrix of singular value decomposition is the eigenvector matrix itself, and the singular values would be absolute values of the eigenvalues of the matrix, and the V matrix would be the same as U matrix except that the columns corresponding to negative eigenvalues are the negative values of the columns of U.
And so, by finding out the eigenvalues and the eigenvectors, SVD can be computed.
Python code:
import numpy as np
def svd_symmetric(A):
[s,u] = np.linalg.eig(A) #eigenvalues and eigenvectors
v = u.copy()
v[:,s<0] = -u[:,s<0] #replacing the corresponding columns with negative sign
s = abs(s)
return [u, s, v.T]
n = 5
A = np.matrix(np.random.rand(n,n)) # a matrix of random numbers
A = A.T+A #making it a symmetric matrix
[u, s, vT] = svd_symmetric(A)
print(u * np.diag(s) * vT)
This prints the same matrix as A.
(Note: I don't know whether it works for complex matrices or not.)

How to whiten matrix in PCA

I'm working with Python and I've implemented the PCA using this tutorial.
Everything works great, I got the Covariance I did a successful transform, brought it make to the original dimensions not problem.
But how do I perform whitening? I tried dividing the eigenvectors by the eigenvalues:
S, V = numpy.linalg.eig(cov)
V = V / S[:, numpy.newaxis]
and used V to transform the data but this led to weird data values.
Could someone please shred some light on this?
Here's a numpy implementation of some Matlab code for matrix whitening I got from here.
import numpy as np
def whiten(X,fudge=1E-18):
# the matrix X should be observations-by-components
# get the covariance matrix
Xcov = np.dot(X.T,X)
# eigenvalue decomposition of the covariance matrix
d, V = np.linalg.eigh(Xcov)
# a fudge factor can be used so that eigenvectors associated with
# small eigenvalues do not get overamplified.
D = np.diag(1. / np.sqrt(d+fudge))
# whitening matrix
W = np.dot(np.dot(V, D), V.T)
# multiply by the whitening matrix
X_white = np.dot(X, W)
return X_white, W
You can also whiten a matrix using SVD:
def svd_whiten(X):
U, s, Vt = np.linalg.svd(X, full_matrices=False)
# U and Vt are the singular matrices, and s contains the singular values.
# Since the rows of both U and Vt are orthonormal vectors, then U * Vt
# will be white
X_white = np.dot(U, Vt)
return X_white
The second way is a bit slower, but probably more numerically stable.
If you use python's scikit-learn library for this, you can just set the inbuilt parameter
from sklearn.decomposition import PCA
pca = PCA(whiten=True)
whitened = pca.fit_transform(X)
check the documentation.
I think you need to transpose V and take the square root of S. So the formula is
matrix_to_multiply_with_data = transpose( v ) * s^(-1/2 )
Use ZCA mapping instead
function [Xw] = whiten(X)
% Compute and apply the ZCA mapping
mu_X = mean(X, 1);
X = bsxfun(#minus, X, mu_X);
Xw = X / sqrtm(cov(X));
end

Categories

Resources