Matrix Mutliplication should converge, but it doesn't

Matrix Mutliplication should converge, but it doesn't - python

I'm trying to calculate the eigenvalues of multiplication between matrices of this kind:
import numpy as np
μ=1.5
σ=0.5
m=np.random.normal(μ,σ)
P=[[-1, m, 0.1, 0.1],[1,0,0,0],[0,1,0,0],[0,0,1,0]] #matrix
n=200
for i in range(n):
m=np.random.normal(μ,σ)
T=[[-1, m, 0.1, 0.1],[1,0,0,0],[0,1,0,0],[0,0,1,0]]
P=np.dot(T,P)
l,v=np.linalg.eig(T)
λ,w=np.linalg.eig(P)
print(l)
print(λ)
Being three of the eigenvalues of the matrix T lower than 1 (in module), I expect something similar from the eigenvalues of the matrix product P, in particular that the three eigenvalues lower than 1 will correspond to eigenvalues of P which decreases and converge to 0 with n increasing. In fact, this is true until n=40. Then it doesn't work anymore.
I'm sure there is something wrong in the algorithm and not in the math because for σ=0 P would be the product of n identical matrices with eigenvalues lower than 1. But the eigenvalues of P diverges.

Your T[i] is not guaranteed to have eigenvalues less than 1
Add an assertion and you will see when it violates the assumption
import numpy as np
μ=1.5
σ=0.5
m=np.random.normal(μ,σ)
P=[[-1, m, 0.1, 0.1],[1,0,0,0],[0,1,0,0],[0,0,1,0]] #matrix
n=200
for i in range(n):
m=np.random.normal(μ,σ)
T=[[-1, m, 0.1, 0.1],[1,0,0,0],[0,1,0,0],[0,0,1,0]]
assert np.all(abs(np.linalg.eigvals(T)) < 1)
P=np.dot(T,P)
l,v=np.linalg.eig(T)
λ,w=np.linalg.eig(P)
print(abs(l))
print(λ)
Being a stochastic system even violating the constraint sometimes it can converge, but it requires some careful analysis of the expected values of the sequence.
The matrix structure resembles the controlable canonical form in linar dynamic systems [1].
But since the matrix T sometimes have eigenvalues larger than 1, the reasoning you suggested does not apply in this case.

Related

In which scenario would one use another matrix than the identity matrix for finding eigenvalues?

The scipy.linalg.eigh function can take two matrices as arguments: first the matrix a, of which we will find eigenvalues and eigenvectors, but also the matrix b, which is optional and chosen as the identity matrix in case it is left blank.
In what scenario would someone like to use this b matrix?
Some more context: I am trying to use xdawn covariances from the pyRiemann package. This uses the scipy.linalg.eigh function with a covariance matrix a and a baseline covariance matrix b. You can find the implementation here. This yields an error, as the b matrix in my case is not positive definitive and thus not useable in the scipy.linalg.eigh function. Removing this matrix and just using the identity matrix however solves this problem and yields relatively nice results... The problem is that I do not really understand what I changed, and maybe I am doing something I should not be doing.
This is the code from the pyRiemann package I am using (modified to avoid using functions defined in other parts of the package):
# X are samples (EEG data), y are labels
# shape of X is (1000, 64, 2459)
# shape of y is (1000,)
from scipy.linalg import eigh
Ne, Ns, Nt = X.shape
tmp = X.transpose((1, 2, 0))
b = np.matrix(sklearn.covariance.empirical_covariance(tmp.reshape(Ne, Ns * Nt).T))
for c in self.classes_:
# Prototyped response for each class
P = np.mean(X[y == c, :, :], axis=0)
# Covariance matrix of the prototyper response & signal
a = np.matrix(sklearn.covariance.empirical_covariance(P.T))
# Spatial filters
evals, evecs = eigh(a, b)
# and I am now using the following, disregarding the b matrix:
# evals, evecs = eigh(a)

If A and B were both symmetric matrices that doesn't necessarily have to imply that inv(A)*B must be a symmetric matrix. And so, if i had to solve a generalised eigenvalue problem of Ax=lambda Bx then i would use eig(A,B) rather than eig(inv(A)*B), so that the symmetry isn't lost.
One practical application is in finding the natural frequencies of a dynamic mechanical system from differential equations of the form M (d²x/dt²) = Kx where M is a positive definite matrix known as the mass matrix and K is the stiffness matrix, and x is displacement vector and d²x/dt² is acceleration vector which is the second derivative of the displacement vector. To find the natural frequencies, x can be substituted with x0 sin(ωt) where ω is the natural frequency. The equation reduces to Kx = ω²Mx. Now, one can use eig(inv(K)*M) but that might break the symmetry of the resultant matrix, and so I would use eig(K,M) instead.

A - lambda B x it means that x is not in the same basis as the covariance matrix.
If the matrix is not definite positive it means that there are vectors that can be flipped by your B.
I hope it was helpful.

How to remove discontinuities from complex angle of NumPy eigenvector components?

I am using NumPy's linalg.eig on square matrices. My square matrices are a function of a 2D domain, and I am looking at its eigenvectors' complex angles along a parameterized circle on this domain. As long as the path I am considering is smooth, I expect the complex angles of each eigenvector's components to be smooth. However, for some cases, this is not the case with Python (although it is with other programming languages). For the parameter M=0 (some argument in my matrix that appears on its diagonal), I have components that look like:
when they should ideally look like (M=0.1):
What I have tried:
I verified that the matrices are Hermitian in both cases.
When I use linalg.eigh, M=0.1 becomes discontinuous while M=0 sometimes becomes continuous.
Using np.unwrap did nothing.
The difference between component phases (i.e. np.angle(v1-v2) for eigenvector v=[[v1],[v2]]) is smooth/continuous, but this is not what I want.
Fixing the NumPy seed before solving did nothing for different values of the seed. For example: np.random.seed(1).
What else can I do? I am trying to use Sympy's eigenvects just because I am running out of options, and I asked another question asking about another potential approach here: How do I force first component of NumPy eigenvectors to be real? . But, I do not know what else I can try.
Here is a minimal working example that works nicely in a Jupyter notebook:
import numpy as np
from numpy import linalg as LA
import matplotlib.pyplot as plt
M = 0.01; # nonzero M is okay
M = 0.0; # M=0 causes problems
def matrix_generator(kx,ky,M):
a = 2.46; t = 1; k = np.array((kx,ky));
d1 = (a/2)*np.array((1,np.sqrt(3)));d2 = (a/2)*np.array((1,-np.sqrt(3)));d3 = -a*np.array((1,0));
sx = np.matrix([[0,1],[1,0]]);sy = np.matrix([[0,-1j],[1j,0]]);sz = np.matrix([[1,0],[0,-1]]);
hx = np.cos(k#d1)+np.cos(k#d2)+np.cos(k#d3);hy = np.sin(k#d1)+np.sin(k#d2)+np.sin(k#d3);
return -t*(hx*sx - hy*sy + M*sz)
n_segs = 200; #number of segments in (kx,ky) loop
evecs_along_loop = np.zeros((n_segs,2,2),dtype=float)
# parameterize circular loop
kx0 = 0.5; ky0 = 1; r1=0.2; r2=0.2;
a = np.linspace(0.0, 2*np.pi, num=n_segs+2)
kloop=np.zeros((n_segs+2,2))
for i in range(n_segs+2):
kloop[i,:]=np.array([kx0 + r1*np.cos(a[i]), ky0 + r2*np.sin(a[i])])
# assign eigenvector complex angles
for j in np.arange(n_segs):
np.random.seed(2)
H = matrix_generator(kloop[j][0],kloop[j][1],M)
eval0, psi0 = LA.eig(H)
evecs_along_loop[j,:,:] = np.angle(psi0)
# plot eigenvector complex angles
for p in np.arange(2):
for q in np.arange(2):
print(f"Phase for eigenvector element {p},{q}:")
fig = plt.figure()
ax = plt.axes()
ax.plot((evecs_along_loop[:,p,q]))
plt.show()
Clarification for anon01's comment:
For M=0, a sample matrix at some value of (kx,ky) would look like:
a = np.matrix([[0.+0.j, 0.99286437+1.03026667j],
[0.99286437-1.03026667j, 0.+0.j]])
For M =/= 0, the diagonal will be non-zero (but real).

I think that in general this is a tough problem. The fundamental issue is that eigenvectors (unlike eigenvalues) are not unambiguously defined. An eigenvector v of M with eigenvalue c is any non-zero vector for which
M*v = c*v
In particular for any non zero scalar s, multiplying an eigenvector by s yields an eigenvector, and even if you demand (as usual) that eigenvectors have length 1, we are still free to multiply by any scalar of absolute value 1. Even worse, if v1,..vd are orthogonal eigenvectors for c, then any non-zero linear combination of the v's is also an eigenvector for c.
Different eigendecomposition routines might well, therefore, come up with very different eigenvectors and still be doing their job. Moreover some routines might produce eigenvectors that are far apart for matrices that are close together.
A simple tractable case is where you know that all your eigenvalues are non-degenerate (i.e. each eigenspace is of dimension 1) and you happen to know that for a particular i, the i'th component of each eigenvector will be non zero. Then you could multiply the eigenvector v by a scalar, of absolute value 1, chosen so that after the multiplication v[i] is a positive real number. In C
s = conj(v[i])/cabs(v[i])
where
conj(z) is the complex conjugate of the complex number z,
and cabs(z) is the absolute value of the complex number z
Note that the above supposes that we are using the same index for every eigenvector, though the factor s varies from eigenvector to eigenvector.
This would impose a uniqueness on the eigenvectors, and, one would hope, mean that they varied continuously with the parameters of your matrix.

Numpy: Raise diagonalizable square matrix to infinite power

Consider a Markovian process with a diagonalizable transition matrix A such that A=PDP^-1, where D is a diagonal matrix with eigenvalues of A, and P is a matrix whose columns are eigenvectors of A.
To compute, for each state, the likelihood of ending up in each absorbing state, I'd like to raise the transition matrix to the power of n, with n approaching infinity:
A^n=PD^nP^-1
Which is the pythonic way of doing this in Numpy? I could naively compute the eigenvalues and eigenvectors of A, raising the eigenvalues to infinity. Due to my assumption that I have a transition matrix, we would have only eigenvalues equal to one (which will remain one), and eigenvalues between 0 and 1, which will become zero (inspired by this answer):
import numpy as np
from scipy import linalg
# compute left eigenvalues and left eigenvectors
eigenvalues, leftEigenvectors = linalg.eig(transitionMatrix, right=False, left=True)
# for stationary distribution, eigenvalues and vectors are real (up to numerical precision)
eigenvalues = eigenvalues.real
leftEigenvectors = leftEigenvectors.real
# create a filter to collect the eigenvalues close to one
absoluteTolerance = 1e-10
mask = abs(eigenvalues - 1) < absoluteTolerance
# raise eigenvalues to the power of infinity
eigenvalues[mask] = 1
eigenvalues[~mask] = 0
D_raised = np.diag(eigenvalues)
A_raised = leftEigenvectors # D_raised # linalg.inv(leftEigenvectors)
Would this be the recommended approach, i.e., one that is both numerically stable and efficient?

Pearson's correlation coefficient between all pairs of rows from two 2D arrays using scipy.stats.pearsonr vs. numpy.corrcoeff in python 3.5

I tried to calculate the Pearson's correlation coefficients between every pairs of rows from two 2D arrays. Then, sort the rows/columns of the correlation matrix based on its diagonal elements. First, the correlation coefficient matrix (i.e., 'ccmtx') was calculated from one random matrix (i.e., 'randmtx') in the following code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import pearsonr
def correlation_map(x, y):
n_row_x = x.shape[0]
n_row_y = x.shape[0]
ccmtx_xy = np.empty((n_row_x, n_row_y))
for n in range(n_row_x):
for m in range(n_row_y):
ccmtx_xy[n, m] = pearsonr(x[n, :], y[m, :])[0]
return ccmtx_xy
randmtx = np.random.randn(100, 1000) # generating random matrix
#ccmtx = np.corrcoef(randmtx, randmtx) # cc matrix based on numpy.corrcoef
ccmtx = correlation_map(randmtx, randmtx) # cc matrix based on scipy pearsonr
#
ccmtx_diag = np.diagonal(ccmtx)
#
ids, vals = np.argsort(ccmtx_diag, kind = 'mergesort'), np.sort(ccmtx_diag, kind = 'mergesort')
#ids, vals = np.argsort(ccmtx_diag, kind = 'quicksort'), np.sort(ccmtx_diag, kind = 'quicksort')
plt.plot(ids)
plt.show()
plt.plot(ccmtx_diag[ids])
plt.show()
vals[0]
The issue here is when the 'pearsonr' was used, the diagonal elements of 'ccmtx' are exactly 1.0 which makes sense. However, the 'corrcoef' was used, the diagonal elements of 'ccmtrix' are not exactly one (and slightly less than 1 for some diagonals) seemingly due to a precision error of floating point numbers.
I found to be annoying that the auto-correlation matrix of a single matrix have diagnoal elements not being 1.0 since this resulted in the shuffling of rows/columes of the correlation matrix when the matrix is sorted based on the diagonal elements.
My questions are:
[1] is there any good way to accelerate the computation time when I stick to use the 'pearsonr' function? (e.g., vectorized pearsonr?)
[2] Is there any good way/practice to prevent this precision error when using the 'corrcoef' in numpy? (e.g. 'decimals' option in np.around?)
I have searched the correlation coefficient calculations between all pairs of rows or columns from two matrices. However, as the algorithms containe some sort of "cov / variance" operation, this kind of precision issue seems always existing.
Minor point: the 'mergesort' option seems to provide reliable results than the 'quicksort' as the quicksort shuffled 1d array with exactly 1 to random order.
Any thoughts/comments would be greatly appreciated!

For question 1 vectorized pearsonr see the comments to the question.
I will answer only question 2: how to improve the precision of np.corrcoef.
The correlation matrix R is computed from the covariance matrix C according to
.
The implementation is optimized for performance and memory usage. It computes the covariance matrix, and then performs two divisions by sqrt(C_ii) and by sqrt(Cjj). This separate square-rooting is where the imprecision comes from. For example:
np.sqrt(3 * 3) - 3 == 0.0
np.sqrt(3) * np.sqrt(3) - 3 == -4.4408920985006262e-16
We can fix this by implementing our own simple corrcoef routine:
def corrcoef(a, b):
c = np.cov(a, b)
d = np.diag(c)
return c / np.sqrt(d[:, None] * d[None, :])
Note that this implementation requires more memory than the numpy implementation because it needs to store a temporary matrix with size n * n and it is slightly slower because it needs to do n^2 square roots instead of only 2 n.

numpy: Possible for zero determinant matrix to be inverted?

By definition, a square matrix that has a zero determinant should not be invertible. However, for some reason, after generating a covariance matrix, I take the inverse of it successfully, but taking the determinant of the covariance matrix ends up with an output of 0.0.
What could be potentially going wrong? Should I not trust the determinant output, or should I not trust the inverse covariance matrix? Or both?
Snippet of my code:
cov_matrix = np.cov(data)
adjusted_cov = cov_matrix + weight*np.identity(cov_matrix.shape[0]) # add small weight to ensure cov_matrix is non-singular
inv_cov = np.linalg.inv(adjusted_cov) # runs with no error, outputs a matrix
det = np.linalg.det(adjusted_cov) # ends up being 0.0

The numerical inversion of matrices does not involve computing the determinant. (Cramer's formula for the inverse is not practical for large matrices.) So, the fact that determinant evaluates to 0 (due to insufficient precision of floats) is not an obstacle for the matrix inversion routine.
Following up on the comments by BobChao87, here is a simplified test case (Python 3.4 console, numpy imported as np)
A = 0.2*np.identity(500)
np.linalg.inv(A)
Output: a matrix with 5 on the main diagonal, which is the correct inverse of A.
np.linalg.det(A)
Output: 0.0, because the determinant (0.2^500) is too small to be represented in double precision.
A possible solution is a kind of pre-conditioning (here, just rescaling): before computing the determinant, multiply the matrix by a factor that will make its entries closer to 1 on average. In my example, np.linalg.det(5*A) returns 1.
Of course, using the factor of 5 here is cheating, but np.linalg.det(3*A) also returns a nonzero value (about 1.19e-111). If you try np.linalg.det(2**k*A) for k running through modest positive integers, you will likely hit one that will return nonzero. Then you will know that the determinant of the original matrix was approximately 2**(-k*n) times the output, where n is the matrix size.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.