Why are these 2 ways of finding eigenvectors different in python? - python

import numpy as np
#first way
A = np.array([[1, 0, 1], [-2, 1, 0]])
print(A)
B = A#A.transpose()
print(B)
eig_val, eig_vec = np.linalg.eig(B)
print(eig_vec)
#second way
from sympy import *
G = Matrix([[2,-2], [-2,5]])
print(G.eigenvects())
Why does these two ways give different result when they are aiming a same goal of finding the eigenvectors?

It has already been mentioned that eignevectors are only unique upto a scalar multiple. That's a mathematical fact. To dig into the implementations of the methods you're using, numpy.linalg.eig returns normalized eigenvectors (i.e. the norm of the vectors would be 1) whereas eigenvects() of sympy does not normalize the vectors.
In some sense, normalized vectors are unique precisely because they have unit norm. They can define a unit eigendirection (geometrically), just like unit vectors in coordinate geometry. (Not strictly important to know)

Related

How to remove discontinuities from complex angle of NumPy eigenvector components?

I am using NumPy's linalg.eig on square matrices. My square matrices are a function of a 2D domain, and I am looking at its eigenvectors' complex angles along a parameterized circle on this domain. As long as the path I am considering is smooth, I expect the complex angles of each eigenvector's components to be smooth. However, for some cases, this is not the case with Python (although it is with other programming languages). For the parameter M=0 (some argument in my matrix that appears on its diagonal), I have components that look like:
when they should ideally look like (M=0.1):
What I have tried:
I verified that the matrices are Hermitian in both cases.
When I use linalg.eigh, M=0.1 becomes discontinuous while M=0 sometimes becomes continuous.
Using np.unwrap did nothing.
The difference between component phases (i.e. np.angle(v1-v2) for eigenvector v=[[v1],[v2]]) is smooth/continuous, but this is not what I want.
Fixing the NumPy seed before solving did nothing for different values of the seed. For example: np.random.seed(1).
What else can I do? I am trying to use Sympy's eigenvects just because I am running out of options, and I asked another question asking about another potential approach here: How do I force first component of NumPy eigenvectors to be real? . But, I do not know what else I can try.
Here is a minimal working example that works nicely in a Jupyter notebook:
import numpy as np
from numpy import linalg as LA
import matplotlib.pyplot as plt
M = 0.01; # nonzero M is okay
M = 0.0; # M=0 causes problems
def matrix_generator(kx,ky,M):
a = 2.46; t = 1; k = np.array((kx,ky));
d1 = (a/2)*np.array((1,np.sqrt(3)));d2 = (a/2)*np.array((1,-np.sqrt(3)));d3 = -a*np.array((1,0));
sx = np.matrix([[0,1],[1,0]]);sy = np.matrix([[0,-1j],[1j,0]]);sz = np.matrix([[1,0],[0,-1]]);
hx = np.cos(k#d1)+np.cos(k#d2)+np.cos(k#d3);hy = np.sin(k#d1)+np.sin(k#d2)+np.sin(k#d3);
return -t*(hx*sx - hy*sy + M*sz)
n_segs = 200; #number of segments in (kx,ky) loop
evecs_along_loop = np.zeros((n_segs,2,2),dtype=float)
# parameterize circular loop
kx0 = 0.5; ky0 = 1; r1=0.2; r2=0.2;
a = np.linspace(0.0, 2*np.pi, num=n_segs+2)
kloop=np.zeros((n_segs+2,2))
for i in range(n_segs+2):
kloop[i,:]=np.array([kx0 + r1*np.cos(a[i]), ky0 + r2*np.sin(a[i])])
# assign eigenvector complex angles
for j in np.arange(n_segs):
np.random.seed(2)
H = matrix_generator(kloop[j][0],kloop[j][1],M)
eval0, psi0 = LA.eig(H)
evecs_along_loop[j,:,:] = np.angle(psi0)
# plot eigenvector complex angles
for p in np.arange(2):
for q in np.arange(2):
print(f"Phase for eigenvector element {p},{q}:")
fig = plt.figure()
ax = plt.axes()
ax.plot((evecs_along_loop[:,p,q]))
plt.show()
Clarification for anon01's comment:
For M=0, a sample matrix at some value of (kx,ky) would look like:
a = np.matrix([[0.+0.j, 0.99286437+1.03026667j],
[0.99286437-1.03026667j, 0.+0.j]])
For M =/= 0, the diagonal will be non-zero (but real).
I think that in general this is a tough problem. The fundamental issue is that eigenvectors (unlike eigenvalues) are not unambiguously defined. An eigenvector v of M with eigenvalue c is any non-zero vector for which
M*v = c*v
In particular for any non zero scalar s, multiplying an eigenvector by s yields an eigenvector, and even if you demand (as usual) that eigenvectors have length 1, we are still free to multiply by any scalar of absolute value 1. Even worse, if v1,..vd are orthogonal eigenvectors for c, then any non-zero linear combination of the v's is also an eigenvector for c.
Different eigendecomposition routines might well, therefore, come up with very different eigenvectors and still be doing their job. Moreover some routines might produce eigenvectors that are far apart for matrices that are close together.
A simple tractable case is where you know that all your eigenvalues are non-degenerate (i.e. each eigenspace is of dimension 1) and you happen to know that for a particular i, the i'th component of each eigenvector will be non zero. Then you could multiply the eigenvector v by a scalar, of absolute value 1, chosen so that after the multiplication v[i] is a positive real number. In C
s = conj(v[i])/cabs(v[i])
where
conj(z) is the complex conjugate of the complex number z,
and cabs(z) is the absolute value of the complex number z
Note that the above supposes that we are using the same index for every eigenvector, though the factor s varies from eigenvector to eigenvector.
This would impose a uniqueness on the eigenvectors, and, one would hope, mean that they varied continuously with the parameters of your matrix.

Eigensystem of parametrized, hermitian matrix in python

Suppose we are interested in the eigenvalues and eigenvectors of a hermitian matrix h(t) that depends on a parameter t. My matrix is large and sparse and hence needs to be treated numerically.
A naive approach is to evaluate the matrix h(t_k) at discretized parameter values t_k. Is it possible to sort the eigenvectors and eigenvalues according to the "character of the eigenvector"?
Let me illustrate what I mean by "character of the eigenvector" with the following simple example (i denotes the imaginary unit).
h(t) = {{1, it}, {-it, 1}}
The eigenvalues are 1-t and 1+t with the corresponding eigenvectors {-i, 1} and {i, 1}. Hence sorting according to the "eigenvector character" the eigenvalues should cross at t = 0. However most eigensolvers sort them by increasing eigenvalues exchanging the eigenvectors from negative to positive t (see code and output plot).
import numpy as np
import scipy.sparse.linalg as sla
import matplotlib.pyplot as plt
def h(t):
# parametrized hermitian matrix
return np.array([[1, t*1j], [-t*1j, 1]])
def eigenvalues(t):
# convert to tuple for np.vectorize to work
return tuple(sla.eigsh(h(t), k=2, return_eigenvectors=False))
eigenvalues = np.vectorize(eigenvalues)
t = np.linspace(-1, 1, num=200)
ev0, ev1 = eigenvalues(t)
plt.plot(t, ev0, 'r')
plt.plot(t, ev1, 'g')
plt.xlabel('t')
plt.ylabel('eigenvalues')
plt.show()
Idea
Most eigensolvers iteratively approximate the eigenvectors and eigenvalues. By feeding the eigensystem of the matrix h(t_k) to the solver as an initial guess to diagonalize h(t_{k+1}) one might obtain a result ordered by the "character of the eigenvector".
Is it possible to achieve this with scipy or more generally with python? Preferentially the heavy diagonalization should be delegated to a dedicated compiled library (e.g. Lapack in scipy). Is there a suitable Lapack routine maybe already wrapped in scipy?
Is there an alternative method that achieves the same? How can it be implemented in python?

What is the canonical representation of a point in numpy?

I'm going to be doing some geometric calculations involving 2-D and 3D points using numpy.
What is the canonical representation of a 2-D or 3-D point? Please assume minimal familiarity with numpy, data shapes, etc.
The representation of a single point in Cartesian space is somewhat trivial. You could even use flat tuples or lists to represent them and matrix operations would still work, but if you want to add or scale them (which is fundamentally what linear spaces are for) you have to use arrays. I don't see a reason why not to use a 1d array with shape (d,) in d dimensions: you can use those both as column and row vectors on either side of a matrix using the # matmul operator:
import numpy as np
rot90 = np.array([[0, -1, 0], [1, 0, 0], [0, 0, 1]]) # rotate 90 degrees around z
inp = np.array([1, 0, 0]) # x
# rotate:
inp_rot = rot90 # inp # y
# inverse transform:
inp_invrot = inp # rot90 # -y
A much better question is how to represent collections of points in Cartesian space. If you have N points you will probably want to use a 2d array. But which shape should it be, (N, d) or (d, N)? The answer depends on your use case but without further input you'll want to choose (N, d).
Arrays in numpy are "C-contiguous" by default, which is also called row-major memory layout. This means that on creation an array occupies a contiguous block of memory by default, and items are laid out in memory row after row, with these indices as an example:
>>> np.arange(2*3).reshape(2, 3)
array([[0, 1, 2],
[3, 4, 5]])
One of the reasons we use numpy is that a contiguous block of memory for a given type occupies much less space than a native python container of the same size, at least for large datasets. The other reason is that we can use vectorized operations that work on slices of the input "simultaneously". The quotes are there because fundamentally the hands of the CPU are bound, but it turns out that you can achieve quite some speedup by making good use of CPU caches. And this is where memory layout comes into play: by using operations on an array that access elements close in memory you have a higher chance of making use of caching, and the reduced communication between RAM and CPU will lead to shorter runtimes.
The problem is not trivial, because vectorizing along larger non-contiguous dimensions might end up faster than vectorizing along smaller contiguous ones. However, without any additional information it's a good rule of thumb to put those dimensions last where you are likely to perform vectorized operations and reductions such as .mean() or .sum(). In case of N points in d-dimensional space it's quite likely that you will want to handle each point separately. Loops in matrix multiplications and things like scalar products and vector norms will all want you to work with one component after the other for a given point.
This is why you will see numpy and scipy functions usually assume arrays of shape (N, d): the inner dimension is second and the "batch" index is first. Consider for example numpy.linalg.eig:
Parameters:
a : (…, M, M) array
Matrices for which the eigenvalues and right eigenvectors will be computed
Returns:
w : (…, M) array
The eigenvalues, each repeated according to its multiplicity. The eigenvalues
are not necessarily ordered. The resulting array will be of complex type,
unless the imaginary part is zero in which case it will be cast to a real
type. When a is real the resulting eigenvalues will be real (0 imaginary
part) or occur in conjugate pairs
[...]
It treats multidimensional arrays as batches of matrices, where the last two indices correspond to the Cartesian indices. Similarly the returned eigenvalues and eigenvectors have batch indices first and vector space indices last.
A more direct example is scipy.spatial.distance.pdist which computes the distance between pairs of points in a collection:
Parameters
X : ndarray
An m by n array of m original observations in an n-dimensional space.
[...]
Again you can see the convention that Cartesian indices are last. The same goes for scipy.interpolate.griddata and probably a bunch of other functions.
So if you have a good reason to use either representation: do that. But if you don't have a good indicator (such as the results of profiling both representations) you should stick with the "batch of vectors/matrices" approach usually employed by numpy and scipy (shape (N, d)), because you might even end up using some of these functions, for which your representation will then be native.
Represent them in your source code as tuples or lists, e.g. (1, 0) or [1, 0, 1].
As per this example from scipy:
>>> from scipy.spatial import distance
>>> distance.euclidean([1, 0, 0], [0, 1, 0])
1.4142135623730951

Pearson's correlation coefficient between all pairs of rows from two 2D arrays using scipy.stats.pearsonr vs. numpy.corrcoeff in python 3.5

I tried to calculate the Pearson's correlation coefficients between every pairs of rows from two 2D arrays. Then, sort the rows/columns of the correlation matrix based on its diagonal elements. First, the correlation coefficient matrix (i.e., 'ccmtx') was calculated from one random matrix (i.e., 'randmtx') in the following code:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import pearsonr
def correlation_map(x, y):
n_row_x = x.shape[0]
n_row_y = x.shape[0]
ccmtx_xy = np.empty((n_row_x, n_row_y))
for n in range(n_row_x):
for m in range(n_row_y):
ccmtx_xy[n, m] = pearsonr(x[n, :], y[m, :])[0]
return ccmtx_xy
randmtx = np.random.randn(100, 1000) # generating random matrix
#ccmtx = np.corrcoef(randmtx, randmtx) # cc matrix based on numpy.corrcoef
ccmtx = correlation_map(randmtx, randmtx) # cc matrix based on scipy pearsonr
#
ccmtx_diag = np.diagonal(ccmtx)
#
ids, vals = np.argsort(ccmtx_diag, kind = 'mergesort'), np.sort(ccmtx_diag, kind = 'mergesort')
#ids, vals = np.argsort(ccmtx_diag, kind = 'quicksort'), np.sort(ccmtx_diag, kind = 'quicksort')
plt.plot(ids)
plt.show()
plt.plot(ccmtx_diag[ids])
plt.show()
vals[0]
The issue here is when the 'pearsonr' was used, the diagonal elements of 'ccmtx' are exactly 1.0 which makes sense. However, the 'corrcoef' was used, the diagonal elements of 'ccmtrix' are not exactly one (and slightly less than 1 for some diagonals) seemingly due to a precision error of floating point numbers.
I found to be annoying that the auto-correlation matrix of a single matrix have diagnoal elements not being 1.0 since this resulted in the shuffling of rows/columes of the correlation matrix when the matrix is sorted based on the diagonal elements.
My questions are:
[1] is there any good way to accelerate the computation time when I stick to use the 'pearsonr' function? (e.g., vectorized pearsonr?)
[2] Is there any good way/practice to prevent this precision error when using the 'corrcoef' in numpy? (e.g. 'decimals' option in np.around?)
I have searched the correlation coefficient calculations between all pairs of rows or columns from two matrices. However, as the algorithms containe some sort of "cov / variance" operation, this kind of precision issue seems always existing.
Minor point: the 'mergesort' option seems to provide reliable results than the 'quicksort' as the quicksort shuffled 1d array with exactly 1 to random order.
Any thoughts/comments would be greatly appreciated!
For question 1 vectorized pearsonr see the comments to the question.
I will answer only question 2: how to improve the precision of np.corrcoef.
The correlation matrix R is computed from the covariance matrix C according to
.
The implementation is optimized for performance and memory usage. It computes the covariance matrix, and then performs two divisions by sqrt(C_ii) and by sqrt(Cjj). This separate square-rooting is where the imprecision comes from. For example:
np.sqrt(3 * 3) - 3 == 0.0
np.sqrt(3) * np.sqrt(3) - 3 == -4.4408920985006262e-16
We can fix this by implementing our own simple corrcoef routine:
def corrcoef(a, b):
c = np.cov(a, b)
d = np.diag(c)
return c / np.sqrt(d[:, None] * d[None, :])
Note that this implementation requires more memory than the numpy implementation because it needs to store a temporary matrix with size n * n and it is slightly slower because it needs to do n^2 square roots instead of only 2 n.

How to implement the fast fourier transform to correlate two 2d arrays?

I want to do this with larger arrays, but so that I can understand it, i'll use a smaller example.
Let's say I have the following array:
A = [[0, 0, 100],
[0, 0, 0],
[0, 0, 0]]
If I want to calculate the correlation of this array with another, by multiplying the corresponding entries. For example, A*A would be equal to A (1*1 = 1, zeros everywhere else). I read that fast fourier transforms can be used to speed this up with large arrays. From what I read, I got the impression if I wanted to multiply two arrays A and B, as in A*B, that I could do it quicker with the following (using numpy in python):
a = np.conj(np.fft.fftn(A))
b = np.fft.fftn(B)
c = np.fft.ifft(a*b)
So in effect, take the fft of A, take the fft of B, multiply the two results, and then get the inverse of that result. However, I tried it with the case of A given above, multiplying itself. I had hoped the inverse multiplication would give me
[[0, 0, 10000],
[0, 0, 0 ],
[0, 0, 0 ]]
However I got something a bit different, closer to
[[10000, 0, 0],
[10000, 0, 0],
[10000, 0, 0]]
does anyone know what's going on? Sorry, I'm guessing there's something I'm misunderstanding about the fft.
You should use scipy.signal.fftconvolve instead.
It is already implemented and has been extensively tested, particularly regarding the handling the boundaries. The only additional step necessary to go from the convolution to the correlation operator in 2D is to rotate the filter array by 180° (see this answer).
If you must implement your own, you should be aware that multiplication in the frequency domain corresponds to circular convolution in the time domain. To obtain the desired linear convolution you need to pad both arrays with zeros, to a length at least twice the size of your original matrix1:
s = [2*x for x in np.shape(A)]
a = np.conj(np.fft.fftn(A,s))
b = np.fft.fftn(B,s)
c = np.fft.ifftn(a*b)
1 strictly speaking a size of 2n-1 (instead of 2n) will do, but FFTs tend to perform better when operating on sizes that are multiples of small prime factors.

Categories

Resources