Getting l1 normalized eigenvectors from python instead of l2?

Getting l1 normalized eigenvectors from python instead of l2? - python

Consider this matrix:
[.6, .7]
[.4, .3]
This is a Markov chain matrix; the columns each sum to 1. This can represent a population distribution, transition rates, etc.
To get the population at equilibrium, take the eigenvalues and eigenvectors...
From wolfram alpha, the eigenvalues and their corresponding eigenvectors are:
l1 = 1, v1 = [4/7, 1]
l2 = -1/10, v2 = [-1,1]
For the population at equilibrium, take the eigenvector that corresponds to the eigenvalue of 1, and scale it so the total = 1.
Vector = [7/4, 1]
Total = 11/4
So multiply the vector by 4/11...
4/11 * [7/4, 1] = [7/11, 4/11]
Therefore at equilibrium the first state has 7/11 of the population and the other state has 4/11.
If you take the desired eigenvector, [7/4, 1] and l2 normalize it (so all squared values sum up to 1), you get roughly [.868, .496].
That's all fine. But when you get the eigenvectors from python...
mat = np.array([[.6, .7], [.4, .3]])
vals, vecs = np.linalg.eig(mat)
vecs = vecs.T #(because you want left eigenvectors)
One of the eigenvectors it spits out is the [.868, .496] one, for l2 normed ones. Now, you can pretty easily scale it again so the sum of each value is 1 (instead of the sum of THE SQUARE of each value) being 1... just do the vector * 1/sum(vector). But is there a way to skip this step? Why add the computaitonal expense to my script, having to sum up the vector each time I do this? Can you get numpy, scipy, etc to spit out the l1 normalized vector instead of the l2 normalized vector? Also, is that the correct usage of the terms l1 and l2...?
Note: I have seen previous questions asking how to get the markov steady states in this manner. My qusetion is different, I am asking how to get numpy to spit out a vector normalized in the way I want, and I am explaining my reasoning by including the markov part.

I think you're assuming that np.linalg.eig computes eigenvectors and eigenvalues like you would by hand. It doesn't. Under the hood, it uses a highly optimized (and famous) FORTRAN library called LAPACK. This library uses numerical techniques that are sort of out of scope, but long story short it doesn't compute the eigenvalues for a 2x2 like you would by hand. I believe it uses the QR algorithm most of the time, and sometimes QZ, or even others. It's not all that simple: I think it even chooses different algorithms based on the matrix structure/size sometimes (I'm not a LAPACK expert, so don't quote me here). What I do know is that LAPACK has been vetted over about 40 years and it is pretty darned fast, and with great speed comes great complexity.
Wolfram Alpha, on the other hand, is using Mathematica on the backend, which is a symbolic solver (i.e. not floating point arithmetic). That's why you get the "same" result as if you'd do it by hand.
Long story short, asking to get you the L1 norm from np.linalg.eig just isn't possible: if you look at the QR algorithm, each iteration will have the L2 normalized vector (that converges to an eigenvector). You'll have trouble getting it from most numerical libraries for the simple reason that a lot of them depend on LAPACK or use similar algorithms (for instance MATLAB outputs unit vectors as well).
At the end of the day, it doesn't really matter if the vector is normalized or not. It really just has to be in the right direction. If you need to scale it for a proportion, then do that. It'll be vectorized (i.e. fast) by numpy since it's a simple multiply.
HTH.

Related

Calculating roots of multiple polynomials in numpy without using a loop

I can use the polyfit() method with a 2D array as input, to calculate polynomials on multiple data sets in a fast manner. After getting these multiple polynomials, I want to calculate the roots of all of these polynomials, in a fast manner.
There is numpy.roots() method for finding the roots of a single polynomial but this method does not work with 2D inputs (meaning multiple polynomials). I am working with millions of polynomials, so I would like to avoid looping over all polynomials using a for loop, map or comprehension because it takes minutes in that case. I would prefer a vectoral numpy operation or series of vectoral operations.
An example code for inefficient calculation:
POLYNOMIAL_COUNT = 1000000
# Create a polynomial of second order with coefficients 2, 3 and 4
coefficients = np.array([[2,3,4]])
# Let's say we have the same polynomial multiple times, represented as a 2D array.
# In reality the polynomial coefficients will be different from each other,
# but they will be the same order.
coefficients = coefficients.repeat(POLYNOMIAL_COUNT, axis=0)
# Calculate roots of these same-order polynomials.
# Looping here takes too much time.
roots = []
for i in range(POLYNOMIAL_COUNT):
roots.append(np.roots(coefficients[i]))
Is there a way to find the roots of multiple same-order polynomials using numpy, but without looping?

For the special case of polynomials up to the fourth order, you can solve in a vectorized manner. Anything higher than that does not have an analytical solution, so requires iterative optimization, which is fundamentally unlikely to be vectorizable since different rows may require a different number of iterations. As #John Coleman suggests, you might be able to get away with using the same number of steps for each one, but will likely have to sacrifice accuracy to do so.
That being said, here is an example of how to vectorize the second order case:
d = coefficients[:, 1:-1]**2 - 4.0 * coefficients[:, ::2].prod(axis=1, keepdims=True)
roots = -0.5 * (coefficients[:, 1:-1] + [1, -1] * np.emath.sqrt(d)) / coefficients[:, :1]
If I got the order of the coefficients wrong, replace coefficients[:, :1] with coefficients[:, -1:] in the denominator of the last assignment. Using np.emath.sqrt is nice because it will return a complex128 result automatically when your discriminant d is negative anywhere, and normal float64 result for all real roots.
You can implement a third order solution or a fourth order solution in a similar manner.

How to find A in a Matrix multiplication Ax=b, with some Values of A known, and A being left stochastic

I have been trying to find an answer to this problem for a couple of hours now, but i can't find anything so far...
So I have two vectors let's call them b and x, of which i know all values. They add up to be the same amount, so sum(b) = sum(x).
I also have a Matrix, let's call it A, of which i know what values are 0, all the other values are unknown (but are different from 0).
Furthermore, the the elements of each column of A has the sum of 1 (I think that's called it's a left stochastic matrix)
Generally the Equation can be written in the form A*x = b.
Now I'm trying to find the missing values of A.
I have found one answer to the general problem here: https://math.stackexchange.com/questions/1170843/solving-ax-b-when-x-and-b-are-given
Furthermore i looked at the documentation of numpy.linalg
:https://docs.scipy.org/doc/numpy/reference/routines.linalg.html, but i just can't figure out how to do it.
It looks similar to a multi linear regression problem, but also on sklearn, i couldn't find anything: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LinearRegression.html#sklearn.linear_model.LinearRegression

Not a complete answer, but a bit of a more formal statement of the problem.
I think this can be solved as just a system of linear equations. Let
NZ = {(i,j)|a(i,j) is not fixed to zero}
Then write:
sum( j | (i,j) ∈ NZ, a(i,j) * x(j) ) = b(i) ∀i
sum( i | (i,j) ∈ NZ, a(i,j)) = 1 ∀j
This is just a system of linear equations in a(i,j). It may be under- (or over-) determined and it may be sparse. I think it depends a bit on this how to solve it. It may possible to think about these as constraints in a linear (or quadratic) programming problem. That would allow you to add an objective (in case of an underdetermined system or overdetermined -- in that case minimize sum of squared deviations, or 1-norm of deviations). In addition we can add bounds on a(i,j) (e.g. lower bounds of zero and upper bounds of one). So a linear programming approach may be what you are looking for.
This problem looks a bit like matrix balancing. This is used a lot for economic data sets that come from different sources and where we want to reconcile the data to get a consistent data set usable for subsequent modeling.

Using networkx to calculate eigenvector centrality

I'm trying to use networkx to calculate the eigenvector centrality of my graph:
import networkx as nx
import pandas as pd
import numpy as np
a = nx.eigenvector_centrality(my_graph)
But I get the error:
NetworkXError: eigenvector_centrality():
power iteration failed to converge in %d iterations."%(i+1))
What is the problem with my graph?

TL/DR: try nx.eigenvector_centrality_numpy.
Here's what's going on: nx.eigenvector_centrality relies on power iteration. The actions it takes are equivalent to repeatedly multiplying a vector by the same matrix (and then normalizing the result). This usually converges to the largest eigenvector. However, it fails when there are multiple eigenvalues with the same (largest) magnitude.
Your graph is a star graph. There are multiple "largest" eigenvalues for a star graph. In the case of a star with just two "peripheral nodes" you can easily check that sqrt(2) and -sqrt(2) are both eigenvalues. More generally sqrt(N) and -sqrt(N) are both eigenvalues, and the other eigenvalues have smaller magnitude. I believe that for any bipartite network, this will happen and the standard algorithm will fail.
The mathematical reason is that after n rounds of iteration, the solution looks like the sum of c_i lambda_i^n v_i/K_n where c_i is a constant that depends on the initial guess, lambda_i is the i-th eigenvalue, v_i is its eigenvector and K is a normalization factor (applied to all terms in the sum). When there is a dominant eigenvalue, lambda_i^n/K_n goes to a nonzero constant for the dominant eigenvalue and 0 for the others.
However in your case, you have two equally large eigenvalues, one is positive (lambda_1) and the other is negative (lambda_2=-lambda_1). The contribution of the smaller eigenvalues still goes to zero. But you're left with (c_1 lambda_1^n v_1 + c_2 lambda_2^n v_2)/K_n. Using lambda_2=-lambda_1 you are left with lambda_1^n(c_1 v_1+(-1)^n c_2v_2)/K_n. Then K_n-> lambda_1^n and this "converges" to c_1 v_1 + (-1)^n c_2 v_2. However, each time you iterate, you go from adding some multiple of v_2 to subtracting that multiple, so it doesn't really converge.
So the simple eigenvalue_centrality that networkx uses won't work. You can instead use nx.eigenvector_centrality_numpy so that numpy is used. That will get you v_1.
Note: With a quick look at the documentation, I'm not 100% positive that the numpy algorithm is guaranteed to be the largest (positive) eigenvalue. It uses a numpy algorithm to find an eigenvector, but I don't see in the documentation of that a guarantee that it is the dominant eigenvector. Most algorithms for finding a single eigenvector will result in the dominant eigenvector, so you're probably alright.
We can add a check to it:
as long as nx.eigenvector_centrality_numpy returns all positive values, the Perron-Frobenius theorem guarantees that this corresponds to the largest eigenvalue.
If some are zero, it gets a bit more tricky to be sure,
and if some are negative than it is not the dominant eigenvector.

scipy / numpy linalg.eigval result interpretation

I am a newbie when it comes to using python libraries for numerical tasks. I was reading a paper on LexRank and wanted to know how to compute eigenvectors of a transition matrix. I used the eigval function and got a result that I have a hard time interpreting:
a = numpy.zeros(shape=(4,4))
a[0,0]=0.333
a[0,1]=0.333
a[0,2]=0
a[0,3]=0.333
a[1,0]=0.25
a[1,1]=0.25
a[1,2]=0.25
a[1,3]=0.25
a[2,0]=0.5
a[2,1]=0.0
a[2,2]=0.0
a[2,3]=0.5
a[3,0]=0.0
a[3,1]=0.333
a[3,2]=0.333
a[3,3]=0.333
print LA.eigval(a)
and the eigenvalue is:
[ 0.99943032+0.j
-0.13278637+0.24189178j
-0.13278637-0.24189178j
0.18214242+0.j ]
Can anyone please explain what j is doing here? Isn't the eigenvalue supposed to be a scalar quantity? How can I interpret this result broadly?

j is the imaginary number, the square root of minus one. In math it is often denoted by i, in engineering, and in Python, it is denoted by j.

A single eigenvalue is a scalar quantity, but an (m, m) matrix will have m eigenvalues (and m eigenvectors). The Wiki page on eigenvalues and eigenvectors has some examples that might help you to get your head around the concepts.
As #unutbu mentions, j denotes the imaginary number in Python. In general, a matrix may have complex eigenvalues (i.e. with real and imaginary components) even if it contains only real values (see here, for example). Symmetric real-valued matrices are an exception, in that they are guaranteed to have only real eigenvalues.

pseudo inverse of sparse matrix in python

I am working with data from neuroimaging and because of the large amount of data, I would like to use sparse matrices for my code (scipy.sparse.lil_matrix or csr_matrix).
In particular, I will need to compute the pseudo-inverse of my matrix to solve a least-square problem.
I have found the method sparse.lsqr, but it is not very efficient. Is there a method to compute the pseudo-inverse of Moore-Penrose (correspondent to pinv for normal matrices).
The size of my matrix A is about 600'000x2000 and in every row of the matrix I'll have from 0 up to 4 non zero values. The matrix A size is given by voxel x fiber bundle (white matter fiber tracts) and we are expecting maximum 4 tracts to cross in a voxel. In most of the white matter voxels we expect to have at least 1 tract, but I will say that around 20% of the lines could be zeros.
The vector b should not be sparse, actually b contains the measure for each voxel, which is in general not zero.
I would need to minimize the error, but there are also some conditions on the vector x. As I tried the model on smaller matrices, I never needed to constrain the system in order to satisfy these conditions (in general 0
Is that of any help? Is there a way to avoid taking the pseudo-inverse of A?
Thanks
Update 1st June:
thanks again for the help.
I can't really show you anything about my data, because the code in python give me some problems. However, in order to understand how I could choose a good k I've tried to create a testing function in Matlab.
The code is as follow:
F=zeros(100000,1000);
for k=1:150000
p=rand(1);
a=0;
b=0;
while a<=0 || b<=0
a=random('Binomial',100000,p);
b=random('Binomial',1000,p);
end
F(a,b)=rand(1);
end
solution=repmat([0.5,0.5,0.8,0.7,0.9,0.4,0.7,0.7,0.9,0.6],1,100);
size(solution)
solution=solution';
measure=F*solution;
%check=pinvF*measure;
k=250;
F=sparse(F);
[U,S,V]=svds(F,k);
s=svds(F,k);
plot(s)
max(max(U*S*V'-F))
for s=1:k
if S(s,s)~=0
S(s,s)=1/S(s,s);
end
end
inv=V*S'*U';
inv*measure
max(inv*measure-solution)
Do you have any idea of what should be k compare to the size of F? I've taken 250 (over 1000) and the results are not satisfactory (the waiting time is acceptable, but not short).
Also now I can compare the results with the known solution, but how could one choose k in general?
I also attached the plot of the 250 single values that I get and their squares normalized. I don't know exactly how to better do a screeplot in matlab. I'm now proceeding with bigger k to see if suddently the value will be much smaller.
Thanks again,
Jennifer

You could study more on the alternatives offered in scipy.sparse.linalg.
Anyway, please note that a pseudo-inverse of a sparse matrix is most likely to be a (very) dense one, so it's not really a fruitful avenue (in general) to follow, when solving sparse linear systems.
You may like to describe a slight more detailed manner your particular problem (dot(A, x)= b+ e). At least specify:
'typical' size of A
'typical' percentage of nonzero entries in A
least-squares implies that norm(e) is minimized, but please indicate whether your main interest is on x_hat or on b_hat, where e= b- b_hat and b_hat= dot(A, x_hat)
Update: If you have some idea of the rank of A (and its much smaller than number of columns), you could try total least squares method. Here is a simple implementation, where k is the number of first singular values and vectors to use (i.e. 'effective' rank).
from scipy.sparse import hstack
from scipy.sparse.linalg import svds
def tls(A, b, k= 6):
"""A tls solution of Ax= b, for sparse A."""
u, s, v= svds(hstack([A, b]), k)
return v[-1, :-1]/ -v[-1, -1]

Regardless of the answer to my comment, I would think you could accomplish this fairly easily using the Moore-Penrose SVD representation. Find the SVD with scipy.sparse.linalg.svds, replace Sigma by its pseudoinverse, and then multiply V*Sigma_pi*U' to find the pseudoinverse of your original matrix.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.