Difference between scipy.linalg.expm versus hand-coded one - python

I was trying to implement the matrix exponential function as in scipy.linalg.expm. I gained inspiration from kaityo256's github repository. I thus wrote down the following.
from scipy.linalg import expm
from scipy.linalg import eigh
from scipy.linalg import inv
from math import exp as math_exp
from numpy import array, zeros
from numpy.random import random_sample
from numpy.testing import assert_allclose
def diag2sqr(x):
'''Makes an square matrix from a diagonal one.
Takes a 1d matrix. Determines its data type.
Finds out the shape of the 1d matrix.
Makes an empty square matrix with both
dimensions equal to largest (nonzero) dimension of
the 1d matrix. It then fills the elements of the
1d matrix into diagonal slots of the empty
square one.
Parameters
----------
x : ndarray
ndarray of be coverted to a square ndarray
Returns
-------
xsqr : ndarray
ndarray with diagonals sameas those of x
all other elements are zero
dtype same as that of x
'''
x_flat = x.ravel()
xsqr = zeros((x_flat.shape[0], x_flat.shape[0]), dtype=x.dtype)
# Making the empty matrix
for i in range(x_flat.shape[0]):
xsqr[i, i] = x_flat[i]
# filling up the ith element
print('xsqr', xsqr)
return xsqr
def kaityo_expm(x, ):
'''Exponentiates an ndarray (kaityo).
Exponentiates a ndarray in the most naive way.
Parameters
----------
x : ndarray
The ndarray to be exponentiated
Returns
-------
kexpm : ndarray
x after exponentiating
'''
rx, ux = eigh(x)
# Find eigenvalues and eigenvectors
# eigenvectors composed to form a unitary
ux_inv = inv(ux)
# Inverse of the unitary
# tx = diag([array([math_exp(i) for i in rx]).ravel()])
# tx = array([math_exp(i) for i in rx])
tx = diag2sqr(array([math_exp(i) for i in rx]))
# Constructing the diagonal matrix
kexpm1 = tx#ux_inv
kexpm = ux#kexpm1
return kexpm
Afterwards, I tried to test the above code versus scipy.linalg.expm.
x = random_sample((10, 10))
assert_allclose(expm(x), kaityo_expm(x))
This leads to the following output.
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0
Mismatch: 100%
Max absolute difference: 7.04655733
Max relative difference: 0.59875635
x: array([[18.032424, 16.224408, 12.432163, 16.614248, 12.85653 , 13.705387,
15.096966, 10.577946, 18.399573, 17.938062],
[16.352809, 17.525898, 12.79079 , 16.295562, 13.512996, 14.407979,...
y: array([[18.649103, 13.157682, 11.264763, 16.099163, 15.2293 , 17.854499,
11.691586, 13.412066, 15.023189, 15.598455],
[13.157682, 13.612502, 9.628261, 12.659313, 13.559437, 13.382417,..
Obviously, both the implementations differ.
The questions are as follows:
Is it acceptable for them to differ?
Is my implementation wrong?
If my implementation is wrong, how do I fix it?
If my implementation is correct, when is it safe to use scipy.linalg.expm?
I have seen the following questions:
Matrix exponentiation with scipy: expm, expm2 and expm3

from a mathematical approach the definition of exponential of a matrix is made using the Taylor series of the exponential, so:
let A be a diagonal square matrix:
the problem arise when A is a generic square matrix, so before doing the exponential you will need do diagonalize it using eigenvalue and eigenvectors:
with U the matrix of eigenvectors and Lambda the matrix with the eigenvalues on the diagonal.
at this point we are close to finding what is an exponential of a matrix:
now lets implement this result in a simple script:
>>> import numpy as np
>>> import scipy.linalg as ln
>>> A = [[2/3, -4/3, 2],
[5/6, 4/3, -2],
[5/6, -2/3, 0]]
>>> A = np.matrix(A)
>>> print(A)
[[ 0.66666667 -1.33333333 2. ]
[ 0.83333333 1.33333333 -2. ]
[ 0.83333333 -0.66666667 0. ]]
>>> eigvalue, eigvectors = np.linalg.eig(A)
>>> print("eigvalue: ", eigvalue)
>>> print("eigvectors:")
>>> print(eigvectors)
eigvalue: [ 1. -1. 2.]
eigvectors:
[[ 0.81649658 0.27216553 0.87287156]
[ 0.40824829 -0.68041382 -0.21821789]
[ 0.40824829 -0.68041382 0.43643578]]
>>> e_Lambda = np.eye(np.size(A, 0))*(np.exp(eigvalue))
>>> print(e_Lambda)
[[2.71828183 0. 0. ]
[0. 0.36787944 0. ]
[0. 0. 7.3890561 ]]
>>> e_A = eigvectors*e_Lambda*eigvectors.I
>>> print(e_A)
[[ 2.3265481 -6.22769903 7.01116649]
[ 0.97933433 4.27520659 -3.51559341]
[ 0.97933433 -3.11384951 3.87346269]]
>>> e_A2 = ln.expm(A)
>>> print(e_A2)
[[ 2.3265481 -6.22769903 7.01116649]
[ 0.97933433 4.27520659 -3.51559341]
[ 0.97933433 -3.11384951 3.87346269]]
>>> np.testing.assert_allclose(e_A, e_A2)
>>> print(e_A - e_A2)
[[-1.77635684e-15 1.77635684e-15 -8.88178420e-16]
[ 4.44089210e-16 -1.77635684e-15 8.88178420e-16]
[-2.22044605e-16 0.00000000e+00 4.44089210e-16]]
we see that the result is basically the same, so i think it's safe to use scipy.linalg.expm for matrix exponentiation.
i created a repo with the notebook for further testing.

Related

How to put one entry across an entire diagonal for a sparse matrix in Python

I am seeking to construct a matrix of which I will calculate the inverse. This will be used in an implicit method for solving a nonlinear parabolic PDE. My current calculations are, which will become obvious to why, giving me a singular (no possible inverse) matrix. For context, in reality the matrix will be of dimension 30 by 30 but in these examples I am using smaller matrices for testing purposes.
Say I want to create a large square sparse matrix. Using spdiags only allows you to input members of the main, lower and upper diagonals individually. So how to you make it so that each diagonal has one value for all its entries?
Example Code:
import numpy as np
from scipy.sparse import spdiags
from numpy.linalg import inv
updiag = -0.25
diag = 0.5
lowdiag = -0.25
Jdata = np.array([[diag], [lowdiag], [updiag]])
Diags = [0, -1, 1]
J = spdiags(Jdata, Diags, 3, 3).toarray()
print(J)
inverseJ = inv(J)
print(inverseJ)
This would produce an 3 x 3 matrix but only with the first entry of each diagonal given. I wondered about using np.fill_diagonal but that would require a matrix first and only does the main diagonal. Am I misunderstanding something?
The first argument of spdiags is a matrix of values to be used as the diagonals. You can use it this way:
Jdata = np.array([3 * [diag], 3 * [lowdiag], 3 * [updiag]])
Diags = [0, -1, 1]
J = spdiags(Jdata, Diags, 3, 3).toarray()
print(J)
# [[ 0.5 -0.25 0. ]
# [-0.25 0.5 -0.25]
# [ 0. -0.25 0.5 ]]

Python: numpy.linalg.linalgerror: last 2 dimensions of the array must be square

I have a matrix that gives out something like that:
a =
[[ 3.14333470e-02 3.11644303e-02 3.03622814e-02 2.90406252e-02
2.72220757e-02 2.49377488e-02 2.22267299e-02 1.91354055e-02
1.57166688e-02 1.20290155e-02 8.13554227e-03 4.10286765e-03
-8.19802426e-09 -4.10288390e-03 -8.13555810e-03 -1.20290306e-02
-1.57166830e-02 -1.91354185e-02 -2.22267415e-02 -2.49377588e-02
-2.72220839e-02 -2.90406315e-02 -3.03622856e-02 -3.11644324e-02
-3.14333470e-02]
[ 0.00000000e+00 2.90117128e-03 5.75270270e-03 8.50580375e-03
1.11133681e-02 1.35307796e-02 1.57166756e-02 1.76336548e-02
1.92489172e-02 2.05348252e-02 2.14693765e-02 2.20365808e-02
2.22267328e-02 2.20365792e-02 2.14693735e-02 2.05348208e-02
1.92489114e-02 1.76336477e-02 1.57166674e-02 1.35307704e-02
1.11133581e-02 8.50579304e-03 5.75269150e-03 2.90115979e-03
-1.15937571e-08]
[ 0.00000000e+00 2.90117128e-03 5.75270270e-03 8.50580375e-03
1.11133681e-02 1.35307796e-02 1.57166756e-02 1.76336548e-02
1.92489172e-02 2.05348252e-02 2.14693765e-02 2.20365808e-02
2.22267328e-02 2.20365792e-02 2.14693735e-02 2.05348208e-02
1.92489114e-02 1.76336477e-02 1.57166674e-02 1.35307704e-02
1.11133581e-02 8.50579304e-03 5.75269150e-03 2.90115979e-03
-1.15937571e-08]]
and I want to calculate the eigenvalues and eigenvectors
w, v = numpy.linalg.eig(a)
How can I do this?
You cannot directly compute the eigenvalues of the matrix since it is not square. In order to find the eigenvalues and eigenvectors, the matrix has to be diagonalized, which involves taking a matrix inversion at an intermediate step, and only square matrices are invertible.
In order to find the eigenvalues from a non-square matrix you can compute the singular value decomposition (in numpy: np.linalg.svd). You can then relate the singular values with the eigenvalues as explained here, or here. Quoting one of the answers:
Definition: The singular values of a m×n matrix A are the positive square roots of the nonzero eigenvalues of the corresponding matrix A.T*A. The corresponding eigenvectors are called the singular vectors.
Your array is not square, just fill a zero column to fix it.
import numpy
a = numpy.array(([1,7,3,9],[3,1,5,1],[4,2,6,3]))
# fill with zeros to get a square matrix
z = numpy.zeros((max(a.shape), max(a.shape)))
z[:a.shape[0],:a.shape[1]] = a
a = z
w, v = numpy.linalg.eig(a)
print(w)
print(v)
Out:
[10.88979431 -2.23132083 -0.65847348 0. ]
[[-0.55662903 -0.89297739 -0.8543584 -0.58834841]
[-0.50308806 0.25253601 -0.0201359 -0.58834841]
[-0.66111007 0.37258146 0.51929401 0.39223227]
[ 0. 0. 0. 0.39223227]]

Second-order cooccurrence of terms in texts

Basically, I want to reimplement this video.
Given a corpus of documents, I want to find the terms that are most similar to each other.
I was able to generate a cooccurrence matrix using this SO thread and use the video to generate an association matrix. Next I, would like to generate a second order cooccurrence matrix.
Problem statement: Consider a matrix where the rows of the matrix correspond to a term and the entries in the rows correspond to the top k terms similar to that term. Say, k = 4, and we have n terms in our dictionary, then the matrix M has n rows and 4 columns.
HAVE:
M = [[18,34,54,65], # Term IDs similar to Term t_0
[18,12,54,65], # Term IDs similar to Term t_1
...
[21,43,55,78]] # Term IDs similar to Term t_n.
So, M contains for each term ID, the most similar term IDs. Now, I would like to check how many of those similar terms match. In the example of M above, it seems that term t_0 and term t_1 are quite similar, because three out of four terms match, where as terms t_0 and t_nare not similar, because no terms match. Let's write M as a series of lists.
M = [list_0, # Term IDs similar to Term t_0
list_1, # Term IDs similar to Term t_1
...
list_n] # Term IDs similar to Term t_n.
WANT:
C = [[f(list_0, list_0), f(list_0, list_1), ..., f(list_0, list_n)],
[f(list_1, list_0), f(list_1, list_1), ..., f(list_1, list_n)],
...
[f(list_n, list_0), f(list_n, list_1), ..., f(list_n, list_n)]]
I'd like to find the matrix C, that has as its elements, a function f applied to the lists of M. f(a,b) measures the degree of similarity between two lists a and b. Going, with the example above, the degree of similarity between t_0 and t_1 should be high, whereas the degree of similarity of t_0 and t_n should be low.
My questions:
What is a good choice for comparing the ordering of two lists? That is, what is a good choice for function f?
Is there a transformation already available that takes as an input a matrix like M and produces a matrix like C? Preferably a python package?
Thank you, r0f1
In fact, cosine similarity might not be too bad in this case. The problem is, that you don't want to use the index vectors (i.e. [18,34,54,65] and so on in your case), but you want vectors of length n that are zero everywhere except for the values in your index vector. Luckily, you don't have to create those vectors explicitly, but you can just count how many indices the two index vectors have in common:
def f(u, v):
return len(set(u).intersection(set(v)))
Here, I omitted a constant normalization factor k. There are some more elaborate things that one could do (for example the TF-IDF kernel), but I would stay with this for the start.
In order to run this efficiently using numpy, you would want to do two things:
Convert f to a ufunc, i.e. a numpy vectorized function. You can do that by uf = np.frompyfunc(f, 2, 1) (assuming that you did import numpy as np at some point).
Store M as a 1d array of lists (basically what you state in your second code listing). That's a little more tricky, because numpy is trying to be smart here, but you want something else. So here is how to do that:
n = len(M)
Marray = np.empty(n, dtype='O') # dtype='O' allows you to have elements of type list
for i in range(n):
Marray[i] = M[i]
Now, Marray contains essentially what you described in your second code listing. You can then use the new ufunc's outer method to get your similarity matrix. Here is how all of that would work together for your M from the example (assuming n=3):
M = [[18, 34, 54, 65],
[18, 12, 54, 65],
[21, 43, 55, 78]]
n = len(M) # i.e. 3
uf = np.frompyfunc(f, 2, 1)
Marray = np.empty(n, dtype='O')
for i in range(n):
Marray[i] = M[i]
similarities = uf.outer(Marray, Marray).astype('d') # convert to float instead object type
print(similarities)
# array([[4., 3., 0.],
# [3., 4., 0.],
# [0., 0., 4.]])
I hope that answers your questions.
You asked two questions, one somewhat open-ended (the first one) and other one that has a definitive answer, so I will start by the second one:
Is there a transformation already available that takes as an input a
matrix like M and produces a matrix like C? Preferably, a python
package?
The answer is yes, there is one package named scipy.spatial.distance that contains a function that takes a matrix like M and produces a matrix like C. The following example is to show the function:
import numpy as np
from scipy.spatial.distance import pdist, squareform
# initial data
M = [[18, 34, 54, 65],
[18, 12, 54, 65],
[21, 43, 55, 78]]
# convert to numpy array
arr = np.array(M)
result = squareform(pdist(M, metric='euclidean'))
print(result)
Output
[[ 0. 22. 16.1245155 ]
[22. 0. 33.76388603]
[16.1245155 33.76388603 0. ]]
As seen from the example above, pdist takes the M matrix and generates an C matrix. Note that the output of pdist is a condensed distance matrix, so you need to convert it to square form using squareform. Now onto the second issue:
What is a good choice for comparing the ordering of two lists? That
is, what is a good choice for function f?
Given that order does matter in your particular case I suggest you look at rank correlation coefficients such as: Kendall or Spearman, both are provided in the scipy.stats package, along with a whole bunch of other coefficients. Usage example:
import numpy as np
from scipy.spatial.distance import pdist, squareform
from scipy.stats import kendalltau, spearmanr
# distance function
kendall = lambda x, y : kendalltau(x, y)[0]
spearman = lambda x, y : spearmanr(x, y)[0]
# initial data
M = [[18, 34, 54, 65],
[18, 12, 54, 65],
[21, 43, 55, 78]]
# convert to numpy array
arr = np.array(M)
# compute kendall C and convert to square form
kendall_result = 1 - squareform(pdist(arr, kendall)) # subtract 1 because you want a similarity
print(kendall_result)
print()
# compute spearman C and convert to square form
spearman_result = 1 - squareform(pdist(arr, spearman)) # subtract 1 because you want a similarity
print(spearman_result)
print()
Output
[[1. 0.33333333 0. ]
[0.33333333 1. 0.33333333]
[0. 0.33333333 1. ]]
[[1. 0.2 0. ]
[0.2 1. 0.2]
[0. 0.2 1. ]]
If those do not fit your needs you can take a look at the Hamming distance, for example:
import numpy as np
from scipy.spatial.distance import pdist, squareform
# initial data
M = [[18, 34, 54, 65],
[18, 12, 54, 65],
[21, 43, 55, 78]]
# convert to numpy array
arr = np.array(M)
# compute match_rank C and convert to square form
result = 1 - squareform(pdist(arr, 'hamming'))
print(result)
Output
[[1. 0.75 0. ]
[0.75 1. 0. ]
[0. 0. 1. ]]
In the end the choice of the similarity function will depend on your final application, so you will need to try out different functions and see the ones that fit your needs. Both scipy.spatial.distance and scipy.stats provide a plethora of distance and coefficient functions you can try out.
Further
The following paper contains a section on list similarity
I would suggest cosine similarity as every list is an vector.
from sklearn.metrics.pairwise import cosine_similarity
cosine_similarity(list0,list1)

Numpy matrix multiplication behaviour

I have a problem to understand the matrix multiplication in numpy.
For example I have the following matrix (2d numpy array):
a = [ [ 1. 1. ]
[ 1. 2. ]
[ 1. 3. ] ]
And the following row vector theta:
theta = [ 1. 1. ]
The only way to multiply a with theta would be to transform
theta in a column vector first and then I would get the result:
result = [ [ 2. ]
[ 3. ]
[ 4. ] ]
When I multiply the matrix and the row vector (without transforming)
result = np.dot(a,theta)
I get this:
result = [ 2. 3. 4. ]
How is this even possible? I mean, I didn't transform the matrix.
Can you please tell me how this numpy multiplication works?
Thank you for your attention.
No, you're multiplying numpy array with another numpy array (not a matrix with a vector), although it looks like that. This is because, in essence, numpy arrays are not the same as matrices. And the dot product treats it that way as well.
If you write out the array and multiply, then you will see why. It's just the dot product (element-wise multiplication) of each row in the array 'a' with the vector 'theta'.
PS: (matrices are 2-D while arrays are not limited to any dimension)
Also, please take a look at this answer and this excellent answer

Eigenvectors in Numpy: Very bad numerics? Did I do something wrong?

For some calculations I need an eigenvalue decomposition. Now I tried to evaluate the functions of numpy and noticed that there is a very bad behavior! Look at this:
import numpy as np
N = 3
A = np.matrix(np.random.random([N,N]))
A = 0.5*(A.H + A) #Hermetian part
la, V = np.linalg.eig(A)
VI = np.matrix(np.linalg.inv(V))
V = np.matrix(V)
/edit: I chose a hermetian Matrix now, so it is normal.
The mathematics say that we should have VI * VH = 1, and VH * A * V = VI * A * V = D, where D is the diagonal matrix of the eigenvalues. The result which I got from a random matrix was:
print(A.H*A - A*A.H)
[[ 0. 0. 0.]
[ 0. 0. 0.]
[ 0. 0. 0.]]
this shows that A is normal.
print(V.H*A*V)
[[ 1.71513832e+00 5.55111512e-17 -1.11022302e-16]
[ -1.11022302e-16 -5.17694280e-01 0.00000000e+00]
[ -7.63278329e-17 -4.51028104e-17 1.28559996e-01]]
print(VI*A*V)
[[ 1.71513832e+00 -2.77555756e-16 -2.22044605e-16]
[ 7.49400542e-16 -5.17694280e-01 -4.16333634e-17]
[ -3.33066907e-16 1.70002901e-16 1.28559996e-01]]
This two work correct, since the off-diagonals are very small and on the diagonal we have the eigenvalues.
print(VI*V.H)
[[ 0.50868822 -0.57398479 0.64169912]
[ 0.16362266 0.79620605 0.58248052]
[-0.84525968 -0.19130446 0.49893755]]
This should be one, but its far away from it.
So folks, now tell me, what has gone wrong during making the eigenvectors, even in this small example?? Can anybody tell me when I have to care while using this functions, and what I can do against the great missmatch?
Quote from numpy.linalg.eig documentation:
Likewise, the (complex-valued) matrix of eigenvectors v is unitary if the matrix a is normal, i.e., if dot(a, a.H) = dot(a.H, a), where a.H denotes the conjugate transpose of a.
Obviously, in the example you have, A^H A != A A^H, so the matrix V is not unitary.
Therefore, V.T.conj() is not related to the inverse of V.
The most common case where this assumption is correct is for hermitian matrices.

Categories

Resources