Problem in calculating the symmetric normalised laplacian matrix - python

I found some problems in calculating the symmetric normalised laplacian matrix in python.
Suppose to have the matrix S and its diagonal degree matrix D:
[ [ 1 , 0.5, 0.2] [ [1.7, 0, 0 ]
S = [0.5, 1 , 0.5] D = [ 0 , 2, 0 ]
[0.2, 0.5, 1 ] ] [ 0 , 0,1.7] ]
When calculating L as
I obtain this result:
[[ 0.41176471 -0.27116307 -0.11764706]
L = [-0.27116307 0.5 -0.27116307]
[-0.11764706 -0.27116307 0.41176471]]
Using this code:
S = np.array([[1,0.5,0.2],[0.5,1,0.5],[0.2,0.5,1]])
print("Similiarity Matrix: \n",S)
print("\n\n")
D = np.zeros((len(S), len(S)))
#H = np.sum(G[0])
for id, x in enumerate(S):
D[id][id] = np.sum(x)
I = np.identity(len(S))
L = I - ((sqrtm(inv(D))).dot(S)).dot(sqrtm(inv(D)))
print("\n\n")
print("Laplacian normalized: \n",L)
This differ from using the function csgraph.laplacian(S, normed=True) that return:
[[[ 1. -0.5976143 -0.28571429]
L = [-0.5976143 1. -0.5976143 ]
[-0.28571429 -0.5976143 1. ]]
why this happen? Am i doing something wrong?

I noticed that the ratio between the unnormalized and normalized matrixes returned by csgraph.laplacian is closely related to the ratio of the unnormalized matrix and your L:
In [20]: csgraph.laplacian(S, normed=False) / L - 1
Out[20]:
array([[0.7 , 0.84390889, 0.7 ],
[0.84390889, 1. , 0.84390889],
[0.7 , 0.84390889, 0.7 ]])
In [21]: csgraph.laplacian(S, normed=False) / csgraph.laplacian(S, normed=True)
Out[21]:
array([[0.7 , 0.83666003, 0.7 ],
[0.83666003, 1. , 0.83666003],
[0.7 , 0.83666003, 0.7 ]])
0.84390889 ≠ 0.83666003 but other numbers match. Could the difference be simply due to normalization?

That's because you have 1s in the diagonal of S:
# weighted adjacency
S = np.array([[1,0.5,0.2],[0.5,1,0.5],[0.2,0.5,1]])
np.fill_diagonal(S, 0.0)
# strength diagonal matrix
D = np.diag(np.sum(S,axis=1))
# identity
I = np.identity(S.shape[0])
# D^{-1/2} matrix
D_inv_sqrt = np.linalg.inv(np.sqrt(D))
L = I - np.dot(D_inv_sqrt, S).dot(D_inv_sqrt)
L
array([[ 1. , -0.5976143 , -0.28571429],
[-0.5976143 , 1. , -0.5976143 ],
[-0.28571429, -0.5976143 , 1. ]])

Related

How to keep a matrix unchanged

I am trying to calculate the inverse matrix using the Gauss-Jordan Method. For that, I need to find the solution X to A.X = I (A and X being N x N matrices, and I the identity matrix).
However, for every column vector of the solution matrix X I calculate in the first loop, I have to use the original matrix A, but I don't know why it keeps changing when I did a copy of it in the beginning.
def SolveGaussJordanInvMatrix(A):
N = len(A[:,0])
I = np.identity(N)
X = np.zeros([N,N], float)
A_orig = A.copy()
for m in range(N):
x = np.zeros(N, float)
v = I[:,m]
A = A_orig
for p in range(N): # Gauss-Jordan Elimination
A[p,:] /= A[p,p]
v[p] /= A[p,p]
for i in range(p): # Cancel elements above the diagonal element
v[i] -= v[p] * A[i,p]
A[i,p:] -= A[p,p:]*A[i,p]
for i in range(p+1, N): # Cancel elements below the diagonal element
v[i] -= v[p] * A[i,p]
A[i,p:] -= A[p,p:]*A[i,p]
X[:,m] = v # Add column vector to the solution matrix
return X
A = np.array([[2, 1, 4, 1 ],
[3, 4, -1, -1],
[1, -4, 7, 5],
[2, -2, 1, 3]], float)
SolveGaussJordanInvMatrix(A)
Does anyone know how turn A back to its original form after the Gauss-Elimination loop?
I'm getting
array([[ 228.1, 0. , 0. , 0. ],
[-219.9, 1. , 0. , 0. ],
[ -14.5, 0. , 1. , 0. ],
[-176.3, 0. , 0. , 1. ]])
and expect
[[ 1.36842105 -0.89473684 -1.05263158 1. ]
[-1.42105263 1.23684211 1.13157895 -1. ]
[ 0.42105263 -0.23684211 -0.13157895 -0. ]
[-2. 1.5 1.5 -1. ]]

scipy.linalg.expm of hermitian is not special unitary

If I have a (real) hermitian matrix, for instance
H = matrix([[-2. , 0.5, 0.5, 0. ],
[ 0.5, 2. , 0. , 0.5],
[ 0.5, 0. , 0. , 0.5],
[ 0. , 0.5, 0.5, 0. ]])
(This matrix is hermitian; it is the Hamiltonian of a 2-spin Ising chain with coupling to an external field.)
Then there exists a special orthogonal transformation O (preserves length of column and row vectors of a matrix) s.t.
H = O.transpose() # D # O
where D is diagonal. For the matrix exponential this leads to
T = expm(1j * H) = O.transpose() # expm(1j * D) # O
so all column/row vectors of T must have length 1.
If I use scipy.linalg.expm this property is violated:
In [1]: import numpy as np
In [2]: from numpy import matrix
In [3]: from scipy.linalg import expm
In [4]: H = matrix([[-2. , 0.5, 0.5, 0. ],
...: [ 0.5, 2. , 0. , 0.5],
...: [ 0.5, 0. , 0. , 0.5],
...: [ 0. , 0.5, 0.5, 0. ]])
In [5]: T = expm(1j * H)
In [6]: np.sum(np.abs(T[0]))
Out[6]: 1.6099093263121051
In [7]: np.sum(np.abs(T[1]))
Out[7]: 1.609909326312105
In [8]: np.sum(np.abs(T[2]))
Out[8]: 1.7770244703003222
In [9]: np.sum(np.abs(T[3]))
Out[9]: 1.7770244703003222
Is this a bug in expm or am I making a mistake here?
you're using the wrong norm. Use
np.sqrt( np.sum( np.abs(T[0])**2 ) )
Or even in a shorter way
np.linalg.norm( T[0] )

Constructing a matrix based on different lists with Python

I am going to make the following matrix:
s= [[s11 s12 s13]
[s21 s22 s23]
[s31 s32 s33]]
where I can obtain each array of the matrix s by:
sii = a(i) ; for s11, s22, and s33
sij = a(i)**2 + 10 ; for s12=s21, s23=s32, and s13=s31
here, ai is a list of data:
a = [0.1, 0.25, 0.12]
So when I use the following:
import numpy as np
s = np.ones([3,3])
def matrix(s):
a = [0.1, 0.25, 0.12]
s[np.diag_indices_from(s)] = ai
s[~np.eye(s.shape[0],dtype=bool)] = ai**2 + 10
It gives me an error. How can I solve this problem? Thanks.
Here is a hint for you on how to manipulate the diagonal and non-diagonal values.
import numpy as np
s = np.ones([3,3])
def matrix(s):
a = [1,2,3]
for i in range(len(a)):
s[i,i] = a[i] # sii = a(i)
rc = (i + 1) % len(a)
val = a[i] ** 2 + 10
s[i, rc] = val # sij = a(i)**2 + 10
s[rc, i] = val # sij = a(i)**2 + 10
return s
print(matrix(s))
input:
[[ 1. 1. 1.]
[ 1. 1. 1.]
[ 1. 1. 1.]]
output:
[[ 1. 11. 19.]
[ 11. 2. 14.]
[ 19. 14. 3.]]

Broadcast through numpy array with list of arrays

Given an adjacency list:
adj_list = [array([0,1]),array([0,1,2]),array([0,2])]
And an array of indices,
ind_arr = array([0,1,2])
Goal:
A = np.zeros((3,3))
for i in ind_arr:
A[i,list(adj_list[x])] = 1.0/float(adj_list[x].shape[0])
Currently, I have written:
A[ind_list[:],adj_list[:]] = 1. / len(adj_list[:])
And tried various configurations of indexing within this scaffold.
Here's one approach -
lens = np.array([len(i) for i in adj_list])
col_idx = np.concatenate(adj_list)
out = np.zeros((len(lens), col_idx.max()+1))
row_idx = np.repeat(np.arange(len(lens)), lens)
vals = np.repeat(1.0/lens, lens)
out[row_idx, col_idx] = vals
Sample input, output -
In [494]: adj_list = [np.array([0,2]),np.array([0,1,4])]
In [496]: out
Out[496]:
array([[ 0.5 , 0. , 0.5 , 0. , 0. ],
[ 0.33333333, 0.33333333, 0. , 0. , 0.33333333]])
Sparse matrix as output
Additionally, if you want to save memory and create a sparse matrix instead, that's an easy extension -
In [506]: from scipy.sparse import csr_matrix
In [507]: csr_matrix((vals, (row_idx, col_idx)), shape=(len(lens), col_idx.max()+1))
Out[507]:
<2x5 sparse matrix of type '<type 'numpy.float64'>'
with 5 stored elements in Compressed Sparse Row format>
In [508]: _.toarray()
Out[508]:
array([[ 0.5 , 0. , 0.5 , 0. , 0. ],
[ 0.33333333, 0.33333333, 0. , 0. , 0.33333333]])
I don't think you can completely eliminate loops due to the mixed data types, but you can reduce the nested double for loops to a single one:
A = np.zeros((2, 3))
for i, arr in enumerate(adj_list):
arr_size = len(arr)
A[i, :arr_size] = 1./arr_size
A
# array([[ 0.5 , 0.5 , 0. ],
# [ 0.33333333, 0.33333333, 0.33333333]])
Or if the numbers in the arrays are actually columns positions:
A = np.zeros((2, 3))
for i, arr in enumerate(adj_list):
A[i, arr] = 1./len(arr)
A
# array([[ 0.5 , 0.5 , 0. ],
# [ 0.33333333, 0.33333333, 0.33333333]])
Another option using MultiLabelBinarizer from sklearn(but may not be as efficient):
from sklearn.preprocessing import MultiLabelBinarizer
​
mlb = MultiLabelBinarizer()
adj_list = [np.array([0,1]),np.array([0,1,2])]
​
sizes = np.fromiter(map(len, adj_list), dtype=int)
mlb.fit_transform(adj_list)/sizes[:,None]
# array([[ 0.5 , 0.5 , 0. ],
# [ 0.33333333, 0.33333333, 0.33333333]])

numpy linspace and mesh grid for multiple dimensions

I am porting some matlab code to python using numpy and I have the following matlab command:
[xgrid,ygrid]=meshgrid(linspace(-0.5,0.5, GridSize-1), ...
linspace(-0.5,0.5, GridSize-1));
Now, this is fine in 2D but I would like to extend this to n-dimensional. So depending on the input data, GridSize can be a 2, 3 or 4 dimensional vector. So, in 2D this would be:
[xgrid, grid] = np.meshgrid(np.linspace(-0.5,0.5, GridSize[0]),
np.linspace(-0.5,0.5, GridSize[1]));
However, I do not know the dimensions of the input before, so is it possible to rewrite this expression, so that it can generate grids with arbitrary number of dimensions?
You could use loop comprehension to generate all 1D arrays and then use np.meshgrid on all those with * operator that internally does unpacking of argument lists, which is equivalent of MATLAB's comma separated lists, like so -
allG = [np.linspace(-0.5,0.5, G) for G in GridSize]
out = np.meshgrid(*allG)
Sample runs
1) 2D Case :
In [27]: GridSize = [3,4]
In [28]: allG = [np.linspace(-0.5,0.5, G) for G in GridSize]
...: out = np.meshgrid(*allG)
...:
In [29]: out[0]
Out[29]:
array([[-0.5, 0. , 0.5],
[-0.5, 0. , 0.5],
[-0.5, 0. , 0.5],
[-0.5, 0. , 0.5]])
In [30]: out[1]
Out[30]:
array([[-0.5 , -0.5 , -0.5 ],
[-0.16666667, -0.16666667, -0.16666667],
[ 0.16666667, 0.16666667, 0.16666667],
[ 0.5 , 0.5 , 0.5 ]])
2) 3D Case :
In [51]: GridSize = [3,4,2]
In [52]: allG = [np.linspace(-0.5,0.5, G) for G in GridSize]
...: out = np.meshgrid(*allG)
...:
In [53]: out[0]
Out[53]:
array([[[-0.5, -0.5],
[ 0. , 0. ],
[ 0.5, 0.5]], ...
[[-0.5, -0.5],
[ 0. , 0. ],
[ 0.5, 0.5]]])
In [54]: out[1]
Out[54]:
array([[[-0.5 , -0.5 ], ...
[[ 0.16666667, 0.16666667],
[ 0.16666667, 0.16666667],
[ 0.16666667, 0.16666667]],
[[ 0.5 , 0.5 ],
[ 0.5 , 0.5 ],
[ 0.5 , 0.5 ]]])
In [55]: out[2]
Out[55]:
array([[[-0.5, 0.5], ....
[[-0.5, 0.5],
[-0.5, 0.5],
[-0.5, 0.5]]])

Categories

Resources