Similar matrix computation using numpy - python

I am trying to find a similar matrix B to a 3 X 3 matrix :A using a random invertible matrix P .
B = P_inv.A.P
import numpy as np
from scipy import linalg as LA
from numpy.linalg import inv
A = np.random.randint(1,10,9).reshape(3,3)
P = np.random.randn(3,3)
P_inv = inv(P)
eig1 = LA.eigvalsh(A)
eig1 = np.sort(eig1)
B1 = P_inv.dot(A)
B = B1.dot(P)
eig2 = LA.eigvalsh(B)
eig2 = np.sort(eig2)
print(np.round(eig1 ,3))
print(np.round(eig2,3))
However ,I ntoice that eig1 & eig2 are never equal.
What am I missing, or is it a numerical error ?
Thanks
Kedar

You're using eigvalsh, which requires that the matrix be real symmetric (or complex Hermitian), which your randomly generated matrix is not.
Deleting the h and using eigvals instead fixes this.

Related

Using scipy.linalg.solve with symmetric coefficient matrix (assume_a='sym')

I am trying to solve a system of linear equations A * x = b for the unknown x using scipy's linalg.solve function. Here is an example that works fine:
import numpy as np
import scipy.linalg as linalg
A = np.array([[ 0.18666667, 0.06222222, -0.01777778],
[ 0.01777778, 0.18666667, 0.01777778],
[-0.01777778, 0.06222222, 0.18666667]])
b = np.array([0.26666667, -0.26666667, -0.4])
x = linalg.solve(A, b, assume_a='gen')
It results in x = [1.77194417, -1.4555256, -1.48892533], which is a correct solution. This can be verified by computing A.dot(x), which results in [0.26666667, -0.26666667, -0.4]. As this is the same as b, the solution is correct.
However, the matrix of coefficients A is symmetrical, i.e., the values above and below the main diagonal are the same. If I understand the documentation correctly, for solving such a problem more efficiently, the solve function allows to set the argument assume_a='sym'. Unfortunately, using the following code (given the same A and b) results in an incorrect solution being found:
x = linalg.solve(A, b, assume_a='sym')
It results in x = [1.88811181, -1.88811181, -1.78321672], which is different from the solution above. Computing A.dot(x) results in [0.26666667, -0.35058274, -0.48391607]. As this is different from b, the solution seems to be incorrect.
I am wondering, if there is any problem with my code, or if my understanding of symmetric matrices or the expected result is simply wrong!? Maybe the matrix must satisfy additional constraints to be used together with assume_a='sym'?
I appreciate your answers. Thanks in advance!
In think it won't happen. I provide a short answer about it.
Non-symmetric A
import numpy as np
import scipy.linalg as linalg
A = np.array([[ 0.18666667, 0.06222222, -0.01777778],
[ 0.01777778, 0.18666667, 0.01777778],
[-0.01777778, 0.06222222, 0.18666667]])
b = np.array([0.26666667, -0.26666667, -0.4])
x = linalg.solve(A, b, assume_a='gen')
np.allclose(A # x,b)
Out:
True
Which shows the solver works well.
Symmetric A
# use you upper triangular A to get a symmetric matrix
A_symm = (np.triu(A) + np.triu(A).T -np.diag(A.diagonal()))
# solve the equations
x = linalg.solve(A_symm, b, assume_a='sym')
np.allclose(A_symm # x,b)
Out:
True
It still works.
If you pass a non-symmetric matrix A to the solver , and then specify the assume_a = 'sym', solver will only use upper triangular matrix of A, see below:
x = linalg.solve(A, b, assume_a='sym')
np.allclose(A # x,b),x
Out:
(False, array([ 1.88811181, -1.88811181, -1.78321672]))
The result shows that solver works "wrong", but the result x is the same with result of linalg.solve(A_symm, b, assume_a='sym')

How to find Eigenspace of a matrix using python

I have a matrix which is I found its Eigenvalues and EigenVectors, but now I want to solve for eigenspace, which is Find a basis for each of the corresponding eigenspaces! and don't know how to start! by finding the null space from scipy or solve for reef(), I tried but didn't work! please help!
this is the code I am using
# import packages
import numpy as np
from numpy import linalg as LA
from scipy.linalg import null_space
# define matrix and vector
M = np.array([[0.82, 0.1],[0.18,0.9]])
v0 = np.array([[15000],[800]])
eigenVal, eigenVec = LA.eig(M)
print(eigenVal)
# Based on the Characteristic polynomial formula
#pol_formula =(A- \lambda I)\mathbf{v} = 0\)
identity = np.identity(2, dtype=float)
lamdbdaI= eigenVal*identity
## Apply the Characteristic polynomial formula using ###M matrix
char_poly = M-lamdbdaI
print(char_poly)
Here I am stuck !
The np.linalg.eig functions already returns the eigenvectors, which are exactly the basis vectors for your eigenspaces. More precisely:
v1 = eigenVec[:,0]
v2 = eigenVec[:,1]
span the corresponding eigenspaces for eigenvalues lambda1 = eigenVal[0] and lambda2 = eigenvVal[1].

Eig in Python giving different Eigenvalues?

So essentially what the problem is the eig function in Matlab and Python are giving me different things. I am reproducing data from a paper in order to confirm my numerical method is correct (So I know the answers- have them via Matlab)
I have tried eigh, still no improvement.
Below is the data matrix used:
2852 170.380000000000 77.3190000000000 -51.0710000000000 -191.560000000000 105.410000000000 240.950000000000 102.700000000000
2842 169.640000000000 76.6120000000000 -50.3980000000000 -191.310000000000 105.660000000000 240.850000000000 102.960000000000
2838.80000000000 176.950000000000 80.4150000000000 -51.5700000000000 -192.190000000000 104.870000000000 239.700000000000 104.110000000000
2837.40000000000 182.930000000000 88.4070000000000 -54.1410000000000 -194.460000000000 104.230000000000 238.760000000000 105.020000000000
2890.80000000000 167.270000000000 122 -67.7490000000000 -275.150000000000 160.960000000000 248.010000000000 95.9470000000000
2962.10000000000 113.910000000000 177.060000000000 -98.9930000000000 -259.270000000000 80.7860000000000 262.890000000000 80.9180000000000
3013.90000000000 72.9740000000000 225.260000000000 -135.700000000000 -233.520000000000 0.0469300000000000 272.110000000000 71.5160000000000
3026.50000000000 112.420000000000 243.020000000000 -169.460000000000 -218.060000000000 0.0465190000000000 271.250000000000 71.8280000000000
3367.10000000000 -0.310680000000000 479.870000000000 0.494350000000000 -0.603940000000000 -0.147820000000000 282.700000000000 -64.1680000000000
import scipy.io as sc
import math as m
import numpy as np
from numpy import diag, power
from scipy.linalg import expm, sinm, cosm
import matplotlib.pyplot as plt
import pandas as pd
###########################. Import Data from Excel Sheet.
###################################
df = pd.read_excel('DataCompanionMatrix.xlsx', header=None)
data = np.array(df)
###########################. FUNCTION DEFINE.
#################################################
m = data.shape[0]
n = data.shape[1]
x = data[0:-1,:]
y = data[-1,:]
A = np.dot(x,np.transpose(x))
xx = np.dot(x,np.transpose(y))
Co_values = np.dot(np.linalg.pinv(A),xx)
C = np.zeros((n,n))
for i in range(0,n-1):
C[i,i-1] = 1
C[:,n-1] = Co_values
eigV,eigW = np.linalg.eig(C)
print(eigV)
The data is a 9x8 matrix, x is a 8x8 matrix, y is a 1x8 array, A is 8x8, C is 8x8, co is 1x8 array.
In Matlab the eigenvalues are an 1x8 array of complex eigenvalues. In Python, I get 1x8 array filled with 7 zeros and 1 integer.
I expect to plot the eigenvalues and they should sit on the unit circle, this I've done on Matlab.
C matrix- matlab and python (both look like this)
Python eigenvalues
Matlab eigenvalues
The array C you create in Python does not correspond to the one you have in MATLAB.
If I modify your Python code as follows, I get the same array C and the same eigenvalues:
C = np.zeros((n,n))
for i in range(0,n-1):
C[i+1,i] = 1 # This is where the differences are!
C[:,n-1] = Co_values

creating a large pdf matrix efficiently

I have a dataset of 60,000 examples of the form:
mu1 mu2 std1 std2
0 -0.745 0.729 0.0127 0.0149
1 -0.711 0.332 0.1240 0.0433
...
They are essentially parameters of 2-dimensional normal distributions. What I want to do is create a (NxN) matrix P such that P_ij = Normal( mu_i | mean=mu_j, cov=diagonal(std_j)), where mu_i is (mu1, mu2) for data 'i'.
I can do this with the following code for example:
from scipy import stats
import numpy as np
mu_all = data[['mu1', 'mu2']]
std_all = data[['std1', 'std2']]
P = []
for i in range(len(data)):
mu_i = mu_all[i,:]
std_i = std_all[i,:]
prob_i = stats.multivariate_normal.pdf(mu_all, mean=mu_i, cov=np.diag(std_i))
P.append(prob_i)
P = np.array(P).T
But this is too expensive (my machine freezes). How can I do this more efficiently? My guess is that scipy cannot handle computing pdf of 60000 at once. Is there an alternative?
Just realized creating a matrix of that size (60,0000 x 60,000) cannot be handled in python:
Very large matrices using Python and NumPy
So I don't think this can be done

Is there a Python equivalent to the mahalanobis() function in R? If not, how can I implement it?

I have the following code in R that calculates the mahalanobis distance on the Iris dataset and returns a numeric vector with 150 values, one for every observation in the dataset.
x=read.csv("Iris Data.csv")
mean<-colMeans(x)
Sx<-cov(x)
D2<-mahalanobis(x,mean,Sx)
I tried to implement the same in Python using 'scipy.spatial.distance.mahalanobis(u, v, VI)' function, but it seems this function takes only one-dimensional arrays as parameters.
I used the Iris dataset from R, I suppose it is the same you are using.
First, these is my R benchmark, for comparison:
x <- read.csv("IrisData.csv")
x <- x[,c(2,3,4,5)]
mean<-colMeans(x)
Sx<-cov(x)
D2<-mahalanobis(x,mean,Sx)
Then, in python you can use:
from scipy.spatial.distance import mahalanobis
import scipy as sp
import pandas as pd
x = pd.read_csv('IrisData.csv')
x = x.ix[:,1:]
Sx = x.cov().values
Sx = sp.linalg.inv(Sx)
mean = x.mean().values
def mahalanobisR(X,meanCol,IC):
m = []
for i in range(X.shape[0]):
m.append(mahalanobis(X.iloc[i,:],meanCol,IC) ** 2)
return(m)
mR = mahalanobisR(x,mean,Sx)
I defined a function so you can use it in other sets, (observe I use pandas DataFrames as inputs)
Comparing results:
In R
> D2[c(1,2,3,4,5)]
[1] 2.134468 2.849119 2.081339 2.452382 2.462155
In Python:
In [43]: mR[0:5]
Out[45]:
[2.1344679233248431,
2.8491186861585733,
2.0813386639577991,
2.4523816316796712,
2.4621545347140477]
Just be careful that what you get in R is the squared Mahalanobis distance.
A simpler solution would be:
from scipy.spatial.distance import cdist
x = ...
mean = x.mean(axis=0).reshape(1, -1) # make sure 2D
vi = np.linalg.inv(np.cov(x.T))
cdist(mean, x, 'mahalanobis', VI=vi)

Categories

Resources