Consider a case where, given an MxM matrix A and a vector b, I want to solve something of the form inv(A # A.T) # b (where I know A is invertible).
As far as I know, it is always faster to use solve_* rather than inv. There are also variants for more efficient solving for PSD matrices (which A # A.T must be), using Cholesky factorization.
My question - since I'm constructing the matrix A # A.T just to immediately throw it away - is there a more specialized procedure for solving linear equations with the gram matrix of A without having to construct it?
You can compute the factorization of A and then use that to solve your system.
Assume we want to solve
A A^T x = b
for x.
Compute the factorization of A=LU.
Then solve Ay=b for y.
Then solve A^T x = y for x.
This way you dont have to compute the matrix A^T A.
Note that if one has a factorization of A=LU then one can solve Ax=b as well as A^T x=b efficiently for x.
This is because A^T=U^T L^T which is again a factorization of a lower times an upper triangular matrix.
I am coding a computational package in python using numpy, in the package, I would do the matrix multiplication between an arbitrary large square matrix (e.g of size 100*100) and a diagonal matrix of same size frequently.
I have an O(n^2) method, but I think that further improvement could be made.
"""
A is of size 100*100
B is a diagonal matrix
want to do np.dot(A,B) quickly
"""
A=np.random.rand(100,100)
diag_elements=np.random.rand(100)
B=np.diag(diag_elements)
answer1= np.dot(A,B) ###O(n^3) method, quite slow
C=np.zeros((100,100))
C=C+diag_elements
answer2=np.multiply(A,C) ##O(n^2) method, 3times faster for n=100
The anwer2 is O(n^2) but I think it's not good enough, because the operation C+=diag_elements are wasting 1/3 time and could be avoided possibly.
I expect that some numpy function could do the matrix multiplication more elegently and faster. Could someone help me out?
Why don't you simply multiply A by the diagonal?
answer3 = np.multiply(A,diag_elements)
I have two N x N co-occurrence matrices (484x484 and 1060x1060) that I have to analyze. The matrices are symmetrical along the diagonal and contain lots of zero values. The non-zero values are integers.
I want to group together the positions that are non-zero. In other words, what I want to do is the algorithm on this link. When order by cluster is selected, the matrix gets re-arranged in rows and columns to group the non-zero values together.
Since I am using Python for this task, I looked into SciPy Sparse Linear Algebra library, but couldn't find what I am looking for.
Any help is much appreciated. Thanks in advance.
If you have a matrix dist with pairwise distances between objects, then you can find the order on which to rearrange the matrix by applying a clustering algorithm on this matrix (http://scikit-learn.org/stable/modules/clustering.html). For example it might be something like:
from sklearn import cluster
import numpy as np
model = cluster.AgglomerativeClustering(n_clusters=20,affinity="precomputed").fit(dist)
new_order = np.argsort(model.labels_)
ordered_dist = dist[new_order] # can be your original matrix instead of dist[]
ordered_dist = ordered_dist[:,new_order]
The order is given by the variable model.labels_, which has the number of the cluster to which each sample belongs. A few observations:
You have to find a clustering algorithm that accepts a distance matrix as input. AgglomerativeClustering is such an algorithm (notice the affinity="precomputed" option to tell it that we are using pre-computed distances).
What you have seems to be a pairwise similarity matrix, in which case you need to transform it to a distance matrix (e.g. dist=1 - data/data.max())
In the example I assumed 20 clusters, you may have to play with this variable a bit. Alternatively, you might try to find the best one-dimensional representation of your data (using e.g. MDS) to describe the optimal ordering of samples.
Because your data is sparse, treat it as a graph, not a matrix.
Then try the various graph clustering methods. For example cliques are interesting on such data.
Note that not everything may cluster.
I have two sparse matrix A (affinity matrix) and D (Diagonal matrix) with dimension 100000*100000. I have to compute the Laplacian matrix L = D^(-1/2)*A*D^(-1/2). I am using scipy CSR format for sparse matrix.
I didnt find any method to find inverse of sparse matrix. How to find L and inverse of sparse matrix? Also suggest that is it efficient to do so by using python or shall i call matlab function for calculating L?
In general the inverse of a sparse matrix is not sparse which is why you won't find sparse matrix inverters in linear algebra libraries. Since D is diagonal, D^(-1/2) is trivial and the Laplacian matrix calculation is thus trivial to write down. L has the same sparsity pattern as A but each value A_{ij} is multiplied by (D_i*D_j)^{-1/2}.
Regarding the issue of the inverse, the standard approach is always to avoid calculating the inverse itself. Instead of calculating L^-1, repeatedly solve Lx=b for the unknown x. All good matrix solvers will allow you to decompose L which is expensive and then back-substitute (which is cheap) repeatedly for each value of b.
I am working with data from neuroimaging and because of the large amount of data, I would like to use sparse matrices for my code (scipy.sparse.lil_matrix or csr_matrix).
In particular, I will need to compute the pseudo-inverse of my matrix to solve a least-square problem.
I have found the method sparse.lsqr, but it is not very efficient. Is there a method to compute the pseudo-inverse of Moore-Penrose (correspondent to pinv for normal matrices).
The size of my matrix A is about 600'000x2000 and in every row of the matrix I'll have from 0 up to 4 non zero values. The matrix A size is given by voxel x fiber bundle (white matter fiber tracts) and we are expecting maximum 4 tracts to cross in a voxel. In most of the white matter voxels we expect to have at least 1 tract, but I will say that around 20% of the lines could be zeros.
The vector b should not be sparse, actually b contains the measure for each voxel, which is in general not zero.
I would need to minimize the error, but there are also some conditions on the vector x. As I tried the model on smaller matrices, I never needed to constrain the system in order to satisfy these conditions (in general 0
Is that of any help? Is there a way to avoid taking the pseudo-inverse of A?
Thanks
Update 1st June:
thanks again for the help.
I can't really show you anything about my data, because the code in python give me some problems. However, in order to understand how I could choose a good k I've tried to create a testing function in Matlab.
The code is as follow:
F=zeros(100000,1000);
for k=1:150000
p=rand(1);
a=0;
b=0;
while a<=0 || b<=0
a=random('Binomial',100000,p);
b=random('Binomial',1000,p);
end
F(a,b)=rand(1);
end
solution=repmat([0.5,0.5,0.8,0.7,0.9,0.4,0.7,0.7,0.9,0.6],1,100);
size(solution)
solution=solution';
measure=F*solution;
%check=pinvF*measure;
k=250;
F=sparse(F);
[U,S,V]=svds(F,k);
s=svds(F,k);
plot(s)
max(max(U*S*V'-F))
for s=1:k
if S(s,s)~=0
S(s,s)=1/S(s,s);
end
end
inv=V*S'*U';
inv*measure
max(inv*measure-solution)
Do you have any idea of what should be k compare to the size of F? I've taken 250 (over 1000) and the results are not satisfactory (the waiting time is acceptable, but not short).
Also now I can compare the results with the known solution, but how could one choose k in general?
I also attached the plot of the 250 single values that I get and their squares normalized. I don't know exactly how to better do a screeplot in matlab. I'm now proceeding with bigger k to see if suddently the value will be much smaller.
Thanks again,
Jennifer
You could study more on the alternatives offered in scipy.sparse.linalg.
Anyway, please note that a pseudo-inverse of a sparse matrix is most likely to be a (very) dense one, so it's not really a fruitful avenue (in general) to follow, when solving sparse linear systems.
You may like to describe a slight more detailed manner your particular problem (dot(A, x)= b+ e). At least specify:
'typical' size of A
'typical' percentage of nonzero entries in A
least-squares implies that norm(e) is minimized, but please indicate whether your main interest is on x_hat or on b_hat, where e= b- b_hat and b_hat= dot(A, x_hat)
Update: If you have some idea of the rank of A (and its much smaller than number of columns), you could try total least squares method. Here is a simple implementation, where k is the number of first singular values and vectors to use (i.e. 'effective' rank).
from scipy.sparse import hstack
from scipy.sparse.linalg import svds
def tls(A, b, k= 6):
"""A tls solution of Ax= b, for sparse A."""
u, s, v= svds(hstack([A, b]), k)
return v[-1, :-1]/ -v[-1, -1]
Regardless of the answer to my comment, I would think you could accomplish this fairly easily using the Moore-Penrose SVD representation. Find the SVD with scipy.sparse.linalg.svds, replace Sigma by its pseudoinverse, and then multiply V*Sigma_pi*U' to find the pseudoinverse of your original matrix.