Find inverse of (many) sparse (but block-diagonal) matrices in TensorFlow

Find inverse of (many) sparse (but block-diagonal) matrices in TensorFlow - python

My problem is this: I have a very large number (millions) of large matrices to invert, and I would like to do it with TensorFlow. In general this could be a rather hard problem, however my matrices are sparse, in particular they have an (irregular) block diagonal structure. So mathematically speaking, one can find the inverse of the full matrix by doing a bunch of inverses of much smaller matrices instead, which is much faster to do.
However, I want to write some general-purpose code to do this inversion, where the structure and size of the block-diagonal matrices will vary between problems. So it would be nice to have an algorithm that can figure out the block diagonal structure itself and make use of it, and which can work with sparse matrices as input and output (since there is never any need to store the zillions of off-block-diagonal elements, and it is more convenient to keep the block-diagonal "package" together, from a user perspective).
I see that TensorFlow has SparseTensor objects, but I can't find too much about algorithms that can use them. Am I lucky and a nice matrix inversion algorithm for block-SparseTensors already exists? I guess it should not be so hard to write one, but I'd rather not reinvent the wheel.
Edit: Ahh, I should mention that the block-diagonal structure will be the exact same for all the matrices to be inverted. So the plan was to stack all the matrices in one giant 3D tensor (well as many as can fit in RAM anyway).

Related

Large sparse matrix inversion on Python

I'm currently working with a least-square algorithm on Python, regarding some geodetic calculations.
I chose Python (which is not the fastest) and it works pretty well. However, in my code, I have inversions of large sparse symmetric (non-positive definite, so can't use Cholesky) matrix to execute (image below). I currenty use np.linalg.inv() which is using the LU decomposition method.
I pretty sure there would be some optimization to do in terms of rapidity.
I thought about Cuthill-McKee algotihm to rearange the matrix and take its inverse. Do you have any ideas or advice ?
Thank you very much for your answers !

Good news is that if you're using any of the popular python libraries for linear algebra (namely, numpy), the speed of python really doesn't matter for the math – it's all done natively inside the library.
For example, when you write matrix_prod = matrix_a # matrix_b, that's not triggering a bunch of Python loops to multiply the two matrices, but using numpy's internal implementation (which I think uses the FORTRAN LAPACK library).
The scipy.sparse.linalg module has your back covered when it comes to solving sparsely stored matrices specifying sparse systems of equations. (which is what you do with the inverse of a matrix). If you want to use sparse matrices, that's your way to go – notice that there's matrices that are sparse in mathematical terms (i.e., most entries are 0), and matrices which are stored as sparse matrix, which means you avoid storing millions of zeros. Numpy itself doesn't have sparsely stored matrices, but scipy does.
If your matrix is densely stored, but mathematically sparse, i.e. you're using standard numpy ndarrays to store it, then you won't get any more rapid by implementing anything in Python. The theoretical complexity gains will be outweighed by the practical slowness of Python compared to highly optimized inversion.
Inverting a sparse matrix usually loses the sparsity. Also, you never invert a matrix if you can avoid it at all! For a sparse matrix, solving the linear equation system Ax = b, with A your matrix and b a known vector, for x, is so much faster done forward than computing A⁻¹! So,
I'm currently working with a least-square algorithm on Python, regarding some geodetic calculations.
since LS says you don't need the inverse matrix, simply don't calculate it, ever. The point of LS is finding a solution that's as close as it gets, even if your matrix isn't invertible. Which can very well be the case for sparse matrices!

Performing UMAP dimension reduction on inconsistently shaped data - python

first question, I will do my best to be as clear as possible.
If I can provide UMAP with a distance function that also outputs a gradient or some other relevant information, can I apply UMAP to non-traditional looking data? (I.e., a data set with points of inconsistent dimension, data points that are non-uniformly sized matrices, etc.) The closest I have gotten to finding something that looks vaguely close to my question is in the documentation here (https://umap-learn.readthedocs.io/en/latest/embedding_space.html), but this seems to be sort of the opposite process, and as far as I can tell still supposes you are starting with tuple-based data of uniform dimension.
I'm aware that one way around this is just to calculate a full pairwise distance matrix ahead of time and give that to UMAP, but from what I understand of the way UMAP is coded, it only performs a subset of all possible distance calculations, and is thus much faster for the same amount of data than if I were to take the full pre-calculation route.
I am working in python3, but if there is an implementation of UMAP dimension reduction in some other environment that permits this, I would be willing to make a detour in my workflow to obtain this greater flexibility with incoming data types.
Thank you.

Algorithmically this is quite possible, but in practice most implementations do not support anything other than fixed dimension vectors. If computing the all pairs distances is not tractable another option is to try to find a way to featurize or vectorize the data in a way that will allow for easy distance computations. This is, of course, not always possible. The final option is to implement things yourself, but this requires handling the nearest neighbour search, which is likely a non-trivial coding project in and of itself.

How to find dot product of two very large matrices to avoid memory error?

I am trying to learn ML using Kaggle datasets. In one of the problems (using Logistic regression) inputs and parameters matrices are of size (1110001, 8) & (2122640, 8) respectively.
I am getting memory error while doing it in python. This would be same for any language I guess since it's too big. My question is how do they multiply matrices in real life ML implementations (since it would usually be this big)?
Things bugging me :
Some ppl in SO have suggested to calculate dot product in parts and then combine. But even then matrix would be still too big for RAM (9.42TB? in this case)
And If I write it to a file wouldn't it be too slow for optimization algorithms to read from file and minimize function?
Even if I do write it to file how would fmin_bfgs(or any opt. function) read from file?
Also Kaggle notebook shows only 1GB of storage available. I don't think anyone would allow TBs of storage space.
In my input matrix many rows have similar values for some columns. Can I use it my advantage to save space? (like sparse matrix for zeros in matrix)
Can anyone point me to any real life sample implementation of such cases. Thanks!

I have tried many things. I will be mentioning these here, if anyone needs them in future:
I had already cleaned up data like removing duplicates and
irrelevant records depending on given problem etc.
I have stored large matrices which hold mostly 0s as sparse matrix.
I implemented the gradient descent using mini-batch method instead of plain old Batch method (theta.T dot X).
Now everything is working fine.

How to assemble large sparse matrices effectively in python/scipy

I am working on an FEM project using Scipy. Now my problem is, that
the assembly of the sparse matrices is too slow. I compute the
contribution of every element in dense small matrices (one for each
element). For the assembly of the global matrices I loop over all
small dense matrices and set the matrice entries the following way:
[i,j] = someList[k][l]
Mglobal[i,j] = Mglobal[i,j] + Mlocal[k,l]
Mglobal is a lil_matrice of appropriate size, someList maps the
indexing variables.
Of course this is rather slow and consumes most of the matrice
assembly time. Is there a better way to assemble a large sparse matrix
from many small dense matrices? I tried scipy.weave but it doesn't
seem to work with sparse matrices

I posted my response to the scipy mailing list; stack overflow is a bit easier
to access so I will post it here as well, albeit a slightly improved version.
The trick is to use the IJV storage format. This is a trio of three arrays
where the first one contains row indicies, the second has column indicies, and
the third has the values of the matrix at that location. This is the best way
to build finite element matricies (or any sparse matrix in my opinion) as access
to this format is really fast (just filling an an array).
In scipy this is called coo_matrix; the class takes the three arrays as an
argument. It is really only useful for converting to another format (CSR os
CSC) for fast linear algebra.
For finite elements, you can estimate the size of the three arrays by something
like
size = number_of_elements * number_of_basis_functions**2
so if you have 2D quadratics you would do number_of_elements * 36, for example.
This approach is convenient because if you have local matricies you definitely
have the global numbers and entry values: exactly what you need for building
the three IJV arrays. Scipy is smart enough to throw out zero entries, so
overestimating is fine.

Python - efficient representation of pixels and associated values

I'm using python to work with large-ish (approx 2000 x 2000) matrices, where each I, J point in the matrix represents a single pixel.
The matrices themselves are sparse (ie a substantial portion of them will have zero values), but when they are updated they tend to be increment operations, to a large number of adjacent pixels in a rectangular 'block', rather than random pixels here or there (a property i do not currently use to my advantage..).
Afraid a bit new to matrix arithmetic, but I've looked into a number of possible solutions, including the various flavours of scipy sparse matrices. So far co-ordinate (COO) matrices seem to be the most promising.
So for instance where I want to increment one block shape, I'd have to do something along the lines of:
>>> from scipy import sparse
>>> from numpy import array
>>> I = array([0,0,0,0])
>>> J = array([0,1,2,3])
>>> V = array([1,1,1,1])
>>> incr_matrix = sparse.coo_matrix((V,(I,J)),shape=(100,100))
>>> main_matrix += incr_matrix #where main_matrix was previously defined
In the future, i'd like to have a richer pixel value representation in anycase (tuples to represent RGB etc), something that numpy array doesnt support out of the box (or perhaps I need to use this).
Ultimately i'll have a number of these matrices that I would need to do simple arithmitic on, and i'd need the code to be as efficient as possible -- and distributable, so i'd need to be able to persist and exchange these objects in a small-ish representation without substantial penalties. I'm wondering if this is the right way to go, or should I be looking rolling my own structures using dicts etc?

The general rule is, get the code working first, then optimize if needed...
In this case, use a normal numpy 2000x2000 array, or 2000x2000x3 for RGB. This will be much easier and faster to work with, is only a small memory requirement, and has many other advantages, for example, you can use the standard image processing tools, etc.
Then, if needed, "to persist and exchange these objects", you can just compress them using gzip, pytables, jpeg, or whatever, but there's no need to limit your data manipulation based storage requirements.
This way you get both faster processing and better compression.

I would say, yes, this is the way to go. Definitely over building something out of dictionaries! When building a "vector", array, then use a structured array, i.e. define your own dtype:
rgbtype = [('r','uint8'),('g','uint8'),('b','uint8')]
when incrementing your blocks, it will look something like this:
main_matrix['r'][blk_slice] += incr_matrix['r']
main_matrix['g'][blk_slice] += incr_matrix['g']
main_matrix['b'][blk_slice] += incr_matrix['b']
Update:
It looks like you can't do matrix operations with a coo_matrix, they exist simply as a convenient way to populate a sparse matrix. You have to convert them to another (sparse) matrix type before doing the updates. documentation

You might want to consider looking into a quadtree as an implementation. The quadtree structure is pretty efficient at storing sparse data, and has the added advantage that if you're working with structures composed of lots of blocks of similar data the representation can be very compact. I'm not sure if this will be particularly applicable to what you're doing, since I don't know what you mean by "working in blocks," but it's certainly worth checking out as an alternative sparse matrix implementation.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.