For my current project I need to be able to calculate the rank of 64*64 matrices with entries from GF(2). I was wondering if anyone had a good solution.
I've been using pyfinite for this, but it is rather slow since it's a pure python implementation. I've also tried to cythonise the code I've been using, but have had issues due to relying on pyfinite.
My next idea would be to write my own class in cython, but that seems a bit overkill for what I need.
I need the following functionality
matrix = GF2Matrix(size=64) # creating a 64*64 matrix
matrix.setRow(i, [1,0,1....,1]) # set row using list
matrix += matrix2 # addition of matrices
rank(matrix) # then computing the rank
Thanks for any ideas.
One way to efficiently represent a matrix over GF(2) is to store the rows as integers, interpreting each integer as a bit-string. So for example, the 4-by-4 matrix
[0 1 1 0]
[1 0 1 1]
[0 0 1 0]
[1 0 0 1]
(which has rank 3) could be represented as a list [6, 13, 4, 9] of integers. Here I'm thinking of the first column as corresponding to the least significant bit of the integer, and the last to the most significant bit, but the reverse convention would also work.
With this representation, row operations can be performed efficiently using Python's bitwise integer operations: ^ for addition, & for multiplication.
Then you can compute the rank using a standard Gaussian elimination approach.
Here's some reasonably efficient code. Given a collection rows of nonnegative integers representing a matrix as above, we repeatedly remove the last row in the list, and then use that row to eliminate all 1 entries from the column corresponding to its least significant bit. If the row is zero then it has no least significant bit and doesn't contribute to the rank, so we simply discard it and move on.
def gf2_rank(rows):
"""
Find rank of a matrix over GF2.
The rows of the matrix are given as nonnegative integers, thought
of as bit-strings.
This function modifies the input list. Use gf2_rank(rows.copy())
instead of gf2_rank(rows) to avoid modifying rows.
"""
rank = 0
while rows:
pivot_row = rows.pop()
if pivot_row:
rank += 1
lsb = pivot_row & -pivot_row
for index, row in enumerate(rows):
if row & lsb:
rows[index] = row ^ pivot_row
return rank
Let's run some timings for random 64-by-64 matrices over GF2. random_matrices is a function to create a collection of random 64-by-64 matrices:
import random
def random_matrix():
return [random.getrandbits(64) for row in range(64)]
def random_matrices(count):
return [random_matrix() for _ in range(count)]
and here's the timing code:
import timeit
count = 1000
number = 10
timer = timeit.Timer(
setup="ms = random_matrices({})".format(count),
stmt="[gf2_rank(m.copy()) for m in ms]",
globals=globals())
print(min(timer.repeat(number=number)) / count / number)
The result printed on my machine (2.7 GHz Intel Core i7, macOS 10.14.5, Python 3.7) is 0.0001984686384, so that's a touch under 200µs for a single rank computation.
200µs is quite respectable for a pure Python rank computation, but in case this isn't fast enough, we can follow your suggestion to use Cython. Here's a Cython function that takes a 1d NumPy array of dtype np.uint64, again thinking of each element of the array as a row of your 64-by-64 matrix over GF2, and returns the rank of that matrix.
# cython: language_level=3, boundscheck=False
from libc.stdint cimport uint64_t, int64_t
def gf2_rank(uint64_t[:] rows):
"""
Find rank of a matrix over GF2.
The matrix can have no more than 64 columns, and is represented
as a 1d NumPy array of dtype `np.uint64`. As before, each integer
in the array is thought of as a bit-string to give a row of the
matrix over GF2.
This function modifies the input array.
"""
cdef size_t i, j, nrows, rank
cdef uint64_t pivot_row, row, lsb
nrows = rows.shape[0]
rank = 0
for i in range(nrows):
pivot_row = rows[i]
if pivot_row:
rank += 1
lsb = pivot_row & -pivot_row
for j in range(i + 1, nrows):
row = rows[j]
if row & lsb:
rows[j] = row ^ pivot_row
return rank
Running equivalent timings for 64-by-64 matrices, now represented as NumPy arrays of dtype np.uint64 and shape (64,), I get an average rank-computation time of 7.56µs, over 25 times faster than the pure Python version.
I wrote a Python package galois that extends NumPy arrays over Galois fields. Linear algebra on Galois field matrices is one of the intended use cases. It is written in Python but JIT compiled using Numba for speed. It is quite fast and most linear algebra routines are also compiled. (One exception, as of 08/11/2021 the row reduction routine hasn't been JIT compiled, but that could be added.)
Here is an example using the galois library to do what you are describing.
Create a GF(2) array class and create an explicit array and a random array.
In [1]: import numpy as np
In [2]: import galois
In [3]: GF = galois.GF(2)
In [4]: A = GF([[0, 0, 1, 0], [0, 1, 1, 1], [1, 0, 1, 0], [1, 0, 1, 0]]); A
Out[4]:
GF([[0, 0, 1, 0],
[0, 1, 1, 1],
[1, 0, 1, 0],
[1, 0, 1, 0]], order=2)
In [5]: B = GF.Random((4,4)); B
Out[5]:
GF([[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 0, 0],
[0, 0, 1, 0]], order=2)
You can update an entire row (as you requested) like this.
In [6]: B[0,:] = [1,0,0,0]; B
Out[6]:
GF([[1, 0, 0, 0],
[1, 1, 1, 0],
[1, 1, 0, 0],
[0, 0, 1, 0]], order=2)
Matrix arithmetic works with normal binary operators. Here is matrix addition and matrix multiplication.
In [7]: A + B
Out[7]:
GF([[1, 0, 1, 0],
[1, 0, 0, 1],
[0, 1, 1, 0],
[1, 0, 0, 0]], order=2)
In [8]: A # B
Out[8]:
GF([[1, 1, 0, 0],
[0, 0, 0, 0],
[0, 1, 0, 0],
[0, 1, 0, 0]], order=2)
There is an added method to the NumPy arrays called row_reduce() which performs Gaussian elimination on the matrix. You can also call the standard NumPy linear algebra functions on a Galois field array and get the correct result.
In [9]: A.row_reduce()
Out[9]:
GF([[1, 0, 0, 0],
[0, 1, 0, 1],
[0, 0, 1, 0],
[0, 0, 0, 0]], order=2)
In [10]: np.linalg.matrix_rank(A)
Out[10]: 3
Hope this helps! If there is additional functionality desired, please open an issue on GitHub.
Related
First of all, I want to summarize how I arrived at this particular problem. I wanted to create a song recommender using collaborative filtering method. But the problem is that I have a very large dataset at hand, 1m rows x 2.2m columns. If my understanding is correct, I needed to create a sparse matrix in order to move forward with my idea, since I do not know of anything that can hold a matrix with the size of 1m x 2.2m.* Hence, sparse matrix.
Now, since this matrix will only contain 1s or 0s in the cells, I've somehow mapped out which cells should have 1 if I were to create a hypothetical monstrous matrix. The information I have looks like this;
rows
locations
row1
[56110, 78999, 1508886, 2090010]
row2
[1123, 976554]
...
...
row1000000
[334555, 2200100]
The problem is that I don't know how to create a sparse matrix using this information. I've checked many sources but couldn't find any viable solution. If you could help me, I would very much appreciate it. Also, if you have any notes on collaborative filtering methods that utilize sparse matrices I would also be very grateful.
There are several ways you could do this. Here is one that creates a csr_matrix, since the data that you show is close to this format. (That docstring has a terse explanation of the csr_matrix attributes data, indices and indptr.) Whether or not this is the best method (for some definition of "best") depends on the actual "raw" form of your data (among other things).
I assume you can put the data that you show in the locations column into a list of lists, called locations. It is important that there is an entry in locations for each row, even if the list is empty. I also assume that the values given in locations are 0-based indices that correspond to the column of the matrix. Here's an example, for an array that has shape (5, 8).
In [23]: locations = [[2, 3], [], [1, 3, 5], [0, 1, 7], [7]]
To form indptr, we compute the cumulative sum of the lengths of the lists, and prepend a 0:
In [28]: lengths = np.array([len(t) for t in locations])
In [29]: lengths
Out[29]: array([2, 0, 3, 3, 1])
In [30]: indptr = np.concatenate(([0], lengths.cumsum()))
In [31]: indptr
Out[31]: array([0, 2, 2, 5, 8, 9])
indices is just the flattened version of locations. Note that sum() in the following is the Python builtin sum() function, not np.sum. That function call concatenates all the lists in locations.
In [32]: indices = sum(locations, start=[])
In [33]: indices
Out[33]: [2, 3, 1, 3, 5, 0, 1, 7, 7]
The data for the array is an array of 1s that is the same length as indices:
In [38]: data = np.ones_like(indices)
We now have all the pieces we need to create a SciPy csr_matrix:
In [39]: from scipy.sparse import csr_matrix
In [40]: A = csr_matrix((data, indices, indptr))
In [41]: A
Out[41]:
<5x8 sparse matrix of type '<class 'numpy.int64'>'
with 9 stored elements in Compressed Sparse Row format>
In [42]: A.toarray()
Out[42]:
array([[0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 1]])
I want to use sparse matrices to store a "map" of occupied/free space detected by a laser range finder. I think sparse matrices are a good fit to this problem because there is much more free space in the environment (denoted with 0s), than there are obstacles (denoted with 1s, or a float between 0 and 1 for probability).
After acquiring this map, as part of a matching algorithm I want to apply different spatial transformations to it, such as, but not necessarily limited to, rotations and translations.
As a proof of concept for the transformations part, I have this small SciPy sparse matrix defined in coordinates format (coo_matrix) as follows:
import numpy as np
from scipy import sparse
row = np.array([2])
col = np.array([2])
data = np.array([1])
shape = (4, 4)
m = sparse.coo_matrix((data, (row, col)), shape)
m.toarray()
array([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 0]])
I want to, for example, rotate this matrix 90 degrees counter-clockwise around its center, to get this new matrix:
array([[0, 0, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
What I did was:
Get the coordinates for the non-zero point:
point = [m.col[0], m.row[0], 1]
[2, 2, 1]
Calculate the transformation matrix corresponding to a) putting the center at (1.5, 1.5) and b) rotating 90 degrees counter-clockwise:
rot = np.array([[0, -1, 3],[1, 0, 0],[0, 0, 1]])
array([[ 0, -1, 3],
[ 1, 0, 0],
[ 0, 0, 1]])
Apply the transformation by multiplying the point at 1) with the matrix at 2):
rot.dot(point)
array([1, 2, 1])
Which effectively has rotated point at (2,2) to (1,2) (or to (1,2,1) in homogeneous coordinates which is the same), with center on (1.5, 1.5).
Now, this feels rather "manual". Is there a shorter way of applying this transformation? I know there is some quaternion-based rotation supported by scipy but I'm more interested in a general approach that can include not only rotations but translations and basically any transformation (scipy only supports rotations for now anyways).
Is there a more direct way of transforming this matrix without having to manually a) build each non-zero point from coordinates and b) applying the transform to each point?
Note: I'm using sparse matrices for convenience and storage, but if as an intermediate step it had to be converted to another format, transformed, and converted back to sparse, that's ok too.
Thanks!
I'm researching and trying to implement a Q-Learning example. So far, I've been able to follow the code slowly by breaking it apart and figuring out how it works, however I've stumbled upon a tiny snippet that I can't figure out why it exists...
action = np.argmax(q_learning_table[state,:] + np.random.randn(1, 4))
From what I gather, an action is being chosen from the Q-Learning table but only from a specific row in the matrix, whatever value state is. What I don't understand is why the need for the np.random.randn(1, 4).
Locally, I've done the following to try and understand it:
A = np.matrix([[0, 0, 5, 0], [4, 0, 0, 0], [0, 0, 0, 9])
a = np.argmax(A[2,:] + 100)
print(a)
My understanding is that I should see the result 103 rather than 3 (location of 9). So, why do I still see 3. What's the purpose of adding 100?
The goal of the training phase of Q-learning is to create a Q-table that represents an optimal policy, i.e., a table that accurately predicts the cumulative reward for each potential action at a given state.
During training, it is necessary to introduce random action, so that the learner will be encouraged to explore the available state space and gain new experience. Without this randomness, the learner will quickly converge to a policy that is sub-optimal, because it will continually choose the same actions based on a limited amount of experience.
In your example, the np.random.randn() call introduces this randomness. It adds noise based on the standard normal distribution. The np.argmax() call then returns the index of the maximum value in the array, in this case, the maximum reward for each potential action with noise added.
It's most likely a random noise in order to encourage exploration. It's so that QL won't stick to a single random good solution and try to find a possibly better solution.
Furthermore, np.argmax(x) returns the index of the largest element in the array. Not the value. That's np.max(x).
# Largest value is at index 2
np.argmax([1,3,9,4,5,6,3]) -> 2
# Largest value is 9
np.max([1,3,9,4,5,6,3]) -> 9
In [12]: A = np.array([[0, 0, 5, 0], [4, 0, 0, 0], [0, 0, 0, 9]])
In [13]: A
Out[13]:
array([[0, 0, 5, 0],
[4, 0, 0, 0],
[0, 0, 0, 9]])
argmax returns the index of the largest item in the array:
In [14]: np.argmax(A)
Out[14]: 11
In [15]: A.ravel()
Out[15]: array([0, 0, 5, 0, 4, 0, 0, 0, 0, 0, 0, 9])
Without axis it treats the array as 1d. With axis it looks by row or column:
In [16]: np.argmax(A, axis=0)
Out[16]: array([1, 0, 0, 2], dtype=int32)
In [17]: np.argmax(A, axis=1)
Out[17]: array([2, 0, 3], dtype=int32)
Adding a value, 100 or the random array, changes values in the array that argmax sees. Simply adding a scalar doesn't change the location of the maximum value. Adding a random array can change the location.
np.argmax(q_learning_table[state,:] + np.random.randn(1, 4))
is
arr = q_learning_table[state,:] + np.random.randn(1, 4)
np.argmax(arr)
That is, Python evaluates the arguments first, and passes the result to argmax. The math is not done inside argmax. It is done before argmax is even run.
Adding a random array to A can change the location of the max:
In [24]: A + np.random.randint(0,20, A.shape)
Out[24]:
array([[ 2, 2, 10, 3],
[ 7, 9, 13, 6],
[ 3, 14, 10, 13]])
In [25]: np.argmax(_)
Out[25]: 9
I have a SciPy csr_matrix (a vector in this case) of 1 column and x rows. In it are float values which I need to convert to the discrete class labels -1, 0 and 1. This should be done with a threshold function which maps the float values to one of these 3 class labels.
Is there no way other than iterating over the elements as described in Iterating through a scipy.sparse vector (or matrix)? I would love to have some elegant way to just somehow map(thresholdfunc()) on all elements.
Note that while it is of type csr_matrix, it isn't actually sparse as it's just the return of another function where a sparse matrix was involved.
If you have an array, you can discretize based on some condition with the np.where function. e.g.:
>>> import numpy as np
>>> x = np.arange(10)
>>> np.where(x < 5, 0, 1)
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
The syntax is np.where(BOOLEAN_ARRAY, VALUE_IF_TRUE, VALUE_IF_FALSE).
You can chain together two where statements to have multiple conditions:
>>> np.where(x < 3, -1, np.where(x > 6, 0, 1))
array([-1, -1, -1, 1, 1, 1, 1, 0, 0, 0])
To apply this to your data in the CSR or CSC sparse matrix, you can use the .data attribute, which gives you access to the internal array containing all the nonzero entries in the sparse matrix. For example:
>>> from scipy import sparse
>>> mat = sparse.csr_matrix(x.reshape(10, 1))
>>> mat.data = np.where(mat.data < 3, -1, np.where(mat.data > 6, 0, 1))
>>> mat.toarray()
array([[ 0],
[-1],
[-1],
[ 1],
[ 1],
[ 1],
[ 1],
[ 0],
[ 0],
[ 0]])
I want to do something similar to here (in Python):
How to convert a column or row matrix to a diagonal matrix in Python?
that is :
1) set all elements of matrix A onto the diagonal of matrix B (all other elements of B should be 0) and 2) after performing some operation on B, I want to recreate matrix A, so take the elements off B's diagonal , in the same order as was performed in the first step, and put them back in A.
Can you not do just unravel your matrix onto the diagonal of another?
In [29]: import numpy as np
In [30]: a = np.array([[1,2],[3,4]])
In [31]: b = np.diag(a.ravel())
In [32]: b
Out[32]:
array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
Then, to go back:
In [33]: b.diagonal().reshape((2,2))
Out[33]:
array([[1, 2],
[3, 4]])