I'm researching and trying to implement a Q-Learning example. So far, I've been able to follow the code slowly by breaking it apart and figuring out how it works, however I've stumbled upon a tiny snippet that I can't figure out why it exists...
action = np.argmax(q_learning_table[state,:] + np.random.randn(1, 4))
From what I gather, an action is being chosen from the Q-Learning table but only from a specific row in the matrix, whatever value state is. What I don't understand is why the need for the np.random.randn(1, 4).
Locally, I've done the following to try and understand it:
A = np.matrix([[0, 0, 5, 0], [4, 0, 0, 0], [0, 0, 0, 9])
a = np.argmax(A[2,:] + 100)
print(a)
My understanding is that I should see the result 103 rather than 3 (location of 9). So, why do I still see 3. What's the purpose of adding 100?
The goal of the training phase of Q-learning is to create a Q-table that represents an optimal policy, i.e., a table that accurately predicts the cumulative reward for each potential action at a given state.
During training, it is necessary to introduce random action, so that the learner will be encouraged to explore the available state space and gain new experience. Without this randomness, the learner will quickly converge to a policy that is sub-optimal, because it will continually choose the same actions based on a limited amount of experience.
In your example, the np.random.randn() call introduces this randomness. It adds noise based on the standard normal distribution. The np.argmax() call then returns the index of the maximum value in the array, in this case, the maximum reward for each potential action with noise added.
It's most likely a random noise in order to encourage exploration. It's so that QL won't stick to a single random good solution and try to find a possibly better solution.
Furthermore, np.argmax(x) returns the index of the largest element in the array. Not the value. That's np.max(x).
# Largest value is at index 2
np.argmax([1,3,9,4,5,6,3]) -> 2
# Largest value is 9
np.max([1,3,9,4,5,6,3]) -> 9
In [12]: A = np.array([[0, 0, 5, 0], [4, 0, 0, 0], [0, 0, 0, 9]])
In [13]: A
Out[13]:
array([[0, 0, 5, 0],
[4, 0, 0, 0],
[0, 0, 0, 9]])
argmax returns the index of the largest item in the array:
In [14]: np.argmax(A)
Out[14]: 11
In [15]: A.ravel()
Out[15]: array([0, 0, 5, 0, 4, 0, 0, 0, 0, 0, 0, 9])
Without axis it treats the array as 1d. With axis it looks by row or column:
In [16]: np.argmax(A, axis=0)
Out[16]: array([1, 0, 0, 2], dtype=int32)
In [17]: np.argmax(A, axis=1)
Out[17]: array([2, 0, 3], dtype=int32)
Adding a value, 100 or the random array, changes values in the array that argmax sees. Simply adding a scalar doesn't change the location of the maximum value. Adding a random array can change the location.
np.argmax(q_learning_table[state,:] + np.random.randn(1, 4))
is
arr = q_learning_table[state,:] + np.random.randn(1, 4)
np.argmax(arr)
That is, Python evaluates the arguments first, and passes the result to argmax. The math is not done inside argmax. It is done before argmax is even run.
Adding a random array to A can change the location of the max:
In [24]: A + np.random.randint(0,20, A.shape)
Out[24]:
array([[ 2, 2, 10, 3],
[ 7, 9, 13, 6],
[ 3, 14, 10, 13]])
In [25]: np.argmax(_)
Out[25]: 9
Related
I am creating a Sudoku bot in python and I need to create a 3x3 matrix where each value represents a box on the board. The value will be False if there is an instance of value in the box and True if not.
Currently I have
temp_board = ma.masked_where(board == value, board, True)
boxes = np.full((3, 3), False)
for x in range(3):
for y in range(3):
boxes[x, y] = not np.any(temp_board.mask[x * 3:(x + 1) * 3, y * 3:(y + 1) * 3])
board is a 9x9 matrix containing numbers 0-9 but value can only equal 1-9
Here is an example of what the output should look like
# input
board = np.array([[4, 0, 9, 0, 7, 2, 0, 1, 3],
[7, 0, 2, 8, 3, 0, 6, 0, 0],
[0, 1, 6, 0, 4, 9, 8, 7, 0],
[2, 0, 0, 1, 0, 0, 0, 6, 0],
[5, 4, 7, 0, 0, 0, 2, 0, 0],
[6, 9, 0, 0, 0, 4, 0, 3, 5],
[8, 0, 3, 4, 0, 0, 0, 0, 6],
[0, 0, 0, 0, 0, 3, 1, 0, 0],
[0, 6, 0, 9, 0, 0, 0, 4, 0]])
value = 9
# output
[[False False True]
[False True True]
[ True False True]]
The method I am using works but is terribly inefficient, I was wondering if there was a faster way to do this.
One way, not very subtle, but fast, would be to have a lookup index table
look=np.zeros((9,9,9), dtype=bool)
look[0,:3,:3]=True
look[1,:3,3:6]=True
look[2,:3,6:]=True
look[3,3:6,:3]=True
look[4,3:6,3:6]=True
look[5,3:6,6:]=True
look[6,6:,:3]=True
look[7,6:,3:6]=True
look[8,6:,6:]=True
def inblock(board, blnum, val):
return val in board[look[blnum]]
To get directly you matrix, you can then
~(look*board == value).any(axis=(1,2)).reshape(3,3)
look*board is a 9x9x9 matrix, filtering only one block each ((look*board)[k] contains 0 everywhere, but in block k, which are a copy of the board).
(look*board == value) is a boolean version of that.
so (look*board == value).any(axis=(1,2)) are 9 values, True iff block[k] contains value.
Since I use a 1 dimension block array, you can reshape to get 3x3 matrix.
And then negate the result to imitate your output.
Note that there are surely more subtle ways to build the look table. For example, using your own for loop, slightly modified. But well, that in only one time.
Alternative
Just to be even more direct (and, frankly, what I was looking for, when I started to think to your question), would be this
look=np.zeros((3,3,9,9), dtype=bool)
for i in range(3):
for j in range(3):
look[i,j,i*3:(i+1)*3,j*3:(j+1)*3] = True
~np.einsum('ijkl,kl', look, board==value)
The fact that I use (3,3,9,9) shape and then avoid reshape is not really the point (it is the same cost, just a presentation problem). I could have done that for the previous solution.
Nor is the fact that this time, I use a double loop to create the look table. That again, I could have done it before. It is not look that changes. Just the usage of einsum. It is not better than my previous solution. But it is what I had in mind at first: use a sort of matrix multiplication.
Timings
~(look*board == value).any(axis=(2,3)) : 18.9 μs
(Note this is my previous solution, adapted to (3,3,9,9) shape
~np.einsum('ijkl,kl', look, board==value) : 13.6 μs
So, my second solution is not only the one I wanted. But it appears it is also faster.
First of all, I want to summarize how I arrived at this particular problem. I wanted to create a song recommender using collaborative filtering method. But the problem is that I have a very large dataset at hand, 1m rows x 2.2m columns. If my understanding is correct, I needed to create a sparse matrix in order to move forward with my idea, since I do not know of anything that can hold a matrix with the size of 1m x 2.2m.* Hence, sparse matrix.
Now, since this matrix will only contain 1s or 0s in the cells, I've somehow mapped out which cells should have 1 if I were to create a hypothetical monstrous matrix. The information I have looks like this;
rows
locations
row1
[56110, 78999, 1508886, 2090010]
row2
[1123, 976554]
...
...
row1000000
[334555, 2200100]
The problem is that I don't know how to create a sparse matrix using this information. I've checked many sources but couldn't find any viable solution. If you could help me, I would very much appreciate it. Also, if you have any notes on collaborative filtering methods that utilize sparse matrices I would also be very grateful.
There are several ways you could do this. Here is one that creates a csr_matrix, since the data that you show is close to this format. (That docstring has a terse explanation of the csr_matrix attributes data, indices and indptr.) Whether or not this is the best method (for some definition of "best") depends on the actual "raw" form of your data (among other things).
I assume you can put the data that you show in the locations column into a list of lists, called locations. It is important that there is an entry in locations for each row, even if the list is empty. I also assume that the values given in locations are 0-based indices that correspond to the column of the matrix. Here's an example, for an array that has shape (5, 8).
In [23]: locations = [[2, 3], [], [1, 3, 5], [0, 1, 7], [7]]
To form indptr, we compute the cumulative sum of the lengths of the lists, and prepend a 0:
In [28]: lengths = np.array([len(t) for t in locations])
In [29]: lengths
Out[29]: array([2, 0, 3, 3, 1])
In [30]: indptr = np.concatenate(([0], lengths.cumsum()))
In [31]: indptr
Out[31]: array([0, 2, 2, 5, 8, 9])
indices is just the flattened version of locations. Note that sum() in the following is the Python builtin sum() function, not np.sum. That function call concatenates all the lists in locations.
In [32]: indices = sum(locations, start=[])
In [33]: indices
Out[33]: [2, 3, 1, 3, 5, 0, 1, 7, 7]
The data for the array is an array of 1s that is the same length as indices:
In [38]: data = np.ones_like(indices)
We now have all the pieces we need to create a SciPy csr_matrix:
In [39]: from scipy.sparse import csr_matrix
In [40]: A = csr_matrix((data, indices, indptr))
In [41]: A
Out[41]:
<5x8 sparse matrix of type '<class 'numpy.int64'>'
with 9 stored elements in Compressed Sparse Row format>
In [42]: A.toarray()
Out[42]:
array([[0, 0, 1, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 1, 0, 1, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 1]])
Suppose I have a numpy array
a = np.array([0,2,3,4,5,1,9,0,0,7,9,0,0,0]).reshape(7,2)
I want to find out the indices of all the times the minimum element (here 0) occurs in the 2nd column. Using argmin I can find out the index of when 0 is occurring for the first time. How can I do this in Python?
Using np.flatnonzero on a[:, 1]==np.min(a) is the most starightforward way:
In [3]: idxs = np.flatnonzero(a[:, 1]==np.min(a))
In [4]: idxs
Out[4]: array([3, 5, 6])
After you reshaped your array it looks like this:
array([[0, 2],
[3, 4],
[5, 1],
[9, 0],
[0, 7],
[9, 0],
[0, 0]])
You can get all elements that are of the same value by using np.where. IN your case the following would work:
np.where(a.T[-1] == a.argmin())
# This would give you (array([3, 5, 6]),)
What happens here is that you create a transposed view on the array. This means you can easily access the columns. The term view here means that the a array itself is not changed for that. This leaves you with:
a.T
array([[0, 3, 5, 9, 0, 9, 0],
[2, 4, 1, 0, 7, 0, 0]])
From this you select the last line (i.e. the last column of a) by using the index -1. Now you have the array
array([2, 4, 1, 0, 7, 0, 0])
on which you can call np.where(condititon), which gives you all indices for which the condition is true. In your case the condition is
a.T[-1] == a.argmin()
which gives you all entries in the selected line of the transposed array that have the same value as np.argmin(a) which, as you said, is 0 in your case.
I have an array X:
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
And I wish to find the index of the row of several values in this array:
searched_values = np.array([[4, 2],
[3, 3],
[5, 6]])
For this example I would like a result like:
[0,3,4]
I have a code doing this, but I think it is overly complicated:
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
searched_values = np.array([[4, 2],
[3, 3],
[5, 6]])
result = []
for s in searched_values:
idx = np.argwhere([np.all((X-s)==0, axis=1)])[0][1]
result.append(idx)
print(result)
I found this answer for a similar question but it works only for 1d arrays.
Is there a way to do what I want in a simpler way?
Approach #1
One approach would be to use NumPy broadcasting, like so -
np.where((X==searched_values[:,None]).all(-1))[1]
Approach #2
A memory efficient approach would be to convert each row as linear index equivalents and then using np.in1d, like so -
dims = X.max(0)+1
out = np.where(np.in1d(np.ravel_multi_index(X.T,dims),\
np.ravel_multi_index(searched_values.T,dims)))[0]
Approach #3
Another memory efficient approach using np.searchsorted and with that same philosophy of converting to linear index equivalents would be like so -
dims = X.max(0)+1
X1D = np.ravel_multi_index(X.T,dims)
searched_valuesID = np.ravel_multi_index(searched_values.T,dims)
sidx = X1D.argsort()
out = sidx[np.searchsorted(X1D,searched_valuesID,sorter=sidx)]
Please note that this np.searchsorted method assumes there is a match for each row from searched_values in X.
How does np.ravel_multi_index work?
This function gives us the linear index equivalent numbers. It accepts a 2D array of n-dimensional indices, set as columns and the shape of that n-dimensional grid itself onto which those indices are to be mapped and equivalent linear indices are to be computed.
Let's use the inputs we have for the problem at hand. Take the case of input X and note the first row of it. Since, we are trying to convert each row of X into its linear index equivalent and since np.ravel_multi_index assumes each column as one indexing tuple, we need to transpose X before feeding into the function. Since, the number of elements per row in X in this case is 2, the n-dimensional grid to be mapped onto would be 2D. With 3 elements per row in X, it would had been 3D grid for mapping and so on.
To see how this function would compute linear indices, consider the first row of X -
In [77]: X
Out[77]:
array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
We have the shape of the n-dimensional grid as dims -
In [78]: dims
Out[78]: array([10, 7])
Let's create the 2-dimensional grid to see how that mapping works and linear indices get computed with np.ravel_multi_index -
In [79]: out = np.zeros(dims,dtype=int)
In [80]: out
Out[80]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
Let's set the first indexing tuple from X, i.e. the first row from X into the grid -
In [81]: out[4,2] = 1
In [82]: out
Out[82]:
array([[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
Now, to see the linear index equivalent of the element just set, let's flatten and use np.where to detect that 1.
In [83]: np.where(out.ravel())[0]
Out[83]: array([30])
This could also be computed if row-major ordering is taken into account.
Let's use np.ravel_multi_index and verify those linear indices -
In [84]: np.ravel_multi_index(X.T,dims)
Out[84]: array([30, 66, 61, 24, 41])
Thus, we would have linear indices corresponding to each indexing tuple from X, i.e. each row from X.
Choosing dimensions for np.ravel_multi_index to form unique linear indices
Now, the idea behind considering each row of X as indexing tuple of a n-dimensional grid and converting each such tuple to a scalar is to have unique scalars corresponding to unique tuples, i.e. unique rows in X.
Let's take another look at X -
In [77]: X
Out[77]:
array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
Now, as discussed in the previous section, we are considering each row as indexing tuple. Within each such indexing tuple, the first element would represent the first axis of the n-dim grid, second element would be the second axis of the grid and so on until the last element of each row in X. In essence, each column would represent one dimension or axis of the grid. If we are to map all elements from X onto the same n-dim grid, we need to consider the maximum stretch of each axis of such a proposed n-dim grid. Assuming we are dealing with positive numbers in X, such a stretch would be the maximum of each column in X + 1. That + 1 is because Python follows 0-based indexing. So, for example X[1,0] == 9 would map to the 10th row of the proposed grid. Similarly, X[4,1] == 6 would go to the 7th column of that grid.
So, for our sample case, we had -
In [7]: dims = X.max(axis=0) + 1 # Or simply X.max(0) + 1
In [8]: dims
Out[8]: array([10, 7])
Thus, we would need a grid of at least a shape of (10,7) for our sample case. More lengths along the dimensions won't hurt and would give us unique linear indices too.
Concluding remarks : One important thing to be noted here is that if we have negative numbers in X, we need to add proper offsets along each column in X to make those indexing tuples as positive numbers before using np.ravel_multi_index.
Another alternative is to use asvoid (below) to view each row as a single
value of void dtype. This reduces a 2D array to a 1D array, thus allowing you to use np.in1d as usual:
import numpy as np
def asvoid(arr):
"""
Based on http://stackoverflow.com/a/16973510/190597 (Jaime, 2013-06)
View the array as dtype np.void (bytes). The items along the last axis are
viewed as one value. This allows comparisons to be performed which treat
entire rows as one value.
"""
arr = np.ascontiguousarray(arr)
if np.issubdtype(arr.dtype, np.floating):
""" Care needs to be taken here since
np.array([-0.]).view(np.void) != np.array([0.]).view(np.void)
Adding 0. converts -0. to 0.
"""
arr += 0.
return arr.view(np.dtype((np.void, arr.dtype.itemsize * arr.shape[-1])))
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
searched_values = np.array([[4, 2],
[3, 3],
[5, 6]])
idx = np.flatnonzero(np.in1d(asvoid(X), asvoid(searched_values)))
print(idx)
# [0 3 4]
The numpy_indexed package (disclaimer: I am its author) contains functionality for performing such operations efficiently (also uses searchsorted under the hood). In terms of functionality, it acts as a vectorized equivalent of list.index:
import numpy_indexed as npi
result = npi.indices(X, searched_values)
Note that using the 'missing' kwarg, you have full control over behavior of missing items, and it works for nd-arrays (fi; stacks of images) as well.
Update: using the same shapes as #Rik X=[520000,28,28] and searched_values=[20000,28,28], it runs in 0.8064 secs, using missing=-1 to detect and denote entries not present in X.
Here is a pretty fast solution that scales up well using numpy and hashlib. It can handle large dimensional matrices or images in seconds. I used it on 520000 X (28 X 28) array and 20000 X (28 X 28) in 2 seconds on my CPU
Code:
import numpy as np
import hashlib
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
searched_values = np.array([[4, 2],
[3, 3],
[5, 6]])
#hash using sha1 appears to be efficient
xhash=[hashlib.sha1(row).digest() for row in X]
yhash=[hashlib.sha1(row).digest() for row in searched_values]
z=np.in1d(xhash,yhash)
##Use unique to get unique indices to ind1 results
_,unique=np.unique(np.array(xhash)[z],return_index=True)
##Compute unique indices by indexing an array of indices
idx=np.array(range(len(xhash)))
unique_idx=idx[z][unique]
print('unique_idx=',unique_idx)
print('X[unique_idx]=',X[unique_idx])
Output:
unique_idx= [4 3 0]
X[unique_idx]= [[5 6]
[3 3]
[4 2]]
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
S = np.array([[4, 2],
[3, 3],
[5, 6]])
result = [[i for i,row in enumerate(X) if (s==row).all()] for s in S]
or
result = [i for s in S for i,row in enumerate(X) if (s==row).all()]
if you want a flat list (assuming there is exactly one match per searched value).
Another way is to use cdist function from scipy.spatial.distance like this:
np.nonzero(cdist(X, searched_values) == 0)[0]
Basically, we get row numbers of X which have distance zero to a row in searched_values, meaning they are equal. Makes sense if you look on rows as coordinates.
I had similar requirement and following worked for me:
np.argwhere(np.isin(X, searched_values).all(axis=1))
Here's what worked out for me:
def find_points(orig: np.ndarray, search: np.ndarray) -> np.ndarray:
equals = [np.equal(orig, p).all(1) for p in search]
exists = np.max(equals, axis=1)
indices = np.argmax(equals, axis=1)
indices[exists == False] = -1
return indices
test:
X = np.array([[4, 2],
[9, 3],
[8, 5],
[3, 3],
[5, 6]])
searched_values = np.array([[4, 2],
[3, 3],
[5, 6],
[0, 0]])
find_points(X, searched_values)
output:
[0,3,4,-1]
I have a matrix A = Matrix([[1, 0, 0, 20], [-1, 1, 0, 0], [-2, 1, 0, 0], [0, -1, 1, 0]]), a sympy object.
I want to know if there is a conflicting row - meaning a row that after i reduce the matrix, all the terms in the row are zero, apart from the rightmost one.
This seems easy to do on paper, but I think I misunderstand sympy.
Basically the output from rref method is not what I expected.
Notice that if we row reduce A with pen and paper, we should get Matrix([[1, 0, 0, 20], [0, 1, 0, 20], [0, 0, 0, 20], [0, 0, 1, 20]]) at a certain point.
So row number 2 is a conflicting row.
However when I use A.rref() I get something else entirely. I get Matrix([[1, 0, 0, 0], [0, 1, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1]]) and list <class 'list'>: [0, 1, 2, 3]
I don't understand how they reached this result and how to interpet the list. How can I find the conflicting rows using sympy?
The answer by sympy is correct. The matrix you reached in reducing manually is not the end of the row-reduction process, which explains the difference between your answer and sympy's.
To continue the row-reduction from your matrix, swap rows 2 and 3 (the third and fourth rows), and you get
matrix([
[ 1, 0, 0, 20],
[ 0, 1, 0, 20],
[ 0, 0, 1, 20],
[ 0, 0, 0, 20]])
Now subtract row 3 (the last row) from each of the other rows, then divide that last row by 20, and we get
matrix([
[ 1, 0, 0, 0],
[ 0, 1, 0, 0],
[ 0, 0, 1, 0],
[ 0, 0, 0, 1]])
which is sympy's answer.
There are multiple ways to interpret this result. One way is to think of a system of 4 linear equations in 3 variables--the last column of the matrix hold the constants on the right side of the equations while the other columns are the variable coefficients. Your original matrix represents the equations
x = 20
- x + y = 0
- 2x + y = 0
- y + z = 0
and sympy's row reduction shows this system has the same solutions as
x = 0
y = 0
z = 0
0 = 1
which, of course, has no solutions at all, thanks to the last equation.
Also, you seem to have a misunderstanding of what row-reduction can do. You ask, "How can I find the conflicting rows using sympy?" and "if there is a conflicting row." Row reduction does not find which row conflicts, it finds if the rows together conflict. The rref process cannot show a conflicting row since it swaps rows if needed to get a non-zero pivot value in proper place, so the rows of the starting and the ending matrix do not correspond. Also, it is not true that one row conflicts with the others, just that all the rows together conflict. In your matrix, you could remove any one of the first 3 rows and the result will be non-conflicting. (Removing the last row still has a conflicting matrix.) So which row can you say conflicts? There usually is not one conflicting row, so rref() or any other method cannot possibly find one.