I'm trying to find the row in which a 2d array appears in a 3d numpy ndarray. Here's an example of what I mean. Give:
arr = [[[0, 3], [3, 0]],
[[0, 0], [0, 0]],
[[3, 3], [3, 3]],
[[0, 3], [3, 0]]]
I'd like to find all occurrences of:
[[0, 3], [3, 0]]
The result I'd like is:
[0, 3]
I tried to use argwhere but that unfortunately got me nowhere. Any ideas?
Try
np.argwhere(np.all(arr==[[0,3], [3,0]], axis=(1,2)))
How it works:
arr == [[0,3], [3,0]] returns
array([[[ True, True],
[ True, True]],
[[ True, False],
[False, True]],
[[False, True],
[ True, False]],
[[ True, True],
[ True, True]]], dtype=bool)
This is a three dimensional array where the innermost axis is 2. The values at this axis are:
[True, True]
[True, True]
[True, False]
[False, True]
[False, True]
[True, False]
[True, True]
[True, True]
Now with np.all(arr==[[0,3], [3,0]], axis=2) you are checking if both elements on a row are True and its shape will be reduced to (4, 2) from (4, 2, 2). Like this:
array([[ True, True],
[False, False],
[False, False],
[ True, True]], dtype=bool)
You need one more step of reducing as you want both of them to be the same (both [0, 3] and [3, 0]. You can do it either by reducing on the result (now the innermost axis is 1):
np.all(np.all(test, axis = 2), axis=1)
Or you can also do it by giving a tuple for the axis parameter to do the same thing step by step (first innermost, then one step higher). The result will be:
array([ True, False, False, True], dtype=bool)
The 'contains' function in the numpy_indexed package (disclaimer: I am its author) can be used to make queries of this kind. It implements a solution similar to the one offered by Saullo.
import numpy_indexed as npi
test = [[[0, 3], [3, 0]]]
# check which elements of arr are present in test (checked along axis=0 by default)
flags = npi.contains(test, arr)
# if you want the indexes:
idx = np.flatnonzero(flags)
In you can use np.in1d after defining a new data type which will have the memory size of each row in your arr. To define such data type:
mydtype = np.dtype((np.void, arr.dtype.itemsize*arr.shape[1]*arr.shape[2]))
then you have to convert your arr to a 1-D array where each row will have arr.shape[1]*arr.shape[2] elements:
aView = np.ascontiguousarray(arr).flatten().view(mydtype)
You are now ready to look for your 2-D array pattern [[0, 3], [3, 0]] which also has to be converted to dtype:
bView = np.array([[0, 3], [3, 0]]).flatten().view(mydtype)
You can now check the occurrencies of bView in aView:
np.in1d(aView, bView)
#array([ True, False, False, True], dtype=bool)
This mask is easily converted to indices using np.where, for example.
Timings (updated)
THe following function is used to implement this approach:
def check2din3d(b, a):
"""
Return where `b` (2D array) appears in `a` (3D array) along `axis=0`
"""
mydtype = np.dtype((np.void, a.dtype.itemsize*a.shape[1]*a.shape[2]))
aView = np.ascontiguousarray(a).flatten().view(mydtype)
bView = np.ascontiguousarray(b).flatten().view(mydtype)
return np.in1d(aView, bView)
The updated timings considering #ayhan comments showed that this method can be faster the np.argwhere, but the different is not significant and for large arrays like below, #ayhan's approach is considerably faster:
arrLarge = np.concatenate([arr]*10000000)
arrLarge = np.concatenate([arrLarge]*10, axis=2)
pattern = np.ascontiguousarray([[0,3]*10, [3,0]*10])
%timeit np.argwhere(np.all(arrLarger==pattern, axis=(1,2)))
#1 loops, best of 3: 2.99 s per loop
%timeit check2din3d(pattern, arrLarger)
#1 loops, best of 3: 4.65 s per loop
Related
I want to fetch all row indicies of an 2d np.array on the basis of another 2d np.array.
Input:
all_elements = np.array([[1, 1], [2, 2], [3, 3]])
elements = np.array([[1, 1],[2, 2]])
Something like:
idx = np.row_idx(elements, all_elements, axis=0)
output:
[0, 1]
I have been trying to do this with some version of np.where(np.isin(.....)) but i can't get it to work properly.
Any1 have any tips?
IIUC, you might want to use:
np.where((all_elements==elements[:,None]).all(2).any(0))[0]
output: array([0, 1])
explanation:
# compare all elements using broadcasting
(all_elements==elements[:,None])
array([[[ True, True],
[False, False],
[False, False]],
[[False, False],
[ True, True],
[False, False]]])
# all True on the last dimension
(all_elements==elements[:,None]).all(2)
array([[ True, False, False],
[False, True, False]])
# any match per first dimension
(all_elements==elements[:,None]).all(2).any(0)
array([ True, True, False])
so let`s say I have a matrix mat= [[1,2,3,4,5,6],[1,2,3,4,5,6],[1,2,3,4,5,6]]
and a lower bound vector vector_low = [2.1,1.9,1.7] and upper bound vector vector_up = [3.1,3.5,4.1].
How do I get the values in the matrix in between the upper and lower bounds for every row?
Expected Output:
[[3],[2,3],[2,3,4]] (it`s a list #mozway)
alternatively a vector with all of them would also do...
(Extra question: get the values of the matrix that are between the upper and lower bound, but rounded down/up to the next value in the matrix..
Expected Output:
[[2,3,4],[1,2,3,4],[1,2,3,4,5]])
There should be a fast solution without loop, hope someone can help, thanks!
PS: In the end I just want to sum over the list entries, so the output format is not important...
I probably shouldn't indulge you since you haven't provided the code I asked for, but to satisfy my own curiosity, here my solution(s)
Your lists:
In [72]: alist = [[1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6], [1, 2, 3, 4, 5, 6]]
In [73]: low = [2.1,1.9,1.7]; up = [3.1,3.5,4.1]
A utility function:
In [74]: def between(row, l, u):
...: return [i for i in row if l <= i <= u]
and the straightforward list comprehension solution - VERY PYTHONIC:
In [75]: [between(row, l, u) for row, l, u in zip(alist, low, up)]
Out[75]: [[3], [2, 3], [2, 3, 4]]
A numpy solutions requires starting with arrays:
In [76]: arr = np.array(alist)
In [77]: Low = np.array(low)
...: Up = np.array(up)
We can check the bounds with:
In [79]: Low[:, None] <= arr
Out[79]:
array([[False, False, True, True, True, True],
[False, True, True, True, True, True],
[False, True, True, True, True, True]])
In [80]: (Low[:, None] <= arr) & (Up[:,None] >= arr)
Out[80]:
array([[False, False, True, False, False, False],
[False, True, True, False, False, False],
[False, True, True, True, False, False]])
Applying the mask to index arr produces a flat array of values:
In [81]: arr[_]
Out[81]: array([3, 2, 3, 2, 3, 4])
to get values by row, we still have to iterate:
In [82]: [row[mask] for row, mask in zip(arr, Out[80])]
Out[82]: [array([3]), array([2, 3]), array([2, 3, 4])]
For the small case I expect the list approach to be faster. For larger cases [81] will do better - IF we already have arrays. Creating arrays from the lists is not a time-trivial task.
This question already has answers here:
check for identical rows in different numpy arrays
(7 answers)
Closed 1 year ago.
I have two arrays:
A = np.array([[3, 1], [4, 1], [1, 4]])
B = np.array([[0, 1, 5], [2, 4, 5], [2, 3, 5]])
Is it possible to use numpy.isin rowwise for 2D arrays? I want to check if A[i,j] is in B[i] and return this result into C[i,j]. At the end I would get the following C:
np.array([[False, True], [True, False], [False, False]])
It would be great, if this is also doable with the == operator, then I could use it also with PyTorch.
Edit:
I also considered check for identical rows in different numpy arrays. The question is somehow similar, but I am not able to apply its solutions to this slightly different problem.
Not sure that my code solves your problem perfectly. please run it on more test cases to confirm. but i would do smth like i've done in the following code taking advantage of numpys vector outer operations ability (similar to vector outer product). If it works as intended it should work with pytorch as well.
import numpy as np
A = np.array([[3, 1], [4, 1], [1, 4]])
B = np.array([[0, 1, 5], [2, 4, 5], [2, 3, 5]])
AA = A.reshape(3, 2, 1)
BB = B.reshape(3, 1, 3)
(AA == BB).sum(-1).astype(bool)
output:
array([[False, True],
[ True, False],
[False, False]])
Updated answer
Here is a way to do this :
(A == B[..., None]).any(axis=1).astype(bool)
# > array([[False, True],
# [ True, False],
# [False, False]])
Previous answer
You could also do it inside a list comprehension:
[np.isin(a, b) for a,b in zip(A, B)]
# > [array([False, True]), array([ True, False]), array([False, False])]
np.array([np.isin(a, b) for a,b in zip(A, B)])
# > array([[False, True],
# [ True, False],
# [False, False]])
But, as #alex said it defeats the purpose of numpy.
I am given a predefined array
a = [[2 3 4]
[5 6 7]
[8 9 10]]
my issue is converting this array into a boolean array where all even numbers are true. Any help is appreciated!
I would use numpy, and it's pretty straightforward:
import numpy as np
a = [[2, 3, 4],
[5, 6, 7],
[8, 9, 10]]
a = np.asarray(a)
(a % 2) == 0
Do as follows using Numpy:
import numpy as np
a = [[2, 3, 4],
[5, 6, 7],
[8, 9, 10]]
aBool=(np.asarray(a)%2==0)
The variable aBool will contain True where there is an Even value and False for an Odd value.
array([[ True, False, True],
[False, True, False],
[ True, False, True]])
This is adapted from the answer found here:
a=np.array([1,2,3,4,5,6])
answer = (a[a%2==0])
print(answer)
which is essentially what Andrew said but without using NumPy
If you're interested in getting booleans and testing for certain conditions within a NumPy array you can find a neat post here
This can be done simply using one line (if you don't want to use numpy):
[[i % 2 == 0 for i in sublist] for sublist in a]
>>> [[True, False, True], [False, True, False], [True, False, True]]
Here, i % 2 denotes the modulus operator, which gives the remainder when the first number is divided by the second. In this case, for even numbers i % 2 = 0, and for odd numbers i % 2 = 1. The two = signs means that the expression is evaluated as a boolean.
The two for loops iterate over each list inside of the 2D list.
This can be extended if you find this format easier to understand, but it essentially does the same thing:
>>> newlist = []
>>> for sublist in a:
partial_list = []
for i in sublist:
partial_list.append(i % 2 == 0)
newlist.append(partial_list)
>>> newlist
[[True, False, True], [False, True, False], [True, False, True]]
Why not to mention negations:
a = [[2, 3, 4],
[5, 6, 7],
[8, 9, 10]]
>>> ~np.array(a)%2
array([[1, 0, 1],
[0, 1, 0],
[1, 0, 1]], dtype=int32)
This is a boolean form:
>>> ~(np.array(a)%2).astype(bool)
array([[ True, False, True],
[False, True, False],
[ True, False, True]])
I have a numpy matrix and want to compare every columns to a given array, like:
M = np.array([1,2,3,3,2,1,1,3,2]).reshape((3,3)).T
v = np.array([1,2,3])
Now I want to compare every columns of M with v, i.e. I want a matrix with the first column consisting of True, True, True. A second saying False, True, False. A third True, False, False.
How do I achieve this?
Thanks!
Use broadcasted comparison:
>>> M == v[:, None]
array([[ True, False, True],
[ True, True, False],
[ True, False, False]])
You might consider using np.equal column-wise:
np.array([np.equal(col, v) for col in M.T]).T
it compares the elements of two numpy arrays element-wise. The M.T makes for loop to pop your original M columns as one-dimensional arrays and the final transpose is needed to reverse it.
Here the equal/not_equal functions are described.
Alternatively, you could match each row in the matrix with the given vector using np.apply_along_axis
>>> M
array([[1, 3, 1],
[2, 2, 3],
[3, 1, 2]])
>>> v
array([1, 2, 3])
>>> np.apply_along_axis(lambda x: x==v, 1, M)
array([[ True, False, False],
[False, True, True],
[False, False, False]], dtype=bool)