Pythonically compare indices of two arrays with permuted rows in numpy [duplicate] - python

This question already has an answer here:
How can I efficiently map each pixel of a three channel image to one channel?
(1 answer)
Closed 4 years ago.
I have two identically-sized numpy ndarrays with permuted rows:
import numpy as np
a = np.ndarray([[1,2,3],
[4,5,6],
[7,8,9],
[10,11,12]])
b = np.ndarray([[7,8,9],
[10,11,12],
[1,2,3],
[4,5,6]])
I want a function that returns the indices of each row in first array, relative to the second array. For example:
compare_row_indices(a,b)
would return
[2,3,0,1] # 0-based indexing
What is the most pythonic way to implement this function?

Maybe not the best possible way, but this seems to work (breaking it down to multiple steps for easier visualization):
>>> cmp = a[:, None] == b
>>> cmp
array([[[False, False, False],
[False, False, False],
[ True, True, True],
[False, False, False]],
[[False, False, False],
[False, False, False],
[False, False, False],
[ True, True, True]],
[[ True, True, True],
[False, False, False],
[False, False, False],
[False, False, False]],
[[False, False, False],
[ True, True, True],
[False, False, False],
[False, False, False]]])
>>> eq = np.all(cmp, axis=-1)
>>> eq
array([[False, False, True, False],
[False, False, False, True],
[ True, False, False, False],
[False, True, False, False]])
>>> np.argwhere(eq)
array([[0, 2],
[1, 3],
[2, 0],
[3, 1]])
>>> np.argwhere(eq)[:, 1]
array([2, 3, 0, 1])

Related

How can I perform the "reverse" of numpy argwhere? [duplicate]

This question already has an answer here:
Replace 2D numpy array elements based on 2D indexes [duplicate]
(1 answer)
Closed 12 months ago.
Suppose I have a boolean numpy array, and I perform np.argwhere() on it. Is there any way to easily and efficiently do the reverse operation? In other words, given the final shape of a, and the results of argwhere(), how can I find a? I've tried to use the argwhere results together with an array full of False, but can't figure out how to use to do it. Maybe somehow use np.where()?
>>> a = np.array([[False, True, False, True, False],
[False, False, True, False, False]])
>>> results = np.argwhere(a)
>>> results
array([[0, 1],
[0, 3],
[1, 2]], dtype=int64)
>>> recover_a = np.full(shape=a.shape, fill_value=False) # I am
>>> # guessing I could start here then do something...
Use results columns as indices to update the value in recover_a:
recover_a[results[:,0], results[:,1]] = True
recover_a
# array([[False, True, False, True, False],
# [False, False, True, False, False]])
In [233]: a = np.array([[False, True, False, True, False], [False, False, True,
...: False, False]])
In [234]: np.argwhere(a)
Out[234]:
array([[0, 1],
[0, 3],
[1, 2]])
In [235]: np.nonzero(a)
Out[235]: (array([0, 0, 1]), array([1, 3, 2]))
argwhere is just the np.transpose(np.nonzero(a)). One is a tuple of arrays, the other a 2d array with those arrays arranged as columns.
The nonzero/where result is better for indexing, since it is a tuple of indices.
In [236]: res = np.zeros(a.shape, bool)
In [237]: res[np.nonzero(a)] = True
In [238]: res
Out[238]:
array([[False, True, False, True, False],
[False, False, True, False, False]])
In [239]: a[np.nonzero(a)]
Out[239]: array([ True, True, True])

How can I use numpy array elements as indices to assign values for another numpy array

I have following problem, which I want to solve using numpy array elements.
The problem is:
Matrix = np.zeros((4*4), dtype = bool) which gives this 2D matrix.
Matrix = [[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False]]
Les us suppose that we have an another array a = np.array([0,1], [2,1], [3,3])
a = [[0, 1],
[2, 1],
[3, 3]]
My question is: How to use the elements of the a array as indices to fill my matrix with True's. The output should seem like this
Matrix = [[False, True, False, False], # [0, 1]
[False, False, False, False],
[False, True, False, False], # [2, 1]
[False, False, False, True]] # [3, 3]
import numpy as np
Matrix = np.zeros((4*4), dtype = bool).reshape(4,4)
a = [[0, 1],
[2, 1],
[3, 3]]
Unroll them into a proper pair of indexing arrays for a 2d array
a = ([x[0] for x in a], [x[1] for x in a])
Matrix[a] = True
>>> Matrix
array([[False, True, False, False],
[False, False, False, False],
[False, True, False, False],
[False, False, False, True]])
Simple way to make the (4,4) bool array:
In [390]: arr = np.zeros((4,4), dtype = bool)
In [391]: arr
Out[391]:
array([[False, False, False, False],
[False, False, False, False],
[False, False, False, False],
[False, False, False, False]])
Proper syntax for making a:
In [392]: a = np.array([[0,1], [2,1], [3,3]])
In [393]: a
Out[393]:
array([[0, 1],
[2, 1],
[3, 3]])
Use the 2 columns of a as indices for the 2 dimensions of arr:
In [394]: arr[a[:,0],a[:,1]]=True
In [395]: arr
Out[395]:
array([[False, True, False, False],
[False, False, False, False],
[False, True, False, False],
[False, False, False, True]])

Fill mask efficiently based on start indices

I have a 2D array (for this example, actually can be ND), and I would like to create a mask for it that masks the end of each row. For example:
np.random.seed(0xBEEF)
a = np.random.randint(10, size=(5, 6))
mask_indices = np.argmax(a, axis=1)
I would like to convert mask_indices to a boolean mask. Currently, I can't think of a better way than
mask = np.zeros(a.shape, dtype=np.bool)
for r, m in enumerate(mask_indices):
mask[r, m:] = True
So for
a = np.array([[6, 5, 0, 2, 1, 2],
[8, 1, 3, 7, 1, 9],
[8, 7, 6, 7, 3, 6],
[2, 7, 0, 3, 1, 7],
[5, 4, 0, 7, 6, 0]])
and
mask_indices = np.array([0, 5, 0, 1, 3])
I would like to see
mask = np.array([[ True, True, True, True, True, True],
[False, False, False, False, False, True],
[ True, True, True, True, True, True],
[False, True, True, True, True, True],
[False, False, False, True, True, True]])
Is there a vectorized form of this operation?
In general, I would like to be able to do this across all the dimensions besides the one that defines the index points.
I. Ndim array-masking along last axis (rows)
For n-dim array to mask along rows, we could do -
def mask_from_start_indices(a, mask_indices):
r = np.arange(a.shape[-1])
return mask_indices[...,None]<=r
Sample run -
In [177]: np.random.seed(0)
...: a = np.random.randint(10, size=(2, 2, 5))
...: mask_indices = np.argmax(a, axis=-1)
In [178]: a
Out[178]:
array([[[5, 0, 3, 3, 7],
[9, 3, 5, 2, 4]],
[[7, 6, 8, 8, 1],
[6, 7, 7, 8, 1]]])
In [179]: mask_indices
Out[179]:
array([[4, 0],
[2, 3]])
In [180]: mask_from_start_indices(a, mask_indices)
Out[180]:
array([[[False, False, False, False, True],
[ True, True, True, True, True]],
[[False, False, True, True, True],
[False, False, False, True, True]]])
II. Ndim array-masking along generic axis
For n-dim arrays masking along a generic axis, it would be -
def mask_from_start_indices_genericaxis(a, mask_indices, axis):
r = np.arange(a.shape[axis]).reshape((-1,)+(1,)*(a.ndim-axis-1))
mask_indices_nd = mask_indices.reshape(np.insert(mask_indices.shape,axis,1))
return mask_indices_nd<=r
Sample runs -
Data array setup :
In [288]: np.random.seed(0)
...: a = np.random.randint(10, size=(2, 3, 5))
In [289]: a
Out[289]:
array([[[5, 0, 3, 3, 7],
[9, 3, 5, 2, 4],
[7, 6, 8, 8, 1]],
[[6, 7, 7, 8, 1],
[5, 9, 8, 9, 4],
[3, 0, 3, 5, 0]]])
Indices setup and masking along axis=1 -
In [290]: mask_indices = np.argmax(a, axis=1)
In [291]: mask_indices
Out[291]:
array([[1, 2, 2, 2, 0],
[0, 1, 1, 1, 1]])
In [292]: mask_from_start_indices_genericaxis(a, mask_indices, axis=1)
Out[292]:
array([[[False, False, False, False, True],
[ True, False, False, False, True],
[ True, True, True, True, True]],
[[ True, False, False, False, False],
[ True, True, True, True, True],
[ True, True, True, True, True]]])
Indices setup and masking along axis=2 -
In [293]: mask_indices = np.argmax(a, axis=2)
In [294]: mask_indices
Out[294]:
array([[4, 0, 2],
[3, 1, 3]])
In [295]: mask_from_start_indices_genericaxis(a, mask_indices, axis=2)
Out[295]:
array([[[False, False, False, False, True],
[ True, True, True, True, True],
[False, False, True, True, True]],
[[False, False, False, True, True],
[False, True, True, True, True],
[False, False, False, True, True]]])
Other scenarios
A. Extending to given end/stop-indices for masking
To extend the solutions for cases when we are given end/stop-indices for masking, i.e. we are looking to vectorize mask[r, :m] = True, we just need to edit the last step of comparison in the posted solutions to the following -
return mask_indices_nd>r
B. Outputting an integer array
There might be cases when we might be looking to get an int array. On those, simply view the output as such. Hence, if out is the output off the posted solutions, then we can simply do out.view('i1') or out.view('u1') for int8 and uint8 dtype outputs respectively.
For other datatypes, we would need to use .astype() for dtype conversions.
C. For index-inclusive masking for stop-indices
For index-inclusive masking, i.e. the index is to be included for stop-indices case, we need to simply include the equality in the comparison. Hence, the last step would be -
return mask_indices_nd>=r
D. For index-exclusive masking for start-indices
This is a case when the start indices are given and those indices are not be masked, but masked only from the next element onwards until end. So, similar to the reasoning listed in previous section, for this case we would have the last step modified to -
return mask_indices_nd<r
>>> az = np.zeros(a.shape)
>>> az[np.arange(az.shape[0]), mask_indices] = 1
>>> az.cumsum(axis=1).astype(bool) # use n-th dimension for nd case
array([[ True, True, True, True, True, True],
[False, False, False, False, False, True],
[ True, True, True, True, True, True],
[False, True, True, True, True, True],
[False, False, False, True, True, True]])

Vectorized approach for masking individual slices per column

I have a numpy array:
>>> a = np.arange(20).reshape(5, -1)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19]])
I have an array of regions going in order of columns, that I would like to create a boolean mask for:
idx = np.array([[0,2], [1,3], [2,4], [1,4]])
My desired mask for this set of indices is:
array([[ True, False, False, False],
[ True, True, False, True],
[False, True, True, True],
[False, False, True, True],
[False, False, False, False]])
So column 0 has 0:2 masked, column 1 has 1:3 masked, etc. My current approach works, but I am looking for something vectorized:
def foo(a, idx):
out = np.zeros(a, dtype=np.bool8)
for (i, j), k in zip(idx, np.arange(a[1])):
out[i:j, k] = True
return out
In action:
foo(a.shape, idx)
array([[ True, False, False, False],
[ True, True, False, True],
[False, True, True, True],
[False, False, True, True],
[False, False, False, False]])
Using broadcasting -
In [434]: r = np.arange(a.shape[0])[:,None]
In [435]: (idx[:,0] <= r) & (idx[:,1] > r)
Out[435]:
array([[ True, False, False, False],
[ True, True, False, True],
[False, True, True, True],
[False, False, True, True],
[False, False, False, False]])

what is the difference between [[False] * 3] * 3 and [[False for i in range(3)] for j in range(3)]? [duplicate]

This question already has answers here:
List of lists changes reflected across sublists unexpectedly
(17 answers)
Closed 4 years ago.
I am trying to initialize the 2D array with false values. Both instructions mentioned below are producing same results. Please help to know the difference between the two ?
Input
[[False] * 3] * 3
[[False for i in range(3)] for j in range(3)]
Output
[[False, False, False], [False, False, False], [False, False, False]]
[[False, False, False], [False, False, False], [False, False, False]]
Let' try this:
>>> a = [[False] * 3] * 3
>>> b = [[False for i in range(3)] for j in range(3)]
>>> a
[[False, False, False], [False, False, False], [False, False, False]]
>>> b
[[False, False, False], [False, False, False], [False, False, False]]
Now let's change a single value (do we?):
>>> a[0][0] = True
>>> a
[[True, False, False], [True, False, False], [True, False, False]]
Note that a was changed in three positions.
>>> b[0][0] = True
>>> b
[[True, False, False], [False, False, False], [False, False, False]]
b was not. That's the difference.

Categories

Resources