How do I do masking in PyTorch / Numpy with different dimensions?

How do I do masking in PyTorch / Numpy with different dimensions? - python

I have a mask with a size of torch.Size([20, 1, 199]) and a tensor, reconstruct_output and inputs both with a size of torch.Size([20, 1, 161, 199]).
I want to set reconstruct_output to inputs where the mask is 0. I tried:
reconstruct_output[mask == 0] = inputs[mask == 0]
But I get an error:
IndexError: The shape of the mask [20, 1, 199] at index 2 does not match the shape of the indexed tensor [20, 1, 161, 199] at index 2

We can use advanced indexing here. To obtain the indexing arrays which we want to use to index both reconstruct_output and inputs, we need the indices along its axes where m==0. For that we can use np.where, and use the resulting indices to update reconstruct_output as:
m = mask == 0
i, _, l = np.where(m)
reconstruct_output[i, ..., l] = inputs[i, ..., l]
Here's a small example which I've checked with:
mask = np.random.randint(0,3, (2, 1, 4))
reconstruct_output = np.random.randint(0,10, (2, 1, 3, 4))
inputs = np.random.randint(0,10, (2, 1, 3, 4))
Giving for instance:
print(reconstruct_output)
array([[[[8, 9, 7, 2],
[5, 4, 6, 1],
[1, 4, 0, 3]]],
[[[4, 3, 3, 4],
[0, 9, 9, 7],
[3, 4, 9, 3]]]])
print(inputs)
array([[[[7, 3, 9, 8],
[3, 1, 0, 8],
[0, 5, 4, 8]]],
[[[3, 7, 5, 8],
[2, 5, 3, 8],
[3, 6, 7, 5]]]])
And the mask:
print(mask)
array([[[0, 1, 2, 1]],
[[1, 0, 1, 0]]])
By using np.where to find the indices where there are zeroes in mask we get:
m = mask == 0
i, _, l = np.where(m)
i
# array([0, 1, 1])
l
# array([0, 1, 3])
Hence we'll be replacing the 0th column from the first 2D array and the 1st and 3rd from the second 2D array.
We can now use these arrays to replace along the corresponding axes indexing as:
reconstruct_output[i, ..., l] = inputs[i, ..., l]
Getting:
reconstruct_output
array([[[[7, 9, 7, 2],
[3, 4, 6, 1],
[0, 4, 0, 3]]],
[[[4, 7, 3, 8],
[0, 5, 9, 8],
[3, 6, 9, 5]]]])

Related

Problem involving 'alphabetization' of sets of row elements

Consider a variable setSize (it can take value 2 or 3), and a numpy array v.
The number of columns in v is divisible by setSize. Here's a small sample:
import numpy as np
setSize = 2
# the array spaces are shown to emphasize that the rows
# are made up of sets having, in this case, 2 elements each.
v = np.array([[2,5, 3,5, 1,8],
[4,6, 2,7, 5,9],
[1,8, 2,3, 1,4],
[2,8, 1,4, 3,5],
[5,7, 2,3, 7,8],
[1,2, 4,6, 3,5],
[3,5, 2,8, 1,4]])
PROBLEM: For the rows that have all elements unique, I need to ALPHABETIZE the sets.
For example: set 1,14 would precede set 3,5, which would precede set 5,1.
As a final step, I need to eliminate any duplicated rows that may result.
In this example above, the array rows having indices 1,3,5,and 6 have unique elements,
so these rows must be alphabetized. The other rows are not changed.
Further, the rows v[3] and v[6], after alphabetization, are now identical. One of them may be dropped.
The final output looks like:
v = [[2,5, 3,5, 1,8],
[2,7, 4,6, 5,9],
[1,8, 2,3, 1,4],
[1,4, 2,8, 3,5],
[5,7, 2,3, 7,8],
[1,2, 3,5, 4,6]]
I can identify the rows having unique elements with code like below, but I stuck with the alphabetization code.
s = np.sort(v,axis=1)
v[(s[:,:-1] != s[:,1:]).all(1)]

Assuming you have unsuitable rows dropped with:
s = np.sort(v, axis=1)
idx = (s[:,:-1] != s[:,1:]).all(1)
w = v[idx]
Then you can get orders of each row with np.lexsort on a reshaped array:
w = w.reshape(-1,3,2)
s = np.lexsort((w[:,:,1], w[:,:,0]))
Then you can apply fancy indexing and reshape it back:
rows, orders = np.repeat(np.arange(len(s)), 3), s.flatten()
v[idx] = w[rows, orders].reshape((-1,6))
If you need to drop duplicated rows, you can do it like so:
u, idx = np.unique(v, return_index=True, axis=0)
output = v[np.sort(idx)]
Visualization of process:
Sample run:
>>> s
array([[1, 0, 2],
[1, 0, 2],
[0, 2, 1],
[2, 1, 0]], dtype=int64)
>>> rows
array([0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3])
>>> orders
array([1, 0, 2, 1, 0, 2, 0, 2, 1, 2, 1, 0], dtype=int64)
>>> v[idx]
array([[2, 7, 4, 6, 5, 9],
[1, 4, 2, 8, 3, 5],
[1, 2, 3, 5, 4, 6],
[1, 4, 2, 8, 3, 5]])
>>> v
array([[2, 5, 3, 5, 1, 8],
[2, 7, 4, 6, 5, 9],
[1, 8, 2, 3, 1, 4],
[1, 4, 2, 8, 3, 5],
[5, 7, 2, 3, 7, 8],
[1, 2, 3, 5, 4, 6],
[1, 4, 2, 8, 3, 5]])
>>> output
array([[2, 5, 3, 5, 1, 8],
[2, 7, 4, 6, 5, 9],
[1, 8, 2, 3, 1, 4],
[1, 4, 2, 8, 3, 5],
[5, 7, 2, 3, 7, 8],
[1, 2, 3, 5, 4, 6]])

Indexing a 3D array along 1 axis using a 2D array [duplicate]

This question already has answers here:
Indexing one array by another in numpy
(4 answers)
Closed 3 years ago.
Say that I have a numpy array a with the shape: [z, y, x], and another array b with the shape: [y, x].
Array b contains indices along z that I would like to extract from a for each y and x.
So far I have the following inelegant way of doing this:
from_a = np.full_like(b, np.nan)
for i in range(y):
for j in range(x):
des_ind = b[i,j]
from_a[i,j] = a[des_ind,i,j]
Is there a nice neat pythonic way of doing this?

You could use the array b directly as index for a:
import numpy as np
z, y, x = 3, 4, 5
a = np.random.randint(10, (z, y, x))
# array([[[2, 1, 8, 3, 0], | [[9, 3, 6, 8, 7], | [[6, 6, 2, 1, 4],
# [4, 0, 1, 2, 9], | [5, 9, 0, 7, 1], | [3, 9, 4, 4, 8],
# [0, 0, 7, 2, 6], | [7, 9, 8, 4, 9], | [9, 7, 0, 3, 0],
# [3, 5, 2, 9, 3]], | [6, 8, 9, 5, 9]], | [4, 3, 0, 1, 1]]])
b = np.random.randint(z, size=(y, x))
# array([[1, 1, 1, 0, 2],
# [1, 2, 1, 1, 2],
# [2, 0, 0, 0, 2],
# [0, 0, 1, 0, 0]])
from_a = a[b,
np.arange(y)[:, np.newaxis],
np.arange(x)[np.newaxis, :]]
# array([[9, 3, 6, 3, 4],
# [5, 9, 0, 7, 8],
# [9, 0, 7, 2, 0],
# [3, 5, 9, 9, 3]])
See also advanced indexing for numpy arrays.

numpy select values based on list of indices. Process batch at once [duplicate]

Suppose I have a matrix A with some arbitrary values:
array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
And a matrix B which contains indices of elements in A:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
How do I select values from A pointed by B, i.e.:
A[B] = [[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]]

EDIT: np.take_along_axis is a builtin function for this use case implemented since numpy 1.15. See #hpaulj 's answer below for how to use it.
You can use NumPy's advanced indexing -
A[np.arange(A.shape[0])[:,None],B]
One can also use linear indexing -
m,n = A.shape
out = np.take(A,B + n*np.arange(m)[:,None])
Sample run -
In [40]: A
Out[40]:
array([[2, 4, 5, 3],
[1, 6, 8, 9],
[8, 7, 0, 2]])
In [41]: B
Out[41]:
array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
In [42]: A[np.arange(A.shape[0])[:,None],B]
Out[42]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
In [43]: m,n = A.shape
In [44]: np.take(A,B + n*np.arange(m)[:,None])
Out[44]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])

More recent versions have added a take_along_axis function that does the job:
A = np.array([[ 2, 4, 5, 3],
[ 1, 6, 8, 9],
[ 8, 7, 0, 2]])
B = np.array([[0, 0, 1, 2],
[0, 3, 2, 1],
[3, 2, 1, 0]])
np.take_along_axis(A, B, 1)
Out[]:
array([[2, 2, 4, 5],
[1, 9, 8, 6],
[2, 0, 7, 8]])
There's also a put_along_axis.

I know this is an old question, but another way of doing it using indices is:
A[np.indices(B.shape)[0], B]
output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]

Following is the solution using for loop:
outlist = []
for i in range(len(B)):
lst = []
for j in range(len(B[i])):
lst.append(A[i][B[i][j]])
outlist.append(lst)
outarray = np.asarray(outlist)
print(outarray)
Above can also be written in more succinct list comprehension form:
outlist = [ [A[i][B[i][j]] for j in range(len(B[i]))]
for i in range(len(B)) ]
outarray = np.asarray(outlist)
print(outarray)
Output:
[[2 2 4 5]
[1 9 8 6]
[2 0 7 8]]

Get index of largest element for each submatrix in a Numpy 2D array

I have a 2D Numpy ndarray, x, that I need to split in square subregions of size s. For each subregion, I want to get the greatest element (which I do), and its position within that subregion (which I can't figure out).
Here is a minimal example:
>>> x = np.random.randint(0, 10, (6,8))
>>> x
array([[9, 4, 8, 9, 5, 7, 3, 3],
[3, 1, 8, 0, 7, 7, 5, 1],
[7, 7, 3, 6, 0, 2, 1, 0],
[7, 3, 9, 8, 1, 6, 7, 7],
[1, 6, 0, 7, 5, 1, 2, 0],
[8, 7, 9, 5, 8, 3, 6, 0]])
>>> h, w = x.shape
>>> s = 2
>>> f = x.reshape(h//s, s, w//s, s)
>>> mx = np.max(f, axis=(1, 3))
>>> mx
array([[9, 9, 7, 5],
[7, 9, 6, 7],
[8, 9, 8, 6]])
For example, the 8 in the lower left corner of mx is the greatest element from subregion [[1,6], [8, 7]] in the lower left corner of x.
What I want is to get an array similar to mx, that keeps the indices of the largest elements, like this:
[[0, 1, 1, 2],
[0, 2, 3, 2],
[2, 2, 2, 2]]
where, for example, the 2 in the lower left corner is the index of 8 in the linear representation of [[1, 6], [8, 7]].
I could do it like this: np.argmax(f[i, :, j, :]) and iterate over i and j, but the speed difference is enormous for large amounts of computation. To give you an idea, I'm trying to use (only) Numpy for max pooling. Basically, I'm asking if there is a faster alternative than what I'm using.

Here's one approach -
# Get shape of output array
m,n = np.array(x.shape)//s
# Reshape and permute axes to bring the block as rows
x1 = x.reshape(h//s, s, w//s, s).swapaxes(1,2).reshape(-1,s**2)
# Use argmax along each row and reshape to output shape
out = x1.argmax(1).reshape(m,n)
Sample input, output -
In [362]: x
Out[362]:
array([[9, 4, 8, 9, 5, 7, 3, 3],
[3, 1, 8, 0, 7, 7, 5, 1],
[7, 7, 3, 6, 0, 2, 1, 0],
[7, 3, 9, 8, 1, 6, 7, 7],
[1, 6, 0, 7, 5, 1, 2, 0],
[8, 7, 9, 5, 8, 3, 6, 0]])
In [363]: out
Out[363]:
array([[0, 1, 1, 2],
[0, 2, 3, 2],
[2, 2, 2, 2]])
Alternatively, to simplify things, we could use scikit-image that does the heavy work of reshaping and permuting axes for us -
In [372]: from skimage.util import view_as_blocks as viewB
In [373]: viewB(x, (s,s)).reshape(-1,s**2).argmax(1).reshape(m,n)
Out[373]:
array([[0, 1, 1, 2],
[0, 2, 3, 2],
[2, 2, 2, 2]])

argmax on 2 axis for 3-d numpy array

I'd like to obtain a 1D array of indexes from a 3D matrix.
For instance given x = np.random.randint(10, size=(10,3,3)), I'd like to do something like np.argmax(x, axis=(1,2)) just like you can do with np.max, that is, obtain a 1D array of length 10 containing the indexes (0 to 8) of the maximums of each submatrix of size (3,3).
I have not found anything helpful so far and I want to avoid looping on the first dimension (and use np.argmax(x)) as it is quite big.
Cheers!

Reshape to merge those last two axes and then use np.argmax -
idx = x.reshape(x.shape[0],-1).argmax(-1)
out = np.unravel_index(idx, x.shape[-2:])
Sample run -
In [263]: x = np.random.randint(10, size=(4,3,3))
In [264]: x
Out[264]:
array([[[0, 9, 2],
[7, 7, 8],
[2, 5, 9]],
[[1, 7, 2],
[8, 9, 0],
[2, 8, 3]],
[[7, 5, 0],
[7, 1, 6],
[5, 1, 1]],
[[0, 7, 3],
[5, 4, 1],
[9, 8, 9]]])
In [265]: idx = x.reshape(x.shape[0],-1).argmax(-1)
In [266]: np.unravel_index(idx, x.shape[-2:])
Out[266]: (array([0, 1, 0, 2]), array([1, 1, 0, 0]))
If you meant getting the merged index, then its simpler -
x.reshape(x.shape[0],-1).argmax(1)
Sample run -
In [283]: x
Out[283]:
array([[[2, 3, 7],
[8, 1, 0],
[3, 6, 9]],
[[8, 0, 5],
[2, 2, 9],
[9, 0, 9]],
[[1, 9, 2],
[5, 0, 3],
[7, 2, 1]],
[[1, 6, 5],
[2, 3, 7],
[7, 4, 6]]])
In [284]: x.reshape(x.shape[0],-1).argmax(1)
Out[284]: array([8, 5, 1, 5])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How do I do masking in PyTorch / Numpy with different dimensions? - python

Related

Problem involving 'alphabetization' of sets of row elements

Indexing a 3D array along 1 axis using a 2D array [duplicate]

numpy select values based on list of indices. Process batch at once [duplicate]

Get index of largest element for each submatrix in a Numpy 2D array

argmax on 2 axis for 3-d numpy array

Categories

Resources