Select multiple indices in an axis of pytorch tensor - python

My actual problem is in a higher dimension, but I am posting it in a smaller dimension to make it easy to visualize.
I have a tensor of shape (2,3,4):
x = torch.randn(2, 3, 4)
tensor([[[-0.9118, 1.4676, -0.4684, -0.6343],
[ 1.5649, 1.0218, -1.3703, 1.8961],
[ 0.8652, 0.2491, -0.2556, 0.1311]],
[[ 0.5289, -1.2723, 2.3865, 0.0222],
[-1.5528, -0.4638, -0.6954, 0.1661],
[-1.8151, -0.4634, 1.6490, 0.6957]]])
From this tensor, I need to select rows given by a list of indices along axis-1.
Example,
indices = torch.tensor([0, 2])
Expected Output:
tensor([[[-0.9118, 1.4676, -0.4684, -0.6343]],
[[-1.8151, -0.4634, 1.6490, 0.6957]]])
Output Shape: (2,1,4)
Explanation: Select 0th row from x[0], select 2nd row from x[1]. (Came from indices)
I tried using index_select like this:
torch.index_select(x, 1, indices)
But the problem is that it is selecting the 0th and 2nd row for each item in x. It looks like it needs some modification I could not figure it out at the moment.

In your case, this is quite straightforward. An easy way to navigate through two dimensions in parallel is to use a range on the first axis and your indexing tensor on the second:
>>> x[range(len(indices)), indices]
tensor([[-0.9118, 1.4676, -0.4684, -0.6343],
[-1.8151, -0.4634, 1.6490, 0.6957]])
In more general cases though, this would require the use of torch.gather:
First expand indices such that it has enough dimensions:
index = indices[:,None,None].expand(x.size(0), -1, x.size(-1))
Then you can apply the function on x and index and squeeze dim=1:
>>> x.gather(dim=-2, index=index)[:,0]
tensor([[-0.9118, 1.4676, -0.4684, -0.6343],
[-1.8151, -0.4634, 1.6490, 0.6957]])

Related

Find index of the maximum value in a numpy array

I have a numpy array called predictions as follows
array([[3.7839172e-06, 8.0308418e-09, 2.2542761e-06, 5.9392878e-08,
5.3137046e-07, 1.7033290e-05, 1.7738441e-07, 1.0742254e-03,
1.8656212e-06, 9.9890006e-01]], dtype=float32)
In order to get the index of the maximum value in this array, I used the following
np.where(prediction==prediction.max())
But the result I am getting showing index 0 also.
(array([0], dtype=int64), array([9], dtype=int64))
Does anyone know why is it showing index 0 also?
Also how can I get just the index number instead of showing as (array([9], dtype=int64)
Use built-in function for it:
prediction.argmax()
output:
9
Also, that index 0 is the row number, so the max is at row 0 and column 9.
The predictions array here is two dimensional. When you call np.where with only a condition, this is the same as calling np.asarray(condition).nonzero() which returns you the indices of the non-zero elements of prediction==prediction.max() which is a boolean array with the only non-zero element at (0,9).
What you are looking for is the argmax function which will give you the index of the maximum value along an axis. You effectively only have one axis (2d but only one row) here so this should be fine.
As the other answers mentioned, you have a 2D array, so you end up with two indices. Since the array is just a row, the first index is always zero. You can bypass this in a number of ways:
Use prediction.argmax(). The default axis argument is None, which means operate on a flattened array. Other options that will get you the same result are prediction.argmax(-1) (last axis) and prediction.argmax(1) (second axis). Keep in mind that you will only ever get the index of the first maximum this way. That's fine if you only ever expect to have one, or only need one.
Use np.flatnonzero to get the linear indices similarly to the way you were doing:
np.flatnonzero(perdiction == prediction.max())
Use np.nonzero or np.where, but extract the axis you care about:
np.nonzero(prediction == prediction.max())[1]
ravel the array on input:
np.where(prediction.ravel() == prediction.max())
Do the same thing, but with np.squeeze:
np.nonzero(prediction.squeeze() == prediction.max())

Subsetting a two dimensional tensor with a one dimensional tensor

I want to extract from each row of a two-dimensional tensor the column that is stored in another one dimensional tensor.
import torch
test_tensor = tensor([1,-2,3], [-2,7,4]).float()
select_tensor = tensor([1,2])
So in this particular example I would like to get the element in position 1 for the first row (so -2) and the element in position 2 for the second row (so 4).
I tried:
test_tensor[:, select_tensor]
But this selects the elements at position 1 and 2 for each row. I suspect it might be something very simple that I am missing.
You can use torch.gather
import torch
test_tensor = torch.tensor([[1,-2,3], [-2,7,4]]).float()
select_tensor = torch.tensor([1,2], dtype=torch.int64).view(-1,1) # number of dimension should match with the test tensor.
final_tensor = torch.gather(test_tensor, 1, select_tensor)
final_tensor
output
tensor([[-2.],
[ 4.]])
or, use torch.view to flatten the output tensor: final_tensor.view(-1) will give you tensor([-2., 4.])
If you're looking for a solution with indexing, you need to index on axis=0 as well, you could do that with torch.arange:
>>> test_tensor = torch.tensor([[1,-2,3], [-2,7,4]])
>>> select_tensor = torch.tensor([1,2])
>>> test_tensor[torch.arange(len(select_tensor)), select_tensor]
tensor([-2, 4])

why does numpy array return wrong shape of sub arrays when indexing

An example is shown as follows:
>>> import numpy as np
>>> a=np.zeros((288,512))
>>> x1,x2,y1,y2=0,16,0,16
>>> p=a[x1:x2][y1:y2]
>>> p.shape
(16, 512)
>>> p=a[x1:x2,y1:y2]
>>> p.shape
I try to query a patch from an array, ranging from columns 0 to 16, and rows 0 to 16. I index the array in two ways and get very different result. a[x1:x2][y1:y2] gives me the wrong result.
Why?
Thx for helping me!!!
When you do a[x1:x2][y1:y2], you are slicing by rows twice. That is, a[x1:x2] will give you a shape (16,512). The second slice operation in a[x1:x2][y1:y2] is slicing the result of the first operation and will give you the same result.
In the second case, when you do a[x1:x2,y1:y2], you are slicing by the two dimensions of your 2-dimensional array.
Important note: If you have a 2-dimensional array and you slice like this:
a = np.zeros((10,15))
a[1:3].shape
Output:
(2, 15)
you will slice only by rows. Your resulting array will have 2 rows and the total number of columns (15 columns). If you want to slice by rows and columns, you will have to use a[1:3, 1:3].
The two methods of indexing you tried are not equivalent. In the first one (a[x1:x2][y1:y2]), you are essentially indexing the first axis twice. In the second, you are indexing the first and second axes.
a[x1:x2][y1:y2] can be rewritten as
p = a[x1:x2] # result still has two dimensions
p = p[y1:y2]
You are first indexing 0:16 in the first dimension. Then you index 0:16 in the first dimension of the result of the previous operation (which will simply return the same as a[x1:x2] because x1==y1 and x2==y2).
In the second method, you index the first and second dimensions directly. I would not write it this way, but one could write it like this to contrast it with the first method:
a[x1:x2][:, y1:y2]

I don' t understand the retriving value syntax

Here i have the shape of my set
input [8] : train_x.shape
Out [8] : (4500, 3, 2)
Then in don't understand the following syntax
input [9] : train_x_retrive = train_x[:, -1, :]
Thank you for your help
See,
(4500, 3, 2) means 3 dimension data
with 1st dimension having 4500 length, 2nd dimension having 3 length and 3rd dimension having 2 length.
What train_x[:, -1, :] Means is that retrieve all the data of first dimension, of the last data of 2nd dimension of all the 3rd dimension.
results shape will be (4500, 2)
--EDIT--
Turns out if the returned array has only one selection then there is no need to display it, and so np automatically squeeze that column. so instead of getting array of size (4500,1,2) it returns (4500,2)
While #thisisjaymehta 's answer is correct and well explained, I find it is much easier to understand what's happening in a 2D array.
Consider a 3 row, 2 col random array:
import numpy as np
X = np.random.random((3,2))
print(X)
Yields:
array([[0.05809464, 0.49751321],
[0.25815324, 0.23862334],
[0.56815427, 0.91610693]])
We can access individual elements with subscripting. For instance to reach row (horizontal) 0, col (vertical) 1 we can use X[0,1]:
print(X[0,1])
Which yields:
0.49751320772009267
Similarly we can reach the last row, and the first (0th) column by using X[-1,0]:
0.568154265734957
The notation : is used to address the whole of that axis, so to get the last row, and all of the columns in that last row we can use X[-1,:] to yeild:
array([0.56815427, 0.91610693])
This principle extends in 3 or more dimensions as well. So train_x[:, -1, :] means "All rows (first dimension), the last column (second dimension), and all of the third dimension". This results in an array of shape (4500,2) in your example, where you started with (4500, 3, 2).
The way I like to think of this is you have 4500 3x2 images, and you are requesting for the last row of each image. The resultant array contains 4500 1,2 image strips, squeezed into a 4500,2 array.
You could also do -2 in place of -1 to reach the penultimate index.

Difference in shapes of numpy array

For the array:
import numpy as np
arr2d = np.array([[1,2,3],[4,5,6],[7,8,9]])
>>> arr2d
array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> arr2d[2].shape
(3,)
>>> arr2d[2:,:].shape
(1, 3)
Why do I get different shapes when both statements return the 3rd row? and shouldn't the result be (1,3) in both cases since we are returning a single row with 3 columns?
Why do I get different shapes when both statements return the 3rd row?
Because with the first operation you are indexing the rows, and selecting just ONE element, which -as mentioned in the single-element indexing paragraph of a multidimensional array- returns an array with a lower dimension (a 1D array).
In the 2nd example, you are using a slice as evident by the colon. Slicing operations do not reduce the dimensions of an array. This is also logical, because imagine the array would not have 3 but 4 rows. Then arr2d[2:,:].shape would be (2,3). The developers of numpy made slicing operations consistent and therefor they (slices) never reduce the number of dimensions of the array.
and shouldn't the result be (1,3) in both cases since we are returning a single row with 3 columns?
No, just because of the previous reasons.
When doing arr2d[2], you are taking a row out of the array;
While when doing arr2d[2:, :], you are taking a subset of rows out of the array ('slicing'), in this case being the rows starting from the 3rd to the end, which is only the 3rd, but it didn't change that you are taking a subset, not an element.

Categories

Resources