Subsetting a two dimensional tensor with a one dimensional tensor

Subsetting a two dimensional tensor with a one dimensional tensor - python

I want to extract from each row of a two-dimensional tensor the column that is stored in another one dimensional tensor.
import torch
test_tensor = tensor([1,-2,3], [-2,7,4]).float()
select_tensor = tensor([1,2])
So in this particular example I would like to get the element in position 1 for the first row (so -2) and the element in position 2 for the second row (so 4).
I tried:
test_tensor[:, select_tensor]
But this selects the elements at position 1 and 2 for each row. I suspect it might be something very simple that I am missing.

You can use torch.gather
import torch
test_tensor = torch.tensor([[1,-2,3], [-2,7,4]]).float()
select_tensor = torch.tensor([1,2], dtype=torch.int64).view(-1,1) # number of dimension should match with the test tensor.
final_tensor = torch.gather(test_tensor, 1, select_tensor)
final_tensor
output
tensor([[-2.],
[ 4.]])
or, use torch.view to flatten the output tensor: final_tensor.view(-1) will give you tensor([-2., 4.])

If you're looking for a solution with indexing, you need to index on axis=0 as well, you could do that with torch.arange:
>>> test_tensor = torch.tensor([[1,-2,3], [-2,7,4]])
>>> select_tensor = torch.tensor([1,2])
>>> test_tensor[torch.arange(len(select_tensor)), select_tensor]
tensor([-2, 4])

Related

Select multiple indices in an axis of pytorch tensor

My actual problem is in a higher dimension, but I am posting it in a smaller dimension to make it easy to visualize.
I have a tensor of shape (2,3,4):
x = torch.randn(2, 3, 4)
tensor([[[-0.9118, 1.4676, -0.4684, -0.6343],
[ 1.5649, 1.0218, -1.3703, 1.8961],
[ 0.8652, 0.2491, -0.2556, 0.1311]],
[[ 0.5289, -1.2723, 2.3865, 0.0222],
[-1.5528, -0.4638, -0.6954, 0.1661],
[-1.8151, -0.4634, 1.6490, 0.6957]]])
From this tensor, I need to select rows given by a list of indices along axis-1.
Example,
indices = torch.tensor([0, 2])
Expected Output:
tensor([[[-0.9118, 1.4676, -0.4684, -0.6343]],
[[-1.8151, -0.4634, 1.6490, 0.6957]]])
Output Shape: (2,1,4)
Explanation: Select 0th row from x[0], select 2nd row from x[1]. (Came from indices)
I tried using index_select like this:
torch.index_select(x, 1, indices)
But the problem is that it is selecting the 0th and 2nd row for each item in x. It looks like it needs some modification I could not figure it out at the moment.

In your case, this is quite straightforward. An easy way to navigate through two dimensions in parallel is to use a range on the first axis and your indexing tensor on the second:
>>> x[range(len(indices)), indices]
tensor([[-0.9118, 1.4676, -0.4684, -0.6343],
[-1.8151, -0.4634, 1.6490, 0.6957]])
In more general cases though, this would require the use of torch.gather:
First expand indices such that it has enough dimensions:
index = indices[:,None,None].expand(x.size(0), -1, x.size(-1))
Then you can apply the function on x and index and squeeze dim=1:
>>> x.gather(dim=-2, index=index)[:,0]
tensor([[-0.9118, 1.4676, -0.4684, -0.6343],
[-1.8151, -0.4634, 1.6490, 0.6957]])

Array mean iteration

My question pertains to array iteration but is a bit more complicated. You see I have an array with a shape of (4, 50). What I want to do is find the mean of the arrays. I will show a simple explanation of what I mean
A = np.array([[10,5,3],[12,6,6],[9,8,7],[20,3,4]])
When this code is run, you get an array with a shape of (4,3). What I want is for the mean of each row to be found and returned.
Returned should be an array of ([[6],[8],[8],[9]]) with the same rows and naturally a column of 1.
Please explain the code and thought process behind it. Thank you very much.

Use the numpy.mean function. Parameter axis=1 means that the row-wise mean will be calculated. Parameter keepdims=True means that original array dimensions are kept.
import numpy as np
A = np.array([[10,5,3],[12,6,6],[9,8,7],[20,3,4]])
B = np.mean(A, axis=1, keepdims=True)
print(B)
# Output:
# [[6.]
# [8.]
# [8.]
# [9.]]

Use np.mean and list comprehension into a new array:
A = np.array([[10,5,3],[12,6,6], [9,8,7],[20,3,4]])
# Use .reshape() to get 4 rows by 1 column.
new_A = np.array([np.mean(row) for row in A]).reshape(-1, 1)
Output:
array([[6.], [8.], [8.], [9.]])

why does numpy array return wrong shape of sub arrays when indexing

An example is shown as follows:
>>> import numpy as np
>>> a=np.zeros((288,512))
>>> x1,x2,y1,y2=0,16,0,16
>>> p=a[x1:x2][y1:y2]
>>> p.shape
(16, 512)
>>> p=a[x1:x2,y1:y2]
>>> p.shape
I try to query a patch from an array, ranging from columns 0 to 16, and rows 0 to 16. I index the array in two ways and get very different result. a[x1:x2][y1:y2] gives me the wrong result.
Why?
Thx for helping me!!!

When you do a[x1:x2][y1:y2], you are slicing by rows twice. That is, a[x1:x2] will give you a shape (16,512). The second slice operation in a[x1:x2][y1:y2] is slicing the result of the first operation and will give you the same result.
In the second case, when you do a[x1:x2,y1:y2], you are slicing by the two dimensions of your 2-dimensional array.
Important note: If you have a 2-dimensional array and you slice like this:
a = np.zeros((10,15))
a[1:3].shape
Output:
(2, 15)
you will slice only by rows. Your resulting array will have 2 rows and the total number of columns (15 columns). If you want to slice by rows and columns, you will have to use a[1:3, 1:3].

The two methods of indexing you tried are not equivalent. In the first one (a[x1:x2][y1:y2]), you are essentially indexing the first axis twice. In the second, you are indexing the first and second axes.
a[x1:x2][y1:y2] can be rewritten as
p = a[x1:x2] # result still has two dimensions
p = p[y1:y2]
You are first indexing 0:16 in the first dimension. Then you index 0:16 in the first dimension of the result of the previous operation (which will simply return the same as a[x1:x2] because x1==y1 and x2==y2).
In the second method, you index the first and second dimensions directly. I would not write it this way, but one could write it like this to contrast it with the first method:
a[x1:x2][:, y1:y2]

how to get the row and column indices of the maximum element from a 2D pytorch tensor?

Is there any way that I can retrieve the row and column indices of the greatest element contained in 2-dimensional pytorch tensor? For example, see the pytorch tensor a below:
a
>> torch.tensor([1,2,3],
[9,5,4],
[6,7,8])
The greatest element in the tensor a is 9, which happens at the first column of the second row. If I change that into a python column and row index that starts from zero, the column index of the element would 0 and the row index would be 1.
Is there any way that I can retrieve the index [1,0] from the 2 dimensional pytorch tensor a?

Unfortunately there is no build-in method.
However you could use numpy:
np.unravel_index(torch.argmax(a), a.shape)
Otherwise you need to write your own logic, something like:
def unravel_index(flat_idx, shape):
multi_idx = []
r = flat_idx
for s in shape[:-1]:
multi_idx.append(r // s)
r = r % s
multi_idx.append(r % s)
return multi_idx

tf.ones_like(tensor) with a specific index set to 0

I am trying to mimic tf.ones_like() where "Given a single tensor (tensor), this operation returns a tensor of the same type and shape as tensor with all elements set to 1." except I want to specify a certain column index to be set to 0. For example, I want the first column to be all 0
if Given tensor = [[1,2,3], [4,5,6]] then I would like to return [[0,1,1], [0,1,1]] if I specify the first column. Is there any way to do this with tensorflow operations?

IMHO, from Variable helper functions only the assign can be of any reasonable help in cases like this. If you would not choose to let numpy out, I can suggest this code:
t = np.array([[1,2,3],[4,5,6]])
v = tf.Variable(t)
t = np.ones_like(t)
t[:,0] = 0
sess = tf.Session()
print(sess.run(v.assign(t)))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subsetting a two dimensional tensor with a one dimensional tensor - python

If you're looking for a solution with indexing, you need to index on axis=0 as well, you could do that with torch.arange: >>> test_tensor = torch.tensor([[1,-2,3], [-2,7,4]]) >>> select_tensor = torch.tensor([1,2]) >>> test_tensor[torch.arange(len(select_tensor)), select_tensor] tensor([-2, 4])

Related

Select multiple indices in an axis of pytorch tensor

Array mean iteration

why does numpy array return wrong shape of sub arrays when indexing

how to get the row and column indices of the maximum element from a 2D pytorch tensor?

tf.ones_like(tensor) with a specific index set to 0

Categories

Resources