Pytorch gather question (3D Computer Vision)

Pytorch gather question (3D Computer Vision) - python

I have N groups of C-dimension points. In each groups there are M points. So, there is a tensor of (N, M, C). Let's call it features.
I calculated the maximum element and the index through M dimension, to find the maximum points for each C dimension (a max pooling operation), resulting max tensor (N, 1, C) and index tensor (N, 1, C).
I have another tensor of shape (N, M, 3) storing the geometric coordinates of those N*M high-dimention points. Now, I want to use the index from the maximum points in each C dimension, to get the coordinates of all those maximum points.
For example, N=2, M=4, C=6.
The coordinate tensor, whose shape is (2, 4, 3):
[[[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
[8, 7, 6]]
[11, 12, 13]
[14, 15, 16]
[17, 18, 19]
[18, 17, 16]]]
The indices tensor, whose shape is (2, 1, 6):
[[[0, 1, 2, 1, 2, 3]]
[[1, 2, 3, 2, 1, 0]]]
For example, the first element in indices is 0, I want to grab [1, 2, 3] from the coordinate tensor out. For the second element (1), I want to grab [4, 5, 6] out. For the third element in the next dimension (3), I want to grab [18, 17, 16] out.
The result tensor will look like:
[[[1, 2, 3] # 0
[4, 5, 6] # 1
[7, 8, 9] # 2
[4, 5, 6] # 1
[7, 8, 9] # 2
[8, 7, 6]] # 3
[[14, 15, 16] # 1
[17, 18, 19] # 2
[18, 17, 16] # 3
[17, 18, 19] # 2
[14, 15, 16] # 1
[11, 12, 13]]]# 0
whose shape is (2, 6, 3).
I tried to use torch.gather but I can not get it worked. I wrote a naive algorithm enumerating all N groups, but indeed it is slow, even using TorchScript's JIT. So, how to write this efficiently in pytorch?

You can use integer array indexing combined with broadcasting semantics to get your result.
import torch
x = torch.tensor([
[[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[8, 7, 6]],
[[11, 12, 13],
[14, 15, 16],
[17, 18, 19],
[18, 17, 16]],
])
i = torch.tensor([[[0, 1, 2, 1, 2, 3]],
[[1, 2, 3, 2, 1, 0]]])
# rows is shape [2, 1], cols is shape [2, 6]
rows = torch.arange(x.shape[0]).type_as(i).unsqueeze(1)
cols = i.squeeze(1)
# y is [2, 6, ...]
y = x[rows, cols]

Related

How to slice tensorflow tensor differently for each row at once?

I have simple tensor
a = tf.constant([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])
and want to slice it, but I need to do it differently for each of the rows. This slice operation is described by another tensor
b = tf.constant([[0, 1], [2, 4], [2, 5]])
It means that from the first row of tensor a I need elements from 0 to 1, from the second row from 2 to 4 and so on. So the final final result will be
[
[1],
[8, 9],
[13, 14, 15]
]
My first idea was to fill ranges between begin and end of a slice, but unfortunately, doing it with map_fn is not possible because result's rows have different lengths.
Does anyone know how to do such operation?

Basically we have two arrays to iterate. One with actual data, other with range to return.
Therefore, zip function can help iterate over elements from multiple arrays one by one.
import tensorflow as tf
a = tf.constant([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])
b = tf.constant([[0, 1], [2, 4], [2, 5]])
# As you iterate, provided a and b have same length
# [1, 2, 3, 4, 5] sliced as [0:1]
# [6, 7, 8, 9, 10] sliced as [2:4]
# [11, 12, 13, 14, 15] sliced as [2:5]
[data.numpy().tolist()[start:end] for data, (start, end) in zip(a,b)]
Output:
[[1], [8, 9], [13, 14, 15]]

If the size of b is known at graph compile time, then you can slice each row separately.
import tensorflow as tf
a = tf.constant([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]])
b = tf.constant([[0, 1], [2, 4], [2, 5]])
r = []
for i in range(3):
bi = b[i]
r.append(a[i][bi[0]: bi[1]])
print(r)

How to modify N columns of numpy array at the same time?

How to modify N columns of numpy array?? For example, I have a numpy array as follows:
P = array([[1, 2, 3, 8, 6],
[4, 5, 6, 4, 5]
[0,-2, 5, 3, 0]])
How do I change the elements of first, second and forth columns of P?

Use indexing:
Here is an example:
>>> P[:, [0, 1, 3]] += 10
>>>
>>> P
array([[11, 12, 3, 18, 6],
[14, 15, 6, 14, 5],
[10, 8, 5, 13, 0]])

How to set a value to elements in a column filtered by another array

I have an m X 3 matrix and an array of length m.
I want to do the following
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15]])
b = np.array([1, 2, 1, 3, 3])
me = np.mean(a[np.where(b==1)][:, 0])
a[np.where(b==1)][:, 0] = me
The problem is that
a[np.where(b==1)][:, 0]
returns [1, 7] instead of [4, 4].

You are combining index arrays with slices:
[np.where(b==1)] is a index array, [:, 0] is a slice. The way you do it a copy is returned and therefore you set the new values on the copy. You should instead do:
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15]])
b = np.array([1, 2, 1, 3, 3])
me = np.mean(a[np.where(b==1)][:, 0])
a[np.where(b==1), 0] = me
Also see https://docs.scipy.org/doc/numpy/user/basics.indexing.html for combining index arrays with slices.

Reverse diagonal on numpy python

let's say I have this:
(numpy array)
a=
[0 1 2 3],
[4 5 6 7],
[8 9 10 11]
to get [1,1] which is 5 its diagonal is zero; according to numpy, a.diagonal(0)= [0,5,10]. How do I get the reverse or the right to left diagonal [2,5,8] for [1,1]? Is this possible?
My original problem is an 8 by 8 (0:7).. I hope that helps

Get a new array each row reversed.
>>> import numpy as np
>>> a = np.array([
... [0, 1, 2, 3],
... [4, 5, 6, 7],
... [8, 9, 10, 11]
... ])
>>> a[:, ::-1]
array([[ 3, 2, 1, 0],
[ 7, 6, 5, 4],
[11, 10, 9, 8]])
>>> a[:, ::-1].diagonal(1)
array([2, 5, 8])
or using numpy.fliplr:
>>> np.fliplr(a).diagonal(1)
array([2, 5, 8])

Flip the array upside-down and use the same:
np.flipud(a).diagonal(0)[::-1]

Another way to achieve this is to use np.rot90
import numpy as np
a = np.array([[0, 1, 2, 3],
[4, 5, 6, 7],
[8, 9, 10, 11]])
my_diag = np.rot90(a).diagonal(-1)
Result:
>>> my_diag
array([2, 5, 8])

A number of answers so far. #Akavall is closest as you need to rotate or filip and transpose (equivilant operations). I haven't seen a response from the OP regarding expected behavior on the "long" part of the rectangle.
Generalized solution for a square matrix:
a = array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19],
[20, 21, 22, 23, 24]])
>>> [(i, np.rot90(a).diagonal(2*i-a.shape[0]+1)) for i in range(a.shape[0])]
[(0, array([0])),
(1, array([ 2, 6, 10])),
(2, array([ 4, 8, 12, 16, 20])),
(3, array([14, 18, 22])),
(4, array([24]))]
As a function:
def reverse_diag(arr, n):
idx = 2*n - arr.shape[0]+1
return np.rot90(arr).diagonal(idx)
original matrix can be made square with a[:np.min(a.shape),:np.min(a.shape)]
EDIT: OP indicated the array is square.... Final Answer is the above

how is axis indexed in numpy's array?

From Numpy's tutorial, axis can be indexed with integers, like 0 is for column, 1 is for row, but I don't grasp why they are indexed this way? And How do I figure out each axis' index when coping with multidimensional array?

By definition, the axis number of the dimension is the index of that dimension within the array's shape. It is also the position used to access that dimension during indexing.
For example, if a 2D array a has shape (5,6), then you can access a[0,0] up to a[4,5]. Axis 0 is thus the first dimension (the "rows"), and axis 1 is the second dimension (the "columns"). In higher dimensions, where "row" and "column" stop really making sense, try to think of the axes in terms of the shapes and indices involved.
If you do .sum(axis=n), for example, then dimension n is collapsed and deleted, with each value in the new matrix equal to the sum of the corresponding collapsed values. For example, if b has shape (5,6,7,8), and you do c = b.sum(axis=2), then axis 2 (dimension with size 7) is collapsed, and the result has shape (5,6,8). Furthermore, c[x,y,z] is equal to the sum of all elements b[x,y,:,z].

If at all anyone need this visual description of a shape=(3,5) array:

You can grasp axis in this way:
>>> a = np.array([[[1,2,3],[2,2,3]],[[2,4,5],[1,3,6]],[[1,2,4],[2,3,4]],[[1,2,4],[1,2,6]]])
array([[[1, 2, 3],
[2, 2, 3]],
[[2, 4, 5],
[1, 3, 6]],
[[1, 2, 4],
[2, 3, 4]],
[[1, 2, 4],
[1, 2, 6]]])
>>> a.shape
(4,2,3)
I created an array of a shape with different values(4,2,3) so that you can tell the structure clearly. Different axis means different 'layer'.
That is, axis = 0 index the first dimension of shape (4,2,3). It refers to the arrays in the first []. There are 4 elements in it, so its shape is 4:
array[[1, 2, 3],
[2, 2, 3]],
array[[2, 4, 5],
[1, 3, 6]],
array[[1, 2, 4],
[2, 3, 4]],
array[[1, 2, 4],
[1, 2, 6]]
axis = 1 index the second dimension in shape(4,2,3). There are 2 elements in each array of the layer: axis = 0,e.c. In the array of
array[[1, 2, 3],
[2, 2, 3]]
.
The two elements are:
array[1, 2, 3]
array[2, 2, 3]
And the third shape value means there are 3 elements in each array element of layer: axis = 2. e.c. There are 3 elements in array[1, 2, 3]. That is explicit.
And also, you can tell the axis/dimensions from the number of [] at the beginning or in the end. In this case, the number is 3([[[), so you can choose axis from axis = 0, axis = 1 and axis = 2.

In general, axis = 0, means all cells with first dimension varying with each value of 2nd dimension and 3rd dimension and so on
For example , 2-dimensional array has two corresponding axes: the first running vertically downwards across rows (axis 0), and the second running horizontally across columns (axis 1)
For 3D, it becomes complex, so, use multiple for loops
>>> x = np.array([[[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8]],
[[ 9, 10, 11],
[12, 13, 14],
[15, 16, 17]],
[[18, 19, 20],
[21, 22, 23],
[24, 25, 26]]])
>>> x.shape #(3, 3, 3)
#axis = 0
>>> for j in range(0, x.shape[1]):
for k in range(0, x.shape[2]):
print( "element = ", (j,k), " ", [ x[i,j,k] for i in range(0, x.shape[0]) ])
...
element = (0, 0) [0, 9, 18] #sum is 27
element = (0, 1) [1, 10, 19] #sum is 30
element = (0, 2) [2, 11, 20]
element = (1, 0) [3, 12, 21]
element = (1, 1) [4, 13, 22]
element = (1, 2) [5, 14, 23]
element = (2, 0) [6, 15, 24]
element = (2, 1) [7, 16, 25]
element = (2, 2) [8, 17, 26]
>>> x.sum(axis=0)
array([[27, 30, 33],
[36, 39, 42],
[45, 48, 51]])
#axis = 1
for i in range(0, x.shape[0]):
for k in range(0, x.shape[2]):
print( "element = ", (i,k), " ", [ x[i,j,k] for j in range(0, x.shape[1]) ])
element = (0, 0) [0, 3, 6] #sum is 9
element = (0, 1) [1, 4, 7]
element = (0, 2) [2, 5, 8]
element = (1, 0) [9, 12, 15]
element = (1, 1) [10, 13, 16]
element = (1, 2) [11, 14, 17]
element = (2, 0) [18, 21, 24]
element = (2, 1) [19, 22, 25]
element = (2, 2) [20, 23, 26]
# for sum, axis is the first keyword, so we may omit it,
>>> x.sum(0), x.sum(1), x.sum(2)
(array([[27, 30, 33],
[36, 39, 42],
[45, 48, 51]]),
array([[ 9, 12, 15],
[36, 39, 42],
[63, 66, 69]]),
array([[ 3, 12, 21],
[30, 39, 48],
[57, 66, 75]]))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pytorch gather question (3D Computer Vision) - python

Related

How to slice tensorflow tensor differently for each row at once?

How to modify N columns of numpy array at the same time?

How to set a value to elements in a column filtered by another array

Reverse diagonal on numpy python

how is axis indexed in numpy's array?

Categories

Resources