How to fetch specific rows from a tensor in Tensorflow? - python

I have a tensor defined as follows:
temp_var = tf.Variable(initial_value=np.asarray([[1, 2, 3],[4, 5, 6],[7, 8, 9],[10, 11, 12]]))
I also have an array of indexes of rows to be fetched from tensor:
idx = tf.constant([0, 2])
Now I want to take a subset of temp_var at those indexes i.e. idx
I know that to take a single index or a slice, we can do something like
temp_var[single_row_index, :]
or
temp_var[start:end, :]
But how to fetch rows indicated by idx array?
Something like temp_var[idx, :] ?

The tf.gather() op does exactly what you need: it selects rows from a matrix (or in general (N-1)-dimensional slices from an N-dimensional tensor). Here's how it would work in your case:
temp_var = tf.Variable([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]))
idx = tf.constant([0, 2])
rows = tf.gather(temp_var, idx)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
print(sess.run(rows)) # ==> [[1, 2, 3], [7, 8, 9]]

Related

Summing over specific indices PyTorch (similar to scatter_add)

I have some matrix where rows belong to some label, unordered. I want to sum all rows for each label.
Here is how it can be done with a loop:
labels = torch.tensor([0, 1, 0])
x = torch.tensor([[1, 2, 3],[4, 5, 6],[7, 8, 9]])
torch.stack([torch.sum(x[labels == i], dim=0) for i in torch.unique(labels)])
desired output:
tensor([[ 8, 10, 12],
[ 4, 5, 6]])
EDIT: Just to make it clear, I have the labels tensor, I know which labels repeat, I am interested in computing the final line without the use of a loop. I was thinking scatter_add_ or gather might help.
Just use torch.index_add function.
labels = torch.tensor([0, 1, 0])
x = torch.tensor([[1, 2, 3],[4, 5, 6],[7, 8, 9]])
nrow = torch.unique(labels).size(0)
ncol = x.size(1)
out = torch.zeros((nrow, ncol), dtype=x.dtype)
out.index_add_(0, labels, x)
print(out)
# the output will be
# tensor([[ 8, 10, 12],
# [ 4, 5, 6]])
1: I tried to find the repeated labels
def get_repeated_labels(label_list):
"""
Args:
label_list (ndarray): target list
Return:
(list): Repeated labels
"""
records_array = label_list
values, inverse, count = np.unique(records_array,
return_inverse=True,
return_counts=True)
repeated = np.where(count > 1)[0]
repeated = values[repeated]
rows, cols = np.where(inverse == repeated[:, np.newaxis])
_, inverse_rows = np.unique(rows, return_index=True)
res = np.split(cols, inverse_rows[1:])
return res
if __name__ == '__main__':
labels = torch.tensor([0, 1, 0])
r = get_repeated_labels(labels.numpy())
Output:
[array([0, 2])]
This means the 0th and the 2nd indexes are repeating. We need to sum the 0th and 2nd index array.
torch.sum(x[r[i]], dim=0)
But len(r[i]) is 1-dim, and we have two labels. Therefore I used if-else condition.
Final:
print(torch.stack([torch.sum(x[r[i]], dim=0) if len(r) >= i + 1 else x[i] for i, _ in enumerate(torch.unique(labels))]))
Output:
tensor([[ 8, 10, 12],
[ 4, 5, 6]])

PyTorch slice matrix with vector

Say I have one matrix and one vector as follows:
import torch
x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = torch.tensor([0, 2, 1])
is there a way to slice it x[y] so the result is:
res = [1, 6, 8]
So basically I take the first element of y and take the element in x that corresponds to the first row and the elements' column.
You can specify the corresponding row index as:
import torch
x = torch.tensor([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
y = torch.tensor([0, 2, 1])
x[range(x.shape[0]), y]
tensor([1, 6, 8])
Advanced indexing in pytorch works just as NumPy's, i.e the indexing arrays are broadcast together across the axes. So you could do as in FBruzzesi's answer.
Though similarly to np.take_along_axis, in pytorch you also have torch.gather, to take values along a specific axis:
x.gather(1, y.view(-1,1)).view(-1)
# tensor([1, 6, 8])

How do you get and set a 1-D array with column indexes of a 2-D matrix?

Suppose you have a matrix:
a = np.arange(9).reshape(3,3)
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
and I want get or set over the values 1, 5, and 6, how would I do that.
For example I thought doing
# getting
b = a[:, np.array([1,2,0])]
# want b = [1,5,6]
# setting
a[:, np.array([1,2,0])] = np.array([9, 10, 11])
# want:
# a = array([[0, 9, 2],
# [3, 4, 10],
# [11, 7, 8]])
would do it, but that is not the case. Any thoughts on this?
Only a small tweak makes this work:
import numpy as np
a = np.arange(9).reshape(3,3)
# getting
b = a[range(a.shape[0]), np.array([1,2,0])]
# setting
a[range(a.shape[0]), np.array([1,2,0])] = np.array([9, 10, 11])
The reason why your code didn't work as expected is because you were indexing the x-axis with slices instead of indices. Slices mean take all rows, but specifying the index directly will get you the row you want for each index value.

Tensorflow - Grouping placeholders by batch index

Given a network with two or more placeholders of varying dimensionality e.g.
x1 = tf.placeholder(tf.int32, [None, seq_len])
x2 = tf.placeholder(tf.int32, [None, seq_len])
xn = tf.placeholder(tf.int32, [None, None, seq_len]
The first dimension in each placeholder corresponds to the minibatch size. seq_len is the length of the inputs. The second dimension is like a list of inputs that I need to process together with x1 and x2 for each index in the minibatch. How can I group these tensors to operate on them by batch index?
For example
x1 = [[1, 2, 3], [4, 5, 6]]
x2 = [[7, 8, 9], [8, 7, 6]]
xn = [[[1, 5, 2], [7, 2, 8], [3, 2, 5]], [[8, 9, 8]]]
I need to keep x1[0] i.e. [1, 2, 3], x2[0] i.e. [7, 8, 9], and xn[0] i.e. [[1, 5, 2], [7, 2, 8], [3, 2, 5]] together, because I need to perform matrix operations between x1[i] and each element in xn[i] for all i.
Notice that the dimensionality of xn is jagged.
Still not sure if I understand your question. If I understand correctly, your challenge comes from the jagged nature of the dimensionality of xn. I have the below way to "unrolling" along batch index. The result is an array with a size of batch_size; each element in the array is a Tensor. Of course you can perform other operations for all these individual tensors before evaluating them.
I have to use tf.scan to perform the operation for each element of xn[i] because its first dimension is dynamic. There might exist better solutions though.
x1 = np.array([[1, 2, 3]])
xn = np.array([[[1, 5, 2], [7, 2, 8], [3, 2, 5]]])
batch_size = x1.shape[0]
result = []
for batch_idx in range(batch_size):
x1_i = x1[batch_idx]
xn_i = xn[batch_idx]
result.append(tf.scan(fn=lambda a, x: x * x1_i, elems=xn_i, initializer=x1_i))
with tf.Session() as sess:
print sess.run([result[0]])
# result, this is x1[0] multiply each element in xn[0] for all i (element-wise).
# free free to plug in your own matrix operations in the `fn` arg of `tf.scan`.
[array([[ 1, 10, 6],
[ 7, 4, 24],
[ 3, 4, 15]])]

Numpy - How to replace elements based on condition (or matching a pattern)

I have a numpy array, say:
>>> a=np.array([[0,1,2],[4,3,6],[9,5,7],[8,9,8]])
>>> a
array([[0, 1, 2],
[4, 3, 6],
[9, 5, 7],
[8, 9, 8]])
I want to replace the second and third column elements with the minimum of them (row by row), except if one of these 2 elements is < 3.
The resulting array should be:
array([[0, 1, 2],# nothing changes since 1 and 2 are <3
[4, 3, 3], #min(3,6)=3 => 6 changed to 3
[9, 5, 5], #min(5,7)=5 => 7 changed to 5
[8, 8, 8]]) #min(9,8)=8 => 9 changed to 8
I know I can use clip, for instance a[:,1:3].clip(2,6,a[:,1:3]), but
1) clip will be applied to all elements, including those <3.
2) I don't know how to set the min and max values of clip to the minimum values of the 2 related elements of each row.
Just use the >= operator to first select what you are interested of:
b = a[:, 1:3] # select the columns
matching = numpy.all(b >= 3, axis=1) # find rows with all elements matching
b = b[matching, :] # select rows
Now you can replace the content with the minimum by e.g.:
# find row minimum and convert to a column vector
b[:, :] = b.min(1, keepdims=True)
We first defined a row_mask, depicting the <3 condition, and then apply a minimum along an axis to find the minimum (for rows in row_mask).
The newaxis part is required for the broadcasting of a 1dim array (of minimums) to the 2-dim target of the assignment.
a=np.array([[0,1,2],[4,3,6],[9,5,7],[8,9,8]])
row_mask = (a[:,0]>=3)
a[row_mask, 1:] = a[row_mask, 1:].min(axis=1)[...,np.newaxis]
a
=>
array([[0, 1, 2],
[4, 3, 3],
[9, 5, 5],
[8, 8, 8]])
Here's a one liner:
a[np.where(np.sum(a,axis=1)>3),1:3]=np.min(a[np.where(np.sum(a,axis=1)>3),1:3],axis=2).reshape(1,3,1)
Here's a breakdown:
>>> b = np.where(np.sum(a,axis=1)>3) # finds rows where, in a, row sums are > 3
(array([1, 2, 3]),)
>>> c = a[b,1:3] # the part of a that needs to change
array([[[3, 3],
[5, 5],
[8, 8]]])
>>> d = np.min(c,axis=2) # the minimum values in each row (cols 1 and 2)
array([[3, 5, 8]])
>>> e = d.reshape(1,3,1) # adjust shape for broadcast to a
array([[[3],
[5],
[8]]])
>>> a[np.where(np.sum(a,axis=1)>3),1:3] = e # set the values in a
>>> a
array([[0, 1, 2],
[4, 3, 3],
[9, 5, 5],
[8, 8, 8]])

Categories

Resources