Index a torch tensor with an array - python

I have the following torch tensor:
tensor([[-0.2, 0.3],
[-0.5, 0.1],
[-0.4, 0.2]])
and the following numpy array: (I can convert it to something else if necessary)
[1 0 1]
I want to get the following tensor:
tensor([0.3, -0.5, 0.2])
i.e. I want the numpy array to index each sub-element of my tensor. Preferably without using a loop.
Thanks in advance

You may want to use torch.gather - "Gathers values along an axis specified by dim."
t = torch.tensor([[-0.2, 0.3],
[-0.5, 0.1],
[-0.4, 0.2]])
idxs = np.array([1,0,1])
idxs = torch.from_numpy(idxs).long().unsqueeze(1)
# or torch.from_numpy(idxs).long().view(-1,1)
t.gather(1, idxs)
tensor([[ 0.3000],
[-0.5000],
[ 0.2000]])
Here, your index is numpy array so you have to convert it to LongTensor.

Just simply, use a range(len(index)) for the first dimension.
import torch
a = torch.tensor([[-0.2, 0.3],
[-0.5, 0.1],
[-0.4, 0.2]])
c = [1, 0, 1]
b = a[range(3),c]
print(b)

Related

Assigning values to a 2D tensor using indices in Tensorflow

I have a 2D tensor A, I wish to replace it's non-zero entries with another tensor B as follows.
A = tf.constant([[1.0,0,1.0],[0,1.0,0],[1.0,0,1.0]],dtype=tf.float32)
B = tf.constant([1.0,2.0,3.0,4,0,5.0],dtype=tf.float32)
So I would like to have the final A as
A = tf.constant([[1.0,0.0,2.0],[0,3.0,0.0],[4.0,0.0,5.0]],dtype=tf.float32)
And I get the indices of non-zero elements of A as follows
where_nonzero = tf.not_equal(A, tf.constant(0, dtype=tf.float32))
indices = tf.where(where_nonzero)
indices = <tf.Tensor: shape=(5, 2), dtype=int64, numpy=
array([[0, 0],
[0, 2],
[1, 1],
[2, 0],
[2, 2]])>
Can someone please help with this?
IIUC, you should be able to use tf.tensor_scatter_nd_update:
import tensorflow as tf
A = tf.constant([[1.0,0,1.0],[0,1.0,0],[1.0,0,1.0]],dtype=tf.float32)
B = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0],dtype=tf.float32)
where_nonzero = tf.not_equal(A, tf.constant(0, dtype=tf.float32))
indices = tf.where(where_nonzero)
A = tf.tensor_scatter_nd_update(A, indices, B)
print(A)
tf.Tensor(
[[1. 0. 2.]
[0. 3. 0.]
[4. 0. 5.]], shape=(3, 3), dtype=float32)
you can try SparseTensor
c = tf.constant([[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0],
[0.0, 0.0, 0.0]])
indices = [[1, 1]] # A list of coordinates to update.
values = [1.0] # A list of values corresponding to the respective
# coordinate in indices.
shape = [3, 3] # The shape of the corresponding dense tensor, same as `c`.
delta = tf.SparseTensor(indices, values, shape)
or scatter_update:
tf.scatter_update(c, indices, values)

PyTorch: index 2D tensor with 2D tensor of row indices

I have a torch tensor a of shape (x, n) and another tensor b of shape (y, n) where y <= x. every column of b contains a sequence of row indices for a and what I would like to be able to do is to somehow index a with b such that I obtain a tensor of shape (y, n) in which the ith column contains a[:, i][b[:, i]] (not quite sure if that's the correct way to express it).
Here's an example (where x = 5, y = 3 and n = 4):
import torch
a = torch.Tensor(
[[0.1, 0.2, 0.3, 0.4],
[0.6, 0.7, 0.8, 0.9],
[1.1, 1.2, 1.3, 1.4],
[1.6, 1.7, 1.8, 1.9],
[2.1, 2.2, 2.3, 2.4]]
)
b = torch.LongTensor(
[[0, 3, 1, 2],
[2, 2, 2, 0],
[1, 1, 0, 4]]
)
# How do I get from a and b to c
# (so that I can also assign to those elements in a)?
c = torch.Tensor(
[[0.1, 1.7, 0.8, 1.4],
[1.1, 1.2, 1.3, 0.4],
[0.6, 0.7, 0.3, 2.4]]
)
I can't get my head around this. What I'm looking for is a method that will not yield the tensor c but also let me assign a tensor of the same shape as c to the elements of a which c is made up of.
I try to use index_select but it supports only 1-dim array for index.
bt = b.transpose(0, 1)
at = a.transpose(0, 1)
ct = [torch.index_select(at[i], dim=0, index=bt[i]) for i in range(len(at))]
c = torch.stack(ct).transpose(0, 1)
print(c)
"""
tensor([[0.1000, 1.7000, 0.8000, 1.4000],
[1.1000, 1.2000, 1.3000, 0.4000],
[0.6000, 0.7000, 0.3000, 2.4000]])
"""
It might be not the best solution, but hope this helps you at least.

Get top N values from each sub-array from 2D numpy array [duplicate]

I think this is an easy question for experienced numpy users.
I have a score matrix. The raw index corresponds to samples and column index corresponds to items. For example,
score_matrix =
[[ 1. , 0.3, 0.4],
[ 0.2, 0.6, 0.8],
[ 0.1, 0.3, 0.5]]
I want to get top-M indices of items for each samples. Also I want to get top-M scores. For example,
top2_ind =
[[0, 2],
[2, 1],
[2, 1]]
top2_score =
[[1. , 0.4],
[0,8, 0.6],
[0.5, 0.3]]
What is the best way to do this using numpy?
Here's an approach using np.argpartition -
idx = np.argpartition(a,range(M))[:,:-M-1:-1] # topM_ind
out = a[np.arange(a.shape[0])[:,None],idx] # topM_score
Sample run -
In [343]: a
Out[343]:
array([[ 1. , 0.3, 0.4],
[ 0.2, 0.6, 0.8],
[ 0.1, 0.3, 0.5]])
In [344]: M = 2
In [345]: idx = np.argpartition(a,range(M))[:,:-M-1:-1]
In [346]: idx
Out[346]:
array([[0, 2],
[2, 1],
[2, 1]])
In [347]: a[np.arange(a.shape[0])[:,None],idx]
Out[347]:
array([[ 1. , 0.4],
[ 0.8, 0.6],
[ 0.5, 0.3]])
Alternatively, possibly slower, but a bit shorter code to get idx would be with np.argsort -
idx = a.argsort(1)[:,:-M-1:-1]
Here's a post containing some runtime test that compares np.argsort and np.argpartition on a similar problem.
I'd use argsort():
top2_ind = score_matrix.argsort()[:,::-1][:,:2]
That is, produce an array which contains the indices which would sort score_matrix:
array([[1, 2, 0],
[0, 1, 2],
[0, 1, 2]])
Then reverse the columns with ::-1, then take the first two columns with :2:
array([[0, 2],
[2, 1],
[2, 1]])
Then similar but with regular np.sort() to get the values:
top2_score = np.sort(score_matrix)[:,::-1][:,:2]
Which following the same mechanics as above, gives you:
array([[ 1. , 0.4],
[ 0.8, 0.6],
[ 0.5, 0.3]])
In case someone is interested in the both the values and corresponding indices without tempering with the order, the following simple approach will be helpful. Though it could be computationally expensive if working with large data since we are using a list to store tuples of value, index.
import numpy as np
values = np.array([0.01,0.6, 0.4, 0.0, 0.1,0.7, 0.12]) # a simple array
values_indices = [] # define an empty list to store values and indices
while values.shape[0]>1:
values_indices.append((values.max(), values.argmax()))
# remove the maximum value from the array:
values = np.delete(values, values.argmax())
The final output as list of tuples:
values_indices
[(0.7, 5), (0.6, 1), (0.4, 1), (0.12, 3), (0.1, 2), (0.01, 0)]
Easy way would be:
To get top-2 indices
np.argsort(-score_matrix)[:, :2]
To get top-2 values
-np.sort(-score_matrix)[:, :2]

TensorFlow - dense vector to one-hot

Suppose I have the following tensor:
T = [[0.1, 0.3, 0.7],
[0.2, 0.5, 0.3],
[0.1, 0.1, 0.8]]
I want to transform this into a one-hot tensor, such that the indexes with the maximum value over dimension 0 get set to 1 and all the other ones get set to zero, like this:
T_onehot = [[0, 0, 1],
[0, 1, 0],
[0, 0, 1]]
I know there's tf.argmax to get the indices of the largest elements in the tensor, but is there any method which allows me to do what I want to do in one step?
I don't know if there's a way to do this in one step, but there's a one_hot function in tensorflow:
import tensorflow as tf
T = tf.constant([[0.1, 0.3, 0.7], [0.2, 0.5, 0.3], [0.1, 0.1, 0.8]])
T_onehot = tf.one_hot(tf.argmax(T, 1), T.shape[1])
tf.InteractiveSession()
print(T_onehot.eval())
# [[ 0. 0. 1.]
# [ 0. 1. 0.]
# [ 0. 0. 1.]]

get max value of multiplication of column combinations and their respective index in python

I have a numpy array of M*N dimensions in which each element of the array is a float with a value between 0-1.
Input: for simplicity purpose lets consider a 3*4 array:
a=np.array([
[0.1, 0.2, 0.3, 0.6],
[0.3, 0.4, 0.8, 0.7],
[0.5, 0.6, 0.2, 0.1]
])
I want to consider 3 columns at a time (say col 0,1,2 for first iteration and 1,2,3 for second) and get the maximum value of multiplication of all possible combinations of the 3 columns and also get the index of their respective values.
In this case I should get max value of 0.5*0.6*0.8=0.24 and the index of the rows of values that gave the max value: (2,2,1) in this case.
Output: [[0.24,(2,2,1)],[0.336,(2,1,1)]]
I can do this using loops but I want to avoid them as it would affect running time, is there anyway I can do that in numpy?
Here's an approach using NumPy strides that is supposedly very efficient for such sliding windowed operations as it creates a view into the array without actually making copies -
N = 3 # Window size
m,n = a.strides
p,q = a.shape
a3D = np.lib.stride_tricks.as_strided(a,shape=(p, q-N +1, N),strides=(m,n,n))
out1 = a3D.argmax(0)
out2 = a3D.max(0).prod(1)
Sample run -
In [69]: a
Out[69]:
array([[ 0.1, 0.2, 0.3, 0.6],
[ 0.3, 0.4, 0.8, 0.7],
[ 0.5, 0.6, 0.2, 0.1]])
In [70]: out1
Out[70]:
array([[2, 2, 1],
[2, 1, 1]])
In [71]: out2
Out[71]: array([ 0.24 , 0.336])
We can zip those two outputs together if needed in that format -
In [75]: zip(out2,map(tuple,out1))
Out[75]: [(0.23999999999999999, (2, 2, 1)), (0.33599999999999997, (2, 1, 1))]

Categories

Resources