Get value of variable index in particular dimension - python

Say if i have a tensor that is
value = torch.tensor([
[[0, 0, 0], [1, 1, 1]],
[[2, 2, 2], [3, 3, 3]],
])
essentially with shape (2,2,3).
Now say if i have an index = [1, 0], which means I want to take:
# row 1 of [[0, 0, 0], [1, 1, 1]], giving me: [1, 1, 1]
# row 0 of [[2, 2, 2], [3, 3, 3]], giving me: [2, 2, 2]
So that the final output:
output = torch.tensor([[1, 1, 1], [2, 2, 2]])
is there a vectorized way to achieve this?

You can use advanced indexing.
I can't find a good pytorch document about this, but I believe it works as same as numpy, so here's the numpy's document about indexing.
import torch
value = torch.tensor([
[[0, 0, 0], [1, 1, 1]],
[[2, 2, 2], [3, 3, 3]],
])
index = [1, 0]
i = range(0,2)
result = value[i, index]
# same as result = value[i, index, :]
print(result)

Related

Get list for duplicates on an other list python

I need help to get a list from an other :
input :
[[1, 1], [1, 1], [2, 2], [1, 1], [1, 1], [2, 2], [3, 3], [4, 4]]
output wanted :
[0, 0, 1, 0, 0, 1, 2, 3]
I tried to use enumerate but I fail, any suggestion ?
Edit : Every time I meet a new element in the list, I associate this new element with a number (start from 0 and +1 every new element) and if I recognize it later I put the same number, so [1,1] --> 0 because is the first element we met and [2,2] --> 1 etc...
Okay I found a solution :
One more thing before, my example is bad because I can have [1,2] in element of the list for input
the solution I found is
line = [[1, 1], [1, 1], [2, 2], [1, 1], [2, 1], [2, 2], [3, 3], [4, 4]]
p = []
line_not = []
k = 0
for i in range (len(line)):
if line[i] in line[:i]:
p.append(line_not[:k].index(line[i]))
else:
p.append(k)
line_not.append(line[i])
k+=1
the output is :
[0, 0, 1, 0, 2, 1, 3, 4]
If u have a better solution, tell me !
try to make a map, this works:
inp=[[1, 1], [1, 1], [2, 2], [1, 1], [1, 1], [2, 2], [3, 3], [4, 4]]
out = [0, 0, 1, 0, 0, 1, 2, 3]
mymap={inp[0][0]:0}
output = [0]
k_count=1
for i in inp[1:]:
if i[0] in mymap.keys():
output.append(mymap[i[0]])
else:
mymap[i[0]] = k_count
output.append(mymap[i[0]])
k_count+=1
and then output == [0, 0, 1, 0, 0, 1, 2, 3]
First build a dictionary that does the assocation of each unique element with a number:
>>> x = [[1, 1], [1, 1], [2, 2], [1, 1], [1, 1], [2, 2], [3, 3], [4, 4]]
>>> d = {}
>>> for [i, _] in x:
... if i not in d:
... d[i] = len(d)
...
and then you can easily build your output list by doing lookups in that dictionary:
>>> [d[i] for [i, _] in x]
[0, 0, 1, 0, 0, 1, 2, 3]
this would work in your current example, but it is not a comprehensive solution. Without context its hard to understand what you are trying to achieve, so use with care:
import numpy as np
inp = [[1, 1], [1, 1], [2, 2], [1, 1], [1, 1], [2, 2], [3, 3], [4, 4]]
out = np.array([i[0] for i in inp]) - 1
print(out) # result: [0 0 1 0 0 1 2 3]

how to split a list in python based on the values of the list

I have a list having sublists of numbers and want to extract specific ones. In my simplified example I have two main sublists and each one has its own pairs of numbers:
data=[[[1, 0], [2, 0], [2, 1], [2, 2],\
[1, 0], [1, 1], [1, 2],\
[0, 1], [0, 2], [0, 3]],\
[[1, 0], [2, 0],\
[1, 0],\
[0, 1], [0, 2], [1, 2],\
[1, 0], [1, 1], [1, 1]]]
Pairs stored in data can be divided based on some rules and I want the last pair of each division. For simplicity I have shown each division as a row in data. Each division starts with [1, 0] or [0, 1] and these two pairs are break points. Then, simply I want the last pair before each break points. In cases I may have no point between two break points and I only export the previous break point. Finally I want it as the following list:
data=[[[2, 2],\
[1, 2],\
[0, 3]],\
[[2, 0],\
[1, 0],\
[1, 2],\
[1, 1]]]
You can do the following, using enumerate:
def fun(lst):
return [p for i, p in enumerate(lst) if i==len(lst)-1 or set(lst[i+1])=={0,1}]
[*map(fun, data)]
# [[[2, 2], [1, 2], [0, 3]], [[2, 0], [1, 0], [1, 2], [1, 1]]]
fun filters a nested list for all elements that are either last or succeeded by [0, 1] or [1, 0].
data=[[[1, 0], [2, 0], [2, 1], [2, 2],
[1, 0], [1, 1], [1, 2],
[0, 1], [0, 2], [0, 3]],
[[1, 0], [2, 0],
[1, 0],
[0, 1], [0, 2], [1, 2],
[1, 0], [1, 1], [1, 1]]]
newData = []
for subarray in data:
new_subarray = []
for i,item in enumerate(subarray):
if item == [0,1] or item == [1,0]:
if i> 0:
new_subarray.append(subarray[i-1])
if i == len(subarray)-1:
new_subarray.append(item)
newData.append(new_subarray)
print(newData)
Here is a fun little unreadable numpy oneliner:
import numpy as np
[np.array(a)[np.roll(np.flatnonzero(np.logical_or(np.all(np.array(a)==(1, 0), axis=1), np.all(np.array(a)==(0, 1), axis=1)))-1, -1)].tolist() for a in data]
# [[[2, 2], [1, 2], [0, 3]], [[2, 0], [1, 0], [1, 2], [1, 1]]]
It works but in reality you'd better use schwobaseggl's solution.

Using _scatter() to replace values in matrix

Given the following two tensors:
x = torch.tensor([[[1, 2],
[2, 0],
[0, 0]],
[[2, 2],
[2, 0],
[3, 3]]]) # [batch_size x sequence_length x subseq_length]
y = torch.tensor([[2, 1, 0],
[2, 1, 2]]) # [batch_size x sequence_length]
I would like to sort the sequences in x based on their sub-sequence lengths (0 corresponds to padding in the sequence). y corresponds to the lengths of the sub-sequences in x. I have tried the following:
y_sorted, y_sort_idx = y.sort(dim=1, descending=True)
print(x.scatter_(dim=1, index=y_sort_idx.unsqueeze(2), src=x))
This results in:
tensor([[[1, 2],
[2, 0],
[0, 0]],
[[2, 2],
[2, 0],
[2, 3]]])
However what I would like to achieve is:
tensor([[[1, 2],
[2, 0],
[0, 0]],
[[2, 2],
[3, 3],
[2, 0]]])
This should do it
y_sorted, y_sort_idx = y.sort(dim=1, descending=True)
index = y_sort_idx.unsqueeze(2).expand_as(x)
x = x.gather(dim=1, index=index)

Combine index and value of tenor to from a new tensor

I have a tensor like a = torch.tensor([1,2,0,1,2]). I want to calculate a tensor b which has indices and values of tensor a such that:
b = tensor([ [0,1], [1,2], [2,0], [3,1], [4,2] ]).
Edit: a[i] is >= 0.
One way of doing this is:
b = torch.IntTensor(list(zip(range(0, list(a.size())[0], 1), a.numpy())))
Output:
tensor([[0, 1],
[1, 2],
[2, 0],
[3, 1],
[4, 2]], dtype=torch.int32)
Alternatively, you can also use torch.cat() as below:
a = torch.tensor([1,2,0,1,2])
indices = torch.arange(0, list(a.size())[0])
res = torch.cat([indices.view(-1, 1), a.view(-1, 1)], 1)
Output:
tensor([[0, 1],
[1, 2],
[2, 0],
[3, 1],
[4, 2]])
a = torch.tensor([1,2,0,1,2])
print(a)
i = torch.arange(a.size(0))
print(i)
r = torch.stack((i, a), dim=1)
print(r)
tensor([1, 2, 0, 1, 2])
tensor([0, 1, 2, 3, 4])
tensor([[0, 1],
[1, 2],
[2, 0],
[3, 1],
[4, 2]])

How do I accept "divide by zero" as zero? (Python)

So I have the following code:
import numpy as np
array1 = np.array([[[[2, 2, 3], [0, 2, 0], [2, 0, 0]],
[[1, 2, 2], [2, 2, 0], [0, 2, 3]],
[[0, 4, 2], [2, 2, 2], [2, 2, 3]]],
[[[2, 3, 0], [3, 2, 0], [2, 0, 3]],
[[0, 2, 2], [2, 2, 0], [2, 2, 3]],
[[1, 0, 2], [2, 2, 2], [2, 2, 0]]],
[[[2, 0, 0], [0, 2, 0], [2, 0, 0]],
[[2, 2, 2], [0, 2, 0], [2, 2, 0]],
[[0, 2, 2], [2, 2, 2], [2, 2, 0]]]])
array2 = np.array([[[[2, 2, 3], [0, 2, 0], [2, 0, 0]],
[[1, 2, 2], [2, 2, 0], [0, 2, 3]],
[[0, 4, 2], [2, 2, 2], [2, 2, 3]]],
[[[2, 3, 0], [3, 2, 0], [2, 0, 3]],
[[0, 2, 2], [2, 10, 0], [2, 2, 3]],
[[1, 0, 2], [2, 2, 2], [2, 2, 0]]],
[[[2, 0, 0], [0, 2, 0], [2, 0, 0]],
[[2, 2, 2], [0, 2, 0], [2, 2, 0]],
[[0, 2, 2], [2, 2, 2], [2, 2, 0]]]])
def calc(x, y):
result = y/x
return result
final_result = []
for x, y in zip(array1, array2):
final_result.append(calc(np.array(x), np.array(y)))
So all in all I have two lists that include some 3D arrays, and then I have defined a function. The last part is where I use each 3D array in the function, and I ultimately end up with a list (final_result) of some other 3D arrays where the function has been used on each entry from array1 and array2.
However, as you can see, array1 which ultimately gives the x values in the function does have 0 values in some of the entries. And yes, mathematically, this is no good. But in this case, I really just need the entries that does have a zero x-entry to be zero. So it doesn't need to run the function whenever that happens, but just skip it, and leave that entry as zero.
Can this be done?
This question has been answered here. Numpy has a specific way to catch such errors:
def calc( a, b ):
""" ignore / 0, div0( [-1, 0, 1], 0 ) -> [0, 0, 0] """
with np.errstate(divide='ignore', invalid='ignore'):
c = np.true_divide( a, b )
c[ ~ np.isfinite( c )] = 0 # -inf inf NaN
return c

Categories

Resources