Pytorch index with Tensor - python

I have a 2-dimentional tensor arr with 0 as all the entries. I have a second tensor idx. I want to make all entries in arr with the indices in idx into 1.
arr = torch.zeros(size = (2,10))
idx = torch.Tensor([
[0,2],
[4,5]
])
arr[idx] = 1 #This doesn't work
print(arr)
The output should look like this:
tensor([[1., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.]])
I had high confidence that I would definitely find someone else ask this in SO, however I couldn't find one. I hope it isn't duplicate.

Use scatter() along dim=1 or the innermost dimension in this case i.e. dim=-1. Note that in place of src tensor, I just passed the constant value 1.
In [31]: arr = torch.zeros(size=(2, 10))
In [32]: idx = torch.tensor([
...: [0, 2],
...: [4, 5]
...: ])
In [33]: torch.scatter(arr, 1, idx, 1)
Out[33]:
tensor([[1., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.]])
In [34]: torch.scatter(arr, -1, idx, 1)
Out[34]:
tensor([[1., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 1., 1., 0., 0., 0., 0.]])

Related

How to vectorize one hot encoding loop in numpy

Is there a way to vectorize the loop in this code?
def get_onehot(y):
categories = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
arr = np.zeros((y.shape[0], len(categories)))
for i in range(y.shape[0]):
n = y[i]
arr[i][n] = 1
return arr
>>> get_onehot(np.array([0, 2, 5]))
array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]])
E.g I assume this is not the most efficient way of doing it and I am wondering if there is a code improvement available.
If you don't want to use scikit-learn, here is a NumPy way:
import numpy as np
def get_onehot(y, n=10):
return np.eye(n)[y]
get_onehot(np.array([0, 2, 5]))
# array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
# [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
# [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]])
np.eye documentation

`np.add.at` to 2-dimensional array

I'm looking for 2-dimensional version of np.add.at().
The expected behavior is as follows.
augend = np.zeros((10, 10))
indices_for_dim0 = np.array([1, 5, 2])
indices_for_dim1 = np.array([5, 3, 1])
addend = np.array([1, 2, 3])
### some procedure substituting np.add.at ###
assert augend[1, 5] == 1
assert augend[5, 3] == 2
assert augend[2, 1] == 3
Any advice will help!
You can use np.add.at multidimensionally as it is. The indices argument contains the following in the description:
... If first operand has multiple dimensions, indices can be a tuple of array like index objects or slice
So:
augend = np.zeros((10, 10))
indices_for_dim0 = np.array([1, 5, 2])
indices_for_dim1 = np.array([5, 3, 1])
addend = np.array([1, 2, 3])
np.add.at(augend, (indices_for_dim0, indices_for_dim1), addend)
More simply:
augend[indices_for_dim0, indices_for_dim1] += addend
If you're really worried about the multidimensional aspect and your augend is a vanilla contiguous C order array, you can use ravel and ravel_multi_index to perform the operation on a 1D view:
indices = np.ravel_multi_index((indices_for_dim0, indices_for_dim1), augend.shape)
raveled = augend.ravel()
np.add.at(raveled, indices, addend)
Oneliner:
np.add.at(augend, (indices_for_dim0, indices_for_dim1), addend)
augend
array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
[0., 3., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 2., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])
assert augend[1, 5] == 1
assert augend[5, 3] == 2
assert augend[2, 1] == 3
# No AssertionError
When using 2d-array for np.add.at, indices must be of a tuple where tuple[0] contains all the first coordinates and tuple[1] contains all the second coordinates.

Finding the start/stop positions and length of the longest and shortest sequence of 1s or 0s in a numpy matrix

I have a numpy matrix that looks like:
matrix = [[0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 1., 1., 1., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
1., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]
How would I get the length of the longest sequence of 1s or 0s? Also how would I get their start and stop positions?
Is there an easier numpy-way to get this done?
Output format is flexible as long as it denotes the inner list index, the length value, and value's list indices.
Example:
LONGEST ONES: 1, 16, 2, 17 (index of inner list, length, longest 1s sequence index start, longest 1s sequence end pos.).
or [1, 16, 2, 17]/(1, 16, 2, 17)
LONGEST ZEROS: 2, 45, 0, 45
Not a duplicate of these questions as this concerns a matrix:
find the start position of the longest sequence of 1's
The result(longest) should be considered among all lists.
A sequence count does not continue when it reaches the end of an inner list.
Using Divakar's base answer, you can adapt by using np.vectorize, setting the argument signature and doing simple math operations to get what you're looking for.
Take, for instance,
m = np.array(matrix)
def get_longest_ones_matrix(b):
idx_pairs = np.where(np.diff(np.hstack(([False], b==1, [False]))))[0].reshape(-1,2)
if not idx_pairs.size: return(np.array([0,0,0]))
d = np.diff(idx_pairs, axis=1).argmax()
start_longest_seq = idx_pairs[d,0]
end_longest_seq = idx_pairs[d,1]
l = end_longest_seq - start_longest_seq
p = start_longest_seq % 45
e = end_longest_seq - 1
return(np.array([l,p,e]))
s = m.shape[-1]
v = np.vectorize(get_longest_ones_matrix, signature=f'(s)->(1)')
x = v(m)
Which yields
[[ 3 26 28]
[16 2 17]
[ 0 0 0]]
Then,
a = x[:,0].argmax()
print(a,x[a])
1 [16 2 17]

Index into NumPy array ignoring NaNs in the indexing array

I have an array of zeros
arr = np.zeros([5,5])
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
I want to assign values based on index so I did this .
out = np.array([[nan,2.,4.,1.,1.],[nan,3.,4.,4.,4.]])
arr[out[0].astype(int),np.arange(len(out[0]))] = 1
arr[out[1].astype(int),np.arange(len(out[1]))] = 1
Assignment works fine if there is 0 instead of nan.
How can I skip assignment in case of nan? and Is it possible to assign values at once from a multidimensional index array rather than using for loop ?
Mask it -
mask = ~np.isnan(out)
arr[out[0,mask[0]].astype(int),np.flatnonzero(mask[0])] = 1
arr[out[1,mask[1]].astype(int),np.flatnonzero(mask[1])] = 1
Sample run -
In [171]: out
Out[171]:
array([[ nan, 2., 4., 1., 1.],
[ nan, 3., 4., 4., 4.]])
In [172]: mask = ~np.isnan(out)
...: arr[out[0,mask[0]].astype(int),np.flatnonzero(mask[0])] = 1
...: arr[out[1,mask[1]].astype(int),np.flatnonzero(mask[1])] = 1
...:
In [173]: arr
Out[173]:
array([[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 1., 1.],
[ 0., 1., 0., 0., 0.],
[ 0., 1., 0., 0., 0.],
[ 0., 0., 1., 1., 1.]])
Alternative, replace the flatnonzero calls with range-masking -
r = np.arange(arr.shape[1])
arr[out[0,mask[0]].astype(int),r[mask[0]]] = 1
arr[out[1,mask[1]].astype(int),r[mask[1]]] = 1
If you are working with a lot many rows than just 2 and you want to assign them in a vectorized manner, here's one method, using linear-indexing -
n = arr.shape[1]
linear_idx = (out*n + np.arange(n))
np.put(arr, linear_idx[~np.isnan(linear_idx)].astype(int), 1)

How to fill numpy array of zeros with ones given indices/coordinates

Given a numpy array of zeros, say
arr = np.zeros((5, 5))
and an array of indices that represent vertices of a polygon, say
verts = np.array([[0, 2], [2, 0], [2, 4]])
1) What is the elegant way of doing
for v in verts:
arr[v[0], v[1]] = 1
such that the resulting array is
In [108]: arr
Out[108]:
array([[ 0., 0., 1., 0., 0.],
[ 0., 0., 0., 0., 0.],
[ 1., 0., 0., 0., 1.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
2) How can I fill the array with ones such that the output array is
In [158]: arr
Out[158]:
array([[ 0., 0., 1., 0., 0.],
[ 0., 1., 1., 1., 0.],
[ 1., 1., 1., 1., 1.],
[ 0., 0., 0., 0., 0.],
[ 0., 0., 0., 0., 0.]])
To answer the first part of your question: arr[tuple(verts.T)] = 1
verts.T transposes your indices to a (2, n) array, where the two rows correspond to the row and column dimensions of arr. These are then unpacked into a tuple of (row_indices, col_indices), which we then use to index into arr.
We could write this a bit more verbosely as:
row_indices = verts[:, 0]
col_indices = verts[:, 1]
arr[row_indices, col_indices] = 1
For the second part, one method that will work for arbitrary polygons would be to use matplotlib.Path.contains_points, as described here:
from matplotlib.path import Path
points = np.indices(arr.shape).reshape(2, -1).T
path = Path(verts)
mask = path.contains_points(points, radius=1e-9)
mask = mask.reshape(arr.shape).astype(arr.dtype)
print(repr(mask))
# array([[ 0., 0., 1., 0., 0.],
# [ 0., 1., 1., 1., 0.],
# [ 1., 1., 1., 1., 1.],
# [ 0., 0., 0., 0., 0.],
# [ 0., 0., 0., 0., 0.]])

Categories

Resources