Set a numpy slice using np.where based on another dimension - python

How can I set values in the first 3 channels of a 4 channel numpy array based on values in the 4th channel? Is it possible to do so with a numpy slice as a l-value?
Given a 3 by 2 pixel numpy array with 4 channels
a = np.arange(24).reshape(3,2,4)
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[12, 13, 14, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
I can select slices where the 4th channel is modulo 3.
px = np.where(0==a[:,:,3]%3)
(array([0, 1], dtype=int64), array([0, 1], dtype=int64))
a[px]
array([[ 0, 1, 2, 3],
[12, 13, 14, 15]])
Now I want to set the first 3 channels in those rows in a to 0 such that the results looks like:
a
array([[[ 0, 0, 0, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 0, 0, 0, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
I tried
a[px][:,0:3] = 0
but that leaves the array unchanged.
I read Setting values in a numpy arrays indexed by a slice and two boolean arrays and do not understand how to use a Boolean index to set only the first 3 channels.

Here is one way:
>>> px0, px1 = np.where(0==a[:,:,3]%3)
>>> a[px0, px1, :3] = 0
>>> a
array([[[ 0, 0, 0, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 0, 0, 0, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
or
>>> px = np.where(0==a[:,:,3]%3)
>>> a[..., :3][px] = 0
>>> a
array([[[ 0, 0, 0, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 0, 0, 0, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])
or
>>> a[(*px, np.s_[:3])] = 0
>>> a
array([[[ 0, 0, 0, 3],
[ 4, 5, 6, 7]],
[[ 8, 9, 10, 11],
[ 0, 0, 0, 15]],
[[16, 17, 18, 19],
[20, 21, 22, 23]]])

Related

How do I aggregate a 2D tensor while some of them have to be averaged and some of them retained?

I want to aggregate a 2D tensor. The rule is, if the number in the the variable "idx" is the same, average the corresponding tensor. Below is the problem.
idx = torch.tensor([[0, 1, 2], [1, 2, 3], [3, 4, 5]])
x = torch.tensor([[10, 10, 10], [11, 11, 11], [12, 12, 12],
[13, 13, 13], [14, 14, 14], [15, 15, 15],
[16, 16, 16], [17, 17, 17], [18, 18, 18]])
For example, since index (1, 2) is repeated, the corresponding tensors
[[11, 11, 11], [12, 12, 12]] and [[13, 13, 13], [14, 14, 14]]
will be aggregated and thus we get the tensor
[[12, 12, 12], [13, 13, 13]]
if there is no repeated number, keep the tensor. So for this problem the desired answer is
torch.tensor([[10, 10, 10], [12, 12, 12], [13, 13, 13], [15.5, 15.5, 15.5], [17, 17, 17], [18, 18, 18]])
How do I do this? Thank you in advance for helping.
It looks like you could use a scatter function.
Minimal setup in 2D
Let us explore a reduced version of your problem whereby instead of having triplets, we have single values. In that scenario, we would have:
>>> x = torch.tensor([[10, 11, 12],
[13, 14, 15],
[16, 17, 18]])
>>> idx = torch.tensor([[0, 1, 2],
[1, 2, 3],
[3, 4, 5]])
We should look at an intermediate result rather than the desired result itself. Ultimately, we know that the result will be summed or averaged along one of its dimensions. We can first define its shape, it should have as many rows as x and idx and as many columns as there are unique indices in idx. In this particular example, the intermediate result we're looking for will have a shape of (3, 6). Since #[|0,5|]=6:
tensor([[10, 11, 12, 0, 0, 0],
[ 0, 13, 14, 15, 0, 0],
[ 0, 0, 0, 16, 17, 18]])
We can see that once reduced along the first dimension, we get:
tensor([10., 12., 13., 15.5, 17., 18.])
Which is the desired result for this minimal example in 1D.
To do such operation we can benefit from torch.Tensor.scatter. When applied on two-dimensional tensors, and setting dim=1, calling this line:
out.scatter_(dim=1, index=idx, src=x)
will have the following effect on out:
out[i][idx[i][j]] = x[i][j]
If we try on our input tensors, we get:
>>> o = torch.zeros(len(idx), idx.max()+1, dtype=int)
>>> o.scatter_(1, idx, x)
tensor([[10, 11, 12, 0, 0, 0],
[ 0, 13, 14, 15, 0, 0],
[ 0, 0, 0, 16, 17, 18]])
Then we only have to average over nonzero values using torch.sum and torch.count_nonzero:
>>> o.sum(dim=0) / o.count_nonzero(dim=0)
tensor([10.0000, 12.0000, 13.0000, 15.5000, 17.0000, 18.0000])
Original problem
However, in your original use case, you have three values per element:
>>> x = torch.tensor([[10, 10, 10], [11, 11, 11], [12, 12, 12],
[13, 13, 13], [14, 14, 14], [15, 15, 15],
[16, 16, 16], [17, 17, 17], [18, 18, 18]])
>>> idx = torch.tensor([[0, 1, 2],
[1, 2, 3],
[3, 4, 5]])
Therefore, we first have to do the following on x and idx:
reshape x to a 3D tensor with shape (*idx.shape, x.size(-1)):
>>> x_ = x.view(*idx.shape, x.size(-1))
tensor([[[10, 10, 10],
[11, 11, 11],
[12, 12, 12]],
[[13, 13, 13],
[14, 14, 14],
[15, 15, 15]],
[[16, 16, 16],
[17, 17, 17],
[18, 18, 18]]])
Unsqueeze and expand idx such that it has the same shape as x_:
>>> idx_ = idx.unsqueeze(-1).expand_as(x_)
tensor([[[0, 0, 0],
[1, 1, 1],
[2, 2, 2]],
[[1, 1, 1],
[2, 2, 2],
[3, 3, 3]],
[[3, 3, 3],
[4, 4, 4],
[5, 5, 5]]])
Then we simply have to apply the same approach with torch.scatter, but in 3D, i.e. out.scatter_(dim=1, index=idx_, src=x_) which will result in:
out[i][idx[i][j][k]][k] = x[i][j][k]
For our inputs:
>>> o = torch.zeros(len(idx), idx.max()+1, idx.size(-1), dtype=int)
>>> o.scatter_(1, idx_, x_)
tensor([[[10, 10, 10],
[11, 11, 11],
[12, 12, 12],
[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[13, 13, 13],
[14, 14, 14],
[15, 15, 15],
[ 0, 0, 0],
[ 0, 0, 0]],
[[ 0, 0, 0],
[ 0, 0, 0],
[ 0, 0, 0],
[16, 16, 16],
[17, 17, 17],
[18, 18, 18]]])
To finish it off, reduce on dim=1, same way as before:
>>> o.sum(dim=0) / o.count_nonzero(dim=0)
tensor([[10.0000, 10.0000, 10.0000],
[12.0000, 12.0000, 12.0000],
[13.0000, 13.0000, 13.0000],
[15.5000, 15.5000, 15.5000],
[17.0000, 17.0000, 17.0000],
[18.0000, 18.0000, 18.0000]])

numpy is double transposition necessary in this specific case?

I have an array
xx = np.arange(24).reshape(2, 12)
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]])
and I would like to reshape it, to obtain
array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
I can achieve it via
xx.T.reshape(3, 4, 2).transpose(0, 2, 1)
But it has to be transposed twice, which seems unnecessary to me. So could somebody confirm that this is the only way of doing it or provide more readable solution otherwise?
Thanks!
It is possible to do a single transpose:
data = np.arange(24).reshape(2, 12)
data = data.reshape(2, 3, 4).transpose(1, 0, 2)
Edit:
I checked this using itertools.permutations and itertools.product:
import itertools
import numpy as np
data = np.arange(24).reshape(2, 12)
desired_data = np.array([[[ 0, 1, 2, 3],
[12, 13, 14, 15]],
[[ 4, 5, 6, 7],
[16, 17, 18, 19]],
[[ 8, 9, 10, 11],
[20, 21, 22, 23]]])
shapes = [2, 3, 4]
transpose_dims = [0, 1, 2]
shape_permutations = itertools.permutations(shapes)
transpose_permutations = itertools.permutations(transpose_dims)
for shape, transpose in itertools.product(
list(shape_permutations),
list(transpose_permutations),
):
new_data = data.reshape(*shape).transpose(*transpose)
try:
np.allclose(new_data, desired_data)
except ValueError as e:
pass
else:
break
print(f"{shape=}, {transpose=}")
shape=(2, 3, 4), transpose=(1, 0, 2)
I would do it this way: first, generate two arrays (shown separated for the sake of decomposition):
xx.reshape(2, -1, 4)
# Output:
# array([[[ 0, 1, 2, 3],
# [ 4, 5, 6, 7],
# [ 8, 9, 10, 11]],
#
# [[12, 13, 14, 15],
# [16, 17, 18, 19],
# [20, 21, 22, 23]]])
From here, I would then stack along the second dimension in order to combine them like you want:
np.stack(xx.reshape(2, -1, 4), axis=1)
# Output:
# array([[[ 0, 1, 2, 3],
# [12, 13, 14, 15]],
#
# [[ 4, 5, 6, 7],
# [16, 17, 18, 19]],
#
# [[ 8, 9, 10, 11],
# [20, 21, 22, 23]]])
You'd avoid the transposition. Hopefully it's more readable, but in the end, that's highly subjective, right? '^^
To add on top of #Paul's answer, there is some speedup from removing one of the transpose. The time gain is of ~15%:

Masking out row and column of 3-D numpy array based on 2-D boolean mask

For some 3-D cubic numpy array like the following:
import numpy as np
a = np.array([[[1,2,3],[4,5,6],[7,8,9]],[[10,11,12],[13,14,15],[16,17,18]],[[19,20,21],[22,23,24],[25,26,27]]])
array([[[ 1, 2, 3],
[ 4, 5, 6],
[ 7, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 21],
[22, 23, 24],
[25, 26, 27]]])
and some 2-D boolean mask array like the following:
b = np.array([[0,1,1],[1,1,1],[1,1,0]])
array([[0, 1, 1],
[1, 1, 1],
[1, 1, 0]])
I'm wondering if there is a way, using numpy operations, to compute a result such that for all elements where b[i][j] = 0, then a[i,:,j] = 0 and a[i,j,:] = 0. It's guaranteed that b is n x n and a is n x n x n. In the above example the result would look like
array([[[ 0, 0, 0],
[ 0, 5, 6],
[ 0, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 0],
[22, 23, 0],
[ 0, 0, 0]]])
In [111]: b = np.array([[0,1,1],[1,1,1],[1,1,0]])
In [116]: I,J = np.nonzero(b==0)
In [117]: I,J
Out[117]: (array([0, 2]), array([0, 2]))
Test the indexing:
In [118]: a[I,:,J]
Out[118]:
array([[ 1, 4, 7],
[21, 24, 27]])
In [119]: a[I,J,:]
Out[119]:
array([[ 1, 2, 3],
[25, 26, 27]])
Apply:
In [120]: a[I,:,J]=0
In [121]: a[I,J,:]=0
In [122]: a
Out[122]:
array([[[ 0, 0, 0],
[ 0, 5, 6],
[ 0, 8, 9]],
[[10, 11, 12],
[13, 14, 15],
[16, 17, 18]],
[[19, 20, 0],
[22, 23, 0],
[ 0, 0, 0]]])

Can I combine non-adjacent dimensions in a NumPy array without copying data?

I would like to combine the first and the last dimension of a 3-D NumPy array into one dimension, without copying the data:
import numpy as np
data = np.empty((3, 4, 5))
data = data.transpose([0, 2, 1])
try:
# this fails, indicating that it is not possible:
# AttributeError: incompatible shape for a non-contiguous array
data.shape = (-1, 4)
except AttributeError:
# this creates a copy of the data:
data = data.reshape((-1, 4))
Is this possible?
In [55]: arr = np.arange(24).reshape(2,3,4)
In [56]: arr1 = arr.transpose(2,1,0)
In [57]: arr
Out[57]:
array([[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
In [58]: arr1
Out[58]:
array([[[ 0, 12],
[ 4, 16],
[ 8, 20]],
[[ 1, 13],
[ 5, 17],
[ 9, 21]],
[[ 2, 14],
[ 6, 18],
[10, 22]],
[[ 3, 15],
[ 7, 19],
[11, 23]]])
Look at how the values are laid out in the 1d data buffer:
In [59]: arr.ravel()
Out[59]:
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 21, 22, 23])
compare the order after the transpose:
In [60]: arr1.ravel()
Out[60]:
array([ 0, 12, 4, 16, 8, 20, 1, 13, 5, 17, 9, 21, 2, 14, 6, 18, 10,
22, 3, 15, 7, 19, 11, 23])
If the raveled values don't have the same order, you can't avoid a copy.
reshape has this note:
You can think of reshaping as first raveling the array (using the given
index order), then inserting the elements from the raveled array into the
new array using the same kind of index ordering as was used for the
raveling.
In [63]: arr1.reshape(-1,2)
Out[63]:
array([[ 0, 12],
[ 4, 16],
[ 8, 20],
[ 1, 13],
[ 5, 17],
[ 9, 21],
[ 2, 14],
[ 6, 18],
[10, 22],
[ 3, 15],
[ 7, 19],
[11, 23]])

How to slice multidimensional array with Numpy, multiple columns?

I am generating multidimensional array of different sizes, though they'll all have an even number of columns.
>> import numpy as np
>> x = np.arange(24).reshape((3,8))
Which results in:
array([[ 0, 1, 2, 3, 4, 5, 6, 7],
[ 8, 9, 10, 11, 12, 13, 14, 15],
[16, 17, 18, 19, 20, 21, 22, 23]])
I am able to slice with numpy and get the first column in an array:
>> newarr = x[0:,0:2]
array([[ 0, 1],
[ 8, 9],
[16, 17]])
However, I want to have one array that is just a list of the columns where column 1 and 2 are together, 3 and 4 are together, and so on.. For example:
array([[[ 0, 1],
[ 8, 9],
[16, 17]],
[[ 2, 3],
[10, 11],
[18, 19]],
etc....]
)
This code below works but it's clunky and my arrays are not all the same. Some arrays have 16 columns, some have 34, some have 50, etc.
>> newarr = [x[0:,0:2]]+[x[0:,2:4]]+[x[0:,4:6]]
[array([[ 0, 1],
[ 8, 9],
[16, 17]]), array([[ 2, 3],
[10, 11],
[18, 19]])]
There's got to be a better way to do this than
newarr = [x[0:,0:2]]+[x[0:,2:4]]+[x[0:,4:6]]+...+[x[0:,n:n+2]]
Help!
My idea is adding a for loop:
slice_len = 2
x_list = [x[0:, slice_len*i:slice_len*(i+1)] for i in range(x.shape[1] // slice_len)]
Output:
[array([[ 0, 1],
[ 8, 9],
[16, 17]]), array([[ 2, 3],
[10, 11],
[18, 19]]), array([[ 4, 5],
[12, 13],
[20, 21]]), array([[ 6, 7],
[14, 15],
[22, 23]])]

Categories

Resources