I don't understand what squeeze and unsqueeze do to a tensor, even after looking at the docs and related questions.
I tried to understand it by exploring it myself in python. I first created a random tensor with
x = torch.rand(3,2,dtype=torch.float)
>>> x
tensor([[0.3703, 0.9588],
[0.8064, 0.9716],
[0.9585, 0.7860]])
But regardless of how I squeeze it, I end up with the same results:
torch.equal(x.squeeze(0), x.squeeze(1))
>>> True
If I now try to unsqueeze I get the following,
>>> x.unsqueeze(1)
tensor([[[0.3703, 0.9588]],
[[0.8064, 0.9716]],
[[0.9585, 0.7860]]])
>>> x.unsqueeze(0)
tensor([[[0.3703, 0.9588],
[0.8064, 0.9716],
[0.9585, 0.7860]]])
>>> x.unsqueeze(-1)
tensor([[[0.3703],
[0.9588]],
[[0.8064],
[0.9716]],
[[0.9585],
[0.7860]]])
However if I now create a tensor x = torch.tensor([1,2,3,4]), and I try to unsqueeze it then it appears that 1 and -1 makes it a column where as 0 remains the same.
x.unsqueeze(0)
tensor([[1, 2, 3, 4]])
>>> x.unsqueeze(1)
tensor([[1],
[2],
[3],
[4]])
>>> x.unsqueeze(-1)
tensor([[1],
[2],
[3],
[4]])
Can someone provide an explanation of what squeeze and unsqueeze are doing to a tensor? And what's the difference between providing the arguements 0, 1 and -1?
Here is a visual representation of what squeeze/unsqueeze do for an effectively 2d matrix:
When you are unsqueezing a tensor, it is ambiguous which dimension you wish to 'unsqueeze' it across (as a row or column etc). The dim argument dictates this - i.e. position of the new dimension to be added.
Hence the resulting unsqueezed tensors have the same information, but the indices used to access them are different.
Simply put, unsqueeze() "adds" a superficial 1 dimension to tensor (at the specified dimension), while squeeze removes all superficial 1 dimensions from tensor.
You should look at tensor's shape attribute to see it easily. In your last case it would be:
import torch
tensor = torch.tensor([1, 0, 2, 3, 4])
tensor.shape # torch.Size([5])
tensor.unsqueeze(dim=0).shape # [1, 5]
tensor.unsqueeze(dim=1).shape # [5, 1]
It is useful for providing single sample to the network (which requires first dimension to be batch), for images it would be:
# 3 channels, 32 width, 32 height
tensor = torch.randn(3, 32, 32)
# 1 batch, 3 channels, 32 width, 32 height
tensor.unsqueeze(dim=0).shape
unsqueeze can be seen if you create tensor with 1 dimensions, e.g. like this:
# 3 channels, 32 width, 32 height and some 1 unnecessary dimensions
tensor = torch.randn(3, 1, 32, 1, 32, 1)
# 1 batch, 3 channels, 32 width, 32 height again
tensor.squeeze().unsqueeze(0) # [1, 3, 32, 32]
torch.unsqueeze(input, dim) → Tensor
a = torch.randn(4, 4, 4)
torch.unsqueeze(a, 0).size()
>>> torch.Size([1, 4, 4, 4])
a = torch.randn(4, 4, 4)
torch.unsqueeze(a, 1).size()
>>> torch.Size([4, 1, 4, 4])
a = torch.randn(4, 4, 4)
torch.unsqueeze(a, 2).size()
>>> torch.Size([4, 4, 1, 4])
a = torch.randn(4, 4, 4)
torch.unsqueeze(a, 3).size()
>>> torch.Size([4, 4, 4, 1])
torch.squeeze(input, dim=None, out=None) → Tensor
b = torch.randn(4, 1, 4)
>>> tensor([[[ 1.2912, -1.9050, 1.4771, 1.5517]],
[[-0.3359, -0.2381, -0.3590, 0.0406]],
[[-0.2460, -0.2326, 0.4511, 0.7255]],
[[-0.1456, -0.0857, -0.8443, 1.1423]]])
b.size()
>>> torch.Size([4, 1, 4])
c = b.squeeze(1)
b
>>> tensor([[[ 1.2912, -1.9050, 1.4771, 1.5517]],
[[-0.3359, -0.2381, -0.3590, 0.0406]],
[[-0.2460, -0.2326, 0.4511, 0.7255]],
[[-0.1456, -0.0857, -0.8443, 1.1423]]])
b.size()
>>> torch.Size([4, 1, 4])
c
>>> tensor([[ 1.2912, -1.9050, 1.4771, 1.5517],
[-0.3359, -0.2381, -0.3590, 0.0406],
[-0.2460, -0.2326, 0.4511, 0.7255],
[-0.1456, -0.0857, -0.8443, 1.1423]])
c.size()
>>> torch.Size([4, 4])
Related
So I have a tensor that is M x B x C, where M is the number of models, B is the batch and C is the classes and each cell is the probability of a class for a given model and batch. Then I have a tensor of the correct answers which is just a 1D of size B we'll call "t". How do I use the 1D of size B to just return a M x B x 1, where the returned tensor is just the value at the correct class? Say the M x B x C tensor is called "blah" I've tried
blah[:, :, C]
for i in range(M):
blah[i, :, C]
blah[:, C, :]
The top 2 just return the values of indexes t in the 3rd dimension of every slice. The last one returns the values at t indexes in the 2nd dimension. How do I do this?
We can get the desired result by combining advanced and basic indexing
import torch
# shape [2, 3, 4]
blah = torch.tensor([
[[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]],
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
# shape [3]
t = torch.tensor([2, 1, 0])
b = torch.arange(blah.shape[1]).type_as(t)
# shape [2, 3, 1]
result = blah[:, b, t].unsqueeze(-1)
which results in
>>> result
tensor([[[ 2],
[ 5],
[ 8]],
[[14],
[17],
[20]]])
Here is one way to do it:
Suppose a is your M x B x C shaped tensor. I am taking some representative values below,
>>> M = 3
>>> B = 5
>>> C = 4
>>> a = torch.rand(M, B, C)
>>> a
tensor([[[0.6222, 0.6703, 0.0057, 0.3210],
[0.6251, 0.3286, 0.8451, 0.5978],
[0.0808, 0.8408, 0.3795, 0.4872],
[0.8589, 0.8891, 0.8033, 0.8906],
[0.5620, 0.5275, 0.4272, 0.2286]],
[[0.2419, 0.0179, 0.2052, 0.6859],
[0.1868, 0.7766, 0.3648, 0.9697],
[0.6750, 0.4715, 0.9377, 0.3220],
[0.0537, 0.1719, 0.0013, 0.0537],
[0.2681, 0.7514, 0.6523, 0.7703]],
[[0.5285, 0.5360, 0.7949, 0.6210],
[0.3066, 0.1138, 0.6412, 0.4724],
[0.3599, 0.9624, 0.0266, 0.1455],
[0.7474, 0.2999, 0.7476, 0.2889],
[0.1779, 0.3515, 0.8900, 0.2301]]])
Let's say the 1D class tensor is t, which gives the true class of each example in the batch. So it is a 1D tensor of shape (B, ) having class labels in the range {0, 1, 2, ..., C-1}.
>>> t = torch.randint(C, size = (B, ))
>>> t
tensor([3, 2, 1, 1, 0])
So basically you want to select the indices corresponding to t from the innermost dimension of a. This can be achieved using fancy indexing and broadcasting combined as follows:
>>> i = torch.arange(M).reshape(M, 1, 1)
>>> j = torch.arange(B).reshape(1, B, 1)
>>> k = t.reshape(1, B, 1)
Note that once you index anything by (i, j, k), they are going to expand and take the shape (M, B, 1) which is the desired output shape.
Now just indexing a by i, j and k gives:
>>> a[i, j, k]
tensor([[[0.3210],
[0.8451],
[0.8408],
[0.8891],
[0.5620]],
[[0.6859],
[0.3648],
[0.4715],
[0.1719],
[0.2681]],
[[0.6210],
[0.6412],
[0.9624],
[0.2999],
[0.1779]]])
So essentially, if you generate the index arrays conveying your access pattern beforehand, you can directly use them to extract some slice of the tensor.
You simply need to pass:
your index as the third slice
range(B) as the second slice
(i.e. which element in the 2nd dim each 3rd dim index corresponds to)
blah[:,range(B),t]
If I have a batch of uniform 3D grids of coordinate locations, with shape (for example), [1, 32, 32, 32, 3], what is the best way for me to split this up into multiple even chunks, so I could end up with something such as [1, 4096, 2, 2, 2, 3]? In other words, I’m splitting up that one big 32 x 32 x 32 cube where each point is an x, y, z coordinate location into 4096 smaller 2 x 2 x 2 cubes? Does a simple view operation make sense here, or would it throw off the coordinate values? I was looking into operations like torch.chunk, but they require a specific dimension to split along, which I’m not sure applies here.
My use case for this is that I have a smaller [1, 16, 16, 16, 3] cube, so I’m trying to match up points from this smaller shape into the corresponding cubes in the upsampled [1, 32, 32, 32, 3] shape (since a single coordinate point in the 16^3 shape corresponds to 8 points in the 32^3 shape).
For additional context, this is how I generate my 3D grid right now:
pxs = torch.linspace(-1, 1, 32)
pys = torch.linspace(-1, 1, 32)
pzs = torch.linspace(-1, 1, 32)
pxs = pxs.view(-1, 1, 1).expand(*shape).contiguous().view(size)
pys = pys.view(1, -1, 1).expand(*shape).contiguous().view(size)
pzs = pzs.view(1, 1, -1).expand(*shape).contiguous().view(size)
points = torch.stack([pxs, pys, pzs], dim=1)
grid_3d = torch.reshape(points, (32, 32, 32, 3))
Just reshape it with the dimensions you'd like:
In [29]: lin = np.linspace(-1, 1, 32)
...: cube = np.stack(np.meshgrid(lin, lin, lin), axis=-1)
...: cube.shape
Out[29]: (32, 32, 32, 3)
In [30]: new_cube = cube.reshape((1, 4096, 2, 2, 2, 3))
...: new_cube.shape
Out[30]: (1, 4096, 2, 2, 2, 3)
For a given 2D tensor I want to retrieve all indices where the value is 1. I expected to be able to simply use torch.nonzero(a == 1).squeeze(), which would return tensor([1, 3, 2]). However, instead, torch.nonzero(a == 1) returns a 2D tensor (that's okay), with two values per row (that's not what I expected). The returned indices should then be used to index the second dimension (index 1) of a 3D tensor, again returning a 2D tensor.
import torch
a = torch.Tensor([[12, 1, 0, 0],
[4, 9, 21, 1],
[10, 2, 1, 0]])
b = torch.rand(3, 4, 8)
print('a_size', a.size())
# a_size torch.Size([3, 4])
print('b_size', b.size())
# b_size torch.Size([3, 4, 8])
idxs = torch.nonzero(a == 1)
print('idxs_size', idxs.size())
# idxs_size torch.Size([3, 2])
print(b.gather(1, idxs))
Evidently, this does not work, leading to aRunTimeError:
RuntimeError: invalid argument 4: Index tensor must have same
dimensions as input tensor at
C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensorEvenMoreMath.cpp:453
It seems that idxs is not what I expect it to be, nor can I use it the way I thought. idxs is
tensor([[0, 1],
[1, 3],
[2, 2]])
but reading through the documentation I don't understand why I also get back the row indices in the resulting tensor. Now, I know I can get the correct idxs by slicing idxs[:, 1] but then still, I cannot use those values as indices for the 3D tensor because the same error as before is raised. Is it possible to use the 1D tensor of indices to select items across a given dimension?
You could simply slice them and pass it as the indices as in:
In [193]: idxs = torch.nonzero(a == 1)
In [194]: c = b[idxs[:, 0], idxs[:, 1]]
In [195]: c
Out[195]:
tensor([[0.3411, 0.3944, 0.8108, 0.3986, 0.3917, 0.1176, 0.6252, 0.4885],
[0.5698, 0.3140, 0.6525, 0.7724, 0.3751, 0.3376, 0.5425, 0.1062],
[0.7780, 0.4572, 0.5645, 0.5759, 0.5957, 0.2750, 0.6429, 0.1029]])
Alternatively, an even simpler & my preferred approach would be to just use torch.where() and then directly index into the tensor b as in:
In [196]: b[torch.where(a == 1)]
Out[196]:
tensor([[0.3411, 0.3944, 0.8108, 0.3986, 0.3917, 0.1176, 0.6252, 0.4885],
[0.5698, 0.3140, 0.6525, 0.7724, 0.3751, 0.3376, 0.5425, 0.1062],
[0.7780, 0.4572, 0.5645, 0.5759, 0.5957, 0.2750, 0.6429, 0.1029]])
A bit more explanation about the above approach of using torch.where(): It works based on the concept of advanced indexing. That is, when we index into the tensor using a tuple of sequence objects such as tuple of tensors, tuple of lists, tuple of tuples etc.
# some input tensor
In [207]: a
Out[207]:
tensor([[12., 1., 0., 0.],
[ 4., 9., 21., 1.],
[10., 2., 1., 0.]])
For basic slicing, we would need a tuple of integer indices:
In [212]: a[(1, 2)]
Out[212]: tensor(21.)
To achieve the same using advanced indexing, we would need a tuple of sequence objects:
# adv. indexing using a tuple of lists
In [213]: a[([1,], [2,])]
Out[213]: tensor([21.])
# adv. indexing using a tuple of tuples
In [215]: a[((1,), (2,))]
Out[215]: tensor([21.])
# adv. indexing using a tuple of tensors
In [214]: a[(torch.tensor([1,]), torch.tensor([2,]))]
Out[214]: tensor([21.])
And the dimension of the returned tensor would always be one dimension less than the dimension of the input tensor.
Assuming that b's three dimensions are batch_size x sequence_length x features (b x s x feats), the expected results can be achieved as follows.
import torch
a = torch.Tensor([[12, 1, 0, 0],
[4, 9, 21, 1],
[10, 2, 1, 0]])
b = torch.rand(3, 4, 8)
print(b.size())
# b x s x feats
idxs = torch.nonzero(a == 1)[:, 1]
print(idxs.size())
# b
c = b[torch.arange(b.size(0)), idxs]
print(c.size())
# b x feats
import torch
a = torch.Tensor([[12, 1, 0, 0],
[4, 9, 21, 1],
[10, 2, 1, 0]])
b = torch.rand(3, 4, 8)
print('a_size', a.size())
# a_size torch.Size([3, 4])
print('b_size', b.size())
# b_size torch.Size([3, 4, 8])
#idxs = torch.nonzero(a == 1, as_tuple=True)
idxs = torch.nonzero(a == 1)
#print('idxs_size', idxs.size())
print(torch.index_select(b,1,idxs[:,1]))
As a supplementary of #kmario23's solution, you can still achieve the same results like
b[torch.nonzero(a==1,as_tuple=True)]
Reading Dynamic Graph CNN for Learning on Point Clouds code, I came across this snippet:
idx_ = tf.range(batch_size) * num_points
idx_ = tf.reshape(idx_, [batch_size, 1, 1])
point_cloud_flat = tf.reshape(point_cloud, [-1, num_dims])
point_cloud_neighbors = tf.gather(point_cloud_flat, nn_idx+idx_) <--- what happens here?
point_cloud_central = tf.expand_dims(point_cloud_central, axis=-2)
debugging the line I made sure that the dims are
point_cloud_flat:(32768,3) nn_idx:(32,1024,20), idx_:(32,1,1)
// indices are (32,1024,20) after broadcasting
Reading the tf.gather doc I couldn't understand what the function does with dimensions higher that the input dimensions
An equivalent function in numpy is np.take, a simple example:
import numpy as np
params = np.array([4, 3, 5, 7, 6, 8])
# Scalar indices; (output is rank(params) - 1), i.e. 0 here.
indices = 0
print(params[indices])
# Vector indices; (output is rank(params)), i.e. 1 here.
indices = [0, 1, 4]
print(params[indices]) # [4 3 6]
# Vector indices; (output is rank(params)), i.e. 1 here.
indices = [2, 3, 4]
print(params[indices]) # [5 7 6]
# Higher rank indices; (output is rank(params) + rank(indices) - 1), i.e. 2 here
indices = np.array([[0, 1, 4], [2, 3, 4]])
print(params[indices]) # equivalent to np.take(params, indices, axis=0)
# [[4 3 6]
# [5 7 6]]
In your case, the rank of indices is higher than params, so output is rank(params) + rank(indices) - 1 (i.e. 2 + 3 - 1 = 4, i.e. (32, 1024, 20, 3)). The - 1 is because the tf.gather(axis=0) and axis must be rank 0 (so a scalar) at this moment. So the indices takes the elements of the first dimension (axis=0) in a "fancy" indexing way.
EDITED:
In brief, in your case, (if I didn't misunderstand the code)
point_cloud is (32, 1024, 3), 32 batches 1024 points which have 3
coordinates.
nn_idx is (32, 1024, 20), indices of 20 neighbors of
32 batches 1024 points. The indices are for indexing in point_cloud.
nn_idx+idx_ (32, 1024, 20), indices of 20 neighbors of
32 batches 1024 points. The indices are for indexing in point_cloud_flat.
point_cloud_neighbors finally is (32, 1024,
20, 3), the same as nn_idx+idx_ except that point_cloud_neighbors are their 3 coordinates while nn_idx+idx_ are just their indices.
I have a 4d numpy array which represents a dataset with 3d instances.
Lets say that the shape of the array is (32, 32, 3, 73257).
How can i change the shape of the array to (73257, 32, 32, 3)?
--- Question update
It seems that both rollaxis and transpose do the trick.
Thanx for replying!
The np.transpose function does exactly what you want, you can pass an axis argument which controls which axis you want to swap:
a = np.empty((32, 32, 3, 73257))
b = np.transpose(a, (3, 0, 1, 2))
The axis of b are permuted versions of the ones of a: the axis 0 of b is the 3-rd axis of a, the axis 1 of b is the 0-th axis of a, etc...
That way, you can specify which of the axis of size 32 you want in second or in third place:
b = np.transpose(a, (3, 1, 0, 2))
Also gives an array of the desired shape, but is different from the previous one.
It looks like np.rollaxis(arr, axis=-1) will do what you want. Example:
>>> arr = np.empty(32, 32, 3, 73257)
>>> arr2 = np.rollaxis(arr, axis=-1)
>>> arr2.shape
(73257, 32, 32, 3)
This will make arr[i,j,k,l] == arr2[l,i,j,k] for all ijkl