I want to reshape a vector of shape (5,) into a matrix of shape (1, 5).
With numpy, I can do:
>>> import numpy as np
>>> a = np.array([1, 2, 3, 4, 5])
>>> a.shape
(5,)
>>> a = np.reshape(a, (1, 5))
>>> a.shape
(1, 5)
>>> a
array([[1, 2, 3, 4, 5]])
But how do I do this with PyTorch?
Use torch.unsqueeze(input, dim, out=None):
>>> import torch
>>> a = torch.Tensor([1, 2, 3, 4, 5])
>>> a
1
2
3
4
5
[torch.FloatTensor of size 5]
>>> a = a.unsqueeze(0)
>>> a
1 2 3 4 5
[torch.FloatTensor of size 1x5]
you might use
a.view(1,5)
Out:
1 2 3 4 5
[torch.FloatTensor of size 1x5]
There are multiple ways of reshaping a PyTorch tensor. You can apply these methods on a tensor of any dimensionality.
Let's start with a 2-dimensional 2 x 3 tensor:
x = torch.Tensor(2, 3)
print(x.shape)
# torch.Size([2, 3])
To add some robustness to this problem, let's reshape the 2 x 3 tensor by adding a new dimension at the front and another dimension in the middle, producing a 1 x 2 x 1 x 3 tensor.
Approach 1: add dimension with None
Use NumPy-style insertion of None (aka np.newaxis) to add dimensions anywhere you want. See here.
print(x.shape)
# torch.Size([2, 3])
y = x[None, :, None, :] # Add new dimensions at positions 0 and 2.
print(y.shape)
# torch.Size([1, 2, 1, 3])
Approach 2: unsqueeze
Use torch.Tensor.unsqueeze(i) (a.k.a. torch.unsqueeze(tensor, i) or the in-place version unsqueeze_()) to add a new dimension at the i'th dimension. The returned tensor shares the same data as the original tensor. In this example, we can use unqueeze() twice to add the two new dimensions.
print(x.shape)
# torch.Size([2, 3])
# Use unsqueeze twice.
y = x.unsqueeze(0) # Add new dimension at position 0
print(y.shape)
# torch.Size([1, 2, 3])
y = y.unsqueeze(2) # Add new dimension at position 2
print(y.shape)
# torch.Size([1, 2, 1, 3])
In practice with PyTorch, adding an extra dimension for the batch may be important, so you may often see unsqueeze(0).
Approach 3: view
Use torch.Tensor.view(*shape) to specify all the dimensions. The returned tensor shares the same data as the original tensor.
print(x.shape)
# torch.Size([2, 3])
y = x.view(1, 2, 1, 3)
print(y.shape)
# torch.Size([1, 2, 1, 3])
Approach 4: reshape
Use torch.Tensor.reshape(*shape) (aka torch.reshape(tensor, shapetuple)) to specify all the dimensions. If the original data is contiguous and has the same stride, the returned tensor will be a view of input (sharing the same data), otherwise it will be a copy. This function is similar to the NumPy reshape() function in that it lets you define all the dimensions and can return either a view or a copy.
print(x.shape)
# torch.Size([2, 3])
y = x.reshape(1, 2, 1, 3)
print(y.shape)
# torch.Size([1, 2, 1, 3])
Furthermore, from the O'Reilly 2019 book Programming PyTorch for Deep Learning, the author writes:
Now you might wonder what the difference is between view() and reshape(). The answer is that view() operates as a view on the original tensor, so if the underlying data is changed, the view will change too (and vice versa). However, view() can throw errors if the required view is not contiguous; that is, it doesn’t share the same block of memory it would occupy if a new tensor of the required shape was created from scratch. If this happens, you have to call tensor.contiguous() before you can use view(). However, reshape() does all that behind the scenes, so in general, I recommend using reshape() rather than view().
Approach 5: resize_
Use the in-place function torch.Tensor.resize_(*sizes) to modify the original tensor. The documentation states:
WARNING. This is a low-level method. The storage is reinterpreted as C-contiguous, ignoring the current strides (unless the target size equals the current size, in which case the tensor is left unchanged). For most purposes, you will instead want to use view(), which checks for contiguity, or reshape(), which copies data if needed. To change the size in-place with custom strides, see set_().
print(x.shape)
# torch.Size([2, 3])
x.resize_(1, 2, 1, 3)
print(x.shape)
# torch.Size([1, 2, 1, 3])
My observations
If you want to add just one dimension (e.g. to add a 0th dimension for the batch), then use unsqueeze(0). If you want to totally change the dimensionality, use reshape().
See also:
What's the difference between reshape and view in pytorch?
What is the difference between view() and unsqueeze()?
In PyTorch 0.4, is it recommended to use reshape than view when it is possible?
For in-place modification of the shape of the tensor, you should use
tensor.resize_():
In [23]: a = torch.Tensor([1, 2, 3, 4, 5])
In [24]: a.shape
Out[24]: torch.Size([5])
# tensor.resize_((`new_shape`))
In [25]: a.resize_((1,5))
Out[25]:
1 2 3 4 5
[torch.FloatTensor of size 1x5]
In [26]: a.shape
Out[26]: torch.Size([1, 5])
In PyTorch, if there's an underscore at the end of an operation (like tensor.resize_()) then that operation does in-place modification to the original tensor.
Also, you can simply use np.newaxis in a torch Tensor to increase the dimension. Here is an example:
In [34]: list_ = range(5)
In [35]: a = torch.Tensor(list_)
In [36]: a.shape
Out[36]: torch.Size([5])
In [37]: new_a = a[np.newaxis, :]
In [38]: new_a.shape
Out[38]: torch.Size([1, 5])
or you can use this, the '-1' means you don't have to specify the number of the elements.
In [3]: a.view(1,-1)
Out[3]:
1 2 3 4 5
[torch.FloatTensor of size 1x5]
This question has been thoroughly answered already, but I want to add for the less experienced python developers that you might find the * operator helpful in conjunction with view().
For example if you have a particular tensor size that you want a different tensor of data to conform to, you might try:
img = Variable(tensor.randn(20,30,3)) # tensor with goal shape
flat_size = 20*30*3
X = Variable(tensor.randn(50, flat_size)) # data tensor
X = X.view(-1, *img.size()) # sweet maneuver
print(X.size()) # size is (50, 20, 30, 3)
This works with numpy shape too:
img = np.random.randn(20,30,3)
flat_size = 20*30*3
X = Variable(tensor.randn(50, flat_size))
X = X.view(-1, *img.shape)
print(X.size()) # size is (50, 20, 30, 3)
torch.reshape() is made to dupe the numpy reshape method.
It came after the view() and torch.resize_() and it is inside the dir(torch) package.
import torch
x=torch.arange(24)
print(x, x.shape)
x_view = x.view(1,2,3,4) # works on is_contiguous() tensor
print(x_view.shape)
x_reshaped = x.reshape(1,2,3,4) # works on any tensor
print(x_reshaped.shape)
x_reshaped2 = torch.reshape(x_reshaped, (-1,)) # part of torch package, while view() and resize_() are not
print(x_reshaped2.shape)
Out:
tensor([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20, 21, 22, 23]) torch.Size([24])
torch.Size([1, 2, 3, 4])
torch.Size([1, 2, 3, 4])
torch.Size([24])
But did you know it can also work as a replacement for squeeze() and unsqueeze()
x = torch.tensor([1, 2, 3, 4])
print(x.shape)
x1 = torch.unsqueeze(x, 0)
print(x1.shape)
x2 = torch.unsqueeze(x1, 1)
print(x2.shape)
x3=x.reshape(1,1,4)
print(x3.shape)
x4=x.reshape(4)
print(x4.shape)
x5=x3.squeeze()
print(x5.shape)
Out:
torch.Size([4])
torch.Size([1, 4])
torch.Size([1, 1, 4])
torch.Size([1, 1, 4])
torch.Size([4])
torch.Size([4])
As far as I know, the best way to reshape tensors is to use einops. It solves various reshape problems by providing a simple and elegant function. In your situation, the code could be written as
from einops import rearrange
ans = rearrange(tensor,'h -> 1 h')
I highly recommend you try it.
BTW, you can use it with pytorch/tensorflow/numpy and many other libraries.
import torch
>>>a = torch.Tensor([1,2,3,4,5])
>>>a.size()
torch.Size([5])
#use view to reshape
>>>b = a.view(1,a.shape[0])
>>>b
tensor([[1., 2., 3., 4., 5.]])
>>>b.size()
torch.Size([1, 5])
>>>b.type()
'torch.FloatTensor'
Assume the following code:
import torch
import numpy as np
a = torch.tensor([1, 2, 3, 4, 5])
The following three calls have the exact same effect:
res_1 = a.unsqueeze(0)
res_2 = a.view(1, 5)
res_3 = a[np.newaxis,:]
res_1.shape == res_2.shape == res_3.shape == (1,5) # Returns true
Notice that for any of the resulting tensors, if you modify the data in them, you are also modifying the data in a, because they don't have a copy of the data, but reference the original data in a.
res_1[0,0] = 2
a[0] == res_1[0,0] == 2 # Returns true
The other way of doing it would be using the resize_ in place operation:
a.shape == res_1.shape # Returns false
a.reshape_((1, 5))
a.shape == res_1.shape # Returns true
Be careful of using resize_ or other in-place operation with autograd. See the following discussion: https://pytorch.org/docs/stable/notes/autograd.html#in-place-operations-with-autograd
import torch
t = torch.ones((2, 3, 4))
t.size()
>>torch.Size([2, 3, 4])
a = t.view(-1,t.size()[1]*t.size()[2])
a.size()
>>torch.Size([2, 12])
Related
I'm writing a definition that needs to take slices of a tensor with an arbitrary number of dimensions. The slice will always be on the batch dimension of 0.
Here is a simple example of what I want:
def masktensor(X, array_of_indices):
return X[array_of_indices, *] # edit * to allow variable number of dims
I want to be able to feed it various sized tensors and get the indexed slices out as a batch. Such as:
A = torch.rand(1000, 3, 32, 32)
B = torch.rand(1000, 5, 20)
indices = np.arange(10)
A_batch = masktensor(A, indices)
B_batch = masktensor(B, indices)
Thanks in advance!
After a bit of trial and error, I found that ... works as a slice wildcard:
def masktensor(X, array_of_indices):
return X[array_of_indices, ...] # <---- '...' allows variable dim size
A = torch.rand(1000, 3, 32, 32)
B = torch.rand(1000, 5, 20)
indices = np.arange(10)
print(masktensor(A, indices).size())
print(masktensor(B, indices).size())
Yields:
>>> torch.Size([10, 3, 32, 32])
>>> torch.Size([10, 5, 20])
I have 6 tensors of shape (batch_size, S, S, 1) and I want to combine them in one python list of size (batch_size, S*S, 6) - so every element of tensor should be inside the inner list.
Can this be achieved without using loops? What's the efficient way to solve it?
Let batch_size=10 and S=4 for the purpose of this example:
>>> x = [torch.rand(10, 4, 4, 1) for _ in range(6)]
Indeed the first step is to concatenate the tensor on the last dimension axis=3:
>>> y = torch.cat(x, -1)
>>> y.shape
torch.Size([10, 4, 4, 6])
Then reshape to flatten axis=1 and axis=2, you can do so with torch.flatten here since the two axes as adjacent:
>>> y = torch.cat(x, -1).flatten(1, 2)
>>> y.shape
torch.Size([10, 16, 6])
I'm starting off with a numpy array of an image.
In[1]:img = cv2.imread('test.jpg')
The shape is what you might expect for a 640x480 RGB image.
In[2]:img.shape
Out[2]: (480, 640, 3)
However, this image that I have is a frame of a video, which is 100 frames long. Ideally, I would like to have a single array that contains all the data from this video such that img.shape returns (480, 640, 3, 100).
What is the best way to add the next frame -- that is, the next set of image data, another 480 x 640 x 3 array -- to my initial array?
A dimension can be added to a numpy array as follows:
image = image[..., np.newaxis]
Alternatively to
image = image[..., np.newaxis]
in #dbliss' answer, you can also use numpy.expand_dims like
image = np.expand_dims(image, <your desired dimension>)
For example (taken from the link above):
x = np.array([1, 2])
print(x.shape) # prints (2,)
Then
y = np.expand_dims(x, axis=0)
yields
array([[1, 2]])
and
y.shape
gives
(1, 2)
You could just create an array of the correct size up-front and fill it:
frames = np.empty((480, 640, 3, 100))
for k in xrange(nframes):
frames[:,:,:,k] = cv2.imread('frame_{}.jpg'.format(k))
if the frames were individual jpg file that were named in some particular way (in the example, frame_0.jpg, frame_1.jpg, etc).
Just a note, you might consider using a (nframes, 480,640,3) shaped array, instead.
Pythonic
X = X[:, :, None]
which is equivalent to
X = X[:, :, numpy.newaxis] and
X = numpy.expand_dims(X, axis=-1)
But as you are explicitly asking about stacking images,
I would recommend going for stacking the list of images np.stack([X1, X2, X3]) that you may have collected in a loop.
If you do not like the order of the dimensions you can rearrange with np.transpose()
You can use np.concatenate() use the axis parameter to specify the dimension that should be concatenated. If the arrays being concatenated do not have this dimension, you can use np.newaxis to indicate where the new dimension should be added:
import numpy as np
movie = np.concatenate((img1[:,np.newaxis], img2[:,np.newaxis]), axis=3)
If you are reading from many files:
import glob
movie = np.concatenate([cv2.imread(p)[:,np.newaxis] for p in glob.glob('*.jpg')], axis=3)
Consider Approach 1 with reshape method and Approach 2 with np.newaxis method that produce the same outcome:
#Lets suppose, we have:
x = [1,2,3,4,5,6,7,8,9]
print('I. x',x)
xNpArr = np.array(x)
print('II. xNpArr',xNpArr)
print('III. xNpArr', xNpArr.shape)
xNpArr_3x3 = xNpArr.reshape((3,3))
print('IV. xNpArr_3x3.shape', xNpArr_3x3.shape)
print('V. xNpArr_3x3', xNpArr_3x3)
#Approach 1 with reshape method
xNpArrRs_1x3x3x1 = xNpArr_3x3.reshape((1,3,3,1))
print('VI. xNpArrRs_1x3x3x1.shape', xNpArrRs_1x3x3x1.shape)
print('VII. xNpArrRs_1x3x3x1', xNpArrRs_1x3x3x1)
#Approach 2 with np.newaxis method
xNpArrNa_1x3x3x1 = xNpArr_3x3[np.newaxis, ..., np.newaxis]
print('VIII. xNpArrNa_1x3x3x1.shape', xNpArrNa_1x3x3x1.shape)
print('IX. xNpArrNa_1x3x3x1', xNpArrNa_1x3x3x1)
We have as outcome:
I. x [1, 2, 3, 4, 5, 6, 7, 8, 9]
II. xNpArr [1 2 3 4 5 6 7 8 9]
III. xNpArr (9,)
IV. xNpArr_3x3.shape (3, 3)
V. xNpArr_3x3 [[1 2 3]
[4 5 6]
[7 8 9]]
VI. xNpArrRs_1x3x3x1.shape (1, 3, 3, 1)
VII. xNpArrRs_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
VIII. xNpArrNa_1x3x3x1.shape (1, 3, 3, 1)
IX. xNpArrNa_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
a = np.expand_dims(a, axis=-1)
or
a = a[:, np.newaxis]
or
a = a.reshape(a.shape + (1,))
There is no structure in numpy that allows you to append more data later.
Instead, numpy puts all of your data into a contiguous chunk of numbers (basically; a C array), and any resize requires allocating a new chunk of memory to hold it. Numpy's speed comes from being able to keep all the data in a numpy array in the same chunk of memory; e.g. mathematical operations can be parallelized for speed and you get less cache misses.
So you will have two kinds of solutions:
Pre-allocate the memory for the numpy array and fill in the values, like in JoshAdel's answer, or
Keep your data in a normal python list until it's actually needed to put them all together (see below)
images = []
for i in range(100):
new_image = # pull image from somewhere
images.append(new_image)
images = np.stack(images, axis=3)
Note that there is no need to expand the dimensions of the individual image arrays first, nor do you need to know how many images you expect ahead of time.
You can use stack with the axis parameter:
img.shape # h,w,3
imgs = np.stack([img1,img2,img3,img4], axis=-1) # -1 = new axis is last
imgs.shape # h,w,3,nimages
For example: to convert grayscale to color:
>>> d = np.zeros((5,4), dtype=int) # 5x4
>>> d[2,3] = 1
>>> d3.shape
Out[30]: (5, 4, 3)
>>> d3 = np.stack([d,d,d], axis=-2) # 5x4x3 -1=as last axis
>>> d3[2,3]
Out[32]: array([1, 1, 1])
I followed this approach:
import numpy as np
import cv2
ls = []
for image in image_paths:
ls.append(cv2.imread('test.jpg'))
img_np = np.array(ls) # shape (100, 480, 640, 3)
img_np = np.rollaxis(img_np, 0, 4) # shape (480, 640, 3, 100).
This worked for me:
image = image[..., None]
This will help you add axis anywhere you want
import numpy as np
signal = np.array([[0.3394572666491664, 0.3089068053925853, 0.3516359279582483], [0.33932706934615525, 0.3094755563319447, 0.3511973743219001], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256]])
print(signal.shape)
#(4,3)
print(signal[...,np.newaxis].shape) or signal[...:none]
#(4, 3, 1)
print(signal[:, np.newaxis, :].shape) or signal[:,none, :]
#(4, 1, 3)
there is three-way for adding new dimensions to ndarray .
first: using "np.newaxis" (something like #dbliss answer)
np.newaxis is just given an alias to None for making it easier to
understand. If you replace np.newaxis with None, it works the same
way. but it's better to use np.newaxis for being more explicit.
import numpy as np
my_arr = np.array([2, 3])
new_arr = my_arr[..., np.newaxis]
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
second: using "np.expand_dims()"
Specify the original ndarray in the first argument and the position
to add the dimension in the second argument axis.
my_arr = np.array([2, 3])
new_arr = np.expand_dims(my_arr, -1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
third: using "reshape()"
my_arr = np.array([2, 3])
new_arr = my_arr.reshape(*my_arr.shape, 1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
I'm starting off with a numpy array of an image.
In[1]:img = cv2.imread('test.jpg')
The shape is what you might expect for a 640x480 RGB image.
In[2]:img.shape
Out[2]: (480, 640, 3)
However, this image that I have is a frame of a video, which is 100 frames long. Ideally, I would like to have a single array that contains all the data from this video such that img.shape returns (480, 640, 3, 100).
What is the best way to add the next frame -- that is, the next set of image data, another 480 x 640 x 3 array -- to my initial array?
A dimension can be added to a numpy array as follows:
image = image[..., np.newaxis]
Alternatively to
image = image[..., np.newaxis]
in #dbliss' answer, you can also use numpy.expand_dims like
image = np.expand_dims(image, <your desired dimension>)
For example (taken from the link above):
x = np.array([1, 2])
print(x.shape) # prints (2,)
Then
y = np.expand_dims(x, axis=0)
yields
array([[1, 2]])
and
y.shape
gives
(1, 2)
You could just create an array of the correct size up-front and fill it:
frames = np.empty((480, 640, 3, 100))
for k in xrange(nframes):
frames[:,:,:,k] = cv2.imread('frame_{}.jpg'.format(k))
if the frames were individual jpg file that were named in some particular way (in the example, frame_0.jpg, frame_1.jpg, etc).
Just a note, you might consider using a (nframes, 480,640,3) shaped array, instead.
Pythonic
X = X[:, :, None]
which is equivalent to
X = X[:, :, numpy.newaxis] and
X = numpy.expand_dims(X, axis=-1)
But as you are explicitly asking about stacking images,
I would recommend going for stacking the list of images np.stack([X1, X2, X3]) that you may have collected in a loop.
If you do not like the order of the dimensions you can rearrange with np.transpose()
You can use np.concatenate() use the axis parameter to specify the dimension that should be concatenated. If the arrays being concatenated do not have this dimension, you can use np.newaxis to indicate where the new dimension should be added:
import numpy as np
movie = np.concatenate((img1[:,np.newaxis], img2[:,np.newaxis]), axis=3)
If you are reading from many files:
import glob
movie = np.concatenate([cv2.imread(p)[:,np.newaxis] for p in glob.glob('*.jpg')], axis=3)
Consider Approach 1 with reshape method and Approach 2 with np.newaxis method that produce the same outcome:
#Lets suppose, we have:
x = [1,2,3,4,5,6,7,8,9]
print('I. x',x)
xNpArr = np.array(x)
print('II. xNpArr',xNpArr)
print('III. xNpArr', xNpArr.shape)
xNpArr_3x3 = xNpArr.reshape((3,3))
print('IV. xNpArr_3x3.shape', xNpArr_3x3.shape)
print('V. xNpArr_3x3', xNpArr_3x3)
#Approach 1 with reshape method
xNpArrRs_1x3x3x1 = xNpArr_3x3.reshape((1,3,3,1))
print('VI. xNpArrRs_1x3x3x1.shape', xNpArrRs_1x3x3x1.shape)
print('VII. xNpArrRs_1x3x3x1', xNpArrRs_1x3x3x1)
#Approach 2 with np.newaxis method
xNpArrNa_1x3x3x1 = xNpArr_3x3[np.newaxis, ..., np.newaxis]
print('VIII. xNpArrNa_1x3x3x1.shape', xNpArrNa_1x3x3x1.shape)
print('IX. xNpArrNa_1x3x3x1', xNpArrNa_1x3x3x1)
We have as outcome:
I. x [1, 2, 3, 4, 5, 6, 7, 8, 9]
II. xNpArr [1 2 3 4 5 6 7 8 9]
III. xNpArr (9,)
IV. xNpArr_3x3.shape (3, 3)
V. xNpArr_3x3 [[1 2 3]
[4 5 6]
[7 8 9]]
VI. xNpArrRs_1x3x3x1.shape (1, 3, 3, 1)
VII. xNpArrRs_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
VIII. xNpArrNa_1x3x3x1.shape (1, 3, 3, 1)
IX. xNpArrNa_1x3x3x1 [[[[1]
[2]
[3]]
[[4]
[5]
[6]]
[[7]
[8]
[9]]]]
a = np.expand_dims(a, axis=-1)
or
a = a[:, np.newaxis]
or
a = a.reshape(a.shape + (1,))
There is no structure in numpy that allows you to append more data later.
Instead, numpy puts all of your data into a contiguous chunk of numbers (basically; a C array), and any resize requires allocating a new chunk of memory to hold it. Numpy's speed comes from being able to keep all the data in a numpy array in the same chunk of memory; e.g. mathematical operations can be parallelized for speed and you get less cache misses.
So you will have two kinds of solutions:
Pre-allocate the memory for the numpy array and fill in the values, like in JoshAdel's answer, or
Keep your data in a normal python list until it's actually needed to put them all together (see below)
images = []
for i in range(100):
new_image = # pull image from somewhere
images.append(new_image)
images = np.stack(images, axis=3)
Note that there is no need to expand the dimensions of the individual image arrays first, nor do you need to know how many images you expect ahead of time.
You can use stack with the axis parameter:
img.shape # h,w,3
imgs = np.stack([img1,img2,img3,img4], axis=-1) # -1 = new axis is last
imgs.shape # h,w,3,nimages
For example: to convert grayscale to color:
>>> d = np.zeros((5,4), dtype=int) # 5x4
>>> d[2,3] = 1
>>> d3.shape
Out[30]: (5, 4, 3)
>>> d3 = np.stack([d,d,d], axis=-2) # 5x4x3 -1=as last axis
>>> d3[2,3]
Out[32]: array([1, 1, 1])
I followed this approach:
import numpy as np
import cv2
ls = []
for image in image_paths:
ls.append(cv2.imread('test.jpg'))
img_np = np.array(ls) # shape (100, 480, 640, 3)
img_np = np.rollaxis(img_np, 0, 4) # shape (480, 640, 3, 100).
This worked for me:
image = image[..., None]
This will help you add axis anywhere you want
import numpy as np
signal = np.array([[0.3394572666491664, 0.3089068053925853, 0.3516359279582483], [0.33932706934615525, 0.3094755563319447, 0.3511973743219001], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256], [0.3394407172182317, 0.30889042266755573, 0.35166886011421256]])
print(signal.shape)
#(4,3)
print(signal[...,np.newaxis].shape) or signal[...:none]
#(4, 3, 1)
print(signal[:, np.newaxis, :].shape) or signal[:,none, :]
#(4, 1, 3)
there is three-way for adding new dimensions to ndarray .
first: using "np.newaxis" (something like #dbliss answer)
np.newaxis is just given an alias to None for making it easier to
understand. If you replace np.newaxis with None, it works the same
way. but it's better to use np.newaxis for being more explicit.
import numpy as np
my_arr = np.array([2, 3])
new_arr = my_arr[..., np.newaxis]
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
second: using "np.expand_dims()"
Specify the original ndarray in the first argument and the position
to add the dimension in the second argument axis.
my_arr = np.array([2, 3])
new_arr = np.expand_dims(my_arr, -1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
third: using "reshape()"
my_arr = np.array([2, 3])
new_arr = my_arr.reshape(*my_arr.shape, 1)
print("old shape", my_arr.shape)
print("new shape", new_arr.shape)
>>> old shape (2,)
>>> new shape (2, 1)
I have a tensor of shape (16, 4096, 3). I have another tensor of indices of shape (16, 32768, 3). I am trying to collect the values along dim=1. This was initially done in pytorch using gather function as shown below-
# a.shape (16L, 4096L, 3L)
# idx.shape (16L, 32768L, 3L)
b = a.gather(1, idx)
# b.shape (16L, 32768L, 3L)
Please note that the size of output b is the same as that of idx. However, when I apply gather function of tensorflow, I get a completely different output. The output dimension was found mismatching as shown below-
b = tf.gather(a, idx, axis=1)
# b.shape (16, 16, 32768, 3, 3)
I also tried using tf.gather_nd but got in vain. See below-
b = tf.gather_nd(a, idx)
# b.shape (16, 32768)
Why am I getting different shapes of tensors? I want to get the tensor of the same shape as calculated by pytorch.
In other words, I want to know the tensorflow equivalent of torch.gather.
For 2D case,there is a method to do it:
# a.shape (16L, 10L)
# idx.shape (16L,1)
idx = tf.stack([tf.range(tf.shape(idx)[0]),idx[:,0]],axis=-1)
b = tf.gather_nd(a,idx)
However,For ND case,this method maybe very complex
This "should" be a general solution using tf.gather_nd (I've only tested for rank 2 and 3 tensors along the last axis):
def torch_gather(x, indices, gather_axis):
# if pytorch gather indices are
# [[[0, 10, 20], [0, 10, 20], [0, 10, 20]],
# [[0, 10, 20], [0, 10, 20], [0, 10, 20]]]
# tf nd_gather needs to be
# [[0,0,0], [0,0,10], [0,0,20], [0,1,0], [0,1,10], [0,1,20], [0,2,0], [0,2,10], [0,2,20],
# [1,0,0], [1,0,10], [1,0,20], [1,1,0], [1,1,10], [1,1,20], [1,2,0], [1,2,10], [1,2,20]]
# create a tensor containing indices of each element
all_indices = tf.where(tf.fill(indices.shape, True))
gather_locations = tf.reshape(indices, [indices.shape.num_elements()])
# splice in our pytorch style index at the correct axis
gather_indices = []
for axis in range(len(indices.shape)):
if axis == gather_axis:
gather_indices.append(gather_locations)
else:
gather_indices.append(all_indices[:, axis])
gather_indices = tf.stack(gather_indices, axis=-1)
gathered = tf.gather_nd(x, gather_indices)
reshaped = tf.reshape(gathered, indices.shape)
return reshaped
For the last-axis gathering, we can use the 2D-reshape trick for general ND cases, and then employ #LiShaoyuan 2D code above
# last-axis gathering only - use 2D-reshape-trick for Torch's style nD gathering
def torch_gather(param, id_tensor):
# 2d-gather torch equivalent from #LiShaoyuan above
def gather2d(target, id_tensor):
idx = tf.stack([tf.range(tf.shape(id_tensor)[0]),id_tensor[:,0]],axis=-1)
result = tf.gather_nd(target,idx)
return tf.expand_dims(result,axis=-1)
target = tf.reshape(param, (-1, param.shape[-1])) # reshape 2D
target_shape = id_tensor.shape
id_tensor = tf.reshape(id_tensor, (-1, 1)) # also 2D-index
result = gather2d(target, id_tensor)
return tf.reshape(result, target_shape)