I am just using TensorFlow to realise a CNN model called DVF: https://github.com/liuziwei7/voxel-flow.
The output of the model is 'deconv4' with the shape of [batch_size, 256, 256,3], then I need to extract optical flow using command:'flow = tf.slice(deconv4.outputs, [0,0,0,0], [batch_size, 256, 256, 2])'.
However, if the 'batch_size' is 'None', how do I slice the 'flow' tensor?
Thanks in advance.
The shape of 'deconv4' is [?,256,256,3] and I want to obtain the 'flow' with the shape of [?,256,256,2] from the 'deconv4'.
deconv4 = Conv2d(deconv3_bn_relu, 3, [5, 5], act=tf.tanh, padding='SAME', W_init=w_init, name='deconv4')
#################### Calculate Voxel Flow based on the 'deconv4' ############################
flow = tf.slice(deconv4.outputs, [0,0,0,0], [batch_size, 256, 256, 2])
The shape of 'flow' should be [?, 256,256,2]. But I am not sure how to obtain it.
You should be able to replace your batch_size with None and that should do the trick.
Alternatively, this: tf.shape(x)[0] will give you a variable tensor with your batch size.
Related
I have a PyTorch tensor with the shape of [1, 3, 64, 64], and I want to convert it to the shape [1, 4, 64, 64] while setting the value of the newly added layer to be the same as the previous layer in the same dimension (eg newtensor[0][3] = oldtensor[0][2])
Note that my tensor has requires_grad=True, so I cannot use resize_()
How can I do this?
Get a slice from the old tensor, and concatenate it to the new tensor along dimension 1.
tslice = old[:,-1:,:,:]
new = torch.cat((old,tslice), dim = 1)
This will work perfectly. #DerekG code had an error in -1, but his idea is correct.
tensor is your tensor data.
new = torch.cat((tensor, tensor[:, 0:1, :, :]), dim=1)
I am confused on how to replicate Keras (TensorFlow) convolutions in PyTorch.
In Keras, I can do something like this. (the input size is (256, 237, 1, 21) and the output size is (256, 237, 1, 1024).
import tensorflow as tf
x = tf.random.normal((256,237,1,21))
y = tf.keras.layers.Conv1D(filters=1024, kernel_size=5,padding="same")(x)
print(y.shape)
(256, 237, 1, 1024)
However, in PyTorch, when I try to do the same thing I get a different output size:
import torch.nn as nn
x = torch.randn(256,237,1,21)
m = nn.Conv1d(in_channels=237, out_channels=1024, kernel_size=(1,5))
y = m(x)
print(y.shape)
torch.Size([256, 1024, 1, 17])
I want PyTorch to give me the same output size that Keras does:
This previous question seems to imply that Keras filters are PyTorch's out_channels but thats what I have. I tried to add the padding in PyTorch of padding=(0,503) but that gives me torch.Size([256, 1024, 1, 1023]) but that still not correct. This also takes so much longer than keras does so I feel that I have incorrectly assigned a parameter.
How can I replicate what Keras did with convolution in PyTorch?
In TensorFlow, tf.keras.layers.Conv1D takes in a tensor of shape (batch_shape + (steps, input_dim)). Which means that what is commonly known as channels appears on the last axis. For instance in 2D convolution you would have (batch, height, width, channels). This is different from PyTorch where the channel dimension is right after the batch axis: torch.nn.Conv1d takes in shapes of (batch, channel, length). So you will need to permute two axes.
For torch.nn.Conv1d:
in_channels is the number of channels in the input tensor
out_channels is the number of filters, i.e. the number of channels the output will have
stride the step size of the convolution
padding the zero-padding added to both sides
In PyTorch there is no option for padding='same', you will need to choose padding correctly. Here stride=1, so padding must equal to kernel_size//2 (i.e. padding=2) in order to maintain the length of the tensor.
In your example, since x has a shape of (256, 237, 1, 21), in TensorFlow's terminology it will be considered as an input with:
a batch shape of (256, 237),
steps=1, so the length of your 1D input is 1,
21 input channels.
Whereas in PyTorch, x of shape (256, 237, 1, 21) would be:
batch shape of (256, 237),
1 input channel
a length of 21.
Have kept the input in both examples below (TensorFlow vs. PyTorch) as x.shape=(256, 237, 21) assuming 256 is the batch size, 237 is the length of the input sequence, and 21 is the number of channels (i.e. the input dimension, what I see as the dimension on each timestep).
In TensorFlow:
>>> x = tf.random.normal((256, 237, 21))
>>> m = tf.keras.layers.Conv1D(filters=1024, kernel_size=5, padding="same")
>>> y = m(x)
>>> y.shape
TensorShape([256, 237, 1024])
In PyTorch:
>>> x = torch.randn(256, 237, 21)
>>> m = nn.Conv1d(in_channels=21, out_channels=1024, kernel_size=5, padding=2)
>>> y = m(x.permute(0, 2, 1))
>>> y.permute(0, 2, 1).shape
torch.Size([256, 237, 1024])
So in the latter, you would simply work with x = torch.randn(256, 21, 237)...
PyTorch now has out of the box same convolution operation you can take a look at this link [Same convolution][1]
class InceptionNet(nn.Module):
def __init__(self, in_channels, in_1x1, in_3x3reduce, in_3x3, in_5x5reduce, in_5x5, in_1x1pool):
super(InceptionNet, self).__init__()
self.incep_1 = ConvBlock(in_channels, in_1x1, kernel_size=1, padding='same')
Note a same convolution only supports the default stride value which is 1 anything other won't work.
[1]: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html
I am new to pytorch. I have 3D tensor (32,10,64) and I want a 2D tensor (32, 64).
I tried view() and used after passing to linear layer squeeze() which converted it to (32,10).
Try this
t = torch.rand(32, 10, 64).permute(0, 2, 1)[:, :, -1]
or, as pointed out by Shai, you could also
t = torch.rand(32, 10, 64)[:, -1, :]
print(t.size()) # torch.Size([32, 64])
I have a problem. I have built a ConvNet. One hidden before the final output the shape of the output of that hidden layer is (None,64,32,32). What I want is to take the element wise average of those 64 channels. I have tried this:
main_inputs=[]
outputs=[]
def convnet(channels,rows,columns):
input=Input(shape=(channels,rows,columns))
main_inputs.append(input)
conv1=Convolution2D(kernel_size=(3,3) ,filters=64, padding="same")(input)
activation1= Activation('relu')(conv1)
conv2=Convolution2D(kernel_size=(3,3), filters=64, padding="same")(activation1)
activation2 = Activation('relu')(conv2)
conv3=Convolution2D(kernel_size=(3,3), filters=64, padding="same")(activation2)
activation3 = Activation('relu')(conv3)
conv4=Convolution2D(kernel_size=(3,3), filters=channels, padding="same")(activation3)
out=keras.layers.Average()(conv4)
activation4 = Activation('linear')(out)
outputs.append(activation4)
print(np.shape(outputs))
model = Model(inputs=main_inputs, outputs=outputs)
return model
But when I am getting an error:
ValueError: A merge layer should be called on a list of inputs
After that instead of the keras.layer.average I tried with the backend documentation:
out=K.mean(conv4,axis=1)
But I am getting this error:
'Tensor' object has no attribute '_keras_history'
Any ideas?
Let's say conv4 is a tensor with shape (batch_size, nb_channels, 32, 32). You can average conv4 over the channels' dimension as follows:
out = Lambda(lambda x: K.mean(x, axis=1))(conv4)
The resulting tensor out will have shape (batch_size, 32, 32). You need to wrap all the backend operations within a Lambda layer, so that the resulting tensors are valid Keras tensors (so that they don't lack some attributes such as _keras_history).
If you want the shape of out to be (batch_size, 1, 32, 32) instead, you can do:
out = Lambda(lambda x: K.mean(x, axis=1)[:, None, :, :])(conv4)
NOTE: Not tested.
Add my few cents to rvinas answer - there's parameter called keepdims which prevent reducing shape of tensor after applying some operation to it.
keepdims: A boolean, whether to keep the dimensions or not. If
keepdims is False, the rank of the tensor is reduced by 1. If keepdims
is True, the reduced dimension is retained with length 1.
out = Lambda(lambda x: K.mean(x, axis=1), keepdims=True)(conv4)
According to this Deep Learning course http://cs231n.github.io/convolutional-networks/#conv, It says that if there is an input x with shape [W,W] (where W = width = height) goes through a Convolutional Layer with filter shape [F,F]and stride S, the Layer will return an output with shape [(W-F)/S +1, (W-F)/S +1]
However, when I'm trying to follow the tutorial of the Tensorflow: https://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html. There seems to have difference of the function tf.nn.conv2d(inputs, filter, stride)
Whatever how do I change my filter size, conv2d will constantly return me a value with the same shape as the input.
In my case, I am using the MNIST dataset which indicates that every image has size [28,28](ignoring channel_num = 1)
but after I defining the first conv1 layers, I used the conv1.get_shape() to see its output, it gives me [28,28, num_of_filters]
Why is this? I thought the return value should follow the formula above.
Appendix: Code snippet
#reshape x from 2d to 4d
x_image = tf.reshape(x, [-1, 28, 28, 1]) #[num_samples, width, height, channel_num]
## define the shape of weights and bias
w_shape = [5, 5, 1, 32] #patch_w, patch_h, in_channel, output_num(out_channel)
b_shape = [32] #bias only need to be consistent with output_num
## init weights of conv1 layers
W_conv1 = weight_variable(w_shape)
b_conv1 = bias_variable(b_shape)
## first layer x_iamge->conv1/relu->pool1
#Our convolutions uses a stride of one
#and are zero padded
#so that the output is the same size as the input
h_conv1 = tf.nn.relu(
conv2d(x_image, W_conv1) + b_conv1
)
print 'conv1.shape=',h_conv1.get_shape()
## conv1.shape= (?, 28, 28, 32)
## I thought conv1.shape should be (?, (28-5)/1+1, 24 ,32)
h_pool1 = max_pool_2x2(h_conv1) #output 32 num
print 'pool1.shape=',h_pool1.get_shape() ## pool1.shape= (?, 14, 14, 32)
It depends on the padding parameter. 'SAME' will keep the output as WxW (assuming stride=1,) 'VALID' will shrink the size of the output to (W-F+1)x(W-F+1)
Conv2d has a parameter called padding see here
Where if you set padding to "VALID" it will satisfy your formula. It defaults to "SAME" which pads (same as adding a border around) the image filled with zeroes such that the output will remain the same shape as the input.