I am trying to construct a model that looks like this.
Notice that the output shape of the padding layer is 1 * 48 * 48 * 32. The input shape to padding layer is 1 * 48 * 48 * 16. Which type of padding operation does that?
My code:
prelu3 = tf.keras.layers.PReLU(shared_axes = [1, 2])(add2)
deptconv3 = tf.keras.layers.DepthwiseConv2D(3, strides=(2, 2), padding='same')(prelu3)
conv4 = tf.keras.layers.Conv2D(32, 1, strides=(1, 1), padding='same')(deptconv3)
maxpool1 = tf.keras.layers.MaxPool2D()(prelu3)
pad1 = tf.keras.layers.ZeroPadding2D(padding=(1, 1))(maxpool1) # This is the padding layer where problem lies.
This is the part of code that is trying to replicate that block. However, I get model that looks like this.
Am I missing something here or am I using the wrong layer?
By default, keras maxpool2d takes in:
Input shape : 4D tensor with shape (batch_size, rows, cols, channels).
Output shape : (batch_size, padded_rows, padded_cols, chamels)
PLease have a look here zero_padding2d layer docs in keras.
In that respect you are trying to double what is getting treated as a channel here.
Your input looks more like (batch, x, y, z) and you want to have a (batch, x, y, 2*z)
Why do you want to have a zeropadding to double your z? I would rather suggest you to use a dense layer like
tf.keras.layers.Dense(32)(maxpool1)
That would increase z shape from 16 to 32.
Edited:
I got something which can help you.
tf.keras.layers.ZeroPadding2D(
padding=(0, 8), data_format="channels_first"
)(maxpool1)
What this does is treats your y, z as (x, y) and x as channel and pads (0, 8) around (y, z) to give (y, 32)
Demo:
import tensorflow as tf
input_shape = (4, 28, 28, 3)
x = tf.keras.layers.Input(shape=input_shape[1:])
y = tf.keras.layers.Conv2D(16, 3, activation='relu', dilation_rate=2, input_shape=input_shape[1:])(x)
x=tf.keras.layers.ZeroPadding2D(
padding=(0, 8), data_format="channels_first"
)(y)
print(y.shape, x.shape)
(None, 24, 24, 16) (None, 24, 24, 32)
Related
I am confused on how to replicate Keras (TensorFlow) convolutions in PyTorch.
In Keras, I can do something like this. (the input size is (256, 237, 1, 21) and the output size is (256, 237, 1, 1024).
import tensorflow as tf
x = tf.random.normal((256,237,1,21))
y = tf.keras.layers.Conv1D(filters=1024, kernel_size=5,padding="same")(x)
print(y.shape)
(256, 237, 1, 1024)
However, in PyTorch, when I try to do the same thing I get a different output size:
import torch.nn as nn
x = torch.randn(256,237,1,21)
m = nn.Conv1d(in_channels=237, out_channels=1024, kernel_size=(1,5))
y = m(x)
print(y.shape)
torch.Size([256, 1024, 1, 17])
I want PyTorch to give me the same output size that Keras does:
This previous question seems to imply that Keras filters are PyTorch's out_channels but thats what I have. I tried to add the padding in PyTorch of padding=(0,503) but that gives me torch.Size([256, 1024, 1, 1023]) but that still not correct. This also takes so much longer than keras does so I feel that I have incorrectly assigned a parameter.
How can I replicate what Keras did with convolution in PyTorch?
In TensorFlow, tf.keras.layers.Conv1D takes in a tensor of shape (batch_shape + (steps, input_dim)). Which means that what is commonly known as channels appears on the last axis. For instance in 2D convolution you would have (batch, height, width, channels). This is different from PyTorch where the channel dimension is right after the batch axis: torch.nn.Conv1d takes in shapes of (batch, channel, length). So you will need to permute two axes.
For torch.nn.Conv1d:
in_channels is the number of channels in the input tensor
out_channels is the number of filters, i.e. the number of channels the output will have
stride the step size of the convolution
padding the zero-padding added to both sides
In PyTorch there is no option for padding='same', you will need to choose padding correctly. Here stride=1, so padding must equal to kernel_size//2 (i.e. padding=2) in order to maintain the length of the tensor.
In your example, since x has a shape of (256, 237, 1, 21), in TensorFlow's terminology it will be considered as an input with:
a batch shape of (256, 237),
steps=1, so the length of your 1D input is 1,
21 input channels.
Whereas in PyTorch, x of shape (256, 237, 1, 21) would be:
batch shape of (256, 237),
1 input channel
a length of 21.
Have kept the input in both examples below (TensorFlow vs. PyTorch) as x.shape=(256, 237, 21) assuming 256 is the batch size, 237 is the length of the input sequence, and 21 is the number of channels (i.e. the input dimension, what I see as the dimension on each timestep).
In TensorFlow:
>>> x = tf.random.normal((256, 237, 21))
>>> m = tf.keras.layers.Conv1D(filters=1024, kernel_size=5, padding="same")
>>> y = m(x)
>>> y.shape
TensorShape([256, 237, 1024])
In PyTorch:
>>> x = torch.randn(256, 237, 21)
>>> m = nn.Conv1d(in_channels=21, out_channels=1024, kernel_size=5, padding=2)
>>> y = m(x.permute(0, 2, 1))
>>> y.permute(0, 2, 1).shape
torch.Size([256, 237, 1024])
So in the latter, you would simply work with x = torch.randn(256, 21, 237)...
PyTorch now has out of the box same convolution operation you can take a look at this link [Same convolution][1]
class InceptionNet(nn.Module):
def __init__(self, in_channels, in_1x1, in_3x3reduce, in_3x3, in_5x5reduce, in_5x5, in_1x1pool):
super(InceptionNet, self).__init__()
self.incep_1 = ConvBlock(in_channels, in_1x1, kernel_size=1, padding='same')
Note a same convolution only supports the default stride value which is 1 anything other won't work.
[1]: https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html
I was trying to train my model for prediction of EMNIST by using Pytorch.
Edit:- Here's the link of colab notebook for the problem.
class Net(nn.Module):
def __init__(self):
super(Net, self).__init__()
self.conv1 = nn.Conv2d(28, 64, (5, 5), padding=2)
self.conv1_bn = nn.BatchNorm2d(64)
self.conv2 = nn.Conv2d(64, 128, 2, padding=2)
self.fc1 = nn.Linear(2048, 1024)
self.dropout = nn.Dropout(0.3)
self.fc2 = nn.Linear(1024, 512)
self.bn = nn.BatchNorm1d(1)
self.fc3 = nn.Linear(512, 128)
self.fc4 = nn.Linear(128, 47)
def forward(self, x):
x = F.relu(self.conv1(x))
x = F.max_pool2d(x, 2, 2)
x = self.conv1_bn(x)
x = F.relu(self.conv2(x))
x = F.max_pool2d(x, 2, 2)
x = x.view(-1, 2048)
x = F.relu(self.fc1(x))
x = self.dropout(x)
x = self.fc2(x)
x = x.view(-1, 1, 512)
x = self.bn(x)
x = x.view(-1, 512)
x = self.fc3(x)
x = self.fc4(x)
return F.log_softmax(x, dim=1)
return x
I am getting this type of error as shown below, whenever I am training my model.
<ipython-input-11-07c68cf1cac2> in forward(self, x)
24 def forward(self, x):
25 x = F.relu(self.conv1(x))
---> 26 x = F.max_pool2d(x, 2, 2)
27 x = self.conv1_bn(x)
RuntimeError: Given input size: (64x28x1). Calculated output size: (64x14x0). Output size is too small
I tried to searched for the solutions and found that I should transform the data before. So i tried transforming it by the most common suggestion:-
transform_valid = transforms.Compose(
[
transforms.ToTensor(),
])
But then again I am getting the error mentioned below. Maybe the problem lies here in the transformation part.
/opt/conda/lib/python3.7/site-packages/torchvision/datasets/mnist.py:469: UserWarning: The given NumPy array is not writeable, and PyTorch does not support non-writeable tensors. This means you can write to the underlying (supposedly non-writeable) NumPy array using the tensor. You may want to copy the array to protect its data or make it writeable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /opt/conda/conda-bld/pytorch_1595629403081/work/torch/csrc/utils/tensor_numpy.cpp:141.)
return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)
I wanted to make that particular numpy array writable by using "ndarray.setflags(write=None, align=None, uic=None)" but I'm not able to figure out from where and what type of array should I make writable, as I'm directly loading the dataset using ->
"datasets.EMNIST(root, split="balanced", train=False, download=True, transform=transform_valid)"
welcome to Stackoverflow !
Your problem is not related to the toTensor transform, this error is yielded because of the dimension of the tensor you input in your maxpool : the error clearly states that you are trying to maxppol a tensor of which one of the dimensions is 1 (64, 28, 1) and thus it will output a tensor with a dimension of 0 (64,14,0), which makes no sense.
You need to check the dimensions of the tensors you input in your model. They are definitely too small. Maybe you made a mistake with a view somewhere (hard to tell without a minimal reproducible example).
If I can try to guess, you have at the beginning a tensor size 28x28x1 (typical MNIST), and you put it into a convolution that expects a tensor of dims BxCxWxH (batch_size, channels, width, height), i.e something like (B, 1, 28, 28), but you confuse the width (28) from the input channels (nn.Conv2d(->28<-, 64, (5, 5), padding=2))
I believe you want your first layer to be nn.Conv2d(1, 64, (5, 5), padding=2), and you need to resize your tensors to give them the shape (B, 1, 28, 28) (the value of B is up to you) before giving them to the network.
Sidenote : the warning about writable numpy arrays is completely unrelated, it just means that pytorch will possibly override the "non-writable" data of your numpy array. If you don't care about this numpy array being modified, you can ignore the warning.
I have an Input layer that looks like this:
>>>inp = tf.keras.Input(shape=(107, 3))
>>>print(inp)
Tensor("input_25:0", shape=(None, 107, 3), dtype=float32)
Since the shape is (None, 107, 3), I want to take each (None, 107, 1) to use it for separate layers. How do I do that?
According to a related GitHub issue, you can use tf.keras.layers.Lambda to split the input tensor by channel.
import tensorflow as tf
tfkl = tf.keras.layers
inp = tf.keras.Input(shape=(107, 3))
x0 = tfkl.Lambda(lambda x : x[..., 0])(x)
x1 = tfkl.Lambda(lambda x : x[..., 1])(x)
x2 = tfkl.Lambda(lambda x : x[..., 2])(x)
The ... is an Ellipsis and fills in the slices to get the last slice only.
As the title says I'm looking at determining the proper dimensions for my CNN architecture. First, I obtain the next element of my dataset:
train_ds = iter(model.train_dataset)
feature, label = next(train_ds)
Where feature has dimensions (32, 64, 64, 4) corresponding to a batch size of 32, height of 64, length 64, and extended batch size of 4 (not a channel dimension). I initialize my 4-d kernel to pass over my 3-matrix, as I do not want the extended batch size to be convoluted. What I mean by this is in practice I want a 2-d kernel of size (1, 1) to pass over each 64 x 64 image, and do the same for the extended batch size without convoluting the extended batch sizes together. So I am in fact doing a (1, 1) convolution for each image in parallel with each other. So far I was able to initialize the kernel and feed the conv2d like so:
kernel = tf.constant(np.ones((1, 1, 4, 4)), dtype=tf.float32)
output = tf.nn.conv2d(feature, kernel, strides=[1, 1, 1, 1], padding='SAME')
Doing this produces my expected output, (32, 64, 64, 4). But I have absolutely no idea how to initialize the weights so that they work with this architecture. I have something like this:
w_init = tf.random_normal_initializer()
input_dim = (4, 1, 1, 4)
w = tf.Variable(
initial_value=w_init(shape=(input_dim), dtype="float32"),
trainable=True)
tf.matmul(output, w)
But I'm receiving incompatible batch dimensions as I don't know what the input_dim should be. I know it should be something like (num_filters * filter_size * filter_size * num_channels) + num_filters according to this answer, but I'm pretty sure that doesn't work for my scenario.
After tinkering around I was able to come up with a solution when the dimension weights are of size (1, 1, 4, 4) or (num_filters * num_channels * filter_size * filter_size). If anyone wants to provide a mathematical or similar explanation, it would be much appreciated!
In my computational pipeline, I have used custom function which is going to create custom keras blocks, and I used this blocks multiple times with Conv2D. At the end, I got two different tensor which is features maps with different tensor shape: TensorShape([None, 21, 21, 64]) and TensorShape([None, 10, 10, 192]). In this case, using tf.keras.layers.concatenate to do concatenation is not working for me. Can anyone point me out how to concatenate this two tensors into one? Any idea to make this happen?
if I could able to concatenate the tensors with shape of TensorShape([None, 21, 21, 64]) and TensorShape([None, 10, 10, 192]), I want to do the following after the concatenation.
x = Conv2D(32, (2, 2), strides=(1,1), padding='same')(merged_tensors)
x = BatchNormalization(axis=-1)(x)
x = Activation('relu')(x)
x = MaxPooling2D(pool_size=(2,2))(x)
x = Flatten()(x)
x = Dense(256)(x)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = Dropout(0.25)(x)
x = Dropout(0.25)(x)
x = Dense(10)(x)
x = Activation('softmax')(x)
outputs = x
model = tf.keras.Model(inputs=inputs, outputs=outputs)
I tried to reshape the tensors with shape of TensorShape([None, 21, 21, 64]) and TensorShape([None, 10, 10, 192]) in 1D convolution and do merge, then reshape the output back to 2d convolution. My way is not working. Can anyone suggest possible way of doing this? Any thoughts?
update
I am still not sure the way of getting output shape of concatenation is going to be TensorShape([None, 21+10, 21+10, 192+64]) or not because I am not sure it does make sense in terms of mathematics standpoint. How to make this concatenation easily and correct? what would be right shape of concatenated one? Any idea?
To operate a concatenation you should provide layers with the same shapes except for the concat axis... in case of images, if you want to concatenate them on features dimensionality (axis -1) the layers must have the same batch_dim, width, and height.
If you want to force the operation you need to do something that equals the dimensionalities. A possibility is the padding. Below an example where I concatenate two layers on the last dimensionality
batch_dim = 32
x1 = np.random.uniform(0,1, (batch_dim, 10,10,192)).astype('float32')
x2 = np.random.uniform(0,1, (batch_dim, 21,21,64)).astype('float32')
merged_tensors = Concatenate()([ZeroPadding2D(((6,5),(6,5)))(x1), x2]) # (batch_dim, 21, 21, 192+64)
with Pooling instead of Padding:
batch_dim = 32
x1 = np.random.uniform(0,1, (batch_dim, 10,10,192)).astype('float32')
x2 = np.random.uniform(0,1, (batch_dim, 21,21,64)).astype('float32')
merged_tensors = Concatenate()([MaxPool2D(2)(x2), x1]) # (batch_dim, 10, 10, 192+64)