I'm trying to build a neural net however I can't figure out where I'm going wrong with the max pooling layer.
self.embed1 = nn.Embedding(256, 8)
self.conv_1 = nn.Conv2d(1, 64, (7,8), padding = (0,0))
self.fc1 = nn.Linear(64, 2)
def forward(self,x):
import pdb; pdb.set_trace()
x = self.embed1(x) #input a tensor of ([1,217]) output size: ([1, 217, 8])
x = x.unsqueeze(0) #conv lay needs a tensor of size (B x C x W x H) so unsqueeze here to make ([1, 1, 217, 8])
x = self.conv_1(x) #creates 64 filter of size (7, 8).Outputs ([1, 64, 211, 1]) as 6 values lost due to not padding.
x = torch.max(x,0) #returning max over the 64 columns. This returns a tuple of length 2 with 64 values in each att, the max val and indices.
x = x[0] #I only need the max values. This returns a tensor of size ([64, 211, 1])
x = x.squeeze(2) #linear layer only wants the number of inputs and number of outputs so I squeeze the tensor to ([64, 211])
x = self.fc1(x) #Error Size mismatch (M1: [64 x 211] M2: [64 x 2])
I understand why the linear layer isn't accepting 211 however I don't understand why my tensor after maxing over the columns isn't 64 x 2.
You use of torch.max returns two outputs: the max value along dim=0 and the argmax along that dimension. Thus, you need to pick only the first output. (you might want to consider using adaptive max pooling for this task).
Your linear layer expects its input to have dim 64 (that is batch_size-by-64 shaped tensor). However, it seems like your x[0] is of shape 13504x1 - definitely not 64.
See this thread for example.
If I'm guessing your intentions correctly, your mistake is that you're using torch.max for 2d maxpooling, instead of torch.nn.functional.max_pool2d. The former reduces across a tensor dimension (for instance across all feature maps or all horizontal lines), whereas the latter reduces in each square spatial neighborhood in the [h, w] plane of a [batch, features, h, w] tensor.
Instead of this:
x = x.squeeze(2)
You can do this instead:
x = x.view(-1, 64) # view will now correctly resize it to [64 x 2]
You can think of view as numpy reshape. We use -1 to signify that we don't know how many rows we want but we know how many columns we have, 64.
Related
I am trying to take an inner product of two vectors in tensorflow, for which I use the dot product:
x = tf.constant([1, 2, 3], dtype=tf.int32)
y = tf.constant([4, 5, 6], dtype=tf.int32)
# desired result
tf.tensordot(x, y, axes=1)
# Output: 32
Now I'm dealing with batch tensors which both have shape (32, 3). I still want the same operation, yielding an output vector of shape (32, ). My only succesful attempt so far is:
tf.linalg.diag_part(tf.tensordot(x, y, axes=[[1], [1]]))
# Output: <tf.Tensor: shape=(32,)>
# where each entry is the inner product of the vectors of length 3
However, I compute 32 as many inner products as required.
How do I solve my problem more efficiently?
Think about what this operation is at the end of the day: Element-wise multiplication and a sum over axis 1. So you can just do this
tf.reduce_sum(x * y, axis=1)
So currently I have this code which will pad a BxNxNxC tensor to a BxNxNx(C+P)tensor, where B is batch size, C is the number of channels, and P is the number of padding channels I want to add:
A = <some BxNxNxC tensor>
P = <some calculation>
padding_tensor = keras.layers.UpSampling3D(size=[1, 1, P])(tf.zeros_like(A)[:, :, :, 0:1])
# This is the BxNxNx(C+P) tensor
concat = keras.layers.Concatenate(axis=3)([A, padding_tensor])
The reason I do this in a round about way is because I cannot directly create a padding_tensor of the correct size, because it seems impossible to get the batch size to specify the shape.
I want clean way to do this because I am looking at the computation graphs of my Models and this adds a lot of bloat. If it is possible to sort of hide all of these operations into a single computation node I would be happy enough with that but would rather not have to use 3 operations for something as simple as padding.
I also suspect this will be kind of slow, but I don't know enough about tensorflow to really know.
this is my suggestion... I initialize a fake conv2d layer with zeros and make it not trainable, this will produce 0 output
batch, H, W, F, C, P = 32, 28, 28, 3, 5, 6
X = np.random.uniform(0,1, (batch,H,W,F))
inp = Input((H,W,F))
x_c = Conv2D(C,3, padding='same')(inp) # BxNxNxC
x_p = Conv2D(P,3, padding='same', kernel_initializer='zeros', name='zeros')(inp) # BxNxNxP
concat = Concatenate()([x_c,x_p]) # BxNxNx(C+P)
model = Model(inp, concat)
model.get_layer('zeros').trainable = False # important
model.summary()
# check if zeros
model.predict(X)[:,:,:,-P:].sum() # 0
I'm currently working on building a convolutional neural network (CNN) that will work on financial time series data. The input shape is (100, 40) - 100 time stamps by 40 features.
The CNN that I'm using uses asymmetric kernel sizes (i.e. 1 x 2 and 4 x 1) and also asymmetric strides (i.e. 1 x 2 for the 1 x 2 layers and 1 x 1 for the 4 x 1 layers).
In order to maintain the height dimension to stay 100, I needed to pad the data. In my research, I noticed that people who use TensorFlow or Keras simply use padding='same'; but this option is apparently unavailable in PyTorch.
According to some answers in What is the difference between 'SAME' and 'VALID' padding in tf.nn.max_pool of tensorflow?, and also this answer on the PyTorch discussion forum, I can manually calculate how I need to pad my data, and use torch.nn.ZeroPad2d to solve the problem - since apparently normal torch.nn.Conv2d layers don't support asymmetric padding (I believe that the total padding I need is 3 in height and 0 in width).
I tried this code:
import torch
import torch.nn as nn
conv = nn.Conv2d(1, 1, kernel_size=(4, 1))
pad = nn.ZeroPad2d((0, 0, 2, 1)) # Add 2 to top and 1 to bottom.
x = torch.randint(low=0, high=9, size=(100, 40))
x = x.unsqueeze(0).unsqueeze(0)
y = pad(x)
x.shape # (1, 1, 100, 40)
y.shape # (1, 1, 103, 40)
print(conv(x.float()).shape)
print(conv(y.float()).shape)
# Output
# x -> (1, 1, 97, 40)
# y -> (1, 1, 100, 40)
It does work, in the sense that the data shape remains the same. However, is there really no padding='same' option available? Also, how can we decide which side to pad?
I had the same issue some time ago, so I implemented it myself using a ZeroPad2d layer as you are trying to do. Here is the right formula:
from functools import reduce
from operator import __add__
kernel_sizes = (4, 1)
# Internal parameters used to reproduce Tensorflow "Same" padding.
# For some reasons, padding dimensions are reversed wrt kernel sizes,
# first comes width then height in the 2D case.
conv_padding = reduce(__add__,
[(k // 2 + (k - 2 * (k // 2)) - 1, k // 2) for k in kernel_sizes[::-1]])
pad = nn.ZeroPad2d(conv_padding)
conv = nn.Conv2d(1, 1, kernel_size=kernel_sizes)
print(x.shape) # (1, 1, 103, 40)
print(conv(y.float()).shape) # (1, 1, 103, 40)
Also, as mentioned by #akshayk07 and #Separius, I can confirm that it is the dynamic nature of pytorch that makes it hard. Here is a post about this point from a Pytorch developper.
It looks like there is now, in pytorch 1.9.1, according to the docs.
padding='valid' is the same as no padding. padding='same' pads
the input so the output has the shape as the input. However, this mode
doesn't support any stride values other than 1.
padding='same' and padding='valid' is possible in Pytorch 1.10.0+. However, 'same' and 'valid' for padding is not possible for when stride > 1.
I have a tensor of shape [64, 270] (64 batches * 270 items) and want to add the same 200 additional items to each batch (a tensor of shape [200]). The result should be that each of the 64 batches contains their original 270 items plus the 200 new items that are the same for each batch.
basically concat([64, 270], [200]) --> [64, 470]
How can I do that? I tried using tf.concat, tf.stack, increasing the rank of the second tensor using tf.expand_dims but nothing works. It always either complains about the unequal rank or unequal zeroth (batch) dimension.
You can try,
tf.concat([x,tf.tile(y[None,...],[tf.shape(x)[0],1])], axis=1)
Code:
x = tf.placeholder(tf.float32,[None,270])
y = tf.placeholder(tf.float32, (200))
z = tf.concat([x,tf.tile(y[None,...],[tf.shape(x)[0],1])], axis=1)
with tf.Session() as sess:
print(sess.run(z, {x:np.random.normal(size=(64,270)), y:np.random.normal(size=(200))}).shape)
# (64, 470)
I have two tensors of shape N x D1 and M x D2 where D1 > D2, called X and Y respectively. For my task, X acts as the input and Y acts as the filter.
I want to calculate a matrix P of shape N x M x (D1-D2+1) such that:
P[0,0,0] = dot(X[0,0:D2], Y[0,:])
P[0,0,1] = dot(X[0,1:D2+1], Y[0,:])
...
P[N-1,M-1,D1-D2] = dot(X[N-1,D1-D2:D1], Y[M-1,:])
I can create a for loop and manually slide Y and calculate the dot products.
However I prefer using the correlation operator.
As I know, tensorflow has correlation operator implemented (https://www.tensorflow.org/versions/master/api_docs/python/nn/convolution) but I don't know how can I use my tensors as inputs and filters.
tf.nn.conv2d(input, filter, strides, padding, use_cudnn_on_gpu=None, data_format=None, name=None)
In your case, I'd set strides to 1, and padding to SAME.
tf.nn.conv2d(X, Y, strides=1, padding=SAME)
Yes, you can use indeed tf.nn.conv2d(), but you should add both batch and channel dimensions:
X = tf.expand_dims(tf.expand_dims(X,0),-1)
# X.shape [batch=1, in_height, in_width, in_channels=1]
Y = tf.expand_dims(tf.expand_dims(Y,-1),-1)
# Y.shape = [filter_height, filter_width, in_channels=1, out_channels=1]
# Convolution (actually correlation, see doc of conv2d)
xcorr = tf.nn.conv2d(X, Y, padding="VALID", strides=[1, 1, 1, 1])
# Padding should be VALID, since you've already padded your input
CAVEAT: However, you cannot extrapolate this approach for batches of signals, since tf.nn.conv2d uses always the same filter over the batch dimension, and from my understanding you do want to change it.