keras max pooling layer after embedding layer - python

I have a 7 word embedding, each in dimension 10, so overall I have a matrix of 7 rows and 10 columns.
After applying this embedding (using embedding layer), I get the output of dimension (None, 7, 1, 10).
I want to have a vector, such that its i'th element is the maximum of row i, for each i in the range of {0,6}. Hence, it should be a vector of length 7 (and dimension like (None,7)).
For this purpose, I first use a reshape layer, to change dimension to (7,10) (otherwise, it does not work).
Then, I've tried to use GlobalMaxPooling1D for this purpose (also tried MaxPooling1D), and the output does not
have the dimension that I wanted. It is (None, 10) instead of (None,7).
I've checked the GlobalMaxPooling1D on a separate example, and it works well. It must be something with the additional dimension that the embedding layer pushes in the middle.
p.s. A similar question has been asked here:
Max over time pooling in Keras
But I don't think that the suggested answer there answer the original question (it seems to return the maximum of the entire matrix).

Related

Confusion regarding dimensions in CNN

The above picture generated using Matlab's deep learning toolbox shows the architecture of a CNN created for a toy example. The input image is of size 25*20*7, number of filters are 15 each of size 5*5 and padding is same. The output of the first convolution conv1 is 25*20*15 which goes into maxpooling 1 operation of size 2*2 with stride 1 and padding same.
Based on my understanding, the role of maxpooling is to perform dimension reduction. However, in my code since the padding is set to same I understand that the output of maxpooling will preserve the spatial dimension to its input which is 25*20*15. That is why the output of maxpooling1 and the rest of the maxpooling is of the same dimension as its input and there is no change in the dimension in the remaining layers. AS an example, So, the output of maxpooling should have been: (25 - 2 +2*1/1) + 1 = 23+2/1 + 1 = 25. Similarly, for the second dimension maxpooling would yield: (20 - 2 +2*1/1) + 1 = 18+2/1 + 1 = 20. Thus, the output of maxpooling should be 25*20*15.
This implies that maxpooling is not doing dimension reduction. Therefore, should I remove maxpooling if the padding option is set to same?
Please let me know how the dimensions are same after doing maxpooling and if same dimension then should I remove this operation? Or did I do some mistake?
The role of padding is different for convolutional and maxpooling layer. If padding=same in convolutional layer, it means that the output size (primarily height and width) remains the same as the input.
On the other hand, padding in pooling layers has a different functionality. The purpose of pooling layers is to reduce the spatial dimensions (height and width). In pooling layers, padding = same does not mean that the spatial dimensions do not change. Padding in pooling is required to make up for overlaps when the input size and kernel size do not perfectly fit.
tldr: If you want to reduce the image size in every step, use padding=valid. it's default option.
Maxpooilign is generally used for Downsampling and Reducing Overfitting.
If you use padding='same', it will stretch the image to input size, causing no drop in the size.
In the example below, input size is 4 * 4, pool is 2*2 and (default) stride is 2x2, so output is 2 * 2
Find more examples on keras' official site

What is a linear pooling layer?

What is a linear pooling layer?
What can be the maximum size of a linear pooling kernel?
Do you use dense layers after linear layers?
Same as a normal pooling layer, but along one dimension. I.e., instead of selecting max response from a n x n window, select from 1 x n. Perhaps it makes sense mostly if the previous output is one-dimensional.
Size of the previous output along the desired dimension
Nothing prevents you from doing so. Just do whatever makes sense.

Is it possible to implement a max pooling layer in Keras which provides the maximum n values from a given pool size?

I have a multidimensional time series dataset which has the following shape (n_samples, 512, 9) where 512 is the timesteps and 9 are the channels.
After the first 1D CNN layer with 64 kernels my output shape is (n_samples, 512, 64). Now I would like to have my input to next layer which is an LSTM to be of the shape (n_samples, 384, 64).
It can be achieved if I have a Maxpool layer that returns maximum 3 values from pool size of 4 but is it possible to implement this in Keras?
You can probably solve this with a keras.layers.Lambda layer, and the backend to tf.nn.in_top_k. Note that the handling is somewhat different from tf.nn.top_k, in that it does not pool if all the values are of the same value!
Now you can define yourself a function that returns the top k values (and does so
somewhat efficiently), and then pass it as a function to the lambda layer.
I sadly haven't worked enough with Keras to type out the specific code, but maybe this is help enough to point you in the right direction.
Also, there exists a similar thread for TensorFlow specifically.

How does tensorflow scale RNNCell weight tensors when changing their dimensions?

I'm trying to understand how the weights are scaled in a RNNCell when going from training to inference in tensorflow.
Consider the following placeholders defined as:
data = tf.placeholder(tf.int32,[None,max_seq_len])
targets = tf.placholder(tf.int32,[None,max_seq_len])
During training the batch_size is set to 10, e.g. both tensors have shape [10,max_seq_len]. However, during inference only one example is used, not a batch of ten, so the tensors have shape [1,max_seq_len].
Tensorflow handles this dimension change seamlessly, however, I'm uncertain of how it does this?
My hypothesis is that weigth tensors, in the RNNCell, are actually shape [1,hidden_dim], and scaling to larger batch sizes is acheived by broadcasting, but I'm unable to find something that reflects this in the source. I've read through the rnn source and the rnn cell source. Any help with understanding this would be much appreciated.
you have defined your data tensor as data = tf.placeholder(tf.int32,[None,max_seq_len]) which means that the first dimension will change according to the input but the second dimension will always remain max_seq_len
So if max_seq_len = 5 than you feed shape can be either [1, 5], [2, 5], [3, 5] which means you can change the first dimension but not the second one
If you change the second dimension to a number other than 5 then it will throw you an error for mismatch shape or similar error
Your input's first dimension which is the batch_size won't affect the weight matrix of any of the neurons in you network

Input shape and Conv1d in Keras

The first layer of my neural network is like this:
model.add(Conv1D(filters=40,
kernel_size=25,
input_shape=x_train.shape[1:],
activation='relu',
kernel_regularizer=regularizers.l2(5e-6),
strides=1))
if my input shape is (600,10)
i get (None, 576, 40) as output shape
if my input shape is (6000,1)
i get (None, 5976, 40) as output shape
so my question is what exactly is happening here? is the first example simply ignoring 90% of the input?
It is not "ignoring" a 90% of the input, the problem is simply that if you perform a 1-dimensional convolution with a kernel of size K over an input of size X the result of the convolution will have size X - K + 1. If you want the output to have the same size as the input, then you need to extend or "pad" your data. There are several strategies for that, such as add zeros, replicate the value at the ends or wrap around. Keras' Convolution1D has a padding parameter that you can set to "valid" (the default, no padding), "same" (add zeros at both sides of the input to obtain the same output size as the input) and "causal" (padding with zeros at one end only, idea taken from WaveNet).
Update
About the questions in your comments. So you say your input is (600, 10). That, I assume, is the size of one example, and you have a batch of examples with size (N, 600, 10). From the point of view of the convolution operation, this means you have N examples, each of with a length of at most 600 (this "length" may be time or whatever else, it's just the dimension across which the convolution works) and, at each of these 600 points, you have vectors of size 10. Each of these vectors is considered an atomic sample with 10 features (e.g. price, heigh, size, whatever), or, as is sometimes called in the context of convolution, "channels" (from the RGB channels used in 2D image convolution).
The point is, the convolution has a kernel size and a number of output channels, which is the filters parameter in Keras. In your example, what the convolution does is take every possible slice of 25 contiguous 10-vectors and produce a single 40-vector for each (that, for every example in the batch, of course). So you pass from having 10 features or channels in your input to having 40 after the convolution. It's not that it's using only one of the 10 elements in the last dimension, it's using all of them to produce the output.
If the meaning of the dimensions in your input is not what the convolution is interpreting, or if the operation it is performing is not what you were expecting, you may need to either reshape your input or use a different kind of layer.

Categories

Resources