How to add Conv2D layers with LSTM in Keras? - python

I'm trying to identify the sequence of images. I've 2 images and I need to identify the 3rd one. All are color images.
I'm getting below error:
ValueError: Error when checking input: expected
time_distributed_1_input to have 5 dimensions, but got array with
shape (32, 128, 128, 6)
This is my layer:
batch_size = 32
height = 128
width = 128
model = Sequential()
model.add(TimeDistributed(Conv2D(32, (3, 3), activation = 'relu'), input_shape=(batch_size, height, width, 2 * 3)))
model.add(TimeDistributed(MaxPooling2D(2, 2)))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(Conv2D(32, (3, 3), activation='relu', padding='same')))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(LSTM(256, return_sequences=True, dropout=0.5))
model.add(Conv2D(3, (3, 3), activation='relu', padding='same'))
model.compile(optimizer='adam')
model.summary()
My input images shapes are:
(128, 128, 2*3) [as I'm concatenating 2 input images]
My output image shape is:
(128, 128, 3)

You have applied the conv layer after Flatten(). This causes error because after flattening the data flowing through the Network is no more a 2D object.
I suggest you to keep the convolutional and the recurrent phases separated. First, you apply convolution to images, training the model to extract their relevant features. Later, you push these features into LSTM layers, so that you can capture also the information hidden in their sequence.
Hope this helps, otherwise let me know.
--
EDIT:
According to the error that you get, it seems that you are also not feeding the exact input shape. Keras is saying: "I need 5 dimensions, but you gave me 4". A TimeDistributed() layers needs a shape such as: (sample, time, width, length, channel). Your input lacks time, apparently.
I suggest you to print your model.summary() before running, and check the layer called time_distributed_1_input. That's the one your compiler is upset with.

Related

How to design a CNN in Keras for data of dimensions (2505,10)?

I am designing a neural network for the classification of resting-state EEG signals. I have preprocessed my data such that each subject is characterized by a table consisting of 111 channels and their readings over 2505 timesteps. As a measure of dimensionality reduction, I clustered the 111 channels into the 10 lobes of the brain, effectively reducing the dimension to (2505,10) per subject. Since this data is 2D, I assume it would be analogous to CNNs for grayscale images.
I have compiled the EEG data for each subject into a dataframe of size (253, 2505, 10), where 253 is the number of subjects. The corresponding ground truth values are stored in a list of size (253,1) with the indices matching those from the dataframe. I want to build a classifier which tells if the subject is ADHD positive or negative. I am stuck on designing the neural network, particularly facing a dimensionality issue when passing a subject to the 1st layer.
#where X=[df0, df1, df2,......, df252] & y=[0,1,0,........,1]
# Model configuration
batch_size = 100
no_epochs = 30
learning_rate = 0.001
no_classes = 2
validation_split = 0.2
verbosity = 1
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu', padding='same'))
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
# Fit data to model
i=0 #validation_data=(X_test, y_test),
X_train = np.array(X_train)
y_train = np.array(y_train)
print("X_train:\t")
print(X_train.shape)
print("y_train:\t")
print(y_train.shape)
history = model.fit(X_train,y_train,
batch_size=batch_size,
epochs=no_epochs,
verbose=verbosity)
ValueError: Input 0 of layer sequential_12 is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: (None, 2505, 10).
Any help shall be appreciated.
To have a Conv2D model your train data, in an image processing perspective, needs to be of 4 dimension (N_observatoion, nrows, ncolumns, nchannels). Therefore, you have to reshape your features accordingly as per your domain knowledge to make it meaningful:
X_train = np.array(X_train).reshape(253, 2505, 10, 1) or # np.array(X_train).reshape(253, 2505, 1, 10)
# Then models can be defined as following:
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape = X_train.shape[1:], padding='same'))
I dont have any experience working with Signal data but what I would like to share that if your channel columns do not have any spatial significance like that in an image pixel then considering a 2D Conv network is not meaningful. For example if among 111 channels putting channel X's data in column 1 or putting channel Y's data in column 1 doesnt have any meaningful difference like that of the opposite in case of an image pixels then your sliders of a conv2D is not getting any significant information. Rather you can consider a conv1D or LSTM networks. for a conv 1D network you dont need 4 dimension X and your corrent 3 dimension X is ok. You can Try:
model = models.Sequential()
model.add(layers.Conv1D(32, 3, activation='relu', input_shape = X_train.shape[1:], padding='same'))

What does the filter parameter mean in Conv2d layer?

I am getting confused with the filter paramater, which is the first parameter in the Conv2D() layer function in keras. As I understand the filters are supposed to do things like edge detection or sharpening the image or blurring the image, but when I am defining the model as
input_shape = (32, 32, 3)
model = Sequential()
model.add( Conv2D(64, kernel_size=(5, 5), activation='relu', input_shape=input_shape, strides=(1,1), padding='same') )
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Conv2D(64, kernel_size=(5, 5), activation='relu', input_shape=input_shape, strides=(1,1), padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2,2)))
model.add(Conv2D(128, kernel_size=(5, 5), activation='relu', input_shape=input_shape, strides=(1,1), padding='same'))
model.add(Flatten())
model.add(Dense(3072, activation='relu'))
model.add(Dense(2048, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
I am not mentioning the the edge detection or blurring or sharpening anywhere in the Conv2D function. The input images are 32 by 32 RGB images.
So my question is, when I define the Convolution layer as Conv2D(64, ...), does this 64 means 64 different types of filters, such as vertical edge, horizontal edge, etc, which are chosen by keras at random? if so then is the output of the convolution layer (with 64 filters and 5x5 kernel and 1x1 stride) on a 32x32 1-channel image is 64 images of 28x28 size each. How are these 64 images combined to form a single image for further layers?
The filters argument sets the number of convolutional filters in that layer. These filters are initialized to small, random values, using the method specified by the kernel_initializer argument. During network training, the filters are updated in a way that minimizes the loss. So over the course of training, the filters will learn to detect certain features, like edges and textures, and they might become something like the image below (from here).
It is very important to realize that one does not hand-craft filters. These are learned automatically during training -- that's the beauty of deep learning.
I would highly recommend going through some deep learning resources, particularly https://cs231n.github.io/convolutional-networks/ and https://www.youtube.com/watch?v=r5nXYc2wYvI&list=PLypiXJdtIca5sxV7aE3-PS9fYX3vUdIOX&index=3&t=3122s.
Just wanted to clarify what the output shape was.
Although jakub's answer was good, I don't think it addressed the "single image for further layers" part of the question.
I did a model.summary() to find out more.
I found that the shape returned from a Conv2D is (None, img_width, img_height, num_filters)
So when you pass the output of the Conv2D to MaxPooling you are passing that shape which means it is basically passing each entire convoluted image.
The other layers handle this gracefully. MaxPooling2D(2,2) returns the same shape but half the image size (None, img_width / 2, img_height / 2, num_filters).
Side note: I wish the filters was named num_filters because filters seems to imply you're passing in a list of filters in which to convolute the image.

Negative dimension size caused by subtracting 3 from 1 for 'conv2d_1/convolution' (op: 'Conv2D') with input shapes: [?,1,10000,80], [3,3,80,16]

I am using Keras version 2.3.1 and TensorFlow 2.0.0.
I induce the titular error on my instantiation of the first convolutional layer in my network:
model = Sequential([
Conv2D(16, 3, input_shape=(1, 10000, 80)),
LeakyReLU(alpha=0.01),
MaxPooling2D(pool_size=3),
Conv2D(16, 3),
LeakyReLU(alpha=0.01),
MaxPooling2D(pool_size=3),
Conv2D(16, 3),
LeakyReLU(alpha=0.01),
MaxPooling2D(pool_size=3),
Conv2D(16, 3),
LeakyReLU(alpha=0.01),
MaxPooling2D(pool_size=3),
Dense(256),
LeakyReLU(alpha=0.01),
Dense(32),
LeakyReLU(alpha=0.01),
Dense(1, activation='sigmoid')])
As I am aware, the TF dimensional ordering should be set as (samples, rows, columns). My input is an array of shape 1000, 80.
I have tried all of the fixes I have found online, including:
K.common.set_image_dim_ordering('tf')
K.set_image_data_format('channels_last')
K.tensorflow_backend.set_image_dim_ordering('tf')
K.set_image_dim_ordering('tf')
However, all of these either do not change anything (as in the case of the first two) or fail at those lines (the latter two).
None of these fixes will work if the input_shape is wrong. The input_shape for a Conv2D layers should be (width, height, channels), the samples dimension is not included as its implictly inserted by Keras.
The input_shape you gave would be interpreted with width of one, which is a problem. You need to format your input_shape correctly and also add the channels dimension.

How does the second convolutional layer in tensorflow keras work?

I have the fllowing model in keras.
model = Sequential()
model.add(Conv2D(4, (3, 3), input_shape=input_shape, name='Conv2D_0', padding = 'same', use_bias=False, activation=None))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(8, (3, 3), name='Conv2D_1', padding='same', use_bias=False, activation=None))
model.add(MaxPooling2D(pool_size=(2, 2)))
input_shape is (32, 32). So, for the first layer, if I have a an image of size (32, 32), I get 4 images of size (32, 32). So the input image is convoluted with 4 diffrent kernels. After the pooling layer, I get 4 images of size (16, 16).
The second convolutional layer gives me 8 images of size (16, 16). This layer has
4*8 kernels. The kernels have the size (3, 3, 4, 8). But I don't get, how the 8 output images are computed.
I thought for example for the first image I can do sth like:
H_i : i-th output image of the first Pooling layer
Ker_i : i-th kernel. (:, :, i, 0)
So the first output image of the second convolutional layer could be:
conv(H_0, ker_0) + conv(H_1, ker_1) + conv(H_2, ker_2) + conv(H_3, ker_3)
But this seems to be wrong.
Can anyone explaine me, how the second conv-layer computes the output images?
Thank you for your help.

Error in Convolutional Neural network for input shape

I have 1000, 28*28 resolution images. I converted those 1000 images into numpy array and formed a new array with size of (1000,28,28). So, while
creating convolution layer using keras, input shape(X value) is specified as (1000,28,28) and output shape(Y value) as (1000,10). Because I ha
ve 1000 examples are inputs and 10 categories of output.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',kernel_initializer='he_normal',input_shape=(1000,28,28)))
.
.
.
model.fit(train_x,train_y,batch_size=32,epochs=10,verbose=1)
So, while using fit function, it shows ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (1000, 28, 28) as error. Pls help me guys to provide proper input and output dimension for CNN.
Code:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',kernel_initializer='he_normal',input_shape=(4132,28,28)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(10, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adam(),metrics=['accuracy'])
model.summary()
train_x = numpy.array([train_x])
model.fit(train_x,train_y,batch_size=32,epochs=10,verbose=1)
You need to change the inputs to 4 dimensions with channel set to 1 : (1000, 28, 28, 1) and you need to change the input_shape of the convolutional layer to (28, 28, 1):
model.add(Conv2D(32, kernel_size=(3, 3),...,input_shape=(28,28,1)))
Your numpy arrays need a fourth dimension, the common standard is to number the samples with the first dimension, so changing (1000, 28, 28) to (1, 1000, 28, 28).
You can read more about this here.
from your input it looks like you are using tensorflow as back end.
In keras the input_shape should always be 3 dimension .
For tensorflow as a backend the input_shape to your model will be
input_shape = [img_height,img_width,channels(depth)]
in your case for tensor flow backend that should be
input_shape = [28,28,1]
and the shape of train_x should be
train_x = [batch_size,img_height,img_width,channels(depth)]
in your case
train_x = [1000,28,28,1]
As you are using a gray scale image,the dimension of the image will be (image_height, image_width) and hence you have to add an extra dimension to the image which will result to (image_height, image_width, 1) the '1' suggests the depth of the image,for gray scale that is '1' and for rgb that is '3'.

Categories

Resources