I have 1000, 28*28 resolution images. I converted those 1000 images into numpy array and formed a new array with size of (1000,28,28). So, while
creating convolution layer using keras, input shape(X value) is specified as (1000,28,28) and output shape(Y value) as (1000,10). Because I ha
ve 1000 examples are inputs and 10 categories of output.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',kernel_initializer='he_normal',input_shape=(1000,28,28)))
.
.
.
model.fit(train_x,train_y,batch_size=32,epochs=10,verbose=1)
So, while using fit function, it shows ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (1000, 28, 28) as error. Pls help me guys to provide proper input and output dimension for CNN.
Code:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',kernel_initializer='he_normal',input_shape=(4132,28,28)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(Dropout(0.4))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(10, activation='softmax'))
model.compile(loss=keras.losses.categorical_crossentropy,optimizer=keras.optimizers.Adam(),metrics=['accuracy'])
model.summary()
train_x = numpy.array([train_x])
model.fit(train_x,train_y,batch_size=32,epochs=10,verbose=1)
You need to change the inputs to 4 dimensions with channel set to 1 : (1000, 28, 28, 1) and you need to change the input_shape of the convolutional layer to (28, 28, 1):
model.add(Conv2D(32, kernel_size=(3, 3),...,input_shape=(28,28,1)))
Your numpy arrays need a fourth dimension, the common standard is to number the samples with the first dimension, so changing (1000, 28, 28) to (1, 1000, 28, 28).
You can read more about this here.
from your input it looks like you are using tensorflow as back end.
In keras the input_shape should always be 3 dimension .
For tensorflow as a backend the input_shape to your model will be
input_shape = [img_height,img_width,channels(depth)]
in your case for tensor flow backend that should be
input_shape = [28,28,1]
and the shape of train_x should be
train_x = [batch_size,img_height,img_width,channels(depth)]
in your case
train_x = [1000,28,28,1]
As you are using a gray scale image,the dimension of the image will be (image_height, image_width) and hence you have to add an extra dimension to the image which will result to (image_height, image_width, 1) the '1' suggests the depth of the image,for gray scale that is '1' and for rgb that is '3'.
Related
This is the model I've created:
model.add(Conv2D(64, (5,5), input_shape = (28, 28, 3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (5, 5)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
# added layers
model.add(Dense(10))
model.add(Activation('softmax'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X, y, batch_size=256, epochs=25, validation_split=0.3)
But loading an image for prediction as such:
test_image = np.array(img)
test_image = test_image.astype('float32')
test_image /= 255
# image.shape is 28, 28, 3
print((model.predict(test_image)))
results in the following error:
ValueError: Input 0 of layer sequential_11 is incompatible with the layer: : expected min_ndim=4, found ndim=3. Full shape received: (None, 28, 3)
X.shape is (2163, 28, 28, 3) where 2163 is the number of pictures of 28x28 pixels.
The model expects an input with 4 dimensions. This means that you have to reshape your image with .reshape(n_images, 28, 28,3 ). Now you have added an extra dimension without changing the data . Basically, you need to reshape your data to (n_images, x_shape, y_shape, channels). Try it, Make sure the shape of your input layer is 28,28,3
You need a batch dimension, because Keras is used to that input shape. I suggest you use np.expand_dims:
test_image = np.array(img).astype('float32')
test_image = np.expand_dims(test_image, axis=0)/255
test_image = tf.image.resize_with_pad(test_image, 28, 28)
print((model.predict(test_image)))
I am trying to combine CNN and LSTM for image classification.
I tried the following code and I am getting an error. I have 4 classes on which I want to train and test.
Following is the code:
from keras.models import Sequential
from keras.layers import LSTM,Conv2D,MaxPooling2D,Dense,Dropout,Input,Bidirectional,Softmax,TimeDistributed
input_shape = (200,300,3)
Model = Sequential()
Model.add(TimeDistributed(Conv2D(
filters=16, kernel_size=(12, 16), activation='relu', input_shape=input_shape)))
Model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),strides=2)))
Model.add(TimeDistributed(Conv2D(
filters=24, kernel_size=(8, 12), activation='relu')))
Model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),strides=2)))
Model.add(TimeDistributed(Conv2D(
filters=32, kernel_size=(5, 7), activation='relu')))
Model.add(TimeDistributed(MaxPooling2D(pool_size=(2, 2),strides=2)))
Model.add(Bidirectional(LSTM((10),return_sequences=True)))
Model.add(Dense(64,activation='relu'))
Model.add(Dropout(0.5))
Model.add(Softmax(4))
Model.compile(loss='sparse_categorical_crossentropy',optimizer='adam')
Model.build(input_shape)
I am getting the following error:
"Input tensor must be of rank 3, 4 or 5 but was {}.".format(n + 2))
ValueError: Input tensor must be of rank 3, 4 or 5 but was 2.
I found a lot of problems in the code:
your data are in 4D so simple Conv2D are ok, TimeDistributed is not needed
your output is 2D so set return_sequences=False in the last LSTM cell
your last layers are very messy: no need to put a dropout between a layer output and an activation
you need categorical_crossentropy and not sparse_categorical_crossentropy because your target is one-hot encoded
LSTM expects 3D data. So you need to pass from 4D (the output of convolutions) to 3D. There are two possibilities you can adopt: 1) make a reshape (batch_size, H, W * channel); 2) (batch_size, W, H * channel). In this way, u have 3D data to use inside your LSTM
here a full model example:
def ReshapeLayer(x):
shape = x.shape
# 1 possibility: H,W*channel
reshape = Reshape((shape[1],shape[2]*shape[3]))(x)
# 2 possibility: W,H*channel
# transpose = Permute((2,1,3))(x)
# reshape = Reshape((shape[1],shape[2]*shape[3]))(transpose)
return reshape
model = Sequential()
model.add(Conv2D(filters=16, kernel_size=(12, 16), activation='relu', input_shape=input_shape))
model.add(MaxPooling2D(pool_size=(2, 2),strides=2))
model.add(Conv2D(filters=24, kernel_size=(8, 12), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2),strides=2))
model.add(Conv2D(filters=32, kernel_size=(5, 7), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2),strides=2))
model.add(Lambda(ReshapeLayer)) # <========== pass from 4D to 3D
model.add(Bidirectional(LSTM(10, activation='relu', return_sequences=False)))
model.add(Dense(nclasses,activation='softmax'))
model.compile(loss='categorical_crossentropy',optimizer='adam')
model.summary()
here the running notebook
I have train & test image data for which Shapes are given below.
X_test.shape , y_test.shape , X_train.shape , y_train.shape
((277, 128, 128, 3), (277, 1), (1157, 128, 128, 3), (1157, 1))
I am training a model
def baseline_model():
filters = 100
model = Sequential()
model.add(Conv2D(filters, (3, 3), input_shape=(128, 128, 3), padding='same', activation='relu'))
#model.add(Dropout(0.2))
model.add(BatchNormalization())
model.add(Conv2D(filters, (3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Flatten())
model.add(Conv2D(filters, (3, 3), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(Conv2D(filters, (3, 3), activation='relu', padding='same'))
model.add(Activation('linear'))
model.add(BatchNormalization())
model.add(Dense(512, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
# Compile model
lrate = 0.01
epochs = 10
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
model.compile(loss='sparse_categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
print(model.summary())
return model
But I am getting an error Given below
Error when checking target: expected dense_35 to have 4 dimensions,
but got array with shape (1157, 1)
Please tell me what mistake I am making and how to fix this. I have attached snapshot of model summary
One thing you have probably forgotten to do is adding a Flatten layer right before the first Dense layer:
model.add(BatchNormalization())
model.add(Flatten()) # flatten the output of previous layer before feeding it to Dense layer
model.add(Dense(512, activation='relu'))
You need it because Dense layer does not flatten its input; rather, it is applied on the last dimension.
Although dense_35 needs to feed with 4 dimension data, according to the error, the network feed with 2 dimension data which is the label vector.
DL beginner here. I'm trying to implement LeNet using Keras and apply it on good ol' MNIST.
class LeNet:
#staticmethod
def build(height, width, depth, classes):
model= Sequential()
inputshape= (height,width, depth)
if K.image_data_format()== 'channels_first':
inputshape= (depth, height, width)
#build model
model.add(Conv2d(20, (5,5), padding= "same", input_shape= inputshape))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size= (2,2), strides=(2,2)))
#replicate what's above again
model.add(Conv2d(50, (5,5), padding= "same"))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size= (2,2), strides=(2,2)))
#Fully Connected Layers
model.add(Flatten())
model.add(Dense(500))
model.add(Activation('relu'))
#softmax
model.add(Dense(classes))
model.add(Activation('softmax'))
return model
Then I load the dataset
print('[INFO] Accessing MNIST...')
dataset= loadmat('mnist-original.mat')
data=dataset['data']
Now I'm trying to reshape the data whose current shape is (784, 70000) by
data= data.reshape(data.shape[0], 28, 28, 1)
But I get an error that says ValueError: cannot reshape array of size 54880000 into shape (784,28,28,1)
Where am I going wrong here? Please help. Thanks.
I'm trying to identify the sequence of images. I've 2 images and I need to identify the 3rd one. All are color images.
I'm getting below error:
ValueError: Error when checking input: expected
time_distributed_1_input to have 5 dimensions, but got array with
shape (32, 128, 128, 6)
This is my layer:
batch_size = 32
height = 128
width = 128
model = Sequential()
model.add(TimeDistributed(Conv2D(32, (3, 3), activation = 'relu'), input_shape=(batch_size, height, width, 2 * 3)))
model.add(TimeDistributed(MaxPooling2D(2, 2)))
model.add(TimeDistributed(BatchNormalization()))
model.add(TimeDistributed(Conv2D(32, (3, 3), activation='relu', padding='same')))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(LSTM(256, return_sequences=True, dropout=0.5))
model.add(Conv2D(3, (3, 3), activation='relu', padding='same'))
model.compile(optimizer='adam')
model.summary()
My input images shapes are:
(128, 128, 2*3) [as I'm concatenating 2 input images]
My output image shape is:
(128, 128, 3)
You have applied the conv layer after Flatten(). This causes error because after flattening the data flowing through the Network is no more a 2D object.
I suggest you to keep the convolutional and the recurrent phases separated. First, you apply convolution to images, training the model to extract their relevant features. Later, you push these features into LSTM layers, so that you can capture also the information hidden in their sequence.
Hope this helps, otherwise let me know.
--
EDIT:
According to the error that you get, it seems that you are also not feeding the exact input shape. Keras is saying: "I need 5 dimensions, but you gave me 4". A TimeDistributed() layers needs a shape such as: (sample, time, width, length, channel). Your input lacks time, apparently.
I suggest you to print your model.summary() before running, and check the layer called time_distributed_1_input. That's the one your compiler is upset with.