This question already has answers here:
Multiple images input to the same CNN using Conv3d in keras
(2 answers)
Closed 3 years ago.
I have a dataset of 15 class with 460 images all. I want to enter every 8 sequences of images at the same time to the same CNN structure. I use conv3d to do that, but I'm confusing with input shape, it returns error.
This is my model:
IMAGE_DIMS = (8, 460, 60, 60, 3)
data = []
labels = []
# loading images...
imagePaths = "dataset\\path"
listing = os.listdir(imagePaths)
for imagePath in listing:
image_fold = os.listdir(imagePaths + "\\" + imagePath)
for file in image_fold:
im = (imagePaths + "\\" + imagePath + "\\" + file)
image = cv2.imread(im)
image = cv2.resize(image, (IMAGE_DIMS[2], IMAGE_DIMS[3]))
image = img_to_array(image)
data.append(image)
label= imagePath.split(os.path.sep)[-1]
labels.append(label)
# scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
# binarize the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
(trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.2, random_state=42)
model = Sequential()
sample= IMAGE_DIMS[0]
frame=IMAGE_DIMS[1]
height = IMAGE_DIMS[2]
width=IMAGE_DIMS[3]
channels=IMAGE_DIMS[4]
classes=len(lb.classes_)
inputShape = (sample, frame, height, width, channels)
chanDim = -1
if K.image_data_format() == "channels_first":
inputShape = (sample, frame, channels, height, width)
chanDim = 1
model.add(Conv3D(32, (3, 3, 3), padding="same", batch_input_shape=inputShape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding="same", data_format="channels_last"))
model.add(Dropout(0.25))
model.add(Conv3D(64, (3, 3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding="same", data_format="channels_last"))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(classes))
model.add(Activation("softmax"))
model.summary()
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer= opt, metrics=["accuracy"])
H = model.fit(trainX, trainY, batch_size=BS, epochs=EPOCHS, verbose=1,validation_data (testX,testY))
and this is my model summary:
But I get the following error:
ValueError: Error when checking input: expected conv3d_1_input to have 5 dimensions, but got array with shape (368, 60, 60, 3)
How can I fix the error, can anyone please help me, I will be thankful for any help. I know the problem with the input shape, the compiler refer to the model.fit step. I thing trainX, testX, trainY, testY must be in 5-dim, but I cannot able to that.
If I understand correctly, you would like to fit your model with 8 images which is called actually batch. So when you call the method model.fit() set batch_size = 8. Another point that, I think, you confused is about the input shape. If you would like to fit images to the network, your input shape is the height x width of the image and the number of channels which is in your case RGB. So, the set input_shape = (3, 60, 60). Please be aware of that the network structure does not includes the total number of images in it. Because the NN structure does not need to know what is the training number. When you fit the training images to the network it will just take a batch of it and does the training job. Lastly, Instead of using 3D convolution layer, you need to use 2D. Think it as a 2D frame that moves over the training image and it does the movement for each channel. Therefore, the frame size need to has a 2D shape, set it (x, x). This frame is called kernel in documents.
The following code just an sample and has not been tested. I hope it helps to understand the structure:
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(3, 60, 60)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(number_of_classes))
model.add(Activation('softmax'))
Related
This is my first ML project done without any tutorials so apologies if this is a silly question.
Anyways, I'm making a CNN classifier that simply puts images in 1 of 2 categories (slouched or straight back)
I can't seem to get the input images used for prediction to match the dimensions that the model accepts. It is supposed to input 100x100 grayscale images. It keeps on returning this error:
WARNING:tensorflow:Model was constructed with shape (None, 100, 100, 1) for input KerasTensor(type_spec=TensorSpec(shape=(None, 100, 100, 1), dtype=tf.float32, name='reshape_input'), name='reshape_input', description="created by layer 'reshape_input'"), but it was called on an input with incompatible shape (None, 100).
I have tried various methods of resizing and reshaping and gray scaling (even though it is not in the code snippet), but I can't find something that works*
Here are the relevant bits of code:
print(slouchClassifier.predict(cv2.cvtColor(cv2.resize(frame, (100,100)), cv2.COLOR_BGR2GRAY)))
def predict(self, image):
return self.model.predict(image)
def trainModel(self, training, epochs = 1):
x = keras.utils.image_dataset_from_directory(
training,
labels='inferred',
label_mode='categorical',
class_names=None,
color_mode='grayscale',
batch_size=32,
image_size=(100, 100),
)
self.model.fit(x, batch_size= self.batchSize, verbose = 2, epochs = epochs) #Verbose = status updates on training
def createModel(self):
#Model
model = Sequential()
model.add(Reshape((100,100,1)))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation('relu'))
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dense(2))
model.add(Activation('softmax'))
# initiate RMSprop optimizer
opt = keras.optimizers.RMSprop(learning_rate=0.0001)
# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
return model
I hope this is enough information for you! Let me know if you need more information about the code.
Thanks in advance :)
tensorflow:Model was constructed with shape (None, 100, 100, 1)
Your model expects an input shape of (100, 100, 1). However, you used this to prepare the input:
slouchClassifier.predict(cv2.cvtColor(cv2.resize(frame, (100,100)), cv2.COLOR_BGR2GRAY))
cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) gets rid of the channel dimension entirely, resulting in an image of shape (100, 100). You need to add the channel dimension back. One way is np.expand_dims(img, -1).
For example,
img = cv2.cvtColor(cv2.resize(frame, (100, 100)), cv2.COLOR_BGR2GRAY))
img = np.expand_dims(img, -1)
pred = slouchClassifier.predict(img)
I'm following this tutorial from Nabeel Ahmed to create your own emotion detector using Keras (I'm a noob) and I've found a strange behaviour that I'd like to understand. The input data is a bunch of 48x48 images, each one with an integer value between 0 and 6 (each number stands for an emotion label), which represents the emotion present in the image.
train_X.shape -> (28709, 2304) // training-data, 28709 images of 48x48
train_Y.shape -> (28709,) //The emotion present in each image as an integer, 1 = happiness, 2 = sadness, etc.
val_X.shape -> (3589, 2304)
val_Y.shape -> (3589, )
In order to feed the data into the model, train_X and val_X are reshaped (as the tutorial explains)
train_X.shape -> (28709, 48, 48, 1)
val_X.shape -> (3589, 48, 48, 1)
The model, as it is in the tutorial, is this one:
model = Sequential()
input_shape = (48,48,1)
#1st convolution layer
model.add(Conv2D(64, (5, 5), input_shape=input_shape,activation='relu', padding='same'))
model.add(Conv2D(64, (5, 5), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
#2nd convolution layer
model.add(Conv2D(128, (5, 5),activation='relu',padding='same'))
model.add(Conv2D(128, (5, 5),activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
#3rd convolution layer
model.add(Conv2D(256, (3, 3),activation='relu',padding='same'))
model.add(Conv2D(256, (3, 3),activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.2))
################################################################
model.add(Dense(7)) # <- problematic line
################################################################
model.add(Activation('softmax'))
my_optimiser = tf.keras.optimizers.Adam(
learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False,
name='Adam')
model.compile(loss='categorical_crossentropy', metrics=['accuracy'],optimizer=my_optimiser)
However, when I try to use it, using the tutorial snippet, I get an error in the line of the validation_data like this
history = model.fit(train_X,
train_Y,
batch_size=64,
epochs=80,
verbose=1,
validation_data=(val_X, val_Y),
shuffle=True)
ValueError: Shapes (None, 1) and (None, 7) are incompatible
After reviewing the code and the documentation about the fit method, my only idea was to change the 7 in the last Dense layer of the model to 1, which mysteriously works. I'd like to know what is happening here if anyone could give me a hint.
You seem to be working with sparse integer labels, where each sample belongs to one of seven classes {0, 1, 2, 3, 4, 5, 6}, so I would recommend using SparseCategoricalCrossentropy instead of CategoricalCrossentropy as your loss function. Just change this parameter and your model should work fine. If you want to use CategoricalCrossentropy, you will have to one-hot encode your labels, for example with:
train_Y = tf.keras.utils.to_categorical(train_Y, num_classes=7)
I want to enter 8 images at the same time to the same CNN structure using conv3d. my CNN model is as following:
def build(sample, frame, height, width, channels, classes):
model = Sequential()
inputShape = (sample, frame, height, width, channels)
chanDim = -1
if K.image_data_format() == "channels_first":
inputShape = (sample, frame, channels, height, width)
chanDim = 1
model.add(Conv3D(32, (3, 3, 3), padding="same", input_shape=inputShape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding="same", data_format="channels_last"))
model.add(Dropout(0.25))
model.add(Conv3D(64, (3, 3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling3D(pool_size=(2, 2, 2), padding="same", data_format="channels_last"))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128)) #(Dense(1024))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(classes))
model.add(Activation("softmax")
The training of model is as follow:
IMAGE_DIMS = (57, 8, 60, 60, 3) # since I have 460 images so 57 sample with 8 image each
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
# binarize the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
# note: data is a list of all dataset images
(trainX, testX, trainY, testY) train_test_split(data, labels, test_size=0.2, random_state=42)
aug = ImageDataGenerator(rotation_range=25, width_shift_range=0.1, height_shift_range=0.1, shear_range=0.2, zoom_range=0.2, horizontal_flip=True, fill_mode="nearest")
# initialize the model
model = CNN_Network.build(sample= IMAGE_DIMS[0], frame=IMAGE_DIMS[1],
height = IMAGE_DIMS[2], width=IMAGE_DIMS[3],
channels=IMAGE_DIMS[4], classes=len(lb.classes_))
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer= opt, metrics=["accuracy"])
# train the network
model.fit_generator(
aug.flow(trainX, trainY, batch_size=BS),
validation_data=(testX, testY),
steps_per_epoch=len(trainX) // BS,
epochs=EPOCHS, verbose=1)
I have confused with the input_shape, I know Conv3D require 5D input, the input is 4D with batch added from keras, but I have the following error:
ValueError: Error when checking input: expected conv3d_1_input to have 5 dimensions, but got array with shape (92, 60, 60, 3)
Can anyone please help me what can I do? what is the 92 resulted, I determine input_shape with (57, 8, 60, 60, 3). And what is my input_shape should become to get 8 colored images input to the same model at the same time.
In Keras Python 3, the input shape can be as follows:
input_shape = (8, 64, 64, 1)
Where:
Value 1 (8) is the number of frames
Value 2 (64) is the width
Value 3 (64) is the height
Value 4 (1) is the number of channels
I'm building a keras model of convolution neural network for predicting the correct class and classify the tested objects. the model have the conv2D, activation, maxpooling, dropout, flatten, dense layers. after that I training the network on large dataset, but it take a very long time for training, it may reach to 3,4 days, What I need is to reduce the time required to training the network, Is there any way to do that in python?
I have tried to optimize the learning rate by using the LR_Finder class as follow:
from LR_Finder import LRFinder
lr_finder = LRFinder(min_lr=1e-5,max_lr=1e-2, steps_per_epoch=np.ceil(len(trainX) // BS), epochs=100)
But this also did not give me any reduction about the time required.
This is the code of my model:
class SmallerVGGNet:
#staticmethod
def build(width, height, depth, classes):
# initialize the model along with the input shape to be
# "channels last" and the channels dimension itself
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
# if we are using "channels first", update the input shape
# and channels dimension
if K.image_data_format() == "channels_first":
inputShape = (depth, height, width)
chanDim = 1
# CONV => RELU => POOL
model.add(Conv2D(32, (3, 3), padding="same",
input_shape=inputShape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.25))
# (CONV => RELU) * 2 => POOL
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
# (CONV => RELU) * 2 => POOL
model.add(Conv2D(128, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(128, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
# first (and only) set of FC => RELU layers
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(classes))
model.add(Activation("softmax"))
# return the constructed network architecture
return model
and after that I trained the model as following code:
EPOCHS = 100
INIT_LR = 1e-3
BS = 32
IMAGE_DIMS = (96, 96, 3)
data = []
labels = []
# grab the image paths and randomly shuffle them
imagePaths = sorted(list(paths.list_images("Dataset")))
random.seed(42)
random.shuffle(imagePaths)
# loop over the input images
for imagePath in imagePaths:
# load the image, pre-process it, and store it in the data list
image = cv2.imread(imagePath)
image = cv2.resize(image, (IMAGE_DIMS[1], IMAGE_DIMS[0]))
image = img_to_array(image)
data.append(image)
label = imagePath.split(os.path.sep)[-2]
labels.append(label)
# scale the raw pixel intensities to the range [0, 1]
data = np.array(data, dtype="float") / 255.0
labels = np.array(labels)
print("[INFO] data matrix: {:.2f}MB".format(data.nbytes / (1024 * 1000.0)))
# binarize the labels
lb = LabelBinarizer()
labels = lb.fit_transform(labels)
# partition the data into training and testing splits using 80% of
# the data for training and the remaining 20% for testing
(trainX, testX, trainY, testY) = train_test_split(data,
labels, test_size=0.2, random_state=42)
# construct the image generator for data augmentation
aug = ImageDataGenerator(rotation_range=25, width_shift_range=0.1,
height_shift_range=0.1, shear_range=0.2, zoom_range=0.2,
horizontal_flip=True, fill_mode="nearest")
# initialize the model
model = SmallerVGGNet.build(width=IMAGE_DIMS[1], height=IMAGE_DIMS[0],
depth=IMAGE_DIMS[2], classes=len(lb.classes_))
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer= opt,
metrics=["accuracy"])
print("model compiled in few minutes successfully ^_^")
# train the network
H = model.fit_generator(aug.flow(trainX, trainY, batch_size=BS),
validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS,
epochs=EPOCHS, verbose=1)
According to this code,I expected the output required some minutes or may be a few hours, but when it reach to training in model.fit_generator step, the actual time required is about many hours for every epoch and it requires some days to train all the network or it may be crash and stop working. Is there any way to reduce the training time?
set use_multiprocessing=True and workers>1 when you call fit_generator because the default is to execute the generator on the main thread only
I am doing multi-label image classification on a dataset of around 3000 images. Because this is the limit of my working memory and the dataset will increase, I was trying to implement my own generator because I am also parsing the images from an online source. The network reached an accuracy of 25% where the three labels with the highest accuracy gave a pretty good representation of the images.
A normal batch would be of shape (32, 64, 64, 3) with labels of shape (32, 57).
My model looks like:
def createModel(shape, classes):
x = shape[0]
y = shape[1]
z = shape[2]
model = Sequential()
model.add(Conv2D(32, (2, 2), padding='same',input_shape=(x,y,z)))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(32, (2, 1), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Conv2D(32, (1, 2), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
model.add(Conv2D(48, (2, 2), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
model.add(Conv2D(80, (2, 2), padding='same'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides = 2))
model.add(Flatten())
model.add(Dense(classes, activation = 'sigmoid'))
where classes would be 57 and the x,y,z would be 64,64,3.
My generator looks like:
def generator(data, urls, labels, batch_size):
counter = 0
X_train = []
Y_train = []
while 1:
for i in range(len(data)):
if counter == batch_size:
yield (np.array(X_train), np.array(Y_train))
X_train = []
Y_train = []
counter = 0
try:
ID = data[i][0]
if random.uniform(0,1) > 0.5:
X_train.append(getImage(64, urls[ID]))
else:
X_train.append(np.flip(getImage(64, urls[ID]),1))
Y_train.append(labels[i])
counter+=1
except:
continue
where data is an list with image ID and labels, urls is a list with image ID and the url to find the image, labels is the labels converted by MultiLabelBinarizer() (.fit_transform) and the batch_size. the getImage() function results in a np.array() where 64 gives the shape.
The main calls:
epochs = 60
lr = 1e-6
mlc = model.createModel((64,64,3), 57)
opt = Adam(lr=lr, decay=lr / epochs)
trainGenerator = data.generator(structuredData, urls, mlb_labels, 32)
validationGenerator = data.generator(structuredData, urls, mlb_labels, 32)
mlc.compile(loss="categorical_crossentropy", optimizer=opt, metrics=
["accuracy"])
mlc.fit_generator(trainGenerator, steps_per_epoch = 10, epochs=epochs,
validation_steps = 1, validation_data=validationGenerator)
mlc.save("datagenerator_test.h5")
Furthermore the network thus already works and trains if I do not use the generator, with the generator it seems to get a random accuracy between 1 and 3%. I hope this provides enough information.
EDIT: I takes about 90 seconds to prepare one batch of 32 images. Does the training wait for a batch to be ready?