I am trying to create an image classifier using Keras and TensorFlow 2.0.0 backend.
I'm training this model on my local machine on a custom dataset containing a total of 17~ thousand images. The images vary in size and are located in three different folders (training, validation, and test), each containing two subfolders (one for each class).
I tried an architecture similar to VGG16, which yielded more than decent results on this dataset in the past. Note, there is a minor class imbalance in the data (52:48)
When I call fit_generator(), the model doesn't train well; although the training loss lowers slightly throughout the first epoch, it does not change much afterward. Using this architecture with higher regulation, I achieved 85% accuracy after 55~ epochs in the past.
Imports and hyperparameters
import tensorflow as tf
from tensorflow import keras
from keras import backend as k
from keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten, Input, UpSampling2D
from keras.models import Sequential, Model, load_model
from keras.utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint
TRAIN_PATH = 'data/train/'
VALID_PATH = 'data/validation/'
TEST_PATH = 'data/test/'
TARGET_SIZE = (256, 256)
RESCALE = 1.0 / 255
COLOR_MODE = 'grayscale'
EPOCHS = 2
BATCH_SIZE = 16
CLASSES = ['Damselflies', 'Dragonflies']
CLASS_MODE = 'categorical'
CHECKPOINT = "checkpoints/weights.hdf5"
Model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(256, 256, 1), padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(516, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
In the past, I created a custom pipeline to reshape, grayscale, flip, and normalize the images; then, I trained the model using my CPU on batches of processed images.
I tried repeating the process using ImageDataGenerator, flow_from_directory, and GPU support.
# randomly flip images, and scale pixel values
trainGenerator = ImageDataGenerator(rescale=RESCALE,
horizontal_flip=True,
vertical_flip=True)
# only scale the pixel values validation images
validatioinGenerator = ImageDataGenerator(rescale=RESCALE)
# only scale the pixel values test images
testGenerator = ImageDataGenerator(rescale=RESCALE)
# instanciate train flow
trainFlow = trainGenerator.flow_from_directory(
TRAIN_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode = CLASS_MODE,
shuffle=True
)
# instanciate validation flow
validationFlow = validatioinGenerator.flow_from_directory(
VALID_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode= CLASS_MODE,
shuffle=True
)
Then, fitting the model using fit_generator.
checkpoints = ModelCheckpoint(CHECKPOINT, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
with tf.device('/GPU:0'):
model.fit_generator(
trainFlow,
validation_data=validationFlow,
callbacks=[checkpoints],
epochs=EPOCHS
)
I tried training it for 40 epochs.
The classifier achieves 52% after the first epoch and does not improve as time goes by.
Testing the classifier
testFlow = testGenerator.flow_from_directory(
TEST_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode= CLASS_MODE,
)
ans = model.predict_generator(testFlow)
When I look at the predictions, the model predicts all the test images as the majority class with the same confidence [0.48498476, 0.51501524].
Have I made sure the data is correct?
Yes. I tested whether the generators yield processed images and their corresponding labels correctly.
Have I tried changing the loss function, activation function, and optimizer?
Yes. I tried changing the class mode to binary, the loss to binary_crossentropy, and changing the last layer to produce a single output with sigmoid activation. No, I did not change the optimizer. However, I did try to increase the learning rate.
Have I tried changing the model's architecture?
Yes. I tried increasing and decreasing model complexity.
Both more layers with less regularization and fewer layers with more regularization produced similar results.
Are the layers trainable?
Yes.
Is the GPU support implemented correctly?
I hope so.
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
Num GPUs Available: 1
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
config = tf.compat.v1.ConfigProto(log_device_placement=True)
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)
print(sess)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: NVIDIA GeForce GTX 1050 with Max-Q Design, pci bus id: 0000:03:00.0, compute capability: 6.1
<tensorflow.python.client.session.Session object at 0x000001F9443E2CC0>
Have I tried transfer learning?
Not yet.
I found a similar unanswered question from 2017 keras-doesnt-train-using-fit-generator.
Thoughts?
The problem is with your model. I copied your code and ran it on a data set I have used before (which gets high accuracy) and got results similar to yours. I then substituted the simple model below
model = tf.keras.Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(256 , 256,1)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu' ),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(128, activation='relu'),
Dropout(.3),
Dense(64, activation='relu'),
Dropout(.3),
Dense(2, activation='softmax')
])
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
The model trained properly. By the way model.fit_generator is depreciated. You can now just use model.fit which can now handle generators. I then took your model and removed all the dropout layers except for the last one and your model trained properly. Code is:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(256, 256, 1), padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(516, activation='relu'))
#model.add(Dropout(0.1))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
#Gerry P,
By accident, I found what's causing the error.
Removing from Keras import backend as k resolved the model's inability to learn.
That's not all. I also identified that the model you defined, not calling ModelCheckpoint, and not customizing class names affected the fitting process.
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(256 , 256, 1)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu' ),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(128, activation='relu'),
Dropout(.3),
Dense(64, activation='relu'),
Dropout(.3),
Dense(2, activation='softmax')
])
I commented that import to try and resolve an error that occurred when I copy-pasted your sequential model. Then, I forgot to uncomment it when I tested it beautiful or average dataset. I achieved over 80% accuracy after the third epoch. Then, I reverted the changes and tried it on my dataset, and it failed again.
As a bonus, not importing Keras's backend decreased the time it takes to train the model!
Lately, I had to re-install Keras and TensorFlow because they couldn't detect my GPU anymore. I probably made a mistake and installed an incompatible version of Keras.
CUDA==10.0
tensorflow-gpu==2.0.0
keras==2.3.1
Note, it's still not a 100% solution, and the problems arise every so often.
EDIT:
Whenever it doesn't work, simplify the model.
Changed batch size and stopped learning? Simplify the model.
Augmented the images further and stopped learning? Simplify the model.
Related
I'm following this tutorial from Nabeel Ahmed to create your own emotion detector using Keras (I'm a noob) and I've found a strange behaviour that I'd like to understand. The input data is a bunch of 48x48 images, each one with an integer value between 0 and 6 (each number stands for an emotion label), which represents the emotion present in the image.
train_X.shape -> (28709, 2304) // training-data, 28709 images of 48x48
train_Y.shape -> (28709,) //The emotion present in each image as an integer, 1 = happiness, 2 = sadness, etc.
val_X.shape -> (3589, 2304)
val_Y.shape -> (3589, )
In order to feed the data into the model, train_X and val_X are reshaped (as the tutorial explains)
train_X.shape -> (28709, 48, 48, 1)
val_X.shape -> (3589, 48, 48, 1)
The model, as it is in the tutorial, is this one:
model = Sequential()
input_shape = (48,48,1)
#1st convolution layer
model.add(Conv2D(64, (5, 5), input_shape=input_shape,activation='relu', padding='same'))
model.add(Conv2D(64, (5, 5), activation='relu', padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
#2nd convolution layer
model.add(Conv2D(128, (5, 5),activation='relu',padding='same'))
model.add(Conv2D(128, (5, 5),activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
#3rd convolution layer
model.add(Conv2D(256, (3, 3),activation='relu',padding='same'))
model.add(Conv2D(256, (3, 3),activation='relu',padding='same'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.2))
################################################################
model.add(Dense(7)) # <- problematic line
################################################################
model.add(Activation('softmax'))
my_optimiser = tf.keras.optimizers.Adam(
learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=False,
name='Adam')
model.compile(loss='categorical_crossentropy', metrics=['accuracy'],optimizer=my_optimiser)
However, when I try to use it, using the tutorial snippet, I get an error in the line of the validation_data like this
history = model.fit(train_X,
train_Y,
batch_size=64,
epochs=80,
verbose=1,
validation_data=(val_X, val_Y),
shuffle=True)
ValueError: Shapes (None, 1) and (None, 7) are incompatible
After reviewing the code and the documentation about the fit method, my only idea was to change the 7 in the last Dense layer of the model to 1, which mysteriously works. I'd like to know what is happening here if anyone could give me a hint.
You seem to be working with sparse integer labels, where each sample belongs to one of seven classes {0, 1, 2, 3, 4, 5, 6}, so I would recommend using SparseCategoricalCrossentropy instead of CategoricalCrossentropy as your loss function. Just change this parameter and your model should work fine. If you want to use CategoricalCrossentropy, you will have to one-hot encode your labels, for example with:
train_Y = tf.keras.utils.to_categorical(train_Y, num_classes=7)
I have built a simple CNN using tensorflow-gpu 2.4.0 for cifar10
# Set constent
num_classes = len(label_names) # The number of classes
input_shape = (32, 32, 3) # The input shape 32 * 32 pixels, 3 RGB brightness for each pixel
# Construct a sequential model
model = Sequential()
# Starting add layers to the model
# First VGG blockes
model.add(Conv2D(16, (3, 3), padding='same', activation='relu', input_shape=input_shape))
model.add(BatchNormalization()) # BatchNormalization after convolutional layer
model.add(Conv2D(16, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization()) # BatchNormalization after convolutional layer
model.add(MaxPooling2D((2, 2))) # MaxPooling2D to keep important features as well as shirnk info size
model.add(Dropout(0.1)) # Drop some portion of nodes to increase generality of model
# Second VGG blockes
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
# Third VGG blockes
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.3))
# Fourth VGG blockes
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.4))
# Flat and dense layer
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
# Final softmax layer
model.add(Dense(num_classes, activation='softmax'))
# The optimizer gives the learning rate lr and some other parameters.
opt = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
# Model complie
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
And while I am training this model:
#Train model
epochs = 100
batch_size = 128
# create data generator
datagen = ImageDataGenerator(horizontal_flip=True,
width_shift_range=5,
height_shift_range=5,
rotation_range=20)
history = model.fit(datagen.flow(x_training, y_training, batch_size=batch_size),
epochs=epochs,
validation_data=(x_validate, y_validate),
verbose=1,
shuffle=True)
The RAM of my computer keeps grow until the training process crashes. Adding python garbage collector for at end of each epoch did not work. Disabling eager detection did not work.
I using
tensorflow-gpu 2.4.0
CUDA 11.0
cuDNN 8.0
I am trying to use CNN for feature extraction and XGboost for classification of a image data. I researched and found that it could be done by extracting the data after the convolution layers. I found some source code for similar problem and tried on my own.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3,3), strides=(1,1), padding='same', activation="relu", input_shape = data.shape[1:]))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2)))
model.add(Conv2D(64, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu") )
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Conv2D(128, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
model.add(MaxPool2D(pool_size=(2,2), strides=(2,2))) #max pool window 2x2
model.add(Flatten())
model.add(Dense(128, activation="relu", name='firstDenseLayer'))
model.add(Dense(1, activation="sigmoid"))
# model.summary()
# print(model)
model.compile(loss="binary_crossentropy", optimizer="adam", metrics=["accuracy"])
model.fit(data, label, batch_size=16, epochs=10, validation_data=(val_data, val_label))
Below i accessed the dense layer named "firstDenseLayer".
import xgboost as xgb
from keras.models import Model
layerName = 'firstDenseLayer'
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layerName).output)
intermediate_output = intermediate_layer_model.predict(data)
from xgboost import XGBClassifier
xgbmodel = XGBClassifier(objective='multi:softmax', num_class= 2)
xgbmodel.fit(intermediate_output, label)
xgbmodel.score(intermediate_output, label)
As i am new in this, i have several confusions.
How the data is being flowed. After i extract the features of the pictures via convolution layers, how do i actually access the data from there?
What is this line of code doing? What data is it extracting?
intermediate_output = intermediate_layer_model.predict(data)
When i omit(keep commented out) the below line,
model.fit(data, label, batch_size=16, epochs=10, validation_data=(val_data, val_label))
from the first snippet and run the XGboost model directly the XGboost gives low accuracy and when i don't it gives higher accuracy. Why is it being like that?
Kindly help me out. I am stuck with this for quite a while. I am just trying to access the extracted features data from the last convolution layer and use that data to do classification using XGboost. As i tried to follow the method that i found from online, i am not sure if it is the the only way of doing it. If there is another way kindly let me know.
The model.fit(...) line does what you would expect, it trains the convnet defined by model on some data and labels. Your classifier yielding lower accuracy when you're using randomly initialized weights (i.e. without running fit) is not surprising.
intermediate_layer_model is constructed as a keras model whose output is the dense layer just before the output of model. Note the name parameter given to the dense layer in the construction of model.
You could just as easily give a name to one of the Conv2D layers and access it the same way. Alternatively, you could store the layer in a python variable, i.e. instead of
model.add(Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu"))
in the model construction it could say
last_conv_layer = Conv2D(256, kernel_size=(3,3), padding='same', strides=(1,1), activation="relu")
model.add(last_conv_layer)
Then for the intermediate_layer_model you put
intermediate_layer_model = Model(inputs=model.input, outputs=last_conv_layer.output)
I am trying to configure a network for character recognition of sequential data like license plates.
Now I would like to use the architecture which is noted in Table 3 in Deep Automatic Licence Plate Recognition system (link: http://www.ee.iisc.ac.in/people/faculty/soma.biswas/Papers/jain_icgvip2016_alpr.pdf).
The architecture the authors presented is this one:
The first layers are very common, but where I was stumbling was the top (the part in the red frame) of the architecture. They mention 11 parallel layers and I am really unsure how to get this in Python. I coded this architecture but it does not seem to be right to me.
model = Sequential()
model.add(Conv2D(64, kernel_size=(5, 5), input_shape = (32, 96, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(128, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(256, kernel_size=(3, 3), activation = "relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1024, activation = "relu"))
model.add(Dense(11*37, activation="Softmax"))
model.add(keras.layers.Reshape((11, 37)))
Could someone help? How do I have to code the top to get an equal architecture like the authors?
The code below can build the architecture described in the image.
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Conv2D, Flatten, MaxPooling2D, Dense, Input, Reshape, Concatenate, Dropout
def create_model(input_shape = (32, 96, 1)):
input_img = Input(shape=input_shape)
'''
Add the ST Layer here.
'''
model = Conv2D(64, kernel_size=(5, 5), input_shape = input_shape, activation = "relu")(input_img)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Conv2D(128, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Conv2D(256, kernel_size=(3, 3), input_shape = input_shape, activation = "relu")(model)
model = MaxPooling2D(pool_size=(2, 2))(model)
model = Dropout(0.25)(model)
model = Flatten()(model)
backbone = Dense(1024, activation="relu")(model)
branches = []
for i in range(11):
branches.append(backbone)
branches[i] = Dense(37, activation = "softmax", name="branch_"+str(i))(branches[i])
output = Concatenate(axis=1)(branches)
output = Reshape((11, 37))(output)
model = Model(input_img, output)
return model
From my understanding, your implementation is almost correct. The authors train 11 individual classifiers taking as input the output from the Fully Connected Layer. Here, you can think of "parallel" as "independent".
However, you cannot apply the Softmax activation right after the Fully Connected Layer. Since all the classifiers are independent, we want each of them to output a probability for each possible character. Putting things differently, we want the sum of the outputs of each classifier to be 1. Hence, the correct implementation would be:
...
model.add(Dense(1024, activation = "relu"))
# Feeding every neuron with the previous layer's output
model.add(Dense(11*37))
model.add(keras.layers.Reshape((11, 37)))
model.add(keras.activations.softmax(x, axis=1))
I'm working on a binary classification problem. I was getting 69% accuracy at first, but kept running out of memory so I shrunk certain parameters, now it's coming up 0. Any idea whats going on?
model = Sequential()
from keras.layers import Dropout
model.add(Conv2D(96, kernel_size=11, padding="same", input_shape=(300, 300, 1), activation = 'relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
from keras.layers.core import Activation
model.add(Flatten())
# model.add(Dense(units=1000, activation='relu' ))
model.add(Dense(units= 300, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation("softmax"))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
featurewise_center=True,
rotation_range=90,
fill_mode='nearest',
validation_split = 0.2
)
datagen.fit(train)
train_generator = datagen.flow(train, train_labels, batch_size=8)
# # fits the model on batches with real-time data augmentation:
history = model.fit_generator(generator=train_generator,
use_multiprocessing=True,
steps_per_epoch = len(train_generator) / 8,
epochs = 5,
workers=20)
Softmax should only be used if you have a multiclass classification problem. You have a single output from your Dense layer, so you should use sigmoid.