Multiclass Classification model not training properly. Why is the training loss constant?

Multiclass Classification model not training properly. Why is the training loss constant? - python

I am trying to train a model using keras, for multiclass classification. There are 5 classes from which to predict. This is an image classification problem, as mentioned before there are five classes of images, bedroom, bathroom, living room, dining room, and kitchen. The problem is the model doesn't seem to learn, it's always stuck at 20% accuracy and the loss never changes from epoch 1. I'm using the convolutional base from the Xception model with my classifier on top. The train, test, and validation datasets are set up using the tf.data API.
Can someone please point out what I am doing wrong?
This is the dataset generation
train_dir = "House_Dataset/Train"
valid_dir = "House_Dataset/Valid"
test_dir = "House_Dataset/Test"
train_ds = trainAug.flow_from_directory(
train_dir,
target_size=(224,224),
shuffle= False,
class_mode= "sparse"
)
valid_ds = image_dataset_from_directory(
valid_dir,
image_size=(224,224),
shuffle=False,
)
test_ds = image_dataset_from_directory(
test_dir,
image_size=(224,224),
shuffle=False,
)
This is the importing of the exception convolution base.
conv_base = keras.applications.Xception(include_top=False, weights="imagenet", input_shape=(224,224,3))
conv_base.trainable = False
This is the model building function.
def pre_trained():
inputs = keras.Input(shape=(224,224,3))
#x = data_augmentation(inputs)
x = keras.applications.xception.preprocess_input(inputs)
x = conv_base(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(5, activation = "softmax")(x)
model = keras.Model(inputs, outputs)
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics = ["accuracy"])
return model
Training function call
history = pre_trained_model.fit(train_ds, epochs=25)
This is the picture of the epochs.

While the exact cause for this remains unclear to me, I have found where the problem occurred, and the solution for it.
I added some parameters in the dataset generator function.
train_dir = "House_Dataset/Train"
valid_dir = "House_Dataset/Valid"
test_dir = "House_Dataset/Test"
train_ds = image_dataset_from_directory(
train_dir,
image_size=(224,224),
shuffle= True,
seed=1,
labels="inferred",
label_mode = "categorical"
)
valid_ds = image_dataset_from_directory(
valid_dir,
image_size=(224,224),
shuffle=True,
seed=1,
labels="inferred",
label_mode = "categorical"
)
test_ds = image_dataset_from_directory(
test_dir,
image_size=(224,224),
shuffle=True,
seed=1,
labels="inferred",
label_mode = "categorical"
)
I added the option to shuffle with some seed, and changed the label mode to categorical, which will produce a one-hot encoding of labels. Likewise I also changed the loss from sparse_categorical_crossentropy to categorical_crossentropy. These changes have allowed the model train, and there have been significant improvements in both training and validation loss as well as accuracy.

try my cnn network and see if you get 87% accuracy. cnn extract features in each layer as a filter. the filter then feeds to the category softmax function.
model=Sequential()
model.add(Conv2D(32, (3,3),activation='relu',input_shape=(IMG_SIZE,IMG_SIZE,3)))
#model.add(Dropout(0.25))
model.add(MaxPooling2D(2))
#model.add(BatchNormalization())
model.add(Conv2D(64, (3,3), activation="relu"))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(128, (3,3), activation="relu"))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(128, (3,3), activation="relu"))
model.add(MaxPooling2D(2,2))
#model.add(BatchNormalization())
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(512,activation='relu'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = ['accuracy'])
model.summary()

Related

CNN accuracy: 0.0000e+00 for multi-classification on images

I have the following code that produces my horrible accuracy dilema, has anyone else encountered this issue for multi classification task(49 different images to classify)?
I am running resnet50 on top of my CNN model with softmax as last activation FN, my loss is categorical_crossentropy and my optimizer is Adam.
What might I be doing wrong?
## Build CNN architecture
model1 = Sequential()
model1.add(Conv2D(32, (3,3), strides=1, input_shape = (720, 720, 3)))
model1.add(Activation('relu'))
model1.add(Conv2D(32, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2,2)))
model1.add(Conv2D(64, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(Conv2D(64, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2,2)))
model1.add(Flatten())
model1.add(Dense(200))
model1.add(Activation('relu'))
model1.add(Dense(200))
model1.add(Dropout(0.24))
model1.add(Activation('relu'))
model1.add(Dense(49, activation='softmax'))
model1.summary()
# Image data generator for on the fly image augmentation
directory = '/home/carlini-TF2/data/train/'
batch_size = 64
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=90.,
shear_range=0.2,
zoom_range=[0.8,1.2],
horizontal_flip=True,
validation_split=0.2,
preprocessing_function=tf.keras.applications.resnet50.preprocess_input)
train_generator = train_datagen.flow_from_directory(directory=directory,
subset='training',
target_size=(720, 720),
shuffle=True,
seed=42,
color_mode='rgb',
class_mode='categorical',
batch_size=batch_size)
valid_directory = '/home/carlini-TF2/data/test/'
valid_generator = train_datagen.flow_from_directory(directory=valid_directory,
target_size=(720, 720),
color_mode="rgb",
batch_size=batch_size,
class_mode="categorical",
subset='validation',
shuffle=True,
seed=42)
## Compile and train Neural Network
METRICS = [
tf.keras.metrics.Accuracy(name='accuracy'),
tf.keras.metrics.Precision(name='precision'),
tf.keras.metrics.Recall(name='recall')]
# optimal optimizer FN | loss FN to work with accuracy metric
model1.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
metrics=METRICS)
# stop training when loss gets worse after consecutive epochs
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
# fit model with augmented training set and validation set | shuffle batch
history = model1.fit(train_generator,
validation_data = valid_generator,
steps_per_epoch = train_generator.n//batch_size,
validation_steps = valid_generator.n//batch_size,
shuffle=True, callbacks = [callback],
epochs=50)

The issue is that ResNet50 was being used for data augmentation and not in the CNN architecture. In order to reach somewhat robust model the following code is needed.
We can throw out the previous architecture and use a very simple model and the ResNet50 since this gives conclusive results.
We must use Functional API since ResNet50 was built on it
data_bias = np.log(1802./4657)
initializer = tf.keras.initializers.Constant(data_bias)
resnet50_imagenet_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(720,720,3) )
resnet50_imagenet_model.trainable = False
#Flatten output layer of Resnet
flattened = tf.keras.layers.Flatten()(resnet50_imagenet_model.output)
#Fully connected layer, output layer with 49 diff labels
fc2 = tf.keras.layers.Dense(49, activation='softmax', bias_initializer=initializer, name="AddedDense2")(flattened)
model1 = tf.keras.models.Model(inputs=resnet50_imagenet_model.input, outputs=fc2)

K-fold cross validation for Keras Neural Network

Hi have already tuned my hyperparameters and would like to perfrom kfold cross validation for my model. I have being looking around for different methods it won't seem to work for me. The code is here below:
tf.get_logger().setLevel(logging.ERROR)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Set random seeds for repeatable results
RANDOM_SEED = 3
random.seed(RANDOM_SEED)
np.random.seed(RANDOM_SEED)
tf.random.set_seed(RANDOM_SEED)
classes_values = [ "nearmiss", "normal" ]
classes = len(classes_values)
Y = tf.keras.utils.to_categorical(Y - 1, classes)
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=0.2, random_state=1)
input_length = X_train[0].shape[0]
train_dataset = tf.data.Dataset.from_tensor_slices((X_train, Y_train))
validation_dataset = tf.data.Dataset.from_tensor_slices((X_test, Y_test))
def get_reshape_function(reshape_to):
def reshape(image, label):
return tf.reshape(image, reshape_to), label
return reshape
callbacks = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3,mode="auto")
model = Sequential()
model.add(Dense(200, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(44, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(68, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(44, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(classes, activation='softmax', name='y_pred'))
# this controls the learning rate
opt = Adam(learning_rate=0.0002, beta_1=0.9, beta_2=0.999)
# this controls the batch size, or you can manipulate the tf.data.Dataset objects yourself
BATCH_SIZE = 32
train_dataset = train_dataset.batch(BATCH_SIZE, drop_remainder=False)
validation_dataset = validation_dataset.batch(BATCH_SIZE, drop_remainder=False)
# train the neural network
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
history=model.fit(train_dataset, epochs=50, validation_data=validation_dataset, verbose=2, callbacks=callbacks)
model.test_on_batch(X_test, Y_test)
model.metrics_names
# Use this flag to disable per-channel quantization for a model.
# This can reduce RAM usage for convolutional models, but may have
# an impact on accuracy.
disable_per_channel_quantization = False
Appericate if someone could guide me on this as I am very new to TensorFlow and neural network

I haven't tested it, but this should roughly be what you want. You use the sklearn KFold method to split the dataset into different folds, and then you simply fit the model on the current fold.
tf.get_logger().setLevel(logging.ERROR)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
# Set random seeds for repeatable results
RANDOM_SEED = 3
random.seed(RANDOM_SEED)
np.random.seed(RANDOM_SEED)
tf.random.set_seed(RANDOM_SEED)
classes_values = [ "nearmiss", "normal" ]
classes = len(classes_values)
Y = tf.keras.utils.to_categorical(Y - 1, classes)
def get_reshape_function(reshape_to):
def reshape(image, label):
return tf.reshape(image, reshape_to), label
return reshape
callbacks = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=3,mode="auto")
def create_model():
model = Sequential()
model.add(Dense(200, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(44, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(68, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(44, activation='tanh',
activity_regularizer=tf.keras.regularizers.l1(0.0001)))
model.add(Dropout(0.3))
model.add(Dense(classes, activation='softmax', name='y_pred'))
return model
# this controls the learning rate
opt = Adam(learning_rate=0.0002, beta_1=0.9, beta_2=0.999)
# this controls the batch size, or you can manipulate the tf.data.Dataset objects yourself
BATCH_SIZE = 32
kf = KFold(n_splits=5)
kf.get_n_splits(X)
# Loop over the dataset to create seprate folds
for train_index, test_index in kf.split(X):
X_train, X_test = X[train_index], X[test_index]
y_train, y_test = Y[train_index], Y[test_index]
input_length = X_train[0].shape[0]
train_dataset = tf.data.Dataset.from_tensor_slices((X_train, y_train))
validation_dataset = tf.data.Dataset.from_tensor_slices((X_test, y_test))
train_dataset = train_dataset.batch(BATCH_SIZE, drop_remainder=False)
validation_dataset = validation_dataset.batch(BATCH_SIZE, drop_remainder=False)
# Create a new model instance
model = create_model()
model.compile(loss='binary_crossentropy', optimizer=opt, metrics=['accuracy'])
# train the model on the current fold
history=model.fit(train_dataset, epochs=50, validation_data=validation_dataset, verbose=2, callbacks=callbacks)
model.test_on_batch(X_test, y_test)

Problem with logits and labels size. Tensorflow

I try to train top layer separate from base model. All is working with generating features with model.predict_generator like
bottleneck_features_train = model.predict_generator(
train_generator, predict_size_train)
np.save(save_dir + 'bottleneck_features_train.npy', bottleneck_features_train)
train_data = np.load(mtx.save_dir + 'bottleneck_features_train.npy')
model.fit(train_data, ....
)
But now I got huge dataset and can't load all data in memory so I use generator flow_from_directory
def create_generator(root_path, batch_size):
datagen = ImageDataGenerator(rescale=1. / 255)
generator = datagen.flow_from_directory(
root_path,
target_size=(224, 224),
batch_size=batch_size,
class_mode="categorical",
shuffle=True)
return generator
train_generator = create_generator(mtx.train_data_dir, mtx.batch_size)
and than
model.fit(train_generator...
class_mode in flow_from_directory is "categorical" and loss function too(categorical_crossentropy)
layers is
model = Sequential()
model.add(Flatten(input_shape=(7, 7, 512)))
model.add(Dense(512, activation="relu"))
model.add(Dropout(0.7))
model.add(Dense(num_classes, activation='softmax'))
but when I run training I get
logits and labels must be broadcastable: logits_size=[24,32] labels_size=[4,32]
As I understand it's something wrong with shapes in layers or how are features/labels encoded.
Update 1:
Also it's working when batch_size in flow_from_directory is set with 1. But accuracy is very low than.

try
model.add(Flatten(input_shape=(224,224,3)))

Keras CNN accuracy is either static or way too high for image classification

I'm trying to implement a Convolutional Neural Network that can detect whether a person is wearing glasses or not. Unfortunately, I keep getting very strange results no matter which exact settings I use for learning rate, the specific optimizer, etc. With most settings, I notice that the accuracy of my model doesn't change after the second epoch and gets stuck at around 0.56 (which is close to the ratio of one label, 2700 images, compared to the other label, 2200 images). In other runs, with slightly different settings, the accuracy suddenly rockets to about 0.9 and keeps increasing. In both cases, however, the model predicts the exact same classification ('with glasses') each time (even on images that were in the training/validation set), always with a confidence level of 100% (the label is exactly 1 each time).
I'm not all that experienced with Neural Networks for image classification so I wasn't quite sure how to figure out the issue. I tried printing some values from my dataset and their respective labels, and the labels do contain both labels (0s and 1s). Therefore, I assume it's probably an issue with my model but I can't really figure out much myself. I've tried different optimizers (Adam, SGD mostly), smaller and bigger learning rates, different momentum values, less/more convolutional layers and different parameters for the padding and kernel_initializer, different batch sizes... It's still stuck with either the very quickly improving accuracy or the static one.
My code looks as follows:
#parameters
batch_size = 16
img_height = 180
img_width = 180
num_classes = 2
epochs = 10
#training data
train_db = tf.keras.preprocessing.image_dataset_from_directory(
`D:\archive\faces\`,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width), color_mode = "grayscale",
batch_size=batch_size)
#validation data
val_db = tf.keras.preprocessing.image_dataset_from_directory(
`D:\archive\faces\`,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width), color_mode = "grayscale",
batch_size=batch_size)
#speeds up the model training
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_db = train_db.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)
val_db = val_db.cache().prefetch(buffer_size=AUTOTUNE)
#establishing the model
model = Sequential([
layers.experimental.preprocessing.Rescaling(1./255),
layers.Conv2D(16, (3,3), activation='relu', input_shape=(img_width, img_height, 1), kernel_initializer='he_uniform', padding='same'),
layers.MaxPooling2D(2, 2),
layers.Conv2D(32, (3,3), activation='relu', kernel_initializer='he_uniform', padding='same'),
layers.MaxPooling2D(2,2),
layers.Conv2D(64, (3,3), activation='relu', kernel_initializer='he_uniform', padding='same'),
layers.MaxPooling2D(2,2),
layers.Conv2D(64, (3,3), activation='relu', kernel_initializer='he_uniform', padding='same'),
layers.MaxPooling2D(2,2),
layers.Conv2D(64, (3,3), activation='relu', kernel_initializer='he_uniform', padding='same'),
layers.MaxPooling2D(2,2),
layers.Flatten(),
layers.Dense(512, activation='relu', kernel_initializer='he_uniform'),
layers.Dense(1, activation='sigmoid')
])
#different optimizer options
opt = tf.keras.optimizers.SGD(learning_rate=0.001, momentum=0.9)
opt2 = tf.keras.optimizers.Adam(learning_rate=0.001)
#compiling the model
model.compile(
optimizer=opt,
loss=tf.losses.BinaryCrossentropy(),
metrics='accuracy')
#training the model
model.fit(train_db,validation_data=val_db,epochs=epochs)

since you are using loss=tf.losses.BinaryCrossentropy() then in image_dataset_from_directory you need to add label_mode='binary' for both the train_db and val_db. For val_db add shuffle=False

Validation Accuracy Stuck, Accuracy low

I want to create a machine learning model with Tensorflow which detects flowers. I went in the nature and took pictures of 4 different species (~600 per class, one class got 700).
I load these images with Tensorflow Train Generator:
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.15,
brightness_range=[0.7, 1.4],
fill_mode='nearest',
vertical_flip=True,
horizontal_flip=True,
rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
pfad,
target_size=(imageShape[0],imageShape[1]),
batch_size=batchSize,
class_mode='categorical',
subset='training',
seed=1,
shuffle=False,
#save_to_dir=r'G:\test'
)
validation_generator = train_datagen.flow_from_directory(
pfad,
target_size=(imageShape[0],imageShape[1]),
batch_size=batchSize,
shuffle=False,
seed=1,
class_mode='categorical',
subset='validation')
Then I am creating a simple model looking like this:
model = tf.keras.Sequential([
keras.layers.Conv2D(128, (3,3), activation='relu', input_shape=(imageShape[0], imageShape[1],3)),
keras.layers.MaxPooling2D(2,2),
keras.layers.Dropout(0.5),
keras.layers.Conv2D(256, (3,3), activation='relu'),
keras.layers.MaxPooling2D(2,2),
keras.layers.Conv2D(512, (3,3), activation='relu'),
keras.layers.MaxPooling2D(2,2),
keras.layers.Flatten(),
keras.layers.Dense(280, activation='relu'),
keras.layers.Dense(4, activation='softmax')
])
opt = tf.keras.optimizers.SGD(learning_rate=0.001,decay=1e-5)
model.compile(loss='categorical_crossentropy',
optimizer= opt,
metrics=['accuracy'])
And want to start the training process (CPU):
history=model.fit(
train_generator,
steps_per_epoch = train_generator.samples // batchSize,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batchSize,
epochs = 200,callbacks=[checkpoint,early,tensorboard],workers=-1)
The result should be that my validation Accuracy improves, but it starts with 0.3375 and stays at this level the whole training process. Validation loss (1.3737) decreases by 0.001. Accuracy start with 0.15 but increases.
Why is my validation accuracy stuck?
Am I using the right loss? Or do I build my model wrong? Is my Tensorflow Train Generator hot encoding the labels?
Thanks

I solved the problem by using RMSprop() without any parameters.
So I changed from:
opt = tf.keras.optimizers.SGD(learning_rate=0.001,decay=1e-5)
model.compile(loss='categorical_crossentropy',optimizer= opt, metrics=['accuracy'])
to:
opt = tf.keras.optimizers.RMSprop()
model.compile(loss='categorical_crossentropy',
optimizer= opt,
metrics=['accuracy'])

This is a similar example, except that for 4 categorical classes, the below is binary. You may want to change the loss to categorical cross entropy, class_mode from binary to categorical in the train and test generators and final dense layer activation to softmax. I am still able to use model.fit_generator()
image_dataGen = ImageDataGenerator(rotation_range=20,
width_shift_range=0.2,height_shift_range=0.2,shear_range=0.1,
zoom_range=0.1,fill_mode='nearest',horizontal_flip=True,
vertical_flip=True,rescale=1/255)
train_images = image_dataGen.flow_from_directory(train_path,target_size = image_shape[:2],
color_mode = 'rgb',class_mode = 'binary')
test_images = image_dataGen.flow_from_directory(test_path,target_size = image_shape[:2],
color_mode = 'rgb',class_mode = 'binary',
shuffle = False)
model = Sequential()
model.add(Conv2D(filters = 32, kernel_size = (3,3),input_shape = image_shape,activation = 'relu'))
model.add(MaxPool2D(pool_size = (2,2)))
model.add(Conv2D(filters = 48, kernel_size = (3,3),input_shape = image_shape,activation = 'relu'))
model.add(MaxPool2D(pool_size = (2,2)))
model.add(Flatten())
model.add(Dense(units = 128,activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(units = 1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy',metrics = ['accuracy'], optimizer = 'adam')
results = model.fit_generator(train_images, epochs = 10, callbacks = [early_stop],
validation_data = test_images)

Maybe your learning rate is too high.
Use learning rate = 0.000001 and if that does not work then try another optimizer like Adam.

use model.fit_generator() instead of model.fit() Also below points could be helpful.
In order to use .flow_from_directory, you must organize the images in sub-directories. This is an absolute requirement, otherwise the method won't work. The directories should only contain images of one class, so one folder per class of images. Also could you check if the path for the training data and test data is correct ? They cannot point to the same location. I have used the ImageGenerator class for classification problem. You can also try changing the optimizer to 'Adam'
Structure Needed:
Image Data Folder
Class 1
0.jpg
1.jpg
...
Class 2
0.jpg
1.jpg
...
...
Class n

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Multiclass Classification model not training properly. Why is the training loss constant? - python

Related

CNN accuracy: 0.0000e+00 for multi-classification on images

K-fold cross validation for Keras Neural Network

Problem with logits and labels size. Tensorflow

Keras CNN accuracy is either static or way too high for image classification

Validation Accuracy Stuck, Accuracy low

Categories

Resources