Validation Accuracy Stuck, Accuracy low - python

I want to create a machine learning model with Tensorflow which detects flowers. I went in the nature and took pictures of 4 different species (~600 per class, one class got 700).
I load these images with Tensorflow Train Generator:
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.15,
brightness_range=[0.7, 1.4],
fill_mode='nearest',
vertical_flip=True,
horizontal_flip=True,
rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
pfad,
target_size=(imageShape[0],imageShape[1]),
batch_size=batchSize,
class_mode='categorical',
subset='training',
seed=1,
shuffle=False,
#save_to_dir=r'G:\test'
)
validation_generator = train_datagen.flow_from_directory(
pfad,
target_size=(imageShape[0],imageShape[1]),
batch_size=batchSize,
shuffle=False,
seed=1,
class_mode='categorical',
subset='validation')
Then I am creating a simple model looking like this:
model = tf.keras.Sequential([
keras.layers.Conv2D(128, (3,3), activation='relu', input_shape=(imageShape[0], imageShape[1],3)),
keras.layers.MaxPooling2D(2,2),
keras.layers.Dropout(0.5),
keras.layers.Conv2D(256, (3,3), activation='relu'),
keras.layers.MaxPooling2D(2,2),
keras.layers.Conv2D(512, (3,3), activation='relu'),
keras.layers.MaxPooling2D(2,2),
keras.layers.Flatten(),
keras.layers.Dense(280, activation='relu'),
keras.layers.Dense(4, activation='softmax')
])
opt = tf.keras.optimizers.SGD(learning_rate=0.001,decay=1e-5)
model.compile(loss='categorical_crossentropy',
optimizer= opt,
metrics=['accuracy'])
And want to start the training process (CPU):
history=model.fit(
train_generator,
steps_per_epoch = train_generator.samples // batchSize,
validation_data = validation_generator,
validation_steps = validation_generator.samples // batchSize,
epochs = 200,callbacks=[checkpoint,early,tensorboard],workers=-1)
The result should be that my validation Accuracy improves, but it starts with 0.3375 and stays at this level the whole training process. Validation loss (1.3737) decreases by 0.001. Accuracy start with 0.15 but increases.
Why is my validation accuracy stuck?
Am I using the right loss? Or do I build my model wrong? Is my Tensorflow Train Generator hot encoding the labels?
Thanks

I solved the problem by using RMSprop() without any parameters.
So I changed from:
opt = tf.keras.optimizers.SGD(learning_rate=0.001,decay=1e-5)
model.compile(loss='categorical_crossentropy',optimizer= opt, metrics=['accuracy'])
to:
opt = tf.keras.optimizers.RMSprop()
model.compile(loss='categorical_crossentropy',
optimizer= opt,
metrics=['accuracy'])

This is a similar example, except that for 4 categorical classes, the below is binary. You may want to change the loss to categorical cross entropy, class_mode from binary to categorical in the train and test generators and final dense layer activation to softmax. I am still able to use model.fit_generator()
image_dataGen = ImageDataGenerator(rotation_range=20,
width_shift_range=0.2,height_shift_range=0.2,shear_range=0.1,
zoom_range=0.1,fill_mode='nearest',horizontal_flip=True,
vertical_flip=True,rescale=1/255)
train_images = image_dataGen.flow_from_directory(train_path,target_size = image_shape[:2],
color_mode = 'rgb',class_mode = 'binary')
test_images = image_dataGen.flow_from_directory(test_path,target_size = image_shape[:2],
color_mode = 'rgb',class_mode = 'binary',
shuffle = False)
model = Sequential()
model.add(Conv2D(filters = 32, kernel_size = (3,3),input_shape = image_shape,activation = 'relu'))
model.add(MaxPool2D(pool_size = (2,2)))
model.add(Conv2D(filters = 48, kernel_size = (3,3),input_shape = image_shape,activation = 'relu'))
model.add(MaxPool2D(pool_size = (2,2)))
model.add(Flatten())
model.add(Dense(units = 128,activation = 'relu'))
model.add(Dropout(0.5))
model.add(Dense(units = 1, activation = 'sigmoid'))
model.compile(loss = 'binary_crossentropy',metrics = ['accuracy'], optimizer = 'adam')
results = model.fit_generator(train_images, epochs = 10, callbacks = [early_stop],
validation_data = test_images)

Maybe your learning rate is too high.
Use learning rate = 0.000001 and if that does not work then try another optimizer like Adam.

use model.fit_generator() instead of model.fit() Also below points could be helpful.
In order to use .flow_from_directory, you must organize the images in sub-directories. This is an absolute requirement, otherwise the method won't work. The directories should only contain images of one class, so one folder per class of images. Also could you check if the path for the training data and test data is correct ? They cannot point to the same location. I have used the ImageGenerator class for classification problem. You can also try changing the optimizer to 'Adam'
Structure Needed:
Image Data Folder
Class 1
0.jpg
1.jpg
...
Class 2
0.jpg
1.jpg
...
...
Class n

Related

CNN accuracy: 0.0000e+00 for multi-classification on images

I have the following code that produces my horrible accuracy dilema, has anyone else encountered this issue for multi classification task(49 different images to classify)?
I am running resnet50 on top of my CNN model with softmax as last activation FN, my loss is categorical_crossentropy and my optimizer is Adam.
What might I be doing wrong?
## Build CNN architecture
model1 = Sequential()
model1.add(Conv2D(32, (3,3), strides=1, input_shape = (720, 720, 3)))
model1.add(Activation('relu'))
model1.add(Conv2D(32, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2,2)))
model1.add(Conv2D(64, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(Conv2D(64, (3,3), strides=1, padding="same"))
model1.add(Activation('relu'))
model1.add(MaxPooling2D(pool_size=(2,2)))
model1.add(Flatten())
model1.add(Dense(200))
model1.add(Activation('relu'))
model1.add(Dense(200))
model1.add(Dropout(0.24))
model1.add(Activation('relu'))
model1.add(Dense(49, activation='softmax'))
model1.summary()
# Image data generator for on the fly image augmentation
directory = '/home/carlini-TF2/data/train/'
batch_size = 64
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rotation_range=90.,
shear_range=0.2,
zoom_range=[0.8,1.2],
horizontal_flip=True,
validation_split=0.2,
preprocessing_function=tf.keras.applications.resnet50.preprocess_input)
train_generator = train_datagen.flow_from_directory(directory=directory,
subset='training',
target_size=(720, 720),
shuffle=True,
seed=42,
color_mode='rgb',
class_mode='categorical',
batch_size=batch_size)
valid_directory = '/home/carlini-TF2/data/test/'
valid_generator = train_datagen.flow_from_directory(directory=valid_directory,
target_size=(720, 720),
color_mode="rgb",
batch_size=batch_size,
class_mode="categorical",
subset='validation',
shuffle=True,
seed=42)
## Compile and train Neural Network
METRICS = [
tf.keras.metrics.Accuracy(name='accuracy'),
tf.keras.metrics.Precision(name='precision'),
tf.keras.metrics.Recall(name='recall')]
# optimal optimizer FN | loss FN to work with accuracy metric
model1.compile(loss=tf.keras.losses.CategoricalCrossentropy(from_logits=False),
optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
metrics=METRICS)
# stop training when loss gets worse after consecutive epochs
callback = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
# fit model with augmented training set and validation set | shuffle batch
history = model1.fit(train_generator,
validation_data = valid_generator,
steps_per_epoch = train_generator.n//batch_size,
validation_steps = valid_generator.n//batch_size,
shuffle=True, callbacks = [callback],
epochs=50)
The issue is that ResNet50 was being used for data augmentation and not in the CNN architecture. In order to reach somewhat robust model the following code is needed.
We can throw out the previous architecture and use a very simple model and the ResNet50 since this gives conclusive results.
We must use Functional API since ResNet50 was built on it
data_bias = np.log(1802./4657)
initializer = tf.keras.initializers.Constant(data_bias)
resnet50_imagenet_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(720,720,3) )
resnet50_imagenet_model.trainable = False
#Flatten output layer of Resnet
flattened = tf.keras.layers.Flatten()(resnet50_imagenet_model.output)
#Fully connected layer, output layer with 49 diff labels
fc2 = tf.keras.layers.Dense(49, activation='softmax', bias_initializer=initializer, name="AddedDense2")(flattened)
model1 = tf.keras.models.Model(inputs=resnet50_imagenet_model.input, outputs=fc2)

Model val_accuracy higher than test accuracy without dropout regularization

I recently created a machine learning of 810 training and 810 test images (27 classes) in order to identify ASL hand signs. I trained this model using an SGD optimizer with a 0.001 learning rate, 5 epochs, and categorical cross entropy loss. However, my validation accuracy is around 20% higher than my model test accuracy, and I'm not sure why. I've tried adjusting my model structure, optimizers, learning rate, and epochs - this never changes.
Anyone have any ideas? Here's my model code:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
model = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(32, (3,3), activation='relu', input_shape=(150, 150, 3)),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(64, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Conv2D(128, (3,3), activation='relu'),
tf.keras.layers.MaxPooling2D(2,2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(27, activation='softmax')
])
model.summary()
from tensorflow.keras.optimizers import SGD
sgd = SGD(learning_rate=0.001, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss = 'categorical_crossentropy',
optimizer = sgd,
metrics = ['accuracy'])
class myCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('accuracy')>0.95):
print("\nReached >95% accuracy so cancelling training!")
self.model.stop_training = True
callbacks = myCallback()
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2, # Shifting image width by 20%
height_shift_range=0.2,# Shifting image height by 20%
shear_range=0.2, # Shearing across X-axis by 20%
zoom_range=0.2, # Image zooming by 20%
horizontal_flip=True,
fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(
"/content/drive/MyDrive/train_asl",
target_size = (150, 150),
class_mode = 'categorical',
batch_size = 5)
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(
"/content/drive/MyDrive/test_asl",
target_size = (150, 150),
class_mode = 'categorical',
batch_size = 5
)
import numpy as np
history = model.fit_generator(
train_generator,
steps_per_epoch = np.ceil(810/5), # 2520 images = batch_size * steps
epochs = 100,
validation_data=validation_generator,
validation_steps = np.ceil(810/5), # 372 images = batch_size * steps
callbacks=[callbacks],
verbose = 2)
Validation and test accuracies could be different if the underlying data distribution of validation and test data is different or unbalanced.
You could ensure that the class distribution of the 27 classes is approximately the same in both validation and test sets, using stratified sampling techniques. You could also check whether the input data distributions are same/different during validation and testing, because 810 images is not a lot especially if you are not using transfer-learning.

Multiclass Classification model not training properly. Why is the training loss constant?

I am trying to train a model using keras, for multiclass classification. There are 5 classes from which to predict. This is an image classification problem, as mentioned before there are five classes of images, bedroom, bathroom, living room, dining room, and kitchen. The problem is the model doesn't seem to learn, it's always stuck at 20% accuracy and the loss never changes from epoch 1. I'm using the convolutional base from the Xception model with my classifier on top. The train, test, and validation datasets are set up using the tf.data API.
Can someone please point out what I am doing wrong?
This is the dataset generation
train_dir = "House_Dataset/Train"
valid_dir = "House_Dataset/Valid"
test_dir = "House_Dataset/Test"
train_ds = trainAug.flow_from_directory(
train_dir,
target_size=(224,224),
shuffle= False,
class_mode= "sparse"
)
valid_ds = image_dataset_from_directory(
valid_dir,
image_size=(224,224),
shuffle=False,
)
test_ds = image_dataset_from_directory(
test_dir,
image_size=(224,224),
shuffle=False,
)
This is the importing of the exception convolution base.
conv_base = keras.applications.Xception(include_top=False, weights="imagenet", input_shape=(224,224,3))
conv_base.trainable = False
This is the model building function.
def pre_trained():
inputs = keras.Input(shape=(224,224,3))
#x = data_augmentation(inputs)
x = keras.applications.xception.preprocess_input(inputs)
x = conv_base(x)
x = layers.GlobalAveragePooling2D()(x)
x = layers.BatchNormalization()(x)
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(5, activation = "softmax")(x)
model = keras.Model(inputs, outputs)
model.compile(optimizer="rmsprop", loss="sparse_categorical_crossentropy", metrics = ["accuracy"])
return model
Training function call
history = pre_trained_model.fit(train_ds, epochs=25)
This is the picture of the epochs.
While the exact cause for this remains unclear to me, I have found where the problem occurred, and the solution for it.
I added some parameters in the dataset generator function.
train_dir = "House_Dataset/Train"
valid_dir = "House_Dataset/Valid"
test_dir = "House_Dataset/Test"
train_ds = image_dataset_from_directory(
train_dir,
image_size=(224,224),
shuffle= True,
seed=1,
labels="inferred",
label_mode = "categorical"
)
valid_ds = image_dataset_from_directory(
valid_dir,
image_size=(224,224),
shuffle=True,
seed=1,
labels="inferred",
label_mode = "categorical"
)
test_ds = image_dataset_from_directory(
test_dir,
image_size=(224,224),
shuffle=True,
seed=1,
labels="inferred",
label_mode = "categorical"
)
I added the option to shuffle with some seed, and changed the label mode to categorical, which will produce a one-hot encoding of labels. Likewise I also changed the loss from sparse_categorical_crossentropy to categorical_crossentropy. These changes have allowed the model train, and there have been significant improvements in both training and validation loss as well as accuracy.
try my cnn network and see if you get 87% accuracy. cnn extract features in each layer as a filter. the filter then feeds to the category softmax function.
model=Sequential()
model.add(Conv2D(32, (3,3),activation='relu',input_shape=(IMG_SIZE,IMG_SIZE,3)))
#model.add(Dropout(0.25))
model.add(MaxPooling2D(2))
#model.add(BatchNormalization())
model.add(Conv2D(64, (3,3), activation="relu"))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(128, (3,3), activation="relu"))
model.add(MaxPooling2D(2,2))
model.add(Conv2D(128, (3,3), activation="relu"))
model.add(MaxPooling2D(2,2))
#model.add(BatchNormalization())
model.add(Flatten())
model.add(Dropout(0.2))
model.add(Dense(512,activation='relu'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics = ['accuracy'])
model.summary()

Problem with logits and labels size. Tensorflow

I try to train top layer separate from base model. All is working with generating features with model.predict_generator like
bottleneck_features_train = model.predict_generator(
train_generator, predict_size_train)
np.save(save_dir + 'bottleneck_features_train.npy', bottleneck_features_train)
train_data = np.load(mtx.save_dir + 'bottleneck_features_train.npy')
model.fit(train_data, ....
)
But now I got huge dataset and can't load all data in memory so I use generator flow_from_directory
def create_generator(root_path, batch_size):
datagen = ImageDataGenerator(rescale=1. / 255)
generator = datagen.flow_from_directory(
root_path,
target_size=(224, 224),
batch_size=batch_size,
class_mode="categorical",
shuffle=True)
return generator
train_generator = create_generator(mtx.train_data_dir, mtx.batch_size)
and than
model.fit(train_generator...
class_mode in flow_from_directory is "categorical" and loss function too(categorical_crossentropy)
layers is
model = Sequential()
model.add(Flatten(input_shape=(7, 7, 512)))
model.add(Dense(512, activation="relu"))
model.add(Dropout(0.7))
model.add(Dense(num_classes, activation='softmax'))
but when I run training I get
logits and labels must be broadcastable: logits_size=[24,32] labels_size=[4,32]
As I understand it's something wrong with shapes in layers or how are features/labels encoded.
Update 1:
Also it's working when batch_size in flow_from_directory is set with 1. But accuracy is very low than.
try
model.add(Flatten(input_shape=(224,224,3)))

wifi gesture recognition,dl,ml, python, cnn,

I have dataset look like this
(7500, 200, 30, 3)
which 7500 samples (there are a tensor of shape 200,30,3) which is related to CSI data (kind of wifi data for gesture recognition) It has 150 different labels (gestures) the aim is to classify
I used a CNN by keras to classify, I faced with huge overfitting
def create_DL_model():
# input layer
csi = Input(shape=(200,30,3))
# first feature extractor
x = Conv2D(64, kernel_size=3, activation='relu',name='layer1-01')(csi)
x=BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2),name='layer1-02')(x)
x = Conv2D(64, kernel_size=3, activation='relu',name='layer1-03')(x)
x=BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2),name='layer1-04')(x)
x=BatchNormalization()(x)
x = Conv2D(64, kernel_size=3, activation='relu',name='layer1-05',padding='same')(x)
x=Conv2D(32, kernel_size=3, activation='relu',name='layer1-06',padding='same')(x)
x=Conv2D(64, (3,3),padding='same',activation='relu',name='layer-01')(x)
x=BatchNormalization()(x)
x=MaxPool2D(pool_size=(2, 2,),name='layer-02')(x)
x=Conv2D(32, (3,3),padding="same",activation='relu',name='layer-03')(x)
x=BatchNormalization()(x)
x=MaxPool2D(pool_size=(2, 2),name='layer-04')(x)
x=Flatten()(x)
x=Dense(16,activation='relu')(x)
keras.layers.Dropout(.50, seed=1)
probability=Dense(150,activation='softmax')(x)
model= Model(inputs=csi, outputs=probability)
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
return model
as you see, I used drop out for dense layer, early stopping and batch normalization for fight with overfitting, as you see still, there is the problem
after cross validation, I have accuracy around 70 (some papers got 90 pecent accuracy however we have 150 labels and it seems 90 pecent it is really grear result, they used meta-learning which I could not use), is there any way that you can recommend
many thanks
The accuracy vs epoch graph signals the over fitting issue present in your model. This is due to few training samples (7500/150 = 50 per class). One possible solution could be applying Data Augmentation which allows you to build a powerful image classifier using only very few training examples.
Data structure
Store your data according to the following structure
data/
train/
class1/
class1_img001.jpg
class1_img002.jpg
class2/
class2_img001.jpg
class2_img002.jpg
...
class150/
class150_img001.jpg
class150_img002.jpg
...
validation/
class1/
class1_img001.jpg
class1_img002.jpg
class2/
class2_img001.jpg
class2_img002.jpg
...
class150/
class150_img001.jpg
class150_img002.jpg
...
You can do:
def create_DL_model(img_height, img_width, channel):
# input layer
csi = Input(shape=(img_height, img_width, channel))
# first feature extractor
x = Conv2D(64, kernel_size=3, activation='relu',name='layer1-01')(csi)
x=BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2),name='layer1-02')(x)
x = Conv2D(64, kernel_size=3, activation='relu',name='layer1-03')(x)
x=BatchNormalization()(x)
x = MaxPooling2D(pool_size=(2, 2),name='layer1-04')(x)
x=BatchNormalization()(x)
x = Conv2D(64, kernel_size=3, activation='relu',name='layer1-05',padding='same')(x)
x=Conv2D(32, kernel_size=3, activation='relu',name='layer1-06',padding='same')(x)
x=Conv2D(64, (3,3),padding='same',activation='relu',name='layer-01')(x)
x=BatchNormalization()(x)
x=MaxPool2D(pool_size=(2, 2,),name='layer-02')(x)
x=Conv2D(32, (3,3),padding="same",activation='relu',name='layer-03')(x)
x=BatchNormalization()(x)
x=MaxPool2D(pool_size=(2, 2),name='layer-04')(x)
x=Flatten()(x)
x=Dense(16,activation='relu')(x)
keras.layers.Dropout(.50, seed=1)
probability=Dense(150,activation='softmax')(x)
model= Model(inputs=csi, outputs=probability)
return model
from keras.preprocessing.image import ImageDataGenerator
batch_size = 32
img_height = 200
img_width = 30
channel = 3
model = create_DL_model(img_height, img_width, channel)
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)
# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
'data/train', # this is the target directory
target_size=(img_height , img_width ), # all images will be resized
batch_size=batch_size,
class_mode='categorical_crossentropy')
# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
'data/validation',
target_size=(img_height , img_width ),
batch_size=batch_size,
class_mode='categorical_crossentropy')
model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit_generator(
train_generator,
steps_per_epoch=7500// batch_size,
epochs=50,
validation_data=validation_generator,
validation_steps=YOUR_VALIDATION_SIZE// batch_size) # use YOUR_VALIDATION_SIZE as per your validation data
model.save('model-e50-b32.h5') # always save your weights after training or during training
The number of CNN layers can be decreased and accuracy-loss can be monitored since we are only feeding 7500 training images.
The above code is not tested. Please share your errors for further advice.
More about data augmentation and how to apply is here.

Categories

Resources