I am implementing Bayesian Optimization to find the best hyperparameters for my convolutional neural network (CNN).
After 10 trials, the algorithm found the best parameters with an accuracy of 80%. When I train the model with the best parameters found previously, the model stays stuck at the same accuracy value (0.5) and val_accu (0.5). I don't understand why.
My model builder is defined as below:
def model_builder(hp):
model = Sequential()
#model.add(Input(shape=(50,50,3)))
for i in range(hp.Int('num_blocks', 1,5)):
hp_padding=hp.Choice('padding_'+ str(i), values=['valid', 'same'])
hp_filters=hp.Choice('filters_'+ str(i), values=[32, 64])
model.add(Conv2D(hp_filters, (3, 3), padding=hp_padding, activation='relu', kernel_initializer='he_uniform', input_shape=(50, 50, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(hp.Choice('dropout_'+ str(i), values=[0.0, 0.1, 0.2])))
model.add(Flatten())
hp_units = hp.Int('units', min_value=25, max_value=150, step=25)
model.add(Dense(hp_units, activation='relu', kernel_initializer='he_uniform'))
model.add(Dense(2,activation="sigmoid"))
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3])
hp_optimizer=hp.Choice('Optimizer', values=['Adam', 'SGD'])
#hp_optimizer=hp.Choice('Optimizer', values=['Adam'])
if hp_optimizer == 'Adam':
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-5])
elif hp_optimizer == 'SGD':
hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3])
nesterov=True
momentum=0.9
#model.compile(loss=keras.losses.binary_crossentropy, optimizer=tf.keras.optimizers.Adam(learning_rate=hp_learning_rate), metrics=['accuracy'])
model.compile(optimizer=hp_optimizer,loss=keras.losses.binary_crossentropy,metrics=['accuracy'])
return model
To search the best hyperparameters:
tuner_cnn = kt.tuners.BayesianOptimization(
model_builder,
objective='accuracy',
max_trials=10,
directory='.',
project_name='tuning-cnn')
I launch the instance as the following:
tuner_cnn.search(datagen.flow(X_trainRusReshaped, Y_trainRusHot,batch_size=256), steps_per_epoch=len(X_trainRusReshaped) / batch_size, epochs=80, validation_data=(X_testRusReshaped, Y_testRusHot), callbacks=[stop_early])
and after 10 trials, the algorithm shows me this:
best_mlp_hyperparameters = tuner_cnn.get_best_hyperparameters(1)[0]
print("Best Hyper-parameters")
best_mlp_hyperparameters.values
output:
Best Hyper-parameters
{'Optimizer': 'Adam',
'dropout_0': 0.2,
'filters_0': 64,
'learning_rate': 0.001,
'num_blocks': 3,
'padding_0': 'valid',
'units': 100}
Finally, I train our CNN model using the best hyperparameters
model_cnn = Sequential()
for i in range(best_mlp_hyperparameters['num_blocks']):
hp_padding=best_mlp_hyperparameters['padding_'+ str(i)]
hp_filters=best_mlp_hyperparameters['filters_'+ str(i)]
model_cnn.add(Conv2D(hp_filters, (3, 3), padding=hp_padding, activation='relu', kernel_initializer='he_uniform', input_shape=(50, 50, 3)))
model_cnn.add(MaxPooling2D((2, 2)))
model_cnn.add(Dropout(best_mlp_hyperparameters['dropout_'+ str(i)]))
model_cnn.add(Flatten())
model_cnn.add(Dense(best_mlp_hyperparameters['units'], activation='relu', kernel_initializer='he_uniform'))
model_cnn.add(Dense(2,activation="sigmoid"))
model_cnn.compile(optimizer=best_mlp_hyperparameters['Optimizer'],
loss='binary_crossentropy',
metrics=['accuracy'])
print(model_cnn.summary())
#history_cnn= model_cnn.fit(train_x, train_y, epochs=50, batch_size=32, validation_data=(dev_x, dev_y), callbacks=callback)
training=model_cnn.fit_generator(datagen.flow(X_trainRusReshaped,Y_trainRusHot,batch_size=batch_size),steps_per_epoch=len(X_trainRusReshaped) / batch_size, epochs=epochs,validation_data=(X_testRusReshaped, Y_testRusHot), verbose=1, callbacks=[stop_early])
When I am training I am far from the values obtained previously:
119/119 [==============================] - 25s 201ms/step - loss: 14.8185 - accuracy: 0.4982 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 2/80
119/119 [==============================] - 21s 174ms/step - loss: 0.7104 - accuracy: 0.4972 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/80
119/119 [==============================] - 21s 173ms/step - loss: 0.7013 - accuracy: 0.4982 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/80
119/119 [==============================] - 21s 173ms/step - loss: 0.6977 - accuracy: 0.4990 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 5/80
119/119 [==============================] - 21s 174ms/step - loss: 0.6960 - accuracy: 0.4996 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 6/80
119/119 [==============================] - 21s 174ms/step - loss: 0.6949 - accuracy: 0.5000 - val_loss: 0.6932 - val_accuracy: 0.5000
Any idea?
Related
I'm working on a multiclass text classification problem.
After splitting the data to train and validation data frames, I've performed text augmentation to balance the data (only on the train data of course).
I've ended up with a balanced trained data and 44325 samples (of trained data).
Later on I've applied the "clean text" task for getting (i.e. stemming and stuff) on the trained data.
train['text'] = train['text'].apply(clean_text)
X_train = train.iloc[:, :-1]
y_train = train.iloc[:, -1:]
X_test = valid.iloc[:, :-1]
y_test = valid.iloc[:, -1:]
y_test = pd.DataFrame(y_test).reset_index(drop=True)
tokenizer = Tokenizer(num_words=vocab_size, oov_token='<OOV>')
tokenizer.fit_on_texts(X_train['text'])
train_seq = tokenizer.texts_to_sequences(X_train['text'])
train_padded = pad_sequences(train_seq, maxlen=max_length, padding=padding_type, truncating=trunc_type)
validation_seq = tokenizer.texts_to_sequences(X_test['text'])
validation_padded = pad_sequences(validation_seq, maxlen=max_length, padding=padding_type, truncating=trunc_type)
print('Shape of train data tensor:', train_padded.shape)
print('Shape of validation data tensor:', validation_padded.shape)
Output:
Shape of train data tensor: (44325, 200)
Shape of validation data tensor: (5466, 200)
Here's the encoding section:
encode = OneHotEncoder()
training_labels = encode.fit_transform(y_train)
validation_labels = encode.transform(y_test)
training_labels = training_labels.toarray()
validation_labels = validation_labels.toarray()
Model:
model = Sequential()
model.add(Embedding(vocab_size, embedding_dim, input_length=train_padded.shape[1]))
model.add(Conv1D(48, len(GROUPS), activation='relu', padding='valid'))
model.add(GlobalMaxPooling1D())
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dropout(0.5))
model.add(Dense(len(GROUPS), activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
epochs = 100
batch_size = 32
history = model.fit(train_padded, training_labels, shuffle=True ,
epochs=epochs, batch_size=batch_size,
validation_split=0.2,
validation_data=(validation_padded, validation_labels),
callbacks=[ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=5, min_lr=0.0001),
EarlyStopping(monitor='val_loss', mode='min', patience=2, verbose=1),
EarlyStopping(monitor='val_accuracy', mode='max', patience=5, verbose=1)])
Model output:
Epoch 1/100
1109/1109 [==============================] - 88s 79ms/step - loss: 1.2021 - accuracy: 0.5235 - val_loss: 0.8374 - val_accuracy: 0.7232
Epoch 2/100
1109/1109 [==============================] - 87s 79ms/step - loss: 0.9505 - accuracy: 0.6645 - val_loss: 0.7488 - val_accuracy: 0.7461
Epoch 3/100
1109/1109 [==============================] - 86s 77ms/step - loss: 0.8378 - accuracy: 0.7058 - val_loss: 0.6686 - val_accuracy: 0.7663
Epoch 4/100
1109/1109 [==============================] - 88s 79ms/step - loss: 0.7391 - accuracy: 0.7382 - val_loss: 0.6134 - val_accuracy: 0.7891
Epoch 5/100
1109/1109 [==============================] - 89s 80ms/step - loss: 0.6763 - accuracy: 0.7546 - val_loss: 0.5832 - val_accuracy: 0.7997
Epoch 6/100
1109/1109 [==============================] - 87s 79ms/step - loss: 0.6185 - accuracy: 0.7760 - val_loss: 0.5529 - val_accuracy: 0.8050
Epoch 7/100
1109/1109 [==============================] - 87s 79ms/step - loss: 0.5737 - accuracy: 0.7912 - val_loss: 0.5311 - val_accuracy: 0.8153
Epoch 8/100
1109/1109 [==============================] - 88s 80ms/step - loss: 0.5226 - accuracy: 0.8080 - val_loss: 0.5268 - val_accuracy: 0.8226
Epoch 9/100
1109/1109 [==============================] - 88s 79ms/step - loss: 0.4955 - accuracy: 0.8171 - val_loss: 0.5142 - val_accuracy: 0.8285
Epoch 10/100
1109/1109 [==============================] - 88s 80ms/step - loss: 0.4665 - accuracy: 0.8265 - val_loss: 0.5035 - val_accuracy: 0.8338
Epoch 11/100
1109/1109 [==============================] - 88s 79ms/step - loss: 0.4410 - accuracy: 0.8348 - val_loss: 0.5082 - val_accuracy: 0.8399
Epoch 12/100
1109/1109 [==============================] - 88s 80ms/step - loss: 0.4190 - accuracy: 0.8407 - val_loss: 0.5160 - val_accuracy: 0.8414
Epoch 00012: early stopping
... and to the last part I'm unsure of:
def evaluate_preds(y_true, y_preds):
"""
Performs evaluation comparison on y_true labels vs. y_pred labels
on a classification.
"""
accuracy = accuracy_score(y_true, y_preds)
precision = precision_score(y_true, y_preds, average='micro')
recall = recall_score(y_true, y_preds, average='micro')
f1 = f1_score(y_true, y_preds, average='micro')
metric_dict = {"accuracy": round(accuracy, 2),
"precision": round(precision, 2),
"recall": round(recall, 2),
"f1": round(f1, 2)}
print(f"Acc: {accuracy * 100:.2f}%")
print(f"Precision: {precision:.2f}")
print(f"Recall: {recall:.2f}")
print(f"F1 score: {f1:.2f}")
return metric_dict
predicted = model.predict(validation_padded)
evaluate_preds(np.argmax(validation_labels, axis=1), np.argmax(predicted, axis=1))
Output:
Acc: 40.16%
Precision: 0.40
Recall: 0.40
F1 score: 0.40
I can't understand what am I doing wrong.
How come accuracy of the last mentioned method is so low compare to val_accuracy?
Here is how I loaded the data which are 2 folders with image data:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
main_folder,
validation_split=0.1,
subset="training",
seed=123,
image_size=(dim, dim))
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
main_folder,
validation_split=0.1,
subset="validation",
seed=123,
image_size=(dim, dim))
The loading of the training data from the folder gives
Found 6457 files belonging to 2 classes.
Using 5812 files for training.
Found 6457 files belonging to 2 classes.
Using 645 files for validation.
Here is how I trained my model:
model = tf.keras.models.Sequential([
tf.keras.layers.experimental.preprocessing.Rescaling(1. / 255),
tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss=tf.losses.BinaryCrossentropy(from_logits=True), optimizer="adam", metrics=["accuracy"])
es = EarlyStopping(monitor='val_accuracy', min_delta=0.1, patience=5)
model.fit(
train_ds,
validation_data=val_ds,
epochs=epc,
callbacks=[es])
Here is how I got the results:
y_pred = model.predict(val_ds)
predicted_categories = tf.argmax(y_pred, axis=1)
true_categories = tf.concat([y for x, y in val_ds], axis=0)
print(classification_report(true_categories, predicted_categories ))
The Contradicting outputs are:
Epoch 1/100
182/182 [==============================] - 8s 44ms/step - loss: 0.6617 - accuracy: 0.5139 - val_loss: 0.6466 - val_accuracy: 0.3442
Epoch 2/100
182/182 [==============================] - 8s 46ms/step - loss: 0.6613 - accuracy: 0.5712 - val_loss: 0.6460 - val_accuracy: 0.6558
Epoch 3/100
182/182 [==============================] - 8s 44ms/step - loss: 0.6611 - accuracy: 0.5594 - val_loss: 0.6474 - val_accuracy: 0.3442
Epoch 4/100
182/182 [==============================] - 8s 46ms/step - loss: 0.6315 - accuracy: 0.6504 - val_loss: 0.4623 - val_accuracy: 0.9690
Epoch 5/100
182/182 [==============================] - 8s 46ms/step - loss: 0.4780 - accuracy: 0.9554 - val_loss: 0.4597 - val_accuracy: 0.9690
Epoch 6/100
182/182 [==============================] - 8s 45ms/step - loss: 0.4831 - accuracy: 0.9434 - val_loss: 0.4517 - val_accuracy: 0.9845
Epoch 7/100
182/182 [==============================] - 8s 45ms/step - loss: 0.4720 - accuracy: 0.9658 - val_loss: 0.4546 - val_accuracy: 0.9736
Epoch 8/100
182/182 [==============================] - 8s 44ms/step - loss: 0.4719 - accuracy: 0.9652 - val_loss: 0.4507 - val_accuracy: 0.9860
Epoch 9/100
182/182 [==============================] - 8s 44ms/step - loss: 0.4747 - accuracy: 0.9597 - val_loss: 0.4528 - val_accuracy: 0.9814
precision recall f1-score support
0 0.34 1.00 0.51 222
1 0.00 0.00 0.00 423
accuracy 0.34 645
macro avg 0.17 0.50 0.26 645
weighted avg 0.12 0.34 0.18 645
Otherwise, I get a different answer every time I execute it
Can someone please please why is the classification report has an accuracy of 34% while the model val_accuracy is 0.94%?
tf.keras.preprocessing.image_dataset_from_directory
method has a parameter called label_mode and its default value is int which is suitable for sparse_categoricalcrossentropy etc. It should be changed into label_model = binary if the classification is binary classification.
Contradiction is here:
tf.keras.layers.Dense(1, activation='sigmoid')
predicted_categories = tf.argmax(y_pred, axis=1)
With sigmoid your outputs consist of a list with a shape of (1,). And when you take argmax of that list it is always returning zero as index because of the list has only one index. So you need to apply some threshold method when using sigmoid. Sigmoid squeezes outputs into a range of [0,1]. So you can do:
predicted_categories = [1 * (x[0]>=0.5) for x in y_pred]
or using numpy:
predicted_categories = np.where(y_pred > 0.5, 1, 0)
where 0.5 is the threshold.
This means if predicted value is bigger than 0.5 then it will belong to second class. You can adjust the threshold depending on your needings.
The CNN model will classify between two class with training samples = 5974 and validation = 1987.
I am using datagen.flow_from_directory and my model will predict from separate test set. I am running the code For 200 epoch in Google Colab, but after 5 epoch, the training and validation accuracy is not improving.
Accuracy
Epoch 45/200
186/186 [==============================] - 138s 744ms/step - loss: 0.6931 - acc: 0.4983 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 46/200
186/186 [==============================] - 137s 737ms/step - loss: 0.6931 - acc: 0.4990 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 47/200
186/186 [==============================] - 142s 761ms/step - loss: 0.6931 - acc: 0.4987 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 48/200
186/186 [==============================] - 140s 752ms/step - loss: 0.6931 - acc: 0.4993 - val_loss: 0.6931 - val_acc: 0.5005
Epoch 49/200
186/186 [==============================] - 139s 745ms/step - loss: 0.6931 - acc: 0.4976 - val_loss: 0.6931 - val_acc: 0.5010
Epoch 50/200
186/186 [==============================] - 143s 768ms/step - loss: 0.6931 - acc: 0.4992 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 51/200
186/186 [==============================] - 140s 755ms/step - loss: 0.6931 - acc: 0.4980 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 52/200
186/186 [==============================] - 141s 758ms/step - loss: 0.6931 - acc: 0.4990 - val_loss: 0.6931 - val_acc: 0.4995
Epoch 53/200
186/186 [==============================] - 141s 759ms/step - loss: 0.6931 - acc: 0.4985 - val_loss: 0.6931 - val_acc: 0.5000
Epoch 54/200
186/186 [==============================] - 143s 771ms/step - loss: 0.6931 - acc: 0.4987 - val_loss: 0.6931 - val_acc: 0.4995
Epoch 55/200
186/186 [==============================] - 143s 771ms/step - loss: 0.6931 - acc: 0.4992 - val_loss: 0.6931 - val_acc: 0.5005
train_data_path = "/content/drive/My Drive/snk_tod/train"
valid_data_path = "/content/drive/My Drive/snk_tod/valid"
test_data_path = "/content/drive/My Drive/snk_tod/test"
img_rows = 100
img_cols = 100
epochs = 200
print(epochs)
batch_size = 32
num_of_train_samples = 5974
num_of_valid_samples = 1987
#Image Generator
train_datagen = ImageDataGenerator(rescale=1. / 255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
valid_datagen = ImageDataGenerator(rescale=1. / 255)
test_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(train_data_path,
target_size=(img_rows, img_cols),
batch_size=batch_size,
shuffle=True,
class_mode='categorical')
validation_generator = valid_datagen.flow_from_directory(valid_data_path,
target_size=(img_rows, img_cols),
batch_size=batch_size,
shuffle=True,
class_mode='categorical')
test_generator = test_datagen.flow_from_directory(test_data_path,
target_size=(img_rows, img_cols),
batch_size=batch_size,
shuffle=False,
class_mode='categorical')
model = Sequential()
model.add(Conv2D((32), (3, 3), input_shape=(img_rows, img_cols, 3), kernel_initializer="glorot_uniform", bias_initializer="zeros"))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D((32), (3, 3),kernel_initializer="glorot_uniform", bias_initializer="zeros"))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D((64), (3, 3),kernel_initializer="glorot_uniform", bias_initializer="zeros"))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Conv2D((64), (3, 3),kernel_initializer="glorot_uniform", bias_initializer="zeros"))
model.add(Activation('relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.5))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dropout(0.5))
model.add(Dense(512))
model.add(Dense(2))
model.add(Activation('sigmoid'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['acc'])
#Train
history=model.fit_generator(train_generator,
steps_per_epoch=num_of_train_samples // batch_size,
epochs=epochs,
validation_data=validation_generator,
validation_steps=num_of_valid_samples // batch_size)
I have built a tensorflow model and am getting no change in my validation accuracy in different epochs, which makes me believe there is something wrong in my setup. Below is my code.
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import regularizers
import tensorflow as tf
model = Sequential()
model.add(Conv2D(16, (3, 3), input_shape=(299, 299,3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(32, (3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(64, (3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
model.add(Conv2D(64, (3, 3),padding='same'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2),padding='same'))
# this converts our 3D feature maps to 1D feature vectors
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
batch_size=32
# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
rescale=1./255,
# shear_range=0.2,
# zoom_range=0.2,
horizontal_flip=True)
# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)
# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
'Documents/Training', # this is the target directory
target_size=(299, 299), #all images will be resized to 299
batch_size=batch_size,
class_mode='binary') # since we use binary_crossentropy loss, we need binary labels
# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
'Documents/Dev',
target_size=(299, 299),
batch_size=batch_size,
class_mode='binary')
#w1 = tf.Variable(tf.truncated_normal([784, 30], stddev=0.1))
model.fit_generator(
train_generator,
steps_per_epoch=50 // batch_size,
verbose = 1,
epochs=10,
validation_data=validation_generator,
validation_steps=8 // batch_size)
Which when I run produces the following output. Anything I'm missing here as far as my architecture is concerned or data generation steps? I have referenced Tensorflow model accuracy not increasing and accuracy not increasing in tensorflow model to no avail yet.
Epoch 1/10
3/3 [==============================] - 2s 593ms/step - loss: 0.6719 - accuracy: 0.6250 - val_loss: 0.8198 - val_accuracy: 0.5000
Epoch 2/10
3/3 [==============================] - 2s 607ms/step - loss: 0.6521 - accuracy: 0.6667 - val_loss: 0.8518 - val_accuracy: 0.5000
Epoch 3/10
3/3 [==============================] - 2s 609ms/step - loss: 0.6752 - accuracy: 0.6250 - val_loss: 0.7129 - val_accuracy: 0.5000
Epoch 4/10
3/3 [==============================] - 2s 611ms/step - loss: 0.6841 - accuracy: 0.6250 - val_loss: 0.7010 - val_accuracy: 0.5000
Epoch 5/10
3/3 [==============================] - 2s 608ms/step - loss: 0.6977 - accuracy: 0.5417 - val_loss: 0.6551 - val_accuracy: 0.5000
Epoch 6/10
3/3 [==============================] - 2s 607ms/step - loss: 0.6508 - accuracy: 0.7083 - val_loss: 0.5752 - val_accuracy: 0.5000
Epoch 7/10
3/3 [==============================] - 2s 615ms/step - loss: 0.6596 - accuracy: 0.6875 - val_loss: 0.9326 - val_accuracy: 0.5000
Epoch 8/10
3/3 [==============================] - 2s 604ms/step - loss: 0.7022 - accuracy: 0.6458 - val_loss: 0.6976 - val_accuracy: 0.5000
Epoch 9/10
3/3 [==============================] - 2s 591ms/step - loss: 0.6331 - accuracy: 0.7292 - val_loss: 0.9571 - val_accuracy: 0.5000
Epoch 10/10
3/3 [==============================] - 2s 595ms/step - loss: 0.6085 - accuracy: 0.7292 - val_loss: 0.6029 - val_accuracy: 0.5000
Out[24]: <keras.callbacks.callbacks.History at 0x1ee4e3a8f08>
You are setting the training steps per epoch =50//32=1. So do you only have 50 training images? Similarly for validation you have steps = 8//32=0. Do you have only 8 validation images? When you execute the program how many images do the training and validation generators print out they have found? You will need more images than that. Try setting your batch size =1
I've played around with the setup of my architecture a lot, the number of layers, pooling size, dropouts, etc but I always end up in the same ballpark: ~96-98 accuracy and loss between 2-6%.
I'm training from a dataset of 78000 images, 26 classes (letters of the alphabet), and 3000 images per photo. I am satisfied with the accuracy, but are there any suggestions to reduce loss? Also How much loss do you think is too much? Anything else I should add to the model to make it more robust? Code for the model is shown below:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(200, 200, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
#model.add(Dropout(0.5))
model.add(Conv2D(64, (3, 3), activation='relu', input_shape=(200, 200, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
#model.add(Dropout(0.5))
model.add(Conv2D(64, (3, 3), activation='relu', input_shape=(200, 200, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
#model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(26, activation='softmax'))
Compile and Fit:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10, batch_size=64, verbose=1, validation_data=(X_test, y_test))
Output:
Train on 54600 samples, validate on 23400 samples
Epoch 1/10
54600/54600 [==============================] - 68s 1ms/step - loss: 1.1217 - accuracy: 0.6622 - val_loss: 0.9795 - val_accuracy: 0.7394
Epoch 2/10
54600/54600 [==============================] - 67s 1ms/step - loss: 0.2377 - accuracy: 0.9219 - val_loss: 0.2300 - val_accuracy: 0.9277
Epoch 3/10
54600/54600 [==============================] - 67s 1ms/step - loss: 0.1184 - accuracy: 0.9627 - val_loss: 0.2746 - val_accuracy: 0.9286
Epoch 4/10
54600/54600 [==============================] - 67s 1ms/step - loss: 0.0755 - accuracy: 0.9761 - val_loss: 0.1850 - val_accuracy: 0.9517
Epoch 5/10
54600/54600 [==============================] - 69s 1ms/step - loss: 0.0669 - accuracy: 0.9801 - val_loss: 0.2044 - val_accuracy: 0.9450
Epoch 6/10
54600/54600 [==============================] - 69s 1ms/step - loss: 0.0520 - accuracy: 0.9848 - val_loss: 0.2265 - val_accuracy: 0.9485
Epoch 7/10
54600/54600 [==============================] - 72s 1ms/step - loss: 0.0481 - accuracy: 0.9865 - val_loss: 0.1709 - val_accuracy: 0.9559
Epoch 8/10
54600/54600 [==============================] - 66s 1ms/step - loss: 0.0370 - accuracy: 0.9905 - val_loss: 0.1534 - val_accuracy: 0.9659
Epoch 9/10
54600/54600 [==============================] - 66s 1ms/step - loss: 0.0335 - accuracy: 0.9912 - val_loss: 0.1181 - val_accuracy: 0.9703
Epoch 10/10
54600/54600 [==============================] - 66s 1ms/step - loss: 0.0277 - accuracy: 0.9921 - val_loss: 0.1204 - val_accuracy: 0.9704