I am building a CNN model to classify images into 11 classes as follows: 'dew','fogsmog','frost','glaze','hail','lightning','rain','rainbow','rime','sandstorm','snow'
and while training I get good accuracy and good validation accuracy
Epoch 1/20
131/131 [==============================] - 1012s 8s/step - loss: 1.8284 - accuracy: 0.3724 - val_loss: 1.4365 - val_accuracy: 0.5719
Epoch 2/20
131/131 [==============================] - 67s 511ms/step - loss: 1.3041 - accuracy: 0.5516 - val_loss: 1.1048 - val_accuracy: 0.6515
Epoch 3/20
131/131 [==============================] - 67s 510ms/step - loss: 1.1547 - accuracy: 0.6161 - val_loss: 1.0509 - val_accuracy: 0.6732
Epoch 4/20
131/131 [==============================] - 67s 510ms/step - loss: 1.0681 - accuracy: 0.6394 - val_loss: 1.0644 - val_accuracy: 0.6616
Epoch 5/20
131/131 [==============================] - 66s 505ms/step - loss: 1.0269 - accuracy: 0.6509 - val_loss: 1.0929 - val_accuracy: 0.6363
Epoch 6/20
131/131 [==============================] - 66s 506ms/step - loss: 1.0018 - accuracy: 0.6576 - val_loss: 0.9666 - val_accuracy: 0.6869
Epoch 7/20
131/131 [==============================] - 67s 507ms/step - loss: 0.9384 - accuracy: 0.6790 - val_loss: 0.8623 - val_accuracy: 0.7144
Epoch 8/20
131/131 [==============================] - 66s 505ms/step - loss: 0.9160 - accuracy: 0.6903 - val_loss: 0.8834 - val_accuracy: 0.7180
Epoch 9/20
131/131 [==============================] - 66s 502ms/step - loss: 0.8909 - accuracy: 0.6915 - val_loss: 0.8667 - val_accuracy: 0.7050
Epoch 10/20
131/131 [==============================] - 66s 503ms/step - loss: 0.8476 - accuracy: 0.7075 - val_loss: 0.8100 - val_accuracy: 0.7339
Epoch 11/20
131/131 [==============================] - 67s 509ms/step - loss: 0.8108 - accuracy: 0.7262 - val_loss: 0.8352 - val_accuracy: 0.7137
Epoch 12/20
131/131 [==============================] - 66s 506ms/step - loss: 0.7922 - accuracy: 0.7212 - val_loss: 0.8368 - val_accuracy: 0.7195
Epoch 13/20
131/131 [==============================] - 66s 505ms/step - loss: 0.7424 - accuracy: 0.7442 - val_loss: 0.8813 - val_accuracy: 0.7166
Epoch 14/20
131/131 [==============================] - 66s 503ms/step - loss: 0.7060 - accuracy: 0.7579 - val_loss: 0.8453 - val_accuracy: 0.7231
Epoch 15/20
131/131 [==============================] - 66s 503ms/step - loss: 0.6767 - accuracy: 0.7584 - val_loss: 0.8347 - val_accuracy: 0.7151
Epoch 16/20
131/131 [==============================] - 66s 506ms/step - loss: 0.6692 - accuracy: 0.7632 - val_loss: 0.8038 - val_accuracy: 0.7346
Epoch 17/20
131/131 [==============================] - 67s 507ms/step - loss: 0.6308 - accuracy: 0.7718 - val_loss: 0.7956 - val_accuracy: 0.7455
Epoch 18/20
131/131 [==============================] - 67s 508ms/step - loss: 0.6043 - accuracy: 0.7901 - val_loss: 0.8295 - val_accuracy: 0.7477
Epoch 19/20
131/131 [==============================] - 66s 506ms/step - loss: 0.5632 - accuracy: 0.8018 - val_loss: 0.7918 - val_accuracy: 0.7455
Epoch 20/20
131/131 [==============================] - 67s 510ms/step - loss: 0.5368 - accuracy: 0.8138 - val_loss: 0.7798 - val_accuracy: 0.7549
but when I predict and submit my results I get very low accuracy.
here is my model
from keras.preprocessing.image import ImageDataGenerator
IMG_SIZE = 50
datagen = ImageDataGenerator(
rescale=1./255,
validation_split=0.25)
train_dataset = datagen.flow_from_directory( directory=Train_folder,
shuffle=True,
target_size=(50,50),
subset="training",
classes=['dew','fogsmog','frost','glaze','hail','lightning','rain','rainbow','rime','sandstorm','snow'],
class_mode='categorical')
validation_dataset = datagen.flow_from_directory( directory=Train_folder,
shuffle=True,
target_size=(50,50),
subset="validation",
classes=['dew','fogsmog','frost','glaze','hail','lightning','rain','rainbow','rime','sandstorm','snow'],
class_mode='categorical')
Found 4168 images belonging to 11 classes.
Found 1383 images belonging to 11 classes.
model = Sequential([
layers.Conv2D(32, kernel_size=(3, 3),activation="relu",padding='same',input_shape=(IMG_SIZE, IMG_SIZE, 3)),
layers.MaxPooling2D((2, 2),padding='same'),
layers.Dropout(0.25),
layers.Conv2D(64, (3, 3), activation="relu",padding='same'),
layers.MaxPooling2D(pool_size=(2, 2),padding='same'),
layers.Dropout(0.25),
layers.Conv2D(128, (3, 3), activation="relu",padding='same'),
layers.MaxPooling2D(pool_size=(2, 2),padding='same'),
layers.Dropout(0.4),
layers.Flatten(),
layers.Dense(128, activation="relu"),
layers.Dropout(0.3),
layers.Dense(11, activation='softmax')
])
model.build()
model.summary()
model.compile(optimizer='adam',
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=['accuracy'])
history = model.fit(
train_dataset,
epochs=20,
validation_data=validation_dataset,
)
model.save('model.tfl')
Test_folder="/content/drive/MyDrive/[NN'22] Project Dataset/Test"
test_data = []
labels = []
for img in tqdm(os.listdir(Test_folder)):
path = os.path.join(Test_folder, img)
img_data2 = cv2.imread(path)
try:
img_data2 = cv2.resize(img_data2, (IMG_SIZE,IMG_SIZE))
except:
continue
test_data.append([np.array(img_data2)])
labels.append(img)
X_data=np.array([test_data]).reshape(-1, IMG_SIZE, IMG_SIZE, 3)
prediction = model.predict([X_data])
Related
I am training a CNN model using Keras on Google Colab for binary image classification, the problem is when I use the test dataset divided into 2 classes in the model evaluate the accuracy is fixed at 0.5000. When I use a test dataset without dividing the data into 2 classes this does not happen and I have an accuracy of 0.9167.
My code:
modelo.add(Conv2D(32, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same', input_shape=(551, 1117, 3)))
modelo.add(MaxPooling2D((2, 2)))
modelo.add(Conv2D(64, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
modelo.add(MaxPooling2D((2, 2)))
modelo.add(Conv2D(128, (3, 3), activation='relu', kernel_initializer='he_uniform', padding='same'))
modelo.add(MaxPooling2D((2, 2)))
modelo.add(Flatten())
modelo.add(Dense(128, activation='relu', kernel_initializer='he_uniform'))
modelo.add(Dense(1, activation='sigmoid'))
modelo.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
hist = modelo.fit(traning_generator, validation_data=validation_generator, epochs=10, callbacks=callbacks)
training result:
Epoch 1/10
13/13 [==============================] - 33s 3s/step - loss: 136.5746 - accuracy: 0.5300 - val_loss: 23.4487 - val_accuracy: 0.4068
Epoch 2/10
13/13 [==============================] - 27s 2s/step - loss: 5.3578 - accuracy: 0.4675 - val_loss: 0.7363 - val_accuracy: 0.4068
Epoch 3/10
13/13 [==============================] - 28s 2s/step - loss: 0.6870 - accuracy: 0.5925 - val_loss: 0.7120 - val_accuracy: 0.5932
Epoch 4/10
13/13 [==============================] - 18s 1s/step - loss: 0.5529 - accuracy: 0.7225 - val_loss: 0.8240 - val_accuracy: 0.3898
Epoch 5/10
13/13 [==============================] - 18s 1s/step - loss: 0.4633 - accuracy: 0.7750 - val_loss: 1.1202 - val_accuracy: 0.4322
Epoch 6/10
13/13 [==============================] - 19s 1s/step - loss: 0.5213 - accuracy: 0.7675 - val_loss: 1.4779 - val_accuracy: 0.4407
Epoch 7/10
13/13 [==============================] - 17s 1s/step - loss: 0.1730 - accuracy: 0.9550 - val_loss: 1.8047 - val_accuracy: 0.4492
Epoch 8/10
13/13 [==============================] - 17s 1s/step - loss: 0.0887 - accuracy: 0.9925 - val_loss: 2.4989 - val_accuracy: 0.4831
Epoch 9/10
13/13 [==============================] - 17s 1s/step - loss: 0.0318 - accuracy: 1.0000 - val_loss: 3.7380 - val_accuracy: 0.4407
Epoch 10/10
13/13 [==============================] - 17s 1s/step - loss: 0.0070 - accuracy: 1.0000 - val_loss: 4.7144 - val_accuracy: 0.4492
when testing dataset without class division I get:
test_loss, test_acc = modelo.evaluate(test_dataset)
1/1 [==============================] - 1s 577ms/step - loss: 0.6590 - accuracy: 0.9167
and when testing the test dataset with 2 classes:
2/2 [==============================] - 3s 653ms/step - loss: 1.2334 - accuracy: 0.5000
I am fairly new to deep learning and right now am trying to predict consumer choices based on EEG data. The total dataset consists of 1045 EEG recordings each with a corresponding label, indicating Like or Dislike for a product. Classes are distributed as follows (44% Likes and 56% Dislikes). I read that Convolutional Neural Networks are suitable to work with raw EEG data so I tried to implement a network based on keras with the following structure:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(full_data, target, test_size=0.20, random_state=42)
y_train = np.asarray(y_train).astype('float32').reshape((-1,1))
y_test = np.asarray(y_test).astype('float32').reshape((-1,1))
# X_train.shape = ((836, 512, 14))
# y_train.shape = ((836, 1))
from keras.optimizers import Adam
from keras.optimizers import SGD
from keras.layers import MaxPooling1D
model = Sequential()
model.add(Conv1D(16, kernel_size=3, activation="relu", input_shape=(512,14)))
model.add(MaxPooling1D())
model.add(Conv1D(8, kernel_size=3, activation="relu"))
model.add(MaxPooling1D())
model.add(Flatten())
model.add(Dense(1, activation="sigmoid"))
model.compile(optimizer=Adam(lr = 0.001), loss='binary_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=20, batch_size = 64)
When I fit the model however the validation accuracy does not change at all with the following output:
Epoch 1/20
14/14 [==============================] - 0s 32ms/step - loss: 292.6353 - accuracy: 0.5383 - val_loss: 0.7884 - val_accuracy: 0.5407
Epoch 2/20
14/14 [==============================] - 0s 7ms/step - loss: 1.3748 - accuracy: 0.5598 - val_loss: 0.8860 - val_accuracy: 0.5502
Epoch 3/20
14/14 [==============================] - 0s 6ms/step - loss: 1.0537 - accuracy: 0.5598 - val_loss: 0.7629 - val_accuracy: 0.5455
Epoch 4/20
14/14 [==============================] - 0s 6ms/step - loss: 0.8827 - accuracy: 0.5598 - val_loss: 0.7010 - val_accuracy: 0.5455
Epoch 5/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7988 - accuracy: 0.5598 - val_loss: 0.8689 - val_accuracy: 0.5407
Epoch 6/20
14/14 [==============================] - 0s 6ms/step - loss: 1.0221 - accuracy: 0.5610 - val_loss: 0.6961 - val_accuracy: 0.5455
Epoch 7/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7415 - accuracy: 0.5598 - val_loss: 0.6945 - val_accuracy: 0.5455
Epoch 8/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7381 - accuracy: 0.5574 - val_loss: 0.7761 - val_accuracy: 0.5455
Epoch 9/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7326 - accuracy: 0.5598 - val_loss: 0.6926 - val_accuracy: 0.5455
Epoch 10/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7338 - accuracy: 0.5598 - val_loss: 0.6917 - val_accuracy: 0.5455
Epoch 11/20
14/14 [==============================] - 0s 7ms/step - loss: 0.7203 - accuracy: 0.5610 - val_loss: 0.6916 - val_accuracy: 0.5455
Epoch 12/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7192 - accuracy: 0.5610 - val_loss: 0.6914 - val_accuracy: 0.5455
Epoch 13/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7174 - accuracy: 0.5610 - val_loss: 0.6912 - val_accuracy: 0.5455
Epoch 14/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7155 - accuracy: 0.5610 - val_loss: 0.6911 - val_accuracy: 0.5455
Epoch 15/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7143 - accuracy: 0.5610 - val_loss: 0.6910 - val_accuracy: 0.5455
Epoch 16/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7129 - accuracy: 0.5610 - val_loss: 0.6909 - val_accuracy: 0.5455
Epoch 17/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7114 - accuracy: 0.5610 - val_loss: 0.6907 - val_accuracy: 0.5455
Epoch 18/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7103 - accuracy: 0.5610 - val_loss: 0.6906 - val_accuracy: 0.5455
Epoch 19/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7088 - accuracy: 0.5610 - val_loss: 0.6906 - val_accuracy: 0.5455
Epoch 20/20
14/14 [==============================] - 0s 6ms/step - loss: 0.7075 - accuracy: 0.5610 - val_loss: 0.6905 - val_accuracy: 0.5455
Thanks in advance for any insights!
The phenomenon you run into is called underfitting. This happens when the amount our quality of your training data is insufficient, or your network architecture is too small and not capable to learn the problem.
Try normalizing your input data and experiment with different network architectures, learning rates and activation functions.
As #Muhammad Shahzad stated in his comment, adding some Dense Layers after flatting would be a concrete architecture adaption you should try.
You can also increase the epoch and must increase the data set. And you also can use-
train_datagen= ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
vertical_flip = True,
channel_shift_range=0.2,
fill_mode='nearest'
)
for feeding the model more data and I hope you can increase the validation_accuracy.
This is the model accuracy in train and validation
This is the model loss
Batch generator:
train_batches = ImageDataGenerator(rotation_range=8).flow_from_directory(train_path,target_size=(224,224), classes=['Covid','Normal','Pneumonia'],batch_size=64)
valid_batches = ImageDataGenerator().flow_from_directory(valid_path,target_size=(224,224), classes=['Covid','Normal','Pneumonia'],batch_size=32)
I'm using a pretrained model:
model=keras.applications.resnet.ResNet50(include_top=False, weights='imagenet', input_tensor=None, input_shape=(224, 224, 3), pooling=None, classes=1000)
Performed regularisation:
regularizer = tf.keras.regularizers.l1(0.0001)
for layer in model.layers:
for attr in ['kernel_regularizer']:
if hasattr(layer, attr):
setattr(layer, attr, regularizer)
Used chopped off the last fully connected layers of the pretrained model and added the layers below:
x = AveragePooling2D(pool_size=(4, 4))(last_layer)
x = Flatten(name="flatten")(x)
x = Dense(64, activation="relu",kernel_regularizer=regularizers.l2(0.001))(x)
x = Dropout(0.6)(x)
# x = Dropout(0.6)(x)
out = Dense(3, activation="softmax",name='output_layer')(x)
Froze the upper layers:
for layer in custom_resnet_model.layers[:-7]:
layer.trainable = False
Using Adam optimizer:
custom_resnet_model.compile(Adam(lr=.0001),loss='binary_crossentropy',metrics=['accuracy'])
And the model fit:
history = custom_resnet_model.fit_generator(train_batches, steps_per_epoch=36,
validation_data=valid_batches, validation_steps=18, epochs=25, verbose=2)
As you can see below, towards the end the validation loss is all over the place:
Epoch 1/25
- 67s - loss: 0.7458 - accuracy: 0.7076 - val_loss: 0.7266 - val_accuracy: 0.7584
Epoch 2/25
- 64s - loss: 0.5467 - accuracy: 0.8139 - val_loss: 0.5276 - val_accuracy: 0.8022
Epoch 3/25
- 62s - loss: 0.4723 - accuracy: 0.8543 - val_loss: 0.4393 - val_accuracy: 0.8336
Epoch 4/25
- 62s - loss: 0.4274 - accuracy: 0.8800 - val_loss: 0.6082 - val_accuracy: 0.8384
Epoch 5/25
- 62s - loss: 0.4017 - accuracy: 0.8862 - val_loss: 0.5227 - val_accuracy: 0.8490
Epoch 6/25
- 62s - loss: 0.3698 - accuracy: 0.9004 - val_loss: 0.5691 - val_accuracy: 0.8532
Epoch 7/25
- 63s - loss: 0.3524 - accuracy: 0.9093 - val_loss: 0.4616 - val_accuracy: 0.8425
Epoch 8/25
- 63s - loss: 0.3379 - accuracy: 0.9183 - val_loss: 0.4604 - val_accuracy: 0.8467
Epoch 9/25
- 62s - loss: 0.3206 - accuracy: 0.9248 - val_loss: 0.5499 - val_accuracy: 0.8526
Epoch 10/25
- 61s - loss: 0.3240 - accuracy: 0.9244 - val_loss: 0.4745 - val_accuracy: 0.8526
Epoch 11/25
- 63s - loss: 0.3134 - accuracy: 0.9297 - val_loss: 0.4533 - val_accuracy: 0.8567
Epoch 12/25
- 62s - loss: 0.2995 - accuracy: 0.9337 - val_loss: 0.5668 - val_accuracy: 0.8555
Epoch 13/25
- 63s - loss: 0.2898 - accuracy: 0.9404 - val_loss: 0.6349 - val_accuracy: 0.8603
Epoch 14/25
- 62s - loss: 0.2845 - accuracy: 0.9386 - val_loss: 0.5612 - val_accuracy: 0.8650
Epoch 15/25
- 63s - loss: 0.2961 - accuracy: 0.9330 - val_loss: 0.7284 - val_accuracy: 0.8579
Epoch 16/25
- 64s - loss: 0.2759 - accuracy: 0.9429 - val_loss: 0.4720 - val_accuracy: 0.8650
Epoch 17/25
- 62s - loss: 0.2707 - accuracy: 0.9482 - val_loss: 0.9979 - val_accuracy: 0.8650
Epoch 18/25
- 63s - loss: 0.2744 - accuracy: 0.9416 - val_loss: 0.8098 - val_accuracy: 0.8733
Epoch 19/25
- 63s - loss: 0.2771 - accuracy: 0.9428 - val_loss: 0.1989 - val_accuracy: 0.8662
Epoch 20/25
- 62s - loss: 0.2647 - accuracy: 0.9440 - val_loss: 0.8921 - val_accuracy: 0.8686
Epoch 21/25
- 63s - loss: 0.2566 - accuracy: 0.9478 - val_loss: 0.3362 - val_accuracy: 0.8745
Epoch 22/25
- 62s - loss: 0.2645 - accuracy: 0.9402 - val_loss: 1.2044 - val_accuracy: 0.8662
Epoch 23/25
- 63s - loss: 0.2550 - accuracy: 0.9472 - val_loss: 0.6615 - val_accuracy: 0.8745
Epoch 24/25
- 62s - loss: 0.2486 - accuracy: 0.9519 - val_loss: 0.4722 - val_accuracy: 0.8674
Epoch 25/25
- 62s - loss: 0.2542 - accuracy: 0.9507 - val_loss: 0.8232 - val_accuracy: 0.8721
I have posted the code so that someone can pointout if I'm doing something wrong.
Try increasing the validation dataset.
Reason for the fluctuation may be "Unrepresentative Validation Dataset".
Please let me know if it solves your problem
Unrepresentative Validation Dataset
I am using Keras with TensorFlow backend to train an LSTM network for some time-sequential data sets. The performance seems pretty good when I represent my training data (as well as the validation data) in the Numpy array format:
train_x.shape: (128346, 10, 34)
val_x.shape: (7941, 10, 34)
test_x.shape: (24181, 10, 34)
train_y.shape: (128346, 2)
val_y.shape: (7941, 2)
test_y.shape: (24181, 2)
P.s., 10 is the time steps and 34 is the number of features; The labels were one-hot encoded.
model = tf.keras.Sequential()
model.add(layers.LSTM(_HIDDEN_SIZE, return_sequences=True,
input_shape=(_TIME_STEPS, _FEATURE_DIMENTIONS)))
model.add(layers.Dropout(0.4))
model.add(layers.LSTM(_HIDDEN_SIZE, return_sequences=True))
model.add(layers.Dropout(0.3))
model.add(layers.TimeDistributed(layers.Dense(_NUM_CLASSES)))
model.add(layers.Flatten())
model.add(layers.Dense(_NUM_CLASSES, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr = _LR)
model.compile(optimizer = opt, loss = 'categorical_crossentropy',
metrics = ['accuracy'])
model.fit(train_x,
train_y,
epochs=_EPOCH,
batch_size = _BATCH_SIZE,
verbose = 1,
validation_data = (val_x, val_y)
)
And the training results are:
Train on 128346 samples, validate on 7941 samples
Epoch 1/10
128346/128346 [==============================] - 50s 390us/step - loss: 0.5883 - acc: 0.6975 - val_loss: 0.5242 - val_acc: 0.7416
Epoch 2/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.4804 - acc: 0.7687 - val_loss: 0.4265 - val_acc: 0.8014
Epoch 3/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.4232 - acc: 0.8076 - val_loss: 0.4095 - val_acc: 0.8096
Epoch 4/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.3894 - acc: 0.8276 - val_loss: 0.3529 - val_acc: 0.8469
Epoch 5/10
128346/128346 [==============================] - 49s 382us/step - loss: 0.3610 - acc: 0.8430 - val_loss: 0.3283 - val_acc: 0.8593
Epoch 6/10
128346/128346 [==============================] - 49s 382us/step - loss: 0.3402 - acc: 0.8525 - val_loss: 0.3334 - val_acc: 0.8558
Epoch 7/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.3233 - acc: 0.8604 - val_loss: 0.2944 - val_acc: 0.8741
Epoch 8/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.3087 - acc: 0.8663 - val_loss: 0.2786 - val_acc: 0.8805
Epoch 9/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.2969 - acc: 0.8709 - val_loss: 0.2785 - val_acc: 0.8777
Epoch 10/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.2867 - acc: 0.8757 - val_loss: 0.2590 - val_acc: 0.8877
This log seems pretty normal, but when I tried to use TensorFlow Dataset API to represent my data sets, the training process performed very strange (it seems that the model turns to overfit/underfit?):
def tfdata_generator(features, labels, is_training = False, batch_size = _BATCH_SIZE, epoch = _EPOCH):
dataset = tf.data.Dataset.from_tensor_slices((features, tf.cast(labels, dtype = tf.uint8)))
if is_training:
dataset = dataset.shuffle(10000) # depends on sample size
dataset = dataset.batch(batch_size, drop_remainder = True).repeat(epoch).prefetch(batch_size)
return dataset
training_set = tfdata_generator(train_x, train_y, is_training=True)
validation_set = tfdata_generator(val_x, val_y, is_training=False)
testing_set = tfdata_generator(test_x, test_y, is_training=False)
Training on the same model and hyperparameters:
model.fit(
training_set.make_one_shot_iterator(),
epochs = _EPOCH,
steps_per_epoch = len(train_x) // _BATCH_SIZE,
verbose = 1,
validation_data = validation_set.make_one_shot_iterator(),
validation_steps = len(val_x) // _BATCH_SIZE
)
And the log seems much different from the previous one:
Epoch 1/10
2005/2005 [==============================] - 54s 27ms/step - loss: 0.1451 - acc: 0.9419 - val_loss: 3.2980 - val_acc: 0.4975
Epoch 2/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1675 - acc: 0.9371 - val_loss: 3.0838 - val_acc: 0.4975
Epoch 3/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1821 - acc: 0.9316 - val_loss: 3.1212 - val_acc: 0.4975
Epoch 4/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1902 - acc: 0.9287 - val_loss: 3.0032 - val_acc: 0.4975
Epoch 5/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1905 - acc: 0.9283 - val_loss: 2.9671 - val_acc: 0.4975
Epoch 6/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1867 - acc: 0.9299 - val_loss: 2.8734 - val_acc: 0.4975
Epoch 7/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1802 - acc: 0.9316 - val_loss: 2.8651 - val_acc: 0.4975
Epoch 8/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1740 - acc: 0.9350 - val_loss: 2.8793 - val_acc: 0.4975
Epoch 9/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1660 - acc: 0.9388 - val_loss: 2.7894 - val_acc: 0.4975
Epoch 10/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1613 - acc: 0.9405 - val_loss: 2.7997 - val_acc: 0.4975
The validation loss could not be reduced and the val_acc always the same value when I use the TensorFlow Dataset API to represent my data.
My questions are:
Based on the same model and parameters, why the model.fit() provides such different training results when I merely adopted tf.data.Dataset API?
What the difference between these two mechanisms?
model.fit(train_x,
train_y,
epochs=_EPOCH,
batch_size = _BATCH_SIZE,
verbose = 1,
validation_data = (val_x, val_y)
)
vs
model.fit(
training_set.make_one_shot_iterator(),
epochs = _EPOCH,
steps_per_epoch = len(train_x) // _BATCH_SIZE,
verbose = 1,
validation_data = validation_set.make_one_shot_iterator(),
validation_steps = len(val_x) // _BATCH_SIZE
)
How to solve this strange problem if I have to use tf.data.Dataset API?
I am trying to train my model by finetuning a pretrained model(vggface). My model has 12 classes with 1774 training images and 313 validation images, each class having around 150 images.
My model was overfitting so I added dropout and FC layers with batch normalization to see how it goes. But still, the model overfits:
train_data_path = 'dataset_cfps/train'
validation_data_path = 'dataset_cfps/validation'
#Parametres
img_width, img_height = 224, 224
vggface = VGGFace(model='resnet50', include_top=False, input_shape=(img_width, img_height, 3))
last_layer = vggface.get_layer('avg_pool').output
x = Flatten(name='flatten')(last_layer)
xx = Dense(1024, activation = 'softmax')(x)
x2 = Dropout(0.5)(xx)
y = Dense(1024, activation = 'softmax')(x2)
yy = BatchNormalization()(y)
y1 = Dropout(0.5)(yy)
x3 = Dense(12, activation='softmax', name='classifier')(y1)
custom_vgg_model = Model(vggface.input, x3)
# Create the model
model = models.Sequential()
# Add the convolutional base model
model.add(custom_vgg_model)
model.summary()
model = load_model('facenet_resnet_lr3_SGD_relu_1024.h5')
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
validation_datagen = ImageDataGenerator(rescale=1./255)
# Change the batchsize according to your system RAM
train_batchsize = 32
val_batchsize = 32
train_generator = train_datagen.flow_from_directory(
train_data_path,
target_size=(img_width, img_height),
batch_size=train_batchsize,
class_mode='categorical')
validation_generator = validation_datagen.flow_from_directory(
validation_data_path,
target_size=(img_width, img_height),
batch_size=val_batchsize,
class_mode='categorical',
shuffle=True)
# Compile the model
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.SGD(lr=1e-3),
metrics=['acc'])
# Train the model
history = model.fit_generator(
train_generator,
steps_per_epoch=train_generator.samples/train_generator.batch_size ,
epochs=100,
validation_data=validation_generator,
validation_steps=validation_generator.samples/validation_generator.batch_size,
verbose=1)
# Save the model
model.save('facenet_resnet_lr3_SGD_relu_1024_1.h5')
Here are the epochs:
(type) Output Shape Param #
=================================================================
model_5 (Model) (None, 12) 26725324
=================================================================
Total params: 26,725,324
Trainable params: 26,670,156
Non-trainable params: 55,168
_________________________________________________________________
Found 1774 images belonging to 12 classes.
Found 313 images belonging to 12 classes.
.
.
.
Epoch 70/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5433 - acc: 0.8987 - val_loss: 0.8271 - val_acc: 0.7796
Epoch 71/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5353 - acc: 0.9145 - val_loss: 0.7954 - val_acc: 0.7508
Epoch 72/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5353 - acc: 0.8955 - val_loss: 0.8690 - val_acc: 0.7348
Epoch 73/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5310 - acc: 0.9037 - val_loss: 0.8673 - val_acc: 0.7476
Epoch 74/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5189 - acc: 0.8943 - val_loss: 0.8701 - val_acc: 0.7380
Epoch 75/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5333 - acc: 0.8952 - val_loss: 0.9399 - val_acc: 0.7188
Epoch 76/100
56/55 [==============================] - 49s 879ms/step - loss: 0.5106 - acc: 0.9043 - val_loss: 0.8107 - val_acc: 0.7700
Epoch 77/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5108 - acc: 0.9064 - val_loss: 0.9624 - val_acc: 0.6869
Epoch 78/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5214 - acc: 0.8994 - val_loss: 0.9602 - val_acc: 0.6933
Epoch 79/100
56/55 [==============================] - 49s 880ms/step - loss: 0.5246 - acc: 0.9009 - val_loss: 0.8379 - val_acc: 0.7572
Epoch 80/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4859 - acc: 0.9082 - val_loss: 0.7856 - val_acc: 0.7796
Epoch 81/100
56/55 [==============================] - 49s 881ms/step - loss: 0.5005 - acc: 0.9175 - val_loss: 0.7609 - val_acc: 0.7827
Epoch 82/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4690 - acc: 0.9294 - val_loss: 0.7671 - val_acc: 0.7636
Epoch 83/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4897 - acc: 0.9146 - val_loss: 0.7902 - val_acc: 0.7636
Epoch 84/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4604 - acc: 0.9291 - val_loss: 0.7603 - val_acc: 0.7636
Epoch 85/100
56/55 [==============================] - 49s 881ms/step - loss: 0.4750 - acc: 0.9220 - val_loss: 0.7325 - val_acc: 0.7668
Epoch 86/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4524 - acc: 0.9266 - val_loss: 0.7782 - val_acc: 0.7636
Epoch 87/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4643 - acc: 0.9172 - val_loss: 0.9892 - val_acc: 0.6901
Epoch 88/100
56/55 [==============================] - 49s 881ms/step - loss: 0.4718 - acc: 0.9177 - val_loss: 0.8269 - val_acc: 0.7380
Epoch 89/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4646 - acc: 0.9290 - val_loss: 0.7846 - val_acc: 0.7604
Epoch 90/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4433 - acc: 0.9341 - val_loss: 0.7693 - val_acc: 0.7764
Epoch 91/100
56/55 [==============================] - 49s 877ms/step - loss: 0.4706 - acc: 0.9196 - val_loss: 0.8200 - val_acc: 0.7604
Epoch 92/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4572 - acc: 0.9184 - val_loss: 0.9220 - val_acc: 0.7220
Epoch 93/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4479 - acc: 0.9175 - val_loss: 0.8781 - val_acc: 0.7348
Epoch 94/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4793 - acc: 0.9100 - val_loss: 0.8035 - val_acc: 0.7572
Epoch 95/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4329 - acc: 0.9279 - val_loss: 0.7750 - val_acc: 0.7796
Epoch 96/100
56/55 [==============================] - 49s 879ms/step - loss: 0.4361 - acc: 0.9212 - val_loss: 0.8124 - val_acc: 0.7508
Epoch 97/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4371 - acc: 0.9202 - val_loss: 0.9806 - val_acc: 0.7029
Epoch 98/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4298 - acc: 0.9149 - val_loss: 0.8637 - val_acc: 0.7380
Epoch 99/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4370 - acc: 0.9255 - val_loss: 0.8349 - val_acc: 0.7604
Epoch 100/100
56/55 [==============================] - 49s 880ms/step - loss: 0.4407 - acc: 0.9205 - val_loss: 0.8477 - val_acc: 0.7508
CNN deep networks need a huge data for training. You have a little dataset and the model is unable to generalize from this small dataset. You have two options
reduce the network size
increase the number of dataset
EDIT after comments on answer:
The model has some issues. You wouldn't use softmax for hidden layers.
If you want to overcome the over-fitting issue you would freeze the trained layers and train only new added layers. If the model still overfits, you may remove some of layers you have added or lower their number of units.