I have an issue with my current model, I am not using any pretrained model and just want to see how I can try to improve the model without any pretrained weights. I am doing an object detection project with around 8000+ images. I created a model as such:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(300, 300, 3)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Flatten())
initializer = tf.keras.initializers.HeUniform()
model.add(Dense(256, activation='relu', kernel_initializer=initializer))
model.add(Dropout(0.5))
model.add(Dense(4))
model.compile(loss='mse', optimizer='adam', metrics=[tfr.keras.metrics.MeanAveragePrecisionMetric()])
model.summary()
The issue that I am facing is that I don't get any higher value in the mAP metric, just seems to get stuck at around 80 or less. I just cannot seem to understand why it does not go any higher. The output of the last few epochs:
Epoch 103/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0891 - mean_average_precision_metric_1: 0.8126 - val_loss: 0.0792 - val_mean_average_precision_metric_1: 0.7933
Epoch 104/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0901 - mean_average_precision_metric_1: 0.8029 - val_loss: 0.0830 - val_mean_average_precision_metric_1: 0.7912
Epoch 105/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0909 - mean_average_precision_metric_1: 0.8045 - val_loss: 0.0808 - val_mean_average_precision_metric_1: 0.7931
Epoch 106/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0890 - mean_average_precision_metric_1: 0.8077 - val_loss: 0.0789 - val_mean_average_precision_metric_1: 0.7920
Epoch 107/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0887 - mean_average_precision_metric_1: 0.8042 - val_loss: 0.0885 - val_mean_average_precision_metric_1: 0.7913
Epoch 108/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0887 - mean_average_precision_metric_1: 0.8012 - val_loss: 0.0798 - val_mean_average_precision_metric_1: 0.7922
Epoch 109/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0884 - mean_average_precision_metric_1: 0.8048 - val_loss: 0.0796 - val_mean_average_precision_metric_1: 0.7923
Epoch 110/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0896 - mean_average_precision_metric_1: 0.8009 - val_loss: 0.0838 - val_mean_average_precision_metric_1: 0.7948
Epoch 111/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0883 - mean_average_precision_metric_1: 0.7957 - val_loss: 0.0806 - val_mean_average_precision_metric_1: 0.7930
Epoch 112/150
211/211 [==============================] - 7s 31ms/step - loss: 0.0887 - mean_average_precision_metric_1: 0.8123 - val_loss: 0.0785 - val_mean_average_precision_metric_1: 0.7930
Epoch 00112: early stopping
Related
I am trying to train a CNN model with 2030 preprocessed eye images. the shape of my input data is (2030, 200,200, 1). At first, the shape of the data was 1527. Then I used imblearn.over_sampling.RandomOverSampler to increase the dataset size. I constructed the model with Keras and here is the summary of my model:
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same', activation='relu',
input_shape=(img_cols, img_rows, 1)))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (5, 5), padding='same', activation='relu'))
model.add(Conv2D(64, (5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
optimizer = tf.keras.optimizers.SGD(learning_rate=0.000001)
model.compile(optimizer=optimizer,loss='binary_crossentropy',
metrics=[tf.keras.metrics.SpecificityAtSensitivity(0.5),
tf.keras.metrics.SensitivityAtSpecificity(0.5),'accuracy'])
# Augmentation
train_datagen = ImageDataGenerator(
rescale=1./255,
horizontal_flip=True,
vertical_flip=True,
width_shift_range=0.3,
height_shift_range=0.5,
rotation_range=10,
zoom_range=0.2
)
test_datagen = ImageDataGenerator(rescale=1./255)
train_data = train_datagen.flow(x_train, y_train)
test_data = test_datagen.flow(x_test, y_test)
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss',
factor=0.9,
patience=2, min_lr=0.0000000000000000001)
history=model.fit(train_data, epochs=10, batch_size=32,
validation_data=test_data, callbacks=[reduce_lr])
I trained the model with different parameters (with batch sizes 32, 64, 128, 256, 512, 1024, adding 128 and 256 neuron convolution layers, decreasing and increasing learning rate, using callbacks, varying the dense layer size with 32, 64, ..., 1024) but I always get the following learning process:
*Epoch 1/10
51/51 [==============================] - 14s 238ms/step - loss: 0.6962 - specificity_at_sensitivity_15: 0.4548 - sensitivity_at_specificity_15: 0.4777 - accuracy: 0.4969 - val_loss: 0.6957 - val_specificity_at_sensitivity_15: 0.4112 - val_sensitivity_at_specificity_15: 0.3636 - val_accuracy: 0.4852 - lr: 1.0000e-04
Epoch 2/10
51/51 [==============================] - 12s 226ms/step - loss: 0.6945 - specificity_at_sensitivity_15: 0.4829 - sensitivity_at_specificity_15: 0.4615 - accuracy: 0.5018 - val_loss: 0.6949 - val_specificity_at_sensitivity_15: 0.4467 - val_sensitivity_at_specificity_15: 0.3206 - val_accuracy: 0.4877 - lr: 1.0000e-04
Epoch 3/10
51/51 [==============================] - 12s 227ms/step - loss: 0.6955 - specificity_at_sensitivity_15: 0.4328 - sensitivity_at_specificity_15: 0.4082 - accuracy: 0.5043 - val_loss: 0.6945 - val_specificity_at_sensitivity_15: 0.5584 - val_sensitivity_at_specificity_15: 0.5167 - val_accuracy: 0.4852 - lr: 1.0000e-04
Epoch 4/10
51/51 [==============================] - 12s 226ms/step - loss: 0.6971 - specificity_at_sensitivity_15: 0.4034 - sensitivity_at_specificity_15: 0.4256 - accuracy: 0.5049 - val_loss: 0.6941 - val_specificity_at_sensitivity_15: 0.4010 - val_sensitivity_at_specificity_15: 0.3923 - val_accuracy: 0.4852 - lr: 1.0000e-04
Epoch 5/10
51/51 [==============================] - 12s 226ms/step - loss: 0.6954 - specificity_at_sensitivity_15: 0.4670 - sensitivity_at_specificity_15: 0.4640 - accuracy: 0.4969 - val_loss: 0.6938 - val_specificity_at_sensitivity_15: 0.5584 - val_sensitivity_at_specificity_15: 0.5407 - val_accuracy: 0.4729 - lr: 1.0000e-04
Epoch 6/10
51/51 [==============================] - 12s 227ms/step - loss: 0.6972 - specificity_at_sensitivity_15: 0.4352 - sensitivity_at_specificity_15: 0.3883 - accuracy: 0.4791 - val_loss: 0.6935 - val_specificity_at_sensitivity_15: 0.4772 - val_sensitivity_at_specificity_15: 0.3206 - val_accuracy: 0.4729 - lr: 1.0000e-04
Epoch 7/10
51/51 [==============================] - 12s 227ms/step - loss: 0.6943 - specificity_at_sensitivity_15: 0.4474 - sensitivity_at_specificity_15: 0.4814 - accuracy: 0.5031 - val_loss: 0.6933 - val_specificity_at_sensitivity_15: 0.3604 - val_sensitivity_at_specificity_15: 0.4880 - val_accuracy: 0.4729 - lr: 1.0000e-04
Epoch 8/10
51/51 [==============================] - 12s 225ms/step - loss: 0.6974 - specificity_at_sensitivity_15: 0.4609 - sensitivity_at_specificity_15: 0.4355 - accuracy: 0.4926 - val_loss: 0.6930 - val_specificity_at_sensitivity_15: 0.5279 - val_sensitivity_at_specificity_15: 0.5885 - val_accuracy: 0.4655 - lr: 1.0000e-04
Epoch 9/10
51/51 [==============================] - 12s 226ms/step - loss: 0.6945 - specificity_at_sensitivity_15: 0.4425 - sensitivity_at_specificity_15: 0.4777 - accuracy: 0.5031 - val_loss: 0.6929 - val_specificity_at_sensitivity_15: 0.4619 - val_sensitivity_at_specificity_15: 0.3876 - val_accuracy: 0.4655 - lr: 1.0000e-04
Epoch 10/10
51/51 [==============================] - 12s 226ms/step - loss: 0.6977 - specificity_at_sensitivity_15: 0.4389 - sensitivity_at_specificity_15: 0.4367 - accuracy: 0.4766 - val_loss: 0.6927 - val_specificity_at_sensitivity_15: 0.6091 - val_sensitivity_at_specificity_15: 0.5024 - val_accuracy: 0.4951 - lr: 1.0000e-04*
And evaluation with test data generated from x_test data (2 percent of the 2030 images) resulted in:
13/13 [==============================] - 1s 69ms/step - loss: 0.6927 - specificity_at_sensitivity_15: 0.6091 - sensitivity_at_specificity_15: 0.5024 - accuracy: 0.4951
Accuracy score is : 0.4950738847255707
How can I improve my accuracy score? I tried every possible way, but the maximum I could increase was 53%. Similar codes I saw on the internet reached 76%. It is a medical imaging project, I believe it is better to obtain better accuracy.
If you could try with changing optimizer from SGD to Adam, you will get better accuracy along with doing some other changes like adding more convolution layers, increasing learning rate, removing dropout layers as follows:
model = Sequential()
model.add(Conv2D(16, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, (5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(128, (5, 5), padding='same', activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optimizer,loss='binary_crossentropy',
metrics=[tf.keras.metrics.SpecificityAtSensitivity(0.5),
tf.keras.metrics.SensitivityAtSpecificity(0.5),'accuracy'])
reduce_lr = tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss',factor=0.2,
patience=2, min_lr=0.00001)
history=model.fit(train_dataset, epochs=10, batch_size=32,
validation_data=validation_dataset, callbacks=[reduce_lr])
Output:
Epoch 1/10
63/63 [==============================] - 9s 104ms/step - loss: 0.7631 - specificity_at_sensitivity_2: 0.5760 - sensitivity_at_specificity_2: 0.5700 - accuracy: 0.5445 - val_loss: 0.6751 - val_specificity_at_sensitivity_2: 0.7760 - val_sensitivity_at_specificity_2: 0.7460 - val_accuracy: 0.5050 - lr: 0.0010
Epoch 2/10
63/63 [==============================] - 5s 77ms/step - loss: 0.6570 - specificity_at_sensitivity_2: 0.7260 - sensitivity_at_specificity_2: 0.7030 - accuracy: 0.6030 - val_loss: 0.6652 - val_specificity_at_sensitivity_2: 0.7480 - val_sensitivity_at_specificity_2: 0.6920 - val_accuracy: 0.5990 - lr: 0.0010
Epoch 3/10
63/63 [==============================] - 4s 57ms/step - loss: 0.6277 - specificity_at_sensitivity_2: 0.7920 - sensitivity_at_specificity_2: 0.7650 - accuracy: 0.6565 - val_loss: 0.6696 - val_specificity_at_sensitivity_2: 0.6960 - val_sensitivity_at_specificity_2: 0.6820 - val_accuracy: 0.5930 - lr: 0.0010
Epoch 4/10
63/63 [==============================] - 4s 56ms/step - loss: 0.6163 - specificity_at_sensitivity_2: 0.8080 - sensitivity_at_specificity_2: 0.7830 - accuracy: 0.6570 - val_loss: 0.6330 - val_specificity_at_sensitivity_2: 0.8320 - val_sensitivity_at_specificity_2: 0.7840 - val_accuracy: 0.6520 - lr: 0.0010
Epoch 5/10
63/63 [==============================] - 4s 58ms/step - loss: 0.5710 - specificity_at_sensitivity_2: 0.8710 - sensitivity_at_specificity_2: 0.8420 - accuracy: 0.6995 - val_loss: 0.5940 - val_specificity_at_sensitivity_2: 0.8600 - val_sensitivity_at_specificity_2: 0.8420 - val_accuracy: 0.7030 - lr: 0.0010
Epoch 6/10
63/63 [==============================] - 4s 58ms/step - loss: 0.5426 - specificity_at_sensitivity_2: 0.8930 - sensitivity_at_specificity_2: 0.8790 - accuracy: 0.7250 - val_loss: 0.6158 - val_specificity_at_sensitivity_2: 0.8740 - val_sensitivity_at_specificity_2: 0.8360 - val_accuracy: 0.7060 - lr: 0.0010
Epoch 7/10
63/63 [==============================] - 4s 60ms/step - loss: 0.4991 - specificity_at_sensitivity_2: 0.9260 - sensitivity_at_specificity_2: 0.9100 - accuracy: 0.7550 - val_loss: 0.5927 - val_specificity_at_sensitivity_2: 0.8760 - val_sensitivity_at_specificity_2: 0.8460 - val_accuracy: 0.7280 - lr: 0.0010
Epoch 8/10
63/63 [==============================] - 4s 58ms/step - loss: 0.4597 - specificity_at_sensitivity_2: 0.9480 - sensitivity_at_specificity_2: 0.9300 - accuracy: 0.7885 - val_loss: 0.6473 - val_specificity_at_sensitivity_2: 0.8900 - val_sensitivity_at_specificity_2: 0.8260 - val_accuracy: 0.7320 - lr: 0.0010
Epoch 9/10
63/63 [==============================] - 4s 58ms/step - loss: 0.4682 - specificity_at_sensitivity_2: 0.9500 - sensitivity_at_specificity_2: 0.9310 - accuracy: 0.7900 - val_loss: 0.5569 - val_specificity_at_sensitivity_2: 0.9080 - val_sensitivity_at_specificity_2: 0.8880 - val_accuracy: 0.7330 - lr: 0.0010
Epoch 10/10
63/63 [==============================] - 4s 60ms/step - loss: 0.3974 - specificity_at_sensitivity_2: 0.9740 - sensitivity_at_specificity_2: 0.9600 - accuracy: 0.8155 - val_loss: 0.6180 - val_specificity_at_sensitivity_2: 0.9180 - val_sensitivity_at_specificity_2: 0.8940 - val_accuracy: 0.7540 - lr: 0.0010
This is my first time posting a question, please pardon me if it wasn't written or structured well.
The dataset consists of images in TIF format. Which means I am running a 3D CNN.
These images are simulated X-Ray images and the dataset has 2 classes; Normal and Anomaly.
Their labels would be '0' for Normal and '1' for Anomaly.
The tree of my folders look like this:
Train
Normal
Anomaly
Validation
Normal
Anomaly
What I have done was initialise 2 arrays;
train and y_train.
I ran a FOR loop which imports and appends the Normal images into train, and append a '0' into y_train for each image appended. So if I have 10 Normal images in train, I will have ten '0' s in y_train as well.
This is repeated for the Anomaly images and they are appended into train and a '1' will be appended into y_train as well. This means train consists of Normal images followed by Anomaly images. And y_train consists of '0' s, followed by '1' s.
Another FOR loop is executed for the validation folder, whereby my arrays are test and y_test.
This is my code for my Neural Network:
def vgg1():
model = Sequential()
model.add(Conv3D(16, (3, 3, 3), activation="relu", padding="same", name="block1_conv1", input_shape=(128, 128, 128, 1), data_format="channels_last")) # 64
model.add(Conv3D(16, (3, 3, 3), activation="relu", padding="same", name="block1_conv2", data_format="channels_last")) # 64
model.add(MaxPooling3D((2,2, 2), strides=(2,2, 2),padding='same', name='block1_pool'))
model.add(Dropout(0.5))
model.add(Conv3D(32, (3, 3, 3), activation="relu", padding="same", name="block2_conv1", data_format="channels_last")) # 128
model.add(Conv3D(32, (3, 3, 3), activation="relu", padding="same", name="block2_conv2", data_format="channels_last")) # 128
model.add(MaxPooling3D((2,2, 2), strides=(2,2, 2),padding='same', name='block2_pool'))
model.add(Dropout(0.5))
model.add(Conv3D(64, (3, 3, 3), activation="relu", padding="same", name="block3_conv1", data_format="channels_last")) # 256
model.add(Conv3D(64, (3, 3, 3), activation="relu", padding="same", name="block3_conv2", data_format="channels_last")) # 256
model.add(Conv3D(64, (3, 3, 3), activation="relu", padding="same", name="block3_conv3", data_format="channels_last")) # 256
model.add(MaxPooling3D((2,2, 2), strides=(2,2, 2),padding='same', name='block3_pool'))
model.add(Dropout(0.5))
model.add(Conv3D(128, (3, 3, 3), activation="relu", padding="same", name="block4_conv1", data_format="channels_last")) # 512
model.add(Conv3D(128, (3, 3, 3), activation="relu", padding="same", name="block4_conv2", data_format="channels_last")) # 512
model.add(Conv3D(128, (3, 3, 3), activation="relu", padding="same", name="block4_conv3", data_format="channels_last")) # 512
model.add(MaxPooling3D((2,2, 2), strides=(2,2, 2),padding='same', name='block4_pool'))
model.add(Dropout(0.5))
model.add(Conv3D(128, (3, 3, 3), activation="relu", padding="same", name="block5_conv1", data_format="channels_last")) # 512
model.add(Conv3D(128, (3, 3, 3), activation="relu", padding="same", name="block5_conv2", data_format="channels_last")) # 512
model.add(Conv3D(128, (3, 3, 3), activation="relu", padding="same", name="block5_conv3", data_format="channels_last")) # 512
model.add(MaxPooling3D((2,2, 2), strides=(2,2, 2),padding='same', name='block5_pool'))
model.add(Dropout(0.5))
model.add(Flatten(name='flatten'))
model.add(Dense(4096, activation='relu',name='fc1'))
model.add(Dense(4096, activation='relu',name='fc2'))
model.add(Dense(2, activation='softmax', name='predictions'))
print(model.summary())
return model
The following code is the initialisation of x_train, y_train, x_test, y_test and model.compile.
I have also converted my labels into one-hot encoding.
from keras.utils import to_categorical
model = vgg1()
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
x_train = np.load('/content/drive/My Drive/3D Dataset v2/x_train.npy')
y_train = np.load('/content/drive/My Drive/3D Dataset v2/y_train.npy')
y_train = to_categorical(y_train)
x_test = np.load('/content/drive/My Drive/3D Dataset v2/x_test.npy')
y_test = np.load('/content/drive/My Drive/3D Dataset v2/y_test.npy')
y_test = to_categorical(y_test)
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
This is the problem I want to highlight, which is the constant training accuracy and validation accuracy .
Train on 127 samples, validate on 31 samples
Epoch 1/25
127/127 [==============================] - 1700s 13s/step - loss: 1.0030 - accuracy: 0.7480 - val_loss: 0.5842 - val_accuracy: 0.7419
Epoch 2/25
127/127 [==============================] - 1708s 13s/step - loss: 0.5813 - accuracy: 0.7480 - val_loss: 0.5728 - val_accuracy: 0.7419
Epoch 3/25
127/127 [==============================] - 1693s 13s/step - loss: 0.5758 - accuracy: 0.7480 - val_loss: 0.5720 - val_accuracy: 0.7419
Epoch 4/25
127/127 [==============================] - 1675s 13s/step - loss: 0.5697 - accuracy: 0.7480 - val_loss: 0.5711 - val_accuracy: 0.7419
Epoch 5/25
127/127 [==============================] - 1664s 13s/step - loss: 0.5691 - accuracy: 0.7480 - val_loss: 0.5785 - val_accuracy: 0.7419
Epoch 6/25
127/127 [==============================] - 1666s 13s/step - loss: 0.5716 - accuracy: 0.7480 - val_loss: 0.5710 - val_accuracy: 0.7419
Epoch 7/25
127/127 [==============================] - 1676s 13s/step - loss: 0.5702 - accuracy: 0.7480 - val_loss: 0.5718 - val_accuracy: 0.7419
Epoch 8/25
127/127 [==============================] - 1664s 13s/step - loss: 0.5775 - accuracy: 0.7480 - val_loss: 0.5718 - val_accuracy: 0.7419
Epoch 9/25
127/127 [==============================] - 1660s 13s/step - loss: 0.5753 - accuracy: 0.7480 - val_loss: 0.5711 - val_accuracy: 0.7419
Epoch 10/25
127/127 [==============================] - 1681s 13s/step - loss: 0.5756 - accuracy: 0.7480 - val_loss: 0.5714 - val_accuracy: 0.7419
Epoch 11/25
127/127 [==============================] - 1679s 13s/step - loss: 0.5675 - accuracy: 0.7480 - val_loss: 0.5710 - val_accuracy: 0.7419
Epoch 12/25
127/127 [==============================] - 1681s 13s/step - loss: 0.5779 - accuracy: 0.7480 - val_loss: 0.5741 - val_accuracy: 0.7419
Epoch 13/25
127/127 [==============================] - 1682s 13s/step - loss: 0.5763 - accuracy: 0.7480 - val_loss: 0.5723 - val_accuracy: 0.7419
Epoch 14/25
127/127 [==============================] - 1685s 13s/step - loss: 0.5732 - accuracy: 0.7480 - val_loss: 0.5714 - val_accuracy: 0.7419
Epoch 15/25
127/127 [==============================] - 1685s 13s/step - loss: 0.5701 - accuracy: 0.7480 - val_loss: 0.5710 - val_accuracy: 0.7419
Epoch 16/25
127/127 [==============================] - 1678s 13s/step - loss: 0.5704 - accuracy: 0.7480 - val_loss: 0.5733 - val_accuracy: 0.7419
Epoch 17/25
127/127 [==============================] - 1663s 13s/step - loss: 0.5692 - accuracy: 0.7480 - val_loss: 0.5710 - val_accuracy: 0.7419
Epoch 18/25
127/127 [==============================] - 1657s 13s/step - loss: 0.5731 - accuracy: 0.7480 - val_loss: 0.5717 - val_accuracy: 0.7419
Epoch 19/25
127/127 [==============================] - 1674s 13s/step - loss: 0.5708 - accuracy: 0.7480 - val_loss: 0.5712 - val_accuracy: 0.7419
Epoch 20/25
127/127 [==============================] - 1666s 13s/step - loss: 0.5795 - accuracy: 0.7480 - val_loss: 0.5730 - val_accuracy: 0.7419
Epoch 21/25
127/127 [==============================] - 1671s 13s/step - loss: 0.5635 - accuracy: 0.7480 - val_loss: 0.5753 - val_accuracy: 0.7419
Epoch 22/25
127/127 [==============================] - 1672s 13s/step - loss: 0.5713 - accuracy: 0.7480 - val_loss: 0.5718 - val_accuracy: 0.7419
Epoch 23/25
127/127 [==============================] - 1672s 13s/step - loss: 0.5666 - accuracy: 0.7480 - val_loss: 0.5711 - val_accuracy: 0.7419
Epoch 24/25
127/127 [==============================] - 1669s 13s/step - loss: 0.5695 - accuracy: 0.7480 - val_loss: 0.5724 - val_accuracy: 0.7419
Epoch 25/25
127/127 [==============================] - 1663s 13s/step - loss: 0.5675 - accuracy: 0.7480 - val_loss: 0.5721 - val_accuracy: 0.7419
What is wrong and what can I do to rectify this? I presume that having a constant accuracy is undesirable.
I've played around with the setup of my architecture a lot, the number of layers, pooling size, dropouts, etc but I always end up in the same ballpark: ~96-98 accuracy and loss between 2-6%.
I'm training from a dataset of 78000 images, 26 classes (letters of the alphabet), and 3000 images per photo. I am satisfied with the accuracy, but are there any suggestions to reduce loss? Also How much loss do you think is too much? Anything else I should add to the model to make it more robust? Code for the model is shown below:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(200, 200, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
#model.add(Dropout(0.5))
model.add(Conv2D(64, (3, 3), activation='relu', input_shape=(200, 200, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
#model.add(Dropout(0.5))
model.add(Conv2D(64, (3, 3), activation='relu', input_shape=(200, 200, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(BatchNormalization())
#model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(26, activation='softmax'))
Compile and Fit:
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train, y_train, epochs=10, batch_size=64, verbose=1, validation_data=(X_test, y_test))
Output:
Train on 54600 samples, validate on 23400 samples
Epoch 1/10
54600/54600 [==============================] - 68s 1ms/step - loss: 1.1217 - accuracy: 0.6622 - val_loss: 0.9795 - val_accuracy: 0.7394
Epoch 2/10
54600/54600 [==============================] - 67s 1ms/step - loss: 0.2377 - accuracy: 0.9219 - val_loss: 0.2300 - val_accuracy: 0.9277
Epoch 3/10
54600/54600 [==============================] - 67s 1ms/step - loss: 0.1184 - accuracy: 0.9627 - val_loss: 0.2746 - val_accuracy: 0.9286
Epoch 4/10
54600/54600 [==============================] - 67s 1ms/step - loss: 0.0755 - accuracy: 0.9761 - val_loss: 0.1850 - val_accuracy: 0.9517
Epoch 5/10
54600/54600 [==============================] - 69s 1ms/step - loss: 0.0669 - accuracy: 0.9801 - val_loss: 0.2044 - val_accuracy: 0.9450
Epoch 6/10
54600/54600 [==============================] - 69s 1ms/step - loss: 0.0520 - accuracy: 0.9848 - val_loss: 0.2265 - val_accuracy: 0.9485
Epoch 7/10
54600/54600 [==============================] - 72s 1ms/step - loss: 0.0481 - accuracy: 0.9865 - val_loss: 0.1709 - val_accuracy: 0.9559
Epoch 8/10
54600/54600 [==============================] - 66s 1ms/step - loss: 0.0370 - accuracy: 0.9905 - val_loss: 0.1534 - val_accuracy: 0.9659
Epoch 9/10
54600/54600 [==============================] - 66s 1ms/step - loss: 0.0335 - accuracy: 0.9912 - val_loss: 0.1181 - val_accuracy: 0.9703
Epoch 10/10
54600/54600 [==============================] - 66s 1ms/step - loss: 0.0277 - accuracy: 0.9921 - val_loss: 0.1204 - val_accuracy: 0.9704
i'm doing for my final project and i'm new in ConVnets. i want to classifies which one is genuine image and spoof image. i have +-8000 data (combine). and i want to show you some of my training log.
Epoch 7/100
311/311 [==============================] - 20s 63ms/step - loss: 0.3274 - accuracy: 0.8675 - val_loss: 0.2481 - val_accuracy: 0.9002
Epoch 8/100
311/311 [==============================] - 20s 63ms/step - loss: 0.3189 - accuracy: 0.8691 - val_loss: 0.3015 - val_accuracy: 0.8684
Epoch 9/100
311/311 [==============================] - 19s 62ms/step - loss: 0.3201 - accuracy: 0.8667 - val_loss: 0.2460 - val_accuracy: 0.9036
Epoch 10/100
311/311 [==============================] - 19s 62ms/step - loss: 0.3063 - accuracy: 0.8723 - val_loss: 0.2752 - val_accuracy: 0.8901
Epoch 11/100
311/311 [==============================] - 19s 62ms/step - loss: 0.3086 - accuracy: 0.8749 - val_loss: 0.2717 - val_accuracy: 0.8988
[INFO] evaluating network...
model = Sequential()
inputShape = (height, width, depth)
chanDim = -1
if K.image_data_format() == "channels_first":
inputShape = (depth, height, width)
chanDim = 1
model.add(Conv2D(16, (3, 3), padding="same", input_shape=inputShape)) model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(16, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))
model.add(Conv2D(32, (5, 5), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.6))
model.add(Dense(classes))
model.add(Activation("softmax"))
the input is 32x32 and it has two classes. i used EarlyStopping in keras to prevent overfitting. and i always change the value of learning rate and try to change the number of neuron node but still the training always stop below 20 epoch. any advice to prevent overfitting ? since i'm beginner in convolutional neural network. Thanks in advance !
PS LR: 0.001 BS: 20 EPOCHS: 100
I have the following neural network, written in Keras using Tensorflow as the backend, which I'm running on Python 3.5 (Anaconda) on Windows 10:
model = Sequential()
model.add(Dense(100, input_dim=283, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(150, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(200, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(200, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(200, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(4, init='normal', activation='sigmoid'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
I'm training on my GPU. During training (10000 epochs), the accuracy of the naive network steadily increases from 0.25 to somewhere between 0.7 and 0.9, before suddenly dropping and sticking at 0.25:
Epoch 1/10000
6120/6120 [==============================] - 1s - loss: 1.5329 - acc: 0.2665
Epoch 2/10000
6120/6120 [==============================] - 1s - loss: 1.2985 - acc: 0.3784
Epoch 3/10000
6120/6120 [==============================] - 1s - loss: 1.2259 - acc: 0.4891
Epoch 4/10000
6120/6120 [==============================] - 1s - loss: 1.1867 - acc: 0.5208
Epoch 5/10000
6120/6120 [==============================] - 1s - loss: 1.1494 - acc: 0.5199
Epoch 6/10000
6120/6120 [==============================] - 1s - loss: 1.1042 - acc: 0.4953
Epoch 7/10000
6120/6120 [==============================] - 1s - loss: 1.0491 - acc: 0.4982
Epoch 8/10000
6120/6120 [==============================] - 1s - loss: 1.0066 - acc: 0.5065
Epoch 9/10000
6120/6120 [==============================] - 1s - loss: 0.9749 - acc: 0.5338
Epoch 10/10000
6120/6120 [==============================] - 1s - loss: 0.9456 - acc: 0.5696
Epoch 11/10000
6120/6120 [==============================] - 1s - loss: 0.9252 - acc: 0.5995
Epoch 12/10000
6120/6120 [==============================] - 1s - loss: 0.9111 - acc: 0.6106
Epoch 13/10000
6120/6120 [==============================] - 1s - loss: 0.8772 - acc: 0.6160
Epoch 14/10000
6120/6120 [==============================] - 1s - loss: 0.8517 - acc: 0.6245
Epoch 15/10000
6120/6120 [==============================] - 1s - loss: 0.8170 - acc: 0.6345
Epoch 16/10000
6120/6120 [==============================] - 1s - loss: 0.7850 - acc: 0.6428
Epoch 17/10000
6120/6120 [==============================] - 1s - loss: 0.7633 - acc: 0.6580
Epoch 18/10000
6120/6120 [==============================] - 4s - loss: 0.7375 - acc: 0.6717
Epoch 19/10000
6120/6120 [==============================] - 1s - loss: 0.7058 - acc: 0.6850
Epoch 20/10000
6120/6120 [==============================] - 1s - loss: 0.6787 - acc: 0.7018
Epoch 21/10000
6120/6120 [==============================] - 1s - loss: 0.6557 - acc: 0.7093
Epoch 22/10000
6120/6120 [==============================] - 1s - loss: 0.6304 - acc: 0.7208
Epoch 23/10000
6120/6120 [==============================] - 1s - loss: 0.6052 - acc: 0.7270
Epoch 24/10000
6120/6120 [==============================] - 1s - loss: 0.5848 - acc: 0.7371
Epoch 25/10000
6120/6120 [==============================] - 1s - loss: 0.5564 - acc: 0.7536
Epoch 26/10000
6120/6120 [==============================] - 1s - loss: 0.1787 - acc: 0.4163
Epoch 27/10000
6120/6120 [==============================] - 1s - loss: 1.1921e-07 - acc: 0.2500
Epoch 28/10000
6120/6120 [==============================] - 1s - loss: 1.1921e-07 - acc: 0.2500
Epoch 29/10000
6120/6120 [==============================] - 1s - loss: 1.1921e-07 - acc: 0.2500
Epoch 30/10000
6120/6120 [==============================] - 2s - loss: 1.1921e-07 - acc: 0.2500
Epoch 31/10000
6120/6120 [==============================] - 1s - loss: 1.1921e-07 - acc: 0.2500
Epoch 32/10000
6120/6120 [==============================] - 1s - loss: 1.1921e-07 - acc: 0.2500 ...
I'm guessing that this is due to the optimiser falling into a local minimum where it assigns all data to one category. How can I inhibit it from doing this?
Things I've tried (but didn't seem to stop this from happening):
Using a different optimiser (adam)
Ensuring that the training data included an equal number of examples from each category
Increasing the volume of training data (currently at 6000)
Varying the number of categories between 2 to 5
Increasing the number of hidden layers in the network from 1 to 5
Changing the width of the layers (from 50 to 500)
None of these helped. Any other ideas why this is happening and/or how to inhibit it? Could it be a bug in Keras? Many thanks in advance for any suggestions.
Edit:
The problem appears to have been solved by changing the final activation to softmax (from sigmoid) and adding maxnorm(3) regularization to the final two hidden layers:
model = Sequential()
model.add(Dense(100, input_dim=npoints, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(150, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(200, init='normal', activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(200, init='normal', activation='relu', W_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(200, init='normal', activation='relu', W_constraint=maxnorm(3)))
model.add(Dropout(0.2))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.add(Dense(ncat, init='normal', activation='softmax'))
model.compile(loss='mean_squared_error', optimizer=sgd, metrics=['accuracy'])
Many thanks for the suggestions.
The problem lied in sigmoid function as an activation in a last layer. In this case the output of your final layer cannot be interpreted as a probability distribution of an example given belonging to a single class. The output from this layer usually doesn't even sum up to 1. In this case the optimization may lead to unexpected behaviour. In my opinion adding a maxnorm constrain is not necessary but I strongly advise you to use a categorical_crossentropy instead of mse loss as it's proven that this function works better for this optimization case.