Accuracy doesn't improve in training model character recognition - python

I am building a training model for my character recognition system. During every epochs, I am getting the same accuracy and it doesn't improve. I have currently 4000 training images and 77 validation images.
My model is as follows:
inputs = Input(shape=(32,32,3))
x = Conv2D(filters = 64, kernel_size = 5, activation = 'relu')(inputs)
x = MaxPooling2D()(x)
x = Conv2D(filters = 32,
kernel_size = 3,
activation = 'relu')(x)
x = MaxPooling2D()(x)
x = Flatten()(x)
x=Dense(256,
activation='relu')(x)
outputs = Dense(1, activation = 'softmax')(x)
model = Model(inputs = inputs, outputs = outputs)
model.compile(
optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
data_gen_train = ImageDataGenerator(rescale=1/255)
data_gen_test=ImageDataGenerator(rescale=1/255)
data_gen_valid = ImageDataGenerator(rescale=1/255)
train_generator = data_gen_train.flow_from_directory(directory=r"./drive/My Drive/train_dataset",
target_size=(32,32), batch_size=10, class_mode="binary")
valid_generator = data_gen_valid.flow_from_directory(directory=r"./drive/My
Drive/validation_dataset", target_size=(32,32), batch_size=2, class_mode="binary")
test_generator = data_gen_test.flow_from_directory(
directory=r"./drive/My Drive/test_dataset",
target_size=(32, 32),
batch_size=6,
class_mode="binary"
)
model.fit(
train_generator,
epochs =10,
steps_per_epoch=400,
validation_steps=37,
validation_data=valid_generator)
The result is as follows:
Found 4000 images belonging to 2 classes.
Found 77 images belonging to 2 classes.
Found 6 images belonging to 2 classes.
Epoch 1/10
400/400 [==============================] - 14s 35ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
Epoch 2/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
Epoch 3/10
400/400 [==============================] - 13s 34ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 4/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 5/10
400/400 [==============================] - 18s 46ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5541
Epoch 6/10
400/400 [==============================] - 13s 34ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 7/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5676
Epoch 8/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5946
Epoch 9/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
Epoch 10/10
400/400 [==============================] - 13s 33ms/step - loss: 0.0000e+00 - accuracy: 0.5000 - val_loss: 0.0000e+00 - val_accuracy: 0.5811
<tensorflow.python.keras.callbacks.History at 0x7fa3a5f4a8d0>

If you are trying to recognize charaters of 2 classes, you should:
use class_mode="binary" in the flow_from_directory function
use binary_crossentropy as loss
your last layer must have 1 neuron with sigmoid activation function
In case there are more than 2 classes:
do not use class_mode="binary" in the flow_from_directory function
use categorical_crossentropy as loss
your last layer must have n neurons with softmax activation, where n stands for the number of classes

Related

Very low validation accuracy but high training accuracy

I copied some sample code straight from Keras official website and edited it to make a machine learning model.
I am using Google Colab for my code.
Link: https://keras.io/examples/vision/image_classification_from_scratch/
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from google.colab import drive
drive.mount("/content/gdrive")
image_size = (50, 50)
batch_size = 400
import random
num = random.randint(1, 400)
#random seed
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/content/gdrive/My Drive/pest/train",
seed=num,
image_size=image_size,
batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/content/gdrive/My Drive/pest/test",
seed=num,
image_size=image_size,
batch_size=batch_size,
)
#tried data augmentation
data_augmentation = keras.Sequential(
[
layers.RandomFlip("horizontal"),
layers.RandomRotation(0.1),
]
)
def make_model(input_shape, num_classes):
inputs = keras.Input(shape=input_shape)
# Image augmentation block
x = data_augmentation(inputs)
# Entry block
x = layers.Rescaling(1.0 / 255)(x)
x = layers.Conv2D(32, 3, strides=2, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.Conv2D(64, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
previous_block_activation = x # Set aside residual
for size in [128, 256, 512, 728]:
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.SeparableConv2D(size, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.MaxPooling2D(3, strides=2, padding="same")(x)
# Project residual
residual = layers.Conv2D(size, 1, strides=2, padding="same")(
previous_block_activation
)
x = layers.add([x, residual]) # Add back residual
previous_block_activation = x # Set aside next residual
x = layers.SeparableConv2D(1024, 3, padding="same")(x)
x = layers.BatchNormalization()(x)
x = layers.Activation("relu")(x)
x = layers.GlobalAveragePooling2D()(x)
if num_classes == 2:
activation = "sigmoid"
units = 1
else:
activation = "softmax"
units = num_classes
x = layers.Dropout(0.5)(x)
outputs = layers.Dense(units, activation=activation)(x)
return keras.Model(inputs, outputs)
model = make_model(input_shape=image_size + (3,), num_classes=2)
keras.utils.plot_model(model, show_shapes=True)
epochs = 50
callbacks = [
keras.callbacks.ModelCheckpoint("save_at_{epoch}.h5"),
]
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss="binary_crossentropy",
metrics=["accuracy"],
)
model.fit(
train_ds, epochs=epochs, callbacks=callbacks, validation_data=val_ds,
#should have automatic shuffling I think
)
However, when I run it, the result is
Epoch 1/50
2/2 [==============================] - 71s 14s/step - loss: 0.6260 - accuracy: 0.6050 - val_loss: 0.6931 - val_accuracy: 0.5000
Epoch 2/50
2/2 [==============================] - 2s 507ms/step - loss: 0.2689 - accuracy: 0.8867 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 3/50
2/2 [==============================] - 2s 536ms/step - loss: 0.1241 - accuracy: 0.9483 - val_loss: 0.6932 - val_accuracy: 0.5000
Epoch 4/50
2/2 [==============================] - 2s 506ms/step - loss: 0.0697 - accuracy: 0.9750 - val_loss: 0.6934 - val_accuracy: 0.5000
Epoch 5/50
2/2 [==============================] - 2s 525ms/step - loss: 0.0479 - accuracy: 0.9867 - val_loss: 0.6936 - val_accuracy: 0.5000
Epoch 6/50
2/2 [==============================] - 2s 534ms/step - loss: 0.0359 - accuracy: 0.9867 - val_loss: 0.6940 - val_accuracy: 0.5000
Epoch 7/50
2/2 [==============================] - 2s 509ms/step - loss: 0.0145 - accuracy: 0.9983 - val_loss: 0.6946 - val_accuracy: 0.5000
Epoch 8/50
2/2 [==============================] - 2s 545ms/step - loss: 0.0124 - accuracy: 0.9967 - val_loss: 0.6954 - val_accuracy: 0.5000
Epoch 9/50
2/2 [==============================] - 2s 544ms/step - loss: 0.0092 - accuracy: 0.9967 - val_loss: 0.6964 - val_accuracy: 0.5000
Epoch 10/50
2/2 [==============================] - 2s 512ms/step - loss: 0.0060 - accuracy: 0.9967 - val_loss: 0.6980 - val_accuracy: 0.5000
Epoch 11/50
2/2 [==============================] - 2s 535ms/step - loss: 0.0036 - accuracy: 0.9983 - val_loss: 0.6998 - val_accuracy: 0.5000
Epoch 12/50
2/2 [==============================] - 2s 503ms/step - loss: 0.0085 - accuracy: 0.9983 - val_loss: 0.7020 - val_accuracy: 0.5000
Epoch 13/50
2/2 [==============================] - 2s 665ms/step - loss: 0.0040 - accuracy: 1.0000 - val_loss: 0.7046 - val_accuracy: 0.5000
Epoch 14/50
2/2 [==============================] - 2s 516ms/step - loss: 0.0017 - accuracy: 1.0000 - val_loss: 0.7078 - val_accuracy: 0.5000
Epoch 15/50
2/2 [==============================] - 2s 520ms/step - loss: 0.0023 - accuracy: 0.9983 - val_loss: 0.7115 - val_accuracy: 0.5000
Epoch 16/50
2/2 [==============================] - 2s 500ms/step - loss: 8.5606e-04 - accuracy: 1.0000 - val_loss: 0.7157 - val_accuracy: 0.5000
Epoch 17/50
2/2 [==============================] - 2s 524ms/step - loss: 0.0018 - accuracy: 1.0000 - val_loss: 0.7205 - val_accuracy: 0.5000
Epoch 18/50
2/2 [==============================] - 2s 499ms/step - loss: 9.0626e-04 - accuracy: 1.0000 - val_loss: 0.7258 - val_accuracy: 0.5000
Epoch 19/50
2/2 [==============================] - 2s 510ms/step - loss: 0.0014 - accuracy: 1.0000 - val_loss: 0.7313 - val_accuracy: 0.5000
Epoch 20/50
2/2 [==============================] - 2s 711ms/step - loss: 0.0013 - accuracy: 1.0000 - val_loss: 0.7371 - val_accuracy: 0.5000
Epoch 21/50
2/2 [==============================] - 2s 511ms/step - loss: 9.9904e-04 - accuracy: 1.0000 - val_loss: 0.7431 - val_accuracy: 0.5000
Epoch 22/50
2/2 [==============================] - 2s 540ms/step - loss: 0.0019 - accuracy: 1.0000 - val_loss: 0.7489 - val_accuracy: 0.5000
Epoch 23/50
2/2 [==============================] - 2s 513ms/step - loss: 4.9861e-04 - accuracy: 1.0000 - val_loss: 0.7553 - val_accuracy: 0.5000
Epoch 24/50
2/2 [==============================] - 2s 542ms/step - loss: 6.6248e-04 - accuracy: 1.0000 - val_loss: 0.7622 - val_accuracy: 0.5000
Epoch 25/50
2/2 [==============================] - 2s 510ms/step - loss: 7.7911e-04 - accuracy: 1.0000 - val_loss: 0.7699 - val_accuracy: 0.5000
Epoch 26/50
2/2 [==============================] - 2s 502ms/step - loss: 3.3703e-04 - accuracy: 1.0000 - val_loss: 0.7781 - val_accuracy: 0.5000
Epoch 27/50
2/2 [==============================] - 2s 539ms/step - loss: 3.7860e-04 - accuracy: 1.0000 - val_loss: 0.7870 - val_accuracy: 0.5000
Epoch 28/50
2/2 [==============================] - 2s 507ms/step - loss: 2.4852e-04 - accuracy: 1.0000 - val_loss: 0.7962 - val_accuracy: 0.5000
Epoch 29/50
2/2 [==============================] - 2s 512ms/step - loss: 1.7709e-04 - accuracy: 1.0000 - val_loss: 0.8058 - val_accuracy: 0.5000
Epoch 30/50
2/2 [==============================] - 2s 538ms/step - loss: 1.6884e-04 - accuracy: 1.0000 - val_loss: 0.8161 - val_accuracy: 0.5000
Epoch 31/50
2/2 [==============================] - 2s 521ms/step - loss: 2.0884e-04 - accuracy: 1.0000 - val_loss: 0.8266 - val_accuracy: 0.5000
Epoch 32/50
2/2 [==============================] - 2s 543ms/step - loss: 1.8691e-04 - accuracy: 1.0000 - val_loss: 0.8375 - val_accuracy: 0.5000
Epoch 33/50
2/2 [==============================] - 2s 520ms/step - loss: 1.7296e-04 - accuracy: 1.0000 - val_loss: 0.8487 - val_accuracy: 0.5000
Epoch 34/50
2/2 [==============================] - 2s 516ms/step - loss: 4.5739e-04 - accuracy: 1.0000 - val_loss: 0.8601 - val_accuracy: 0.5000
Epoch 35/50
2/2 [==============================] - 2s 530ms/step - loss: 9.6831e-05 - accuracy: 1.0000 - val_loss: 0.8720 - val_accuracy: 0.5000
Epoch 36/50
2/2 [==============================] - 2s 553ms/step - loss: 1.2694e-04 - accuracy: 1.0000 - val_loss: 0.8847 - val_accuracy: 0.5000
Epoch 37/50
2/2 [==============================] - 2s 514ms/step - loss: 8.6252e-05 - accuracy: 1.0000 - val_loss: 0.8977 - val_accuracy: 0.5000
Epoch 38/50
2/2 [==============================] - 2s 520ms/step - loss: 2.6762e-04 - accuracy: 1.0000 - val_loss: 0.9115 - val_accuracy: 0.5000
Epoch 39/50
2/2 [==============================] - 2s 542ms/step - loss: 8.1350e-05 - accuracy: 1.0000 - val_loss: 0.9258 - val_accuracy: 0.5000
Epoch 40/50
2/2 [==============================] - 2s 506ms/step - loss: 8.0961e-05 - accuracy: 1.0000 - val_loss: 0.9405 - val_accuracy: 0.5000
Epoch 41/50
2/2 [==============================] - 2s 526ms/step - loss: 6.6102e-05 - accuracy: 1.0000 - val_loss: 0.9555 - val_accuracy: 0.5000
Epoch 42/50
2/2 [==============================] - 2s 549ms/step - loss: 1.1529e-04 - accuracy: 1.0000 - val_loss: 0.9707 - val_accuracy: 0.5000
Epoch 43/50
2/2 [==============================] - 2s 528ms/step - loss: 6.1373e-05 - accuracy: 1.0000 - val_loss: 0.9864 - val_accuracy: 0.5000
Epoch 44/50
2/2 [==============================] - 2s 516ms/step - loss: 7.2809e-05 - accuracy: 1.0000 - val_loss: 1.0025 - val_accuracy: 0.5000
Epoch 45/50
2/2 [==============================] - 2s 513ms/step - loss: 5.9504e-05 - accuracy: 1.0000 - val_loss: 1.0191 - val_accuracy: 0.5000
Epoch 46/50
2/2 [==============================] - 2s 515ms/step - loss: 6.1622e-05 - accuracy: 1.0000 - val_loss: 1.0361 - val_accuracy: 0.5000
Epoch 47/50
2/2 [==============================] - 2s 525ms/step - loss: 7.7296e-05 - accuracy: 1.0000 - val_loss: 1.0534 - val_accuracy: 0.5000
Epoch 48/50
2/2 [==============================] - 2s 512ms/step - loss: 4.5088e-05 - accuracy: 1.0000 - val_loss: 1.0711 - val_accuracy: 0.5000
Epoch 49/50
2/2 [==============================] - 2s 532ms/step - loss: 1.1449e-04 - accuracy: 1.0000 - val_loss: 1.0887 - val_accuracy: 0.5000
Epoch 50/50
2/2 [==============================] - 2s 516ms/step - loss: 6.0932e-05 - accuracy: 1.0000 - val_loss: 1.1071 - val_accuracy: 0.5000
<keras.callbacks.History at 0x7fb4205a20d0>
Since I have 2 classes, my teacher said that a validation accuracy of 0.5 means that it is completely random.
My images are in the format of 50x50 .jpg images in Google Drive. Could that be the problem as my current image size is 50x50? But when I run
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 10))
for images, labels in train_ds.take(1):
for i in range(9):
ax = plt.subplot(3, 3, i + 1)
plt.imshow(images[i].numpy().astype("uint8"))
plt.title(int(labels[i]))
plt.axis("off")
The images are correct, as in the entire image is shown and is clear.
I tried changing the seed to a random number. The code comes with data augmentation and the model.fit() should automatically shuffle the images (if I understood the online sites correctly).
My teacher does not know what is wrong either. Any solutions?
Edit: this is the dataset
https://www.kaggle.com/datasets/simranvolunesia/pest-dataset
Edit2: Sorry for the confusion but I only used two datasets, aphids and bollworm.
Edit: You are also using binary_crossentropy for a multi-class classification problem, yet you're forcing it to only have two classes when your passed dataset contains nine.
model = make_model(input_shape=image_size + (3,), num_classes=2)
According to your dataset, the classes are:
Pests: aphids, armyworm, beetle, bollworm, grasshopper, mites, mosquito, sawfly, stem borer
I don't see where you're only working with two classes, unless there's some code missing somewhere that removes the other seven. This site (https://keras.io/examples/vision/image_classification_from_scratch/) is classifying into two classes: cat or dog. That's probably where you got two classes from.
So that line needs to be changed to:
model = make_model(input_shape=image_size + (3,), num_classes=9)
Change this:
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss="binary_crossentropy",
metrics=["accuracy"],
)
To:
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss="sparse_categorical_crossentropy",
metrics=["accuracy"],
)
You might also need to change that metric from accuracy to binary_accuracy. Try with just accuracy first, then with binary_accuracy.
model.compile(
optimizer=keras.optimizers.Adam(1e-3),
loss="sparse_categorical_crossentropy",
metrics=["binary_accuracy"],
According to the documentation, you are not splitting your validation data correctly and probably dealing with the default shuffling too.
Define your datasets like this (assuming a 20% validation split):
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/content/gdrive/My Drive/pest/train",
validation_split=0.2,
subset="training"
seed=num,
image_size=image_size,
batch_size=batch_size,
)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/content/gdrive/My Drive/pest/train",
validation_split=0.2,
subset="validation"
seed=num,
image_size=image_size,
batch_size=batch_size,
)
# with test folder for test set
test_ds = tf.keras.preprocessing.image_dataset_from_directory(
"/content/gdrive/My Drive/pest/test",
image_size=image_size,
batch_size=batch_size,
shuffle=False
)
the answer provided by Djinn is correct. Also your are including augmentation WITHIN your model. So your model will augment not only the training images but the validation and test images as well. Test and Validation images should not be augmented. If you want augmentation then use the ImageDataGenerator.flow_from_directory. Documentation for that is here
I’ve found the problem. It is because I have quite few images (only 300 in each class), my batch size is too big. val_accuracy is around 0.8 to 0.9 after changing the batch size to 8. Thanks everyone for the answers!

Transfer Learning Neural Network is not learning

First of all, I know that there is a similar thread here:
https://stats.stackexchange.com/questions/352036/what-should-i-do-when-my-neural-network-doesnt-learn
But unfortunately, it does not help. I probably have a bug inside my code which I cannot find. What I am trying to do is to classify some WAV files. But the model does not learn.
At first, I am collecting the files and saving them in an array.
Second, create new directories, one for train data and one for val data.
Next, I am reading the WAV files, creating spectrograms, and saving them all to the train directory.
Afterward, I am moving 20% of the data from the train directory to the val directory.
Note: While creating the spectrograms I am checking the length of the WAV. If it is too short (less than 2 sec), I am doubling it. Out of this spectrogram, I am cutting a random chunk and saving only this. As a result, all images do have the same height and width.
Then as the next step, I am loading the train and val images. And here I am also doing the normalization.
IMG_WIDTH=300
IMG_HEIGHT=300
IMG_DIM = (IMG_WIDTH, IMG_HEIGHT, 3)
train_files = glob.glob(DBMEL_PATH + "*",recursive=True)
train_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in train_files]
train_imgs = np.array(train_imgs) / 255 # normalizing Data
train_labels = [fn.split('\\')[-1].split('.')[1].strip() for fn in train_files]
validation_files = glob.glob(DBMEL_VAL_PATH + "*",recursive=True)
validation_imgs = [img_to_array(load_img(img, target_size=IMG_DIM)) for img in validation_files]
validation_imgs = np.array(validation_imgs) / 255 # normalizing Data
validation_labels = [fn.split('\\')[-1].split('.')[1].strip() for fn in validation_files]
I have checked the variables and printing them. I guess this is working quite well. The arrays contain 80% and respectively 20% of the total data.
#Train dataset shape: (3756, 300, 300, 3)
#Validation dataset shape: (939, 300, 300, 3)
Next, I have also implemented a One-Hot-Encoder.
So far so good. In the next step I create empty DataGenerators, so without any data augmentation. When calling the DataGenerators, one time for train-data and one time for val-data, I'll pass the arrays for images (train_imgs, validation_imgs) and the one-hot-encoded-labels (train_labels_enc, validation_labels_enc).
Okay. Here now comes the tricky part.
First, create/load a pre-trained network
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.models import Model
import tensorflow.keras
input_shape=(IMG_HEIGHT,IMG_WIDTH,3)
restnet = ResNet50(include_top=False, weights='imagenet', input_shape=(IMG_HEIGHT,IMG_WIDTH,3))
output = restnet.layers[-1].output
output = tensorflow.keras.layers.Flatten()(output)
restnet = Model(restnet.input, output)
for layer in restnet.layers:
layer.trainable = False
And now finally creating the model itself. While creating the model I am using the pre-trained network for transfer learning. I guess somewhere there must be a problem.
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, InputLayer
from tensorflow.keras.models import Sequential
from tensorflow.keras import optimizers
model = Sequential()
model.add(restnet) # <-- transfer learning
model.add(Dense(512, activation='relu', input_dim=input_shape))# 512 (num_classes)
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
And the models run with this
history = model.fit_generator(train_generator,
steps_per_epoch=100,
epochs=100,
validation_data=val_generator,
validation_steps=10,
verbose=1
)
But even after 50 epochs the accuracy stalls at around 0.15
Epoch 1/100
100/100 [==============================] - 711s 7s/step - loss: 10.6419 - accuracy: 0.1530 - val_loss: 1.9416 - val_accuracy: 0.1467
Epoch 2/100
100/100 [==============================] - 733s 7s/step - loss: 1.9595 - accuracy: 0.1550 - val_loss: 1.9372 - val_accuracy: 0.1267
Epoch 3/100
100/100 [==============================] - 731s 7s/step - loss: 1.9940 - accuracy: 0.1444 - val_loss: 1.9388 - val_accuracy: 0.1400
Epoch 4/100
100/100 [==============================] - 735s 7s/step - loss: 1.9416 - accuracy: 0.1535 - val_loss: 1.9380 - val_accuracy: 0.1733
Epoch 5/100
100/100 [==============================] - 737s 7s/step - loss: 1.9394 - accuracy: 0.1656 - val_loss: 1.9345 - val_accuracy: 0.1533
Epoch 6/100
100/100 [==============================] - 741s 7s/step - loss: 1.9364 - accuracy: 0.1667 - val_loss: 1.9286 - val_accuracy: 0.1767
Epoch 7/100
100/100 [==============================] - 740s 7s/step - loss: 1.9389 - accuracy: 0.1523 - val_loss: 1.9305 - val_accuracy: 0.1400
Epoch 8/100
100/100 [==============================] - 737s 7s/step - loss: 1.9394 - accuracy: 0.1623 - val_loss: 1.9441 - val_accuracy: 0.1667
Epoch 9/100
100/100 [==============================] - 735s 7s/step - loss: 1.9391 - accuracy: 0.1582 - val_loss: 1.9458 - val_accuracy: 0.1333
Epoch 10/100
100/100 [==============================] - 734s 7s/step - loss: 1.9381 - accuracy: 0.1602 - val_loss: 1.9372 - val_accuracy: 0.1700
Epoch 11/100
100/100 [==============================] - 739s 7s/step - loss: 1.9392 - accuracy: 0.1623 - val_loss: 1.9302 - val_accuracy: 0.2167
Epoch 12/100
100/100 [==============================] - 741s 7s/step - loss: 1.9368 - accuracy: 0.1627 - val_loss: 1.9326 - val_accuracy: 0.1467
Epoch 13/100
100/100 [==============================] - 740s 7s/step - loss: 1.9381 - accuracy: 0.1513 - val_loss: 1.9312 - val_accuracy: 0.1733
Epoch 14/100
100/100 [==============================] - 736s 7s/step - loss: 1.9396 - accuracy: 0.1542 - val_loss: 1.9407 - val_accuracy: 0.1367
Epoch 15/100
100/100 [==============================] - 741s 7s/step - loss: 1.9393 - accuracy: 0.1597 - val_loss: 1.9336 - val_accuracy: 0.1333
Epoch 16/100
100/100 [==============================] - 737s 7s/step - loss: 1.9375 - accuracy: 0.1659 - val_loss: 1.9354 - val_accuracy: 0.1267
Epoch 17/100
100/100 [==============================] - 741s 7s/step - loss: 1.9422 - accuracy: 0.1487 - val_loss: 1.9307 - val_accuracy: 0.1567
Epoch 18/100
100/100 [==============================] - 738s 7s/step - loss: 1.9399 - accuracy: 0.1680 - val_loss: 1.9408 - val_accuracy: 0.1567
Epoch 19/100
100/100 [==============================] - 743s 7s/step - loss: 1.9405 - accuracy: 0.1610 - val_loss: 1.9335 - val_accuracy: 0.1533
Epoch 20/100
100/100 [==============================] - 738s 7s/step - loss: 1.9410 - accuracy: 0.1575 - val_loss: 1.9331 - val_accuracy: 0.1533
Epoch 21/100
100/100 [==============================] - 746s 7s/step - loss: 1.9395 - accuracy: 0.1639 - val_loss: 1.9344 - val_accuracy: 0.1733
Epoch 22/100
100/100 [==============================] - 746s 7s/step - loss: 1.9393 - accuracy: 0.1585 - val_loss: 1.9354 - val_accuracy: 0.1667
Epoch 23/100
100/100 [==============================] - 746s 7s/step - loss: 1.9398 - accuracy: 0.1599 - val_loss: 1.9352 - val_accuracy: 0.1500
Epoch 24/100
100/100 [==============================] - 746s 7s/step - loss: 1.9392 - accuracy: 0.1585 - val_loss: 1.9449 - val_accuracy: 0.1667
Epoch 25/100
100/100 [==============================] - 746s 7s/step - loss: 1.9399 - accuracy: 0.1495 - val_loss: 1.9352 - val_accuracy: 0.1600
Can anyone please help to find the problem?
I solved the problem on my own.
I exchanged this
model = Sequential()
model.add(restnet) # <-- transfer learning
model.add(Dense(512, activation='relu', input_dim=input_shape))# 512 (num_classes)
model.add(Dropout(0.3))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(7, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.summary()
with this:
base_model = tf.keras.applications.MobileNetV2(input_shape = (224, 224, 3), include_top = False, weights = "imagenet")
model = Sequential()
model.add(base_model)
model.add(tf.keras.layers.GlobalAveragePooling2D())
model.add(Dropout(0.2))
model.add(Dense(number_classes, activation="softmax"))
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.00001),
loss="categorical_crossentropy",
metrics=['accuracy'])
model.summary()
And I found out one more thing. In contrary to some tutorials, using data augmentation is not useful when working with spectrograms.
Without data augmentation I got 0.99 on train-accuracy and 0.72 on val-accuracy. But with data augmentation I got only 0.75 on train-accuracy and 0.16 on val-accuracy.

Twitter Sentiment Analysis Constant Zero (0.0000e+00) Loss value

I'm trying to figure out why model's Loss value is always 0.0, so the accuracy seems to be constant as well (which is incorrect in my case, afaik).
Code snippet:
model = Sequential()
model.add(Embedding(vocab_size, glove_vectors.vector_size, weights=[embedding_matrix], input_length=X.shape[1]))
model.add(Dropout(0.5))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss='categorical_crossentropy', optimizer="adam", metrics=["accuracy"])
model.summary()
EPOCHS = 20
train_data, test_data, train_labels, test_labels = train_test_split(X, Y, test_size=0.20, random_state = 42)
print(train_data.shape, train_labels.shape)
print(test_data.shape, test_labels.shape)
val_data = (test_data, test_labels)
history = model.fit(train_data, train_labels, validation_data=val_data, epochs=EPOCHS)
score = model.evaluate(test_data, test_labels)
Output:
Epoch 1/20
25/25 [==============================] - 4s 69ms/step - loss: 0.0000e+00 - accuracy: 0.5241 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
Epoch 2/20
25/25 [==============================] - 1s 55ms/step - loss: 0.0000e+00 - accuracy: 0.4927 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
Epoch 3/20
25/25 [==============================] - 1s 55ms/step - loss: 0.0000e+00 - accuracy: 0.5110 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
Epoch 4/20
25/25 [==============================] - 1s 56ms/step - loss: 0.0000e+00 - accuracy: 0.5074 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
Epoch 5/20
25/25 [==============================] - 1s 55ms/step - loss: 0.0000e+00 - accuracy: 0.5363 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
Epoch 6/20
25/25 [==============================] - 1s 53ms/step - loss: 0.0000e+00 - accuracy: 0.5042 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
Epoch 7/20
25/25 [==============================] - 1s 54ms/step - loss: 0.0000e+00 - accuracy: 0.4904 - val_loss: 0.0000e+00 - val_accuracy: 0.4650
In binary classification there will be 1 node in the output layer even though we will be predicting between two classes. In order to get the output in a probability format between 0 and 1 we will use the sigmoid function.
Hence binary_crossentropy is the correct loss function in your case
model.compile(loss='binary_crossentropy', optimizer="adam", metrics=["accuracy"])

How can I prevent my model from being overfitted?

I'm a newbie with deep learning and I try to create a model and I don't really understand the model. add(layers). I m sure that the input shape (it's for recognition). I think the problem is in the Dropout, but I don't understand the value.
Can someone explains to me the
model = models.Sequential()
model.add(layers.Conv2D(32, (3,3), activation = 'relu', input_shape = (128,128,3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64, (3,3), activation = 'relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dropout(0.5))
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(6, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer=optimizers.Adam(lr=1e-4), metrics=['acc'])
-------------------------------------------------------
history = model.fit(
train_data,
train_labels,
epochs=30,
validation_data=(test_data, test_labels),
)
and here is the result :
Epoch 15/30
5/5 [==============================] - 0s 34ms/step - loss: 0.3987 - acc: 0.8536 - val_loss: 0.7021 - val_acc: 0.7143
Epoch 16/30
5/5 [==============================] - 0s 31ms/step - loss: 0.3223 - acc: 0.8891 - val_loss: 0.6393 - val_acc: 0.7778
Epoch 17/30
5/5 [==============================] - 0s 32ms/step - loss: 0.3321 - acc: 0.9082 - val_loss: 0.6229 - val_acc: 0.7460
Epoch 18/30
5/5 [==============================] - 0s 31ms/step - loss: 0.2615 - acc: 0.9409 - val_loss: 0.6591 - val_acc: 0.8095
Epoch 19/30
5/5 [==============================] - 0s 32ms/step - loss: 0.2161 - acc: 0.9857 - val_loss: 0.6368 - val_acc: 0.7143
Epoch 20/30
5/5 [==============================] - 0s 33ms/step - loss: 0.1773 - acc: 0.9857 - val_loss: 0.5644 - val_acc: 0.7778
Epoch 21/30
5/5 [==============================] - 0s 32ms/step - loss: 0.1650 - acc: 0.9782 - val_loss: 0.5459 - val_acc: 0.8413
Epoch 22/30
5/5 [==============================] - 0s 31ms/step - loss: 0.1534 - acc: 0.9789 - val_loss: 0.5738 - val_acc: 0.7460
Epoch 23/30
5/5 [==============================] - 0s 32ms/step - loss: 0.1205 - acc: 0.9921 - val_loss: 0.5351 - val_acc: 0.8095
Epoch 24/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0967 - acc: 1.0000 - val_loss: 0.5256 - val_acc: 0.8413
Epoch 25/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0736 - acc: 1.0000 - val_loss: 0.5493 - val_acc: 0.7937
Epoch 26/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0826 - acc: 1.0000 - val_loss: 0.5342 - val_acc: 0.8254
Epoch 27/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0687 - acc: 1.0000 - val_loss: 0.5452 - val_acc: 0.8254
Epoch 28/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0571 - acc: 1.0000 - val_loss: 0.5176 - val_acc: 0.7937
Epoch 29/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0549 - acc: 1.0000 - val_loss: 0.5142 - val_acc: 0.8095
Epoch 30/30
5/5 [==============================] - 0s 32ms/step - loss: 0.0479 - acc: 1.0000 - val_loss: 0.5243 - val_acc: 0.8095
I never depassed the 70% average but on this i have 80% but i think i'm on overfitting.. I evidemently searched on differents docs but i'm lost
Have you try following into your training:
Data Augmentation
Pre-trained Model
Looking at the execution time per epoch, it looks like your data set is pretty small. Also, it's not clear whether there is any class imbalance in your dataset. You probably should try stratified CV training and analysis on the folds results. It won't prevent overfit but it will eventually give you more insight into your model, which generally can help to reduce overfitting. However, preventing overfitting is a general topic, search online to get resources. You can also try this
model.compile(loss='categorical_crossentropy',
optimizer='adam, metrics=['acc'])
-------------------------------------------------------
# src: https://keras.io/api/callbacks/reduce_lr_on_plateau/
# reduce learning rate by a factor of 0.2 if val_loss -
# won't improve within 5 epoch.
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2,
patience=5, min_lr=0.00001)
# src: https://keras.io/api/callbacks/early_stopping/
# stop training if val_loss don't improve within 15 epoch.
early_stop = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15)
history = model.fit(
train_data,
train_labels,
epochs=30,
validation_data=(test_data, test_labels),
callbacks=[reduce_lr, early_stop]
)
You may also find it useful of using ModelCheckpoint or LearningRateScheduler. This doesn't guarantee of no overfit but some approach for that to adopt.

Different converging for different training data representations (Numpy array and TensorFlow Dataset API)

I am using Keras with TensorFlow backend to train an LSTM network for some time-sequential data sets. The performance seems pretty good when I represent my training data (as well as the validation data) in the Numpy array format:
train_x.shape: (128346, 10, 34)
val_x.shape: (7941, 10, 34)
test_x.shape: (24181, 10, 34)
train_y.shape: (128346, 2)
val_y.shape: (7941, 2)
test_y.shape: (24181, 2)
P.s., 10 is the time steps and 34 is the number of features; The labels were one-hot encoded.
model = tf.keras.Sequential()
model.add(layers.LSTM(_HIDDEN_SIZE, return_sequences=True,
input_shape=(_TIME_STEPS, _FEATURE_DIMENTIONS)))
model.add(layers.Dropout(0.4))
model.add(layers.LSTM(_HIDDEN_SIZE, return_sequences=True))
model.add(layers.Dropout(0.3))
model.add(layers.TimeDistributed(layers.Dense(_NUM_CLASSES)))
model.add(layers.Flatten())
model.add(layers.Dense(_NUM_CLASSES, activation='softmax'))
opt = tf.keras.optimizers.Adam(lr = _LR)
model.compile(optimizer = opt, loss = 'categorical_crossentropy',
metrics = ['accuracy'])
model.fit(train_x,
train_y,
epochs=_EPOCH,
batch_size = _BATCH_SIZE,
verbose = 1,
validation_data = (val_x, val_y)
)
And the training results are:
Train on 128346 samples, validate on 7941 samples
Epoch 1/10
128346/128346 [==============================] - 50s 390us/step - loss: 0.5883 - acc: 0.6975 - val_loss: 0.5242 - val_acc: 0.7416
Epoch 2/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.4804 - acc: 0.7687 - val_loss: 0.4265 - val_acc: 0.8014
Epoch 3/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.4232 - acc: 0.8076 - val_loss: 0.4095 - val_acc: 0.8096
Epoch 4/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.3894 - acc: 0.8276 - val_loss: 0.3529 - val_acc: 0.8469
Epoch 5/10
128346/128346 [==============================] - 49s 382us/step - loss: 0.3610 - acc: 0.8430 - val_loss: 0.3283 - val_acc: 0.8593
Epoch 6/10
128346/128346 [==============================] - 49s 382us/step - loss: 0.3402 - acc: 0.8525 - val_loss: 0.3334 - val_acc: 0.8558
Epoch 7/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.3233 - acc: 0.8604 - val_loss: 0.2944 - val_acc: 0.8741
Epoch 8/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.3087 - acc: 0.8663 - val_loss: 0.2786 - val_acc: 0.8805
Epoch 9/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.2969 - acc: 0.8709 - val_loss: 0.2785 - val_acc: 0.8777
Epoch 10/10
128346/128346 [==============================] - 49s 383us/step - loss: 0.2867 - acc: 0.8757 - val_loss: 0.2590 - val_acc: 0.8877
This log seems pretty normal, but when I tried to use TensorFlow Dataset API to represent my data sets, the training process performed very strange (it seems that the model turns to overfit/underfit?):
def tfdata_generator(features, labels, is_training = False, batch_size = _BATCH_SIZE, epoch = _EPOCH):
dataset = tf.data.Dataset.from_tensor_slices((features, tf.cast(labels, dtype = tf.uint8)))
if is_training:
dataset = dataset.shuffle(10000) # depends on sample size
dataset = dataset.batch(batch_size, drop_remainder = True).repeat(epoch).prefetch(batch_size)
return dataset
training_set = tfdata_generator(train_x, train_y, is_training=True)
validation_set = tfdata_generator(val_x, val_y, is_training=False)
testing_set = tfdata_generator(test_x, test_y, is_training=False)
Training on the same model and hyperparameters:
model.fit(
training_set.make_one_shot_iterator(),
epochs = _EPOCH,
steps_per_epoch = len(train_x) // _BATCH_SIZE,
verbose = 1,
validation_data = validation_set.make_one_shot_iterator(),
validation_steps = len(val_x) // _BATCH_SIZE
)
And the log seems much different from the previous one:
Epoch 1/10
2005/2005 [==============================] - 54s 27ms/step - loss: 0.1451 - acc: 0.9419 - val_loss: 3.2980 - val_acc: 0.4975
Epoch 2/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1675 - acc: 0.9371 - val_loss: 3.0838 - val_acc: 0.4975
Epoch 3/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1821 - acc: 0.9316 - val_loss: 3.1212 - val_acc: 0.4975
Epoch 4/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1902 - acc: 0.9287 - val_loss: 3.0032 - val_acc: 0.4975
Epoch 5/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1905 - acc: 0.9283 - val_loss: 2.9671 - val_acc: 0.4975
Epoch 6/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1867 - acc: 0.9299 - val_loss: 2.8734 - val_acc: 0.4975
Epoch 7/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1802 - acc: 0.9316 - val_loss: 2.8651 - val_acc: 0.4975
Epoch 8/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1740 - acc: 0.9350 - val_loss: 2.8793 - val_acc: 0.4975
Epoch 9/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1660 - acc: 0.9388 - val_loss: 2.7894 - val_acc: 0.4975
Epoch 10/10
2005/2005 [==============================] - 49s 24ms/step - loss: 0.1613 - acc: 0.9405 - val_loss: 2.7997 - val_acc: 0.4975
The validation loss could not be reduced and the val_acc always the same value when I use the TensorFlow Dataset API to represent my data.
My questions are:
Based on the same model and parameters, why the model.fit() provides such different training results when I merely adopted tf.data.Dataset API?
What the difference between these two mechanisms?
model.fit(train_x,
train_y,
epochs=_EPOCH,
batch_size = _BATCH_SIZE,
verbose = 1,
validation_data = (val_x, val_y)
)
vs
model.fit(
training_set.make_one_shot_iterator(),
epochs = _EPOCH,
steps_per_epoch = len(train_x) // _BATCH_SIZE,
verbose = 1,
validation_data = validation_set.make_one_shot_iterator(),
validation_steps = len(val_x) // _BATCH_SIZE
)
How to solve this strange problem if I have to use tf.data.Dataset API?

Categories

Resources