I have set up a model to train on classifying whether an image is a certain video game or not. I pre-scaled my images into 250x250 pixels and have them separated into two folders (the two binary classes) labelled 0 and 1. The amount of both classes are within ~100 of each other and I have around 3500 images in total.
Here are photos of the training process, the model set up and some predictions: https://imgur.com/a/CN1b6LV
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0,
zoom_range=0,
horizontal_flip=True,
width_shift_range=0.1,
height_shift_range=0.1,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
'data\\',
batch_size=batchsize,
shuffle=True,
target_size=(250, 250),
subset="training",
class_mode="binary")
val_generator = train_datagen.flow_from_directory(
'data\\',
batch_size=batchsize,
shuffle=True,
target_size=(250, 250),
subset="validation",
class_mode="binary")
pred_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0,
zoom_range=0,
horizontal_flip=False,
width_shift_range=0.1,
height_shift_range=0.1)
pred_generator = pred_datagen.flow_from_directory(
'batch_pred\\',
batch_size=30,
shuffle=False,
target_size=(250, 250))
model = Sequential()
model.add(Conv2D(input_shape=(250, 250, 3), filters=25, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=32, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=32, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
dense = False
if dense:
model.add(Flatten())
model.add(Dense(250, activation="relu"))
model.add(BatchNormalization())
model.add(Dense(50, activation="relu"))
else:
model.add(GlobalAveragePooling2D())
model.add(Dense(1, activation="softmax"))
model.compile(loss='binary_crossentropy',
optimizer=Adam(0.0005), metrics=["acc"])
callbacks = [EarlyStopping(monitor='val_acc', patience=200, verbose=1),
ModelCheckpoint(filepath="model_checkpoint.h5py",
monitor='val_acc', save_best_only=True, verbose=1)]
model.fit_generator(
train_generator,
steps_per_epoch=train_generator.samples // batchsize,
validation_data=val_generator,
validation_steps=val_generator.samples // batchsize,
epochs=500,
callbacks=callbacks)
Everything appears to run correctly in terms of the model iterating the data by epoch, it finding the correct number of images etc. However, my predictions are always 50% despite a good validation accuracy, low loss, high accuracy etc.
I'm not sure what I'm doing wrong and any help would be appreciated.
I think your problem is that you're using sigmoid for binary classification, your final layer activation function should be linear.
The problem is that you are using softmax on a Dense layer with one unit. Softmax function normalizes its input such that the sum of its elements becomes equal to one. So if it has one unit, then the output would be always 1. Instead, for binary classification you need to use sigmoid function as the activation function of last layer.
Related
when fitting the CNN model,
training accuracy = 68%
validation accuracy = 63%
but when I try to evaluate the same CNN model with the same validation data and training data the results are,
Training Loss 1.86870 Training Accuracy = 36.16%
Val Loss 1.89060 val Accuracy = 36.54%
Testing Loss 1.86273 Testing Accuracy = 36.36%
There is a big gap between these accuracies.
What can be the reason behind this?
The CNN model:
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Activation ,Dropout, Normalization, AveragePooling2D, BatchNormalization
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.python.keras import regularizers
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Dropout(.2))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Dropout(.2))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Dropout(.2))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(256, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(.5))
model.add(Dense(126, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(.5))
model.add(Dense(4,activation='softmax'))
Adam(learning_rate=0.001, name='Adam')
model.compile(optimizer = 'Adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])
epochs = 50
from tensorflow.keras import callbacks
import time
import keras
from keras.callbacks import EarlyStopping
es_callback = keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=20)
datagen = ImageDataGenerator(
horizontal_flip=True,
vertical_flip=True,
featurewise_std_normalization=True,
samplewise_std_normalization=True
)
checkpoint = callbacks.ModelCheckpoint(
filepath='/content/drive/MyDrive/model.{epoch:02d}-{accuracy:.2f}-{val_accuracy:.2f}.h5',
monitor='val_accuracy',
verbose=1,
save_best_only=True,
mode='auto'
)
datagen.fit(X_train)
history5 = model.fit(datagen.flow(X_train,y_train, batch_size=batch_size),
epochs = epochs, validation_data = datagen.flow(X_val,y_val, batch_size=batch_size),
callbacks=[ checkpoint]
)
stop = time.time()
Here, in the fit function, I have augmented training data as well as validation data, which is not correct.
Data augmentation is used to expand the training set and generate more diverse images. It should apply only to training data. Test data and validation data must not be touched.
I'm working on an Image colorization model using auto-encoders, here are parameters I'm using :
Training images size:26000
Validation images size:2600
40 epochs, it's doing well on first 20 epochs, but not improving on 20 others as described in this picture:
Mini-batch size :32
Optimizer : adam (lr=2e-4)
The input of the auto-encoder is the first dimension of the lab space of the image(gray-scale), and the output is the last 2 dimensions of the lab space(color information), here is a part of my code:
IMG_SIZE = 256
N_EPOCHS = 20
BATCH_SIZE = 32
latent_dim = 256
train_datagen = ImageDataGenerator(rescale = 1./255,
horizontal_flip = True)
train_set = train_datagen.flow_from_directory('melange/tr/',
target_size = (IMG_SIZE, IMG_SIZE),
batch_size = BATCH_SIZE)
valid_datagen = ImageDataGenerator(rescale = 1./255)
valid_set = valid_datagen.flow_from_directory('melange/vl/',
target_size = (IMG_SIZE, IMG_SIZE),
batch_size = BATCH_SIZE)
def gen_ab_images(train_set):
for batch in train_set:
lab_batch = rgb2lab(batch[0])
X_batch = lab_batch[:,:,:,0]
Y_batch = lab_batch[:,:,:,1:] / 128
yield (X_batch.reshape(X_batch.shape+(1,)), Y_batch)
model = Sequential()
model.add(Conv2D(64, (3, 3), activation='relu', padding='same', strides=2, input_shape=(256, 256, 1)))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3,3), activation='relu', padding='same', strides=2))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(Conv2D(256, (3,3), activation='relu', padding='same', strides=2))
model.add(Conv2D(512, (3,3), activation='relu', padding='same'))
model.add(Conv2D(512, (3,3), activation='relu', padding='same'))
model.add(Conv2D(256, (3,3), activation='relu', padding='same'))
model.add(Conv2D(128, (3,3), activation='relu', padding='same'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(64, (3,3), activation='relu', padding='same'))
model.add(UpSampling2D((2, 2)))
model.add(Conv2D(32, (3,3), activation='relu', padding='same'))
model.add(Conv2D(16, (3,3), activation='relu', padding='same'))
model.add(Conv2D(2, (3, 3), activation='tanh', padding='same'))
model.add(UpSampling2D((2, 2)))
valeurs=model.fit(x=gen_ab_images(train_set),
epochs=N_EPOCHS,
validation_data=gen_ab_images(valid_set),
steps_per_epoch=len(train_set),
validation_steps=len(valid_set),
shuffle=True)
How can I improve the performance of this autoencoder? and also how can I speed-up the training knowing that it lasts really a long time (4.5 hours for 20 epochs and I do it twice) ?
I have built a simple CNN using tensorflow-gpu 2.4.0 for cifar10
# Set constent
num_classes = len(label_names) # The number of classes
input_shape = (32, 32, 3) # The input shape 32 * 32 pixels, 3 RGB brightness for each pixel
# Construct a sequential model
model = Sequential()
# Starting add layers to the model
# First VGG blockes
model.add(Conv2D(16, (3, 3), padding='same', activation='relu', input_shape=input_shape))
model.add(BatchNormalization()) # BatchNormalization after convolutional layer
model.add(Conv2D(16, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization()) # BatchNormalization after convolutional layer
model.add(MaxPooling2D((2, 2))) # MaxPooling2D to keep important features as well as shirnk info size
model.add(Dropout(0.1)) # Drop some portion of nodes to increase generality of model
# Second VGG blockes
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
# Third VGG blockes
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.3))
# Fourth VGG blockes
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.4))
# Flat and dense layer
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
# Final softmax layer
model.add(Dense(num_classes, activation='softmax'))
# The optimizer gives the learning rate lr and some other parameters.
opt = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
# Model complie
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
And while I am training this model:
#Train model
epochs = 100
batch_size = 128
# create data generator
datagen = ImageDataGenerator(horizontal_flip=True,
width_shift_range=5,
height_shift_range=5,
rotation_range=20)
history = model.fit(datagen.flow(x_training, y_training, batch_size=batch_size),
epochs=epochs,
validation_data=(x_validate, y_validate),
verbose=1,
shuffle=True)
The RAM of my computer keeps grow until the training process crashes. Adding python garbage collector for at end of each epoch did not work. Disabling eager detection did not work.
I using
tensorflow-gpu 2.4.0
CUDA 11.0
cuDNN 8.0
I'm working on a binary classification problem. I was getting 69% accuracy at first, but kept running out of memory so I shrunk certain parameters, now it's coming up 0. Any idea whats going on?
model = Sequential()
from keras.layers import Dropout
model.add(Conv2D(96, kernel_size=11, padding="same", input_shape=(300, 300, 1), activation = 'relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
from keras.layers.core import Activation
model.add(Flatten())
# model.add(Dense(units=1000, activation='relu' ))
model.add(Dense(units= 300, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation("softmax"))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
featurewise_center=True,
rotation_range=90,
fill_mode='nearest',
validation_split = 0.2
)
datagen.fit(train)
train_generator = datagen.flow(train, train_labels, batch_size=8)
# # fits the model on batches with real-time data augmentation:
history = model.fit_generator(generator=train_generator,
use_multiprocessing=True,
steps_per_epoch = len(train_generator) / 8,
epochs = 5,
workers=20)
Softmax should only be used if you have a multiclass classification problem. You have a single output from your Dense layer, so you should use sigmoid.
I cannot tell if this error is due to a technical mistake or hyper-parameters, but my DC-GAN's discriminator loss starts low and gradually climbs higher, slowing down around 8, whereas my generator loss goes way down. I ended it at about 60,000 epochs. Funny enough, the discriminator accuracy seems to be floating around 20-50%. Does anybody have any suggestions to fix the problem? Any help is appreciated.
Important Info
Data Format: 472 320x224 Color PNG files.
Optimizer: Adam(0.0002, 0.5)
Loss: Binary Cross-Entropy
A generated image after 50,000+ epochs: (Supposed to be a sneaker on a white background)
Discriminator Model:
def build_discriminator(self):
img_shape = (self.img_size[0], self.img_size[1], self.channels)
model = Sequential()
model.add(Conv2D(32, kernel_size=self.kernel_size, strides=2, input_shape=img_shape, padding="same")) # 192x256 -> 96x128
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(Conv2D(64, kernel_size=self.kernel_size, strides=2, padding="same")) # 96x128 -> 48x64
model.add(ZeroPadding2D(padding=((0, 1), (0, 1))))
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(BatchNormalization(momentum=0.8))
model.add(Conv2D(128, kernel_size=self.kernel_size, strides=2, padding="same")) # 48x64 -> 24x32
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(BatchNormalization(momentum=0.8))
model.add(Conv2D(256, kernel_size=self.kernel_size, strides=1, padding="same")) # 24x32 -> 12x16
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(Conv2D(512, kernel_size=self.kernel_size, strides=1, padding="same")) # 12x16 -> 6x8
model.add(LeakyReLU(alpha=0.2))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(1, activation='sigmoid'))
model.summary()
img = Input(shape=img_shape)
validity = model(img)
return Model(img, validity)
Generator Model:
def build_generator(self):
noise_shape = (100,)
model = Sequential()
model.add(
Dense(self.starting_filters * (self.img_size[0] // (2 ** self.upsample_layers)) * (self.img_size[1] // (2 ** self.upsample_layers)),
activation="relu", input_shape=noise_shape))
model.add(Reshape(((self.img_size[0] // (2 ** self.upsample_layers)),
(self.img_size[1] // (2 ** self.upsample_layers)),
self.starting_filters)))
model.add(BatchNormalization(momentum=0.8))
model.add(UpSampling2D()) # 6x8 -> 12x16
model.add(Conv2D(1024, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(momentum=0.8))
model.add(UpSampling2D()) # 12x16 -> 24x32
model.add(Conv2D(512, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(momentum=0.8))
model.add(UpSampling2D()) # 24x32 -> 48x64
model.add(Conv2D(256, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(momentum=0.8))
model.add(UpSampling2D()) # 48x64 -> 96x128
model.add(Conv2D(128, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(momentum=0.8))
model.add(UpSampling2D()) # 96x128 -> 192x256
model.add(Conv2D(64, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(momentum=0.8))
model.add(Conv2D(32, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(momentum=0.8))
model.add(Conv2D(self.channels, kernel_size=self.kernel_size, padding="same"))
model.add(Activation("tanh"))
model.summary()
noise = Input(shape=noise_shape)
img = model(noise)
return Model(noise, img)
It sounds totally understandable to me that you are having this problem. Your networks are not compensated, the Generator is much more powerful than the Discriminator, in terms of number of neurons. I would try to make the generator and discriminator symmetric each other, in terms of number of layers, their configuration, and their size, that way you assure no one is stronger than the other de facto.
I guess you are using your own Dataset. If not consider using the same hyperparameter as in the DCGAN paper and first reproduce their results. Then use that network for your dataset. DCGANs especially with cross-entropy loss are known to be really tricky and extremely hyperparameter sensitive to get to run (see https://arxiv.org/abs/1711.10337). Especially, if you did not constrain your discriminator with some kind of gradient penalties (Gradient Penalty or Spectral norm).
In GAN's using your own ideas for the hyperparameters is mostly a bad idea, unless you either really know what you are doing or have loads of GPU power to burn on a large scale search.