Accuracy Equals 0 CNN Python Keras - python

I'm working on a binary classification problem. I was getting 69% accuracy at first, but kept running out of memory so I shrunk certain parameters, now it's coming up 0. Any idea whats going on?
model = Sequential()
from keras.layers import Dropout
model.add(Conv2D(96, kernel_size=11, padding="same", input_shape=(300, 300, 1), activation = 'relu'))
model.add(MaxPooling2D(pool_size=(3, 3), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
model.add(Conv2D(128, kernel_size=3, padding="same", activation = 'relu'))
model.add(MaxPooling2D(pool_size=(2, 2), strides=(2, 2)))
from keras.layers.core import Activation
model.add(Flatten())
# model.add(Dense(units=1000, activation='relu' ))
model.add(Dense(units= 300, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(1))
model.add(Activation("softmax"))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
from keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(
featurewise_center=True,
rotation_range=90,
fill_mode='nearest',
validation_split = 0.2
)
datagen.fit(train)
train_generator = datagen.flow(train, train_labels, batch_size=8)
# # fits the model on batches with real-time data augmentation:
history = model.fit_generator(generator=train_generator,
use_multiprocessing=True,
steps_per_epoch = len(train_generator) / 8,
epochs = 5,
workers=20)

Softmax should only be used if you have a multiclass classification problem. You have a single output from your Dense layer, so you should use sigmoid.

Related

CNN model gives accuracy 0f 63% on Validation Images but when predicted on the same validation set, accuracy is 36%

when fitting the CNN model,
training accuracy = 68%
validation accuracy = 63%
but when I try to evaluate the same CNN model with the same validation data and training data the results are,
Training Loss 1.86870 Training Accuracy = 36.16%
Val Loss 1.89060 val Accuracy = 36.54%
Testing Loss 1.86273 Testing Accuracy = 36.36%
There is a big gap between these accuracies.
What can be the reason behind this?
The CNN model:
import tensorflow as tf
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Activation ,Dropout, Normalization, AveragePooling2D, BatchNormalization
from tensorflow.keras.optimizers import Adam, Adamax
from tensorflow.python.keras import regularizers
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(32, (3, 3), input_shape=(height, width, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Dropout(.2))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Dropout(.2))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Dropout(.2))
model.add(Conv2D(64, (3, 3), padding='same', activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2), padding='same', strides=(2)))
model.add(Flatten()) # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(256, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(.5))
model.add(Dense(126, activation="relu"))
model.add(BatchNormalization())
model.add(Dropout(.5))
model.add(Dense(4,activation='softmax'))
Adam(learning_rate=0.001, name='Adam')
model.compile(optimizer = 'Adam',loss = 'categorical_crossentropy',metrics = ['accuracy'])
epochs = 50
from tensorflow.keras import callbacks
import time
import keras
from keras.callbacks import EarlyStopping
es_callback = keras.callbacks.EarlyStopping(monitor='val_accuracy', patience=20)
datagen = ImageDataGenerator(
horizontal_flip=True,
vertical_flip=True,
featurewise_std_normalization=True,
samplewise_std_normalization=True
)
checkpoint = callbacks.ModelCheckpoint(
filepath='/content/drive/MyDrive/model.{epoch:02d}-{accuracy:.2f}-{val_accuracy:.2f}.h5',
monitor='val_accuracy',
verbose=1,
save_best_only=True,
mode='auto'
)
datagen.fit(X_train)
history5 = model.fit(datagen.flow(X_train,y_train, batch_size=batch_size),
epochs = epochs, validation_data = datagen.flow(X_val,y_val, batch_size=batch_size),
callbacks=[ checkpoint]
)
stop = time.time()
Here, in the fit function, I have augmented training data as well as validation data, which is not correct.
Data augmentation is used to expand the training set and generate more diverse images. It should apply only to training data. Test data and validation data must not be touched.

Keras fit_generator() doesn't train properly

I am trying to create an image classifier using Keras and TensorFlow 2.0.0 backend.
I'm training this model on my local machine on a custom dataset containing a total of 17~ thousand images. The images vary in size and are located in three different folders (training, validation, and test), each containing two subfolders (one for each class).
I tried an architecture similar to VGG16, which yielded more than decent results on this dataset in the past. Note, there is a minor class imbalance in the data (52:48)
When I call fit_generator(), the model doesn't train well; although the training loss lowers slightly throughout the first epoch, it does not change much afterward. Using this architecture with higher regulation, I achieved 85% accuracy after 55~ epochs in the past.
Imports and hyperparameters
import tensorflow as tf
from tensorflow import keras
from keras import backend as k
from keras.layers import Dense, Dropout, Conv2D, MaxPooling2D, Flatten, Input, UpSampling2D
from keras.models import Sequential, Model, load_model
from keras.utils import to_categorical
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import ModelCheckpoint
TRAIN_PATH = 'data/train/'
VALID_PATH = 'data/validation/'
TEST_PATH = 'data/test/'
TARGET_SIZE = (256, 256)
RESCALE = 1.0 / 255
COLOR_MODE = 'grayscale'
EPOCHS = 2
BATCH_SIZE = 16
CLASSES = ['Damselflies', 'Dragonflies']
CLASS_MODE = 'categorical'
CHECKPOINT = "checkpoints/weights.hdf5"
Model
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(256, 256, 1), padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(516, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
In the past, I created a custom pipeline to reshape, grayscale, flip, and normalize the images; then, I trained the model using my CPU on batches of processed images.
I tried repeating the process using ImageDataGenerator, flow_from_directory, and GPU support.
# randomly flip images, and scale pixel values
trainGenerator = ImageDataGenerator(rescale=RESCALE,
horizontal_flip=True,
vertical_flip=True)
# only scale the pixel values validation images
validatioinGenerator = ImageDataGenerator(rescale=RESCALE)
# only scale the pixel values test images
testGenerator = ImageDataGenerator(rescale=RESCALE)
# instanciate train flow
trainFlow = trainGenerator.flow_from_directory(
TRAIN_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode = CLASS_MODE,
shuffle=True
)
# instanciate validation flow
validationFlow = validatioinGenerator.flow_from_directory(
VALID_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode= CLASS_MODE,
shuffle=True
)
Then, fitting the model using fit_generator.
checkpoints = ModelCheckpoint(CHECKPOINT, monitor='val_accuracy', verbose=1, save_best_only=True, mode='max')
with tf.device('/GPU:0'):
model.fit_generator(
trainFlow,
validation_data=validationFlow,
callbacks=[checkpoints],
epochs=EPOCHS
)
I tried training it for 40 epochs.
The classifier achieves 52% after the first epoch and does not improve as time goes by.
Testing the classifier
testFlow = testGenerator.flow_from_directory(
TEST_PATH,
target_size = TARGET_SIZE,
batch_size = BATCH_SIZE,
classes = CLASSES,
color_mode = COLOR_MODE,
class_mode= CLASS_MODE,
)
ans = model.predict_generator(testFlow)
When I look at the predictions, the model predicts all the test images as the majority class with the same confidence [0.48498476, 0.51501524].
Have I made sure the data is correct?
Yes. I tested whether the generators yield processed images and their corresponding labels correctly.
Have I tried changing the loss function, activation function, and optimizer?
Yes. I tried changing the class mode to binary, the loss to binary_crossentropy, and changing the last layer to produce a single output with sigmoid activation. No, I did not change the optimizer. However, I did try to increase the learning rate.
Have I tried changing the model's architecture?
Yes. I tried increasing and decreasing model complexity.
Both more layers with less regularization and fewer layers with more regularization produced similar results.
Are the layers trainable?
Yes.
Is the GPU support implemented correctly?
I hope so.
print("Num GPUs Available: ", len(tf.config.experimental.list_physical_devices('GPU')))
Num GPUs Available: 1
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
config = tf.compat.v1.ConfigProto(log_device_placement=True)
config.gpu_options.allow_growth = True
sess = tf.compat.v1.Session(config=config)
print(sess)
Device mapping:
/job:localhost/replica:0/task:0/device:GPU:0 -> device: 0, name: NVIDIA GeForce GTX 1050 with Max-Q Design, pci bus id: 0000:03:00.0, compute capability: 6.1
<tensorflow.python.client.session.Session object at 0x000001F9443E2CC0>
Have I tried transfer learning?
Not yet.
I found a similar unanswered question from 2017 keras-doesnt-train-using-fit-generator.
Thoughts?
The problem is with your model. I copied your code and ran it on a data set I have used before (which gets high accuracy) and got results similar to yours. I then substituted the simple model below
model = tf.keras.Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(256 , 256,1)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu' ),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(128, activation='relu'),
Dropout(.3),
Dense(64, activation='relu'),
Dropout(.3),
Dense(2, activation='softmax')
])
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
The model trained properly. By the way model.fit_generator is depreciated. You can now just use model.fit which can now handle generators. I then took your model and removed all the dropout layers except for the last one and your model trained properly. Code is:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3), activation='relu',
input_shape=(256, 256, 1), padding='same'))
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
#model.add(Dropout(0.1))
model.add(Flatten())
model.add(Dense(516, activation='relu'))
#model.add(Dropout(0.1))
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.1))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy',
optimizer='Adam', metrics=['accuracy'])
#Gerry P,
By accident, I found what's causing the error.
Removing from Keras import backend as k resolved the model's inability to learn.
That's not all. I also identified that the model you defined, not calling ModelCheckpoint, and not customizing class names affected the fitting process.
model = Sequential([
Conv2D(16, 3, padding='same', activation='relu', input_shape=(256 , 256, 1)),
MaxPooling2D(),
Conv2D(32, 3, padding='same', activation='relu' ),
MaxPooling2D(),
Conv2D(64, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(128, 3, padding='same', activation='relu'),
MaxPooling2D(),
Conv2D(256, 3, padding='same', activation='relu'),
MaxPooling2D(),
Flatten(),
Dense(128, activation='relu'),
Dropout(.3),
Dense(64, activation='relu'),
Dropout(.3),
Dense(2, activation='softmax')
])
I commented that import to try and resolve an error that occurred when I copy-pasted your sequential model. Then, I forgot to uncomment it when I tested it beautiful or average dataset. I achieved over 80% accuracy after the third epoch. Then, I reverted the changes and tried it on my dataset, and it failed again.
As a bonus, not importing Keras's backend decreased the time it takes to train the model!
Lately, I had to re-install Keras and TensorFlow because they couldn't detect my GPU anymore. I probably made a mistake and installed an incompatible version of Keras.
CUDA==10.0
tensorflow-gpu==2.0.0
keras==2.3.1
Note, it's still not a 100% solution, and the problems arise every so often.
EDIT:
Whenever it doesn't work, simplify the model.
Changed batch size and stopped learning? Simplify the model.
Augmented the images further and stopped learning? Simplify the model.

Graph disconnected: cannot obtain value for tensor KerasTensor() Transfer learning

I'm trying to implement transfer learning on my own model but failing. My implementation follows the guides here
https://keras.io/guides/transfer_learning/
How to do transfer-learning on our own models?
https://github.com/anujshah1003/Transfer-Learning-in-keras---custom-data/blob/master/transfer_learning_resnet50_custom_data.py
tensoflow 2.4.1
Keras 2.4.3
Old Model (Works really well):
model = Sequential()
inputShape = (256, 256, 3)
chanDim = -1
# CONV => RELU => POOL
model.add(Conv2D(32, (3, 3), padding="same", input_shape=inputShape))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(3, 3)))
model.add(Dropout(0.25))
# (CONV => RELU) * 2 => POOL
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(64, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
# (CONV => RELU) * 2 => POOL
model.add(Conv2D(128, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(Conv2D(128, (3, 3), padding="same"))
model.add(Activation("relu"))
model.add(BatchNormalization(axis=chanDim))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
# first (and only) set of FC => RELU layers
model.add(Flatten())
model.add(Dense(1024))
model.add(Activation("relu"))
model.add(BatchNormalization())
model.add(Dropout(0.5))
# softmax classifier
model.add(Dense(classes))
model.add(Activation("softmax"))
Transfer Learning:
old_model = load_model('old.model')
# removes top 2 activation layers
for i in range(2):
old_model.pop()
# mark loaded layers as not trainable
for layer in old_model.layers:
layer.trainable = False
# initialize the new model
in_puts = Input(shape=(256, 256, 3))
count = len(old_model.layers)
ll = old_model.layers[count - 1].output
classes = len(lb.classes_)
ll = Dense(classes)(ll)
ll = Activation("softmax", name="activation3_" + NODE)(ll)
model = Model(inputs=in_puts, outputs=ll) # ERROR
opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])
# train the network
H = model.fit(x_train, y_train, validation_data=(x_test, y_test), steps_per_epoch=len(x_train) // BS, epochs=EPOCHS, verbose=1)
# save the model to disk
model.save("new.model")
ERROR
ValueError: Graph disconnected: cannot obtain value for tensor
KerasTensor(type_spec=TensorSpec(shape=(None, 256, 256, 3), dtype=tf.float32,
name='conv2d_input'), name='conv2d_input', description="created by layer 'conv2d_input'") at
layer "conv2d". The following previous layers were accessed without issue: []
Here a simple way to operate transfer learning with your model
classes = 10
sub_old_model = Model(old_model.input, old_model.layers[-3].output)
sub_old_model.trainable = False
ll = Dense(classes)(sub_old_model.output)
ll = Activation("softmax")(ll)
model = Model(inputs=sub_old_model.input, outputs=ll)
Firstly, create a sub-model with layers from the old model that you want to freeze (trainable = False). In our example, we take all the layers excluding the last Dense and the Softmax activation.
Then pass the sub-model output into the new trainable layers.
At this point, you simply need to create a new model instance to assemble all the pieces
With help from #Marco Cerliani, I was able to solve the problem by changing to LabelEncoder and sparse_categorical_crossentropy

Tensorflow 2.4.0 RAM continues grow when CNN training

I have built a simple CNN using tensorflow-gpu 2.4.0 for cifar10
# Set constent
num_classes = len(label_names) # The number of classes
input_shape = (32, 32, 3) # The input shape 32 * 32 pixels, 3 RGB brightness for each pixel
# Construct a sequential model
model = Sequential()
# Starting add layers to the model
# First VGG blockes
model.add(Conv2D(16, (3, 3), padding='same', activation='relu', input_shape=input_shape))
model.add(BatchNormalization()) # BatchNormalization after convolutional layer
model.add(Conv2D(16, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization()) # BatchNormalization after convolutional layer
model.add(MaxPooling2D((2, 2))) # MaxPooling2D to keep important features as well as shirnk info size
model.add(Dropout(0.1)) # Drop some portion of nodes to increase generality of model
# Second VGG blockes
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.2))
# Third VGG blockes
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.3))
# Fourth VGG blockes
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))
model.add(Dropout(0.4))
# Flat and dense layer
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
# Final softmax layer
model.add(Dense(num_classes, activation='softmax'))
# The optimizer gives the learning rate lr and some other parameters.
opt = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0, amsgrad=False)
# Model complie
model.compile(loss='categorical_crossentropy',
optimizer=opt,
metrics=['accuracy'])
And while I am training this model:
#Train model
epochs = 100
batch_size = 128
# create data generator
datagen = ImageDataGenerator(horizontal_flip=True,
width_shift_range=5,
height_shift_range=5,
rotation_range=20)
history = model.fit(datagen.flow(x_training, y_training, batch_size=batch_size),
epochs=epochs,
validation_data=(x_validate, y_validate),
verbose=1,
shuffle=True)
The RAM of my computer keeps grow until the training process crashes. Adding python garbage collector for at end of each epoch did not work. Disabling eager detection did not work.
I using
tensorflow-gpu 2.4.0
CUDA 11.0
cuDNN 8.0

Keras' fit_generator() for binary classification predictions always 50%

I have set up a model to train on classifying whether an image is a certain video game or not. I pre-scaled my images into 250x250 pixels and have them separated into two folders (the two binary classes) labelled 0 and 1. The amount of both classes are within ~100 of each other and I have around 3500 images in total.
Here are photos of the training process, the model set up and some predictions: https://imgur.com/a/CN1b6LV
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0,
zoom_range=0,
horizontal_flip=True,
width_shift_range=0.1,
height_shift_range=0.1,
validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
'data\\',
batch_size=batchsize,
shuffle=True,
target_size=(250, 250),
subset="training",
class_mode="binary")
val_generator = train_datagen.flow_from_directory(
'data\\',
batch_size=batchsize,
shuffle=True,
target_size=(250, 250),
subset="validation",
class_mode="binary")
pred_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0,
zoom_range=0,
horizontal_flip=False,
width_shift_range=0.1,
height_shift_range=0.1)
pred_generator = pred_datagen.flow_from_directory(
'batch_pred\\',
batch_size=30,
shuffle=False,
target_size=(250, 250))
model = Sequential()
model.add(Conv2D(input_shape=(250, 250, 3), filters=25, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=32, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=32, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=64, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=128, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(Conv2D(filters=256, kernel_size=3, activation="relu", padding="same"))
model.add(MaxPooling2D(pool_size=2, padding="same", strides=(2, 2)))
model.add(BatchNormalization())
dense = False
if dense:
model.add(Flatten())
model.add(Dense(250, activation="relu"))
model.add(BatchNormalization())
model.add(Dense(50, activation="relu"))
else:
model.add(GlobalAveragePooling2D())
model.add(Dense(1, activation="softmax"))
model.compile(loss='binary_crossentropy',
optimizer=Adam(0.0005), metrics=["acc"])
callbacks = [EarlyStopping(monitor='val_acc', patience=200, verbose=1),
ModelCheckpoint(filepath="model_checkpoint.h5py",
monitor='val_acc', save_best_only=True, verbose=1)]
model.fit_generator(
train_generator,
steps_per_epoch=train_generator.samples // batchsize,
validation_data=val_generator,
validation_steps=val_generator.samples // batchsize,
epochs=500,
callbacks=callbacks)
Everything appears to run correctly in terms of the model iterating the data by epoch, it finding the correct number of images etc. However, my predictions are always 50% despite a good validation accuracy, low loss, high accuracy etc.
I'm not sure what I'm doing wrong and any help would be appreciated.
I think your problem is that you're using sigmoid for binary classification, your final layer activation function should be linear.
The problem is that you are using softmax on a Dense layer with one unit. Softmax function normalizes its input such that the sum of its elements becomes equal to one. So if it has one unit, then the output would be always 1. Instead, for binary classification you need to use sigmoid function as the activation function of last layer.

Categories

Resources