Image classifier does not generalize well to slight image perturbations

Image classifier does not generalize well to slight image perturbations - python

I am training a CNN to classify a 28x28 rgb image into 200 categories.
The classifier reaches ~95% accuracy on the train set.
The test images are obtained by taking a screenshot, cropping and resizing the roi to 28x28.
This image processing causes a slight difference in the train and test images (example attached).
Even though the difference is almost imperceptible to the human eye it causes a huge drop in accuracy for my classifier.
My classifier reaches up to 95% accuracy on the train set but only ~10% on the test set.
I started applying random perturbations to the training images (blur, pixelation, noise, translation, scaling) and started blurring the test images but test accuracy only barely improved.
How can I make my classifier robust so that it generalizes over slight pixel differences?
Here is my network
network = input_data(shape=[None, img_size[0], img_size[1], 3], name='input')
conv1 = relu(batch_normalization(
conv_2d(network, 16, 3, bias=False, activation=None, regularizer="L2"), trainable=is_training))
conv2 = relu(batch_normalization(
conv_2d(conv1, 32, 3, bias=False, activation=None, regularizer="L2"), trainable=is_training))
conv3 = relu(batch_normalization(
conv_2d(conv2, 64, 3, bias=False, activation=None, regularizer="L2"), trainable=is_training))
net = fully_connected(conv3, 128, activation='relu', regularizer="L2")
net = fully_connected(net, num_elements, activation='softmax')
return regression(net, optimizer='adam', learning_rate=learning_rate,
loss='categorical_crossentropy', name='target')
Train image:
Test image

200 categories is a lot. Are you sure something is not dominating the other classes, e.g. the model is not guessing 'background' all the time and being right 95% of the time just because 95% of the images are 'background'?
Pooling (p. 335 onwards), by for example max pooling, is one way to introduce invariance to small transformations. You should try it out.
Other ways to limit overfit are by tuning that L2 regularization you are already using, to add dropout to the fully connected layer and to not have a too big minibatch size. You could add also small rotations to the list of augmentations you are doing, if you find it appropriate. Maybe random reflections too, if you expect that to happen in the real world? I don't think it's about the augmentation though.
And finally my personal favourite : human error. Usually when I see something this odd it was just my own fault. You should go through the code and intermediate variables again, more than once.

Related

What can I do to help make my TensorFlow network overfit a large dataset?

The reason I am trying to overfit specifically, is because I am following the "Deep Learning with Python" by François Chollet's steps to designing a network. This is important as this is for my final project in my degree.
At this stage, I need to make a network large enough to overfit my data in order to determine a maximal capacity, an upper-bounds for the size of networks that I will optimise for.
However, as the title suggests, I am struggling to make my network overfit. Perhaps my approach is naïve, but let me explain my model:
I am using this dataset, to train a model to classify stars. There are two classes that a star must be classified by (into both of them): its spectral class (100 classes) and luminosity class (10 classes).
For example, our sun is a 'G2V', it's spectral class is 'G2' and it's luminosity class is 'V'.
To this end, I have built a double-headed network, it takes this input data:
DataFrame containing input data
It then splits into two parallel networks.
# Create our input layer:
input = keras.Input(shape=(3), name='observation_data')
# Build our spectral class
s_class_branch = layers.Dense(100000, activation='relu', name = 's_class_branch_dense_1')(input)
s_class_branch = layers.Dense(500, activation='relu', name = 's_class_branch_dense_2')(s_class_branch)
# Spectral class prediction
s_class_prediction = layers.Dense(100,
activation='softmax',
name='s_class_prediction')(s_class_branch)
# Build our luminosity class
l_class_branch = layers.Dense(100000, activation='relu', name = 'l_class_branch_dense_1')(input)
l_class_branch = layers.Dense(500, activation='relu', name = 'l_class_branch_dense_2')(l_class_branch)
# Luminosity class prediction
l_class_prediction = layers.Dense(10,
activation='softmax',
name='l_class_prediction')(l_class_branch)
# Now we instantiate our model using the layer setup above
scaled_model = Model(input, [s_class_prediction, l_class_prediction])
optimizer = keras.optimizers.RMSprop(learning_rate=0.004)
scaled_model.compile(optimizer=optimizer,
loss={'s_class_prediction':'categorical_crossentropy',
'l_class_prediction':'categorical_crossentropy'},
metrics=['accuracy'])
logdir = os.path.join("logs", "2raw100k")
tensorboard_callback = tf.keras.callbacks.TensorBoard(logdir, histogram_freq=1)
scaled_model.fit(
input_data,{
's_class_prediction':spectral_targets,
'l_class_prediction':luminosity_targets
},
epochs=20,
batch_size=1000,
validation_split=0.0,
callbacks=[tensorboard_callback])
In the code above you can see me attempting a model with two hidden layers in both branches, one layer with a shape of 100 000, following into another layer with 500, before going to the output layer. The training targets are one-hot encoded, so there is one node for every class.
I have tried a wide range of sizes with one to four hidden layers, ranging from a shape of 500 to 100 000, only stopping because I ran out of RAM. I have only used dense layers, with the exception of trying a normalisation layer to no affect.
Graph of losses
They will all happily train and slowly lower the loss, but they never seem to overfit. I have run networks out to 100 epochs and they still will not overfit.
What can I do to make my network fit the data better? I am fairly new to machine learning, having only been doing this for a year now, so I am sure there is something that I am missing. I really appreciate any help and would be happy to provide the logs shown in the graph.

After a lot more training I think I have this answered. Basically, the network did not have adequate capacity and needed more layers. I had tried more layers earlier but because I was not comparing it to validation data the overfitting was not apparent!
The proof is in the pudding:
So thank you to #Aryagm for their comment, because that let me work it out. As you can see, the validation data (grey and blue) clearly overfits, while the training data (green and orange) does not show it.
If anything, this goes to show why a separate validation set is so important and I am a fool for not having used it in the first place! Lesson learned.

Could not increase accuracy from a fixed threshold using Keras Dense layer ANN

I'm learning the simplest neural networks using Dense layers using Keras. I'm trying to implement face recognition on a relatively small dataset (In total ~250 images with 50 images per class).
I've downloaded the images from google images and resized them to 100 * 100 png files. Then I've read those files into a numpy array and also created a one hot label array for training my model.
Here is my code for processing the training data:
X, Y = [], []
feature_map = {
'Alia Bhatt': 0,
'Dipika Padukon': 1,
'Shahrukh khan': 2,
'amitabh bachchan': 3,
'ayushmann khurrana': 4
}
for each_dir in os.listdir('.'):
if os.path.isdir(each_dir):
for each_file in os.listdir(each_dir):
X.append(cv2.imread(os.path.join(each_dir, each_file), -1).reshape(1, -1))
Y.append(feature_map[os.path.basename(each_file).split('-')[0]])
X = np.squeeze(X)
X = X / 255.0 # normalize the training data
Y = np.array(Y)
Y = np.eye(5)[Y]
print (X.shape)
print (Y.shape)
This is printing (244, 40000) and (244, 5). Here is my model:
model = Sequential()
model.add(Dense(8000, input_dim = 40000, activation = 'relu'))
model.add(Dense(1200, activation = 'relu'))
model.add(Dense(700, activation = 'relu'))
model.add(Dense(100, activation = 'relu'))
model.add(Dense(5, activation = 'softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=25, batch_size=15)
When I train the model, It stuck at the accuracy 0.2172, which is almost the same as random predictions (0.20).
I've also tried to train mode with grayscale images but still not getting expected accuracy. Also tried with different network architectures by changing the number of hidden layers and neurons in hidden layers.
What am I missing here? Is my dataset too small? or am I missing any other technical detail?
For more details of code, here is my notebook: https://colab.research.google.com/drive/1hSVirKYO5NFH3VWtXfr1h6y0sxHjI5Ey

Two suggestions I can make:
Your data set is probably too small. If you are splitting training and validation at 80/20, that means you are only training on 200 images, which is probably too small. Try increasing your data set to see if results improve.
I would recommend adding Dropout to each layer of your network as your training set is so small. Your network is most likely over-fitting your training data set since it is so small, and Dropout is an easy way to help avoid this problem.
Let me know if these suggestions make a difference!

I agree that the dataset is too small, 50 instances of each person is probably not enough. You can use data augmentation with the keras ImageDataGenerator method to increase the number of images, and rewrite your numpy reshaping code as a pre-processing function for the generator. I also noticed that you haven't shuffled the data, so the network is likely predicting the first class for everything (which is maybe why the accuracy is near random chance).
If increasing the dataset size doesn't help, you'll probably have to play around with the learning rate for the Adam optimizer.

Keras accuracy and actual accuracy are exactly reverse of each other

I'm learning Neural Networks and currently implemented object classification on CFAR-10 dataset using Keras library. Here is my definition of a neural network defined by Keras:
# Define the model and train it
model = Sequential()
model.add(Dense(units = 60, input_dim = 1024, activation = 'relu'))
model.add(Dense(units = 50, activation = 'relu'))
model.add(Dense(units = 60, activation = 'relu'))
model.add(Dense(units = 70, activation = 'relu'))
model.add(Dense(units = 30, activation = 'relu'))
model.add(Dense(units = 10, activation = 'sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=50, batch_size=10000)
So I've 1 input layer having the input of dimensions 1024 or (1024, ) (each image of 32 * 32 *3 is first converted to grayscale resulting in dimensions of 32 * 32), 5 hidden layers and 1 output layer as defined in the above code.
When I train my model over 50 epochs, I got the accuracy of 0.9 or 90%. Also when I evaluate it using test dataset, I got the accuracy of approx. 90%. Here is the line of code which evaluates the model:
print (model.evaluate(X_test, y_test))
This prints following loss and accuracy:
[1.611809492111206, 0.8999999761581421]
But When I calculate the accuracy manually by making predictions on each test data images, I got accuracy around 11% (This is almost the same as probability randomly making predictions). Here is my code to calculate it manually:
wrong = 0
for x, y in zip(X_test, y_test):
if not (np.argmax(model.predict(x.reshape(1, -1))) == np.argmax(y)):
wrong += 1
print (wrong)
This prints out 9002 out of 10000 wrong predictions. So what am I missing here? Why both accuracies are exactly reverse (100 - 89 = 11%) of each other? Any intuitive explanation will help! Thanks.
EDIT:
Here is my code which processes the dataset:
# Process the training and testing data and make in Neural Network comfortable
# convert given colored image to grayscale
def rgb2gray(rgb):
return np.dot(rgb, [0.2989, 0.5870, 0.1140])
X_train, y_train, X_test, y_test = [], [], [], []
def process_batch(batch_path, is_test = False):
batch = unpickle(batch_path)
imgs = batch[b'data']
labels = batch[b'labels']
for img in imgs:
img = img.reshape(3,32,32).transpose([1, 2, 0])
img = rgb2gray(img)
img = img.reshape(1, -1)
if not is_test:
X_train.append(img)
else:
X_test.append(img)
for label in labels:
if not is_test:
y_train.append(label)
else:
y_test.append(label)
process_batch('cifar-10-batches-py/data_batch_1')
process_batch('cifar-10-batches-py/data_batch_2')
process_batch('cifar-10-batches-py/data_batch_3')
process_batch('cifar-10-batches-py/data_batch_4')
process_batch('cifar-10-batches-py/data_batch_5')
process_batch('cifar-10-batches-py/test_batch', True)
number_of_classes = 10
number_of_batches = 5
number_of_test_batch = 1
X_train = np.array(X_train).reshape(meta_data[b'num_cases_per_batch'] * number_of_batches, -1)
print ('Shape of training data: {0}'.format(X_train.shape))
# create labels to one hot format
y_train = np.array(y_train)
y_train = np.eye(number_of_classes)[y_train]
print ('Shape of training labels: {0}'.format(y_train.shape))
# Process testing data
X_test = np.array(X_test).reshape(meta_data[b'num_cases_per_batch'] * number_of_test_batch, -1)
print ('Shape of testing data: {0}'.format(X_test.shape))
# create labels to one hot format
y_test = np.array(y_test)
y_test = np.eye(number_of_classes)[y_test]
print ('Shape of testing labels: {0}'.format(y_test.shape))

The reason why this is happening is due to the loss function that you are using. You are using binary cross entropy where you should be using categorical cross entropy as the loss. Binary is only for a two-label problem but you have 10 labels here due to CIFAR-10.
When you show the accuracy metric, it is in fact misleading you because it is showing binary classification performance. The solution is to retrain your model by choosing categorical_crossentropy.
This post has more details: Keras binary_crossentropy vs categorical_crossentropy performance?
Related - this post is answering a different question, but the answer is essentially what your problem is: Keras: model.evaluate vs model.predict accuracy difference in multi-class NLP task
Edit
You mentioned that the accuracy of your model is hovering at around 10% and not improving in your comments. Upon examining your Colab notebook and when you change to categorical cross-entropy, it appears that you are not normalizing your data. Because the pixel values are originally unsigned 8-bit integer, when you create your training set it promotes the values to floating-point, but because of the dynamic range of the data, your neural network has a hard time learning the right weights. When you try to update the weights, the gradients are so small that there are essentially no updates and hence your network is performing just like random chance. The solution is to simply divide your training and test dataset by 255 before you proceed:
X_train /= 255.0
X_test /= 255.0
This will transform your data so that the dynamic range scales from [0,255] to [0,1]. Your model will have an easier time training due to the smaller dynamic range, which should help gradients propagate and not vanish because of the larger scale before normalizing. Because your original model specification has a significant number of dense layers, due to the dynamic range of your data the gradient updates will most likely vanish which is why the performance is poor initially.
When I run your notebook, I get 37% accuracy. This is not unexpected with CIFAR-10 and only a fully-connected / dense network. Also when you run your notebook now, the accuracy and the fraction of wrong examples match.
If you want to increase accuracy, I have a couple of suggestions:
Actually include colour information. Each object in CIFAR-10 has a distinct colour profile that should help in discrimination
Add Convolutional layers. I'm not sure where you are in your learning, but convolutional layers help in learning and extracting the right features in the image so that the most optimal features are presented to the dense layers so that classification on these features increases accuracy. Right now you're classifying raw pixels, which is not advisable given how noisy they can be, or due to how unconstrained things can get (rotation, translation, skew, scale, etc.).

CNN on small dataset is overfiting

I want to classify pattern on image. My original image shape are 200 000*200 000 i reshape it to 96*96, pattern are still recognizable with human eyes. Pixel value are 0 or 1.
i'm using the following neural network.
train_X, test_X, train_Y, test_Y = train_test_split(cnn_mat, img_bin["Classification"], test_size = 0.2, random_state = 0)
class_weights = class_weight.compute_class_weight('balanced',
np.unique(train_Y),
train_Y)
train_Y_one_hot = to_categorical(train_Y)
test_Y_one_hot = to_categorical(test_Y)
train_X,valid_X,train_label,valid_label = train_test_split(train_X, train_Y_one_hot, test_size=0.2, random_state=13)
model = Sequential()
model.add(Conv2D(24,kernel_size=3,padding='same',activation='relu',
input_shape=(96,96,1)))
model.add(MaxPool2D())
model.add(Conv2D(48,kernel_size=3,padding='same',activation='relu'))
model.add(MaxPool2D())
model.add(Conv2D(64,kernel_size=3,padding='same',activation='relu'))
model.add(MaxPool2D())
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(16, activation='softmax'))
model.compile(optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"])
train = model.fit(train_X, train_label, batch_size=80,epochs=20,verbose=1,validation_data=(valid_X, valid_label),class_weight=class_weights)
I have already run some experiment to find a "good" number of hidden layer and fully connected layer. it's probably not the most optimal architecture since my computer is slow, i just ran different model once and selected best one with matrix confusion, i didn't use cross validation,I didn't try more complex architecture since my number of data is small, i have read small architecture are the best, is it worth to try more complex architecture?
here the result with 5 and 12 epoch, bach size 80. This is the confusion matrix for my test set
As you can see it's look like i'm overfiting. When i only run 5 epoch, most of the class are assigned to class 0; With more epoch, class 0 is less important but classification is still bad
I added 0.8 dropout after each convolutional layer
e.g
model.add(Conv2D(48,kernel_size=3,padding='same',activation='relu'))
model.add(MaxPool2D())
model.add(Dropout(0.8))
model.add(Conv2D(64,kernel_size=3,padding='same',activation='relu'))
model.add(MaxPool2D())
model.add(Dropout(0.8))
With drop out, 95% of my image are classified in class 0.
I tryed image augmentation; i made rotation of all my training image, still used weighted activation function, result didnt improve. Should i try to augment only class with small number of image? Most of the thing i read says to augment all the dataset...
To resume my question are:
Should i try more complex model?
Is it usefull to do image augmentation only on unrepresented class? then should i still use weight class (i guess no)?
Should i have hope to find a "good" model with cnn when we see the size of my dataset?

I think according to the imbalanced data, it is better to create a custom data generator for your model so that each of it's generated data batch, contains at least one sample from each class. And also it is better to use Dropout layer after each dense layer instead of conv layer. For data augmentation it is better to at least use combination of rotate, horizontal flip and vertical flip. there are some other approaches for data augmentation like using GAN network or random pixel replacement.
For Gan you can check This SO post
For using Gan as data augmenter you can read This Article.
For combination of pixel level augmentation and GAN pixel level data augmentation

What I used - in a different setting - was to upsample my data with ADASYN. This algorithm calculates the amount of new data required to balance your classes, and then takes available data to sample novel examples.
There is an implementation for Python. Otherwise, you also have very little data. SVMs are good performing even with little data. You might want to try them or other image classification algorithms depending where the expected pattern is always at the same position, or varies. Then you could also try the Viola–Jones object detection framework.

Acc decreasing to zero in LSTM Keras Training

While trying to implement an LSTM network for trajectory classification, I have been struggling to get decent classification results even for simple trajectories. Also, my training accuracy keeps fluctuating without increasing significantly, this can also be seen in tensorboard:
Training accuracy:
This is my model:
model1 = Sequential()
model1.add(LSTM(8, dropout=0.2, return_sequences=True, input_shape=(40,2)))
model1.add(LSTM(8,return_sequences=True))
model1.add(LSTM(8,return_sequences=False))
model1.add(Dense(1, activation='sigmoid'))`
and my training code:
model1.compile(optimizer='adagrad',loss='binary_crossentropy', metrics=['accuracy'])
hist1 = model1.fit(dataScatter[:,70:110,:],outputScatter,validation_split=0.25,epochs=50, batch_size=20, callbacks = [tensorboard], verbose = 2)
I think the problem is probably due to the data input and output shape, since the model itself seems to be fine. The Data input has (2000,40,2) shape and the output has (2000,1) shape.
Can anyone spot a mistake?

Try to change:
model1.add(Dense(1, activation='sigmoid'))`
to:
model1.add(TimeDistributed(Dense(1, activation='sigmoid')))
The TimeDistributed applies the same Dense layer (same weights) to the LSTMs outputs for one time step at a time.
I recommend this tutorial as well https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/ .

I was able to increase the accuracy to 97% with a few adjustments that were data related. The main obstacle was an unbalanced dataset split for the training and validation set. Further improvements came from normalizing the input trajectories. I also increased the number of cells in the first layer.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.