IndexError: index 1 is out of bounds for axis 1 with size 1 in CIFAR100 Python - python

How to solve this problem in line 17 it says an IndexError: index 1 is out of bounds for axis 1 with size 1.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Dropout, Flatten, Dense
import numpy as np
import os
import sys
import matplotlib.pyplot as plt
import pandas as pd
# Select the Coarse Class of "Insects" and Fine Classes of "bee", "beetle", "butterfly", "caterpillar", "cockroach"
insect_fine_classes = ['bee', 'beetle', 'butterfly', 'caterpillar', 'cockroach']
(train_images, train_labels), (test_images, test_labels) = keras.datasets.cifar100.load_data(label_mode='fine')
train_images = train_images.astype('float32') / 255.0
test_images = test_images.astype('float32') / 255.0
train_insect_fine_class_indices = np.where((train_labels[:, 0] == 0) & np.isin(train_labels[:, 1], insect_fine_classes))
test_insect_fine_class_indices = np.where((test_labels[:, 0] == 0) & np.isin(test_labels[:, 1], insect_fine_classes))
train_insect_fine_class_images = train_images[train_insect_fine_class_indices]
train_insect_fine_class_labels = train_labels[train_insect_fine_class_indices][:, 1]
test_insect_fine_class_images = test_images[test_insect_fine_class_indices]
test_insect_fine_class_labels = test_labels[test_insect_fine_class_indices][:, 1]
# data preparation
print("Shape of train images: {}".format(train_insect_fine_class_images.shape))
print("Shape of train labels: {}".format(train_insect_fine_class_labels.shape))
# build the model
model = keras.Sequential([
Conv2D(32, (3,3), activation='relu', input_shape=(32, 32, 3)),
MaxPooling2D((2,2)),
Conv2D(64, (3,3), activation='relu'),
MaxPooling2D((2,2)),
Conv2D(64, (3,3), activation='relu'),
Flatten(),
Dense(64, activation='relu'),
Dropout(0.5),
Dense(len(insect_fine_classes))
])
# compile the model
model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'],
run_eagerly=True)
# train the model
history = model.fit(train_insect_fine_class_images, train_insect_fine_class_labels, epochs=10,
validation_data=(test_insect_fine_class_images, test_insect_fine_class_labels))
# plot the training and validation accuracy
plt.plot(history.history['accuracy'], label='accuracy')
plt.plot(history.history['val_accuracy'], label='val_accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.ylim([0.5, 1])
plt.legend(loc='lower right')
plt.show()
Its our first time in python and We are already task to build a CIFAR-100 code in python which has a Coarse Class of Insects and Fine Class of bee, beetle, butterfly, caterpillar, cockroach. We our expecting to have output which has a different image of the Fine Class insects.
Are there any problems in this code aside of the error? if so can you correct me. Thank you.

Check the shape of that array once, you might be trying to slice the index which donot exist. In your code,
train_insect_fine_class_labels = train_labels[train_insect_fine_class_indices][:, 1],here index 1 might not available, you can check by replacing with 0.

Related

ValueError being thrown 'This model has not been built'...for CNN Audio Spectrogram feature extraction

In the code below, I am getting the error:
raise ValueError('This model has not yet been built. ' ValueError: This model has not yet been built. Build the model first by calling ``build()`` or calling ``fit()`` with some data, or specify an ``input_shape`` argument in the first layer(s) for automatic build.
I am trying to use a spectrogram instead of the Mel feature extraction for this CNN. I think the error may be in the extraction process for the extract_features step.
Any help will be greatly appreciated in figuring out the source of this error.
Here are some sizes of arrays generated before the error:
x_test = {ndarray:(353,128)}
x_train = {ndarray:(1408,128)}
y = {ndarray:(1761,)}
y_test = {ndarray:(353,10)}
y_ttrain = {ndarray:(1408,10)}
CODE
# Sample rate conversion
import librosa
import librosa.display
from scipy.io import wavfile as wav
import numpy as np
import pandas as pd
import os
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
max_pad_len = 174
n_mels = 128
def extract_features(file_name):
try:
audio, sample_rate = librosa.core.load(file_name, res_type='kaiser_fast')
mely = librosa.feature.melspectrogram(y=audio, sr=sample_rate, n_mels=n_mels)
# pad_width = max_pad_len - mely.shape[1]
# mely = np.pad(mely, pad_width=((0, 0), (0, pad_width)), mode='constant')
melyscaled = np.mean(mely.T, axis=0)
except Exception as e:
print("Error encountered while parsing file: ", file_name)
return None
return melyscaled
features = []
# Iterate through each sound file and extract the features
from time import sleep
#from tqdm import tqdm
from alive_progress import alive_bar
items = len(metadata.index)
with alive_bar(items) as bar:
for index, row in metadata.iterrows():
bar()
file_name = os.path.join(os.path.abspath(fulldatasetpath), 'fold' + str(row["fold"]) + '/',
str(row["slice_file_name"]))
class_label = row["class_name"]
data = extract_features(file_name)
features.append([data, class_label])
# Convert into a Panda dataframe
featuresdf = pd.DataFrame(features, columns=['feature', 'class_label'])
print('Finished feature extraction from ', len(featuresdf), ' files')
from sklearn.preprocessing import LabelEncoder
from keras.utils import to_categorical
# Convert features and corresponding classification labels into numpy arrays
X = np.array(featuresdf.feature.tolist())
y = np.array(featuresdf.class_label.tolist())
# Encode the classification labels
le = LabelEncoder()
yy = to_categorical(le.fit_transform(y))
# split the dataset
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(X, yy, test_size=0.2, random_state = 42)
### Convolutional Neural Network (CNN) model architecture
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, Conv2D, MaxPooling2D, GlobalAveragePooling2D
from keras.optimizers import Adam
from keras.utils import np_utils
from sklearn import metrics
num_rows = 40
num_columns = 174
num_channels = 1
#
#x_train = x_train.reshape(x_train.shape[0], num_rows, num_columns, num_channels)
#x_test = x_test.reshape(x_test.shape[0], num_rows, num_columns, num_channels)
num_labels = yy.shape[1]
filter_size = 2
# Construct model
model = Sequential()
#model.add(Conv2D(filters=16, kernel_size=2, input_shape=(num_rows, num_columns, num_channels), activation='relu'))
model.add(Conv2D(filters=16, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=32, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=64, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(Conv2D(filters=128, kernel_size=2, activation='relu'))
model.add(MaxPooling2D(pool_size=2))
model.add(Dropout(0.2))
model.add(GlobalAveragePooling2D())
model.add(Dense(num_labels, activation='softmax'))
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
# Display model architecture summary
model.summary()
You have to build your model with your input shape before calling model.summary():
model.build((128,))

How to get confusion matrix from the below model?

The X represents features and Y represents labels for image classification. I am using CNN for binary image classification purpose like that of cats and dogs.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import pickle
import numpy as np
from sklearn.metrics import confusion_matrix
X = np.array(pickle.load(open("X.pickle","rb")))
Y = np.array(pickle.load(open("Y.pickle","rb")))
x_test = np.array(pickle.load(open("x_test.pickle","rb")))
y_test = np.array(pickle.load(open("y_test.pickle","rb")))
# X = np.array(pickle.load(open("x_train.pickle","rb")))
# Y = np.array(pickle.load(open("y_train.pickle","rb")))
#scaling our image data
X = X/255.0
model = Sequential()
#model.add(Conv2D(64 ,(3,3), input_shape = X.shape[1:]))
model.add(Conv2D(64 ,(3,3), input_shape = X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Conv2D(128 ,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Conv2D(256 ,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Conv2D(512 ,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Flatten())
model.add(Dense(2048))
model.add(Activation("relu"))
model.add(Dropout(0.5))
np.argmax(model.add(Dense(2)))
model.add(Activation('softmax'))
model.compile(loss="binary_crossentropy",
optimizer = "adam",
metrics = ['accuracy'])
predicted = model.predict(x_test)
print(predicted.shape)
print(y_test.shape)
print(confusion_matrix(y_test,predicted))
The output of predicted and y_test shapes are (90, 2) and
(90,) and when I used confusion matrix it flushes:-
ValueError: Classification metrics can't handle a mix of binary and continuous-multioutput targets.
You can use scikit-learn:
from sklearn.metrics import confusion_matrix
predicted = model.predict(x_test)
print(confusion_matrix(y_test,predicted.round()))
Here's scikit learn documentation for confusion matrix:
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html
Edit:
Advice:
Prefer using Softmax activation on output layer and whether it is binary or multi label classification. Use softmax with number of nodes in output layer = no of classes.
Here's one example how we can get the confusion matrix using the PyCM library:
you need to install the pycm library in anaconda or by using pip3
conda install -c sepandhaghighi pycm
from pycm import *
def plot_confusion_matrix(cm, normalize=False, title='Confusion matrix', cmap=plt.cm.Blues):
"""
This function modified to plots the ConfusionMatrix object.
Normalization can be applied by setting 'normalize=True'.
Code Reference :
http://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html
"""
plt_cm = []
for i in cm.classes :
row=[]
for j in cm.classes:
row.append(cm.table[i][j])
plt_cm.append(row)
plt_cm = np.array(plt_cm)
if normalize:
plt_cm = plt_cm.astype('float') / plt_cm.sum(axis=1)[:, np.newaxis]
plt.imshow(plt_cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(cm.classes))
plt.xticks(tick_marks, cm.classes, rotation=45)
plt.yticks(tick_marks, cm.classes)
fmt = '.2f' if normalize else 'd'
thresh = plt_cm.max() / 2.
for i, j in itertools.product(range(plt_cm.shape[0]), range(plt_cm.shape[1])):
plt.text(j, i, format(plt_cm[i, j], fmt),
horizontalalignment="center",
color="white" if plt_cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('Actual')
plt.xlabel('Prediction')
.....
svm.fit(x_train, y_train)
y_predicted = svm.predict(x_test)
#Get the confusion Matrix:
cm = ConfusionMatrix(actual_vector=y_test, predict_vector=y_predicted)
#Print classes
print("[INFO] Clases")
print(cm.classes)
#Print the table of cmatrix
print(cm.table)
#indicators of the confusion matrix
print(cm)
greetings!

Store images into multiply array and use it to train model

I get this error when train the model:
ValueError: Error when checking target:
expected dropout_5 to have shape (33,) but got array with shape (1,).
I want to store my images into 33 array from folder using path. I have categories the images into different folder which were 1,2,3,4,5...
I have use this code to do it but i dont know how to store it into different array. Can someone help me.
datadir = 'C:/Users/user/Desktop/RESIZE' #path of the folder
categories = ['1','2','3','4','5','6','7','8','9','A','B','C','D','E','F','G','H','I','K','L','M','N','O','P','Q','R','S','T','U','V','W','X','Y']
img_rows, img_cols = 100, 100
training_data = []
for category in categories:
path = os.path.join(datadir,category)
class_num = categories.index(category)
for img in os.listdir(path):
img_array = cv2.imread(os.path.join(path,img),cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array,(img_rows,img_cols))
training_data.append([new_array,class_num])
random.shuffle(training_data)
X = []
y = []
for features, label in training_data:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1,img_rows,img_cols,1)
X = X.astype("float32")
pickle_out = open("X.pickle","wb")
pickle.dump(X,pickle_out)
pickle_out.close()
pickle_out = open("y.pickle","wb")
pickle.dump(y,pickle_out)
pickle_out.close()
After I save the file, then I use this code to train model and I want get 33 output layer but it only can work when my output layer(Dense) set 1.
I got this error:
ValueError: Error when checking target:
expected dropout_5 to have shape (33,) but got array with shape (1,)
Here was my training code.
import tensorflow as tf
from keras import optimizers
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Flatten
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers import Activation
import cv2
import os
import numpy as np
import pickle
from sklearn.utils import shuffle
X = pickle.load(open("X.pickle","rb"))
y = pickle.load(open("y.pickle","rb"))
X = X/255.0
model = Sequential()
model.add(Conv2D(32,(3,3), input_shape = X.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(64,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout(0.25))
model.add(Conv2D(128,(3,3)))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size = (2,2)))
model.add(Dropout(0.4))
model.add(Dense(128))
model.add(Activation("relu"))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(33, activation='softmax'))
model.add(Dropout(0.4))
model.compile(loss = "binary_crossentropy", optimizer = "adam", metrics = ["accuracy"])
model.fit(X, y, batch_size = 2, epochs = 1, validation_split = 0.2)
You need to change your y as one hot encoded data to do the training.
Try this for y,
from sklearn.preprocessing import LabelEncoder
from sklearn.preprocessing import OneHotEncoder
# integer encode
label_encoder = LabelEncoder()
integer_encoded = label_encoder.fit_transform(y)
print(label_encoder.classes_) # This is your classes.
print(integer_encoded.shape())
# binary encode
onehot_encoder = OneHotEncoder(sparse=False)
integer_encoded = integer_encoded.reshape(len(integer_encoded), 1)
onehot_encoded = onehot_encoder.fit_transform(integer_encoded)
print(onehot_encoded.shape())
And one more thing, if you want to classify for 33 class than change your loss to categorical_crossentropy.

Code to perform an attack to a CNN with foolbox, what's wrong?

I have to perform a simple FSGM attack to a convolutional neural network. The code for the CNN works correctly, and the model is saved without a problem, but when i try to perform the attack an error is shown.
HERE'S THE CODE FOR THE CNN
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten, MaxPooling2D
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.utils import to_categorical
import json
import tensorflow as tf
#Using TensorFlow backend.
#download mnist data and split into train and test sets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
#plot the first image in the dataset
plt.imshow(X_train[0])
#check image shape
X_train[0].shape
#reshape data to fit model
X_train = X_train.reshape(60000,28,28,1)
X_test = X_test.reshape(10000,28,28,1)
#one-hot encode target column
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)
y_train[0]
#create model
model = Sequential()
#add model layers
model.add(Conv2D(32, kernel_size=(5,5), activation='relu', input_shape= (28,28,1)))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Conv2D(64, kernel_size=(5,5), activation='relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
#compile model using accuracy as a measure of model performance
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics= ['accuracy'])
#train model
model.fit(X_train, y_train,validation_data=(X_test, y_test), epochs=5)
json.dump({'model':model.to_json()},open("model.json", "w"))
model.save_weights("model_weights.h5")
THEN I TRY TO PERFORM THE ATTACK WITH THE FOLLOWING CODE:
import json
import foolbox
import keras
import numpy as np
from keras import backend
from keras.models import load_model
from keras.datasets import mnist
from keras.utils import np_utils
from foolbox.attacks import FGSM
from foolbox.criteria import Misclassification
from foolbox.distances import MeanSquaredDistance
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import Dense, Flatten, Conv2D, MaxPooling2D
import numpy as np
import tensorflow as tf
from keras.models import model_from_json
import os
############## Loading the model and preprocessing #####################
backend.set_learning_phase(False)
model = tf.keras.models.model_from_json(json.load(open("model.json"))["model"],custom_objects={})
model.load_weights("model_weights.h5")
fmodel = foolbox.models.KerasModel(model, bounds=(0,1))
_,(images, labels) = mnist.load_data()
images = images.reshape(10000,28,28)
images= images.astype('float32')
images /= 255
######################### Attacking the model ##########################
attack=foolbox.attacks.FGSM(fmodel, criterion=Misclassification())
adversarial=attack(images[12],labels[12]) # for single image
adversarial_all=attack(images,labels) # for all the images
adversarial =adversarial.reshape(1,28,28,1) #reshaping it for model prediction
model_predictions = model.predict(adversarial)
print(model_predictions)
########################## Visualization ################################
images=images.reshape(10000,28,28)
adversarial =adversarial.reshape(28,28)
plt.figure()
plt.subplot(1,3,1)
plt.title('Original')
plt.imshow(images[12])
plt.axis('off')
plt.subplot(1, 3, 2)
plt.title('Adversarial')
plt.imshow(adversarial)
plt.axis('off')
plt.subplot(1, 3, 3)
plt.title('Difference')
difference = adversarial - images[124]
plt.imshow(difference / abs(difference).max() * 0.2 + 0.5)
plt.axis('off')
plt.show()
this error is shown when the adversarial examples are generated:
c_api.TF_GetCode(self.status.status))
InvalidArgumentError: Matrix size-incompatible: In[0]: [1,639232], In[1]: [1024,10]
[[{{node dense_4_5/MatMul}}]]
[[{{node dense_4_5/BiasAdd}}]]
What could it be?
here is my solution.
First of all modify the model code as follows
import tensorflow as tf
import json
# download mnist data and split into train and test sets
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()
# reshape data to fit model
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1)
X_train, X_test = X_train/255, X_test/255
# one-hot encode target column
y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)
# create model
model = tf.keras.models.Sequential()
# add model layers
model.add(tf.keras.layers.Conv2D(32, kernel_size=(5, 5),
activation='relu', input_shape=(28, 28, 1)))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Conv2D(64, kernel_size=(5, 5), activation='relu'))
model.add(tf.keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(10, activation='softmax'))
# compile model using accuracy as a measure of model performance
model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
# train model
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=5)
json.dump({'model': model.to_json()}, open("model.json", "w"))
model.save_weights("model_weights.h5")
You just forgot to divide each pixel by the maximum value of RGB (255)
As for the attacker code
import json
import foolbox
from foolbox.attacks import FGSM
from foolbox.criteria import Misclassification
import numpy as np
import tensorflow as tf
############## Loading the model and preprocessing #####################
tf.enable_eager_execution()
tf.keras.backend.set_learning_phase(False)
model = tf.keras.models.model_from_json(
json.load(open("model.json"))["model"], custom_objects={})
model.load_weights("model_weights.h5")
model.compile(optimizer='adam', loss='categorical_crossentropy',
metrics=['accuracy'])
_, (images, labels) = tf.keras.datasets.mnist.load_data()
images = images.reshape(images.shape[0], 28, 28, 1)
images = images/255
images = images.astype(np.float32)
fmodel = foolbox.models.TensorFlowEagerModel(model, bounds=(0, 1))
######################### Attacking the model ##########################
attack = foolbox.attacks.FGSM(fmodel, criterion=Misclassification())
adversarial = np.array([attack(images[0], label=labels[0])])
model_predictions = model.predict(adversarial)
print('real label: {}, label prediction; {}'.format(
labels[0], np.argmax(model_predictions)))
I used TensorFlowEagerModel instead of KerasModel for simplicty. The error you were encountering was due to the fact that model.predict expects a 4d matrix while you were passing a 3d matrix, so I just wrapped up the attack to the image example into a numpy array to make it 4d.
Hope it helps

Keras BatchNorm Layer giving weird results during training and inference

I am having an issue with Keras where evaluate function gives different training loss (way higher) and accuracy(way lower) value as compared to the value that I get during training. I am aware that this question has already been asked at several places (here, here), but I think my issue is different and still not answered in those forums.
Explanation of the Task
It is supposed to be a very simple task. All I am doing is to overfit to my own dataset of 256 images (29x29x3) with 256 output classes (one for each image).
Dataset
Case 1
x_train = All the pixel values in the image = i where i goes from 0 to 255.
y_train = i
Case 2
x_train = Centre 5*5 patch of the pixel values in the image = i where i goes from 0 to 255. All the other pixel values are same for all the images.
y_train = i
This gives me 256 images in total for the training data in each case. (It would be more clear if you just have a look at the code)
Here is my code to reproduce the issue -
from __future__ import print_function
import os
import keras
from keras.datasets import mnist
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D, Activation
from keras.layers.normalization import BatchNormalization
from keras.callbacks import ModelCheckpoint, LearningRateScheduler, Callback
from keras import backend as K
from keras.regularizers import l2
import matplotlib.pyplot as plt
import PIL.Image
import numpy as np
from IPython.display import clear_output
# The GPU id to use, usually either "0" or "1"
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="1"
# To suppress the warnings
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
## Hyperparamters
batch_size = 256
num_classes = 256
l2_reg=0.0
epochs = 500
## input image dimensions
img_rows, img_cols = 29, 29
## Train Image (I took a random image from ImageNet)
train_img_name = 'n01871265_279.JPEG'
ret = PIL.Image.open(train_img_name) #Opening the image
ret = ret.resize((img_rows, img_cols)) #Resizing the image
img = np.asarray(ret, dtype=np.uint8).astype(np.float32) #Converting it to numpy array
print(img.shape) # (29, 29, 3)
## Creating the training data
#############################
x_train = np.zeros((256, img_rows, img_cols, 3))
y_train = np.zeros((256,), dtype=int)
for i in range(len(y_train)):
temp_img = np.copy(img)
## Case1 of dataset
# temp_img[:, :, :] = i # changing all the pixel values
## Case2 of dataset
temp_img[12:16, 12:16, :] = i # changing the centre block of 5*5 pixels
x_train[i, :, :, :] = temp_img
y_train[i] = i
##############################
## Common stuff in Keras
if K.image_data_format() == 'channels_first':
print('Channels First')
x_train = x_train.reshape(x_train.shape[0], 3, img_rows, img_cols)
input_shape = (3, img_rows, img_cols)
else:
print('Channels Last')
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 3)
input_shape = (img_rows, img_cols, 3)
## Normalizing the pixel values
x_train = x_train.astype('float32')
x_train /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
## convert class vectors to binary class matrices
y_train = keras.utils.to_categorical(y_train, num_classes)
## Model definition
def model_toy(mom):
model = Sequential()
model.add( Conv2D(filters=64, kernel_size=(7, 7), strides=(1,1), input_shape=input_shape, kernel_regularizer=l2(l2_reg)) )
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
#Default parameters kept same as PyTorch
#Meaning of PyTorch momentum is different from Keras momentum.
# PyTorch mom = 0.1 is same as Keras mom = 0.9
model.add( Conv2D(filters=128, kernel_size=(7, 7), strides=(1, 1), kernel_regularizer=l2(l2_reg)))
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
model.add(Conv2D(filters=256, kernel_size=(5, 5), strides=(1, 1), kernel_regularizer=l2(l2_reg)))
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
model.add(Conv2D(filters=512, kernel_size=(5, 5), strides=(1, 1), kernel_regularizer=l2(l2_reg)))
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
model.add(Conv2D(filters=1024, kernel_size=(5, 5), strides=(1, 1), kernel_regularizer=l2(l2_reg)))
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
model.add( Conv2D( filters=2048, kernel_size=(3, 3), strides=(1, 1), kernel_regularizer=l2(l2_reg) ) )
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
model.add(Conv2D(filters=4096, kernel_size=(3, 3), strides=(1, 1), kernel_regularizer=l2(l2_reg)))
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
# Passing it to a dense layer
model.add(Flatten())
model.add(Dense(1024, kernel_regularizer=l2(l2_reg)))
model.add(Activation('relu'))
model.add(BatchNormalization(momentum=mom, epsilon=0.00001))
# Output Layer
model.add(Dense(num_classes, kernel_regularizer=l2(l2_reg)))
model.add(Activation('softmax'))
return model
mom = 0.9 #0
model = model_toy(mom)
model.summary()
model.compile(loss=keras.losses.categorical_crossentropy,
optimizer=keras.optimizers.Adam(lr=0.001),
#optimizer=keras.optimizers.SGD(lr=0.01, momentum=0.9, decay=0.0, nesterov=True),
metrics=['accuracy'])
history = model.fit(x_train, y_train,
batch_size=batch_size,
epochs=epochs,
verbose=1,
shuffle=True,
)
print('Training results')
print('-------------------------------------------')
score = model.evaluate(x_train, y_train, verbose=1)
print('Training loss:', score[0])
print('Training accuracy:', score[1])
print('-------------------------------------------')
Small Note - I was able to successfully do this task in PyTorch. It is just that my actual task requires me to have a Keras model. That's why I have changed the default values of the BatchNorm layer (the root cause of the issue) according to the ones I used to train PyTorch model.
Here is the image that I used in my code.
Here are the results of training.
Case1 of the dataset
Case2 of the dataset
If you look at these two files, you would be able to notice the discrepancies in the training loss during training vs inference.
(I have set my batch size to be equal to the size of my training data so as to avoid some the reasons BatchNorm generally creates problems as mentioned here)
Next, I looked at the source code of the Keras to see if there is any way I can make the BatchNorm layer use the batch statistics instead of the running mean and variance.
Here is the update formula that Keras (backend - TF) uses to update the running mean and variance.
#running_stat -= (1 - momentum) * (running_stat - batch_stat)
So if I set the momentum value to be 0, it would mean that the value assigned to the runing_stat would always be equal to batch_stat during the training phase. Thus, the value it will use during inference mode will also be same (close) as batch/dataset statistics.
Here are the results for this little experiment with the same issue still occurring.
Case1 of the dataset
Case2 of the dataset
Programming Environment - Python-3.5.2, tensorflow-1.10.0, keras-2.2.4
I tried the same thing with tensorflow-1.12.0, keras-2.2.2 as well but it still did not solve the issue.

Categories

Resources