Why is the model not learning with pretrained vgg16 in keras?

Why is the model not learning with pretrained vgg16 in keras? - python

I am using the pre-trained VGG 16 model available with Keras and applying it on the SVHN dataset which is a dataset of 10 classes of number 0 - 10. The network is not learning and has been stuck at 0.17 accuracy. There is something that I am doing incorrectly but I am unable to recognise it. The way I am running my training is as follows:
import tensorflow.keras as keras
## DEFINE THE MODEL ##
vgg16 = keras.applications.vgg16.VGG16()
model = keras.Sequential()
for layer in vgg16.layers:
model.add(layer)
model.layers.pop()
for layer in model.layers:
layer.trainable = False
model.add(keras.layers.Dense(10, activation = "softmax"))
## START THE TRAINING ##
train_optimizer_rmsProp = keras.optimizers.RMSprop(lr=0.0001)
model.compile(loss="categorical_crossentropy", optimizer=train_optimizer_rmsProp, metrics=['accuracy'])
batch_size = 128*1
data_generator = keras.preprocessing.image.ImageDataGenerator(
rescale = 1./255
)
train_generator = data_generator.flow_from_directory(
'training',
target_size=(224, 224),
batch_size=batch_size,
color_mode='rgb',
class_mode='categorical'
)
validation_generator = data_generator.flow_from_directory(
'validate',
target_size=(224, 224),
batch_size=batch_size,
color_mode='rgb',
class_mode='categorical')
history = model.fit_generator(
train_generator,
validation_data = validation_generator,
validation_steps = math.ceil(val_split_length / batch_size),
epochs = 15,
steps_per_epoch = math.ceil(num_train_samples / batch_size),
use_multiprocessing = True,
workers = 8,
callbacks = model_callbacks,
verbose = 2
)
What is it that I am doing wrong? Is there something that I am missing? I was expecting a very high accuracy since it is carrying weights from imagenet but it is stuck at 0.17 accuracy from the first epoch.

I assume you're upsampling the 32x32 MNIST-like images to fit the VGG16 input, what you should actually do in this case is to remove all the dense layers, this way you can input any image size as in convolutional layers the weights are agnostic to the image size.
You can do this like:
vgg16 = keras.applications.vgg16.VGG16(include_top=False, input_shape=(32, 32))
Which I consider should be the default behaviour of the constructor.
When you upsample the image, best case scenario you're basically blurring it, in this case you have to consider that a single pixel of the original image corresponds to 7 pixels of the upsampled one, while VGG16's filters are 3 pixels wide, so in other words you're losing the image's features.
It is not necessary to add 3 dense layers at the end like the original VGG16, you can try with the same layer you have in your code.

Related

Keras, multiclass classification of images with an added parameter for measure

I am facing a classical problem of classification of images into 12 classes.
I am using a model based on Inception v3 to which I added 3 layers:
weights_url = "https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5"
weights_file = "inception_v3.h5"
urllib.request.urlretrieve(weights_url, weights_file)
# Instantiate the model
pre_trained_model = InceptionV3(input_shape=(150, 150, 3),
include_top=False,
weights=None)
# load pre-trained weights
pre_trained_model.load_weights(weights_file)
# freeze the layers
for layer in pre_trained_model.layers:
layer.trainable = False
# pre_trained_model.summary()
last_layer = pre_trained_model.get_layer('mixed7')
last_output = last_layer.output
x = layers.Conv2D(128,kernel_size=(3,3),activation='relu',padding='same')(last_output)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Dense(12, activation="softmax", name="classification")(x)
model = Model(pre_trained_model.input, x)
model.compile(optimizer='RMSprop',
loss='categorical_crossentropy',
metrics=['acc'])
model.summary()
I am reading the images from the file system using an ImageDataGenerator:
train_datagen = ImageDataGenerator(rescale=1./255,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest')
train_generator = train_datagen.flow_from_directory(TRAINING_DIR,
batch_size=64,
class_mode='categorical',
target_size=(150, 150))
validation_datagen = ImageDataGenerator(rescale=1./255)
validation_generator = validation_datagen.flow_from_directory(VALIDATION_DIR,
batch_size=64,
class_mode='categorical',
target_size=(150, 150))
I would like to train the classification based also on the measure, for which I have created a dictionary that associates an integer to each file name getting the data from a .csv
dict_measures = {}
for index, row in scales.iterrows():
dict_measures[row['filename']] = row['measure']
The way I would tackle this issue is to zip the mixed data (Conv2d output, measure) just before the final Dense with softmax, then fit the model. The only issue would remain the creation of a generator for the measures in synch with the images, so I would set shuffle=False.
Does this make sense? Did any of you have to tackle a similar problem? What was your approach?

Problem with logits and labels size. Tensorflow

I try to train top layer separate from base model. All is working with generating features with model.predict_generator like
bottleneck_features_train = model.predict_generator(
train_generator, predict_size_train)
np.save(save_dir + 'bottleneck_features_train.npy', bottleneck_features_train)
train_data = np.load(mtx.save_dir + 'bottleneck_features_train.npy')
model.fit(train_data, ....
)
But now I got huge dataset and can't load all data in memory so I use generator flow_from_directory
def create_generator(root_path, batch_size):
datagen = ImageDataGenerator(rescale=1. / 255)
generator = datagen.flow_from_directory(
root_path,
target_size=(224, 224),
batch_size=batch_size,
class_mode="categorical",
shuffle=True)
return generator
train_generator = create_generator(mtx.train_data_dir, mtx.batch_size)
and than
model.fit(train_generator...
class_mode in flow_from_directory is "categorical" and loss function too(categorical_crossentropy)
layers is
model = Sequential()
model.add(Flatten(input_shape=(7, 7, 512)))
model.add(Dense(512, activation="relu"))
model.add(Dropout(0.7))
model.add(Dense(num_classes, activation='softmax'))
but when I run training I get
logits and labels must be broadcastable: logits_size=[24,32] labels_size=[4,32]
As I understand it's something wrong with shapes in layers or how are features/labels encoded.
Update 1:
Also it's working when batch_size in flow_from_directory is set with 1. But accuracy is very low than.

try
model.add(Flatten(input_shape=(224,224,3)))

What Can I do to improve the 96% (f-score) in my CNN Keras?

I'm running a project with roughly 22000 images (11000 each class) with ResNet50 fine tuning. This is my code:
base_model = ResNet50(weights='imagenet', include_top=True, input_shape=(224,224,3))
head_model = base_model.get_layer("conv5_block1_1_conv").output
head_model = Dropout(0.75)(head_model)
head_model = Flatten()(head_model)
head_model = Dense(1, activation="sigmoid")(head_model)
model = Model(inputs=base_model.input, outputs=head_model)
model.summary()
for layer in base_model.layers:
layer.trainable = False
adam = Adam(lr=0.001)
model.compile(optimizer= adam, loss='binary_crossentropy', metrics=['accuracy'])
train_datagen = ImageDataGenerator()
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size=(224, 224),
batch_size=50,
class_mode='binary')
model.fit_generator(train_generator, steps_per_epoch=100)
model.save("asd.h5")
With this model I reached 96 % of f-score. What method I can apply to improve its accuracy? I already tried include colormap as preprocessing and Include Dense layers.

There're a lot of techniques:
You can change the structure of model. Add or remove some layers (and not only Dense layers). Or use other pretrained model.
Change the optimizer. For example, despite the Adam another popular optimizer is RMSprop. You can also try to tune optimizer's hyperparameters.
Preprocess data. You can do zoom, shear and etc.

How can i predict images(outside dataset) using VGG16 with keras after training?

I have trained the VGG16 net using keras with my own dataset, which has 10 classes. so i modified the activation layer with 10 classes.
Here is the code
TRAIN_DIR = "D:\\Dataset\\training"
VALIDATION_DIR = "D:\\Dataset\\validation"
part 2
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
IMAGE_WIDTH = 300
IMAGE_HEIGHT = 300
BATCH_SIZE = 16
part 3
train_datagen = ImageDataGenerator(rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
validation_datagen = ImageDataGenerator(rescale=1./255,
)
train_generator = train_datagen.flow_from_directory(TRAIN_DIR,
target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
batch_size = BATCH_SIZE,
shuffle=True, # By shuffling the images we add some randomness and prevent overfitting
class_mode="categorical")
validation_generator = validation_datagen.flow_from_directory(VALIDATION_DIR,
target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
batch_size = BATCH_SIZE,
shuffle=True,
class_mode="categorical")
part 4
training_samples = 1097
validation_samples = 272
total_steps = training_samples // BATCH_SIZE
loading the VGG16
#VGG16 network with pretrained weights is used
from keras.applications import vgg16
model = vgg16.VGG16(weights='imagenet', include_top=False, input_shape=(IMAGE_WIDTH, IMAGE_HEIGHT, 3), pooling="max")
for layer in model.layers[:-5]:
layer.trainable = False
for layer in model.layers:
print(layer, layer.trainable)
part 5
from keras.layers import Dense, GlobalAveragePooling2D, Dropout
from keras.models import Model, Sequential
# Although this part can be done also with the functional API, I found that for this simple models, this becomes more intuitive
transfer_model = Sequential()
for layer in model.layers:
transfer_model.add(layer)
transfer_model.add(Dense(512, activation="relu"))
transfer_model.add(Dropout(0.5))
transfer_model.add(Dense(10, activation="softmax"))
part 6
# Adam optimizer and learning rate 0.0001
from keras import optimizers
adam = optimizers.Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.00001)
transfer_model.compile(loss="categorical_crossentropy",
optimizer=adam,
metrics=["accuracy"])
finally the training
model_history = transfer_model.fit_generator(train_generator, steps_per_epoch=training_samples // BATCH_SIZE,
epochs=25,
validation_data=validation_generator,
validation_steps=validation_samples // BATCH_SIZE)
part 7, using some random images from the internet to predict
test_path = "D:\\Dataset\\predict\\"
test_datagen = ImageDataGenerator(rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,
fill_mode='nearest'
)
test_generator = test_datagen.flow_from_directory(test_path,
target_size=(IMAGE_WIDTH, IMAGE_HEIGHT),
batch_size = 50,
class_mode="categorical")
enter code here
in this part i am trying to predict, but getting this kind of numbers without getting the actual prediction result which i want as images
pred = model.predict_generator(test_generator, steps=1)
print(pred)
the result is like this, but i want these to be real images, but cannot figure out how.

You can't output images from your network and it's not really clear how you imagine this would work - the images are the input and the output is a list of numbers with one value per class. You can interpret these numbers as the probability the image contains an object of that class.
You could then find the most likely class (using e.g. the argmax function) and show an image from this class if you liked - but this would have to be done separately.
Note - you are running the prediction using the original model not the transfer_model:
pred = model.predict_generator(test_generator, steps=1)
You should use the trained transfer model to get the class predictions and this will be in the form of a vector of 10 probabilities, one value for each class.

Vgg16 for gender detection (male,female)

We have used vgg16 and freeze top layers and retrain the last 4 layers on gender dataset 12k male and 12k female. It gives very low accuracy especially for male. We are using the IMDB dataset. On female test data it gives female as output but on male it gives same output.
vgg_conv=VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
Freeze the layers except the last 4 layers
for layer in vgg_conv.layers[:-4]:
layer.trainable = False
Create the model
model = models.Sequential()
Add the vgg convolutional base model
model.add(vgg_conv)
Add new layers
model.add(layers.Flatten())
model.add(layers.Dense(4096, activation='relu'))
model.add(layers.Dense(4096, activation='relu'))
model.add(layers.Dropout(0.5)) model.add(layers.Dense(2, activation='softmax'))
nTrain=16850 nTest=6667
train_datagen = image.ImageDataGenerator(rescale=1./255)
test_datagen = image.ImageDataGenerator(rescale=1./255)
batch_size = 12 batch_size1 = 12
train_generator = train_datagen.flow_from_directory(train_dir, target_size=(224, 224), batch_size=batch_size, class_mode='categorical', shuffle=False)
test_generator = test_datagen.flow_from_directory(test_dir, target_size=(224, 224), batch_size=batch_size1, class_mode='categorical', shuffle=False)
model.compile(optimizer=optimizers.RMSprop(lr=1e-6), loss='categorical_crossentropy', metrics=['acc'])
history = model.fit_generator( train_generator, steps_per_epoch=train_generator.samples/train_generator.batch_size, epochs=3, validation_data=test_generator, validation_steps=test_generator.samples/test_generator.batch_size, verbose=1)
model.save('gender.h5')
Testing Code:
model=load_model('age.h5')
img=load_img('9358807_1980-12-28_2010.jpg', target_size=(224,224))
img=img_to_array(img)
img=img.reshape((1,img.shape[0],img.shape[1],img.shape[2]))
img=preprocess_input(img)
yhat=model.predict(img)
print(yhat.size)
label=decode_predictions(yhat)
label=label[0][0]
print('%s(%.2f%%)'% (label[1],label[2]*100))

Firstly, you are saving the model as gender.h5 and during testing you are loading the model age.h5. Probably you have added different code for the testing here.
Coming to improving the accuracy of the program -
Most importantly is that you are using loss = 'categorical_crossentropy', change it to loss = 'binary_crossentropy' in model.compile as you have just 2 classes. So your
model.compile(optimizer="adam",loss=tf.keras.losses.BinaryCrossentropy(from_logits=True), metrics=['accuracy']) will look like this.
Also change class_mode='categorical' to class_mode='binary' in flow_from_directory.
As categorical_crossentropy goes hand in hand with softmax activation in the last layer, and if you change the loss to binary_crossentropy the last activation should also be changed to sigmoid. So last layer should be Dense(1, activation='sigmoid').
You have added 2 Dense layers of 4096, this will add 4096 * 4096 = ‭16,777,216‬ weights to be learnt by the model. Reduce them may be to 1026 and 512 respectively.
You have added Dropout layer of 0.5, that is to keep the 50% of neurons off during the epoch. That is huge number. Better is to drop off the Dropout layer and use to only if your model is overfitting.
Set batch_size = 1. As you have very less input let every epoch have same number of steps as input records.
Use Data Augmentation technique like horizontal_flip, vertical_flip, shear_range, zoom_range of ImageDataGenerator to generate the new batches of training and validation images during every epoch.
Train your model for large number of epoch. You are just training for epoch=3, that is too less for learning the weights. Train for epoch=50 and later trim the number.
Hope this answers your question. Happy Learning.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.