Does pre-trained VGG-16 work with LeakyRelu? - python

Does pre-trained VGG-16 with ImageNet weights work with LeakyRelu activation function?
Our input has negative values hence we would like to use LeakyRelu.
Sample code below.
VGG-16 using ReLu:
# CREATE VGG16
vgg = VGG16(include_top=False, pooling = 'max', weights='imagenet', input_shape=(96,96,3))
vgg.get_config()
# freeze weights in vgg layers 1 to 10
for layer in vgg.layers[:-10]:
layer.trainable = False
# Check if any layer is not frozen
for each in vgg.layers:
print(each, each.trainable)
# Put a \newline to make it neat
print('')
# Look at the summary of VGG16
vgg.summary()
VGG-16 using LeakyRelu:
# CREATE VGG16
vgg = VGG16(include_top=False, pooling = 'max', weights='imagenet', input_shape=(96,96,3))
for layer in vgg.layers[:-10]:
if (hasattr(layer,'activation'))==True:
layer.activation = LeakyReLU(alpha=leaky_relu_alpha)
# freeze weights in vgg layers 1 to 10
for layer in vgg.layers[:-10]:
layer.trainable = False
# Check if any layer is not frozen
for each in vgg.layers:
print(each, each.trainable)
# Put a \newline to make it neat
print('')
# Look at the summary of VGG16
vgg.summary()

If the VGG is pertained with relu, then leakyrelu may not work. Because the network layer after the relu activation function never saw feature map with negative values and therefore may lead to low performance. However, the impact of leakyrelu on performance highly depends on your task. I suggest you fine-tune the pre-trained VGG with leakyrelu activation and use a small learning rate for several epochs. This should address the problem of performance degradation.
Another way of handling your negative input is normalizing it into value within range of [0,1] or some other ranges.

Related

Why is my transfer learning implementation of VGG19 not improving accuracy?

I want to use the pretrained VGG19 (with imagenet weights) to build a two class classifier using a dataset of about 2.5k images that i've curated and split into 2 classes. It seems that not only is training taking a very long time, but accuracy seems to not increase in the slightest.
Here's my implementation:
def transferVGG19(train_dataset, val_dataset):
# conv_model = VGG19(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
conv_model = VGG19(
include_top=True,
weights="imagenet",
input_tensor=None,
input_shape=(224, 224, 3),
pooling=None,
classes=1000,
classifier_activation="softmax",
)
for layer in conv_model.layers:
layer.trainable = False
input = layers.Input(shape=(224, 224, 3))
scale_layer = layers.Rescaling(scale=1 / 127.5, offset=-1)
x = scale_layer(input)
x = conv_model(x, training=False)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(64, activation='relu')(x)
predictions = layers.Dense(1, activation='softmax')(x)
full_model = models.Model(inputs=input, outputs=predictions)
full_model.summary()
full_model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['acc'])
history = full_model.fit(
train_dataset,
epochs=10,
validation_data=val_dataset,
workers=10,
)
Model performance seems to be awful...
I imagine this behaviour comes from my rudimentary understanding of how layers work and how to best the new model's architecture. As VGG19 is trained on 1000 classes, i saw it best fit to add to the output a couple of dense layers to reduce the size of the feature maps, as well as a dropout layer in between to randomly discard neurons and help ease the risk of overfitting. At first i suspected i might have dropped too many neurons, but i was expecting my network to learn slower rather than not at all.
Is there something obviously wrong in my implementation that would cause such poor performance? Any explanation is welcomed. Just to mention, i would rule out the dataset as an issue because i've implemented transfer learning on Xception and have managed to get 98% validation accuracy that was monotonously increasing over 20 epochs. That implementation used different layers (i can provide it if necessary) because i was experimenting with different network layouts.
TLDR; Change include_top= True to False
Explaination-
Model graphs are represented in inverted manner i.e last layers are shown at the top and initial layers are shown at bottom.
When include_top=False, the top dense layers which are used for classification and not representation of data are removed from the pretrained VGG model. Only till the last conv2D layers are preserved.
During transfer-learning, you need to keep the learned representation layers intact and only learn the classification part for your data. Hence you are adding your stack of classification layers i.e.
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(64, activation='relu')(x)
predictions = layers.Dense(1, activation='softmax')(x)
If you keep the top classification layers of VGG, it will give 1000 probabilities for 1000 classes due to softmax activation at its top layer in model graph.This activation is not relu. We dont need softmax in intermediate layer as softmax "squishes" the unscaled inputs so that sum(input) = 1. Effectively it produces a smooth software defined approximation of argmax. Hence your accuracy is suffering.

Check if a layer has weights for a keras CNN

I have a CNN which is made in keras.
I can view the weights of a specific layer using:
weights,biases = model.layers[n].get_weights()
print(weights,biases)
But I want to view the weights in all relevant layers so I tried:
for layer in model.layers:
weights,biases = layer.get_weights()
But this throws an error for layers which don't have weights (like the max pool layer).
So is it possible to check if a layer has weights, something like:
if layer.weights != None:
Edit: Added the keras model I am using
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(1, kernel_size=(3,3),activation='relu', input_shape=(6,6,1)))
model.add(keras.layers.MaxPool2D(pool_size=(2,2)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(len(sign_label_list),activation='relu'))
You cold try this:
for layer in model.layers:
if len(layer.get_weights()) > 0:
print(layer.name, layer.get_weights())

How do I use my own data in this CNN finetune code

I am trying to finetune inception-v3, so that it is able to make a decision between an image with signal present and signal absent. How do I edit the code so that it can train on my data? Here is the code to finetune inception-v3:
from keras.applications.inception_v3 import InceptionV3
from keras.preprocessing import image
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras import backend as K
# create the base pre-trained model
base_model = InceptionV3(weights='imagenet',
include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 200 classes
predictions = Dense(200, activation='softmax')(x)
# this is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# first: train only the top layers (which were randomly
# initialized)
# i.e. freeze all convolutional InceptionV3 layers
for layer in base_model.layers:
layer.trainable = False
# compile the model (should be done *after* setting layers
# to non-trainable)
model.compile(optimizer='rmsprop',loss='categorical_
crossentropy')
# train the model on the new data for a few epochs
model.fit_generator(...)
# at this point, the top layers are well trained and we can
# start fine-tuning
# convolutional layers from inception V3. We will freeze the
# bottom N layers
# and train the remaining top layers.
# let's visualize layer names and layer indices to see how
# many layers
# we should freeze:
for i, layer in enumerate(base_model.layers):
print(i, layer.name)
# we chose to train the top 2 inception blocks, i.e. we will
#freeze
# the first 249 layers and unfreeze the rest:
for layer in model.layers[:249]:
layer.trainable = False
for layer in model.layers[249:]:
layer.trainable = True
# we need to recompile the model for these modifications to
# take effect
# we use SGD with a low learning rate
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.0001, momentum=0.9),
loss='categorical_crossentropy')
# we train our model again (this time fine-tuning the top 2
#inception blocks
# alongside the top Dense layers
model.fit_generator(...)
I would greatly appreciate any help you might give.

Pre-training Keras Xception and InceptionV3 models

I'm trying to do a simple binary classification problem using Keras and its pre-built ImageNet CNN architecture.
For VGG16, I took the following approach,
vgg16_model = keras.application.vgg16.VGG16()
'''Rebuild the vgg16 using an empty sequential model'''
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
'''Since the problem is binary, I got rid of the output layer and added a more appropriate output layer.'''
model.pop()
'''Freeze other pre-trained weights'''
for layer in model.layers:
layer.trainable = False
'''Add the modified final layer'''
model.add(Dense(2, activation = 'softmax'))
And this worked marvelously with higher accuracy than my custom built CNN. But it took a while to train and I wanted to take a similar approach using Xception and InceptionV3 since they were lighter models with higher accuracy.
xception_model = keras.applicaitons.xception.Xception()
model = Sequential()
for layer in xception_model.layers:
model_xception.add(layer)
When I run the above code, I get the following error:
ValueError: Input 0 is incompatible with layer conv2d_193: expected axis -1 of input shape to have value 64 but got shape (None, None, None, 128)
Basically, I would like to do the same thing as I did with VGG16 model; keep the other pretrained weights as they are and simply modify the output layer to a binary classification output instead of an output layer with 1000 outcomes. I can see that unlike VGG16, which has relatively straightforward convolution layer structure, Xception and InceptionV3 have some funky nodes that I'm not 100% familiar with and I'm assuming those are causing issues.
Your code fails because InceptionV3 and Xception are not Sequential models (i.e., they contain "branches"). So you can't just add the layers into a Sequential container.
Now since the top layers of both InceptionV3 and Xception consist of a GlobalAveragePooling2D layer and the final Dense(1000) layer,
if include_top:
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
if you want to remove the final dense layer, you can just set include_top=False plus pooling='avg' when creating these models.
base_model = InceptionV3(include_top=False, pooling='avg')
for layer in base_model.layers:
layer.trainable = False
output = Dense(2, activation='softmax')(base_model.output)
model = Model(base_model.input, output)

Is adding a Masking layer before the prediction layers correct?

I want to us a pre-trained model to classify images. Every image can have multiple classes (in this case up to 21). Of course not all classes occur in every image. Therefore, I want to add a Masking layer to the model which masks all zeros (given value if a class doesn't exist).
Now I've never used a Masking layer before, but in most online examples it is added directly after the Input layer. This is, however, not possible in the example below, because the Masking layer is not compatible with all layers (in this case a Conv2D layer.
It was possible to insert the Masking layer before the fully-connected output layer. But this did not really affect the loss of the model (loss is around 20!).
Am I doing it right or is adding the Masking layer at the top of the model totally useless?
My Model
from keras.layers import Input, GlobalAveragePooling2D, Masking, Dense, Dropout
from keras.applications.inception_v3 import InceptionV3
from keras.models import Model
#Define input tensor
input_tensor = Input(shape=(512, 512, 3))
# create the base pre-trained model
base_model = InceptionV3(input_tensor=input_tensor, weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# let's add a Mask fully-connected layer and a dropout
x = Masking(0.)(x)
x = Dense(1024, activation='relu')(x)
x = Dropout(0.2)(x)
# and a logistic layer with 21 classes
predictions = Dense(21, activation='sigmoid')(x)
# create final model
model = Model(inputs=base_model.input, outputs=predictions)
# compile model
model.compile(optimizer='rmsprop', loss='categorical_crossentropy')
Example labels
# input
np.array([[1,2,0,0,0,6,7,8,0,0,0,12,13,14,15,16,17,18,19,20,21],
[0,2,0,4,5,6,7,8,9,10,11,0,0,14,15,16,17,18,19,20,21]])

Categories

Resources