Defining model in Keras - python

I am new to Deep learning and Keras. What does pretrained weights initialization weights='imagenet' mean when used to define a model in Keras?
ResNet50(weights='imagenet')
Thanks!

This code line creates a network architecture known by the name ResNet50 (you can find more information about it here). The weights='imagenet' makes Keras load the weights of this network, which has been trained on the imagenet data set. Without this information Keras would only be able to prepare the network architecture but would not be able to set any of the weights to "good" values, as it does not know the purpose of the model. This is determined by specifying the data set.
If you are using an other data set, then you are using the model as a pre-trained model. You can find more information about this technique here; but the general idea is: after a model has been trained on any complex (image) data set, it will have learned in its lowest layers (most of the time: convolutions) to detect very basic features, such as edges, corners, etc. This helps the model to learn to analyze your own data set much faster, as it does not have to learn to detect this basic features again.

Following #FlashTek answer we can also train this model on our dataset.
Look at the following code:
model = applications.ResNet50(weights = "imagenet", include_top=False,
input_shape = (img_width, img_height,3))
# Freeze the layers which you don't want to train. Here I am freezing the first 30 layers.
for layer in model.layers[0:30]:
layer.trainable = False
for layer in model.layers[30:]:
layer.trainable = True
#Adding custom Layers
x = Flatten()(model.output)
# x = Dense(1024, activation="relu")(x)
# x = Dropout(0.5)(x)
# x = Dense(1024, activation="relu")(x)
# x = Dropout(0.5)(x)
x = Dense(1024, activation="relu")(x)
predictions = Dense(2, activation="softmax")(x)
In the above code we can are specifying how many layer of resnet we have to train on our dataset by assigning layer.trainable either true to train it on your dataset or false for otherwise.
Apart of that we can also stick layer after the network as shown in Adding custom layers

Related

Why is my transfer learning implementation of VGG19 not improving accuracy?

I want to use the pretrained VGG19 (with imagenet weights) to build a two class classifier using a dataset of about 2.5k images that i've curated and split into 2 classes. It seems that not only is training taking a very long time, but accuracy seems to not increase in the slightest.
Here's my implementation:
def transferVGG19(train_dataset, val_dataset):
# conv_model = VGG19(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
conv_model = VGG19(
include_top=True,
weights="imagenet",
input_tensor=None,
input_shape=(224, 224, 3),
pooling=None,
classes=1000,
classifier_activation="softmax",
)
for layer in conv_model.layers:
layer.trainable = False
input = layers.Input(shape=(224, 224, 3))
scale_layer = layers.Rescaling(scale=1 / 127.5, offset=-1)
x = scale_layer(input)
x = conv_model(x, training=False)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(64, activation='relu')(x)
predictions = layers.Dense(1, activation='softmax')(x)
full_model = models.Model(inputs=input, outputs=predictions)
full_model.summary()
full_model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['acc'])
history = full_model.fit(
train_dataset,
epochs=10,
validation_data=val_dataset,
workers=10,
)
Model performance seems to be awful...
I imagine this behaviour comes from my rudimentary understanding of how layers work and how to best the new model's architecture. As VGG19 is trained on 1000 classes, i saw it best fit to add to the output a couple of dense layers to reduce the size of the feature maps, as well as a dropout layer in between to randomly discard neurons and help ease the risk of overfitting. At first i suspected i might have dropped too many neurons, but i was expecting my network to learn slower rather than not at all.
Is there something obviously wrong in my implementation that would cause such poor performance? Any explanation is welcomed. Just to mention, i would rule out the dataset as an issue because i've implemented transfer learning on Xception and have managed to get 98% validation accuracy that was monotonously increasing over 20 epochs. That implementation used different layers (i can provide it if necessary) because i was experimenting with different network layouts.
TLDR; Change include_top= True to False
Explaination-
Model graphs are represented in inverted manner i.e last layers are shown at the top and initial layers are shown at bottom.
When include_top=False, the top dense layers which are used for classification and not representation of data are removed from the pretrained VGG model. Only till the last conv2D layers are preserved.
During transfer-learning, you need to keep the learned representation layers intact and only learn the classification part for your data. Hence you are adding your stack of classification layers i.e.
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(64, activation='relu')(x)
predictions = layers.Dense(1, activation='softmax')(x)
If you keep the top classification layers of VGG, it will give 1000 probabilities for 1000 classes due to softmax activation at its top layer in model graph.This activation is not relu. We dont need softmax in intermediate layer as softmax "squishes" the unscaled inputs so that sum(input) = 1. Effectively it produces a smooth software defined approximation of argmax. Hence your accuracy is suffering.

Get output of any layer from a saved Keras model [duplicate]

I am developing an autoencoder for clustering certain groups of images.
input_images->...->bottleneck->...->output_images
I have calibrated the autoencoder to my satisfaction and saved the model; everything has been developed using keras.tensorflow on python3.
The next step is to apply the autoencoder to a ton of images and cluster them according to cosine distance in the bottleneck layer. Oops, I just realized that I don't know the syntax in keras.tf for running the model on a batch up to a specific layer rather than to the output layer. Thus the question:
How do I run something like Model.predict_on_batch or Model.predict_generator up to the certain "bottleneck" layer and retrieve the values on that layer rather than the values on the output layer?
You need to define a new model (if you didn't define the encoder and decoder as separate models initially, which is usually the easiest option).
If your model was defined without reusing layers, it's just:
inputs = model.input
outputs= model.get_layer('bottleneck').output
encoder = Model(inputs, outputs)
Use the encoder model as any other model.
The full code would be like this,
# ENCODER
encoding_dim = 37310
input_layer = Input(shape=(encoding_dim,))
encoder = Dense(500, activation='tanh')(input_layer)
encoder = Dense(100, activation='tanh')(encoder)
encoder = Dense(50, activation='tanh', name='bottleneck_layer')(encoder)
decoder = Dense(100, activation='tanh')(encoder)
decoder = Dense(500, activation='tanh')(decoder)
decoder = Dense(37310, activation='sigmoid')(decoder)
# full model
model_full = models.Model(input_layer, decoder)
model_full.compile(optimizer='adam', loss='mse')
model_full.fit(x, y, epochs=20, batch_size=16)
# bottleneck model
bottleneck_output = model_full.get_layer('bottleneck_layer').output
model_bottleneck = models.Model(inputs = model_full.input, outputs = bottleneck_output)
bottleneck_predictions = model_bottleneck.predict(X_test)

Multi class image classification using CNN

I wanted to classify images which consist five classes. I wanted to use CNN. But when I try with several models, the training accuracy will not increase than 20%. Please some one help me to overcome this. Mostly model will trained within 3 epoches and when epoches increase there is no improvement in accuracy. Can anyone suggest me a solution or model or can specify what could be the problem?
Below is one of the model i have used
#defining training and test sets
x_train,x_val,y_train,y_val=train_test_split(x,y,test_size=0.2, random_state=42)
print('Training data and target sizes: \n{}, {}'.format(x_train.shape,y_train.shape))
print('Test data and target sizes: \n{}, {}'.format(x_val.shape,y_val.shape))
Training data and target sizes:
(2398, 224, 224, 3), (2398,)
Test data and target sizes:
(600, 224, 224, 3), (600,)
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet',pooling='avg', input_shape=(img_rows, img_cols, img_channel))
print(base_model.summary())
#Adding custom Layers
add_model = Sequential()
add_model.add(Dense(1024, activation='relu',input_shape=base_model.output_shape[1:]))
add_model.add(Dropout(0.60))
add_model.add(Dense(1, activation='sigmoid'))
print(add_model.summary())
# creating the final model
model = Model(inputs=base_model.input, outputs=add_model(base_model.output))
# compile the model
opt = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
reduce_lr = ReduceLROnPlateau(monitor='val_acc',
patience=5,
verbose=1,
factor=0.1,
cooldown=10,
min_lr=0.00001)
model.compile(
loss='categorical_crossentropy',
metrics=['acc'],
optimizer='adam'
)
print(model.summary())
n_fold = 5
kf = model_selection.KFold(n_splits = n_fold, shuffle = True)
eval_fun = metrics.roc_auc_score
model.fit(x_train,y_train,epochs=50,batch_size=50,validation_data=(x_val,y_val))
is it okay could you share the part of the code where you're fitting the model. It's not available in the post.
And since the output is not reproducible due to lack of data, I suggest you go through this link https://www.kaggle.com/kenconstable/alzheimer-s-multi-class-classification
It's really well explained and it has given the best practices of multi-class-classification based on transfer learning as well as from scratch. In case you don't find this helpful, It would be helpful to share the training script including the model.fit() code.
Okay, so here's the issue,
In your code, you may be creating a base model with inception V3, however, you are not really adding that base model to your add_model variable.
Your add_model variable is essentially a dense network and not a CNN. Also, another thing, although it's not a big deal is that you're creating your own optimiser opt and not using it in model.compile
Can you please try this code out and let me know if it works:
# function to build the model
def build_transfer_model(conv_base,dropout,dense_node,learn_rate,metric):
"""
Build and compile a transfer learning model
Input: a base model, dropout rate, the number of filters in the dense node,
the learning rate and performance metrics
Output: A compiled CNN model
"""
# clear previous run
backend.clear_session()
# build the model
model = Sequential()
model.add(conv_base)
model.add(Dropout(dropout))
model.add(BatchNormalization())
model.add(Flatten())
model.add(Dense(dense_node,activation='relu'))
model.add(Dense(1,activation='sigmoid'))
# complile the model
model.compile(
optimizer = tensorflow.keras.optimizers.Adam(lr=learn_rate),
loss = 'categorical_crossentropy',
metrics = metric )
model.summary()
return model
img_rows, img_cols, img_channel = 224, 224, 3
base_model = applications.inception_v3.InceptionV3(include_top=False, weights='imagenet',pooling='avg', input_shape=(img_rows, img_cols, img_channel))
model = build_transfer_model(conv_base=base_model,dropout=0.6,dense_node =1024,learn_rate=0.001,metric=['acc'])
print(model.summary())
model.fit(x_train,y_train,epochs=50,batch_size=50,validation_data=(x_val,y_val))
If you pay attention in the function, the first thing we are adding to the instance of Sequential() is the base layer (InceptionV3 in your case). But you were adding a dense layer directly. Although it may get the weights from the output layer of the base inception V3, it will be a dense network, not a CNN. So please check this out.
I may have changed the variable names, although I have tried not to do the same. And, please change the order of the layers in the build_transfer_model function according to your requirement.
In case it doesn't work, let me know.
Thanks.
You have to use model.fit() to actually train the model after compiling. Right now, it has randomly initialized weights, and is therefore making random predictions. Since you have five classes, the accuracy is approximately 1/5 = 20%. Training your model may take time depending on model size and amount of data you have.

Transfer learning using saved bottleneck values (inference with full model)

I'm using lastest Keras with tensorflow backend.
I'm not quite sure the correct way to put together the full model for inference, if I used a smaller version of my model for training on bottleneck values.
# Save bottleneck values
from keras.applications.xception import Xception
base_model = Xception(weights='imagenet', include_top=False)
prediction = base_model.predict(x)
** SAVE bottleneck data***
Now let's say my full model looks something like this:
base_model = Xception(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(classes, activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
but to speed up training, I wanted to bypass the earlier layers by loading bottleneck values; so I create a smaller model (including only the new layers). I then train and save the model.
bottleneck_input = Input(shape = bottleneck_shape)
x = GlobalAveragePooling2D() (bottleneck_input)
x = Dense(1024, activation='relu')(x)
predictions = Dense(classes, activation='softmax')(x)
model = Model(input= bottleneck_input, output=predictions)
save_full_model() #save model
after training this smaller model, I want to run inference on the full model. So I need to put together the base model and the smaller model. Not sure what is the best way to to do this.
base_model = Xception(weights='imagenet', include_top=False)
#x = base_model.output
loaded_model = load_model() # load bottleneck model
#now to combine both models (something like this?)
Model(inputs = base_model.inputs, outputs = loaded_model.outputs)
What is the proper way to put together the model for inference?
I don't know if there is a way to use my full-model for training, and just start from the bottleneck layers for training and input layer for inference. (Please not this is not the same as freeze layers, which just freezes the weights (weights won't be updated), but still calculates each data point.)
Every model is a layer with extra properties such as loss function etc. So you can use them like a layer in the functional API. In your case it could look like:
input = Input(...)
base_model = Xception(weights='imagenet', include_top=False)
# Apply model to input like layer
base_output = base_model(input)
loaded_model = load_model()
# Now the bottleneck model
out = loaded_model(base_output)
final_model = Model(input, out) # New computation graph

Pre-training Keras Xception and InceptionV3 models

I'm trying to do a simple binary classification problem using Keras and its pre-built ImageNet CNN architecture.
For VGG16, I took the following approach,
vgg16_model = keras.application.vgg16.VGG16()
'''Rebuild the vgg16 using an empty sequential model'''
model = Sequential()
for layer in vgg16_model.layers:
model.add(layer)
'''Since the problem is binary, I got rid of the output layer and added a more appropriate output layer.'''
model.pop()
'''Freeze other pre-trained weights'''
for layer in model.layers:
layer.trainable = False
'''Add the modified final layer'''
model.add(Dense(2, activation = 'softmax'))
And this worked marvelously with higher accuracy than my custom built CNN. But it took a while to train and I wanted to take a similar approach using Xception and InceptionV3 since they were lighter models with higher accuracy.
xception_model = keras.applicaitons.xception.Xception()
model = Sequential()
for layer in xception_model.layers:
model_xception.add(layer)
When I run the above code, I get the following error:
ValueError: Input 0 is incompatible with layer conv2d_193: expected axis -1 of input shape to have value 64 but got shape (None, None, None, 128)
Basically, I would like to do the same thing as I did with VGG16 model; keep the other pretrained weights as they are and simply modify the output layer to a binary classification output instead of an output layer with 1000 outcomes. I can see that unlike VGG16, which has relatively straightforward convolution layer structure, Xception and InceptionV3 have some funky nodes that I'm not 100% familiar with and I'm assuming those are causing issues.
Your code fails because InceptionV3 and Xception are not Sequential models (i.e., they contain "branches"). So you can't just add the layers into a Sequential container.
Now since the top layers of both InceptionV3 and Xception consist of a GlobalAveragePooling2D layer and the final Dense(1000) layer,
if include_top:
x = GlobalAveragePooling2D(name='avg_pool')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
if you want to remove the final dense layer, you can just set include_top=False plus pooling='avg' when creating these models.
base_model = InceptionV3(include_top=False, pooling='avg')
for layer in base_model.layers:
layer.trainable = False
output = Dense(2, activation='softmax')(base_model.output)
model = Model(base_model.input, output)

Categories

Resources