I'm new to ConvNets and Python and want to implement the following:
I want to use the pretrained vgg16 model and add 3 fully connected layers after it with an L2-Normalization at the end.
So Data->VGG16->FC (1x4096)->FC (1x4096)->FC (1x3)->L2-Norm->Output
The first and second FC get an array 1x4096 the last FC gets an array 1x3 where the L2-Norm is performed.
Can anyone give me a hint how to do that ?
I found that I can load the model like that :
model_vgg19 = models.vgg19(pretrained=True)
But how can I add the FCs and the L2-Norm after that ? And how can I get Test-Data through the model ?
I'm quoting an example mentioned in Keras#3465
In Keras framework, if you mention include_top = False while loading your pre-trained model it will not include the final classification layer. You can add your custom FC layers at the end as shown in the example below:
#load vgg16 without dense layer and with theano dim ordering
base_model = VGG16(weights = 'imagenet', include_top = False, input_shape = (3,224,224))
#number of classes in your dataset e.g. 20
num_classes = 20
x = Flatten()(base_model.output)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
x = BatchNormalization()(x)
predictions = Dense(num_classes, activation = 'softmax')(x)
#create graph of your new model
head_model = Model(input = base_model.input, output = predictions)
#compile the model
head_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
head_model.summary()
.
.
.
#train your model on data
head_model.fit(x, y, batch_size = batch_size, verbose = 1)
Related
I have a dataset whose scheme is like:
X1 ... X20 C
where the first 20 columns are input data, and the last column is the target one. The dataset includes 2000 record. I want to design a sequential Keras model to classify those target labels (which vary from 1 to 10, thereby being multi-label classification problem). Assuming that I have saved those input data and labels in X_train_1 and y_train_1, Here is my model:
def build_model_1(n_hidden = 1, n_neurons = 30, learning_rate = 3e-3, input_shape = X_train_1.shape):
model = tf.keras.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=input_shape))
model.add(tf.keras.layers.BatchNormalization(momentum=0.999))
for layer in range(n_hidden):
model.add(tf.keras.layers.Dense(n_neurons, tf.keras.activations.selu,
kernel_initializer="lecun_normal",
kernel_regularizer= tf.keras.regularizers.l2(0.01)))
model.add(tf.keras.layers.BatchNormalization(momentum=0.999))
model.add(tf.keras.layers.Dense(10, tf.keras.activations.softmax, kernel_initializer="lecun_normal"))
loss = tf.keras.losses.categorical_crossentropy
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.9, beta_2=0.999)
metric = [tf.keras.metrics.Accuracy()]
model.compile(loss = loss, optimizer=optimizer, metrics=[metric])
return model
I thought the shape of the input should be that of my training dataset, however when I compile and fit my model, I get the following error:
ValueError: Input 0 of layer sequential_12 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (32, 20)
What am I doing wrong here?
Your input shape is simply 20, since you have 20 features and 2000 samples. You do not have to provide the batch size. Here is a working example:
import tensorflow as tf
import numpy as np
def build_model_1(n_hidden = 1, n_neurons = 30, learning_rate = 3e-3, input_shape = (20,)):
model = tf.keras.Sequential()
model.add(tf.keras.layers.InputLayer(input_shape=input_shape))
model.add(tf.keras.layers.BatchNormalization(momentum=0.999))
for layer in range(n_hidden):
model.add(tf.keras.layers.Dense(n_neurons, tf.keras.activations.selu,
kernel_initializer="lecun_normal",
kernel_regularizer= tf.keras.regularizers.l2(0.01)))
model.add(tf.keras.layers.BatchNormalization(momentum=0.999))
model.add(tf.keras.layers.Dense(10, tf.keras.activations.softmax, kernel_initializer="lecun_normal"))
loss = tf.keras.losses.categorical_crossentropy
optimizer = tf.keras.optimizers.Adam(learning_rate=learning_rate, beta_1=0.9, beta_2=0.999)
metric = [tf.keras.metrics.Accuracy()]
model.compile(loss = loss, optimizer=optimizer, metrics=[metric])
return model
train_data = np.random.random((2000, 20))
model = build_model_1()
y = model(train_data)
Also, ask yourself if you are really dealing with a multi-label classification problem. Can a sample from your dataset belong to more than one class, or are the classes mutually exclusive? If the classes are not mutually exclusive, I would recommend changing the activation function for the output layer to sigmoid and changing the loss function to binary_crossentropy. The intuition behind this can be found here.
I'm creating an Ensemble of Vgg19, DenseNet, and EfficientNetB1.
The code is as follows:
IMAGE_SIZE = (224,224,3)
import tensorflow as tf
vgg19 = tf.keras.applications.vgg19.VGG19(
input_shape=IMAGE_SIZE, weights='imagenet', include_top=False)
for layer in vgg19.layers:
layer._name = layer._name + str('_19')
layer.trainable = False
effnetb1 =tf.keras.applications.efficientnet.EfficientNetB1(
include_top=False, weights='imagenet', input_shape=IMAGE_SIZE)
for layer in effnetb1.layers:
layer._name = layer._name + str('_B1')
layer.trainable=False
densenet=tf.keras.applications.densenet.DenseNet121(
include_top=False, weights="imagenet", input_shape=IMAGE_SIZE)
for layer in densenet.layers:
layer._name = layer._name + str('_Dense')
layer.trainable=False
from keras.layers import Input, Flatten, Concatenate, Dense, Average, Dropout
inp = Input(IMAGE_SIZE)
vgg19_x = Flatten()(vgg19(inp))
vgg19_x = Dense(256, activation='relu')(vgg19_x)
effnet_x = Flatten()(effnetb1(inp))
effnet_x = Dense(256, activation='relu')(effnet_x)
densenet_x = Flatten()(densenet(inp))
densenet_x = Dense(256, activation='relu')(densenet_x)
from keras.models import Model
x = Concatenate()([vgg19_x, effnet_x, densenet_x])
x = Dense(128, activation='relu')(x)
x = Dropout(0.30)(x)
x = Dense(64, activation='relu')(x)
out = Dense(2, activation='softmax')(x)
model = Model(inputs = inp, outputs = out)
model.compile(
loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(
learning_rate=0.0005,
name="Adam"),
metrics=['accuracy']
)
model.summary()
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
checkpointer = ModelCheckpoint(filepath="/content/drive/MyDrive/ensemble/ensemble-weights.hdf5", verbose=1, save_best_only=True)
r = model.fit(
training_set,
validation_data=test_set,
epochs=30,
steps_per_epoch=len(training_set),
validation_steps=len(test_set),
callbacks = [checkpointer]
)
The code runs fine and the training is successfully taking place when I'm not using the callback. But when I use a ModelCheckpoint, I get the following error after 1st epoch:
ValueError: The target structure is of type `<class 'keras.engine.keras_tensor.KerasTensor'>`
KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='input_5'), name=...
However, the input structure is a sequence (<class 'list'>) of length 0.
[]
nest cannot guarantee that it is safe to map one to the other.
Can anyone tell me what's wrong here? Also, is it because I'm concatenating three models?
Your help will be appreciated. Thank you!
I also ran into this issue while trying to implement a nested model (which is what would be constructed here after you create the concatenated model).
The issue seems to be that Keras cannot handle the inputs and outputs of nested models in newer tensorflow versions(tf 2.0 and above). Depending on the version you are on, you might want to either explicitly refer the input/output of the nested model you are using. In tf2.6, what seems to work is to define separate models for each part - ie - the common layers added after concatenation should also be wrapped in a model like below (taken from here):
#Make GradCAM heatmap following the Keras tutorial.
last_conv_layer = model.layers[-4].layers[-1]
last_conv_layer_model = keras.Model(model.layers[-4].inputs, last_conv_layer.output)
# Second, we create a model that maps the activations of the last conv
# layer to the final class predictions
classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])
x = classifier_input
for layer in model.layers[-3:]:
x = layer(x)
classifier_model = keras.Model(classifier_input, x)
#Preparing the image with the preprocessing layers
preprocess_layers = keras.Model(model.inputs, model.layers[-5].output)
img_array = preprocess_layers(prepared_image)
# Then, we compute the gradient of the top predicted class for our input image
# with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
# Compute activations of the last conv layer and make the tape watch it
last_conv_layer_output = last_conv_layer_model(img_array)
tape.watch(last_conv_layer_output)
# Compute class predictions
preds = classifier_model(last_conv_layer_output)
top_pred_index = tf.argmax(preds[0])
top_class_channel = preds[:, top_pred_index]
# This is the gradient of the top predicted class with regard to
# the output feature map of the last conv layer
grads = tape.gradient(top_class_channel, last_conv_layer_output)
You can also check the following github issues (they are not very related, but deal with a similar problem) - issue1, issue2, issue3
I am working on image dataset
I first resized all images
im_size1 = 128
im_size2 = 128
i = 0
for f, breed in tqdm(df_train.values):
if type(cv2.imread('train/{}.jpeg'.format(f)))==type(None):
continue
else:
img = cv2.imread('train/{}.jpeg'.format(f))
label = one_hot_labels[i]
x_train.append(cv2.resize(img, (im_size1, imt_size2)))
y_train.append(label)
i += 1
np.save('x_train2',x_train)
np.save('y_train2',y_train)
print('Done')
On the second run i passed Histogram of Gradient Feature
Note: Model was same in both cases. Ran on different projects, got different results
i = 0
for f, breed in tqdm(df_train.values):
if type(cv2.imread('train/{}.jpeg'.format(f)))==type(None):
continue
else:
img = cv2.imread('train/{}.jpeg'.format(f))
label = one_hot_labels[i]
resizedImage = cv2.resize(img, (im_size1, im_size2))
hog_vec, hog_vis = feature.hog(resizedImage, visualize=True)
resizedImageVec = cv2.resize(hog_vec, (im_size1, im_size2))
x_train.append(resizedImageVec)
#np.concatenate(x_train, hog_vec[:])
y_train.append(label)
i += 1
np.save('x_train2hog',x_train)
np.save('y_train2hog',y_train)
print('Done')
Then configured the model like so
base_model = ResNet50(weights = None, include_top=False, input_shape=(im_size1, im_size2, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
x = Dropout(0.2)(x)
x = Dense(32, activation='relu')(x)
x = Dense(16, activation='relu')(x)
predictions = Dense(num_class, activation='softmax')(x)
# This is the model we will train
model = Model(inputs=base_model.input, outputs=predictions)
# First: train only the top layers (which were randomly initialized)
#for layer in base_model.layers:
# layer.trainable = False
model.compile(loss='categorical_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', verbose=1)]
model.summary()
Now my professor has asked me this.
"Train last layer using images, then train last layer using features (different representation of images). Concatenate these two layers, and then train one more layer."
Please guide me how can i accomplish this
Have a look at the Merge layers from Keras API
keras.layers.Concatenate(axis=-1)
and the functional version:
keras.layers.concatenate(inputs, axis=-1)
I want to do transfer learning with simple MLP models. First I train a 1 hidden layer feed forward network on large data:
net = Sequential()
net.add(Dense(500, input_dim=2048, kernel_initializer='normal', activation='relu'))
net.add(Dense(1, kernel_initializer='normal'))
net.compile(loss='mean_absolute_error', optimizer='adam')
net.fit(x_transf,
y_transf,
epochs=1000,
batch_size=8,
verbose=0)
Then I want to pass the unique hidden layer as input to a new network, in which I want to add a second layer. The re-used layer should not be trainable.
idx = 1 # index of desired layer
input_shape = net.layers[idx].get_input_shape_at(0) # get the input shape of desired layer
input_layer = net.layers[idx]
input_layer.trainable = False
transf_model = Sequential()
transf_model.add(input_layer)
transf_model.add(Dense(input_shape[1], activation='relu'))
transf_model.compile(loss='mean_absolute_error', optimizer='adam')
transf_model.fit(x,
y,
epochs=10,
batch_size=8,
verbose=0)
EDIT:
The above code returns:
ValueError: Error when checking target: expected dense_9 to have shape (None, 500) but got array with shape (436, 1)
What's the trick to make this work?
I would simply use Functional API to build such a model:
shared_layer = net.layers[0] # you want the first layer, so index = 0
shared_layer.trainable = False
inp = Input(the_shape_of_one_input_sample) # e.g. (2048,)
x = shared_layer(inp)
x = Dense(800, ...)(x)
out = Dense(1, ...)(x)
model = Model(inp, out)
# the rest is the same...
I'm using lastest Keras with tensorflow backend.
I'm not quite sure the correct way to put together the full model for inference, if I used a smaller version of my model for training on bottleneck values.
# Save bottleneck values
from keras.applications.xception import Xception
base_model = Xception(weights='imagenet', include_top=False)
prediction = base_model.predict(x)
** SAVE bottleneck data***
Now let's say my full model looks something like this:
base_model = Xception(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(classes, activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
but to speed up training, I wanted to bypass the earlier layers by loading bottleneck values; so I create a smaller model (including only the new layers). I then train and save the model.
bottleneck_input = Input(shape = bottleneck_shape)
x = GlobalAveragePooling2D() (bottleneck_input)
x = Dense(1024, activation='relu')(x)
predictions = Dense(classes, activation='softmax')(x)
model = Model(input= bottleneck_input, output=predictions)
save_full_model() #save model
after training this smaller model, I want to run inference on the full model. So I need to put together the base model and the smaller model. Not sure what is the best way to to do this.
base_model = Xception(weights='imagenet', include_top=False)
#x = base_model.output
loaded_model = load_model() # load bottleneck model
#now to combine both models (something like this?)
Model(inputs = base_model.inputs, outputs = loaded_model.outputs)
What is the proper way to put together the model for inference?
I don't know if there is a way to use my full-model for training, and just start from the bottleneck layers for training and input layer for inference. (Please not this is not the same as freeze layers, which just freezes the weights (weights won't be updated), but still calculates each data point.)
Every model is a layer with extra properties such as loss function etc. So you can use them like a layer in the functional API. In your case it could look like:
input = Input(...)
base_model = Xception(weights='imagenet', include_top=False)
# Apply model to input like layer
base_output = base_model(input)
loaded_model = load_model()
# Now the bottleneck model
out = loaded_model(base_output)
final_model = Model(input, out) # New computation graph