Changing MobileNet Dropout After Loading

Changing MobileNet Dropout After Loading - python

I am working a transfer learning problem. When I create a new model from just the Mobilenet, I set a dropout.
base_model = MobileNet(weights='imagenet', include_top=False, input_shape=(200,200,3), dropout=.15)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(10, activation='softmax')(x)
I save models as I train using model_checkpoint_callback. As I train I find where overfitting is happening and adjust the amount of frozen layers and the learning rate. Can I also adjust dropout when I save a loaded model again?
I saw this answer but there are no actual dropout layers in Mobilenet, so this
for layer in model.layers:
if hasattr(layer, 'rate'):
print(layer.name)
layer.rate = 0.5
doesn't do anything.

In the past, you had to clone the model for the new dropout to take. I haven't tried it recently.
# This code allows you to change the dropout
# Load model from .json
model.load_weights(filenameToModelWeights) # Load weights
model.layers[-2].rate = 0.04 # layer[-2] is my dropout layer, rate is dropout attribute
model = keras.models.clone(model) # If I do not clone, the new rate is never used. Weights are re-init now.
model.load_weights(filenameToModelWeights) # Load weights
model.predict(x)
credit to
http://www.gergltd.com/home/2018/03/changing-dropout-on-the-fly-during-training-time-test-time-in-keras/
If the model doesn't have dropout layers to even begin with, as with Keras's pretrained mobilenet, you'll have to add them with methods. Here's one way you could do it.
For adding in a single layer
def insert_single_layer_in_keras(model, layer_name, new_layer):
layers = [l for l in model.layers]
x = layers[0].output
for i in range(1, len(layers)):
x = layers[i](x)
# add layer afterward
if layers[i].name == layer_name:
x = new_layer(x)
new_model = Model(inputs=layers[0].input, outputs=x)
return new_model
For systematically adding a layer
def insert_layers_in_model(model, layer_common_name, new_layer):
import re
layers = [l for l in model.layers]
x = layers[0].output
layer_config = new_layer.get_config()
base_name = layer_config['name']
layer_class = type(dropout_layer)
for i in range(1, len(layers)):
x = layers[i](x)
match = re.match(".+" + layer_common_name + "+", layers[i].name)
# add layer afterward
if match:
layer_config['name'] = base_name + "_" + str(i) # no duplicate names, could be done different
layer_copy = layer_class.from_config(layer_config)
x = layer_copy(x)
new_model = Model(inputs=layers[0].input, outputs=x)
return new_model
Run like this
import tensorflow as tf
from tensorflow.keras.applications.mobilenet import MobileNet
from tensorflow.keras.layers import Dropout
from tensorflow.keras.models import Model
base_model = MobileNet(weights='imagenet', include_top=False, input_shape=(192, 192, 3), dropout=.15)
dropout_layer = Dropout(0.5)
# add single layer after last dropout
mobile_net_with_dropout = insert_single_layer_in_model(base_model, "conv_pw_13_bn", dropout_layer)
# systematically add layers after any batchnorm layer
mobile_net_with_multi_dropout = insert_layers_in_model(base_model, "bn", dropout_layer)
By the way, you should absolutely experiment, but it's unlikely you want additional regularization on top of batchnorm for a small net like mobilenet.

Related

Adding a list of layers to a keras model computation graph (functional API)

EDIT: More clarification -
I have a pre-trained model file which I can load and pull model.layers and model.weights from. This model may have a complex set of interconnected layers.
I want to be able to use the model.layers or the model() file directly to append it to a layer in another neural network.
#Dummy model - this function is not available to me; only the model file
def model1():
inp = layers.Input((3,))
x = layers.Dense(4, activation='relu')(inp)
out = layers.Dense(2, activation='softmax')(x)
model = Model(inp, out)
return model
pretrained_model = model1() #I have THIS only!
L = pretrained_model.layers
print(L)
[<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7f915d6778b0>,
<tensorflow.python.keras.layers.core.Dense at 0x7f915d643790>,
<tensorflow.python.keras.layers.core.Dense at 0x7f915e124e50>]
I want to take the Dense layers L[1:] and add them to another architecture (Not the weights, just the layers). Something like below as #Anton has described in his solution.
inp = layers.Input((3,))
x = Dense(3, activation='relu')(inp)
m0 = get_layers(pretrained_model)(x) #<---
out = layers.Dense(2)(m0)
This should give me a model.summary() with 5 layers - inp, x, L[1], L[2], out
But I am unable to use the list of layers directly.
I can come up with a function that recreates a partial computation graph based on these layers but I am looking for something simpler.
I have already tried modifying the model1() function to work for me as below, which serves my purpose but assuming I only get a model file and with a massive number of layers, this will not be possible.
def model1(layer):
#inp = layers.Input((3,))
x = layers.Dense(4, activation='relu')(layer)
out = layers.Dense(2, activation='softmax')(x)
model = Model(inp, out)
return model.output

How can I use a model generator inside another model
We can use the generate model1() and replace
inp = layers.Input((3,))
x = Dense(3, activation='relu')(inp)
m0 = get_layers(pretrained_model)(x) # <---
out = layers.Dense(2)(m0)
with
inp = layers.Input((3,))
x = layers.Dense(3, activation='relu')(inp)
m0 = pretrained_model(x) # <---
out = layers.Dense(2)(m0)
and if we want a new model generator model2() that does that as a function
def model2(pretrained_model):
inp = layers.Input((3,), name='model2_input')
x = layers.Dense(3, activation='relu', name='model2_x')(inp)
m0 = pretrained_model(x)
out = layers.Dense(2, name='model2_out')(m0)
model = Model(inp, out, name='model2')
return model
second_model = model2()
If we look at the graph of second_model we can see that indeed it contains the layers of model1
We can generate the above image using
tf.keras.utils.plot_model(second_model, show_shapes=True, expand_nested=True)

Tensorflow 2 does not have a a fully_connected function How can I simulate that?

I am making a CNN model to use for lane detection. But tensorflow 2 does not have tf.contrib therefore i cannot access the fully_connected layer.
How can I make my own Fully connected layer function?
This is my model so far:
conv2d = tf.nn.conv2d
batch_norm = tf.nn.batch_normalization
dropout = tf.nn.dropout
max_pool = tf.nn.max_pool2d
softmax = tf.nn.softmax
relu = tf.nn.relu
avg_pool = tf.nn.avg_pool2d
checkpoint = tf.train.Checkpoint
def network(x):
model = conv2d(x,filters=[1,5,5,1],strides=[1,2,2,1],padding='SAME')
model = relu(model)
model = batch_norm(model)
model = max_pool(model)
model = conv2d(model,filters=[1,4,4,1],strides=[1,2,2,1],padding='SAME')
model = relu(model)
model = batch_norm(model)
model = max_pool(model)
model = conv2d(model,filters=[1,3,3,1],strides=[1,2,2,1],padding='SAME')
model = relu(model)
model = batch_norm(model)
model = avg_pool(model)
model = dropout(model,0.3)
# i want to add the fully connect layer here then a softmax layer then another fully connected

I think what you might be looking for is the Dense layer in the keras module - https://www.tensorflow.org/api_docs/python/tf/keras/layers/Dense

Extracting the dropout mask from a keras dropout layer?

I would like to extract and store the dropout mask [array of 1/0s] from a dropout layer in a Sequential Keras model at each batch while training. I was wondering if there was a straight forward way way to do this within Keras or if I would need to switch over to tensorflow (How to get the dropout mask in Tensorflow).
Would appreciate any help! I'm quite new to TensorFlow and Keras.
There are a couple of functions (dropout_layer.get_output_mask(), dropout_layer.get_input_mask()) for the dropout layer that I tried using but got None after calling on the previous layer.
model = tf.keras.Sequential()
model.add(tf.keras.layers.Flatten(name="flat", input_shape=(28, 28, 1)))
model.add(tf.keras.layers.Dense(
512,
activation='relu',
name = 'dense_1',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros'))
dropout = tf.keras.layers.Dropout(0.2, name = 'dropout') #want this layer's mask
model.add(dropout)
x = dropout.output_mask
y = dropout.input_mask
model.add(tf.keras.layers.Dense(
10,
activation='softmax',
name='dense_2',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros'))
model.compile(...)
model.fit(...)

It's not easily exposed in Keras. It goes deep until it calls the Tensorflow dropout.
So, although you're using Keras, it's will also be a tensor in the graph that can be gotten by name (finding it's name: In Tensorflow, get the names of all the Tensors in a graph).
This option, of course will lack some keras information, you should probably have to do that inside a Lambda layer so Keras adds certain information to the tensor. And you must take extra care because the tensor will exist even when not training (where the mask is skipped)
Now, you can also use a less hacky way, that may consume a little processing:
def getMask(x):
boolMask = tf.not_equal(x, 0)
floatMask = tf.cast(boolMask, tf.float32) #or tf.float64
return floatMask
Use a Lambda(getMasc)(output_of_dropout_layer)
But instead of using a Sequential model, you will need a functional API Model.
inputs = tf.keras.layers.Input((28, 28, 1))
outputs = tf.keras.layers.Flatten(name="flat")(inputs)
outputs = tf.keras.layers.Dense(
512,
# activation='relu', #relu will be a problem here
name = 'dense_1',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros')(outputs)
outputs = tf.keras.layers.Dropout(0.2, name = 'dropout')(outputs)
mask = Lambda(getMask)(outputs)
#there isn't "input_mask"
#add the missing relu:
outputs = tf.keras.layers.Activation('relu')(outputs)
outputs = tf.keras.layers.Dense(
10,
activation='softmax',
name='dense_2',
kernel_initializer=tf.keras.initializers.GlorotUniform(seed=123),
bias_initializer='zeros')(outputs)
model = Model(inputs, outputs)
model.compile(...)
model.fit(...)
Training and predicting
Since you can't train the masks (it doesn't make any sense), it should not be an output of the model for training.
Now, we could try this:
trainingModel = Model(inputs, outputs)
predictingModel = Model(inputs, [output, mask])
But masks don't exist in prediction, because dropout is only applied in training. So this doesn't bring us anything good in the end.
The only way for training is then using a dummy loss and dummy targets:
def dummyLoss(y_true, y_pred):
return y_true #but this might evoke a "None" gradient problem since it's not trainable, there is no connection to any weights, etc.
model.compile(loss=[loss_for_main_output, dummyLoss], ....)
model.fit(x_train, [y_train, np.zeros((len(y_Train),) + mask_shape), ...)
It's not guaranteed that these will work.

I found a very hacky way to do this by trivially extending the provided dropout layer. (Almost all code from TF.)
class MyDR(tf.keras.layers.Layer):
def __init__(self,rate,**kwargs):
super(MyDR, self).__init__(**kwargs)
self.noise_shape = None
self.rate = rate
def _get_noise_shape(self,x, noise_shape=None):
# If noise_shape is none return immediately.
if noise_shape is None:
return array_ops.shape(x)
try:
# Best effort to figure out the intended shape.
# If not possible, let the op to handle it.
# In eager mode exception will show up.
noise_shape_ = tensor_shape.as_shape(noise_shape)
except (TypeError, ValueError):
return noise_shape
if x.shape.dims is not None and len(x.shape.dims) == len(noise_shape_.dims):
new_dims = []
for i, dim in enumerate(x.shape.dims):
if noise_shape_.dims[i].value is None and dim.value is not None:
new_dims.append(dim.value)
else:
new_dims.append(noise_shape_.dims[i].value)
return tensor_shape.TensorShape(new_dims)
return noise_shape
def build(self, input_shape):
self.noise_shape = input_shape
print(self.noise_shape)
super(MyDR,self).build(input_shape)
#tf.function
def call(self,input):
self.noise_shape = self._get_noise_shape(input)
random_tensor = tf.random.uniform(self.noise_shape, seed=1235, dtype=input.dtype)
keep_prob = 1 - self.rate
scale = 1 / keep_prob
# NOTE: if (1.0 + rate) - 1 is equal to rate, then we want to consider that
# float to be selected, hence we use a >= comparison.
self.keep_mask = random_tensor >= self.rate
#NOTE: here is where I save the binary masks.
#the file grows quite big!
tf.print(self.keep_mask,output_stream="file://temp/droput_mask.txt")
ret = input * scale * math_ops.cast(self.keep_mask, input.dtype)
return ret

How to define tf.layer.dense in a for loop for creating dynamics number of hidden layers and hidden unit?

I am looking for a way where we can use tensorflow API to create a neural network with the number of layer and hidden units as user defined.
Lets say I have a neural network like this
hidden1 = tf.layers.dense(inp, units=32, kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden1")
bn1 = tf.layers.batch_normalization(inputs=hidden1, name="bn1")
hidden2 = tf.layers.dense(bn1, units=16, kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden2")
bn2 = tf.layers.batch_normalization(inputs=hidden2, name="bn2")
hidden3 = tf.layers.dense(bn2, units=8 , kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden3")
bn3 = tf.layers.batch_normalization(inputs=hidden3, name="bn3")
out = tf.layers.dense(bn3, units=1, kernel_initializer=tf.initializers.he_uniform(), activation=None, name="out")
In the above snippet you can notice, if I want 3 layers then I need to repeat the code for 3 times.
I am looking for a way, where we can use for loop to define the above code block. For example, if number of layers is defined as 3, then the for loop should iterate and assign units and activation value for each according to user defined.
# psuedocode
for i in range(number_of_layer):
hidden_(i) = tf.layers.dense(inp, units=32, kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden_(i)")
bn_(i) = tf.layers.batch_normalization(inputs=hidden_(i), name="bn_(i)")

You can do it like this:
from keras.layers import Dense, BatchNormalization, Dropout
from keras.layers.advanced_activations import ReLU
from keras.models import Model
# Define the number of units per hidden layer
layer_widths = [128, 64, 32]
# Set up input layer
input_layer = Input(...) # change according to your input
x = input_layer.output
# Iteratively add the hidden layers
for n_neurons in layer_widths:
x = Dense(n_neurons)(x)
x = ReLU()(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
# Add the output layer
output = Dense(16, activation='softmax')(x) # change according to your output
# Stack the model together
model = Model(input, output)

Using tensorflow API
inp = tf.placeholder("float", [None,2],name="inp")
units = [32, 16, 8]
for unit in range(len(units)):
inp = tf.layers.dense(inp, units=units[unit], kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu,name="hidden" + str(unit + 1))
inp = tf.layers.batch_normalization(inputs=inp, name="bn"+str(unit + 1))
out = tf.layers.dense(inp, units=1, kernel_initializer=tf.initializers.he_uniform(), activation=None, name="out")

Combine to Keras functional models

I am trying to mimic this keras blog about fine tuning image classifiers. I would like to use the Inceptionv3 found on a fchollet repo.
Inception is a Model (functional API), so I can't just do model.add(top_model) which is reserved for Sequential.
How can I add combine two functional Models? Let's say I have
inputs = Input(shape=input_shape)
x = Flatten()(inputs)
predictions = Dense(4, name='final1')(x)
model1 = Model(input=inputs, output=predictions)
for the first model and
inputs_2 = Input(shape=(4,))
y = Dense(5)(l_inputs)
y = Dense(2, name='final2')(y)
predictions_2 = Dense(29)(y)
model2 = Model(input=inputs2, output=predictions2)
for the second. I now want an end-to-end that goes from inputs to predicions_2 and links predictions to inputs_2.
I tried using model1.get_layer('final1').output but I had a mismatch with types and I couldn't make it work.

I haven't tried this but according to the documentation functional models are callable, so you can do something like:
y = model2(model1(x))
where x is the data that goes to inputs and y is the result of predictions_2

I ran into this problem as well while fine tuning VGG16. Here's what worked for me and I imagine a similar approach can be taken for Inception V3. Tested on Keras 2.0.5 with Tensorflow 1.2 backend.
# NOTE: define the following variables
# top_model_weights_path
# num_classes
# dense_layer_1 = 4096
# dense_layer_2 = 4096
vgg16 = applications.VGG16(
include_top=False,
weights='imagenet',
input_shape=(224, 224, 3))
# Inspect the model
vgg16.summary()
# This shape has to match the last layer in VGG16 (without top)
dense_input = Input(shape=(7, 7, 512))
dense_output = Flatten(name='flatten')(dense_input)
dense_output = Dense(dense_layer_1, activation='relu', name='fc1')(dense_output)
dense_output = Dense(dense_layer_2, activation='relu', name='fc2')(dense_output)
dense_output = Dense(num_classes, activation='softmax', name='predictions')(dense_output)
top_model = Model(inputs=dense_input, outputs=dense_output, name='top_model')
# from: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)
block5_pool = vgg16.get_layer('block5_pool').output
# Now combine the two models
full_output = top_model(block5_pool)
full_model = Model(inputs=vgg16.input, outputs=full_output)
# set the first 15 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
# WARNING: this may not be applicable for Inception V3
for layer in full_model.layers[:15]:
layer.trainable = False
# Verify things look as expected
full_model.summary()
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
full_model.compile(
loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=5e-5, momentum=0.9),
metrics=['accuracy'])
# Train the model...

I think there are 2 options depending on what you need:
(a) predictions_1 and predictions_2 matter for you. In this case, you can train a network with 2 outputs. Here an example derived from your post:
input_shape = [3, 20]
inputs = Input(shape=input_shape)
x = Flatten()(inputs)
predictions_1 = Dense(4, name='predictions_1')(x)
# here the predictions_1 just corresponds to your next layer's input
y = Dense(5)(predictions_1)
y = Dense(2)(y)
predictions_2 = Dense(29, name='predictions_2')(y)
# you specify here that you have 2 outputs
model = Model(input=inputs, output=[predictions_1, predictions_2])
For the .fit and .predict, you can find a lot of details in https://keras.io/getting-started/functional-api-guide/, section: Multi-input and multi-output models.
(b) you are only interested in predictions_2. In this case, you can just do:
input_shape = [3, 20]
inputs = Input(shape=input_shape)
x = Flatten()(inputs)
predictions_1 = Dense(4, name='predictions_1')(x)
# here the predictions_1 just corresponds to your next layer's input
y = Dense(5)(predictions_1)
y = Dense(2)(y)
predictions_2 = Dense(29, name='predictions_2')(y)
# you specify here that your only output is predictions_2
model = Model(input=inputs, output=predictions_2)
Now as regards inception_v3. You can define by yourself the architecture and modify the deep layers inside according to your needs (giving to these layers specific names in order to avoid keras naming them automatically).
After that, compile your model and loads weights (as in https://keras.io/models/about-keras-models/ see function load_weights(..., by_name=True))
# you can load weights for only the part that corresponds to the true
# inception_v3 architecture. The other part will be initialized
# randomly
model.load_weights("inception_v3.hdf5", by_name=True)
This should solve your problem. By the way, you can find extra information here: https://www.gradientzoo.com. The doc. explains several saving / loading / fine-tuning routines ;)
Update: if you do not want to redefine your model from scratch you can do the following:
input_shape = [3, 20]
# define model1 and model2 as you want
inputs1 = Input(shape=input_shape)
x = Flatten()(inputs1)
predictions_1 = Dense(4, name='predictions_1')(x)
model1 = Model(input=inputs1, output=predictions_1)
inputs2 = Input(shape=(4,))
y = Dense(5)(inputs2)
y = Dense(2)(y)
predictions_2 = Dense(29, name='predictions_2')(y)
model2 = Model(input=inputs2, output=predictions_2)
# then define functions returning the image of an input through model1 or model2
def give_model1():
def f(x):
return model1(x)
return f
def give_model2():
def g(x):
return model2(x)
return g
# now you can create a global model as follows:
inputs = Input(shape=input_shape)
x = model1(inputs)
predictions = model2(x)
model = Model(input=inputs, output=predictions)

Drawing from filitchp's answer above, assuming the output dimensions of model1 match the input dimensions of model2, this worked for me:
model12 = Model(inputs=inputs, outputs=model2(model1.output))

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Changing MobileNet Dropout After Loading - python

Related

Adding a list of layers to a keras model computation graph (functional API)

Tensorflow 2 does not have a a fully_connected function How can I simulate that?

Extracting the dropout mask from a keras dropout layer?

How to define tf.layer.dense in a for loop for creating dynamics number of hidden layers and hidden unit?

Combine to Keras functional models

Categories

Resources