Pretrained NN Finetuning with Keras. How to freeze Batch Normalization?

Pretrained NN Finetuning with Keras. How to freeze Batch Normalization? - python

So I didnt write my code in tf.keras and according to this tutorial for finetuning with a pretrained NN: https://keras.io/guides/transfer_learning/#freezing-layers-understanding-the-trainable-attribute,
I have to set the parameter training=False when calling the pretrained model, so that when I later unfreeze for finetuning, Batch Normalization doesnt destroy my model. But how do I do that in keras (Remember: I didnt write it in tf.keras). Is it even necessary in keras to do that?
The code:
def baseline_model():
pretrained_model = Xception(include_top=False, weights="imagenet")
for layer in pretrained_model.layers:
layer.trainable = False
general_input = Input(shape=(256, 256, 3))
x = pretrained_model(general_input,training=False)
x = GlobalAveragePooling2D()(x)
...
Gives me the error, when calling model = baseline_model():
TypeError: call() got an unexpected keyword argument 'training'
How do I do that best? I tried rewriting everything in tf.keras, but theres errors popping up everyhwere when I tried to do it...
EDIT: My keras version is 2.3.1 and tensorflow 2.2.0.

EDITED my previous answer after doing some additional research:
I did some reading and it seems like there is some trickery in how BatchNorm layer behaves when frozen. This is a good thread talking about it: github.com/keras-team/keras/issues/7085 seems like training=false parameter is necessary to correctly freeze BatchNorm layer and it was added in Keras 2.1.3, so my advice for you is to make sure your Keras/TF version is higher

Related

Proper use of BatchNormalization when Fine-Tunning

I`m using keras and TF 2.0
I'm trying to implement ResNet50 pre-trained on ImageNet to a different problem (pneumonia binary classification) and I've found that there is some discussion online about how to properly set batch normalization layers to do Fine-Tunning.
My question is if I should freeze all the layers in the model, or skip batch normalization layers to do proper fine tuning.By this I mean, if resnet is the pre-trained model
resnet.trainable = False
or
for layer in resnet.layers:
if not isinstance(layer, keras.layers.BatchNormalization):
layer.trainable = False
enter code here
I'm reaching 97% test accuracy but I think it should perform better in such a simple task. Which way of freezing should I use?

Suppression of Tensorflow Warning

I have a tensorflow 2.x functional model model whose first layers are from another pretrained model. I want those layers to remain frozen, so I have used tf.stop_gradient on the pretrained head to stop them from learning. Below is a minimal example of my network:
head = load_my_cool_pretrained_representation_model()
x = tf.keras.layers.Dense(10000)(tf.stop_gradient(head.output))
x = tf.keras.layers.Dense(1)(x)
model = tf.keras.Model(inputs=head.inputs, outputs=x)
model.compile(loss='mse', optimizer=tf.keras.optimizers.Adam())
When I use model.fit() I get the following warning:
WARNING:tensorflow:Gradients do not exist for variables ['cool_rep_layer/embeddings:0', ...] when minimizing the loss.
I do not need the warning as I want those layers to not have gradients. How can I suppress this warning? I have looked at this answer, but I do not want to get into gradient tape for this model.

As per noober's comment, I just added
import logging
logging.getLogger('tensorflow').setLevel(logging.ERROR)
to get rid of the warnings. This worked for me.

Export Pix2Pix generator to tflite model

I trained a Pix2Pix generator from the Tensorflow 2.0 tutorial and I exported it in tflite this way :
converter = tf.lite.TFLiteConverter.from_keras_model(generator)
tflite_model = converter.convert()
open("facades.tflite", "wb").write(tflite_model)
Unfortunately, I have problems that seem to come from tf.keras.layers.BatchNormalization when I try to infer it.
First, the result of an inference only returns Nan values. This can be resolved by disabling the fused implementation.
Secondly, the BatchNormalization layer behaves differently depending on whether we are in training or prediction. The tutorial explicitly states to make a prediction in training=True mode. I don't know how to do this with the tflite model.
One solution talks about replacing the BatchNormalization layer by an InstanceNormalization, which can be found in the tensorflow_addons.
The conversion to tflite is done without any problem, but there is still a problem with the inference.
when I call invoke on the interpreter it crashes by returning me a SEGFAULT. According to the stackcall it would come from SquaredDifference operator of the InstanceNormalization layer.
Has anyone managed to convert this TensorFlow 2.0 model into a tflite and infer it correctly ? How ? Thank you.
PS : I would prefer a solution with BatchNormalization because it is a standard layer in Keras and can therefore also work with TensorFlow javascript.

How to remove (pop) initial layers of Keras InceptionV3 pre-trained model?

I am trying to use pre-trained InceptionV3 model. However, I want to remove initial five layers and add my custom layers. How can I do that? I tried model.layers.pop(0), but that alone will not solve the problem.
Edit:
tf.keras does not help either as mentioned in the first answer:

model.layers.pop() doesn't work in the same way in tf.keras as it doesn in Keras. In tf.keras, model.layers is a view of the model. You can't remove the layers but what you can do is define the layer for which you want the output. For example,
base_model = InceptionV3(shape=shape, weights="imagenet", include_top=True)
# you don't want the last five layers:
base_model_output = base_model.layers[-6].output
# new layers
outputs = Dense(....)(base_model_output)
model = Model(base_model.input, outputs)

Since the first few layers starting from the input are changed, then the pretrained weights cannot be used. So, the architecture can be directly taken from here and modified accordingly instead of trying complex surgeries.
https://github.com/keras-team/keras-applications/blob/master/keras_applications/inception_v3.py

How to obtain the Tensorflow code version of a NN built in Keras?

I have been working with Keras for a week or so. I know that Keras can use either TensorFlow or Theano as a backend. In my case, I am using TensorFlow.
So I'm wondering: is there a way to write a NN in Keras, and then print out the equivalent version in TensorFlow?
MVE
For instance suppose I write
#create seq model
model = Sequential()
# add layers
model.add(Dense(100, input_dim = (10,), activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
# compile model
model.compile(optimizer = 'adam', loss = 'mse')
# fit
model.fit(Xtrain, ytrain, epochs = 100, batch_size = 32)
# predict
ypred = model.predict(Xtest, batch_size = 32)
# evaluate
result = model.evaluate(Xtest)
This code might be wrong, since I just started, but I think you get the idea.
What I want to do is write down this code, run it (or not even, maybe!) and then have a function or something that will produce the TensorFlow code that Keras has written to do all these calculations.

First, let's clarify some of the language in the question. TensorFlow (and Theano) use computational graphs to perform tensor computations. So, when you ask if there is a way to "print out the equivalent version" in Tensorflow, or "produce TensorFlow code," what you're really asking is, how do you export a TensorFlow graph from a Keras model?
As the Keras author states in this thread,
When you are using the TensorFlow backend, your Keras code is actually building a TF graph. You can just grab this graph.
Keras only uses one graph and one session.
However, he links to a tutorial whose details are now outdated. But the basic concept has not changed.
We just need to:
Get the TensorFlow session
Export the computation graph from the TensorFlow session
Do it with Keras
The keras_to_tensorflow repository contains a short example of how to export a model from Keras for use in TensorFlow in an iPython notebook. This is basically using TensorFlow. It isn't a clearly-written example, but throwing it out there as a resource.
Do it with TensorFlow
It turns out we can actually get the TensorFlow session that Keras is using from TensorFlow itself, using the tf.contrib.keras.backend.get_session() function. It's pretty simple to do - just import and call. This returns the TensorFlow session.
Once you have the TensorFlow session variable, you can use the SavedModelBuilder to save your computational graph (guide + example to using SavedModelBuilder in the TensorFlow docs). If you're wondering how the SavedModelBuilder works and what it actually gives you, the SavedModelBuilder Readme in the Github repo is a good guide.
P.S. - If you are planning on heavy usage of TensorFlow + Keras in combination, have a look at the other modules available in tf.contrib.keras

So you want to use instead of WX+b a different function for your neurons. Well in tensorflow you explicitly calculate this product, so for example you do
y_ = tf.matmul(X, W)
you simply have to write your formula and let the network learn. It should not be difficult to implement.
In addition what you are trying to do (according to the paper you link) is called batch normalization and is relatively standard. The idea being you normalize your intermediate steps (in the different layers). Check for example https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFggyMAE&url=https%3A%2F%2Farxiv.org%2Fabs%2F1502.03167&usg=AOvVaw1nGzrGnhPhNGEczNwcn6WK or https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFghCMAM&url=https%3A%2F%2Fbcourses.berkeley.edu%2Ffiles%2F66022277%2Fdownload%3Fdownload_frd%3D1%26verifier%3DoaU8pqXDDwZ1zidoDBTgLzR8CPSkWe6MCBKUYan7&usg=AOvVaw0AHLwD_0pUr1BSsiiRoIFc
Hope that helps,
Umberto

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.