I am trying to train a resnet network using keras backend in tensorflow. The feed dictionary for each batch update is written as:
feed_dict= {x:X_train[indices[start:end]], y:Y_train[indices[start:end]], keras.backend.learning_phase():1}
I am using keras backend (keras.backend.set_session(sess)) because the original resnet network is defined with keras. As the model contains dropout and batch_norm layers, it requires a learning phase to distinct between training and testing.
I observe that whenever I set keras.backend.learning_phase():1, the model train/test accuracy hardly increase above 10%. In contrast, if the learning phase is not set i.e., the feed dictionary is defined as:
feed_dict= {x:X_train[indices[start:end]], y:Y_train[indices[start:end]]}
Then as expected, the model accuracy keeps in increasing with epochs in a standard way.
I would appreciate if someone clarifies whether the use of learning phase is not necessary or if something else is wrong. Keras 2.0 documentation seems to suggest using learning phase with dropout and batch_norm layers.
set the learning phase to 1 (training)
K.set_learning_phase(1)
Then you need to set the training=false for all batch normalization layers
if layer.name.startswith('bn'):
layer.call(layer.input, training=False)
Related
I am new to tensorflow-hub and came across the ELMo model (https://www.tensorflow.org/hub/modules/google/elmo/2).
According to the original paper, the ELMo representation is a weighted average of hidden state activations and these weights are trainable according to the task at hand i.e task specific. As expected, I can see the 4 trainable parameters when I use tf.trainable_variables(). How do I exactly train these variables in tensorflow?
They just mention that these weights are trainable. But who should train it? Me or ELMo model itself trains it? The paper seems to suggest that I should be training it. If so, how do I train it in tensorflow?
You can start off by importing a module into your model with trainable=True, then train the model as you would any other TF model. In the process of this training the model the weight imported a part of the module will be trained as well. You can also use this tutorial as a good starting point as well, and just replace nnlm embedding with ELMo.
I trained a model with several layers than for each layer in model.layers set
layer.trainable = False
I added several layers to this model, called
model.compile(...)
And trained this new model for several epochs with part of the layers frozen.
Later I decided to unfreeze layers and ran
for layer in model.layers:
layer.trainable = True
model.compile(...)
When I start learning the model with unfrozen layers I get loss function value very high even though I just wanted to continue training from previously learned weights. I also checked that after model.compile(...) model still predicts well (not resetting previously learned weights) but as soon as learning process starts everything gets 'erased' and I start as from scratch.
Could someone clarify, whether this behavior is ok? How to recompile the model and not start from scratch?
P.S. I also asked manually saving weights and assigning them back to a newly compiled model using layer.get_weights() and layer.set_weights()
I used the same compile parameters (similar optimizer and similar loss)
You might need to lower your learning rate while starting fine-tuning the trained layers. For example, a learning rate of 0.01 might work for your new dense layers (top) with all others layers set to untrainable. But when setting all layers to be trainable, you might need to reduce the learning rate to say 0.001 There is no need to manually copy or set weights.
I'm building a CNN model using Tensorflow, without the use of any frontend APIs such as Keras. I'm creating a VGG-16 model and using the pre-trained weights, and want to fine tune the last layers to serve my purpose.
Following the tutorial here, http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/
I re-created the training script and modified as per my requirements. However, my training does not happen and the training accuracy is stuck at 50.00% and validation accuracy is forming a pattern repeating the numbers.
Attached is the screenshot of the same.
I have been stuck on this for days now and can't seem to find the error. Any help is appreciated.
The code is pretty long and hence here is the gist file for the same
Your cross entropy is wrong, you are comparing your logits with the softmax of your logits.
This:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_pred)
Should be:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_true)
Some things to note. I would not train on some data point and then evaluate on the same datapoint. Your training accuracy is probably going to be biased by doing so. Another point to note ist that tf.argmax(tf.softmax(logits)) is the same as tf.argmax(logits).
I have trained a pre-trained RESNET18 model in pytorch and saved it. While testing the model is giving different accuracy for different mini-batch size. Does anyone know why?
Yes, I think so.
RESNET contains batch normalisation layers. At evaluation time you need to fix these; otherwise the running means are continuously being adjusted after processing each batch hence giving you different accuracy .
Try setting:
model.eval()
before evaluation. Note before getting back into training, call model.train().
I trained a model with Normalization layer. The code is as this:
In training phase:
model=Sequential()
model.add()
...
k.set_learning_phase(1)
ModelCheckpoint(weights_file)
model.fit()
In inference time:
k.set_learning_phase(0)
model.load_weights(weights_file)
model.predict_classes()
...
The version of Keras:2.0.8. Is that right,or need some special codes to compute the BN after training like using SegNet in Caffe?
No, you don't need to do anything special when using BatchNormalization or Dropout layers. Keras already tracks the learning/testing phases, so when using predict or predict_classes, it does the right thing.
You do not even need to set the learning phase manually, Keras already does it.