Tensorflow: differences on how to save a trained model

Tensorflow: differences on how to save a trained model - python

I noticed that in Python there exist several ways for saving a trained model
However I am not able to see the real difference between them.
Checkpoints
saver = tf.train.Saver()
saver.save(session, output_path)
Freezing
from tensorflow.python.framework import graph_util
input_graph_def = graph.as_graph_def()
output_graph_def = graph_util.convert_variables_to_constants(
session, input_graph_def, output_nodes_names)
with tf.gfile.GFile(output_graph, "wb") as output_graph_file:
output_graph_file.write(output_graph_def.SerializeToString())
SavedModelBuilder
builder = tf.saved_model.builder.SavedModelBuilder(output_path)
builder.add_meta_graph_and_variables(
session,
[tf.saved_model.tag_constants.SERVING],
clear_devices=True)
builder.save()
Let's consider different scenarios: evaluation/inference, fine-tuning, serving API, export to other frameworks.
What's the best way for saving a model for each of these situtations? Are there rules about when to use one method or the other?
Thanks

This is not an exhaustive answer, but with modern (mid 2018) TensorFlow, you probably only need Checkpoints and SavedModels.
As pointed out in
https://www.tensorflow.org/get_started/checkpoints
"Checkpoints - a format dependent on the code that created the model"
"SavedModel - a format independent of the code that created the model"
"Freezing" largely got folded into and replaced by SavedModel.
In your training code and while you still want to retain the capability to continue training/fine-tuning, checkpoints are the way to go, as all the relevant code/state to not only train but also monitor that training is kept around between the checkpoints and your code.
When you move over to the "serving" side (i.e consumption), you add all the metadata needed to use the model, strip out the unneeded training elements and go to SavedModel.
I have not personally tried to export to other frameworks from TensorFlow, just into it, so I cannot offer a good opinion on what would be best for that case.

Related

Most space/memory efficient way to save Tensorflow model for prediction only?

I have a huge Tensorflow model (the checkpoint file is 4-5 gbs). I was wondering if there's a different way to save Tensorflow models, besides the checkpoint way, that is space/memory efficient.
I know that a checkpoint file also saves all the optimizer gradients, so maybe those can be cut out too.
My model is very simple, just two matrices of embeddings, perhaps I can only save those matrices to .npy directly?

What you want to do with the checkpoint is to freeze it. Check out this page from tensorflow's official documentation.
The freezing process strips off all extraneous information from the checkpoint that isn't used for forward inference. Tensorflow provides an easy to use script for it called freeze_graph.py.

Unable to load and use multiple keras models

I'm trying to load three different models in the same process. Only the first one works as expected, the rest of them return like random results.
Basically the order is as follows:
define and compile first model
load trained weights before
rename layers
the same process for the second model
the same process for the third model
So, something like:
model1 = Model(inputs=Input(shape=input_size_im) , outputs=layers_firstmodel)
model1.compile(optimizer='sgd', loss='mse')
model1.load_weights(weights_first, by_name=True)
# rename layers but didn't work
model2 = Model(inputs=Input(shape=input_size_im) , outputs=layers_secondmodel)
model2.compile(optimizer='sgd', loss='mse')
model2.load_weights(weights_second, by_name=True)
# rename layers but didn't work
model3 = Model(inputs=Input(shape=input_size_im) , outputs=layers_thirdmodel)
model3.compile(optimizer='sgd', loss='mse')
model3.load_weights(weights_third, by_name=True)
# rename layers but didn't work
for im in list_images:
results_firstmodel = model1.predict(im)
results_secondmodel = model2.predict(im)
results_thirdmodel = model2.predict(im)
I'd like to perform some inference over a bunch of images. To do that the idea consists in looping over the images and perform inference with these three algorithms, and return the results.
I have tried to rename all layers to make them unique with no success. Also I created a different graph for each network, and with a different session do the inference. This works but it's very inefficient (in addition I have to set their weights every time because of sess.run(tf.global_variables_initializer()) removes them). Each time it's created a session tensorflow prints "creating tensorflow device (/device:GPU:0)".
I am running Tensorflow 1.4.0-rc0, Keras 2.1.1 and Ubuntu 16.04 kernel 4.14.

The OP is correct here. There is a serious bug when you try to load multiple weight files in the same script. The above answer doesn't solve this. If you actually interrogate the weights when loading weights for multiple models in the same script you will notice that the weights are different than when you just load weights for one model on its own. This is where the randomness is the OP observes coming from.
EDIT: To solve this problem you have to encapsulate the model.load_weight command within a function and the randomness that you are experiencing should go away. The problem is that something weird screws up when you have multiple load_weight commands in the same script like you have above. If you load those model weights with a function you issues should go away.

From the Keras docs we have this explanation for the user of load_weights:
loads the weights of the model from a HDF5 file (created by save_weights). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use by_name=True to load only those layers with the same name.
Therefore, if your architecture is unchanged you should drop the by_name=True or make it False (its default value). This could be causing the inconsistencies that you are facing, as your weights are not being loaded probably due to having different names on your layers.
Another important thing to consider is the nature of your HDF5 file, and the way you created it. If it indeed contains only the weights (created with save_weights as the docs point out) then there should be no problem in proceeding as explained before.
Now, if that HDF5 contains weights and architecture in the same file, then you should be loading it with keras.models.load_model instead (further reading if you like here). If this is the case then this would also explain those inconsistencies.
As a side suggestion, I prefer to save my models using Callbacks, like the ModelCheckpoint or the EarlyStopping if you want to automatically determine when to stop training. This not only gives you greater flexibility when training and saving your models (as you can stop them on the optimal training epoch or when you desire), but also makes loading those models easily, as you can simply use the load_model method to load both architecture and weights to your desired variable.
Finally, here is one useful SO post where saving (and loading) Keras models is explained.

Loading SavedModel is a lot slower than loading a tf.train.Saver checkpoint

I changed from tf.train.Saver to the SavedModel format which surprisingly means loading my model from disk is a lot slower (instead of a couple of seconds it takes minutes). Why is this and what can I do to load the model faster?
I used to do this:
# Save model
saver = tf.train.Saver()
save_path = saver.save(session, model_path)
# Load model
saver = tf.train.import_meta_graph(model_path + '.meta')
saver.restore(session, model_path)
But now I do this:
# Save model
builder = tf.saved_model.builder.SavedModelBuilder(model_path)
builder.add_meta_graph_and_variables(session, [tf.saved_model.tag_constants.TRAINING])
builder.save()
# Load model
tf.saved_model.loader.load(session, [tf.saved_model.tag_constants.TRAINING], model_path)

I am by no ways an expert in Tensorflow, but if I had to take a guess as to why this is happening, I would say that:
tf.train.Saver(), saves a complete meta-graph. Therefore, all the information needed to perform any operations contained in your graph is already there. All tensorflow needs to do to load the model, is insert the meta-graph into the default/current graph and you're good to go.
The SavedModelBuilder() on the other hand, behind the scene creates a language agnostic representation of your operations and variables. Which means that the loading method has to extract all the information, then recreate all the operation and variables from your previous graph, and insert them into the default/current graph.
Depending on the size of your graph, recreating everything that it contained might take some time.
Concerning the second question, as #J H said, if there are no reasons for you to use one strategy over the other, and time is of the essence, then just go with the fastest one.

what can I do to load the model faster?
Switch back to tf.train.Saver, as your question shows no motivations for using SavedModelBuilder, and makes it clear that elapsed time matters to you. Alternatively, an MCVE that reproduced the timing issue would allow others to collaborate with you on profiling, diagnosing, and fixing any perceived performance issue.

What are tensorflow summaries? How exactly are they utilized when using a tf model to make predictions?

Posting here as I couldn't find an explicit answer from tensorflow's documentation. I am curious about the actual purpose of the summary files in tensorflow. After training, I call the tensor flow model (the model file and the meta file) that were saved by:
tf.train.saver()
There seems to be no need for me to actually keep the summary files apart from logging training information;I can use my models to predict without referencing the summaries.
Is the summary file merely just log files of the training runs (accuracy and loss). Is there any other purpose that these files serve?

Saving model in tensorflow

Tensorflow allows us to save/load model's structure, using method tf.train.write_graph, so that we can restore it in the future to continue our training session. However, I'm wondering that if this is necessary because I can create a module, e.g GraphDefinition.py, and use this module to re-create the model.
So, which is the better way to save the model structure or are there any rule of thumb that suggest which way should I use when saving a model?

First of all you have to understand, that tensorflow graph does not have current weights in it (until you save them manually there) and if you load model structure from graph.pb, you will start you train from the very beginning. But if you want to continue train or use your trained model, you have to save checkpoint (using tf Saver) with the values of the variables in it, not only the structure.
Check out this tread: Tensorflow: How to restore a previously saved model (python)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.