Is it possible to save the class/label mapping directly inside a keras model.h5 file? - python

Using model.save() I am able to save the trained model. However, upon using the model for predictions, I still need to recover the respective class/label mappings (0: 'cat', 1: 'dog' ... etc.). Currently I am saving the .class_indices from my train_generator and reload it to prepare my test-data to accomplish this. However this is quite inconvenient since it forces me to keep the mapping file somewhere save for future use of my saved model.h5 file.
Hence, I wonder if there is a simpler way of saving the class information i.e. saving it inside the model file. I can't find any information on this in the keras docs, only this post Attaching class labels to a Keras model where someone tried to come up with a 'workaround' but I assume there must be a better way.

Related

TFLite model maker custom object detector training using tfrecord

I am trying to train a custom object detector using tflite model maker (https://www.tensorflow.org/lite/tutorials/model_maker_object_detection). I want to deploy trained tflite model to coral edgeTPU. I want to use tensorflow tfrecord (multiple) as input for training a model like object detection API. I tried with
tflite_model_maker.object_detector.DataLoader(
tfrecord_file_patten, size, label_map, annotations_json_file=None
) but I am not able to work around it. I have following questions.
Is it possible to tfrecord for training like mentioned above?
Is it also possible to pass multiple CSV files for training?
For multiple CSV files, you could probably just append one file to the other. Then you'd just have to pass one csv file.
As for passing a tfrecord instead, this should be possible. I'm also attempting to do this, so if I get it working I'll update my post. Looking at the source, it seems from_cache is the function internally used. Following that structure, should be able to create a DataLoader object similarly:
train_data = DataLoader(tfrecord_file_patten, meta_data['size'],
meta_data['label_map'], ann_json_file)
In this case, tfrecord_file_patten should be a tfrecord of your training data. You can construct the validation and test data the same way. This will work provided you're constructing your TFRecords correctly. There appears to be some inconsistency to how it's done in different places, so make sure you follow the same structure in creating the TFRecords as found in the ModelMaker source. This worked for me. One specific thing to watch out for is to use an integer for the 'image/source_id' feature in your TFExamples. If you use a string it'll throw an error.

Save entire model but load weights only

I have defined a deep learning model my_unet() in tensorflow. During training I set save_weigths=False since I wanted to save the entire model (not only the wieghts bu the whole configuration). The generated file is path_to_model.hdf5.
However, when loading back the model I used the earlier version (I forgot to update it) in which I first called the model and then load the model using:
model = my_unet()
model.load_weights(path_to_model.hdf5)
Instead of simply using: model = tf.keras.models.load_model(path_to_model.hdf5) to load the entire model.
Both ways of loading the model and the weights provided the same predictions when run in some dummy data and there were no errors.
My question is: Why loading the entire model using model.load_weights() does not generate any problem? What is the structure of the hdf5 file and how theese two ways of loading exactly work? Where can I find this information?
You can please see the documentation here for any future reference: http://davis.lbl.gov/Manuals/HDF5-1.8.7/UG/03_DataModel.html

Saving custom variables in Keras .h5 file

I'm developing a RNN for a project and I need to train it on a computer and be able to predict on another. The solution I found is to save the model into a .h5 file using the code below:
... # Train the data etc....
model.save("model.h5")
My problem is that I need to store some meta-data from my training dataset and pre-process and be able to load it together with the model. (e.g. name of dataset file, size of the dataset file, number of characters, etc...)
I don't want to store this information in a second file (e.g. a .txt file) because I would have to use two files. I don't want to use any additional library or framework for this task.
I was thinking (brainstorming) a code like this:
model.save("model.h5", metaData={'myVariableName': myVariable})
And to load would be:
myVariable = model.load("model.h5").getMetaData('myVariableName')
I know this is not possible in the current version and I already read Keras doc, but I couldn't find any efficient method to do that. Notice that what I'm asking is different from custom_object because want to save and load my own variables.
Is there a smarter approach to solve this problem?

Accessing Estimator evaluation results via SessionRunHooks

I'm trying to modify a program that uses the Estimator class in TensorFlow (v1.10) and I would like to access the evaluation metric results every time evaluation occurs so that I can copy the checkpoint files only when a new maximum has been achieved.
One idea I had was to create a class inheriting from SessionRunHook, doing the work I want in the after_run method. According to the documentation I can specify what is passed to after_run using before_run. However I cannot find a way to access the evaluation metrics results I want from the information passed in to before_run.
I looked into the Estimator code and it appears that it is writing the results to a summary file so another idea I had was to read this back in the after_run method, but the summary api doesn't seem to provide any read operations.
Are there any other ways I can achieve what I want to do? Not using the Estimator class is not an option as that would involve drastic changes to the code I'm working with.
Checkpoints are not the same as exporting. Checkpoints are about fault-recovery and involve saving the complete training state (weights, global step number, etc.).
In your case I would recommend exporting. The exported model will written to a directory called “exporter” and the serving input function specifies what the end-user will be expected to provide to the prediction service.
You can use the class "Best Exporter" to just export the models that are perfoming best:
https://www.tensorflow.org/api_docs/python/tf/estimator/BestExporter
This class exports the serving graph and checkpoints of the best models.
Also, it performs a model export everytime when the new model is better than any exsiting model.

Saving model in tensorflow

Tensorflow allows us to save/load model's structure, using method tf.train.write_graph, so that we can restore it in the future to continue our training session. However, I'm wondering that if this is necessary because I can create a module, e.g GraphDefinition.py, and use this module to re-create the model.
So, which is the better way to save the model structure or are there any rule of thumb that suggest which way should I use when saving a model?
First of all you have to understand, that tensorflow graph does not have current weights in it (until you save them manually there) and if you load model structure from graph.pb, you will start you train from the very beginning. But if you want to continue train or use your trained model, you have to save checkpoint (using tf Saver) with the values of the variables in it, not only the structure.
Check out this tread: Tensorflow: How to restore a previously saved model (python)

Categories

Resources