Is there a way to save the weights and load them on another file

Is there a way to save the weights and load them on another file - python

So, is simple as the title of the question, is there a way to save the weights after training like this
model.save_weights("path")
and then load them on another project only with
model = load_weights("path")
model.predict(x)
is it possible ?

yes. it is possible if you call the right path
for instance, you have this path
- project1/cool.py
- project2/another_cool.py
you train with cool.py and the model is saved inside project1's folder. then you want to load the model in another_cool.py
just call load_model function with path ../project1/weigh.h5

If you only want to save/load the weights, you can use
model.save_weights("path/to/my_model_weights.hdf5")
and then to reload (potentially in another python project / in another interpreter, you just have to update the path accordingly)
other_model.load_weights("path/to/my_model_weights.hdf5")
However both models should have the same architecture (instances of the same class), and Python/tensorflow/keras versions should be the same. See the doc for more info.
You can save both weights and architecture through model.save("path/to/my_model.hdf5")for saving on disk and keras.models.load_model("path/to/my_model.hdf5")for loading from disk (once again, the documentation should provide details).
Once loaded in memory, you can retrain your model, or use predict on it, predictions should be identical between projects

Related

Is it possible to load only weights to TF-TRT model?

I have two models with the exact same architecture, but different weights as the same network is used for two different problems. We're using TF-TRT to optimize the model in order to use it on edge devices.
We'd like to be able to switch from one model to the other as fast as possible. As of now, we load the next model using tf.saved_model.load(), however, this reloads the entire model including the architecture. In order to speed up the process, we'd like to simply load the weights & switch them in the model architecture.
From what I've seen, it is possible in Keras by loading a .w1 file, but we don't have such file after converting to TF-TRT.
I've found out that TRT has a Refitter object but I don't think we can use it in this case.
I'd like to know if it is possible to switch weights of a TF-TRT model, perhaps there is something I'm missing out.
Thank you for your help.

Save entire model but load weights only

I have defined a deep learning model my_unet() in tensorflow. During training I set save_weigths=False since I wanted to save the entire model (not only the wieghts bu the whole configuration). The generated file is path_to_model.hdf5.
However, when loading back the model I used the earlier version (I forgot to update it) in which I first called the model and then load the model using:
model = my_unet()
model.load_weights(path_to_model.hdf5)
Instead of simply using: model = tf.keras.models.load_model(path_to_model.hdf5) to load the entire model.
Both ways of loading the model and the weights provided the same predictions when run in some dummy data and there were no errors.
My question is: Why loading the entire model using model.load_weights() does not generate any problem? What is the structure of the hdf5 file and how theese two ways of loading exactly work? Where can I find this information?

You can please see the documentation here for any future reference: http://davis.lbl.gov/Manuals/HDF5-1.8.7/UG/03_DataModel.html

Save model with weights using state dict Pytorch

I have a PyTorch model class and its statedict with the weights.
I'd like to save the model directly with its weight in a .pt file using torch.save(model, PATH) but that simply saves the state dict again.
How do I save the model with the loaded_weights in it?
What I'm currently doing
lin_model = ModelClass(args)
lin_model.load_state_dict(torch.load('state_dict.pt'))
torch.save(lin_model, PATH)
I want the newly saved model to be a fully loaded pt file. Please help me here,thanks in advance.

According to the pytorch documentation here, when you use torch.save(model, PATH) it saves the entire model with the class. But here is the problem. It doesn't work every time. You see, the saved model is in pickle format, but the pickle file does not save the exact directory structure but just a path to the file containing the model class. So this saving method can break in various ways when used in other projects.

Unable to load and use multiple keras models

I'm trying to load three different models in the same process. Only the first one works as expected, the rest of them return like random results.
Basically the order is as follows:
define and compile first model
load trained weights before
rename layers
the same process for the second model
the same process for the third model
So, something like:
model1 = Model(inputs=Input(shape=input_size_im) , outputs=layers_firstmodel)
model1.compile(optimizer='sgd', loss='mse')
model1.load_weights(weights_first, by_name=True)
# rename layers but didn't work
model2 = Model(inputs=Input(shape=input_size_im) , outputs=layers_secondmodel)
model2.compile(optimizer='sgd', loss='mse')
model2.load_weights(weights_second, by_name=True)
# rename layers but didn't work
model3 = Model(inputs=Input(shape=input_size_im) , outputs=layers_thirdmodel)
model3.compile(optimizer='sgd', loss='mse')
model3.load_weights(weights_third, by_name=True)
# rename layers but didn't work
for im in list_images:
results_firstmodel = model1.predict(im)
results_secondmodel = model2.predict(im)
results_thirdmodel = model2.predict(im)
I'd like to perform some inference over a bunch of images. To do that the idea consists in looping over the images and perform inference with these three algorithms, and return the results.
I have tried to rename all layers to make them unique with no success. Also I created a different graph for each network, and with a different session do the inference. This works but it's very inefficient (in addition I have to set their weights every time because of sess.run(tf.global_variables_initializer()) removes them). Each time it's created a session tensorflow prints "creating tensorflow device (/device:GPU:0)".
I am running Tensorflow 1.4.0-rc0, Keras 2.1.1 and Ubuntu 16.04 kernel 4.14.

The OP is correct here. There is a serious bug when you try to load multiple weight files in the same script. The above answer doesn't solve this. If you actually interrogate the weights when loading weights for multiple models in the same script you will notice that the weights are different than when you just load weights for one model on its own. This is where the randomness is the OP observes coming from.
EDIT: To solve this problem you have to encapsulate the model.load_weight command within a function and the randomness that you are experiencing should go away. The problem is that something weird screws up when you have multiple load_weight commands in the same script like you have above. If you load those model weights with a function you issues should go away.

From the Keras docs we have this explanation for the user of load_weights:
loads the weights of the model from a HDF5 file (created by save_weights). By default, the architecture is expected to be unchanged. To load weights into a different architecture (with some layers in common), use by_name=True to load only those layers with the same name.
Therefore, if your architecture is unchanged you should drop the by_name=True or make it False (its default value). This could be causing the inconsistencies that you are facing, as your weights are not being loaded probably due to having different names on your layers.
Another important thing to consider is the nature of your HDF5 file, and the way you created it. If it indeed contains only the weights (created with save_weights as the docs point out) then there should be no problem in proceeding as explained before.
Now, if that HDF5 contains weights and architecture in the same file, then you should be loading it with keras.models.load_model instead (further reading if you like here). If this is the case then this would also explain those inconsistencies.
As a side suggestion, I prefer to save my models using Callbacks, like the ModelCheckpoint or the EarlyStopping if you want to automatically determine when to stop training. This not only gives you greater flexibility when training and saving your models (as you can stop them on the optimal training epoch or when you desire), but also makes loading those models easily, as you can simply use the load_model method to load both architecture and weights to your desired variable.
Finally, here is one useful SO post where saving (and loading) Keras models is explained.

Saving model in tensorflow

Tensorflow allows us to save/load model's structure, using method tf.train.write_graph, so that we can restore it in the future to continue our training session. However, I'm wondering that if this is necessary because I can create a module, e.g GraphDefinition.py, and use this module to re-create the model.
So, which is the better way to save the model structure or are there any rule of thumb that suggest which way should I use when saving a model?

First of all you have to understand, that tensorflow graph does not have current weights in it (until you save them manually there) and if you load model structure from graph.pb, you will start you train from the very beginning. But if you want to continue train or use your trained model, you have to save checkpoint (using tf Saver) with the values of the variables in it, not only the structure.
Check out this tread: Tensorflow: How to restore a previously saved model (python)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.