I have a PyTorch model class and its statedict with the weights.
I'd like to save the model directly with its weight in a .pt file using torch.save(model, PATH) but that simply saves the state dict again.
How do I save the model with the loaded_weights in it?
What I'm currently doing
lin_model = ModelClass(args)
lin_model.load_state_dict(torch.load('state_dict.pt'))
torch.save(lin_model, PATH)
I want the newly saved model to be a fully loaded pt file. Please help me here,thanks in advance.
According to the pytorch documentation here, when you use torch.save(model, PATH) it saves the entire model with the class. But here is the problem. It doesn't work every time. You see, the saved model is in pickle format, but the pickle file does not save the exact directory structure but just a path to the file containing the model class. So this saving method can break in various ways when used in other projects.
Related
I have fine-tuned the ViTForImageClassification (transformer library by HuggingFace) and now I save the module in the default way that create a folder with the 2 following files:
config.json (should contain the structure of the model)
tf_model.h5 (contains the weights)
but the point is that the .h5 doesn’t contain the model but just the weights.
There is a way to get the whole model in one .h5 model? It’s possible in keras so I think that there is a solution for it!
So, is simple as the title of the question, is there a way to save the weights after training like this
model.save_weights("path")
and then load them on another project only with
model = load_weights("path")
model.predict(x)
is it possible ?
yes. it is possible if you call the right path
for instance, you have this path
- project1/cool.py
- project2/another_cool.py
you train with cool.py and the model is saved inside project1's folder. then you want to load the model in another_cool.py
just call load_model function with path ../project1/weigh.h5
If you only want to save/load the weights, you can use
model.save_weights("path/to/my_model_weights.hdf5")
and then to reload (potentially in another python project / in another interpreter, you just have to update the path accordingly)
other_model.load_weights("path/to/my_model_weights.hdf5")
However both models should have the same architecture (instances of the same class), and Python/tensorflow/keras versions should be the same. See the doc for more info.
You can save both weights and architecture through model.save("path/to/my_model.hdf5")for saving on disk and keras.models.load_model("path/to/my_model.hdf5")for loading from disk (once again, the documentation should provide details).
Once loaded in memory, you can retrain your model, or use predict on it, predictions should be identical between projects
I have defined a deep learning model my_unet() in tensorflow. During training I set save_weigths=False since I wanted to save the entire model (not only the wieghts bu the whole configuration). The generated file is path_to_model.hdf5.
However, when loading back the model I used the earlier version (I forgot to update it) in which I first called the model and then load the model using:
model = my_unet()
model.load_weights(path_to_model.hdf5)
Instead of simply using: model = tf.keras.models.load_model(path_to_model.hdf5) to load the entire model.
Both ways of loading the model and the weights provided the same predictions when run in some dummy data and there were no errors.
My question is: Why loading the entire model using model.load_weights() does not generate any problem? What is the structure of the hdf5 file and how theese two ways of loading exactly work? Where can I find this information?
You can please see the documentation here for any future reference: http://davis.lbl.gov/Manuals/HDF5-1.8.7/UG/03_DataModel.html
I have been searching for a method to do this for so long, and I can not find an answer. Most threads I found are of people wanting to do the opposite.
Backstory:
I am experimenting with some pre-trained models provided by the tensorflow/models repository. The models are saved as .pb frozen graphs. I want to fine-tune some of these models by changing the final layers to suit my application.
Hence, I want to load the models inside a jupyter notebook as a normal keras h5 model.
How can I do that?
do you have a better way to do so?
Thanks.
seems like all you would have to do is download the model files and store them in a directory. Call the directory for example c:\models. Then load the model.
model = tf.keras.models.load_model(r'c:\models')
model.summary() # prints out the model layers
# generate code to modify the model as you typically do for transfer learning
# compile the changed model
# train the model
# save the trained model as a .h5 file
dir=r'path to the directory you want to save the model to'
model_identifier= 'abcd.h5' # for abcd use whatever identification you want
save_path=os.path.join(dir, model_identifier)
model.save(save_path)
Currently I am loading a previously trained model using
with tf.Session() as sess:
saver.restore(sess, tf.train.latest_checkpoint('./'))
which loads the file recorded in the file "checkpoint" that was created when saving the model. However, the file "checkpoint" always refers to the last trained model, so if I want to load another model I have to manually edit the "checkpoint" file to change the model name.
My question is, how can I restore a trained model different from the last one I created, without manually editing the "checkpoint" file?
You can use jsonpickle to save variables and models and later load them.
For example:
sklearn_model = RandomForestClassifier()
sklearn_model.fit(x,y)
model_object = jsonpickle.pickler.Pickler.flatten(sklearn_model)
model = jsonpickle.unpickler.Unpickler.restore(model_object)
model.predict(new_x)
All you need now is to save the model whenever you want, and load it as needed. the model_object is a JSON code that can be saved to a file. You can read more about the package here