I am finetuning XLM-RoBERTa for text classification. I am finetuning model on Tensorflow-keras. I have trained model on google colab GPU. I used save method from keras as shown below to save my model.
model.save("/content/drive/MyDrive/TrainedModel")
loaded_model = tf.keras.models.load_model("/content/drive/MyDrive/TrainedModel")
It saves saved model but gives below error when loading the model on CPU. How can I load model to run it?
/usr/local/lib/python3.7/dist-packages/tensorflow/python/training/py_checkpoint_reader.py in get_tensor(self, tensor_str)
65 try:
66 return CheckpointReader.CheckpointReader_GetTensor(
---> 67 self, compat.as_bytes(tensor_str))
68 # TODO(b/143319754): Remove the RuntimeError casting logic once we resolve the
69 # issue with throwing python exceptions from C++.
** IndexError: Read less bytes than requested**
I got below warning when saving the model using above function. The saved model size is 7.43 MB.
WARNING:absl:Found untraced functions such as encoder_layer_call_fn, encoder_layer_call_and_return_conditional_losses, embeddings_layer_call_fn, embeddings_layer_call_and_return_conditional_losses, dense_layer_call_fn while saving (showing 5 of 422). These functions will not be directly callable after loading.
INFO:tensorflow:Assets written to: /content/drive/MyDrive/TrainedModel/assets
INFO:tensorflow:Assets written to: /content/drive/MyDrive/TrainedModel/assets
Related
I am having trouble loading large model after saving.
have tried all below saveing methods:
tf.saved_model.save(model, model_save_path)
model.save(model_save_path+"_new_save")
tf.keras.models.save_model(model, model_save_path+"_v3")
error when loading :
method 1
m2=tf.keras.models.load_model(model_save_path+"_v3")
error:
__init__() got an unexpected keyword argument 'reduction'
method 2
m3=tf.keras.models.load_model(model_save_path
error:
ARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with model.save() or tf.keras.models.save_model(), *NOT* tf.saved_model.save(). To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory.
ValueError: Unable to create a Keras model from SavedModel at xxxx . This SavedModel was exported with `tf.saved_model.save`, and lacks the Keras metadata file. Please save your Keras model by calling `model.save`or `tf.keras.models.save_model`. Note that you can still load this SavedModel with `tf.saved_model.load`.
method 3
m4=tf.saved_model.load(model_save_path)
this works but m4 object has no predict method
and not able to use
model.signatures["serving_default"](**input_data)
or
model.__call__(input_data,training=False)
to predict on data
any help would be appreciated
Adding compile=False to the load function will solve the issue:
m2=tf.keras.models.load_model(model_save_path+"_v3", compile=False)
I am unable to load saved pytorch model from the outputs folder in my other scripts.
I am using following lines of code to save the model:
os.makedirs("./outputs/model", exist_ok=True)
torch.save({
'model_state_dict': copy.deepcopy(model.state_dict()),
'optimizer_state_dict': optimizer.state_dict()
}, './outputs/model/best-model.pth')
new_run.upload_file("outputs/model/best-model.pth", "outputs/model/best-model.pth")
saved_model = new_run.register_model(model_name='pytorch-model', model_path='outputs/model/best-model.pth')
and using the following code to access it:
global model
best_model_path = 'outputs/model/best-model.pth'
model_checkpoint = torch.load(best_model_path)
model.load_state_dict(model_checkpoint['model_state_dict'], strict = False)
but when I run the above mentioned code, I get this error: No such file or directory: './outputs/model/best-model.pth'
Also I want to know is there a way to get the saved model from Azure Models? I have tried to get it by using following lines of code:
from azureml.core.model import Model
model = Model(ws, "Pytorch-model")
but it returns Model type object which returns error on model.eval() (error: Model has no such attribute eval()).
There is no global output folder. If you want to use a Model in a new script you need to give the script the model as an input or register the model and download the model from the new script.
The Model object form from azureml.core.model import Model is not your pytorch Model. 1
You can use model.register(...) to register your model. And model.download(...) to download you model. Than you can use pytorch to load you model. 2
I have successfully trained a Keras model like:
import tensorflow as tf
from keras_segmentation.models.unet import vgg_unet
# initaite the model
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
# Train
model.train(
train_images=train_images,
train_annotations=train_annotations,
checkpoints_path="/tmp/vgg_unet_1", epochs=5
)
And saved it in hdf5 format with:
tf.keras.models.save_model(model,'my_model.hdf5')
Then I load my model with
model=tf.keras.models.load_model('my_model.hdf5')
Finally I want to make a segmentation prediction on a new image with
out = model.predict_segmentation(
inp=image_to_test,
out_fname="/tmp/out.png"
)
I am getting the following error:
AttributeError: 'Functional' object has no attribute 'predict_segmentation'
What am I doing wrong ?
Is it when I am saving my model or when I am loading it ?
Thanks !
predict_segmentation isn't a function available in normal Keras models. It looks like it was added after the model was created in the keras_segmentation library, which might be why Keras couldn't load it again.
I think you have 2 options for this.
You could use the line from the code I linked to manually add the function back to the model.
model.predict_segmentation = MethodType(keras_segmentation.predict.predict, model)
You could create a new vgg_unet with the same arguments when you reload the model, and transfer the weights from your hdf5 file to that model as suggested in the Keras documentation.
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
model.load_weights('my_model.hdf5')
I am using Chainer to train (fine-tune) a Resnet model and then use the checkpoint for evaluation. The checkpoint is a npz file with the following structure:
When I am loading the model for evaluation with chainer.serializers.load_npz(args.load, model) (where model is the standard resnet) I get the following error: KeyError: 'rpn/loc/b is not a file in the archive'.
I think the problem is that the files in the model do not have the 'updater/optimizer/faster/extractor' prefix.
How can I change the name of the files in the resulting npz to remove the prefix or what else should I do to fix the problem?
Thank you!
When you load a snapshot generated by the Snapshot Extension, you need to do it from the trainer.
chainer.serializers.load_npz(args.load, trainer) The trainer will automatically load the state of the updater, optimizer and the model.
You can also load only the model manually by accessing the corresponding field in the snapshot and passing it as an argument to the model.serialize function
npz_data = numpy.load(args.load)
snap = chainer.serializers.NpzDeserializer(npz_data)
model.serialize(snap['updater']['model:main'])
This should load only the weights of the model
i am running a model for my own data set(the project was implemented for training/testing with ImageNet) with 2 classes. I have made all the changes (in config files etc) but after training finishes(successfully), i get the following error when starting testing:
wrote gt roidb to ./data/cache/ImageNetVID_DET_val_gt_roidb.pkl
Traceback (most recent call last):
File "experiments/dff_rfcn/dff_rfcn_end2end_train_test.py", line 20, in <module>
test.main()
File "experiments/dff_rfcn/../../dff_rfcn/test.py", line 53, in main
args.vis, args.ignore_cache, args.shuffle, config.TEST.HAS_RPN, config.dataset.proposal, args.thresh, logger=logger, output_path=final_output_path)
File "experiments/dff_rfcn/../../dff_rfcn/function/test_rcnn.py", line 68, in test_rcnn
roidbs_seg_lens[gpu_id] += x['frame_seg_len']
KeyError: 'frame_seg_len'
I cleaned the cache file before running. As i have read in previous topics, this might be an issue of previous datasets .pkl files in cache. What may have caused this error? I also want to mention that i changed .txt filenames that feed the neural network(if this is important), and that training finishes well.
This is my first time running a project in Deep Learning so please show some understanding.
MXNet typically uses methods other than pickle directly for serialization of the model architecture and trained weights.
With Gluon API, you can save weights of the model to a file (i.e. Block) with .save_params() and then load the weights from a file with .load_params(). You 'save' the model architecture by keeping the code used to define the model. See and example of this here.
With Module API, you can create checkpoints at the end of each epoch which will save the symbol (i.e. model architecture) and the parameters (i.e. model weights). See here.
checkpoint = mx.callback.do_checkpoint(model_prefix)
mod = mx.mod.Module(symbol=net)
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)
You can then load the model of a given checkpoint (e.g. 42 in this example)
sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 42)
mod.set_params(arg_params, aux_params)