I am trying to jit trace and save my pytorch model from the segmentation models package. But I am getting an error. "Could not export Python function call 'SwishImplementation'. Remove calls to python functions before export. Did you forget to add #script or #scrript_method annotation? If this is a nn.ModuleList, add it to _ constants _" It only happens when I use the efficientnet backbone. How can I get the save() function to work? I need to be able to use the model in a c++ application.
import torch
import segmentation_models_pytorch as smp
model = smp.Unet('efficientnet-b7')
model.eval()
input = torch.randn((1,3,224,224))
torch_out = model(input)
model = torch.jit.trace(model,input)
trace_out = model(input)
model.save('model.pt')
The UNET model from the segmentation_models_pytorch module uses an EfficientNet, which uses a MemoryEfficientSwish module. To fix the error, change all instances of MemoryEfficientSwish to Swish before saving the model.
You can iterate through the UNET model, and if the module is an instance of EfficientNet, call the function .set_swish(memory_efficient = False).
After that, you can load the state_dict, and then trace and save the model.
Related
I am unable to load saved pytorch model from the outputs folder in my other scripts.
I am using following lines of code to save the model:
os.makedirs("./outputs/model", exist_ok=True)
torch.save({
'model_state_dict': copy.deepcopy(model.state_dict()),
'optimizer_state_dict': optimizer.state_dict()
}, './outputs/model/best-model.pth')
new_run.upload_file("outputs/model/best-model.pth", "outputs/model/best-model.pth")
saved_model = new_run.register_model(model_name='pytorch-model', model_path='outputs/model/best-model.pth')
and using the following code to access it:
global model
best_model_path = 'outputs/model/best-model.pth'
model_checkpoint = torch.load(best_model_path)
model.load_state_dict(model_checkpoint['model_state_dict'], strict = False)
but when I run the above mentioned code, I get this error: No such file or directory: './outputs/model/best-model.pth'
Also I want to know is there a way to get the saved model from Azure Models? I have tried to get it by using following lines of code:
from azureml.core.model import Model
model = Model(ws, "Pytorch-model")
but it returns Model type object which returns error on model.eval() (error: Model has no such attribute eval()).
There is no global output folder. If you want to use a Model in a new script you need to give the script the model as an input or register the model and download the model from the new script.
The Model object form from azureml.core.model import Model is not your pytorch Model. 1
You can use model.register(...) to register your model. And model.download(...) to download you model. Than you can use pytorch to load you model. 2
I have successfully trained a Keras model like:
import tensorflow as tf
from keras_segmentation.models.unet import vgg_unet
# initaite the model
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
# Train
model.train(
train_images=train_images,
train_annotations=train_annotations,
checkpoints_path="/tmp/vgg_unet_1", epochs=5
)
And saved it in hdf5 format with:
tf.keras.models.save_model(model,'my_model.hdf5')
Then I load my model with
model=tf.keras.models.load_model('my_model.hdf5')
Finally I want to make a segmentation prediction on a new image with
out = model.predict_segmentation(
inp=image_to_test,
out_fname="/tmp/out.png"
)
I am getting the following error:
AttributeError: 'Functional' object has no attribute 'predict_segmentation'
What am I doing wrong ?
Is it when I am saving my model or when I am loading it ?
Thanks !
predict_segmentation isn't a function available in normal Keras models. It looks like it was added after the model was created in the keras_segmentation library, which might be why Keras couldn't load it again.
I think you have 2 options for this.
You could use the line from the code I linked to manually add the function back to the model.
model.predict_segmentation = MethodType(keras_segmentation.predict.predict, model)
You could create a new vgg_unet with the same arguments when you reload the model, and transfer the weights from your hdf5 file to that model as suggested in the Keras documentation.
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
model.load_weights('my_model.hdf5')
Hi I am unable to find a way to save a lightgbm.LGBMRegressor model to a file for later re-use.
Try:
my_model.booster_.save_model('mode.txt')
#load from model:
bst = lgb.Booster(model_file='mode.txt')
Note: the API state that
bst = lgb.train(…)
bst.save_model('model.txt', num_iteration=bst.best_iteration)
Depending on the version, one of the above works. For generic, You can also use pickle or something similar to freeze your model.
import joblib
# save model
joblib.dump(my_model, 'lgb.pkl')
# load model
gbm_pickle = joblib.load('lgb.pkl')
Let me know if that helps
For Python 3.7 and lightgbm==2.3.1, I found that the previous answers were insufficient to correctly save and load a model. The following worked:
lgbr = lightgbm.LGBMRegressor(num_estimators = 200, max_depth=5)
lgbr.fit(train[num_columns], train["prep_time_seconds"])
preds = lgbr.predict(predict[num_columns])
lgbr.booster_.save_model('lgbr_base.txt')
Finally, we can validated that this worked via:
model = lightgbm.Booster(model_file='lgbr_base.txt')
model.predict(predict[num_columns])
Without the above, I was getting the error: AttributeError: 'LGBMRegressor' object has no attribute 'save_model'
With the lastest version of lightGBM using import lightgbm as lgb, here is how to do it:
model.save_model('lgb_classifier.txt', num_iteration=model.best_iteration)
and then you can read the model as follow :
model = lgb.Booster(model_file='lgb_classifier.txt')
clf.save_model('lgbm_model.mdl')
clf = lgb.Booster(model_file='lgbm_model.mdl')
I would like to know if it is possible to use the function tf.graph_util.convert_variables_to_constants (in order to store the frozen version of the graph) in a train/evaluation loop, while I'm using a custom estimators. For example:
best_validation_accuracy = -1
for _ in range(steps // how_often_validation):
# Train the model
estimator.train(input_fn=train_input_fn, steps=how_often_validation)
# Evaluate the model
validation_accuracy = estimator.evaluate(input_fn=eval_input_fn)
# Save best model
if validation_accuracy["accuracy"] > best_validation_accuracy:
best_validation_accuracy = validation_accuracy["accuracy"]
# Save best model perfomances
# I WANT TO USE tf.graph_util.convert_variables_to_constants HERE
To use the function tf.graph_util.convert_variables_to_constants, you need the graph and the session of your model.
After going through the TensorFlow code defining the estimators, it appears that:
This code is deprecated,
The graph is created on the fly and not easily accessible (at least, I was not able to retrieve it).
Thus, we will have to use the good old method.
When you call estimator.train, checkpoints of your model are being saved in a specified directory (estimator.model_dir). You can use those files to access the graph and session and freeze the variables as follow:
1. Load meta graph
saver = tf.train.import_meta_graph('/path/to/meta')
2. Load weights
sess = tf.Session
saver.restore(sess, '/path/to/weights')
3. Freeze variables
tf.graph_util.convert_variables_to_constants(sess,
sess.graph.as_graph_def(),
['output'])
I am going through this script, and there is a code block which takes 2 options into account, DataParallel and DistributedDataParallel here:
if not args.distributed:
if args.arch.startswith('alexnet') or args.arch.startswith('vgg'):
model.features = torch.nn.DataParallel(model.features)
model.cuda()
else:
model = torch.nn.DataParallel(model).cuda()
else:
model.cuda()
model = torch.nn.parallel.DistributedDataParallel(model)
What if I don't want either of these options, and I want to run it without even DataParallel. How do I do it?
How do I define my model so that it runs as a plain nn and not parallelizing anything?
DataParallel is a wrapper object to parallelize the computation on multiple GPUs of the same machine, see here.
DistributedDataParallel is also a wrapper object that lets you distribute the data on multiple devices, see here.
If you don't want it, you can simply remove the wrapper and use the model as it is:
if not args.distributed:
if args.arch.startswith('alexnet') or args.arch.startswith('vgg'):
model.features = model.features
model.cuda()
else:
model = model.cuda()
else:
model.cuda()
model = model
This is to keep code modification to a minimum. Of course, since parallelization is of no interest to you, you could drop this whole if statement to something along the lines of:
if args.arch.startswith('alexnet') or args.arch.startswith('vgg'):
model.features = model.features
model = model.cuda()
Note that this code assumes you are running on the GPU.
DataParallel is a wrapper you can bypass it and get just the original module by doing this:
my_model = model.module.to(device)