Convert a torchscript model into Core ML - python

I am trying to convert a Detectron 2 model into Core ML. I have scripted it and now I have a model of type 'torch.jit._script.RecursiveScriptModule'. If I try and convert it into Core ML using this code:
mlmodel = ct.converters.convert(
torchscipt_model,
inputs=[ct.TensorType(shape=(1, 3, 64, 64))],
)
I get the following error:
RuntimeError: Unknown type bool encountered in graph lowering. This
type is not supported in ONNX export.
I scripted the model using the following code (as illustrated in Detectron's documentation)
model = build_model(cfg)
model.eval()
fields = {"pred_boxes": Boxes, "scores": torch.Tensor, "pred_classes": torch.Tensor, "pred_masks": torch.Tensor, "proposal_boxes": Boxes, "objectness_logits": torch.Tensor}
torchscipt_model = scripting_with_instances(model, fields)
Do you know any other way to convert my Detectron2 model into Core ML?
Thank you so much for your help!

Related

Yolov5s on Openvino

I have trained a model using yolov5 and it is working just fine:
My ultimate goal is to use a model that I have trained on custom data (to detect the hook and bucket) in the Openvino framework.
To achieve this, I first exported the best version of the model to the appropriate Openvino format, using the following command:
!python export.py --weights runs/train/yolov5s24/weights/best.pt --include openvino --dynamic --simplify
The export performed successfully generated me 3 files: best.xml, best.bin, best.mapping;
Now I would like to load it using the Openvino framework and to to that I am following this pipeline:
Create Core object
1.1. (Optional) Load extensions
Read a model from a drive
2.1. (Optional) Perform model preprocessing
Load the model to the device
Create an inference request
Fill input tensors with data
Start inference
Process the inference results
1 Create Core
import numpy as np
import openvino.inference_engine as ie
core = ie.IECore()
2 Read a model from a drive
path_to_xml_file = 'models/best_openvino_model/best.xml'
path_to_bin_file = 'models/best_openvino_model/best.bin'
network = core.read_network(model=path_to_xml_file, weights=path_to_bin_file)
3 Load the Model to the Device
# Load network to the device and create infer requests
exec_network = core.load_network(network, "CPU", num_requests=4)
And here I am getting an error:
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
<ipython-input-21-7c9ba5f53484> in <module>
1 # Load network to the device and create infer requests
----> 2 exec_network = core.load_network(network, "CPU", num_requests=4)
ie_api.pyx in openvino.inference_engine.ie_api.IECore.load_network()
ie_api.pyx in openvino.inference_engine.ie_api.IECore.load_network()
RuntimeError: Check 'std::get<0>(valid)' failed at inference/src/ie_core.cpp:1414:
InferenceEngine::Core::LoadNetwork doesn't support inputs having dynamic shapes. Use ov::Core::compile_model API instead. Dynamic inputs are :{ input:'images,images', shape={?,3,?,?}}
I am using the OpenVINO™ Development Tools - release 2022.1;
The files to reproduce the error are here;
This error is expected as the model contains a dynamic shape. This model can be executed using the ov::Core:compile_model API in OpenVINO 2022.1.
You can refer to class ov::CompiledModel and Dynamic Shape for more information.
You use API 1.0 that does not support dynamic shapes.
You need to use API 2.0 for dynamic shapes.
Here an example of how to infer for dynamic shapes using API 2.0:
from openvino.runtime import AsyncInferQueue, Core, InferRequest, Layout, Type
core = Core()
model = core.read_model("model.xml")
compiled_model = core.compile_model(model, "CPU")
output = compiled_model.infer_new_request({0: input_data})
yolov5 on openvino #
yolov5-openvino
enjoy.....

AttributeError: 'Functional' object has no attribute 'predict_segmentation' When importing TensorFlow model Keras

I have successfully trained a Keras model like:
import tensorflow as tf
from keras_segmentation.models.unet import vgg_unet
# initaite the model
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
# Train
model.train(
train_images=train_images,
train_annotations=train_annotations,
checkpoints_path="/tmp/vgg_unet_1", epochs=5
)
And saved it in hdf5 format with:
tf.keras.models.save_model(model,'my_model.hdf5')
Then I load my model with
model=tf.keras.models.load_model('my_model.hdf5')
Finally I want to make a segmentation prediction on a new image with
out = model.predict_segmentation(
inp=image_to_test,
out_fname="/tmp/out.png"
)
I am getting the following error:
AttributeError: 'Functional' object has no attribute 'predict_segmentation'
What am I doing wrong ?
Is it when I am saving my model or when I am loading it ?
Thanks !
predict_segmentation isn't a function available in normal Keras models. It looks like it was added after the model was created in the keras_segmentation library, which might be why Keras couldn't load it again.
I think you have 2 options for this.
You could use the line from the code I linked to manually add the function back to the model.
model.predict_segmentation = MethodType(keras_segmentation.predict.predict, model)
You could create a new vgg_unet with the same arguments when you reload the model, and transfer the weights from your hdf5 file to that model as suggested in the Keras documentation.
model = vgg_unet(n_classes=50, input_height=512, input_width=608)
model.load_weights('my_model.hdf5')

tflite quantization how to change the input dtype

see possible solution at the end of the post
I am trying to fully quantize the keras-vggface model from rcmalli to run on an NPU. The model is a Keras model (not tf.keras).
When using TF 1.15 for quantization with:
print(tf.version.VERSION)
num_calibration_steps=5
converter = tf.lite.TFLiteConverter.from_keras_model_file('path_to_model.h5')
#converter.post_training_quantize = True # This only makes the weight in8 but does not initialize model quantization
def representative_dataset_gen():
for _ in range(num_calibration_steps):
pfad='path_to_image(s)'
img=cv2.imread(pfad)
# Get sample input data as a numpy array in a method of your choosing.
yield [img]
converter.representative_dataset = representative_dataset_gen
tflite_quant_model = converter.convert()
open("quantized_model", "wb").write(tflite_quant_model)
The model is converted but as I need full int8 quantization, I add:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
This error message appears:
ValueError: Cannot set tensor: Got value of type UINT8 but expected type FLOAT32 for input 0, name: input_1
clearly, the input of the model still requires float32.
Questions:
Do I have to adapt the quantization method that the input dtype is changed ? or
Do I have to change the input layer of the model to dtype int8 beforehand?
Or is that actually reporting that the model is not actually quantized?
If 1 or 2 is the answer, would you also have a best practice tip for me?
Addition:
Using :
h5_path = 'my_model.h5'
model = keras.models.load_model(h5_path)
model.save(os.getcwd() +'/modelTF2')
to save the h5 as pb with TF 2.2 and then using converter=tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
as TF 2.x tflite takes floats, and convert them to uint8s internally . I thought that could be a solution. Unfortunately, this error message appears:
tf.lite.TFLiteConverter.from_keras_model giving 'str' object has no attribute 'call'
Apparently TF2.x cannot handle pure keras models.
using tf.compat.v1.lite.TFLiteConverter.from_keras_model_file() to solve this error just repeats the error from above, as we are back again at "TF 1.15" level.
Addition 2
Another solution is to transfer the keras model to tf.keras manually. I will look into that if there is no other solution.
Regarding the comment of Meghna Natraj
To recreate the model (using TF 1.13.x) just:
pip install git+https://github.com/rcmalli/keras-vggface.git
and
from keras_vggface.vggface import VGGFace
pretrained_model = VGGFace(model='resnet50', include_top=False, input_shape=(224, 224, 3), pooling='avg') # pooling: None, avg or max
pretrained_model.summary()
pretrained_model.save("my_model.h5") #using h5 extension
The input layer is connected. Too bad, that looked like a good/easy fix.
Possible Solution
It seems to work using TF 1.15.3 I used 1.15.0 beforehand. I will check if I did something else different by accident.
A possible reason why this fails is that the model has input tensors that are not connected to the output tensor, i.,e they are probably unused.
Here is a colab notebook where I've reproduced this error. Modify the io_type at the beginning of the notebook to tf.uint8 to see an error similar to one you got.
SOLUTION
You need to manually inspect the model and to see if there are any inputs that are dangling/lost/not connected to the output and remove them.
Post a link to the model and I can try to debug it as well.

Converting saved_model to TFLite model using TF 2.0

currently I am working on converting custom object detection model (trained using SSD and inception network) to quantized TFLite model. I can able to convert custom object detection model from frozen graph to quantized TFLite model using the following code snippet (using Tensorflow 1.4):
converter = tf.lite.TFLiteConverter.from_frozen_graph(args["model"],input_shapes = {'normalized_input_image_tensor':[1,300,300,3]},
input_arrays = ['normalized_input_image_tensor'],output_arrays = ['TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1',
'TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'])
converter.allow_custom_ops=True
converter.post_training_quantize=True
tflite_model = converter.convert()
open(args["output"], "wb").write(tflite_model)
However tf.lite.TFLiteConverter.from_frozen_graph class method is not available for Tensorflow 2.0 (refer this link). So I tried to convert the model using tf.lite.TFLiteConverter.from_saved_model class method. The code snippet is shown below:
converter = tf.lite.TFLiteConverter.from_saved_model("/content/") # Path to saved_model directory
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
The above code snippet throws the following error:
ValueError: None is only supported in the 1st dimension. Tensor 'image_tensor' has invalid shape '[None, None, None, 3]'.
I tried to pass input_shapes as argument
converter = tf.lite.TFLiteConverter.from_saved_model("/content/",input_shapes={"image_tensor" : [1,300,300,3]})
but it throws the following error:
TypeError: from_saved_model() got an unexpected keyword argument 'input_shapes'
Am I missing something? Please feel free to correct me!
I got the solution using tf.compat.v1.lite.TFLiteConverter.from_frozen_graph. This compat.v1 brings the functionality of TF1.x into TF2.x.
Following is the full code:
converter = tf.compat.v1.lite.TFLiteConverter.from_frozen_graph("/content/tflite_graph.pb",input_shapes = {'normalized_input_image_tensor':[1,300,300,3]},
input_arrays = ['normalized_input_image_tensor'],output_arrays = ['TFLite_Detection_PostProcess','TFLite_Detection_PostProcess:1',
'TFLite_Detection_PostProcess:2','TFLite_Detection_PostProcess:3'])
converter.allow_custom_ops=True
# Convert the model to quantized TFLite model.
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
# Write a model using the following line
open("/content/uno_mobilenetV2.tflite", "wb").write(tflite_model)

Description of TF Lite's Toco converter args for quantization aware training

These days I am trying to track down an error concerning the deployment of a TF model with TPU support.
I can get a model without TPU support running, but as soon as I enable quantization, I get lost.
I am in the following situation:
Created a model and trained it
Created an eval graph of the model
Froze the model and saved the result as protocol buffer
Successfully converted and deployed it without TPU support
For the last point, I used the TFLiteConverter's Python API. The script that produces a functional tflite model is
import tensorflow as tf
graph_def_file = 'frozen_model.pb'
inputs = ['dense_input']
outputs = ['dense/BiasAdd']
converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs)
converter.inference_type = tf.lite.constants.FLOAT
input_arrays = converter.get_input_arrays()
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_model = converter.convert()
open('model.tflite', 'wb').write(tflite_model)
This tells me that my approach seems to be ok up to this point. Now, if I want to utilize the Coral TPU stick, I have to quantize my model (I took that into account during training). All I have to do is to modify my converter script. I figured that I have to change it to
import tensorflow as tf
graph_def_file = 'frozen_model.pb'
inputs = ['dense_input']
outputs = ['dense/BiasAdd']
converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs)
converter.inference_type = tf.lite.constants.QUANTIZED_UINT8 ## Indicates TPU compatibility
input_arrays = converter.get_input_arrays()
converter.quantized_input_stats = {input_arrays[0]: (0., 1.)} ## mean, std_dev
converter.default_ranges_stats = (-128, 127) ## min, max values for quantization (?)
converter.allow_custom_ops = True ## not sure if this is needed
## REMOVED THE OPTIMIZATIONS ALTOGETHER TO MAKE IT WORK
tflite_model = converter.convert()
open('model.tflite', 'wb').write(tflite_model)
This tflite model produces results when loaded with the Python API of the interpreter, but I am not able to understand their meaning. Also, there is no (or if there is, it is hidden well) documentation on how to choose mean, std_dev and the min/max ranges. Also, after compiling this with the edgetpu_compiler and deploying it (loading it with the C++ API), I receive an error:
INFO: Initialized TensorFlow Lite runtime.
ERROR: Failed to prepare for TPU. generic::failed_precondition: Custom op already assigned to a different TPU.
ERROR: Node number 0 (edgetpu-custom-op) failed to prepare.
Segmentation fault
I suppose I missed a flag or something during the conversion process. But as the documentation is also lacking here, I can't say for sure.
In short:
What do the params mean, std_dev, min/max do and how do they interact?
What am I doing wrong during the conversion?
I am grateful for any help or guidance!
EDIT: I have opened a github issue with the full test code. Feel free to play around with this.
You should never need to manually set the quantization stats.
Have you tried the post-training-quantization tutorials?
https://www.tensorflow.org/lite/performance/post_training_integer_quant
Basically they set the quantization options:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
Then they pass a "representative dataset" to the converter, so that the converter can run the model a few batches to gather the necessary statistics:
def representative_data_gen():
for input_value in mnist_ds.take(100):
yield [input_value]
converter.representative_dataset = representative_data_gen
While there are options for quantized training, it's always easier to to do post-training quantization.

Categories

Resources