I downloaded MobileBert model from TensorFlow - the Questions and Answers model based on TensorFlow Lite for mobile devices from here:
https://www.tensorflow.org/lite/models/bert_qa/overview
The example how to use it provided only for Android. Can anybody advise how to use this model in Python (for testing purposes). I followed recommendations on how to use TensorFlow Lite API, but I need to figure out how to modify it to use for MobileBert:
import numpy as np
import tensorflow as tf
# Load TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="mobilebert_float_20191023.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
input_data = np.array(np.random.random_sample(input_shape), dtype=np.int32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
Please take a look at this project : https://github.com/samsgates/MobileBERT
It is basically designed to run the Google's model that you mentioned. I managed to run it with TF 1.14 along with other dependencies installed (sentencepiece, bert-for-tf2, both are available from PiP). The only thing I had to tweak is to remove the tf.dtypes.cast() wrappers over the input/output tensors. E.g. for the inputs I changed
input_ids = tf.dtypes.cast(self.get_ids(stokens),tf.int32)
to
input_ids = self.get_ids(stokens).astype('int32')
And for the outputs I changed
end = tf.argmax(end_logits,output_type=tf.dtypes.int32).numpy()[0]
to simple
end = np.argmax(end_logits)
That did the trick for me. Hope this helps!
Related
right now I'm trying to convert a SavedModel to TFLite for use on a raspberry pi. The model is MobileNet Object Detection trained on a custom dataset. The SavedModel works perfectly, and retains the same shape of (1, 150, 150, 3). However, when I convert it to a TFLite model using this code:
import tensorflow as tf
saved_model_dir = input("Model dir: ")
# Convert the model
converter = tf.lite.TFLiteConverter.from_saved_model(saved_model_dir) # path to the SavedModel directory
tflite_model = converter.convert()
# Save the model.
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
And run this code to run the interpreter:
import numpy as np
import tensorflow as tf
from PIL import Image
from os import listdir
from os.path import isfile, join
from random import choice, random
# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
input_shape = input_details[0]['shape']
print(f"Required input shape: {input_shape}")
I get an input shape of [1 1 1 3], therefore I can't use a 150x150 image as input.
I'm using Tensorflow 2.4 on Python 3.7.10 with Windows 10.
How would I fix this?
You can rely on TFLite converter V1 API to set input shapes. Please check out the input_shapes argument in https://www.tensorflow.org/api_docs/python/tf/compat/v1/lite/TFLiteConverter.
How about calling resize_tensor_input() before calling allocate_tensors() ?
interpreter.resize_tensor_input(0, [1, 150, 150, 3], strict=True)
interpreter.allocate_tensors()
see possible solution at the end of the post
I am trying to fully quantize the keras-vggface model from rcmalli to run on an NPU. The model is a Keras model (not tf.keras).
When using TF 1.15 for quantization with:
print(tf.version.VERSION)
num_calibration_steps=5
converter = tf.lite.TFLiteConverter.from_keras_model_file('path_to_model.h5')
#converter.post_training_quantize = True # This only makes the weight in8 but does not initialize model quantization
def representative_dataset_gen():
for _ in range(num_calibration_steps):
pfad='path_to_image(s)'
img=cv2.imread(pfad)
# Get sample input data as a numpy array in a method of your choosing.
yield [img]
converter.representative_dataset = representative_dataset_gen
tflite_quant_model = converter.convert()
open("quantized_model", "wb").write(tflite_quant_model)
The model is converted but as I need full int8 quantization, I add:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8 # or tf.uint8
converter.inference_output_type = tf.int8 # or tf.uint8
This error message appears:
ValueError: Cannot set tensor: Got value of type UINT8 but expected type FLOAT32 for input 0, name: input_1
clearly, the input of the model still requires float32.
Questions:
Do I have to adapt the quantization method that the input dtype is changed ? or
Do I have to change the input layer of the model to dtype int8 beforehand?
Or is that actually reporting that the model is not actually quantized?
If 1 or 2 is the answer, would you also have a best practice tip for me?
Addition:
Using :
h5_path = 'my_model.h5'
model = keras.models.load_model(h5_path)
model.save(os.getcwd() +'/modelTF2')
to save the h5 as pb with TF 2.2 and then using converter=tf.lite.TFLiteConverter.from_saved_model(saved_model_dir)
as TF 2.x tflite takes floats, and convert them to uint8s internally . I thought that could be a solution. Unfortunately, this error message appears:
tf.lite.TFLiteConverter.from_keras_model giving 'str' object has no attribute 'call'
Apparently TF2.x cannot handle pure keras models.
using tf.compat.v1.lite.TFLiteConverter.from_keras_model_file() to solve this error just repeats the error from above, as we are back again at "TF 1.15" level.
Addition 2
Another solution is to transfer the keras model to tf.keras manually. I will look into that if there is no other solution.
Regarding the comment of Meghna Natraj
To recreate the model (using TF 1.13.x) just:
pip install git+https://github.com/rcmalli/keras-vggface.git
and
from keras_vggface.vggface import VGGFace
pretrained_model = VGGFace(model='resnet50', include_top=False, input_shape=(224, 224, 3), pooling='avg') # pooling: None, avg or max
pretrained_model.summary()
pretrained_model.save("my_model.h5") #using h5 extension
The input layer is connected. Too bad, that looked like a good/easy fix.
Possible Solution
It seems to work using TF 1.15.3 I used 1.15.0 beforehand. I will check if I did something else different by accident.
A possible reason why this fails is that the model has input tensors that are not connected to the output tensor, i.,e they are probably unused.
Here is a colab notebook where I've reproduced this error. Modify the io_type at the beginning of the notebook to tf.uint8 to see an error similar to one you got.
SOLUTION
You need to manually inspect the model and to see if there are any inputs that are dangling/lost/not connected to the output and remove them.
Post a link to the model and I can try to debug it as well.
I am trying to deploy this mask-rcnn model in android. I was able to load the keras weight, freeze the model and convert it to .tflite model using tflite 1.13 toco using this script.
It seems this model uses some tf_ops that is not supported in tflite. Thus I had to use
converter.target_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,tf.lite.OpsSet.SELECT_TF_OPS]
to convert the model. Now when I try to infer this model using python interpreter, I get segmentation error in interpreter.invoke() and the python script crashes.
def run_tf_model(model_path="mask_rcnn_coco.tflite"):
interpreter = tf.lite.Interpreter(model_path)
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()[0]
output_details = interpreter.get_output_details()[0]
print(" input_details", input_details)
print("output_details",output_details)
# Test model on random input data.
input_shape = input_details['shape']
print("input_shape tflite",input_shape)
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details['index'])
print(output_data)
Thus I am unable to find out if my converted model has been converted correctly or not.
P.S. I am planning to use this model in android, but I have little experience with android(java or kotlin) tflite api. If any one can point out any resources for learning about that would be also be helpful.
Edit: I also tried to run the inference on android with java api. But get the following error tensorflow/lite/kernels/gather.cc:80 0 <= axis && axis < NumDimensions(input).
Detailed in this tensorflow issue
You can verify your custom trained tflite model using TFLite python interpreter. Reference
import numpy as np
import tensorflow as tf
# Load the TFLite model and allocate tensors.
interpreter = tf.lite.Interpreter(model_path="converted_model.tflite")
interpreter.allocate_tensors()
# Get input and output tensors.
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
# Test the model on random input data.
input_shape = input_details[0]['shape']
input_data = np.array(np.random.random_sample(input_shape), dtype=np.float32)
interpreter.set_tensor(input_details[0]['index'], input_data)
interpreter.invoke()
# The function `get_tensor()` returns a copy of the tensor data.
# Use `tensor()` in order to get a pointer to the tensor.
output_data = interpreter.get_tensor(output_details[0]['index'])
print(output_data)
I want to use this TF Hub asset:
https://tfhub.dev/google/imagenet/resnet_v1_50/feature_vector/3
Versions:
Version: 1.15.0-dev20190726
Eager mode: False
Hub version: 0.5.0
GPU is available
Code
feature_extractor_url = "https://tfhub.dev/google/imagenet/resnet_v1_50/feature_vector/3"
feature_extractor_layer = hub.KerasLayer(module,
input_shape=(HEIGHT, WIDTH, CHANNELS))
I get:
ValueError: Importing a SavedModel with tf.saved_model.load requires a 'tags=' argument if there is more than one MetaGraph. Got 'tags=None', but there are 2 MetaGraphs in the SavedModel with tag sets [[], ['train']]. Pass a 'tags=' argument to load this SavedModel.
I tried:
module = hub.Module("https://tfhub.dev/google/imagenet/resnet_v1_50/feature_vector/3",
tags={"train"})
feature_extractor_layer = hub.KerasLayer(module,
input_shape=(HEIGHT, WIDTH, CHANNELS))
But when I try to save the model I get:
tf.keras.experimental.export_saved_model(model, tf_model_path)
# model.save(h5_model_path) # Same error
NotImplementedError: Can only generate a valid config for `hub.KerasLayer(handle, ...)`that uses a string `handle`.
Got `type(handle)`: <class 'tensorflow_hub.module.Module'>
Tutorial here
It's been a while, but assuming you have migrated to the TF2, this can easily be accomplished with the most recent model version as follows:
import tensorflow as tf
import tensorflow_hub as hub
num_classes=10 # For example
m = tf.keras.Sequential([
hub.KerasLayer("https://tfhub.dev/google/imagenet/resnet_v1_50/feature_vector/5", trainable=True)
tf.keras.layers.Dense(num_classes, activation='softmax')
])
m.build([None, 224, 224, 3]) # Batch input shape.
# train as needed
m.save("/some/output/path")
Please update this question if that doesn't work for you. I believe your issue arose from mixing hub.Module with hub.KerasLayer. The model version you were using was in TF1 Hub format, so within TF1 it is meant to be used exclusively with hub.Module, and not mixed with hub.KerasLayer. Within TF2, hub.KerasLayer can load TF1 Hub format models directly from their URL for composition in larger models, but they cannot be fine-tuned.
Please refer to this compatibility guide for more information
You should use tf.keras.models.save_model(model,'NeuralNetworkModel')
You will get saved model in a folder that can be used later in your sequential nework
When bringing a keras model to production tensorflow serve is often used as a REST API. It has some drawbacks, as the image data is expecting the same Input format as the network input layer e.g. a array with shape (300,300,3) in json. The only way to make this working seems to be wraping the tensorflow serve API into another service.
How is it possible to make tensorflow serve delivering a keras model, which accepts base64 encoded images, without wrapping it in another API?
I found a solution for this and here is a more detailed explanation:
import tensorflow as tf
sess = tf.Session() # get the tensorflow session to reuse it in keras
from keras import backend as K
from keras.models import load_model
K.set_session(sess)
K.set_learning_phase(0) # make sure we disable dropout and other training specific layers
string_inp = tf.placeholder(tf.string, shape=(None,)) #string input for the base64 encoded image
imgs_map = tf.map_fn(
tf.image.decode_image,
string_inp,
dtype=tf.uint8
) # decode jpeg
imgs_map.set_shape((None, None, None, 3))
imgs = tf.image.resize_images(imgs_map, [300, 300]) # resize images
imgs = tf.reshape(imgs, (-1, 300, 300, 3)) # reshape them
img_float = tf.cast(imgs, dtype=tf.float32) / 255 - 0.5 # and make them to floats
model = load_model('keras.h5', compile=False) # load the model
output = model(img_float) # use the image tensor as input for keras
# ...(save to savedModel format and load in tensorflow serve)