I am attempting to convert a TF model to TFLite. The model was saved in .pb format and I have converted it with the following code:
import os
import tensorflow as tf
from tensorflow.core.protobuf import meta_graph_pb2
export_dir = os.path.join('export_dir', '0')
if not os.path.exists('export_dir'):
os.mkdir('export_dir')
tf.compat.v1.enable_control_flow_v2()
tf.compat.v1.enable_v2_tensorshape()
# I took this function from a tutorial on the TF website
def wrap_frozen_graph(graph_def, inputs, outputs):
def _imports_graph_def():
tf.compat.v1.import_graph_def(graph_def, name="")
wrapped_import = tf.compat.v1.wrap_function(_imports_graph_def, [])
import_graph = wrapped_import.graph
return wrapped_import.prune(
inputs, outputs)
graph_def = tf.compat.v1.GraphDef()
loaded = graph_def.ParseFromString(open(os.path.join(export_dir, 'saved_model.pb'),'rb').read())
concrete_func = wrap_frozen_graph(
graph_def, inputs=['extern_data/placeholders/data/data:0', 'extern_data/placeholders/data/data_dim0_size:0'],
outputs=['output/output_batch_major:0'])
concrete_func.inputs[0].set_shape([None, 50])
concrete_func.inputs[1].set_shape([None])
concrete_func.outputs[0].set_shape([None, 100])
converter = tf.lite.TFLiteConverter.from_concrete_functions([concrete_func])
converter.experimental_new_converter = True
converter.post_training_quantize=True
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS,
tf.lite.OpsSet.SELECT_TF_OPS]
converter.allow_custom_ops=True
tflite_model = converter.convert()
# Save the model.
if not os.path.exists('tflite'):
os.mkdir('tflite')
output_model = os.path.join('tflite', 'model.tflite')
with open(output_model, 'wb') as f:
f.write(tflite_model)
However, when I try to use the intepretere with this model I get the following error:
INFO: TfLiteFlexDelegate delegate: 8 nodes delegated out of 970 nodes with 3 partitions.
INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 4 nodes with 0 partitions.
INFO: TfLiteFlexDelegate delegate: 3 nodes delegated out of 946 nodes with 1 partitions.
INFO: TfLiteFlexDelegate delegate: 0 nodes delegated out of 1 nodes with 0 partitions.
INFO: TfLiteFlexDelegate delegate: 3 nodes delegated out of 16 nodes with 2 partitions.
Traceback (most recent call last):
File "/path/to/tflite_interpreter.py", line 9, in <module>
interpreter.allocate_tensors()
File "/path/to/lib/python3.6/site-packages/tensorflow/lite/python/interpreter.py", line 243, in allocate_tensors
return self._interpreter.AllocateTensors()
RuntimeError: Encountered unresolved custom op: VarHandleOp.Node number 0 (VarHandleOp) failed to prepare.
Now, I don't find any VarHandleOp in the code and I found out that it is actually in tensorflow (https://www.tensorflow.org/api_docs/python/tf/raw_ops/VarHandleOp).
So, why isn't TFLite able to recognize it?
It's certainly hard to provide a minimal reproducible example in the case of model conversion, as the SO guidelines recommend, but the questions would benefit from better pointers. For example, instead of saying “I took this function from a tutorial on the TF website”, it is a much better idea to provide a link to the tutorial. The TF website is vastly huge.
The tutorial that you are referring to is probably from the section on migrating from TF1 to TF2, specifically the part of handling the raw graph files. The crucially important note is
if you have a "Frozen graph" (a tf.Graph where the variables have been turned into constants)
(the bold highlight is mine). Apparently, your graph contains VarHandleOp (the same applies to the Variable and VariableV2 nodes), and is not “frozen” by this definition. Your general approach makes sense, but you need a graph that contains actual trained values for the variables in the form of the Const node. You need variables at the training time, but for inference time, and should be baked into the graph. TFLite, as an inference-time framework, does not support variables.
The rest of your idea seems fine. TFLiteConverter.from_concrete_functions currently takes exactly one concrete_function, but this is what you get from wrapping the graph. With enough luck it may work.
There is a utility tensorflow/python/tools/freeze_graph.py that attempts its best to replace variables in a Graph.pb with constants taken from the latest checkpoint file. If you look at its code, either using the saved metagraph (checkpoint_name.meta) file or pointing the tool to the training directory eliminates a lot of guesswork; also, I think that providing the model directory is the only way to get a single frozen graph a sharded model.
I noticed that you use just input in place of tf.nest.map_structure(import_graph.as_graph_element, inputs) in the example. You may have other reasons for that, but if you do it because as_graph_element complains about datatype/shape, this is likely to be resolved by freezing the graph properly. The concrete_function that you obtain from the frozen graph will have a good idea about its input shapes and datatypes. Generally, it's unexpected to need to manually set them, and the fact that you do seems odd to me (but I do not claim a broad experience with this dark corner of TF).
map_structure has a keyword argument to skip the check.
Related
These days I am trying to track down an error concerning the deployment of a TF model with TPU support.
I can get a model without TPU support running, but as soon as I enable quantization, I get lost.
I am in the following situation:
Created a model and trained it
Created an eval graph of the model
Froze the model and saved the result as protocol buffer
Successfully converted and deployed it without TPU support
For the last point, I used the TFLiteConverter's Python API. The script that produces a functional tflite model is
import tensorflow as tf
graph_def_file = 'frozen_model.pb'
inputs = ['dense_input']
outputs = ['dense/BiasAdd']
converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs)
converter.inference_type = tf.lite.constants.FLOAT
input_arrays = converter.get_input_arrays()
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
tflite_model = converter.convert()
open('model.tflite', 'wb').write(tflite_model)
This tells me that my approach seems to be ok up to this point. Now, if I want to utilize the Coral TPU stick, I have to quantize my model (I took that into account during training). All I have to do is to modify my converter script. I figured that I have to change it to
import tensorflow as tf
graph_def_file = 'frozen_model.pb'
inputs = ['dense_input']
outputs = ['dense/BiasAdd']
converter = tf.lite.TFLiteConverter.from_frozen_graph(graph_def_file, inputs, outputs)
converter.inference_type = tf.lite.constants.QUANTIZED_UINT8 ## Indicates TPU compatibility
input_arrays = converter.get_input_arrays()
converter.quantized_input_stats = {input_arrays[0]: (0., 1.)} ## mean, std_dev
converter.default_ranges_stats = (-128, 127) ## min, max values for quantization (?)
converter.allow_custom_ops = True ## not sure if this is needed
## REMOVED THE OPTIMIZATIONS ALTOGETHER TO MAKE IT WORK
tflite_model = converter.convert()
open('model.tflite', 'wb').write(tflite_model)
This tflite model produces results when loaded with the Python API of the interpreter, but I am not able to understand their meaning. Also, there is no (or if there is, it is hidden well) documentation on how to choose mean, std_dev and the min/max ranges. Also, after compiling this with the edgetpu_compiler and deploying it (loading it with the C++ API), I receive an error:
INFO: Initialized TensorFlow Lite runtime.
ERROR: Failed to prepare for TPU. generic::failed_precondition: Custom op already assigned to a different TPU.
ERROR: Node number 0 (edgetpu-custom-op) failed to prepare.
Segmentation fault
I suppose I missed a flag or something during the conversion process. But as the documentation is also lacking here, I can't say for sure.
In short:
What do the params mean, std_dev, min/max do and how do they interact?
What am I doing wrong during the conversion?
I am grateful for any help or guidance!
EDIT: I have opened a github issue with the full test code. Feel free to play around with this.
You should never need to manually set the quantization stats.
Have you tried the post-training-quantization tutorials?
https://www.tensorflow.org/lite/performance/post_training_integer_quant
Basically they set the quantization options:
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
Then they pass a "representative dataset" to the converter, so that the converter can run the model a few batches to gather the necessary statistics:
def representative_data_gen():
for input_value in mnist_ds.take(100):
yield [input_value]
converter.representative_dataset = representative_data_gen
While there are options for quantized training, it's always easier to to do post-training quantization.
I'm currently using fast.ai to train an image classifier model.
data = ImageDataBunch.single_from_classes(path, classes, ds_tfms=get_transforms(), size=224).normalize(imagenet_stats)
learner = cnn_learner(data, models.resnet34)
learner.model.load_state_dict(
torch.load('stage-2.pth', map_location="cpu")
)
which results in :
torch.load('stage-2.pth', map_location="cpu") File
"/usr/local/lib/python3.6/site-packages/torch/nn/modules/module.py",
line 769, in load_state_dict
self.class.name, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for Sequential:
...
Unexpected key(s) in state_dict: "model", "opt".
I have looked around in SO and tried to use the following solution:
# original saved file with DataParallel
state_dict = torch.load('stage-2.pth', map_location="cpu")
# create new OrderedDict that does not contain `module.`
from collections import OrderedDict
new_state_dict = OrderedDict()
for k, v in state_dict.items():
name = k[7:] # remove `module.`
new_state_dict[name] = v
# load params
learner.model.load_state_dict(new_state_dict)
which results in :
RuntimeError: Error(s) in loading state_dict for Sequential:
Unexpected key(s) in state_dict: "".
I'm using Google Colab to train my model and then port the trained model into docker and try to host in in a local server.
What could be the issue? Could it be the different version of pytorch which results in model mismatch?
In my docker config:
# Install pytorch and fastai
RUN pip install torch_nightly -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html
RUN pip install fastai
While my Colab is using the following:
!curl -s https://course.fast.ai/setup/colab | bash
My strong guess is that stage-2.pth contains two top-level items: the model itself (its weights) and the final state of the optimizer which was used to train it. To load just the model, you need only the former. Assuming things were done in the idiomatic PyTorch way, I would try
learner.model.load_state_dict(
torch.load('stage-2.pth', map_location="cpu")['model']
)
Update: after applying my first round of advice it becomes clear that you're loading a savepoint create with a different (perhaps differently configured?) model than the one you're loading it into. As you can see in the pastebin, the savepoint contains weights for some extra layers, not present in your model, such as bn3, downsample, etc.
"0.4.0.bn3.running_var", "0.4.0.bn3.num_batches_tracked", "0.4.0.downsample.0.weight"
at the same time some other key names match, but the tensors are of different shapes.
size mismatch for 0.5.0.downsample.0.weight: copying a param with shape torch.Size([512, 256, 1, 1]) from checkpoint, the shape in current model is torch.Size([128, 64, 1, 1]).
I see a pattern that you consistently try to load a parameter of shape [2^(x+1), 2^x, 1, 1] in place of [2^(x), 2^(x-1), 1, 1]. Perhaps you're trying to load a model of different depth (ex. loading vgg-16 weights for vgg-11?). Either way, you need to figure out the exact architecture used to create your savepoint and then recreate it before loading the savepoint.
PS. In case you weren't sure - savepoints contain model weights, along with their shapes and (autogenerated) names. They do not contain the full specification of the architecture itself - you need to assure yourself, that you're calling model.load_state_dict with model being of exactly the same architecture as was used to create the savepoint. Otherwise you will likely have weight names mismatching.
In some deep learning workflows, it is useful to train a model, extract it out of its graph using tf.graph_util.convert_variables_to_constants or tf.graph_util.extract_sub_graph so training-related tensors are left out, and then connect the extracted subgraph to other model(s) via tf.import_graph_def. In this way, the trained model can serve as a building block in a larger setup.
Often, we'd like to backpropagate through the new, composite model, in order to fine-tune it, optimize the inputs and so on.
However, it appears that one cannot define a gradient through a while_loop tensorflow operation in an imported graph, since it relies on 'outer context', an object added into the metagraph's collections (see TF issue #7404). Slightly adapting the example in this Github issue, here's an example of what I am trying to do:
import tensorflow as tf
g1=tf.Graph()
sess1=tf.Session(graph=g1)
with g1.as_default():
with sess1.as_default():
i=tf.constant(0, name="input")
out=tf.while_loop(lambda i: tf.less(i,5), lambda i: [tf.add(i,1)], [i], name="output")
loss=tf.square(out,name='loss')
graph_def = tf.graph_util.convert_variables_to_constants(sess1,g1.as_graph_def(),['output/Exit'])
g2 = tf.Graph()
with g2.as_default():
tf.import_graph_def(graph_def,name='')
i_imported = g2.get_tensor_by_name("input:0")
out_imported = g2.get_tensor_by_name("output/Exit:0")
tf.gradients(out_imported, i_imported)
The last line raises an AttributeError: 'NoneType' object has no attribute 'outer_context' error.
Tensorflow's solution to this issue is to use tf.train.export_meta_graph and tf.train.import_meta_graph so the outer context is copied, but this copies the entire graph, without editting. In this minimal case, the 'loss' tensor won't be removed.
I tried copying the missing context to the new graph:
g2.add_to_collection('while_context',g1.get_collection('while_context'))
But it doesn't solve the issue.
Is there a way to overcome this limitation or is it an irreparable Tensorflow design flaw?
i am running a model for my own data set(the project was implemented for training/testing with ImageNet) with 2 classes. I have made all the changes (in config files etc) but after training finishes(successfully), i get the following error when starting testing:
wrote gt roidb to ./data/cache/ImageNetVID_DET_val_gt_roidb.pkl
Traceback (most recent call last):
File "experiments/dff_rfcn/dff_rfcn_end2end_train_test.py", line 20, in <module>
test.main()
File "experiments/dff_rfcn/../../dff_rfcn/test.py", line 53, in main
args.vis, args.ignore_cache, args.shuffle, config.TEST.HAS_RPN, config.dataset.proposal, args.thresh, logger=logger, output_path=final_output_path)
File "experiments/dff_rfcn/../../dff_rfcn/function/test_rcnn.py", line 68, in test_rcnn
roidbs_seg_lens[gpu_id] += x['frame_seg_len']
KeyError: 'frame_seg_len'
I cleaned the cache file before running. As i have read in previous topics, this might be an issue of previous datasets .pkl files in cache. What may have caused this error? I also want to mention that i changed .txt filenames that feed the neural network(if this is important), and that training finishes well.
This is my first time running a project in Deep Learning so please show some understanding.
MXNet typically uses methods other than pickle directly for serialization of the model architecture and trained weights.
With Gluon API, you can save weights of the model to a file (i.e. Block) with .save_params() and then load the weights from a file with .load_params(). You 'save' the model architecture by keeping the code used to define the model. See and example of this here.
With Module API, you can create checkpoints at the end of each epoch which will save the symbol (i.e. model architecture) and the parameters (i.e. model weights). See here.
checkpoint = mx.callback.do_checkpoint(model_prefix)
mod = mx.mod.Module(symbol=net)
mod.fit(train_iter, num_epoch=5, epoch_end_callback=checkpoint)
You can then load the model of a given checkpoint (e.g. 42 in this example)
sym, arg_params, aux_params = mx.model.load_checkpoint(model_prefix, 42)
mod.set_params(arg_params, aux_params)
I wrote a script to do the classification of a single input image using a model I trained with MxNet. To classify the incoming image I feedforward them in through network.
In short here is what I am doing:
symbol, arg_params, aux_params = mx.model.load_checkpoint('model-prefix', 42)
model = mx.mod.Module(symbol=symbol, context=mx.cpu())
model.bind(data_shapes=[('data', (1, 3, 224, 244))], for_training=False)
model.set_params(arg_params, aux_params)
# ... loading the image & resizing ...
# img is the image to classify as numpy array of shape (3, 244, 244)
Batch = namedtuple('Batch', ['data'])
self._model.forward(Batch(data=[mx.nd.array(img)]))
probabilities = self._model.get_outputs()[0].asnumpy()
print(str(probabilities))
This works fine, except that I am getting the following warning
UserWarning: Data provided by label_shapes don't match names specified by label_names ([] vs. ['softmax_label'])
What should I change to avoid getting this warning? It is not clear to me what the label_shapes and label_names parameters are meant for, and what I am expect to fill them with.
Note: I found some thread about them, but none enabled me to solve the problem. Similarly the MxNet documentation doesn't provide much details on what those parameters are and on how they are supposed to be filled.
Set label_names=None and allow_missing=True. That should get rid of the warning.
model = mx.mod.Module(symbol=symbol, context=mx.cpu(), label_names=None)
...
model.set_params(arg_params, aux_params, allow_missing=True)
If you are curious why the warning is printed in the first place,
Every module has associated label. When this model was trained, softmax_label was used as the label (most likely because the output layer was a softmax layer named 'softmax'). When the model was loaded from file, the module that was created had softmax_label as the module's label.
>>>print(model.label_names)
['softmax_label']
model.bind is then called without providing label_shapes.
model.bind(data_shapes=[('data', (1, 3, 224, 244))], for_training=False)
MXNet sees that the module has a label in it which was not provided during bind and complains about it - which is the warning message you see.
I think if bind is called with for_training=False, MXNet shouldn't complain about the missing label. I've created this issue: https://github.com/dmlc/mxnet/issues/6958
However, for this particular case where we load a model from disk, we can load it with None as the label so that MXNet doesn't later complain when bind doesn't provide label - which is what the suggested fix does.