To get the prediction / output of my pre-trained model; the model predicts a symbol for each frame (column) of the convoluted image and it is necessary to conduct post-processing of the logits (output of the RNN) to emit the actual sequence of predicted symbols. Code for model construction can be found here.
logits = graph.get_tensor_by_name("fully_connected/BiasAdd:0")
decoded, _ = tf.nn.ctc_greedy_decoder(logits, seq_len)
prediction = sess.run(decoded,
feed_dict={
input: image,
seq_len: seq_lengths,
rnn_keep_prob: 1.0,
})
Prediction is a SparseTensorValue containing every predicted symbol. Decoded is a sparse tensor of non-empty tensors. Ultimately, I parse the resulting SparseTensorValue for the strings I need.
I want to use this trained model for inference either through tensorflow serving or tflite, however in order to proceed I would need to indicate the output nodes for the model. Given the nature of sparse tensors, I won't be able to indicate it by name. Is there a way for me to use this model for proper inference?
I've seen many examples of using ctc decoders such as this in a similar way for prediction, however, there were no examples of using these models for inference without closely relying on the tensorflow api, I am unsure how to proceed.
You can save your model to the tf saved_model format. After that you can use the CLI tool saved_model_cli of the package tensorflow-serving-api to inspect all model signatures with: saved_model_cli show --dir . --all. Withit you will see all information of the input and output shape(s). The default signature is called default_serving.
Related
I'm fairly new to deep learning models, how to design them, and how to understand them. I'm trying to understand my model using the SHAP library, but having some issues with following this tutorial.
My model's forward function looks like this, where the input is a list of tensors.
def forward(self, batch, runValidation=False):
[
visual_features, token_ids, token_masks, labels, xpath_ids,
class_ids, attr_token_masks
] = batch
...
return loss, class_loss, pr_result, output, visual_features
But based on the tutorial and error messages I'm getting, it seems that SHAP expects the model inputs and outputs to be tensors. Namely, background and x_test_each_class seem to be tensors.
# select backgroud for shap
background = x_train[np.random.choice(x_train.shape[0], 1000, replace=False)]
# DeepExplainer to explain predictions of the model
explainer = shap.DeepExplainer(model, background)
# compute shap values
shap_values = explainer.shap_values(x_test_each_class)
My questions are:
Is this proper model design to have the input/output be a list of tensors?
If yes, then is it still possible for me to use SHAP on this model?
I ended up concatenating my list of inputs into one long tensor, then writing a wrapper around my forward function to parse those inputs back into their original format before feeding them into the forward function.
I am prototyping a deep learning segmentation model that needs six channels of input (two aligned 448x448 RGB images under different lighting conditions). I wish to compare the performance of several pretrained models to that of my current model, which I trained from scratch. Can I use the pretrained models in tf.keras.applications for input images with more than 3 channels?
I tried applying a convolution first to reduce the channel dimension to 3 and then passed that output to tf.keras.applications.DenseNet121() but received the following error:
import tensorflow as tf
dense_input = tf.keras.layers.Input(shape=(448, 448, 6))
dense_filter = tf.keras.layers.Conv2D(3, 3, padding='same')(dense_input)
dense_stem = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet', input_tensor=dense_filter)
*** ValueError: You are trying to load a weight file containing 241 layers into a model with 242 layers.
Is there a better way to use pretrained models on data with a different number of input channels in keras? Will pretraining even help when the number of input channels is different?
Technically, it should be possible. Perhaps using the model's __call__ itself:
orig_model = tf.keras.applications.DenseNet121(include_top=False, weights='imagenet')
dense_input = tf.keras.layers.Input(shape=(448, 448, 6))
dense_filter = tf.keras.layers.Conv2D(3, 3, padding='same')(dense_input)
output = orig_model(dense_filter)
model = tf.keras.Model(dense_input, output)
model.compile(...)
model.summary()
On a conceptual level, though, I'd be worried that the new input doesn't look much like the original input that the pretrained model was trained on.
Cross Modality Pre-training may be the method you need. Proposed by Wang et al. (2016), this method averages the weights of the pre-trained model across the channels in the first layer and replicates the mean by the number of target channels. The experiment result indicates that the network gets better performance by using this kind of pre-training method even it has 20 input channels and its input modality is not RGB.
To apply this, one can refer to another answer that use layer.get_weights() and layer.set_weights() to manually set the weights in the first layer of the pre-trained model.
As a complementary approach to adding a convolutional layer before a pre-trained architecture, e.g. any of the pre-trained models available in tf.keras.applications that were trained with RGB-inputs, you could consider manipulating the existing weights so that they would match with your model with 6-channel inputs. For example, if your architecture remains the same besides the added input modalities, you can repeat the green channel to the newly added 3 input channels: see here.
"Is there a better way to use pretrained models on data with a different number of input channels in keras? Will pretraining even help when the number of input channels is different?"
Both the aforementioned and commonly used techniques
adding convolution layer(s) before the pre-trained architecture to convert the modalities
repeating the pre-trained channels to match with the newly added modalities
enable transfer learning, which is virtually always a better choice than starting the training from scratch. However, do not expect neither of the options to work without some retraining. In my opinion/experience, the latter is better. The reason is that the randomly initialized Conv-layers in the former approach would (at least initially) result in radically different inputs than what the rest of the architecture has "got used to seeing". This was already reasoned in the earlier answer by #Kris. The latter technique takes advantage of the fact that many of the relevant features are fairly similar in the different input modalities: a dog might still look like a dog even in a newly added input modality (e.g. RGB vs thermal light).
I trained a CNN to perform a classification task, I used vgg16 with categorical_crossentropy loss function, sgd optimizer, and softmax activation function.
I saved then the trained model to h5 file.
I know that the function to get classes probabilities vector for a new unseen image is predict.
But, what I want to know is whether the final classes probabilities assigned to training images are saved with the model?
If yes, how can I extract this information from the saved model without the need of using predict function?
HDF5 (.h5, .hdf5) is a file format suitable for storing large
collections of multidimensional numeric arrays (e.g. models, data
files).
yes you could save the "final classes probabilities assigned to training images" during your Training into an .h5 file.
But since you only "saved then the trained model to .h5 file" there will be only the network model (with its weights, architecture and the optimizer configuration if you defined so) in this file (more Information can be found here - section "As an HDF5 file").
From here your easiest option, is what you already mentioned: "Extract this information from the saved model with[out] the need of using predict function".
I have used Keras to finetune MobileNet v1. Now I have model.h5 and I need to convert it to TensorFlow Lite to use it in Android app.
I use TFLite conversion script tflite_convert. I can convert it without quantization but I need more performance so I need to make quantization.
If I run this script:
tflite_convert --output_file=model_quant.tflite \
--keras_model_file=model.h5 \
--inference_type=QUANTIZED_UINT8 \
--input_arrays=input_1 \
--output_arrays=predictions/Softmax \
--mean_values=128 \
--std_dev_values=127 \
--input_shape="1,224,224,3"
It fails:
F tensorflow/contrib/lite/toco/tooling_util.cc:1634] Array
conv1_relu/Relu6, which is an input to the DepthwiseConv operator
producing the output array conv_dw_1_relu/Relu6, is lacking min/max
data, which is necessary for quantization. If accuracy matters, either
target a non-quantized output format, or run quantized training with
your model from a floating point checkpoint to change the input graph
to contain min/max information. If you don't care about accuracy, you
can pass --default_ranges_min= and --default_ranges_max= for easy
experimentation.\nAborted (core dumped)\n"
If I use default_ranges_min and default_ranges_max (called as "dummy-quantization"), it works but it is only for debugging performance without accuracy as it is described in error log.
So what I need to do to make Keras model correctly quantizable? Do I need to find best default_ranges_min and default_ranges_max? How? Or it is about changes in Keras training phase?
Library versions:
Python 3.6.4
TensorFlow 1.12.0
Keras 2.2.4
Unfortunately, Tensorflow does not provide the tooling for post-training per layer quantization in flatbuffer (tflite) yet, but only in protobuf. The only available way now is to introduce fakeQuantization layers in your graph and re-train / fine-tune your model on the train or a calibration set. This is called "Quantization-aware training".
Once the fakeQuant layers are introduced, then you can feed the training set and TF is going to use them on Feed-Forward as simulated quantisation layers (fp-32 datatypes that represent 8-bit values) and back-propagate using full precision values. This way, you can get back the accuracy loss that caused by quantization.
In addition, the fakeQuant layers are going to capture the ranges per layer or per channel through moving average and store them in min / max variables.
Later, you can extract the graph definition and get rid of the fakeQuant nodes through freeze_graph tool.
Finally, the model can be fed into tf_lite_converter (cross-fingers it won't brake) and extract the u8_tflite with captured ranges.
A very good white-paper, explaining all these is provided by Google here : https://arxiv.org/pdf/1806.08342.pdf
Hope that helps.
I try to convert the Keras OCR example into a CoreML model.
I already can train my slightly modified model and everything looks good in Python. But now I want to convert the model into CoreML to use it my iOS app.
The problem is, that the CoreML file format can't support Lambda layers.
I am not an expert in this field, but as far as I understand, the Lambda layer here is used to calculate the loss using ctc_batch_cost().
The layer is created around line 464.
I guess this is used for greater precision over the "build in" loss functions.
Is there any way the model creation can be rewritten to fit the layer set CoreML supports?
I have no idea which output layer type to use for the model.
Cost functions usually aren't included in the CoreML model, since CoreML only does inference while cost functions are used for training. So strip out that layer before you export the model and you should be good to go.