I am Trying to deploy a tensorflow keras model using amazon sagemaker. Process finishes successfully, yet i get different prediction results when predicted directly using keras and when calling sagemaker endpoint to make predictions.
I used these steps in deploying the model to sagemaker.
Check the following example.
data = np.random.randn(1, 150, 150, 3)
# predict using amazon sagemaker
sagemaker_predict = uncompiled_predictor.predict(data)
print(sagemaker_predict)
#predict same using keras
val = model.predict(data)
print(val)
>>{'predictions': [[0.491645753]]}
[[0.]]
Is this something supposed to happen? For my knowledge it should be the same. For some reason data gets corrupted or sagemaker weights get reinitialized. Any ideas?
Not suppose to happen.
See what you get if you deloy the model directly to TensorFlow serving (which is what the SageMaker inference container wraps).
To experiment faster you can work with SageMaker inference container in local mode, so you can start/stop an endopint in seconds.
Finally found a solution. It seems to be a problem with .h5 (HDF5) weights file, for some reason sagemaker seems not to extract weight from .h5. Therefore changed the weights file to TensorFlow SavedModel format
As for tensorflow keras save and serialize documentation
There are two formats you can use to save an entire model to disk: the TensorFlow SavedModel format, and the older Keras H5 format. The recommended format is SavedModel. It is the default when you use model.save().
You can switch to the H5 format by:
Passing save_format='h5' to save().
Passing a filename that ends in .h5 or .keras to save()
So instead of saving weights as
model.save("my_model.h5")
save as
model.save("my_model")
And load the same weights as
keras.models.load_model("my_model")
This will save your file in TensorFlow SavedModel format which you can follow in the above documentation to load and deploy to sagemaker.
Related
I am trying to train a TensorFlow object detection model on a custom dataset on google colab and I have a saved model trained for 5000 steps, is it possible to use saved model to resume training? I am planning to train for another 20000 steps. I am using google colab for training and the training will take around 36 hours, so I'm planning to use checkpoint. How to store best model checkpoints and use them when session runs out?
For resuming training using weights from a saved checkpoint, in your pipeline.config file, change the line containing fine_tune_checkpoint from <path_to_ckpt>/model.ckpt to <path_to_ckpt>/model.ckpt-XXXX where XXXX is your checkpoint number.
As far as saving only best weights is concerned, you can refer to this post and/or this GitHub link
I trained a neural network with google colab.
I saved the neural network using joblib.dump()
I then loaded the model on my PC using joblib.load()
I made a prediction on the exact same sample, using the same model, on both colab and my PC. On colab, it has an output of [[0.51]]. On my pc, it has an output of [[nan]].
The model summary reports that the architecture of the model is the same.
I checked the weights of the model I loaded on my PC, and the model on colab, and the weights are the exact same.
Any ideas as to what I can do? Thank you.
Quick update: even if I change all of my inputs to zero, the prediction is still nan.
As far as I know keras has its own function to save the model such as model.save('file.h5'), and the joblib library is used to save sklearn models.
I have written a TensorFlow / Keras Super-Resolution GAN. I've converted the resulting trained .h5 model to a .tflite model, using the below code, executed in Google Colab:
import tensorflow as tf
model = tf.keras.models.load_model('/content/drive/My Drive/srgan/output/srgan.h5')
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.post_training_quantize=True
tflite_model = converter.convert()
open("/content/drive/My Drive/srgan/output/converted_model_quantized.tflite", "wb").write(tflite_model)
As you can see I use converter.post_training_quantize=True which was censed to help to output a lighter .tflite model than the size of my original .h5 model, which is 159MB. The resulting .tflite model is still 159MB however.
It's so big that I can't upload it to Google Firebase Machine Learning Kit's servers in the Google Firebase Console.
How could I either:
decrease the size of the current .tflite model which is 159MB (for example using a tool),
or after having deleted the current .tflite model which is 159MB, convert the .h5 model to a lighter .tflite model (for example using a tool)?
Related questions
How to decrease size of .tflite which I converted from keras: no answer, but a comment telling to use converter.post_training_quantize=True. However, as I explained it, this solution doesn't seem to work in my case.
In general, quantization means, shifting from dtype float32 to uint8. So theoretically our model should reduce by the size of 4. This will be clearly visible in files of greater size.
Check whether your model has been quantized or not by using the tool "https://lutzroeder.github.io/netron/". Here you have to load the model and check the random layers having weight.The quantized graph contains the weights value in uint8 format
In unquantized graph the weights value will be in float32 format.
Only setting "converter.post_training_quantize=True" is not enough to quantize your model. The other settings include:
converter.inference_type=tf.uint8
converter.default_ranges_stats=[min_value,max_value]
converter.quantized_input_stats={"name_of_the_input_layer_for_your_model":[mean,std]}
Hoping you are dealing with images.
min_value=0, max_value=255, mean=128(subjective) and std=128(subjective). name_of_the_input_layer_for_your_model= first name of the graph when you load your model in the above mentioned link or you can get the name of the input layer through the code "model.input" will give the output "tf.Tensor 'input_1:0' shape=(?, 224, 224, 3) dtype=float32". Here the input_1 is the name of the input layer(NOTE: model must include the graph configuration and the weight.)
I am using matterport repository to train MASK RCNN on a custom dataset. I have been successful in training. Now I want to save the trained model and use it in a web application to detect objects. How do I save the mask rcnn model after training? Please guide me.
The link of the repository:
https://github.com/matterport/Mask_RCNN
Based on this discussion on GitHub, it appears that trained model or weights of matterport/Mask RCNN can be saved as a JSON file in a manner similar to those trained via standard Keras:
import keras
import json
def save_model(trained_model, out_fname="model.json"):
jsonObj = trained_model.keras_model.to_json()
with open(out_fname, "w") as fh:
fj.write(jsonObj)
save_model(model, "mymodel.json")
Update: If you run into the error related to thread-like object, you might find this post helpful...
In the Inspect_model.ipynb notebook under the "Load Model" topic you can save it after it loads the model in inference mode.
in the folder Mask_RCNN/logs generates a folder inside it
I am not sure if we really need to save the whole model again since normally when we used the matterport git we just train new weights on the existing architecture and doesnt make changes to the architecture. When we used this for a pet project , post training - we defined a new model as the MASK RCNN object (from mrcnn.model import MaskRCNN) with the parameter mode as inference and then loaded the newly trained weights model.load_weights('<logpath/trainedweights.h5>', by_name=True)
I'm looking to run a basic fully-connected neural network for the MNIST dataset with the C++ API v1.2 from Tensorflow. I have trained the model and exported it using tf.train.Saver() in Python. This gave me a checkpoint file, a data file, an index file and a meta file.
I know that the data file contains the saved variables while the meta file contains the graph from using Tensorboard on a previous project.
However, I am not sure what is the recommended way to load those files
and run the trained model in a C++ environment in v1.2, since all the
tutorials and questions I've found are for older versions which differ
substantially.
I've found that tensorflow::ops::Restore should be the method to do such a thing, but I know that inference in Tensorflow isn't well supported, as such I am not certain what parameters should I give it in order to receive the trained model that I can just put into a session->Run() and receive an accuracy statement when fed test data.