I tried to build a CNN from scratch based on LeNet architecture from this article
I implemented backdrop and now trying to train it on the MNIST dataset using SGD with 16 batch size. I want to find a quick way to verify that the learning goes well and there are no bugs. For this, I visualize loss for every 100th batch but it takes too long on my laptop and I don't see an overall dynamic (the loss fluctuates downwards, but occasionally jumps up back so I am not sure). Could anyone suggest a proven way to find that the CNN works well without waiting many hours of training?
The MNIST consist of 60k datasets of 28 * 28 pixel.Training a CNN with batch size 16 will have 4000 forward pass per epochs.
Now taking into consideration that your are using LeNet which not a very deep model.
I would suggest you to do followings:
Check your PC specifications such as RAM,Processor,GPU etc.
Try your to train your model on cloud service such Google Colab, Kaggle and others
Try a batch size of 128 or 64
Try to normalize your image data set before training
Training speed also depends on machine learning framework you are using such as Tensorflow, Pytorch etc.
I hope this will help.
I am combining a Monte-Carlo Tree Search with a convolutional neural network as the rollout policy. I've identified the Keras model.predict function as being very slow. After experimentation, I found that surprisingly model parameter size and prediction sample size don't affect the speed significantly. For reference:
0.00135549 s for 3 samples with batch_size = 3
0.00303991 s for 3 samples with batch_size = 1
0.00115528 s for 1 sample with batch_size = 1
0.00136132 s for 10 samples with batch_size = 10
as you can see I can predict 10 samples at about the same speed as 1 sample. The change is also very minimal though noticeable if I decrease parameter size by 100X but I'd rather not change parameter size by that much anyway. In addition, the predict function is very slow the first time run through (~0.2s) though I don't think that's the problem here since the same model is predicting multiple times.
I wonder if there is some workaround because clearly the 10 samples can be evaluated very quickly, all I want to be able to do is predict the samples at different times and not all at once since I need to update the Tree Search before making a new prediction. Perhaps should I work with tensorflow instead?
The batch size controls parallelism when predicting, so it is expected that increasing the batch size will have better performance, as you can use more cores and use GPU more efficiently.
You cannot really workaround, there is nothing really to work around, using a batch size of one is the worst case for performance. Maybe you should look into a smaller network that is faster to predict, or predict on the CPU if your experiments are done in a GPU, to minimize overhead due to transfer.
Don't forget that model.predict does a full forward pass of the network, so its speed completely depends on the network architecture.
One way that gave me a speed up was switching from model.predict(x) to,
model.predict_on_batch(x)
making sure your x shape has 1 as the first dimension.
I don't think working with pure Tensorflow would change the performance much. Keras is a high-level API for low-level Tensorflow primitives. You could use a smaller model instead, like MobileNetV3 or EfficientNet, but this would require retraining.
If you need to remain with the existing model, you could try OpenVINO. OpenVINO is optimized for Intel hardware, but it should work with any CPU. It optimizes your model by converting to Intermediate Representation (IR), performing graph pruning and fusing some operations into others while preserving accuracy. Then it uses vectorization in runtime.
It's rather straightforward to convert the Keras model to OpenVINO. The full tutorial on how to do it can be found here. Some snippets are below.
Install OpenVINO
The easiest way to do it is using PIP. Alternatively, you can use this tool to find the best way in your case.
pip install openvino-dev[tensorflow2]
Save your model as SavedModel
OpenVINO is not able to convert the HDF5 model, so you have to save it as SavedModel first.
import tensorflow as tf
from custom_layer import CustomLayer
model = tf.keras.models.load_model('model.h5', custom_objects={'CustomLayer': CustomLayer})
tf.saved_model.save(model, 'model')
Use Model Optimizer to convert SavedModel model
The Model Optimizer is a command-line tool that comes from OpenVINO Development Package. It converts the Tensorflow model to IR, a default format for OpenVINO. You can also try the precision of FP16, which should give you better performance without a significant accuracy drop (change data_type). Run in the command line:
mo --saved_model_dir "model" --data_type FP32 --output_dir "model_ir"
Run the inference
The converted model can be loaded by the runtime and compiled for a specific device, e.g., CPU or GPU (integrated into your CPU like Intel HD Graphics). If you don't know what the best choice for you is, use AUTO. You care about latency, so I suggest adding a performance hint (as shown below) to use the device that fulfills your requirement.
# Load the network
ie = Core()
model_ir = ie.read_model(model="model_ir/model.xml")
compiled_model_ir = ie.compile_model(model=model_ir, device_name="AUTO", config={"PERFORMANCE_HINT":"LATENCY"})
# Get output layer
output_layer_ir = compiled_model_ir.output(0)
# Run inference on the input image
result = compiled_model_ir([input_image])[output_layer_ir]
Disclaimer: I work on OpenVINO.
I'm new to TensorFlow. I am currently working on a project that uses the TF Object detection API. I'm training a model with two classes on my custom images. So far I have successfully run train.py and eval.py and executed TensorBoard at the same time to see how the training processes is progressing.
Here is the image of my work:
How do I display a graph in which I can see the accuracy of the model being developed ?
Any help is appreciated!
You can do this by switching from the deprecated train.py and eval.py to the recent model_main.py. It interleaves evaluation schedule into the training session, so that you can evaluate the training progress of the model without manually doing so.
The flags of model_main.py are very similar to those of train.py, and you can see an example in here:
https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/running_locally.md#running-the-training-job
I'm running the Tensorflow tf.estimator Quickstart example on a Raspberry Pi 3. It works well, however the prediction is very slow.
The instruction predictions = list(classifier.predict(input_fn=predict_input_fn)) takes seconds to return.
It's a problem for me because I want to predict a simple tuple immediately after I receive it from my sensors, I cannot batch the predictions.
My model_dir is a tmpfs folder (something in RAM) so I don't think it's related to IO latency. Perhaps the network is built every time, I don't know.
I'm probably doing something wrong, do you know how to run TensorFlow predictions on the same classifier in a short time ?
I'm looking to run a basic fully-connected neural network for the MNIST dataset with the C++ API v1.2 from Tensorflow. I have trained the model and exported it using tf.train.Saver() in Python. This gave me a checkpoint file, a data file, an index file and a meta file.
I know that the data file contains the saved variables while the meta file contains the graph from using Tensorboard on a previous project.
However, I am not sure what is the recommended way to load those files
and run the trained model in a C++ environment in v1.2, since all the
tutorials and questions I've found are for older versions which differ
substantially.
I've found that tensorflow::ops::Restore should be the method to do such a thing, but I know that inference in Tensorflow isn't well supported, as such I am not certain what parameters should I give it in order to receive the trained model that I can just put into a session->Run() and receive an accuracy statement when fed test data.