I tried to build a CNN from scratch based on LeNet architecture from this article
I implemented backdrop and now trying to train it on the MNIST dataset using SGD with 16 batch size. I want to find a quick way to verify that the learning goes well and there are no bugs. For this, I visualize loss for every 100th batch but it takes too long on my laptop and I don't see an overall dynamic (the loss fluctuates downwards, but occasionally jumps up back so I am not sure). Could anyone suggest a proven way to find that the CNN works well without waiting many hours of training?
The MNIST consist of 60k datasets of 28 * 28 pixel.Training a CNN with batch size 16 will have 4000 forward pass per epochs.
Now taking into consideration that your are using LeNet which not a very deep model.
I would suggest you to do followings:
Check your PC specifications such as RAM,Processor,GPU etc.
Try your to train your model on cloud service such Google Colab, Kaggle and others
Try a batch size of 128 or 64
Try to normalize your image data set before training
Training speed also depends on machine learning framework you are using such as Tensorflow, Pytorch etc.
I hope this will help.
Related
I tried to train a image classifier using tensorflow. I used data api to load the dataset and i used dataset caching to speed up training process. while trying to training the model i struck with a error called Resource Exhausted. I tried to change the batch size even after trying different batch size like 32,64,128 i could not over come this problem
I have tried to remove some layers but i could not fix this error.
Check your batch_size. Decrease it. It seems it is overwhelming.
So I have the following model for sentiment analysis (using pre trained word embeddings):
And as visible, I have a pre trained embedding matrix and only about 500k trainable parameters. So why does it take a whole eternity to train this model? The batch size is 128 and number of epochs is 25. And the ETA for first epoch is about 10 minutes. I haven't even completed that.
Just to mention, I am not using CUDA or anything. I don't think I have a GPU enabled Tensorflow. And I'm willing to do anything to increase the speed. And I have Tensorflow 2.1.0.
And here's the answer I am not using CUDA or anything. Training on CPU is much slower than on GPU. If you don't have high-performance enough video card, you can use several services such as Google Colab or Kaggle
Hello everybody,
My objective is to detect people and cars (day and night) on images of the size of 1920x1080, for this I use the tensorflow API, I use a SSD mobilenet model, I annotated 1000 images (900 for training, 100 for evaluation) from 7 different cameras. I launch the training with an image size of 960x540. My model does not converge. I do not know what to do, should I make different classes for day and night objects?
On a tutorial for face detection with the tensorflow API, they use a dataset with images containing only faces, then use the model on complex scenes. Is this a good idea knowing that a model like SSD also learns negative examples?
Thank you
(sources: https://blog.usejournal.com/face-detection-for-cctv-surveillance-6b8851ca3751)
What do you mean by "not converge"? Are you referring to the train/validation loss?
In this case, the first thing that comes to my mind is to reduce the learning rate (I had a similar problem).
You can do it by modifying you configuration file, in the "train_config" section you'll find the value "initial_learning_rate".
Try to set it up to a lower value (like, an order of magnitude lower) and see if it helps.
I'm a Tensorflow newby and I'm trying to train a 1 class model for object detection. In particular I'm trying to recognize an arrow like the following:
I need a very fast recognition so I started wondering if a pre-trained model can contain such kind of shape.
Unfortunately didn't find anything similar and therefor I started with my own training of the arrow using as model the faster_rcnn_inception_v2_coco_2018_01_28.
I'm using his pipeline config, and I'm using his fine_tune_checkpoint as well, is this right considering that I have to train a completely different object?
The result is a training with a very good accuracy but very low speed. I need to increase the framerate and I didn't understand yet if the less is the "training loss" the more is the "object recognition speed", or not.
Any suggestion on how could I speedup the detection?
I'm using his pipeline config, and I'm using his fine_tune_checkpoint
as well, is this right considering that I have to train a completely
different object?
Yes! Every time you want to change the output of a deep NN, you should take a pretrained model. Training a model from scratch can take several weeks and you will never be able to generate enough data on your own. Taking a pretrained model and fine-tuning it is a way to go.
I didn't understand yet if the
less is the "training loss" the more is the "object recognition
speed", or not.
No. Training loss just tells you how good your model performs with respect to the training set.
The issue you are having is a classic speed vs. accuracy trade-off. I encourage you to take a look at this table and find a model which is fast enough for you (i.e. lowest run-time) but have decent accuracy. I would first check SSD here.
The result is a training with a very good accuracy but very low speed.
How much FPS does your algorithm perform?
Since you already have prepared dataset, I would suggest using Tiny-Yolo which performs 244 FPS on COCO dataset https://pjreddie.com/darknet/yolo/
Preparing training dataset for Tiny-Yolo is very easy if you use this repository
And
I didn't understand yet if the less is the "training loss" the more is the "object recognition speed"
Training lost has nothing to do with speed.
I'm working on a project that requires the recognition of just people in a video or a live stream from a camera. I'm currently using the tensorflow object recognition API with python, and i've tried different pre-trained models and frozen inference graphs. I want to recognize only people and maybe cars so i don't need my neural network to recognize all 90 classes that come with the frozen inference graphs, based on mobilenet or rcnn, as it seems this slows the process, and 89 of this 90 classes are not needed in my project. Do i have to train my own model or is there a way to modify the inference graphs and the existing models? This is probably a noob question for some of you, but mind that i've worked with tensorflow and machine learning for just one month.
Thanks in advance
Shrinking the last layer to output 1 or two classes is not likely to yield large speed ups. This is because most of the computation is in the intermediate layers. You could shrink the intermediate layers, but this would result in poorer accuracy.
Yes, you have to train own model. Let's see in short words some ways how to do.
OPTION 1. When you want to apply transfer knowledge as maximum as possible, you can froze the CNN layers. After, you change a quantity of detected classes with dimension of classifier (dense layers). The classifier is the latest part in CNN architecture. Now, you should retrain only classifier.
OPTION 2. Assuming, you want to apply transfer knowledge for first layers of CNN (for example, froze first 2-3 CNN layers) and retrain rest of CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain rest of CNN layers and classifier.
OPTION 3. Assuming, you want to retrain whole CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain whole CNN with classifier.
Generally, the Tensorflow Object Detection API is a good start for beginners! How to proceed with your problem you can see here more detail about whole process and extra explanation here.