I am trying to train an SSD-based face detector from scratch in Caffe. I have a dataset of images where the bounding boxes for faces are stored in a csv file. In all the tutorials and code I've come across so far, convert_annoset tool is used to generate an lmdb file for object detection. However, in the latest Caffe version on Github (https://github.com/BVLC/caffe), this tool has been removed.
The two options I see to deal with this issue are:
Rewrite the convert_annoset tool using functions in the current Caffe library
Use other python packages (such as lmdb and OpenCV) to manually create lmdb files from the images and bounding box information
I've been trying to rewrite the code but I am unable to find certain classes and functions that were used in the original code such as AnnotatedDatum_AnnotationType and LabelMap in the current version.
Any suggestions for how to proceed with creating lmdb for the object detection problem?
EDIT:
I've just realized that AnnotatedData layer no longer exists in the master branch of Caffe. Does this mean that detection is not possible in this version? Do I have to use some older fork such as https://github.com/weiliu89/caffe for detection or is there any other option?
Since the original Caffe does not have a layer for object detection(mobilenet-ssd)
so use this Caffe (https://github.com/weiliu89/caffe) ssd branch
This Caffe has a convert_annoset that can generate an lmdb.
Related
I have a Deep Learning Code for Object Detection. What I did is that I ran the code on Google Colab and then Exported the model to use it locally. Now to run the model I have to again install whole Tensorflow package which is quite heavy for my system.
I want to ask if there is a way to download and run only specific parts of Tensorflow Library?
I am using Tensorflow at only 2 places in my code and I have to install whole Tensorflow library for it.
This is where I am loading the model.
detect_fn = tf.saved_model.load(PATH_TO_SAVED_MODEL)
This is where I am using Tensorflow 2nd time.
input_tensor = tf.convert_to_tensor(image_rgb)
These are the only 2 functions required to me from the Tensorflow Library and not the whole library... Thanks in anticipation.
Though I'm not entirely sure on the library as a whole, there is a Lite version of Tensorflow (I guess they realised 430MB is a bit much too).
Information regarding this can be found here:
https://www.tensorflow.org/lite/
A guide here seems to detail how to pick and choose parts of the Lite library and although not used myself, I should expect some degree of compatibility between the two...
https://www.tensorflow.org/lite/guide/reduce_binary_size
I did transferlearning by using MaskRCNN for multiple-object detection in an environment with:
python=3.6.12
tensorflow==1.15.3
keras==2.2.4
mrcnn==2.1
And the model works.
Now I would like to implement mrcnn real-time with my laptop camera and OpenCV.
Firstly, I would apply face detection with res10_300x300_ssd_iter_140000.caffemodel because my mrcnn model works better if it is run on a face. I chose res10 because I have aleady used it in another project and it worked well!
Unfortunatly, I notice that MaskRCNN doesn't work with the latest version of tensorflow. Moreover, res10_300x300_ssd_iter_140000.caffemodel doesn't work with old versions of tensorflow and I get this error: "ValueError: Unknown layer: Functional".
I would like to know if it is possible to use res10_300x300_ssd_iter_140000.caffemodel
with previous versions of tensorflow, isn't it?
Is there a way to do a porting of MaskRCNN to a more recent version of tensorflow?
Or, is there a way to use res10 with old versions of tensorflow?
A different model for face detection in opencv with a good accuracy?
A different model rather than mrcnn tha is compatible with res10?
Any advice is welcome!
Thanks!
My Resources:
https://github.com/opencv/opencv/wiki/Deep-Learning-in-OpenCV
https://machinelearningmastery.com/how-to-train-an-object-detection-model-with-keras/
https://www.pyimagesearch.com/2020/05/04/covid-19-face-mask-detector-with-opencv-keras-tensorflow-and-deep-learning/
Description
I had a setup for training using the object detection API that worked really well, however I have had to upgrade from TF1.15 to TF2 and so instead of using model_main.py I am now using model_main_tf2.py and using mobilenet ssd 320x320 pipeline to transfer train a new model.
When training my model in TF1.15 it would display a whole heap of scalars as well as detection box image samples. It was fantastic.
In TF2 training I get no such data, just loss scalars and 3 input images!! and yet the event files are huge gigabytes! where as they were in hundreds of megs using TF1.15
The thing is there is nowhere to specify what data is presented. I have not changed anything other than which model_main py file I use to run the training. I added num_visualizations: to the pipeline config file but no visualizations of detection boxes appear.
Can someone please explain to me what is going on? I need to be able to see whats happening throughout training!
Thank You
I am training on PC in virtual environment before performing TRT optimization in Linux but I think that is irrelevant here really.
Environment
GPU Type: P220
Operating System + Version: Win10 Pro
Python Version (if applicable): 3.6
TensorFlow Version (if applicable): 2
Relevant Files
TF1.15 vs TF2 screenshots:
TF1 (model_main.py) Tensorboard Results
TF2 (model_main_tf2.py) Tensorboard Results
Steps To Reproduce
The repo I am working with GitHub Object Detection API
Model
Pipeline Config File
UPDATE: I have investigated further and discovered that the tensorboard settings are being set in Object Detection Trainer 1 for TF1.15 and Object Detection Trainer 2 for TF2
So if someone who knows more than I do about this could work out what the difference is and what I need to do to get same result in tensorboard with v2 as I do with the first one that would be amazing and save me enormous headache. It would seem that this, even though it is documented as being for TF2, is not actually following TF2 syntax but I could be wrong.
I am trying to use OpenVino python API to run MTCNN face detection, however, the performance of the converted models degraded significantly from the original model. I am wondering how I could get similar results.
I converted the mtcnn caffe models into OpenVino *.xml and *.bin files using the following commands.
python3 mo.py --input_model path/to/PNet/det1.caffemodel --model_name det1 --output_dir path/to/output_dir
python3 mo.py --input_model path/to/RNet/det2.caffemodel --model_name det2 --output_dir path/to/output_dir
python3 mo.py --input_model path/to/ONet/det3.caffemodel --model_name det3 --output_dir path/to/output_dir
And used the step_by_step mtcnn jupyter notebook to check the performance of the converted models.
But detection results using OpenVino models degraded significantly. To regenerate the results you only need to load OpenVino models instead of pytorch model in the notebook.
To regenerate my results do the following steps.
Clone https://github.com/TropComplique/mtcnn-pytorch.git
And use this jupyter notebbok
As you will see the detected boxes in the first stage after P-Net are more than the detected boxes in the original model step_by_step mtcnn jupyter notebook.
Do you have any comment on this. It seems that there is no problem in model conversion the only difference is that pytorch has a variable tensor size (FloatTensor) but for OpenVino I have to reshape the input size for each scale. This might be the reason to get different results, however I have not been able to solve this problem.
I went through all the possible mistake I might had made and check parameters to convert mtcnn models from list_topologies.yaml. This file comes with OpenVino installation and list the parameters like scale mean values and etc.
Finally, I solved the problem by using MXNET pre-trained MTCNN networks.
I hope this would help other users who might encounter this problem.
I'm learning TensorFlow, running version r0.10 on Ubuntu 16.04. I am working on the CIFAR-10 Tutorial and have trained the CNN in the example.
Where is the image data stored for this tutorial?
The data path is defined on this line, in cifar10.py:
tf.app.flags.DEFINE_string('data_dir', '/tmp/cifar10_data',
"""Path to the CIFAR-10 data directory.""")
However I am confused as to why I cannot find this directory. I have attempted to manually search for it, and also look through all the example directories for it.
It is getting saved in a relative path for your OS, not your working directory. Take a look at my answer here and see if that helps.