I have the code to classify the images as Nude or Non nude. It is implemented on deep learning with tensor flow python. The code can be founded in Tensorflow Implementation of Yahoo's Open NSFW Model
I want to add some more images in to the dataset on order to do fine tuning. How can I do fine tuning in this implementation by using another dataset.
Just load their model and initialize its weights with the ones they provide, similar to how they do it here. Assuming that you are familiar with tensorflow, you should then proceed to train that model on your images.
Besides this blog post I'm not aware of any other publications the team has made on their work. This is a bit of an issue as they don't state their training parameters (choice of optimizer, learning rate, etc.). If you want to fine-tune this model you will have to experiment a bit in this regard.
Do they give you the original dataset that the provided model is trained off of? If so, you can easily just add your own dataset to their dataset, and train a completely new model based on the combined dataset.
I wrote more about this "combined" dataset, where you can add more or less data, here.
Good Luck!
Related
I am trying to train a custom object detector, is ther a limit on the number of target class objects that the yolov5 architecture can be trained on.
For example- coco dataset has 80 target class, suppose I have 500 object types to detect, is it advisable to use yolov5.
Can this be explain with reasons.
You can add as many classes you want to any network.
The yolo architecture is known for giving more attention to inference time rather than to performance. Although achieving good results on conventional datasets, the yolo model is built for speed.
But essentially you want a network that has a good backbone (deep and wide) that can really obtain rich features from your image.
From my experience, there is really no straight forward answer. It depends on your dataset as well, if you have large/medium/small objects to detect. I really recommend trying out different models, because every single model, will perform differently on custom datasets. From here you select the best one. State-of-the-art models don't directly relate to the best model on transfer learning and fine-tuning.
The Yolo and all other single shot detectors, for me, were the ones who worked best for fine-tuning purposes (RetinaNet was best for my use cases so far), they are good for hyper parameter tuning because you can train them fast and test what works and what doesn't. With two stage detectors (Faster-RCNN etc) I never achieved overall good results, mainly because the training process is different and much slower.
I recommend you read this article, it explains both architecture types, pros and cons.
Additionally if you want to train a model for more than 500 classes, Tensorflow Object Detection API has pre-trained models for the OpenImages dataset (600 classes), and there is the Detectron 2 on LVIS dataset (1200 classes). I recommend starting with models which were trained on a higher number of classes if you want to fine tune to a similar number of classes in your dataset.
I am trying to train a model with the following data
The image is the input and 14 features must be predicted.
Could I please know how to go about training such a model?
Thank you.
These are not really features as far as I am concerned. These are classes and if I got it correctly, your images sometimes belong to more than one classes.
This is a very broad question but I think here might be a good start to learn more about multi-label image classification.
Note that your model should not be much different than an image classification model that is used for cifar10 challenge, for example. But you need to structure your data and choose your loss function accordingly.
I tried image classification using trained model and its working well but some images could not find perfectly in that time have to get that image and label from users so my doubt is..Is it possible to add new data into already trained model?
No, during inference time you use the weights of the trained model for predictions. Which basically means that at the time your model is deployed the capabilities of your image classifier are fixed by the weights. If you wish to improve your model, you would have to retrain your model with the new - data. However, there is another paradigm of learning called "Online Learning" where the model is continuously learning and modifying the weights. In this case your weights are not fixed and your model is continuously updating its weights with each training input. However afaik this is not usually recommended for CNNs, because the backward pass of gradients is computationally intensive and your inference will be slow because of this.
No model can predict with 100% accuracy if it does it's an ideal model. And if you want to add more data to your train model you have to retrain the model with the new data. Having more data is always a good idea. It allows the “data to tell for itself,” instead of relying on assumptions and weak correlations. Presence of more data results in better and accurate models. So if you want to get better accuracy you have to train your model with more data. Without retraining, you can't add data to your trained model.
I have an image classification task to solve, but based on quite simple/good terms:
There are only two classes (either good or not good)
The images always show the same kind of piece (either with or w/o fault)
That piece is always filmed from the same angle & distance
I have at least 1000 sample images for both classes
So I thought it should be easy to come up with a good CNN solution - and it was. I created a VGG16-based model with a custom classifier (Keras/TF). Via transfer learning I was able to achieve up to 100% validation accuracy during model training, so all is fine on that end.
Out of curiosity and because the VGG-based approach seems a bit "slow", I also wanted to try it with a more modern model architecture as base, so I did with ResNet50v2 and Xception. I trained it similar to the VGG-based model, tried it with several hyperparameter modifications etc. However, I was not able to achieve a better validation accuracy then 95% - so much worse than with the "old" VGG architecture.
Hence my question is: Given these "simple" (always the same) images and only two classes, is the VGG model probably a better base than a modern network like ResNet or Xception? Or is it more likely that I messed something up with my model or simply got the training / hyperparameters not right?
I'm working on a project that requires the recognition of just people in a video or a live stream from a camera. I'm currently using the tensorflow object recognition API with python, and i've tried different pre-trained models and frozen inference graphs. I want to recognize only people and maybe cars so i don't need my neural network to recognize all 90 classes that come with the frozen inference graphs, based on mobilenet or rcnn, as it seems this slows the process, and 89 of this 90 classes are not needed in my project. Do i have to train my own model or is there a way to modify the inference graphs and the existing models? This is probably a noob question for some of you, but mind that i've worked with tensorflow and machine learning for just one month.
Thanks in advance
Shrinking the last layer to output 1 or two classes is not likely to yield large speed ups. This is because most of the computation is in the intermediate layers. You could shrink the intermediate layers, but this would result in poorer accuracy.
Yes, you have to train own model. Let's see in short words some ways how to do.
OPTION 1. When you want to apply transfer knowledge as maximum as possible, you can froze the CNN layers. After, you change a quantity of detected classes with dimension of classifier (dense layers). The classifier is the latest part in CNN architecture. Now, you should retrain only classifier.
OPTION 2. Assuming, you want to apply transfer knowledge for first layers of CNN (for example, froze first 2-3 CNN layers) and retrain rest of CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain rest of CNN layers and classifier.
OPTION 3. Assuming, you want to retrain whole CNN with classifier. After, you change a quantity of detected classes with dimension of classifier. Now, you should retrain whole CNN with classifier.
Generally, the Tensorflow Object Detection API is a good start for beginners! How to proceed with your problem you can see here more detail about whole process and extra explanation here.