For a few days I am following the instructions here:https://github.com/tensorflow/models/tree/master/inception
for fine-tuning inception model. The problem is that my dataset is huge so converting it to TFRecords format would fill my entire hard-disk space. Is there a way of fine-tuning without using this format? Thanks!
Fine-tuning is independent of the data format; you're fine there. TFRecords promotes training and scoring speed; it shouldn't affect the quantity of iterations or epochs needed, nor the ultimate classification accuracy.
You can train any model without converting your data to tfrecords. Here there is a great gist that fine-tunes VGG by reading directly from jpg files. You can change the slim architecture to the Inception one and you should be fine!
In the current configuration your dataset must be divided into train and test folders, with classes as sub-folders. But you can change it to whatever you want.
I've not experienced any huge difference in the speed of the code compared to the one that uses tfrecords.
Related
So this may be a silly question but how exactly do the preprocessing layers in keras work, especially in the context of as a part of the model itself. This being compared to preprocessing being applied outside the model then inputting the results for training.
I'm trying to understand running data augmentation in keras models. Lets say I have 1000 images for training. Out of model I can apply augmentation 10x and get 10000 resultant images for training.
But I don't understand what's happening when you use a preprocess layer for augmentation. Does this (or these if you use many) layers take each image and apply the transformations before training? Does this mean the total number of images used for training (and validation I assume) to be the number of epochs*the original number of images?
Is one option better than the other? Does that depend on the number of images one originally has before augmentation?
The benefit of preprocessing layers is that the model is truly end-to-end, i.e. raw data comes in and a prediction comes out. It makes your model portable since the preprocessing procedure is included in the SavedModel.
However, it will run everything on the GPU. Usually it makes sense to load the data using CPU worker(s) in the background while the GPU optimizes the model.
Alternatively, you could use a preprocessing layer outside of the model and inside a Dataset. The benefit of that is that you can easily create an inference-only model including the layers, which then gives you the portability at inference time but still the speedup during training.
For more information, see the Keras guide.
I was comparing results between the same model generated and used between two frameworks (Ml.net and tensorflow/keras) independently.
These are the steps in followed in getting the results:
Train Model in Keras save as .h5 format.
Convert to .pb model to use in ML.net, as ml.net currently uses tensorflow graphs and not saved model format.
Now for the same two images, the prediction value should be the same. As the model is the same. The difference being the underlying framework.
I could find only bits and pieces regarding why this occours. I am running both the programs with same underlying hardware and software configurations.
Is the underlying cause the difference in framework?
The image prediction scores for example images :
ml.net : 0.9929248 Fail
keras/tensorflow : 0.99635077 Fail
The difference is minute, but being deterministic irrespective of platform is a feature that I am looking for.
I have been studying transfer learning with models like inception_v4 and inception_resnet_v2. Found some projects that uses bottleneck and some uses tfrecords to store the training images. When retraining the inception_v4 model with the same data using those two methods bottleneck gave 95% accuracy and tfrecord only gave 75%. But, all the new projects seems to use tfrecords for data and .ckpt format to store the model. Can someone explain me whats the difference and which one is better in which case
If you are working with large datasets, using a binary file format for storage of your data can have a significant impact on the performance of your import pipeline. Hence, it will affect your training time of the model.
By using TFRecords, it is possible to store sequence data. For e.g, a series of data. Besides, it easy to combine multiple datasets and integrates seamlessly with the data import and preprocessing functionality provided by the library.
For more information about TFrecords, please refer this link.
I have an image classification task where I've created multiple crops of each image as well as flipped/flopped versions to extend my limited dataset. I have written the dataset to a tfrecords file where each record consists of (simplified here to two crops and only a flipped version):
{
lbl: int,
crop_0: np.ndarray,
crop_1: np.ndarray,
crop_0_flipped: np.ndarray,
crop_1_flipped: np.ndarray
}
Basically 4 images / entry. During training, I'd like to treat each image as separate, i.e. feed each record as 4 images with the same label, shuffled with the rest of the images in the dataset, so that N images becomes 4N images. During testing (using a separate but similarly structured dataset), I'd like to take each image, only use the crop_0 and crop_1 images and average the softmax outputs for classification.
My question is - what is the best and most efficient way of training such a dataset? I'm willing to change my approach if this will make training more inefficient, and it seems that the simplest thing to do would have been to have separate tfrecords files for each version (crop & flip/flop images) and interleave the files into one dataset, but I do not want to have a whole bunch of files to deal with if I can help it.
Writing the dataset to disk with 4N images is an approach that you'll come to loath later (I did it this way originally and loath that code now). The better way is to keep your original dataset on disk as-is, don't write your preprocessing steps to disk. Do that kind of preprocessing in the CPU while you train. The tensorflow Dataset preprocessing pipeline makes this easy, modular, and provides the parallelization you need to take advantage of multiple cores at not extra coding expense.
This is the main guide:
https://www.tensorflow.org/programmers_guide/datasets
Your approach should be to create 2 Dataset objects, one for train and one for test. The train Dataset pipeline will perform all the data augmentation you mentioned. The test Dataset pipeline will not, naturally.
One key to understanding this approach is that you will not feed the data to tensorflow using feed_dict, instead, tensorflow will just invoke the Dataset pipeline to pull the data it needs for each batch.
To get parallelization you'll use the Dataset.map function to apply some set of transformations and use the property num_parallel_calls to distribute the operations across multiple cores. If your preprocessing can be done in tensorflow code, great, if not you'll need to use tf.py_func to use python preprocessing code.
The guide I linked to above describes all of this very well. You will want to us a feedable iterator described in the section called "Creating an iterator". This will allow you to get a string ID from each of the 2 datasets (train and test) and pass that string to tensorflow via feed_dict indicating which of the two datasets tensorflow should pull samples from.
I have the code to classify the images as Nude or Non nude. It is implemented on deep learning with tensor flow python. The code can be founded in Tensorflow Implementation of Yahoo's Open NSFW Model
I want to add some more images in to the dataset on order to do fine tuning. How can I do fine tuning in this implementation by using another dataset.
Just load their model and initialize its weights with the ones they provide, similar to how they do it here. Assuming that you are familiar with tensorflow, you should then proceed to train that model on your images.
Besides this blog post I'm not aware of any other publications the team has made on their work. This is a bit of an issue as they don't state their training parameters (choice of optimizer, learning rate, etc.). If you want to fine-tune this model you will have to experiment a bit in this regard.
Do they give you the original dataset that the provided model is trained off of? If so, you can easily just add your own dataset to their dataset, and train a completely new model based on the combined dataset.
I wrote more about this "combined" dataset, where you can add more or less data, here.
Good Luck!