I am training a CNN using Keras with TensorFlow backend, using imgaug for image augmentation.
I am also using Tensorboard to visualize training progress and results.
Since imgaug is applying (random) transformations to the input images, I would like to send (some of) the augmented images over to Tensorboard, so that I can visualize them and verify that everything is correct (eg: to check if I am applying too large translations, or blurring the images too much).
For this I created a custom Keras callback and am trying to input my logic in the on_batch_end method. I can send images to tensorboard alright, but can't find where I can access the augmented input images. Any tips on how to achieve this?
Thanks in advance
Better to do that outside training by simply getting images from your generator.
If it's a regular generator:
for i in range(numberOfBatches):
x,y = next(generator)
#plot, print, etc. with the batches
If it's a keras.utils.Sequence:
for i in range(len(generator)):
x,y = generator[i]
#plot, print, etc. with the batches
So this may be a silly question but how exactly do the preprocessing layers in keras work, especially in the context of as a part of the model itself. This being compared to preprocessing being applied outside the model then inputting the results for training.
I'm trying to understand running data augmentation in keras models. Lets say I have 1000 images for training. Out of model I can apply augmentation 10x and get 10000 resultant images for training.
But I don't understand what's happening when you use a preprocess layer for augmentation. Does this (or these if you use many) layers take each image and apply the transformations before training? Does this mean the total number of images used for training (and validation I assume) to be the number of epochs*the original number of images?
Is one option better than the other? Does that depend on the number of images one originally has before augmentation?
The benefit of preprocessing layers is that the model is truly end-to-end, i.e. raw data comes in and a prediction comes out. It makes your model portable since the preprocessing procedure is included in the SavedModel.
However, it will run everything on the GPU. Usually it makes sense to load the data using CPU worker(s) in the background while the GPU optimizes the model.
Alternatively, you could use a preprocessing layer outside of the model and inside a Dataset. The benefit of that is that you can easily create an inference-only model including the layers, which then gives you the portability at inference time but still the speedup during training.
For more information, see the Keras guide.
I followed a tutorial to make my first Convolutional Neural Network using Keras and I have a small question regarding the rescaling step.
So when we are importing the training set and test set, we create an instance of the tf.keras.preprocessing.image.ImageDataGenerator class and use it as:
train_datagen = ImageDataGenerator(rescale=1/255)
Along with some other augmentation parameters. My understanding is that we use the rescale parameter to normalize the pixel values of the images imported.
But when we load up a single image to run through the CNN, we write something like (code from keras docs):
image = tf.keras.preprocessing.image.load_img(image_path)
input_arr = keras.preprocessing.image.img_to_array(image)
input_arr = np.array([input_arr]) # Convert single image to a batch.
predictions = model.predict(input_arr)
My question is, I cannot see the single input image being rescaled anywhere. Is it being done implicitly, or is there no need to actually perform rescaling? If the latter, then why is it so?
The image should be normalized that it should be divided by 255, if it's done during the training. Network will not be able to interpret that.
Also, when we use test_datagen, we apply Rescaling by 1/255 for the predict generator.
Normalization, mean subtraction and std deviation needs to be done at the testing time, if that has been applied during the training stage.
I have trained my CNN in Tensorflow using MNIST data set; when I tested it, it worked very well using the test data. Even, to prove my model in a better way, I made another set taking images from train and test set randomly. All the images that I took from those set, at the same time, I deleted and I didn't give them to my model. It worked very well too, but with a dowloaded image from Google, it doesn't classify well, so my question is: should I have to apply any filter to that image before I give it to the prediction part?
I resized the image and converted it to gray scale before.
MNIST is an easy dataset. Your model (CNN) structure may do quite well for MNIST, but there is no guarantee that it does well for more complex images too. You can add some more layers and check different activation functions (like Relu, Elu, etc.). Normalizing your image pixel values for small values like between -1 and 1 may help too.
I'm currently working on a program that can do binary image classification with machine learning. I have a list of labels and a list of images that i'm using as inputs which are then fed into the Inception V3 model.
Will inputting of the dataset this way work with the inception V3 architecture? Is it necessary to organize the images with labeled folders before feeding it into the model?
Thanks for your help!
In your example, you have all the images in memory. You can simply call model.fit(trainX, trainY) to train your model. No need to organize the images in specific folder structures.
What you are referring to, is the flow_from_directory() method of the ImageDataGenerator. This is an object that will yield images from the directories, and automatically infer the labels from the folder structure. In this case, your images should be arranged in one folder per label. Since the ImageDataGenerator is a generator, you should use it in combination with model.fit_generator().
As a third option, you can write your own custom generator that yields both images and labels. This is advised in case you have a more complex label structure than one label per images; for instance in multi-label classification, object detection or semantic segmentation, where the outputs are also images. A custom generator should also be used with model.fit_generator().
I am using Keras functionality ImageDataGenerator() to generate training and validation dataset. I an trying to understand what this function does internally. What are the preprocessing steps does this function perform?
Where can I find the source code of this function?
You can find any source code at:
Here is the ImageDataGenerator:
The keras documentation page also has links that lead you there:
Internally, the ImageDataGenerator will make a series of different data augmentation procedures on images that you provide, and also prepare a python generator for you to use when fitting your models.
There are several data augmentation methods to use, you can have an idea of what they are in the help page above.
Generators are used to create batches in a loop. In this case, one batch of images at a time.
Instead of using model.fit(), you will use model.fit_generator() with either ImageDataGenerator.flow() or ImageDataGenerator.flow_from_directory().