Batch-wise training with ImageDataGenerator

Batch-wise training with ImageDataGenerator - python

I have a training dataset which is prepared as follows:
FileName
s01_l01/1_1.png
Labels_onehot
[[0.0, 0.0, 0.0,..., 1.0],..., [0.0, 0.0, 1.0,..., 0.0]]
FileName directly leads to the input images. I would like to preprocess the images (resize them to (224, 224) and rescale them (1./255). Labels_onehot are the one hot encoded labels. These do not have to be further preprocessed.
In the end, the preprocessed images are my X_train data.
X_train.shape
(n, 224, 224, 3)
The labels are the y_train data.
y_train.shape
(n, 8, 36)
Until now, I am preprocessing the images manually and then loading the preprocessed images (one dataset) and the labels (other dataset). This is not a very efficient way and fills out my RAM completely.
But in the end I am able to fit my model relatively easy, e.g. with this code:
model.fit(X_train, y_train, epochs=50, batch_size=32)
Now, I am wondering, how to achieve the same with the ImageDataGenerator. Is there a method to do all the steps I did before manually?

ImageDataGenerator has a number of methods
flow: Takes feature data & label arrays, and returns an iterator that produces generates augmented data batches.
flow_from_directory: Takes a path to a directory and generates batches of augmented data.
First, we will create an instance of ImageDataGenerator.
aug = tf.keras.preprocessing.image.ImageDataGenerator( rotation_range=20,
width_shift_range=0.1, shear_range=0.2, zoom_range=0.4, horizontal_flip=True)
rotation_range: degrees (0 to 180).
width_shift_range and height_shift_range are used for horizontal and
vertical shifts,respectively.
zoom_range: the amount of zoom. If scalar z, the zoom will be
randomly picked in the range [1-z, 1+z].
horizontal_flip: whether to randomly flip images horizontally.
vertical_flip: whether to randomly flip images vertically.
rescale: rescaling factor. This multiplies the image by the rescaling factor. This can be a useful way of normalising the data by specifying a rescaling factor of 1.0/255.
flow function returns a generator, which is a Python iterator object that is used to construct our augmented images
flow_train_generator = aug.flow(x_train, y_train, batch_size=32)
flow_from_directory method returns an Iterator that yields tuples of (x, y) where x is a NumPy array containing a batch of images and y is a NumPy array of corresponding labels.
flow_dict_train_generator = aug.flow_from_directory(
'data/train',
target_size=(150, 150),
batch_size=32,
class_mode='binary')
Please be aware that the flow from the directory can slow down your model training process as you read directly from the disk each time you generate a new batch before making the modifications to the data increase and then pass the batch to the network.
For more details you can refer below link:
https://www.tensorflow.org/api_docs/python/tf/keras/preprocessing/image/ImageDataGenerator

Related

How can I iterate over the test dataset batches?

I have a question about test the model. I created a model test set using tf.keras.utils.image_dataset_from_directory following as:
batch_size = 32
test_dataset = tf.keras.utils.image_dataset_from_directory(
'/content/drive/MyDrive/test',
image_size = (224, 224),
batch_size = batch_size,
shuffle = False
)
and I get the output as Found 150 files belonging to 3 classes.
After that, I want to iterate over the test dataset batches by using:
labels_batch = []
for dataset in test_dataset.unbatch():
image_batch, label_batch = dataset
labels = label_batch.numpy()
labels_batch.append(labels)
I understand that, in the structure of dataset <class 'tuple'> consist of 2 positions are image_batch and label_batch, which are <class 'tensorflow.python.framework.ops.EagerTensor'>.
Therefore, image_batch[0] should mean first image in test_dataset. When I want to show array of first image, I use the command print(image_batch[0]) as shown array of all image with shape=(224, 3) but I think the size of all images should be shape=(224,224,3).
So what command do I have to use to access the array of each images?.
I use TensorFlow version 2.9 in google colab. I'm not sure test_dataset.unbatch().
Is the problem here or not?

the unbatch method actually returns each individual image, to get a batch iterator that returns a batch on each iteration you should be calling batch method instead, or just use the dataset iterator, ie:
for dataset in test_dataset:
so in your code image_batch is an image of shape (224,224,3), and image_batch[0] is an array of shape (224,3), because you sliced the first dimension.
you might want to check the dataset documentation for description of each method.

How to add data augmentation to regression problem?

I am trying to build a CNN model for regression problem with limited number of input data with 400 sample size. The inputs are images and labels are extracted from a column of csv file. To increase the input data, I need to augment the input images and match them with existing labels. I am using rotation and flipping augmentation methods. I am not sure how existing labels should be linked to the augmented images and how the final tensorflow dataset should be created to fit the model. Can anyone help me to solve this data augmentation?
#load csv file
labelPath = "/content/drive/MyDrive/Notebook/tepm.csv"
cols = ["temperature"]
df = pd.read_csv(labelPath, sep=" ", header=None, names=cols)
inputPath='/content/drive/MyDrive/Notebook/test_png_64'
images = []
# Load in the images
for filepath in os.listdir(inputPath):
images.append(cv2.imread(inputPath+'/{0}'.format(filepath),flags=(cv2.IMREAD_ANYCOLOR | cv2.IMREAD_ANYDEPTH)))
images_scaled = np.array(images, dtype="float") / 255.0
(trainY, testY, trainX, testX) = train_test_split(df, images_scaled, test_size=0.20, random_state=42)
(trainY, valY, trainX, valX) = train_test_split(trainY, trainX, test_size=0.20, random_state=42)
def rotate(trainX: tf.Tensor) -> tf.Tensor:
# Rotate 90 degrees
return tf.image.rot90(trainX, tf.random_uniform(shape=[], minval=0, maxval=4, dtype=tf.float32))
def flip(trainX: tf.Tensor) -> tf.Tensor:
trainX = tf.image.random_flip_left_right(trainX)
trainX = tf.image.random_flip_up_down(trainX)
return trainX
update with ImageDataGenerator
datagen = ImageDataGenerator(
vertical_flip=True,
horizontal_flip=True,
fill_mode="nearest")
datagen.fit(trainX)
model.compile(optimizer=tf.keras.optimizers.Adam(lr=0.0001), loss='mean_squared_error', metrics='mse')

ImageDataGenerator should do the trick. It generate batches of tensor image data with real-time data augmentation.

Initially you should consider whether augmentations that preserve the labels are useful, or augmentations that require matching label augmentation, or both. If I am following your code correctly, you have temperature for scalar labels. Without knowing the nature of your images, I'd guess it unlikely that rotations and flips would be temperature-dependent, thus the labels are preserved and you are all set to go with ImageDataGenerator as is. Whether or not those augmentations will help the training is hard to know without trying it. Conversely, ImageDataGenerator does have a Brightness augmentation, which is the sort of thing I could imagine being temperature dependent in an image. In that case, the labels aren't preserved and you'd have to augment them manually, because I don't think ImageDataGenerator has methods for scalar labels. In my experience, it is the latter sort of augmentations (labels not-preserved) which are more obviously useful. But to get matching label augmentation you may have to do a little more manual coding than what comes stock with ImageDataGenerator; fortunately it might not be too hard.
Some of the basic elements for matching label augmentation might go like this (this is not complete code, just snippets):
Set up the subset of parameters for ImageDataGenerator augmentation that make sense for your scalar labels in a convenience dict:
regression_aug = dict(fill_mode='nearest',
rotation_range=3,
width_shift_range=0.1,
height_shift_range=0.1,
Use the ImageDataGenerator method get_random_transform:
self.tparams[i] = self.generator.get_random_transform(self.img_dims)
Apply it to the training image, and further manually apply it to the scalar label(s):
batch_X[i] = self.generator.apply_transform(img[i], self.tparams[i]))
batch_y[i,0] = self.lbl[x,0] - self.tparams[i]['tx']
batch_y[i,1] = self.lbl[x,1] - self.tparams[i]['ty']
batch_y[i,2] = self.lbl[x,2] - self.tparams[i]['theta']
where in this example case I had scalar labels that consisted of position and orientation, such that they could be sensibly be translated and rotated during augmentation.

Creating an image dataset using ImageDataGenerator.flow_from_directory gives different length for different batch size

I created an image dataset for image classification. I used flow_from_directory for creating the dataset. In it's documentation it says, A DirectoryIterator yielding tuples of (x, y) where x is a numpy array containing a batch of images with shape (batch_size, *target_size, channels) and y is a numpy array of corresponding labels.
from tensorflow.keras.preprocessing.image import ImageDataGenerator
train_path='rock_paper_scissors/rps'
val_path='rock_paper_scissors/rps-test-set'
train_data=ImageDataGenerator(rescale=1/255.0)
train=train_data.flow_from_directory(
directory=train_path,
target_size=(150,150),
color_mode='grayscale',
class_mode='categorical',
batch_size=840
)
print(len(train))
There are a total of 840 images in each of the three sub-directories. When I set the batch_size to 840, len(train) = 3 but when batch_size is set to 600, len(train) = 5.
Why is that ? Shouldn't it be 3 as well ?

How to perform prediction using predict_generator on unlabeled test data in Keras?

I'm trying to build an image classification model. It's a 4 class image classification. Here is my code for building image generators and running the training:
train_datagen = ImageDataGenerator(rescale=1./255.,
rotation_range=30,
horizontal_flip=True,
validation_split=0.1)
train_generator = image_gen.flow_from_directory(train_dir, target_size=(299, 299),
class_mode='categorical', batch_size=20,
subset='training')
validation_generator = image_gen.flow_from_directory(train_dir, target_size=(299, 299),
class_mode='categorical', batch_size=20,
subset='validation')
model.compile(Adam(learning_rate=0.001), loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit_generator(train_generator, steps_per_epoch=int(440/20), epochs=20,
validation_data=validation_generator,
validation_steps=int(42/20))
I was able to get train and validation work perfectly because the images in train directory are stored in a separate folder for each class. But, as you can see below, the test directory has 100 images and no folders inside it. It also doesn't have any labels and only contains image files.
How can I do prediction on the image files in test folder using Keras?

If you are interested to only perform prediction, you can load the images by a simple hack like this:
test_datagen = ImageDataGenerator(rescale=1/255.)
test_generator = test_datagen('PATH_TO_DATASET_DIR/Dataset',
# only read images from `test` directory
classes=['test'],
# don't generate labels
class_mode=None,
# don't shuffle
shuffle=False,
# use same size as in training
target_size=(299, 299))
preds = model.predict_generator(test_generator)
You can access test_generator.filenames to get a list of corresponding filenames so that you can map them to their corresponding prediction.
Update (as requested in comments section): if you want to map predicted classes to filenames, first you must find the predicted classes. If your model is a classification model, then probably it has a softmax layer as the classifier. So the values in preds would be probabilities. Use np.argmax method to find the index with highest probability:
preds_cls_idx = preds.argmax(axis=-1)
So this gives you the indices of predicted classes. Now we need to map indices to their string labels (i.e. "car", "bike", etc.) which are provided by training generator in class_indices attribute:
import numpy as np
idx_to_cls = {v: k for k, v in train_generator.class_indices.items()}
preds_cls = np.vectorize(idx_to_cls.get)(preds_cls_idx)
filenames_to_cls = list(zip(test_generator.filenames, preds_cls))

your folder structure be like testfolder/folderofallclassfiles
you can use
test_generator = test_datagen.flow_from_directory(
directory=pred_dir,
class_mode=None,
shuffle=False
)
before prediction i would also use reset to avoid unwanted outputs
EDIT:
For your purpose you need to know which image is associated with which prediction. The problem is that the data-generator start at different positions in the dataset each time we use the generator, thus giving us different outputs everytime. So, in order to restart at the beginning of the dataset in each call to predict_generator() you would need to exactly match the number of iterations and batches to the dataset-size.
There are multiple ways to encounter this
a) You can see the internal batch-counter using batch_index of generator
b) create a new data-generator before each call to predict_generator()
c) there is a better and simpler way, which is to call reset() on the generator, and if you have set shuffle=False in flow_from_directory then it should start over from the beginning of the dataset and give the exact same output each time, so now the ordering of testgen.filenames and testgen.classes matches
test_generator.reset()
Prediction
prediction = model.predict_generator(test_generator,verbose=1,steps=numberofimages/batch_size)
To map the filename with prediction
predict_generator gives output in probabilities so at first we need to convert them to class number like 0,1..
predicted_class = np.argmax(prediction,axis=1)
next step would be to convert those class number into actual class names
l = dict((v,k) for k,v in training_set.class_indices.items())
prednames = [l[k] for k in predicted_classes]
getting filenames
filenames = test_generator.filenames
Finally creating df
finaldf = pd.DataFrame({'Filename': filenames,'Prediction': prednames})

Assign ImageDataGenerator result to Numpy array

I'm using the ImageDataGenerator inside Keras to read a directory of images. I'd like to save the result inside a numpy array, so I can do further manipulations and save it to disk in one file.
flow_from_directory() returns an iterator, which is why I tried the following
itr = gen.flow_from_directory('data/train/', batch_size=1, target_size=(32,32))
imgs = np.concatenate([itr.next() for i in range(itr.nb_sample)])
but that produced
ValueError: could not broadcast input array from shape (32,32,3) into shape (1)
I think I'm misusing the concatenate() function, but I can't figure out where I fail.

I had the same problem and solved it the following way:
itr.next returns the next batch of images as two numpy.ndarray objects: batch_x, batch_y. (Source: keras/preprocessing/image.py)
So what you can do is set the batch_size for flow_from_directory to the size of your whole train dataset.
Example, my whole training set consists of 1481 images:
train_datagen = ImageDataGenerator(rescale=1. / 255)
itr = train_datagen.flow_from_directory(
train_data_dir,
target_size=(img_width, img_height),
batch_size=1481,
class_mode='categorical')
X, y = itr.next()

While using ImageDataGenerator, the data is loaded in the format of the directoryiterator.
you can extract it as batches or as a whole
train_generator = train_datagen.flow_from_directory(
train_parent_dir,
target_size=(300, 300),
batch_size=32,
class_mode='categorical'
)
the output of which is
Found 3875 images belonging to 3 classes.
to extract as numpy array as a whole(which means not as a batch), this code can be used
x=np.concatenate([train_generator.next()[0] for i in range(train_generator.__len__())])
y=np.concatenate([train_generator.next()[1] for i in range(train_generator.__len__())])
print(x.shape)
print(y.shape)
NOTE:BEFORE THIS CODE IT IS ADVISED TO USE train_generator.reset()
the output of above code is
(3875, 300, 300, 3)
(3875, 3)
The output is obtained as a numpy array together, even though it was loaded as batches of 32 using ImageDataGenerator.
To get the output as batches use the following code
x=[]
y=[]
train_generator.reset()
for i in range(train_generator.__len__()):
a,b=train_generator.next()
x.append(a)
y.append(b)
x=np.array(x)
y=np.array(y)
print(x.shape)
print(y.shape)
the output of the code is
(122,)
(122,)
Hope this works as a solution

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.