How to properly set steps_per_epoch and validation_steps in Keras? - python

I've trained several models in Keras. I have 39, 592 samples in my training set, and 9, 899 in my validation set. I used a batch size of 2.
As I was examining my code, it occurred to me that my generators may have been missing some batches of data.
This is the code for my generator:
train_datagen = ImageDataGenerator(
rescale=1. / 255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
val_datagen = ImageDataGenerator(rescale=1. / 255)
train_generator = train_datagen.flow_from_directory(
train_dir,
target_size=(224, 224)
batch_size=batch_size,
class_mode='categorical')
validation_generator = val_datagen.flow_from_directory(
val_dir,
target_size=(224, 224),
batch_size=batch_size,
class_mode='categorical')
I searched around to see how my generators behave, and found this answer:
what if steps_per_epoch does not fit into numbers of samples?
I calculated my steps_per_epoch and validation_steps this way:
steps_per_epoch = int(number_of_train_samples / batch_size)
val_steps = int(number_of_val_samples / batch_size)
Using the code in this link with my own batch size and number of samples, I got these results:
"missing the last batch" for train_generator and "weird behavior" for val_generator.
I'm afraid that I have to retrain my models again. What values should I choose for steps_per_epoch and validation_steps? Is there a way to use exact values for these variables(Other than setting batch_size to 1 or removing some of the samples)? I have several other models with different number of samples, and I think they've all been missing some batches. Any help would be much appreciated.
Two related question:
1- Regarding the models I already trained, are they reliable and properly trained?
2- What would happen if I set these variables using following values:
steps_per_epoch = np.ceil(number_of_train_samples / batch_size)
val_steps = np.ceil(number_of_val_samples / batch_size)
will my model see some of the images more than once in each epoch during training and validation? or Is this the solution to my question?!

Since Keras data generator is meant to loop infinitely, steps_per_epoch indicates how many times you will fetch a new batch from generator during single epoch. Therefore, if you simply take steps_per_epoch = int(number_of_train_samples / batch_size), your last batch would have less than batch_size items and would be discarded. However, in your case, it's not a big deal to lose 1 image per training epoch. The same is for validation step. To sum up: your models are trained [almost :) ] correctly, because the quantity of lost elements is minor.
Corresponding to implementation ImageDataGenerator https://keras.io/preprocessing/image/#imagedatagenerator-class if your number of steps would be larger than expected, after reaching the maximum number of samples you will receive new batches from the beginning, because your data is looped over. In your case, if steps_per_epoch = np.ceil(number_of_train_samples / batch_size) you would receive one additional batch per each epoch which would contains repeated image.

In addition to Greeser's answer,
To avoid losing some training samples, you could calculate your steps with this function:
def cal_steps(num_images, batch_size):
# calculates steps for generator
steps = num_images // batch_size
# adds 1 to the generator steps if the steps multiplied by
# the batch size is less than the total training samples
return steps + 1 if (steps * batch_size) < num_images else steps

Related

WARNING:tensorflow:Your input ran out of data; interrupting training

Python
Dataset problem in last train step
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 2000 batches). You may need to use the repeat() function when building your dataset.
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
test_datagen = ImageDataGenerator(rescale=1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size=(64, 64),
batch_size=32,
class_mode='binary')
classifer.fit_generator(training_set,
steps_per_epoch=(8000),
epochs=25,`enter code here`
validation_data=test_set,
validation_steps=2000)
you have code
classifer.fit_generator(training_set,
steps_per_epoch=(8000),
epochs=25,`enter code here`
validation_data=test_set,
validation_steps=2000)
the entry 'enter code here' doesn't belong in model.fit_generator. Also .fit_generator is depreciated just use .fit. You do not need to specify steps_per_epoch or validation_steps in .fit. It will internally calculate them. However if you wish to specify them then use code
steps_per_epoch= total images in trainset//batch_size
For the validation steps you can use a similar code, however if you want to go through the validation set exactly once per epoch then use this code
length=total number of images in test set
valid_batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=80],reverse=True)[0]
validation_steps=int(length/test_batch_size)
use valid_batch_size as the batch size in your test_datagen. What the code does is determine the batch size and steps such that
valid_batch_size * validation_steps = total number of images in test set.

Validation generator accuracy drop near to chance when I change its properties-- keras ImageDataGenerator, model.evaluate

I am reading images from a directory hierarchy (flow_from_directory using generators from the ImageDataGenerator class). The model is a fixed parameter mobilenetv2 + a trainable softmax layer. When I fit the model to training data, accuracy levels are comparable for training and validation. If I play with the validation parameters or reset the generator, accuracy for the validation generator drops significantly using model.evaluate or if I restart fitting the model with model.fit. The database is a 3D view database.
Relevant code:
'''
batch_size=16
rescaled3D_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, zoom_range=0.2,
shear_range=0.2,
horizontal_flip=True)
train_gen =rescaled3D_gen.flow_from_directory(data_directory + '/train/', seed=3,
target_size = (pixels, pixels), shuffle=True,
batch_size = batch_size, class_mode='binary')
val_gen =rescaled3D_gen.flow_from_directory(data_directory + '/test/', seed=3,
target_size = (pixels, pixels), shuffle=True,
batch_size = batch_size, class_mode='binary')
#MODEL
inputs = tf.keras.Input(shape=(None, None, 3), batch_size=batch_size)
x = tf.keras.layers.Lambda(lambda img: tf.image.resize(img, (pixels,pixels)))(inputs)
x = tf.keras.layers.Lambda(tf.keras.applications.mobilenet_v2.preprocess_input)(x)
mobilev2 = tf.keras.applications.mobilenet_v2.MobileNetV2(weights = 'imagenet', input_tensor = x,
input_shape=(pixels,pixels,3),
include_top=True, pooling = 'avg')
#add a dense layer for task-specific categorization.
full_model = tf.keras.Sequential([mobilev2,
tf.keras.layers.Dense(train_gen.num_classes, activation='softmax')])
for idx, layers in enumerate(mobilev2.layers):
layers.trainable = False
mobilev2.layers[-1].trainable=True
full_model.compile(optimizer = tf.keras.optimizers.RMSprop(learning_rate=0.0001),
loss = 'sparse_categorical_crossentropy',
metrics=['accuracy'])
#start fitting
val_gen.reset()
train_gen.reset()
full_model.fit(train_gen,
steps_per_epoch = samples_per_epoch,
epochs=30,
validation_data=val_gen,
validation_steps = int(np.floor(val_gen.samples/val_gen.batch_size)))
good_acc_score = full_model.evaluate(val_gen, steps=val_gen.n//val_gen.batch_size)
'''
reproduce strangeness by doing something like this:
'''
val_gen.batch_size=4
val_gen.reset()
val_gen.batch_size=batch_size
'''
Then validation accuracy is automatically lower (perhaps to chance) during fit or evaluation
'''
bad_acc_score = full_model.evaluate(val_gen, steps=val_gen.n//val_gen.batch_size)
#or
full_model.fit(train_gen,
steps_per_epoch = samples_per_epoch,
epochs=1,
validation_data=val_gen,
validation_steps = int(np.floor(val_gen.samples/val_gen.batch_size)))
'''
here area few things you might try. You can eliminate the Lamda layers by changing the train_gen as follows
rescaled3D_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, zoom_range=0.2,shear_range=0.2, horizontal_flip=True,
preprocessing_function=tf.keras.applications.mobilenet_v2.preprocess_input)
You do not need the Lamda resize layer since you specify the target size in flow from directory.
In the val_gen you have shuffle=True. This will shuffle the validation image order for each epoch. Better to set it to False for consistancy.
In the code for mobilenet you have include_top=True and pooling='avg' When include_top is True the pooling parameter is ignored.
Setting include_top=True leaves the top layer of your model with a dense layer of 1000 nodes and a softmax activation function.
I would set include_top=False. That way the mobilenet output is a global pooling layer that can directly feed your dense categorization layer. In the generators you set the class_mode='binary'. but in model.compile you set the loss as
sparse_categorical_crossentropy. This will work but better to compile with loss=BinaryCrossentropy.
It is preferable to go through the validation samples exactly one time per epoch for consistancy. To do that the the batch size should be selected such that validation samples/batch_size is an integer and use that integer as the number of validation steps.
The code below will do that for you.
b_max=80 # set this to the maximum batch size you will allow based on memory capacity
length=val_gen.samples
batch_size=sorted([int(length/n) for n in range(1,length+1) if length % n ==0 and length/n<=b_max],reverse=True)[0]
val_steps=int(length/batch_size)
Changing validation batch size can change the results of validation loss and accuracy. Generally a larger batch size will result in less fluctuation of the loss but can lead to a higher probability of getting stuck in a local minimum. Try these changes and see if there is less variance in the results.
I think the lead answer from this board solves the problem I observed
Keras: Accuracy Drops While Finetuning Inception

How does flow_from_directory implemented?

My main question is, does it iterate over every sample in the directory for every epoch? I have directory with 6 classes with almost same number of samples in each class, when I trained model with batch_size=16 it didn't work at all, predicts only 1 class correctly. Making batch_size=128 made that, it can predict 3 classes with high accuracy and other 3 never appeared in test predictions. Why it did so? Does every steps_per_epoch uniquely generated and it only remembers samples of that batch? Which means that it does not remember last used batch samples and creates new random batch with possibility to use already used samples and miss others, if so then it means that it misses whole class samples and the only way to overcome this would be increasing batch_size so that it will remember it in one batch. I can't increase batch_size more than 128 because there is not enough memory on my GPU.
So what should I do?
Here is my code for ImageDataGenerator
train_d = ImageDataGenerator(rescale=1. / 255, shear_range=0.2, zoom_range=0.1, validation_split=0.2,
rotation_range=10.,
width_shift_range=0.1,
height_shift_range=0.1)
train_s = train_d.flow_from_directory('./images/', target_size=(width, height),
class_mode='categorical',
batch_size=32, subset='training')
validation_s = train_d.flow_from_directory('./images/', target_size=(width, height), class_mode='categorical',
subset='validation')
And here is code for fit_generator
classifier.fit_generator(train_s, epochs=20, steps_per_epoch=100, validation_data=validation_s,
validation_steps=20, class_weight=class_weights)
Yes, it iterates for every sample in each folder every epoch. This is the definition of en epoch, a complete pass over the whole dataset.
steps_per_epoch should be set to len(dataset) / batch_size, then only issue is when the batch size does not exactly divide the number samples, and in that case you round steps_per_epoch up and the last batch is smaller than batch_size.

Keras DataGenerator with a validation set smaller than batch size make no validation

I wrote a DataGenerator and initialized a validation_generator. If the batch size specified for training is larger than the size of the validation set, no validation loss/acc is calculated.
If the validation set is larger, everything works fine. Specifying validation_steps does not help.
# Create data generators
training_generator = DataGenerator(partition['train'], embedding_model, **params)
validation_generator = DataGenerator(partition['validation'], embedding_model, **params)
# create LSTM
model = get_LSTM_v1(seq_length, input_dim, hot_enc_dim)
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
# train LSTM
history = model.fit_generator(
generator=training_generator,
validation_data=validation_generator,
epochs=n_epochs,
use_multiprocessing=True,
workers=cpu_cores
)
DataGenerator may need to be modified in order to return a partial batch when the batch size is smaller than the size of the validation set.
Most of the time, the number of computable batches returned by the generator correspond to the floor of the division of the number of samples by the batch size. This would return zero if the batch size is bigger than the size of the set.
You could try to work around by repeating the data in order to have enough for a full batch when needed.

Prevent predict_generator from shuffling batches

I am trying to check the performance of my model on the validation-dataset. As such, I am using predict_generator to return predictions from my validation_generator. However, I am not able to match the predictions with true labels returned from validation_generator.classes since the order of my predictions is mixed up.
This is how I initialize my generator:
BATCH_SIZE = 64
data_generator = ImageDataGenerator(rescale=1./255,
validation_split=0.20)
train_generator = data_generator.flow_from_directory(main_path, target_size=(IMAGE_HEIGHT, IMAGE_SIZE), shuffle=False, seed=13,
class_mode='categorical', batch_size=BATCH_SIZE, subset="training")
validation_generator = data_generator.flow_from_directory(main_path, target_size=(IMAGE_HEIGHT, IMAGE_SIZE), shuffle=False, seed=13,
class_mode='categorical', batch_size=BATCH_SIZE, subset="validation")
#Found 4473 images belonging to 3 classes.
#Found 1116 images belonging to 3 classes.
Now I am using the predict_generator like so:
validation_steps_per_epoch = np.math.ceil(validation_generator.samples / validation_generator.batch_size)
predictions = model.predict_generator(validation_generator, steps=validation_steps_per_epoch)
I realize that there is a mismatch between my validation-data size (=1116) and validation_steps_per_epoch (=1152). Since these two dont match, I find the output predictions is different each time I run model.predict_generator(...).
Is there any way to fix this besides changing batch_size to 1 in order to make sure that generator steps through all samples?
I found someone with a similar issue here keras predict_generator is shuffling its output when using a keras.utils.Sequence, however his solution does not fix my problem since I am not writing any custom functions.
There is no randomization or shuffling going on, what happens is that since the batch size of the validation generator does not exactly divide the number of samples, then the leftover samples spill into the next time the generator is called, which messes up everything.
What you could do is set a batch size for the validation generator that divides exactly the number of validation samples, or set the batch size to one.

Categories

Resources