Tensorflow CNN image augmentation pipeline

Tensorflow CNN image augmentation pipeline - python

I'm trying to learn the new Tensorflow APIs and I am a bit lost on where to get a handle on my input batch tensors so I can manipulate and augment them with for example tf.image.
This is the my current network & pipeline:
trainX, testX, trainY, testY = read_data()
# trainX [num_image, height, width, channels], these are numpy arrays
#...
train_dataset = tf.data.Dataset.from_tensor_slices((trainX, trainY))
test_dataset = tf.data.Dataset.from_tensor_slices((testX, testY))
#...
iterator = tf.data.Iterator.from_structure(train_dataset.output_types,
train_dataset.output_shapes)
features, labels = iterator.get_next()
train_init_op = iterator.make_initializer(train_dataset)
test_init_op = iterator.make_initializer(test_dataset)
#...defining cnn architecture...
# In the train loop
TrainLoop {
sess.run(train_init_op) # switching to train data
sess.run(train_step, ...) # running a train step
#...
sess.run(test_init_op) # switching to test data
test_loss = sess.run(loss, ...) # printing test loss after epoch
}
I'm using the Dataset API creating 2 datasets so that in the trainloop I can calculate the train and test loss and log them.
Where in this pipeline would I manipulate and distort my input batch of images?
I'm not creating any tf.placeholders for my trainX input batches so I can't manipulate them with tf.image because for example tf.image.flip_up_down requires a 3-D or 4-D tensor.
What is the natural way to implement this pipeline with the new API?
Is there a module or easy way to augment an input batch of images for training that would fit in this pipeline?

There's a really good article and talk released recently that go over the API in a lot more detail than my response here. Here's a brief example:
import tensorflow as tf
import numpy as np
def read_data():
n_train = 100
n_test = 50
height = 20
width = 30
channels = 3
trainX = (np.random.random(
size=(n_train, height, width, channels)) * 255).astype(np.uint8)
testX = (np.random.random(
size=(n_test, height, width, channels))*255).astype(np.uint8)
trainY = (np.random.random(size=(n_train,))*10).astype(np.int32)
testY = (np.random.random(size=(n_test,))*10).astype(np.int32)
return trainX, testX, trainY, testY
trainX, testX, trainY, testY = read_data()
# trainX [num_image, height, width, channels], these are numpy arrays
train_dataset = tf.data.Dataset.from_tensor_slices((trainX, trainY))
test_dataset = tf.data.Dataset.from_tensor_slices((testX, testY))
def map_single(x, y):
print('Map single:')
print('x shape: %s' % str(x.shape))
print('y shape: %s' % str(y.shape))
x = tf.image.per_image_standardization(x)
# Consider: x = tf.image.random_flip_left_right(x)
return x, y
def map_batch(x, y):
print('Map batch:')
print('x shape: %s' % str(x.shape))
print('y shape: %s' % str(y.shape))
# Note: this flips ALL images left to right. Not sure this is what you want
# UPDATE: looks like tf documentation is wrong and you need a 3D tensor?
# return tf.image.flip_left_right(x), y
return x, y
batch_size = 32
train_dataset = train_dataset.repeat().shuffle(100)
train_dataset = train_dataset.map(map_single, num_parallel_calls=8)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.map(map_batch)
train_dataset = train_dataset.prefetch(2)
test_dataset = test_dataset.map(
map_single, num_parallel_calls=8).batch(batch_size).map(map_batch)
test_dataset = test_dataset.prefetch(2)
iterator = tf.data.Iterator.from_structure(train_dataset.output_types,
train_dataset.output_shapes)
features, labels = iterator.get_next()
train_init_op = iterator.make_initializer(train_dataset)
test_init_op = iterator.make_initializer(test_dataset)
with tf.Session() as sess:
sess.run(train_init_op)
feat, lab = sess.run((features, labels))
print(feat.shape)
print(lab.shape)
sess.run(test_init_op)
feat, lab = sess.run((features, labels))
print(feat.shape)
print(lab.shape)
A few notes:
This approach relies on being able to load your entire dataset into memory. If you cannot, consider using tf.data.Dataset.from_generator. This can lead to slow shuffle times if your shuffle buffer is large. My preferred method is to load some keys tensor entirely into memory - it might just be the indices of each example - then map that key value to data values using tf.py_func. This is slightly less efficient than converting to tfrecords, but with prefetching it likely won't affect performance. Since the shuffling is done before the mapping, you only have to load shuffle_buffer keys into memory, rather than shuffle_buffer examples.
To augment your dataset, use tf.data.Dataset.map either before or after the batch operation, depending on whether or not you want to apply a batch-wise operation (something working on a 4D image tensor) or element-wise operation (3D image tensor). Note it looks like the documentation for tf.image.flip_left_right is out of date, since I get an error when I try and use a 4D tensor. If you want to augment you data randomly, use tf.image.random_flip_left_right rather than tf.image.flip_left_right.
If you're using a tf.estimator.Estimator (or wouldn't mind converting your code to using it), then check out tf.estimator.train_and_evaluate for an in-built way of switching between datasets.
Consider shuffling/repeating your dataset with the shuffle/repeat methods. See the article for notes on efficiencies. In particular, repeat -> shuffle -> map -> batch -> batch-wise map -> prefetch seems to be the best ordering of operations for most applications.

Related

How to check preprocessing time/speed in Colab?

I am training a neural network on Google Colab GPU. Therefore, I synchronized the input images (180k in total, 105k for training, 76k for validation) with my Google Drive. Then I mount the Google Drive and go from there.
I load a csv-file with image paths and labels in Google Colab and store it as pandas dataframe.
After that I use a list of image paths and labels.
I take this function to get my labels onehot-encoded because I need a special output shape (7, 35) per label, which cannot be done by the existing default functions:
#One Hot Encoding der Labels, Zielarray hat eine Shape von (7,35)
from numpy import argmax
# define input string
def my_onehot_encoded(label):
# define universe of possible input values
characters = '0123456789ABCDEFGHIJKLMNPQRSTUVWXYZ'
# define a mapping of chars to integers
char_to_int = dict((c, i) for i, c in enumerate(characters))
int_to_char = dict((i, c) for i, c in enumerate(characters))
# integer encode input data
integer_encoded = [char_to_int[char] for char in label]
# one hot encode
onehot_encoded = list()
for value in integer_encoded:
character = [0 for _ in range(len(characters))]
character[value] = 1
onehot_encoded.append(character)
return onehot_encoded
After that I use a customized DataGenerator to get the data in batches into my model. x_set is a list of image paths to my images and y_set are the onehot-encoded labels:
class DataGenerator(Sequence):
def __init__(self, x_set, y_set, batch_size):
self.x, self.y = x_set, y_set
self.batch_size = batch_size
def __len__(self):
return math.ceil(len(self.x) / self.batch_size)
def __getitem__(self, idx):
batch_x = self.x[idx*self.batch_size : (idx + 1)*self.batch_size]
batch_x = np.array([resize(imread(file_name), (224, 224)) for file_name in batch_x])
batch_x = batch_x * 1./255
batch_y = self.y[idx*self.batch_size : (idx + 1)*self.batch_size]
batch_y = np.array(batch_y)
return batch_x, batch_y
And with this code I apply the DataGenerator to my data:
training_generator = DataGenerator(X_train, y_train, batch_size=32)
validation_generator = DataGenerator(X_val, y_val, batch_size=32)
When I now train my model one epoch lasts 25-40 minutes which is very long.
model.fit_generator(generator=training_generator,
validation_data=validation_generator,
steps_per_epoch = num_train_samples // 16,
validation_steps = num_val_samples // 16,
epochs = 10, workers=6, use_multiprocessing=True)
I now was wondering how to measure preprocessing time because I don't think it is due to the model size, because I already experimented with models with fewer parameters but the time for training did not reduce significantly... So, I am suspicious regarding the preprocessing...

To measure time in Colab, you can use this autotime package:
!pip install ipython-autotime
%load_ext autotime
Additionally for profiling, you can use %time as mentioned here.
In general to ensure generator runs faster, suggest you to copy the data from gdrive to local host of that colab, otherwise it can get slower.
If you are using Tensorflow 2.0, cause could be this bug.
Work arounds are:
Call tf.compat.v1.disable_eager_execution() at the start of the code
Use model.fit rather than model.fit_generator. The former supports generators anyway.
Downgrade to TF 1.14
Regardless of Tensorflow version, limit how much disk access you are doing, this that is often a bottleneck.
Note that there does seem to be an issue with generators being slow in TF
1.13.2 and 2.0.1 (at least).

Keras Multi Input Network, using Images and structured data : How do I build the correct input data?

I am building a multi input Network using the Keras functionnal API, but I struggle to find and understand the right format for my input data throw the network.
I have two main input :
One is an image, that goes throw a fine-tuned ResNet50 CNN
The second is a simple numpy array (X_train) containing metadata about the image (position and size of the image). This one goes throw a simple dense network.
I load the images from a dataframe, containing the metadata, and the filepath to the corresponding image.
I use ImageDataGenerator and the flow_from_dataframe method to load my images :
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_flow = datagen.flow_from_dataframe(
dataframe=df_train,
x_col="cropped_img_filepath",
y_col="category",
batch_size=batch_size,
shuffle=False,
class_mode="categorical",
target_size=(224,224)
)
I can train the two networks separately using their own data, no problems until here.
The two output of the two distinct networks are then combined to a dense network to output a 10 digits probability vector :
# Create the input for the final dense network using the output of both the dense MLP and CNN
combinedInput = concatenate([cnn.output, mlp.output])
x = Dense(512, activation="relu")(combinedInput)
x = Dense(256, activation="relu")(x)
x = Dense(128, activation="relu")(x)
x = Dense(32, activation="relu")(x)
x = Dense(10, activation="softmax")(x)
model = Model(inputs=[cnn.input, mlp.input], outputs=x)
# Compile the model
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="categorical_crossentropy",
metrics=['accuracy'],
optimizer=opt)
# Train the model
model_history = model.fit(x=(train_flow, X_train),
y=y_train,
epochs=1,
batch_size=batch_size)
However, when I cannot train the overall network, I get the following error :
ValueError: Failed to find data adapter that can handle input: (<class 'tuple'> containing values of types {"<class 'keras_preprocessing.image.dataframe_iterator.DataFrameIterator'>", "<class 'numpy.ndarray'>"}), <class 'pandas.core.series.Series'>
I understand I am not using the correct input format for my input data.
I can train my CNN with the train_flow, and my dense network with X_train, so I was hoping this would work.
Do you have any idea of how to combine image data and nump array into a multi input array ?
Thank you for all the information you can give me!

I finally found how to do it, inspiring me from the post # Nima Aghli proposed.
Here is how I did that :
First instanciate the preprocessing function (for me the one used for ResNest50) :
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
def preprocess_function(x):
if x.ndim == 3:
x = x[np.newaxis, :, :, :]
return preprocess_input(x)
# Initializing the datagen, using the above function :
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
And then Define the Custom Data Generator that will yield randomly sampled array coupling image & metadata, whiule making sure not to be ever out of data (so that you can run on which ever number of epochs) :
def createGenerator(dff, verif=False, batch_size=BATCH_SIZE):
# Shuffles the dataframe, and so the batches as well
dff = dff.sample(frac=1)
# Shuffle=False is EXTREMELY important to keep order of image and coord
flow = datagen.flow_from_dataframe(
dataframe=dff,
directory=None,
x_col="cropped_img_filepath",
y_col="category",
batch_size=batch_size,
shuffle=False,
class_mode="categorical",
target_size=(224,224),
seed=42
)
idx = 0
n = len(dff) - batch_size
batch = 0
while True :
# Get next batch of images
X1 = flow.next()
# idx to reach
end = idx + X1[0].shape[0]
# get next batch of lines from df
X2 = dff[["x", "y", "w", "h"]][idx:end].to_numpy()
dff_verif = dff[idx:end]
# Updates the idx for the next batch
idx = end
# print("batch nb : ", batch, ", batch_size : ", X1[0].shape[0])
batch+=1
# Checks if we are at the end of the dataframe
if idx==len(dff):
# print("END OF THE DATAFRAME\n")
idx = 0
# Yields the image, metadata & target batches
if verif==True :
yield [X1[0], X2], X1[1], dff_verif
else :
yield [X1[0], X2], X1[1] #Yield both images, metadata and their mutual label
I voluntarily kept the commentaries as it helps grasps all the operations that are computed.
The main point/problem is to get images from all the dataframe, without ever getting short on images, and having batches of the same size.
Also, we have to be careful to the order of the images/metadata, so tht the right info is connected to the right image in the returned array.

Keras model.fit() with tf.dataset fails while using tf.train works fine

Summary: according to the documentation, Keras model.fit() should accept tf.dataset as input (I am using TF version 1.12.0). I can train my model if I manually do the training steps but using model.fit() on the same model, I get an error I cannot resolve.
Here is a sketch of what I did: my dataset, which is too big to fit in the memory, consists of many files each with different number of rows of (100 features, label). I'd like to use tf.data to build my data pipeline:
def data_loader(filename):
'''load a single data file with many rows'''
features, labels = load_hdf5(filename)
...
return features, labels
def make_dataset(filenames, batch_size):
'''read files one by one, pick individual rows, batch them and repeat'''
dataset = tf.data.Dataset.from_tensor_slices(filenames)
dataset = dataset.map( # Problem here! See edit for solution
lambda filename: tuple(tf.py_func(data_loader, [filename], [float32, tf.float32])))
dataset = dataset.flat_map(
lambda features, labels: tf.data.Dataset.from_tensor_slices((features, labels)))
dataset = dataset.batch(batch_size)
dataset = dataset.repeat()
dataset = dataset.prefetch(1000)
return dataset
_BATCH_SIZE = 128
training_set = make_dataset(training_files, batch_size=_BATCH_SIZE)
I'd like to try a very basic logistic regression model:
inputs = tf.keras.layers.Input(shape=(100,))
outputs = tf.keras.layers.Dense(1, activation='softmax')(inputs)
model = tf.keras.Model(inputs, outputs)
If I train it manually everything works fine, e.g.:
labels = tf.placeholder(tf.float32)
loss = tf.reduce_mean(tf.keras.backend.categorical_crossentropy(labels, outputs))
train_step = tf.train.GradientDescentOptimizer(.05).minimize(loss)
iterator = training_set.make_one_shot_iterator()
next_element = iterator.get_next()
init_op = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init_op)
for i in range(training_size // _BATCH_SIZE):
x, y = sess.run(next_element)
train_step.run(feed_dict={inputs: x, labels: y})
However, if I instead try to use model.fit like this:
model.compile('adam', 'categorical_crossentropy', metrics=['acc'])
model.fit(training_set.make_one_shot_iterator(),
steps_per_epoch=training_size // _BATCH_SIZE,
epochs=1,
verbose=1)
I get an error message ValueError: Cannot take the length of Shape with unknown rank. inside the keras'es _standardize_user_data function.
I have tried quite a few things but could not resolve the issue. Any ideas?
Edit: based on #kvish's answer, the solution was to change the map from a lambda to a function that would specify the correct tensor dimensions, e.g.:
def data_loader(filename):
def loader_impl(filename):
features, labels, _ = load_hdf5(filename)
...
return features, labels
features, labels = tf.py_func(loader_impl, [filename], [tf.float32, tf.float32])
features.set_shape((None, 100))
labels.set_shape((None, 1))
return features, labels
and now, all needed to do is to call this function from map:
dataset = dataset.map(data_loader)

Probably tf.py_func produces an unknown shape which Keras cannot infer. We can set the shape of the tensor returned by it using set_shape(your_shape) method and that would help Keras infer the shape of the result.

Fully connect network with softmax for CIFAR100

I'm trying to build a fully connected layer using the CIFAR 100 dataset with softmax and print the accuracy,the learning curve and some of end results – the pictures and their true label and predicted label.
I have this following code for the mnist dataset,the problem I'm facing is to how apply the same thing for my data set,I'll try to explain my problem down below:
#initialization
X=tf.placeholder(tf.float32, [None, 28, 28, 1])
w=tf.Variable(tf.zeros([784, 10]))
b=tf.Variable(tf.zeros([10]))
init=tf.global_variables_initializer()
#model
Y=tf.nn.softmax(tf.matmul(tf.reshape(X,[-1, 784]), w)+b)
#place holder for correct answer
Y_=tf.placeholder(tf.float32, [None, 10])
#loss function
cross_entropy= -tf.reduce_sum(Y_ * tf.log(Y))
# % of correct answers found in batch
is_correct=tf.equal(tf.argmax(Y,1), tf.argmax(Y_,1))
accurancy= tf.reduce_mean(tf.cast(is_correct,tf.float32))
#training step
optimizer=tf.train.GradientDescentOptimizer(0.003)
train_step=optimizer.minimize(cross_entropy)
sess=tf.Session()
sess.run(init)
for i in range(10000):
#load batch of images and correct answer
batch_x, batch_Y=mnist.train.next_batch(100)
train_data={X: batch_x, Y_:batch_y}
#train
sess.run(train_step, feed_dict=train_data)
a,c=sess.run([accurancy, cross_entropy], feed=train_data)
test_data={X:mnist.test.images, Y_:mnist.test.lables}
a,c=sess.run([accurancy, cross_entropy], feed=test_data)
I have downloaded CIFAR-100 dataset. The CIFAR-100 dataset consists of 60000 32x32 color images. It has 100 classes containing 600 images each. There are 500 training images and 100 testing images per class. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs).
I only used 2 super classes “aquatic mammals” and “flowers” each with 5 subcategories
here is some of the code:
def unpickle(file):
import pickle
with open(file, 'rb') as fo:
dict = pickle.load(fo, encoding='bytes')
return dict
# loading train data
data = unpickle('train')
train_data, label_train_data = filter_train(data, 5000)
label_train_data = relabel(label_train_data)
# loading test data
data2 = unpickle('test')
test_data, label_test_data = filter_train(data2, 1000)
label_test_data = relabel(label_test_data)
filter_train is just a function I used to fillter the 2 super classes “aquatic mammals” and “flowers”
I know that mnist.train.next_batch(batch_size=100) means it randomly pick 100 data from MNIST dataset
So my Question is how can I exchange the
batch_x, batch_Y=mnist.train.next_batch(100)
and:
test_data={X:mnist.test.images, Y_:mnist.test.lables}
So that I can access the train data and test data of my CIFAR dataset,
I been tring to replace those lines with the
train_data, label_train_data and test_data, label_test_data but it won't seem to work and I can't find any other way to get to those sets.
Any hely would be appreciated

predict_generator and class labels

I am using ImageDataGenerator to generate new augmented images and extract bottleneck features from pretrained model but most of the tutorial I see on keras
samples same no of training samples as number of images in directory.
train_generator = train_datagen.flow_from_directory(
train_path,
target_size=image_size,
shuffle = "false",
class_mode='categorical',
batch_size=1)
bottleneck_features_train = model.predict_generator(
train_generator, 2* nb_train_samples // batch_size)
Suppose I want 2 times more images from the above code, how I can get the desired class labels for the features extracted from bottleneck layer which are stored in tuple train_generator.
shouldnt the code in training_generator.py at line 422
x, _ = generator_output
do something like this
=> x, y = generator_output
and return tuple [np.concatenate(out) for out in all_outs],y from predict_generator
i.e return the corresponding class labels along with the predicted features all_outs since there is no way to get the corresponding labels without running generator twice.

If you're using predict, normally you simply don't want Y, because Y will be the result of the prediction. (You're not training, so you don't need the true labels)
But you can do it yourself:
bottleneck = []
labels = []
for i in range(2 * nb_train_samples // batch_size):
x, y = next(train_generator)
bottleneck.append(model.predict(x))
labels.append(y)
bottleneck = np.concatenate(bottleneck)
labels = np.concatenate(labels)
If you want it with indexing (if your generator supports that):
#...
for epoch in range(2):
for i in range(nb_train_samples // batch_size):
x,y = train_generator[i]
#...

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Tensorflow CNN image augmentation pipeline - python

Related

How to check preprocessing time/speed in Colab?

Keras Multi Input Network, using Images and structured data : How do I build the correct input data?

Keras model.fit() with tf.dataset fails while using tf.train works fine

Fully connect network with softmax for CIFAR100

predict_generator and class labels

Categories

Resources