In order to use Google Colabs TPUs I need a tf.dataset.Dataset. How can I then use Data Augmentation on such a dataset?
More specifically, my code so far is:
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
def scale(image, label):
image = tf.cast(image, tf.float32)
image /= 255.0
label = tf.one_hot(label,10)
return image, label
train_dataset = mnist_train.map(scale).shuffle(10000).batch(batch_size)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
Which is fed into this:
# TPU Strategy ...
with strategy.scope():
model = create_model()
model.compile(loss="categorical_crossentropy",
optimizer="adam",
metrics=["acc"])
train_dataset, test_dataset = get_dataset()
model.fit(train_dataset,
epochs=20,
verbose=1,
validation_data=test_dataset)
So, how can I use here Data Augmentation here?
As far as I know, I can't use the tf.keras ImageDataGenerator, right?
I've tried the following and it didn't work.
data_generator = ...
model.fit_generator(data_generator.flow(train_dataset, batch_size=32),
steps_per_epoch=len(train_dataset) / 32, epochs=20)
Not surprising, since, usually, train_x and train_y are fed as two arguments to the flow function, not "packed" into one tf.dataset.Dataset.
You can use tf.image functions. The tf.image module contains various functions for image processing.
For example:
You can add below functionality in your function def get_dataset.
convert each image to tf.float64 in the 0-1 range.
cache() results as those can be re-used after each repeat
randomly flip left_to_right each image using random_flip_left_right.
randomly change contrast of image using random_contrast.
Number of images increased by twice by repeat which repeat all the steps.
Code -
mnist_train = mnist_train.map(
lambda image, label: (tf.image.convert_image_dtype(image, tf.float32), label)
).cache(
).map(
lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
lambda image, label: (tf.image.random_contrast(image, lower=0.0, upper=1.0), label)
).shuffle(
1000
).
batch(
batch_size
).repeat(2)
Similarly you can use other functionalities like random_flip_up_down, random_crop functions to Randomly flips an image vertically (upside down) and Randomly crop a tensor to a given size respectively.
Your get_dataset function will look like below -
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
train_dataset = mnist_train.map(
lambda image, label: (tf.image.convert_image_dtype(image, tf.float32),label)
).cache(
).map(
lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
lambda image, label: (tf.image.random_contrast(image, lower=0.0, upper=1.0), label)
).shuffle(
1000
).batch(
batch_size
).repeat(2)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
Adding the link suggested by #Andrew H that gives end-to-end example on data augmentation that also uses mnist dataset.
Hope this answers your question. Happy Learning.
Related
Having some trouble with the tensorflow-datasets module. Using the stanford_dogs dataset, I resize images to [180,180], yet when the model is trained, from the error message, it appears tensorflow is trying to load the images in their original sizes.
What am I doing wrong?
Code to replicate error (and the error) below. Dataset is around 750mb. Can copy paste it into google colab and run to replicate.
import io
import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
def _normalize_img(img, label):
img = tf.cast(img, tf.float32) / 255.
img = tf.image.resize(img,[180,180])
return (img, label)
train_dataset, test_dataset = tfds.load(name="stanford_dogs", split=['train', 'test'], as_supervised=True)
train_dataset = train_dataset.shuffle(1024).batch(32)
train_dataset = train_dataset.map(_normalize_img)
test_dataset = test_dataset.batch(32)
test_dataset = test_dataset.map(_normalize_img)
model = tf.keras.Sequential([
tf.keras.layers.Conv2D(64,2,padding='same',activation='relu',input_shape=(180,180,3)),
tf.keras.layers.MaxPooling2D(2),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Conv2D(32,2,padding='same',activation='relu'),
tf.keras.layers.MaxPooling2D(2),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(120,activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(0.001),
loss='sparse_categorical_crossentropy')
history = model.fit(
train_dataset,
epochs=5)
Fails with the error:
InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [278,300,3] and element 1 had shape [375,500,3].
[[node IteratorGetNext (defined at <ipython-input-29-15023f95f627>:39) ]] [Op:__inference_train_function_4908]
You encounter this error because the tf.data.Dataset API cannot create a batch of tensors with different shapes. As the batch function will return Tensors of shape (batch, height, width, channels), the height, width and channels values must be constant throughout the dataset. You can read more about why in the Introduction to Tensors guide.
Batching after resizing will solve your issue :
train_dataset = train_dataset.shuffle(1024)
train_dataset = train_dataset.map(_normalize_img)
# we batch once every image is the same size
train_dataset = train_dataset.batch(32)
I am building a multi input Network using the Keras functionnal API, but I struggle to find and understand the right format for my input data throw the network.
I have two main input :
One is an image, that goes throw a fine-tuned ResNet50 CNN
The second is a simple numpy array (X_train) containing metadata about the image (position and size of the image). This one goes throw a simple dense network.
I load the images from a dataframe, containing the metadata, and the filepath to the corresponding image.
I use ImageDataGenerator and the flow_from_dataframe method to load my images :
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
train_flow = datagen.flow_from_dataframe(
dataframe=df_train,
x_col="cropped_img_filepath",
y_col="category",
batch_size=batch_size,
shuffle=False,
class_mode="categorical",
target_size=(224,224)
)
I can train the two networks separately using their own data, no problems until here.
The two output of the two distinct networks are then combined to a dense network to output a 10 digits probability vector :
# Create the input for the final dense network using the output of both the dense MLP and CNN
combinedInput = concatenate([cnn.output, mlp.output])
x = Dense(512, activation="relu")(combinedInput)
x = Dense(256, activation="relu")(x)
x = Dense(128, activation="relu")(x)
x = Dense(32, activation="relu")(x)
x = Dense(10, activation="softmax")(x)
model = Model(inputs=[cnn.input, mlp.input], outputs=x)
# Compile the model
opt = Adam(lr=1e-3, decay=1e-3 / 200)
model.compile(loss="categorical_crossentropy",
metrics=['accuracy'],
optimizer=opt)
# Train the model
model_history = model.fit(x=(train_flow, X_train),
y=y_train,
epochs=1,
batch_size=batch_size)
However, when I cannot train the overall network, I get the following error :
ValueError: Failed to find data adapter that can handle input: (<class 'tuple'> containing values of types {"<class 'keras_preprocessing.image.dataframe_iterator.DataFrameIterator'>", "<class 'numpy.ndarray'>"}), <class 'pandas.core.series.Series'>
I understand I am not using the correct input format for my input data.
I can train my CNN with the train_flow, and my dense network with X_train, so I was hoping this would work.
Do you have any idea of how to combine image data and nump array into a multi input array ?
Thank you for all the information you can give me!
I finally found how to do it, inspiring me from the post # Nima Aghli proposed.
Here is how I did that :
First instanciate the preprocessing function (for me the one used for ResNest50) :
from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
def preprocess_function(x):
if x.ndim == 3:
x = x[np.newaxis, :, :, :]
return preprocess_input(x)
# Initializing the datagen, using the above function :
datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
And then Define the Custom Data Generator that will yield randomly sampled array coupling image & metadata, whiule making sure not to be ever out of data (so that you can run on which ever number of epochs) :
def createGenerator(dff, verif=False, batch_size=BATCH_SIZE):
# Shuffles the dataframe, and so the batches as well
dff = dff.sample(frac=1)
# Shuffle=False is EXTREMELY important to keep order of image and coord
flow = datagen.flow_from_dataframe(
dataframe=dff,
directory=None,
x_col="cropped_img_filepath",
y_col="category",
batch_size=batch_size,
shuffle=False,
class_mode="categorical",
target_size=(224,224),
seed=42
)
idx = 0
n = len(dff) - batch_size
batch = 0
while True :
# Get next batch of images
X1 = flow.next()
# idx to reach
end = idx + X1[0].shape[0]
# get next batch of lines from df
X2 = dff[["x", "y", "w", "h"]][idx:end].to_numpy()
dff_verif = dff[idx:end]
# Updates the idx for the next batch
idx = end
# print("batch nb : ", batch, ", batch_size : ", X1[0].shape[0])
batch+=1
# Checks if we are at the end of the dataframe
if idx==len(dff):
# print("END OF THE DATAFRAME\n")
idx = 0
# Yields the image, metadata & target batches
if verif==True :
yield [X1[0], X2], X1[1], dff_verif
else :
yield [X1[0], X2], X1[1] #Yield both images, metadata and their mutual label
I voluntarily kept the commentaries as it helps grasps all the operations that are computed.
The main point/problem is to get images from all the dataframe, without ever getting short on images, and having batches of the same size.
Also, we have to be careful to the order of the images/metadata, so tht the right info is connected to the right image in the returned array.
I'm trying to setup an image recognition CNN with TensorFlow 2.0. To be able to analyze my image augmentation I'd like to see the images I feed into the network in tensorboard.
Unfortunately, I cannot figure out, how to do this with TensorFlow 2.0 and Keras. I also didn't really find documentation on this.
For simplicity, I'm showing the code of an MNIST example. How would I add the image summary here?
import tensorflow as tf
(x_train, y_train), _ = tf.keras.datasets.mnist.load_data()
def scale(image, label):
return tf.cast(image, tf.float32) / 255.0, label
def augment(image, label):
return image, label # do nothing atm
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.map(scale).map(augment).batch(32)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(dataset, epochs=5, callbacks=[tf.keras.callbacks.TensorBoard(log_dir='D:\\tmp\\test')])
Except providing an answer to your question
I will make the code more TF2.0-like. If you have any questions/need clarification, please post a comment down below.
1. Loading data
I would advise to use Tensorflow Datasets library. There is absolutely no need to load data in numpy and transform it to tf.data.Dataset if one can do it in a single line:
import tensorflow_datasets as tfds
dataset = tfds.load("mnist", as_supervised=True, split=tfds.Split.TRAIN)
Line above will only return TRAIN split (read more about those here).
2. Define Augmentations and Summaries
In order to save images, one has to keep tf.summary.SummaryWriter object throughout each pass.
I have created a convenient wrapping class with __call__ method for easy usage with tf.data.Dataset's map capabilities:
import tensorflow as tf
class ExampleAugmentation:
def __init__(self, logdir: str, max_images: int, name: str):
self.file_writer = tf.summary.create_file_writer(logdir)
self.max_images: int = max_images
self.name: str = name
self._counter: int = 0
def __call__(self, image, label):
augmented_image = tf.image.random_flip_left_right(
tf.image.random_flip_up_down(image)
)
with self.file_writer.as_default():
tf.summary.image(
self.name,
augmented_image,
step=self._counter,
max_outputs=self.max_images,
)
self._counter += 1
return augmented_image, label
name will be the name under which each part of images will be saved. Which part you may ask - the part defined by max_outputs.
Say image in __call__ will have shape (32, 28, 28, 1), where the first dimension is batch, second width, third height and last channels (in case of MNIST only onel but this dimension is needed in tf.image augmentations). Furthermore, let's say max_outputs is specified as 4. In this case, only 4 first images from batch will be saved. Default value is 3, so you may set it as BATCH_SIZE to save every image.
In Tensorboard, each image will be a separate sample over which you can iterate at the end.
_counter is needed so the images will not be overwritten (I think, not really sure, clarification from someone else would be nice).
Important: You may want to rename this class to something like ImageSaver when doing more serious buisness and move augmentation to separate functors/lambda functions. It suffices for presentation purposes I guess.
3. Setup global variables
Please do not mix function declaration, global variables, data loading and others (like loading data and creating function afterwards). I know TF1.0 encouraged this type of programming but they are trying to get away from it and you might want to follow the trend.
Below I have defined some global variables which will be used throughout next parts, pretty self-explanatory I guess:
BATCH_SIZE = 32
DATASET_SIZE = 60000
EPOCHS = 5
LOG_DIR = "/logs/images"
AUGMENTATION = ExampleAugmentation(LOG_DIR, max_images=4, name="Images")
4. Dataset augmentation
Similar to yours but with a little twist:
dataset = (
dataset.map(
lambda image, label: (
tf.image.convert_image_dtype(image, dtype=tf.float32),
label,
)
)
.batch(BATCH_SIZE)
.map(AUGMENTATION)
.repeat(EPOCHS)
)
repeat is needed as the loaded dataset is a generator
tf.image.convert_image_dtype - better and more readable option than explicit tf.cast mixed with division by 255 (and ensures proper image format)
batching done before augmentation just for the sake of presentation
5. Define model, compile, train
Almost as you did in your example, but I have provided additional steps_per_epoch, so fit knows how many batches constitute an epoch:
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation="softmax"),
]
)
model.compile(
optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)
model.fit(
dataset,
epochs=EPOCHS,
steps_per_epoch=DATASET_SIZE // BATCH_SIZE,
callbacks=[tf.keras.callbacks.TensorBoard(log_dir=LOG_DIR)],
)
Not much to explain other than that I think.
6. Run Tensorboard
Since TF2.0 one can do it inside colab using %tensorboard --logdir /logs/images, just wanted to add this for others who may visit this issue. Do it however you like, anyways you know how to do it for sure.
Images should be inside IMAGES and each sample named by name provided to AUGMENTATION object.
7. Whole code (to make everyone's life easier)
import tensorflow as tf
import tensorflow_datasets as tfds
class ExampleAugmentation:
def __init__(self, logdir: str, max_images: int, name: str):
self.file_writer = tf.summary.create_file_writer(logdir)
self.max_images: int = max_images
self.name: str = name
self._counter: int = 0
def __call__(self, image, label):
augmented_image = tf.image.random_flip_left_right(
tf.image.random_flip_up_down(image)
)
with self.file_writer.as_default():
tf.summary.image(
self.name,
augmented_image,
step=self._counter,
max_outputs=self.max_images,
)
self._counter += 1
return augmented_image, label
if __name__ == "__main__":
# Global settings
BATCH_SIZE = 32
DATASET_SIZE = 60000
EPOCHS = 5
LOG_DIR = "/logs/images"
AUGMENTATION = ExampleAugmentation(LOG_DIR, max_images=4, name="Images")
# Dataset
dataset = tfds.load("mnist", as_supervised=True, split=tfds.Split.TRAIN)
dataset = (
dataset.map(
lambda image, label: (
tf.image.convert_image_dtype(image, dtype=tf.float32),
label,
)
)
.batch(BATCH_SIZE)
.map(AUGMENTATION)
.repeat(EPOCHS)
)
# Model and training
model = tf.keras.models.Sequential(
[
tf.keras.layers.Flatten(input_shape=(28, 28, 1)),
tf.keras.layers.Dense(128, activation="relu"),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation="softmax"),
]
)
model.compile(
optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"]
)
model.fit(
dataset,
epochs=EPOCHS,
steps_per_epoch=DATASET_SIZE // BATCH_SIZE,
callbacks=[tf.keras.callbacks.TensorBoard(log_dir=LOG_DIR)],
)
You could do something like this to add input image to tensorboard
def scale(image, label):
return tf.cast(image, tf.float32) / 255.0, label
def augment(image, label):
return image, label # do nothing atm
file_writer = tf.summary.create_file_writer(logdir + "/images")
def plot_to_image(figure):
buf = io.BytesIO()
plt.savefig(buf, format='png')
plt.close(figure)
buf.seek(0)
image = tf.image.decode_png(buf.getvalue(), channels=4)
image = tf.expand_dims(image, 0)
return image
def image_grid():
"""Return a 5x5 grid of the MNIST images as a matplotlib figure."""
# Create a figure to contain the plot.
figure = plt.figure(figsize=(10, 10))
for i in range(25):
# Start next subplot.
plt.subplot(5, 5, i + 1, title=str(y_train[i]))
plt.xticks([])
plt.yticks([])
plt.grid(False)
image, _ = scale(x_train[i], y_train[i])
plt.imshow(x_train[i], cmap=plt.cm.binary)
return figure
# Prepare the plot
figure = image_grid()
# Convert to image and log
with file_writer.as_default():
tf.summary.image("Training data", plot_to_image(figure), step=0)
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
dataset = dataset.map(scale).map(augment).batch(32)
model = tf.keras.models.Sequential([
tf.keras.layers.Flatten(input_shape=(28, 28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(dataset, epochs=5, callbacks=[tf.keras.callbacks.TensorBoard(log_dir=logdir)])
I'm trying to learn the new Tensorflow APIs and I am a bit lost on where to get a handle on my input batch tensors so I can manipulate and augment them with for example tf.image.
This is the my current network & pipeline:
trainX, testX, trainY, testY = read_data()
# trainX [num_image, height, width, channels], these are numpy arrays
#...
train_dataset = tf.data.Dataset.from_tensor_slices((trainX, trainY))
test_dataset = tf.data.Dataset.from_tensor_slices((testX, testY))
#...
iterator = tf.data.Iterator.from_structure(train_dataset.output_types,
train_dataset.output_shapes)
features, labels = iterator.get_next()
train_init_op = iterator.make_initializer(train_dataset)
test_init_op = iterator.make_initializer(test_dataset)
#...defining cnn architecture...
# In the train loop
TrainLoop {
sess.run(train_init_op) # switching to train data
sess.run(train_step, ...) # running a train step
#...
sess.run(test_init_op) # switching to test data
test_loss = sess.run(loss, ...) # printing test loss after epoch
}
I'm using the Dataset API creating 2 datasets so that in the trainloop I can calculate the train and test loss and log them.
Where in this pipeline would I manipulate and distort my input batch of images?
I'm not creating any tf.placeholders for my trainX input batches so I can't manipulate them with tf.image because for example tf.image.flip_up_down requires a 3-D or 4-D tensor.
What is the natural way to implement this pipeline with the new API?
Is there a module or easy way to augment an input batch of images for training that would fit in this pipeline?
There's a really good article and talk released recently that go over the API in a lot more detail than my response here. Here's a brief example:
import tensorflow as tf
import numpy as np
def read_data():
n_train = 100
n_test = 50
height = 20
width = 30
channels = 3
trainX = (np.random.random(
size=(n_train, height, width, channels)) * 255).astype(np.uint8)
testX = (np.random.random(
size=(n_test, height, width, channels))*255).astype(np.uint8)
trainY = (np.random.random(size=(n_train,))*10).astype(np.int32)
testY = (np.random.random(size=(n_test,))*10).astype(np.int32)
return trainX, testX, trainY, testY
trainX, testX, trainY, testY = read_data()
# trainX [num_image, height, width, channels], these are numpy arrays
train_dataset = tf.data.Dataset.from_tensor_slices((trainX, trainY))
test_dataset = tf.data.Dataset.from_tensor_slices((testX, testY))
def map_single(x, y):
print('Map single:')
print('x shape: %s' % str(x.shape))
print('y shape: %s' % str(y.shape))
x = tf.image.per_image_standardization(x)
# Consider: x = tf.image.random_flip_left_right(x)
return x, y
def map_batch(x, y):
print('Map batch:')
print('x shape: %s' % str(x.shape))
print('y shape: %s' % str(y.shape))
# Note: this flips ALL images left to right. Not sure this is what you want
# UPDATE: looks like tf documentation is wrong and you need a 3D tensor?
# return tf.image.flip_left_right(x), y
return x, y
batch_size = 32
train_dataset = train_dataset.repeat().shuffle(100)
train_dataset = train_dataset.map(map_single, num_parallel_calls=8)
train_dataset = train_dataset.batch(batch_size)
train_dataset = train_dataset.map(map_batch)
train_dataset = train_dataset.prefetch(2)
test_dataset = test_dataset.map(
map_single, num_parallel_calls=8).batch(batch_size).map(map_batch)
test_dataset = test_dataset.prefetch(2)
iterator = tf.data.Iterator.from_structure(train_dataset.output_types,
train_dataset.output_shapes)
features, labels = iterator.get_next()
train_init_op = iterator.make_initializer(train_dataset)
test_init_op = iterator.make_initializer(test_dataset)
with tf.Session() as sess:
sess.run(train_init_op)
feat, lab = sess.run((features, labels))
print(feat.shape)
print(lab.shape)
sess.run(test_init_op)
feat, lab = sess.run((features, labels))
print(feat.shape)
print(lab.shape)
A few notes:
This approach relies on being able to load your entire dataset into memory. If you cannot, consider using tf.data.Dataset.from_generator. This can lead to slow shuffle times if your shuffle buffer is large. My preferred method is to load some keys tensor entirely into memory - it might just be the indices of each example - then map that key value to data values using tf.py_func. This is slightly less efficient than converting to tfrecords, but with prefetching it likely won't affect performance. Since the shuffling is done before the mapping, you only have to load shuffle_buffer keys into memory, rather than shuffle_buffer examples.
To augment your dataset, use tf.data.Dataset.map either before or after the batch operation, depending on whether or not you want to apply a batch-wise operation (something working on a 4D image tensor) or element-wise operation (3D image tensor). Note it looks like the documentation for tf.image.flip_left_right is out of date, since I get an error when I try and use a 4D tensor. If you want to augment you data randomly, use tf.image.random_flip_left_right rather than tf.image.flip_left_right.
If you're using a tf.estimator.Estimator (or wouldn't mind converting your code to using it), then check out tf.estimator.train_and_evaluate for an in-built way of switching between datasets.
Consider shuffling/repeating your dataset with the shuffle/repeat methods. See the article for notes on efficiencies. In particular, repeat -> shuffle -> map -> batch -> batch-wise map -> prefetch seems to be the best ordering of operations for most applications.
I want to use the tf.data.dataset to do image argumentation while training. The code is similar to official guideline as follow:
train_dataset = tf.data.Dataset.from_tensor_slices((train_data, train_label))
train_dataset = train_dataset.map(
lambda file, label: tuple(tf.py_func(
_parse_train_image, [file, label],
[tf.float32, tf.int32]
))
).batch(batch_size).repeat(epoch)
def _parse_train_image(file, label):
image = cv2.imread(file.decode(), cv2.IMREAD_COLOR)
image = process_image(image)
return image, label
The train_data and train_label here refer to two lists which contains image file paths and labels seperately. However, this transformation can only return the original image, I also need to argument the images(like flip and rotate).
How can I continue to argument the images and return these images?