How to load multi-image input in PyTorch?

How to load multi-image input in PyTorch? - python

I have a dataset with 2 RGB images per data sample (6 channels). How to read such a dataset in PyTorch?
For one RGB image I used:
data_transform = transforms.Compose([
transforms.RandomHorizontalFlip(),
transforms.RandomVerticalFlip(),
transforms.ToTensor()
])
mel_dataset_train = datasets.ImageFolder(root='./ftrain',
transform=data_transform)
train_sampler = torch.utils.data.distributed.DistributedSampler(mel_dataset_train)
dataset_loader_train = torch.utils.data.DataLoader(mel_dataset_train,
batch_size=64, shuffle=True, sampler=train_sampler,
num_workers=config.workers)
Yet I do not see how to modify it to read two images per data sample instead of one.

Related

How to resize image tensors

The following is my code where I'm converting every image to PIL and then turning them into Pytorch tensors:
transform = transforms.Compose([transforms.PILToTensor()])
# choose the training and test datasets
train_data = os.listdir('data/training/')
testing_data = os.listdir('data/testing/')
train_tensors = []
test_tensors = []
for train_image in train_data:
img = Image.open('data/training/' + train_image)
train_tensors.append(transform(img))
for test_image in testing_data:
img = Image.open('data/testing/' + test_image)
test_tensors.append(transform(img))
# Print out some stats about the training and test data
print('Train data, number of images: ', len(train_data))
print('Test data, number of images: ', len(testing_data))
batch_size = 20
train_loader = DataLoader(train_tensors, batch_size=batch_size, shuffle=True)
test_loader = DataLoader(test_tensors, batch_size=batch_size, shuffle=True)
# specify the image classes
classes = ['checked', 'unchecked', 'other']
# obtain one batch of training images
dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy()
However, I am getting this error:
RuntimeError: stack expects each tensor to be equal size, but got [4, 66, 268] at entry 0 and [4, 88, 160] at entry 1
This is because my images are not resized prior to PIL -> Tensor. What is the correct way of resizing data images?

Try to utilize ImageFolder from torchvision, and assuming that images have diff size, you can use CenterCrop or RandomResizedCrop depending on your task. Check the Full list.
Here is an example:
train_dir = "data/training/"
train_dataset = datasets.ImageFolder(
train_dir,
transforms.Compose([
transforms.RandomResizedCrop(img_size), # image size int or tuple
# Add more transforms here
transforms.ToTensor(), # convert to tensor at the end
]))
train_loader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

How to make a tensorflow dataset input pipeline for multi-class bounding box regression

I am making multi-class bounding box regression in DICOM images. But the dataset is very big (140 GB). So I can't put all of the training images into memory. I wanted to use a TensorFlow dataset generator from Dicom files and this dataframe.
FilePath
CaseNumber
SliceId
Class
BBox
ImageHeight
ImageWidth
147353
1188164/Seri5/30364335695.dcm
1188164/Seri5
30364335695
Healthy
nan
512
512
147354
16567/Seri1/36459582300.dcm
16567/Seri1
36459582300
Healthy
nan
512
512
147355
19242/Seri1/37897787504.dcm
19242/Seri1
37897787504
Healthy
nan
512
512
147356
16416/Seri2/29526748793.dcm
16416/Seri2
29526748793
Healthy
nan
512
512
147357
26321/Seri2/29284668905.dcm
26321/Seri2
29284668905
Abdominal aort anevrizma
248,177-334,291
512
512
How could I make a data input pipeline (build_dataset function) using tensorflow in order to be able to use just these lines of code to train my model:
train_df, test_df = train_test_split(df, test_size=0.2)
train_images, train_labels, train_bboxes = build_dataset(train_df)
test_images, test_labels, test_bboxes = build_dataset(test_df)
train_targets = {
"class_label": train_labels,
"bounding_box": train_bboxes
}
test_targets = {
"class_label": test_labels,
"bounding_box": test_bboxes
}
history = model.fit(
train_images, train_targets,
validation_data=(test_images, test_targets),
batch_size=32,
epochs=20,)
I have tried this:
def build_dataset(dataframe):
images = tf.data.Dataset.from_tensor_slices(dataframe["FilePath"])
images = images.map(lambda x: tf.py_function(read_dcm_as_array, [x], [tf.string]))
lb = LabelBinarizer()
labels = lb.fit_transform(dataframe["Class"])
labels = tf.data.Dataset.from_tensor_slices(labels)
bboxes = dataframe["BBox"]
bboxes = bboxes.str.replace(",", "-")
bboxes = bboxes.str.split("-", expand=True).astype(int)
bboxes = tf.data.Dataset.from_tensor_slices(bboxes)
#targets = tf.data.Dataset.zip(labels, boxes) # this row gives another error :(
#targets = targets.map(lambda x,y:tf.concat([x,y],axis=-2))
return images, labels, bboxes
But it gives me this error:
ValueError: `y` argument is not supported when using dataset as input.

How to save prediction result of the CNN model in the image format?

I have been working with a CNN in Keras and I want to save the predictions as png images. This is my code to create the model and run the prediction:
history = model.fit(
train_generator,
steps_per_epoch=train_steps,
epochs=epochs,
validation_data=validation_generator,
validation_steps=valid_steps)
pred= model.predict(X_test)
I used following code based on below link and the pred folder remains empty. (How to save output (prediction) of the CNN model in the form of an image?)
images = []
for i in range(len(images)*2):
pred.append(id_to_name[np.argmax(predictions[i // 2])])
plt.imsave(('pred/ {}.png'.format(str(i // 2)+id_to_name[np.argmax(predictions[i // 2])])),images[i // 2])
Could you please tell how to save the model.predict predictions in the form of png images?

If your images have the format:
(batch_size, height, width, channels)
You can use tf.keras.preprocessing.image.save_img
import tensorflow as tf
images = tf.random.uniform((4, 224, 224, 3))
for i, image in enumerate(images, 1):
tf.keras.preprocessing.image.save_img(f'my_picture_{i}.png', image)

Apply different data augmentation to part of the train set based on the category

I'm working on a machine learning process to classify images. My problem is that my dataset is imbalanced, and in my 5 categories of images, I have about 400 images in of one class, and about 20 images of each of the other classes.
I would like to balance my train set by applying data augmentation only to certain classes of my train set.
Here's the code I'm using for creating the train an validation sets:
# Import data
data_dir = pathlib.Path(r"C:\Train set")
# Define train and validation sets (80% - 20%)
batch_size = 32
img_height = 240
img_width = 240
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="training",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
data_dir,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(img_height, img_width),
batch_size=batch_size)
And here's how I apply data augmentation, although this would be for the entire train set:
# Apply data augmentation
data_augmentation = keras.Sequential(
[
layers.experimental.preprocessing.RandomFlip("horizontal",
input_shape=(img_height,
img_width,
3)),
layers.experimental.preprocessing.RandomRotation(0.1),
layers.experimental.preprocessing.RandomZoom(0.1),
]
)
Is there any way to go into my train set, extract those categories that have fewer images, and apply data augmentation only to them?
Thanks in advance!

I suggest not using ImageDataGenerator but a customized tf.data.Dataset. In a mapping operation, you can treat categories differently, e.g.:
def preprocess(filepath):
category = tf.strings.split(filepath, os.sep)[0]
read_file = tf.io.read_file(filepath)
decode = tf.image.decode_jpeg(read_file, channels=3)
resize = tf.image.resize(decode, (200, 200))
image = tf.expand_dims(resize, 0)
if tf.equal(category, 'tf_astronauts'):
image = tf.image.flip_up_down(image)
image = tf.image.flip_left_right(image)
# image = tf.image.convert_image_dtype(image, tf.float32)
# category = tf.cast(tf.equal(category, 'tf_astronauts'), tf.int32)
return image, category
Let me demonstrate it. Let's make you a folder with training images:
import tensorflow as tf
import matplotlib.pyplot as plt
import cv2
from skimage import data
from glob2 import glob
import os
cat = data.chelsea()
astronaut = data.astronaut()
for category, picture in zip(['tf_cats', 'tf_astronauts'], [cat, astronaut]):
os.makedirs(category, exist_ok=True)
for i in range(5):
cv2.imwrite(os.path.join(category, category + f'_{i}.jpg'),
cv2.cvtColor(picture, cv2.COLOR_RGB2BGR))
files = glob('tf_*\\*.jpg')
Now you have these files:
['tf_astronauts\\tf_astronauts_0.jpg',
'tf_astronauts\\tf_astronauts_1.jpg',
'tf_astronauts\\tf_astronauts_2.jpg',
'tf_astronauts\\tf_astronauts_3.jpg',
'tf_astronauts\\tf_astronauts_4.jpg',
'tf_cats\\tf_cats_0.jpg',
'tf_cats\\tf_cats_1.jpg',
'tf_cats\\tf_cats_2.jpg',
'tf_cats\\tf_cats_3.jpg',
'tf_cats\\tf_cats_4.jpg']
Let's apply tranformations only to the astronaut category. Let's use the tf.image transformations.
def preprocess(filepath):
category = tf.strings.split(filepath, os.sep)[0]
read_file = tf.io.read_file(filepath)
decode = tf.image.decode_jpeg(read_file, channels=3)
resize = tf.image.resize(decode, (200, 200))
image = tf.expand_dims(resize, 0)
if tf.equal(category, 'tf_astronauts'):
image = tf.image.flip_up_down(image)
image = tf.image.flip_left_right(image)
# image = tf.image.convert_image_dtype(image, tf.float32)
# category = tf.cast(tf.equal(category, 'tf_astronauts'), tf.int32)
return image, category
Then, we make the tf.data.Dataset:
train = tf.data.Dataset.from_tensor_slices(files).\
shuffle(10).take(4).map(preprocess).batch(4)
And when you iterate the dataset, you'll see that only the astronaut is flipped:
fig = plt.figure()
plt.subplots_adjust(wspace=.1, hspace=.2)
images, labels = next(iter(train))
for index, (image, label) in enumerate(zip(images, labels)):
ax = plt.subplot(2, 2, index + 1)
ax.set_xticks([])
ax.set_yticks([])
ax.set_title(label.numpy().decode())
ax.imshow(image[0].numpy().astype(int))
plt.show()
Please note, for training you will need to uncomment the two lines in preprocess so it returns an array of floats and an integer.

Data Augmentation on tf.dataset.Dataset

In order to use Google Colabs TPUs I need a tf.dataset.Dataset. How can I then use Data Augmentation on such a dataset?
More specifically, my code so far is:
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
def scale(image, label):
image = tf.cast(image, tf.float32)
image /= 255.0
label = tf.one_hot(label,10)
return image, label
train_dataset = mnist_train.map(scale).shuffle(10000).batch(batch_size)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
Which is fed into this:
# TPU Strategy ...
with strategy.scope():
model = create_model()
model.compile(loss="categorical_crossentropy",
optimizer="adam",
metrics=["acc"])
train_dataset, test_dataset = get_dataset()
model.fit(train_dataset,
epochs=20,
verbose=1,
validation_data=test_dataset)
So, how can I use here Data Augmentation here?
As far as I know, I can't use the tf.keras ImageDataGenerator, right?
I've tried the following and it didn't work.
data_generator = ...
model.fit_generator(data_generator.flow(train_dataset, batch_size=32),
steps_per_epoch=len(train_dataset) / 32, epochs=20)
Not surprising, since, usually, train_x and train_y are fed as two arguments to the flow function, not "packed" into one tf.dataset.Dataset.

You can use tf.image functions. The tf.image module contains various functions for image processing.
For example:
You can add below functionality in your function def get_dataset.
convert each image to tf.float64 in the 0-1 range.
cache() results as those can be re-used after each repeat
randomly flip left_to_right each image using random_flip_left_right.
randomly change contrast of image using random_contrast.
Number of images increased by twice by repeat which repeat all the steps.
Code -
mnist_train = mnist_train.map(
lambda image, label: (tf.image.convert_image_dtype(image, tf.float32), label)
).cache(
).map(
lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
lambda image, label: (tf.image.random_contrast(image, lower=0.0, upper=1.0), label)
).shuffle(
1000
).
batch(
batch_size
).repeat(2)
Similarly you can use other functionalities like random_flip_up_down, random_crop functions to Randomly flips an image vertically (upside down) and Randomly crop a tensor to a given size respectively.
Your get_dataset function will look like below -
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
train_dataset = mnist_train.map(
lambda image, label: (tf.image.convert_image_dtype(image, tf.float32),label)
).cache(
).map(
lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
lambda image, label: (tf.image.random_contrast(image, lower=0.0, upper=1.0), label)
).shuffle(
1000
).batch(
batch_size
).repeat(2)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
Adding the link suggested by #Andrew H that gives end-to-end example on data augmentation that also uses mnist dataset.
Hope this answers your question. Happy Learning.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to load multi-image input in PyTorch? - python

Related

How to resize image tensors

How to make a tensorflow dataset input pipeline for multi-class bounding box regression

How to save prediction result of the CNN model in the image format?

Apply different data augmentation to part of the train set based on the category

Data Augmentation on tf.dataset.Dataset

Categories

Resources