I want to use the tf.data.dataset to do image argumentation while training. The code is similar to official guideline as follow:
train_dataset = tf.data.Dataset.from_tensor_slices((train_data, train_label))
train_dataset = train_dataset.map(
lambda file, label: tuple(tf.py_func(
_parse_train_image, [file, label],
[tf.float32, tf.int32]
))
).batch(batch_size).repeat(epoch)
def _parse_train_image(file, label):
image = cv2.imread(file.decode(), cv2.IMREAD_COLOR)
image = process_image(image)
return image, label
The train_data and train_label here refer to two lists which contains image file paths and labels seperately. However, this transformation can only return the original image, I also need to argument the images(like flip and rotate).
How can I continue to argument the images and return these images?
Related
I am working on a CNN multi-class classification of different concentrations (10uM, 30uM, etc.) I create my dataset to include the images as the features and the concentrations as labels. Note that the concentrations are left as a string. When running the code, I am getting the following error:
TypeError: cross_entropy_loss(): argument 'target' (position 2) must be Tensor, not tuple
The following is my dataset class:
class CustomDataset(Dataset):
def __init__(self, path, method):
"""
Args:
csv_path (string): path to csv file
data_path (string): path to the folder where images are
transform: pytorch transforms for transforms and tensor conversion
"""
# Transforms
self.to_tensor = transforms.ToTensor()
# Read the excel file
self.data_path = pd.read_excel(path, sheet_name=method)
# First column contains the image paths
self.img_arr = np.asarray(self.data_path.iloc[:, 0])
# Second column is the labels
self.label_arr = np.asarray(self.data_path.iloc[:, 1])
def __getitem__(self, index):
# Get image name from the pandas df
img_path = self.img_arr[index]
# Open image
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) # Converts the image from BGR to RGB
# Transform image to tensor
img_tensor = self.to_tensor(img)
# Get label(class) of the image based on the cropped pandas column
img_label = self.to_tensor(self.label_arr[index])
img_label = self.label_arr[index]
return (img_tensor, img_label)
def __len__(self):
return len(self.data_path)
I am aware that the reason is most probably due to the fact that the labels are left as tuples, so the loss function is unable to compare the CNN output with the label. However, I am unable to find any resources that explain how labels are dealt with in multi-class classifications of tuple type labels. The solution seems simple, but I am a bit confused on how to solve it. Can anyone direct me?
EDIT: This is the implemented training loop:
def train_epoch(model,dataloader,loss_fn,optimizer):
train_loss,train_correct = 0.0, 0
model.train() #Sets the mode to train (Helpful when using layers such as DropOut and BatchNorm)
for features,labels in dataloader:
#Zero grad
optimizer.zero_grad()
#Forward Pass
output=model(features)
print(output)
print(labels)
loss=loss_fn(output,labels)
#Backward Pass
loss.backward()
optimizer.step()
train_loss += loss.item()*features.size(0) #features.size is useful when using batches.
scores, predictions = torch.max(output.data,1) # 1 is to create a 1 dimensional tensor with max values from each row
train_correct += (predictions==labels).sum().item()
return train_loss, train_correct
This is the output of "output" and "labels", respectively:
tensor([[-0.0528, -0.0150, -0.0153, -0.0939, -0.0887, -0.0863]],
grad_fn=<AddmmBackward0>)
('70uM',)
I'm trying to (pre)process and augment my data and target variables when reading in the data each epoch/batch using the tf.data API. My unprocessed data is a CSV/pandas DataFrame with the format
index, img_id, c1, ..., c5 where img_id contains the path to an image while c1,...,c5 are run length encodings of different defects in the image, both are strings. To increase the amount of data I want to augment (e.g. flip) the images (and therefore the masks of defects aswell) with a certain probability for each image when reading it each batch/epoch. I want to read each image from my drive to save memory and because this seems to still yield good performance within the API (due to prefetching etc).
I'm familiar doing this using pytorchs DataLoader API (using version 1.8.1+cu111), but as this is for a course where I have to use tensorflow (using version 2.4.1), I read up on the tf.data API and came to the conclusion that I should do this augmentation and reading of the image using the map function. However, even reading the images throws different errors. The following is a mix of the code I've tried to use, most lines for reading the images are commented out with an extra comment in the line above with the error message it will produce.
import tensorflow as tf
test = tf.data.experimental.make_csv_dataset("data/mini_formatted.csv", batch_size=4)
def map_fn(df_):
img_path = df_["img_id"]
masks = restore_masks(df_) # get maps from RLE with same shape as images
imgs = []
# has to be declared before loop with correct shape, used for reading imgs later
img = np.empty(shape=(256,1600,1), dtype=np.float32)
# produces TypeError: Can't convert object of type 'Tensor' to 'str' for 'filename'
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
for i in img_path:
# produces TypeError: Can't convert object of type 'Tensor' to 'str' for 'filename'
#img = cv2.imread(i, cv2.IMREAD_GRAYSCALE)
# produces AttributeError: 'NoneType' object has no attribute 'shape'
#img = cv2.imread(str(i), cv2.IMREAD_GRAYSCALE)
# produces ValueError: 'img' has shape (256, 1600, 1) before the loop, but shape <unknown> after one iteration. Use tf.autograph.experimental.set_loop_options to set shape invariants.
#img_file = tf.io.read_file(i)
#img = tf.io.decode_image(img_file, dtype=tf.float32, channels=1)
#imgs.append(img)
pass
# since img_path is a list, this doesn't work either
# ValueError: Shape must be rank 0 but is rank 1 for '{{node ReadFile}} = ReadFile[](args_6)' with input shapes: [4].
img_file = tf.io.read_file(img_path)
img = tf.io.decode_image(img_file, dtype=tf.float32)
##########################################
#
# DO AUGMENTING PER BATCH HERE
#
##########################################
# return augmented images and masks
return imgs, class_masks
proc_ds = test.map(map_fn)
As you can see, reading the image throws different errors I do not quite unterstand, especially because reading the image as follows (i.e. with the exact same commands after getting the first batch from the dataset without applying the map function) works without problems.
it = test.as_numpy_iterator()
x_proc = it.next()
img_files = [tf.io.read_file(i) for i in x_proc["img_id"]]
imgs = [img = tf.io.decode_image(img_file, dtype=tf.float32, channels=1) for img_file in img_files]
From my understanding, using the map function on a dataset should execute the code on each example once per epoch, but from the example given, it seems the function is executed once per batch, what I tried to work around. This doesn't explain to me, why the same code doesn't work inside the map function, while working fine outside it.
To help understand what I want to do, I've written a short Dataset/DataLoader in torch as an example of what my desired outputs are.
import torch
import pandas as pd
class MyDataset(torch.utils.data.Dataset):
def __init__(self, df, mode="train", shuffle=True, augment=False, union=False,
greyscale=False, normalize=True):
self.df = df
self.length = len(df)
self.mode = mode
self.shuffle = shuffle
self.augment = augment
self.union = union
self.greyscale = greyscale
self.normalize = normalize
def __len__(self):
return self.length
def __getitem__(self, idx_):
# gets called for a single item when added to batch -> one line of the dataframe
# in the tf example, these are grouped in an OrderedDict with arrays of length (BATCH_SIZE) as values
df_ = self.df.loc[idx_]
img = self._load_img(df_["img_id"])
if self.union:
masks = build_masks(df_["c1":"c_all"], union_only=True)
else:
masks = build_masks(df_["c1":"c_all"])
# could also add augmentation here instead of in collate_ds
if self.mode == "train":
return {"img": img, "masks": masks}
return {"img": img, "masks": None}
def _load_img(self, img_path):
if self.greyscale:
img = cv2.imread(img_path, cv2.IMREAD_GRAYSCALE)
else:
img = cv2.imread(img_path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
if self.normalize:
img = img.astype(np.float32) / 255.
else:
img = img.astype(np.float32)
return img
def collate_ds(self, batch):
# gets called with BATCH_SIZE examples that were processed using __getitem__
imgs = [d["img"] for d in batch]
masks = [d["masks"] for d in batch]
if self.augment:
# augmentation steps for each image
pass
imgs = torch.tensor(imgs, dtype=torch.float32)
masks = torch.tensor(masks, dtype=torch.float32)
res = (imgs, masks)
return res
mini_df = pd.read_csv("data/mini_formatted.csv", index_col=0)
torch_ds = MyDataset(mini_df, mode="train", shuffle=True, augment=False, union=False,
greyscale=False, normalize=True)
dataloader = torch.utils.data.DataLoader(torch_ds, batch_size=8, shuffle=True,
collate_fn=torch_ds.collate_ds)
batch = next(iter(dataloader))
print(batch[0].shape, batch[1].shape)
# output: (torch.Size([8, 256, 1600, 3]), torch.Size([8, 256, 1600, 5]))
I still don't understand, why even reading the images inside the map function doesn't work (e.g. using cv2 -> neither using imread(img_path) #TypeError: Can't convert object of type 'Tensor' to 'str' for 'filename' nor imread(str(i) #AttributeError: 'NoneType' object has no attribute 'shape' -> image wasn't found works, while the tf.io.* functions work outside the function, but throw errors when the exact same code is executed inside it.
I would be very thankful for any help on what I'm misunderstanding/doing wrong using the map function with the tf.data API and how I could achieve the same results as the provided torch dataloader using the tf.data API.
In order to use Google Colabs TPUs I need a tf.dataset.Dataset. How can I then use Data Augmentation on such a dataset?
More specifically, my code so far is:
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
def scale(image, label):
image = tf.cast(image, tf.float32)
image /= 255.0
label = tf.one_hot(label,10)
return image, label
train_dataset = mnist_train.map(scale).shuffle(10000).batch(batch_size)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
Which is fed into this:
# TPU Strategy ...
with strategy.scope():
model = create_model()
model.compile(loss="categorical_crossentropy",
optimizer="adam",
metrics=["acc"])
train_dataset, test_dataset = get_dataset()
model.fit(train_dataset,
epochs=20,
verbose=1,
validation_data=test_dataset)
So, how can I use here Data Augmentation here?
As far as I know, I can't use the tf.keras ImageDataGenerator, right?
I've tried the following and it didn't work.
data_generator = ...
model.fit_generator(data_generator.flow(train_dataset, batch_size=32),
steps_per_epoch=len(train_dataset) / 32, epochs=20)
Not surprising, since, usually, train_x and train_y are fed as two arguments to the flow function, not "packed" into one tf.dataset.Dataset.
You can use tf.image functions. The tf.image module contains various functions for image processing.
For example:
You can add below functionality in your function def get_dataset.
convert each image to tf.float64 in the 0-1 range.
cache() results as those can be re-used after each repeat
randomly flip left_to_right each image using random_flip_left_right.
randomly change contrast of image using random_contrast.
Number of images increased by twice by repeat which repeat all the steps.
Code -
mnist_train = mnist_train.map(
lambda image, label: (tf.image.convert_image_dtype(image, tf.float32), label)
).cache(
).map(
lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
lambda image, label: (tf.image.random_contrast(image, lower=0.0, upper=1.0), label)
).shuffle(
1000
).
batch(
batch_size
).repeat(2)
Similarly you can use other functionalities like random_flip_up_down, random_crop functions to Randomly flips an image vertically (upside down) and Randomly crop a tensor to a given size respectively.
Your get_dataset function will look like below -
def get_dataset(batch_size=200):
datasets, info = tfds.load(name='mnist', with_info=True, as_supervised=True,
try_gcs=True)
mnist_train, mnist_test = datasets['train'], datasets['test']
train_dataset = mnist_train.map(
lambda image, label: (tf.image.convert_image_dtype(image, tf.float32),label)
).cache(
).map(
lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
lambda image, label: (tf.image.random_contrast(image, lower=0.0, upper=1.0), label)
).shuffle(
1000
).batch(
batch_size
).repeat(2)
test_dataset = mnist_test.map(scale).batch(batch_size)
return train_dataset, test_dataset
Adding the link suggested by #Andrew H that gives end-to-end example on data augmentation that also uses mnist dataset.
Hope this answers your question. Happy Learning.
I have the following code:
imagepaths = tf.convert_to_tensor(imagepaths, dtype=tf.string)
labels = tf.convert_to_tensor(labels, dtype=tf.int32)
# Build a TF Queue, shuffle data
image, label = tf.data.Dataset.from_tensor_slices((imagepaths, labels))
and am getting the following error:
image, label = tf.data.Dataset.from_tensor_slices((imagepaths, labels))
ValueError: too many values to unpack (expected 2)
Shouldn't Dataset.from_tensor_slices see this as the length of the tensor, not the number of inputs? How can I fix this issue or combine the data tensors into the same variable more effectively?
Just for reference:
There are 1800 imagepaths and 1800 labels corresponding to each other. And to be clear, the imagepaths are paths to the files where the jpgs images are located. My goal after this is to shuffle the data set and build the neural network model.
That code is right here:
# Read images from disk
image = tf.read_file(image)
image = tf.image.decode_jpeg(image, channels=CHANNELS)
# Resize images to a common size
image = tf.image.resize_images(image, [IMG_HEIGHT, IMG_WIDTH])
# Normalize
image = image * 1.0/127.5 - 1.0
# Create batches
X, Y = tf.train.batch([image, label], batch_size=batch_size,
capacity=batch_size * 8,
num_threads=4)
try to do this:
def transform(entry):
img = entry[0]
lbl = entry[1]
return img, lbl
raw_data = list(zip(imagepaths, labels))
dataset = tf.data.Dataset.from_tensor_slices(raw_data)
dataset = dataset.map(transform)
and if you want to have a look at your dataset you can do it like this:
for e in dataset.take(1):
print(e)
you can add multiple map functions and you can after that use shuffle and batch on your dataset to prepare it for training ;)
I'm trying to use the code from a public repository to train an kNN model with a set of images. It originally works processing the similarity of all the images between the cluster. But I'd like to use a new image (not included in the model) and get the most similar images from the original cluster.
This is the code to train the original kNN
for f in os.listdir(path):
# Process filename
filename = os.path.splitext(f) # filename in directory
filename_full = os.path.join(path,f) # full path filename
head, ext = filename[0], filename[1]
if ext.lower() not in [".jpg", ".jpeg"]:
continue
# Read image file
img = image.load_img(filename_full, target_size=(224, 224)) #
load
imgs.append(np.array(img)) # image
filename_heads.append(head) # filename head
# Pre-process for model input
img = process_image(img)
features = model.predict(img).flatten() # features
eX.append(features) # append feature extractor
filename_heads.append(head)
X = np.array(eX) # feature vectors
imgs = np.array(imgs) # images
n_neighbours = 5 + 1
knn = kNN() # kNN model
knn.compile(n_neighbors=n_neighbours, algorithm="brute", metric="cosine")
knn.fit(X)
This is my code to query a new image and find similar ones in the original cluster
#previously I read the image from an url and put it in img variable
img = image.load_img('db/temp.jpg', target_size=(224, 224)) # load
img = image.img_to_array(img) # convert to array
img = np.expand_dims(img, axis=0)
img = preprocess_input(img)
img_features = model.predict(img).flatten() # features
distances, indices = knn.predict(img_features)
The problem is that I get a "IndexError: tuple index out of range
" error when I run knn.predict(new_img_features). I've already looked at the shape and type of the img_features and they're all the same, so I don't really know why this error appears. Maybe the error is because the kNN used here is not a classifier, but I don't know how to adapt it in order to work.
Full code link just in case you want to check it out.
The problem was that I had to pass the matrix this way:
distances, indices = knn.predict(np.array([img_features]))