How to load images with multiple JSON annotation in PyTorch - python

I would like to know how I can use the data loader in PyTorch for the custom file structure of mine. I have gone through PyTorch documentation, but all those are with separate folders with class.
My folder structure consists of 2 folders(called training and validation), each with 2 subfolders(called images and json_annotations). Each image in the "images" folder has multiple objects(like cars, cycles, man etc) and each is annotated and have separate JSON files. Standard coco annotation is followed. My intention is to make a neural network which can do real-time classification from videos.
Edit 1:
I have done the coding as suggested by Fábio Perez.
class lDataSet(data.Dataset):
def __init__(self, path_to_imgs, path_to_json):
self.path_to_imgs = path_to_imgs
self.path_to_json = path_to_json
self.img_ids = os.listdir(path_to_imgs)
def __getitem__(self, idx):
img_id = self.img_ids[idx]
img_id = os.path.splitext(img_id)[0]
img = cv2.imread(os.path.join(self.path_to_imgs, img_id + ".jpg"))
load_json = json.load(open(os.path.join(self.path_to_json, img_id + ".json")))
#n = len(load_json)
#bboxes = load_json['annotation'][n]['segmentation']
return img, load_json
def __len__(self):
return len(self.image_ids)
When I try this
l_data = lDataSet(path_to_imgs = '/home/training/images', path_to_json = '/home/training/json_annotations')
I'm getting l_data with l_data[][0] - images and l_data with json. Now I'm confused. How will I use it with finetuning example availalbe in PyTorch? In that example, dataset and dataloader is done as shown below.
https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutorial.html
# Create training and validation datasets
image_datasets = {x: datasets.ImageFolder(os.path.join(data_dir, x), data_transforms[x]) for x in ['train', 'val']}
# Create training and validation dataloaders
dataloaders_dict = {x: torch.utils.data.DataLoader(image_datasets[x], batch_size=batch_size, shuffle=True, num_workers=4) for x in ['train', 'val']}

You should be able to implement your own dataset with data.Dataset. You just need to implement __len__ and __getitem__ methods.
In your case, you can iterate through all images in the image folder (then you can store the image ids in a list in your Dataset). Then, you use the index passed to __getitem__ to get the corresponding image id. With this image id, you can read the corresponding JSON file and return the target data that you need.
Something like this:
class YourDataLoader(data.Dataset):
def __init__(self, path_to_imgs, path_to_json):
self.path_to_imags = path_to_imgs
self.path_to_json = path_to_json
self.image_ids = iterate_through_images(path_to_images)
def __getitem__(self, idx):
img_id = self.image_ids[idx]
img = load_image(os.path.join(self.path_to_images, img_id)
bboxes = load_bboxes(os.path.join(self.path_to_json, img_id)
return img, bboxes
def __len__(self):
return len(self.image_ids)
In iterate_through_images you get all the ids (e.g. filenames) of images in a directory.
In load_bboxes you read the JSON and get the information you need.
I have a JSON loader implementation here if you want a reference.

Related

tensorflow image_dataset_from_directory get certain pictures of the method by index list

I have a folder called train_ds (don't get confused by the name, is just a folder with pics) in which I have 5 subfolders with pictures. Each one is a different class.
I'm running 5 different trained models over this train_ds folder to get the inferences. What I do want is to explicitly get in which pictures all models fail to infer right. For that:
Use the tf method image_dataset_from_directory to load pics.
Use the function inferences_target_list to get a list of inferred elements and the real labels. Both lists have same length.
Use the function get_missclassified to get a list of the indexes that have different value between the inference and the real value. Voila, I got the mismatched ones for one model.
Run the same for the 5 trained models.
Get the common indexes for the 5 different processes.
So I could say, I have indexed all images in the train_ds folder and from all of them, I got what indexes have an image classified wwrong, for all models.
The question now is... How do I get the pictures associated to that indexes from the image_dataset_from_directory method?
Functions:
def inferences_target_list(model, data):
'''
returns 2 lists: inferences list, real labels
'''
# over train set fold1
y_pred_float = model.predict(data)
y_pred = np.argmax(y_pred_float, axis=1)
# get real labels
y_target = tf.concat([y for x, y in data], axis=0)
y_target
print("lenght inferences and real labels: ", len(y_pred), len(y_target))
return y_pred, y_target
def get_missclassified(y_pred, y_target):
'''
returns a list with the indexes of real labels that were missclassified
'''
missclassified = []
for i, (pred, target) in enumerate(zip(y_pred, y_target.numpy().tolist())):
if pred!=target:
#print(i, pred, target)
missclassified.append(i)
print("total missclassified: ",len(missclassified))
return missclassified
Method:
missclassified_train_folders=[]
for f in folders: # at the moment just 1 folder
print(f)
for nn in models_dict: # dictionary of trained models
print(nn)
# -- train dataset for each folder
train_path = reg_input+f+"/"+'train_ds/'
# print("\n train dataset:", "\n", train_path)
train_ds = image_dataset_from_directory(
train_path,
class_names=["Bedroom","Bathroom","Dinning","Livingroom","Kitchen"],
seed=None,
validation_split=None,
subset=None,
image_size= image_size,
batch_size= batch_size,
color_mode='rgb',
shuffle=False
)
# inferences and real values
y_pred, y_target = inferences_target_list(models_dict[nn], train_ds)
# missclassified ones
missclassified = get_missclassified(y_pred, y_target)
print("elements missclassified in {} for model {}: ".format(f, nn), len(missclassified))
missclassified_train_folders.append(missclassified)
I got the list of indexes, but I don't know how to apply it.
Thanks in advance!
| (• ◡•)| (❍ᴥ❍ʋ)
image_dataset_from_directory uses index_directory function behind the scenes to index the directories. basically it sorts the subdirectories using python sorted and loops through them with a ThreadPool
You can directly import it and use it to return the file paths, labels and the index of course.
Check it out at:
https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/preprocessing/dataset_utils.py#L26
You can use something like this to get the indexed format of the dataset
from keras.preprocessing.dataset_utils import index_directory
ALLOWLIST_FORMATS = ('.bmp', '.gif', '.jpeg', '.jpg', '.png')
file_paths, labels, class_names = index_directory(directory="/path/to/train_ds", labels="inferred", formats=ALLOWLIST_FORMATS)
Also, keep shuffle to False
Another solution is to directly infer the file_paths from the train_ds object by using train_ds.file_paths as image_from_dataset sets an attribute file_paths in the dataset object. Please see here https://github.com/keras-team/keras/blob/d8fcb9d4d4dad45080ecfdd575483653028f8eda/keras/preprocessing/image_dataset.py#L234
the given by #ma7555 was the simple solution I was looking for, nevertheless the labels list output with the ma755 method is different than the one using tf.concat([y for x, y in train_ds], axis=0).
train_ds is created using the image_dataset_from_directory method, and have 5 subfolders inside (mi classes). The clumsy solution I got at the moment is:
get list of inferred labels and real ones with inferences_target_list
compare 2 lists, check what labels are different and store their index with get_missclassified
get the list of elements in folders with get_list_of_files. this should be the same than paths for ma7555. i didn't check if the order was the same yet
def inferences_target_list(model, data):
'''
returns 2 lists: inferences list, real labels
'''
# over train set fold1
y_pred_float = model.predict(data)
y_pred = np.argmax(y_pred_float, axis=1)
# get real labels
y_target = tf.concat([y for x, y in data], axis=0)
y_target
print("lenght inferences and real labels: ", len(y_pred), len(y_target))
return y_pred, y_target
def get_missclassified(y_pred, y_target):
'''
returns a list with the indexes of real labels that were missclassified
'''
missclassified = []
for i, (pred, target) in enumerate(zip(y_pred, y_target.numpy().tolist())):
if pred!=target:
#print(i, pred, target)
missclassified.append(i)
print("total missclassified: ",len(missclassified))
return missclassified
def get_list_of_files(dirName):
'''
create a list of file and sub directories names in the given directory
found here => https://thispointer.com/python-how-to-get-list-of-files-in-directory-and-sub-directories/
'''
listOfFile = os.listdir(dirName)
allFiles = list()
# Iterate over all the entries
for entry in listOfFile:
# Create full path
fullPath = os.path.join(dirName, entry)
# If entry is a directory then get the list of files in this directory
if os.path.isdir(fullPath):
allFiles = allFiles + get_list_of_files(fullPath)
else:
allFiles.append(fullPath)
return allFiles
Start
misclassified_train_folders=[]
for f in folders:
print(f)
for nn in models_dict:
#print(nn)
# -- train dataset for each folder
train_path = reg_input+f+"/"+'train_ds/'
# print("\n train dataset:", "\n", train_path)
train_ds = image_dataset_from_directory(
train_path,
class_names=["Bedroom","Bathroom","Dinning","Livingroom","Kitchen"],
seed=None,
validation_split=None,
subset=None,
image_size= image_size,
batch_size= batch_size,
color_mode='rgb',
shuffle=False
)
# list of paths for analysed images
pic_list = get_list_of_files(train_path)
# inferences and real values
y_pred, y_target = inferences_target_list(models_dict[nn], train_ds)
# misclassified ones
misclassified = get_misclassified(y_pred, y_target)
print("elements misclassified in {} for model {}: ".format(f, nn), len(misclassified))
misclassified_train_folders.append(misclassified)
Now I have a list with 5 lists inside: Those lists are made with all misclassified elements by every model in my first folder. Getting the pictures that always are misclassified:
common_misclassified = list(set.intersection(*map(set, misclassified_train_folders)))
# this are the indexes of that images
print(len(common_misclassified), "\n", common_misclassified)
to get the path of those pics:
pic_list_missclassified = [pic_list[i] for i in common_missclassified]
# indexes of common missclassified elements for all models
print(len(pic_list_missclassified))

What is an alternative to tf.data.Dataset.from_tensor_slices for a multi-input model?

I am trying to make a multi-input Keras model that takes two inputs. One input is an image, and the second input is 1D text. I am storing the paths to the images in a dataframe, and then appending the images to a list like this:
from tqdm import tqdm
train_images = []
for image_path in tqdm(train_df['paths']):
byte_file = tf.io.read_file(image_path)
img = tf.image.decode_png(byte_file)
train_images.append(img)
The 1D text inputs are stored in lists. This process is repeated for the validation and test sets. I then make a dataset, like this:
train_protein = tf.expand_dims(padded_train_protein_encode,axis=2)
training_dataset = tf.data.Dataset.from_tensor_slices(((train_protein, train_images), train_Labels))
training_dataset = training_dataset.batch(20)
val_protein = tf.expand_dims(padded_val_protein_encode, axis=2)
validation_dataset = tf.data.Dataset.from_tensor_slices(((val_protein, val_images), validation_Labels))
validation_dataset = validation_dataset.batch(20)
test_protein = tf.expand_dims(padded_test_protein_encode, axis=2)
test_dataset = tf.data.Dataset.from_tensor_slices(((test_protein, test_images), test_Labels))
test_dataset = test_dataset.batch(20)
I am running this in Google Colab, and even using the high-ram option, the program crashes due to running out of ram. What is the best way to solve this problem?
I have researched tf.data.Dataset.from_generator as an option, but I can't work out how to make it work when there are two inputs. Can anyone help?
This is a fairly common pain. There really isn't a better way than a datagenerator if your dataset is too large to load into memory. Coming from PyTorch, there are pythonic classes to do this, rather than having to use tf.data.Dataset.from_generator. Subclassing tf.keras.utils.Sequence could be an elegant alternative. Without access to your dataset, I cannot verify but something like this should work.
__getitem__ is called every batch.
class TfDataGenerator(tf.keras.utils.Sequence):
def __init__(self, filepaths, proteins, labels):
self.filepaths = np.array(filepaths)
self.proteins = np.array(proteins)
self.labels = labels
def __len__(self):
return len(self.filenames) // self.batch_size
def __getitem__(self, index):
indexes = self.indexes[index * self.batch_size:(index + 1) * self.batch_size]
return __generate_x(indexes), labels[indexes]
def __generate_x(self, indexes):
x_1 = np.empty((self.batch_size, *self.dim, self.n_channels))
x_2 = np.empty((self.batch_size, len(self.meta_features)))
for index in enumerate(indexes):
image = cv2.imread(self.filepaths[index])
image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
x_1[num] = image.astype(np.float32)/255.
x_2[num] = self.proteins[index]
return [x_1, x_2]
def on_epoch_end(self):
self.indexes = np.arange(len(self.filenames))
if self.shuffle:
np.random.shuffle(self.indexes)
Again, a very rough example, but hopefully it shows what can be done. Tensorflow documentation here
This has been a big headache for me in the past, so hopefully this answer helps.

Converting two Numpy data sets into a particularr PyTorch data set

I want to play around with a neural network that recognizes handwritten numbers. I found some of these on the web which use PyTorch, however they seem to download the data from the MNIST website in a particular format. My data is, however, available as follows:
with np.load('prediction-challenge-01-data.npz') as fh:
data_x = fh['data_x']
data_y = fh['data_y']
Where data_x is the training data and data_y are the labels of the pictures. I want these data sets to be in the same format as trainloader as shown below:
trainset = datasets.MNIST('/data/mnist', download=True, train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)
Where trainloader already has the training set data_x and labels data_y together in one set.
Is there any way to do this?
Edit: Shapes of data_x and data_y:
In [1]: data_x.shape
Out[2]: (20000, 1, 28, 28)
In [5]: data_y.shape
Out[7]: (20000,)
You can easily create your own dataset. Just inherit from torch.utils.data.Dataset and implement
__getitem__ at the very least:
Here is a quick and dirty example to get you going:
class YourOwnDataset(torch.utils.data.Dataset):
def __init__(self, input_file_path, transformations) :
super().__init__()
self.path = input_file_path
self.transforms = transformations
with np.load(self.path) as fh:
# I assume fh['data_x'] is a list you get the idea
self.data = fh['data_x']
self.labels = fh['data_y']
# in getitem, we retrieve one item based on the input index
def __getitem__(self, index):
data = self.data[index]
# based on the loss you chose and what you have in mind,
# you can transform you label, here I assume they are
# integer numbers (like, 1, 3, etc as labels used for classification)
label = self.labels[index]
img = convert/reshape your data into img
img = self.transforms(img)
return img, labels
def __len__(self):
return len(self.data)
and you can create your dataset like :
from torchvision import transforms
# add any number of transformations you like, I just added ToTensor()
transformations = transforms.Compose([transforms.ToTensor()])
trainset = YourOwnDataset('prediction-challenge-01-data.npz', transformations )
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

Proper dataloader setup to train fasterrcnn-resnet50 for object detection with pytorch

I am trying to train pytorches torchvision.models.detection.fasterrcnn_resnet50_fpn to detect objects in my own images.
According to the documentation, this model expects a list of images and a list of dictionaries with
'boxes' and 'labels' as keys. So my dataloaders __getitem__() looks like this:
def __getitem__(self, idx):
# load images
_, img = self.images[idx].getImage()
img = Image.fromarray(img, mode='RGB')
objects = self.images[idx].objects
boxes = []
labels = []
for o in objects:
# append bbox to boxes
boxes.append([o.x, o.y, o.x+o.width, o.y+o.height])
# append the 4th char of class_id, the number of lights (1-4)
labels.append(int(str(o.class_id)[3]))
# convert everything into a torch.Tensor
boxes = torch.as_tensor(boxes, dtype=torch.float32)
labels = torch.as_tensor(labels, dtype=torch.int64)
target = {}
target["boxes"] = boxes
target["labels"] = labels
# transforms consists only of transforms.Compose([transforms.ToTensor()]) for the time being
if self.transforms is not None:
img = self.transforms(img)
return img, target
To my best knowledge, it returns exactly what's asked. My dataloader looks like this
data_loader = torch.utils.data.DataLoader(
dataset, batch_size=4, shuffle=False, num_workers=2)
however, when it get's to this stage:
for images, targets in dataloaders[phase]:
it raises
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 12 and 7 in dimension 1 at C:\w\1\s\windows\pytorch\aten\src\TH/generic/THTensor.cpp:689
Can someone point me in the right direction?
#jodag was right, I had to write a seperate collate function in order for the net to receive the data like it was supposed to. In my case I only needed to bypass the default function.

how to write a generator for Keras fit_generator with a state?

I am trying to feed a large dataset to a keras model.
The dataset does not fit into memory.
It is currently stored as a serie of hd5f files
I want to train my model using
model.fit_generator(my_gen, steps_per_epoch=30, epochs=10, verbose=1)
However, in all the examples I could find online, my_gen was used only to perform data augmentation on a already loaded dataset. For example
def generator(features, labels, batch_size):
# Create empty arrays to contain batch of features and labels#
batch_features = np.zeros((batch_size, 64, 64, 3))
batch_labels = np.zeros((batch_size,1))
while True:
for i in range(batch_size):
# choose random index in features
index= random.choice(len(features),1)
batch_features[i] = some_processing(features[index])
batch_labels[i] = labels[index]
yield batch_features, batch_labels
In my case, it needs to be something like
def generator(features, labels, batch_size):
while True:
for i in range(batch_size):
# choose random index in features
index= # SELECT THE NEXT FILE
batch_features[i] = some_processing(features[files[index]])
batch_labels[i] = labels[file[index]]
yield batch_features, batch_labels
How do I keep track of the files which were already read in previous batch?
From the keras doc
generator: A generator or an instance of Sequence (keras.utils.Sequence) object in order to avoid duplicate data when using multiprocessing. [...]
This means you can write a class inheriting from keras.utils.sequence
class ProductSequence(keras.utils.Sequence):
def __init__(self):
pass
def __len__(self):
pass
def __getitem__(self, idx):
pass
__init__ ist to init the class.
__len__ should return the number of batches per epoch. Keras will use thisto know which index can be passed to __getitem__. __getitem__ will then return the batch data depending on the index.
A simple example can be found here
With this approach you can simpy have an internal class object in which you save which files are already read.
Let us suppose that your data are images. If you have many images you probably won't be able to load all of them in memory and you would like to read from disk in batches.
Keras flow_from _directory is very fast in doing that as it does this in a multi threading way too but it needs all the images to be in different files, according to their class. If we have all the images in the same file and their classes in separated file we could use the generator bellow to load our x,y data.
import pandas as pd
import numpy as np
import cv2
#df_train: data frame with class of every image
#dpath: path of images
classes=list(np.unique(df_train.label))
def batch_generator(ids):
while True:
for start in range(0, len(ids), batch_size):
x_batch = []
y_batch = []
end = min(start + batch_size, len(ids))
ids_batch = ids[start:end]
for id in ids_batch:
img = cv2.imread(dpath+'train/{}.png'.format(id)) #open cv read as BGR
#img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #BGR to RGB
#img = cv2.resize(img, (224, 224), interpolation = cv2.INTER_CUBIC)
#img = pre_process(img)
labelname=df_train.label.loc[df_train.id==id].values
labelnum=classes.index(labelname)
x_batch.append(img)
y_batch.append(labelnum)
x_batch = np.array(x_batch)
y_batch = to_categorical(y_batch,10)
yield x_batch, y_batch

Categories

Resources