Working with image data from memory in Pytorch

Working with image data from memory in Pytorch - python

Question:
If I have an array in memory with dims (n, height, width, channels) and I want to get a Pytorch classifier to feed them forward and give me an array with class predictions for each of the n images in the array, how do I do that?
Background:
I am working with a computer vision problem where I modify some images using pre-existing code and want to send the modified images into a Pytorch Classifier CNN (not developed or controlled by me). I am accustomed to Tensorflow/Keras more than Pytorch.
With Tensorflow/Keras models you can give them a bunch of images in a numpy array and it'll go ahead and feed them forward through the model.
PS:
A colleague suggested saving all the images to disk first, then reading them in with DataLoader but that is so unnecessary when I already have the images in memory.
Sorry if it's a dumb question, I tried to find a solution elsewhere but obviously haven't had much success.

You can create a custom DataLoader function which takes the images in memory and returns tensors which can be fed directly to the model without having to save them on disk first. A very simple implementation can be:
def images_to_tensor(images):
#images is numpy array of shape N,H,W,C
#normalizes images between -1 and 1, comment this if you want to normalize images between 0 and 1
images = (images.astype(np.float32) - 127.5)/128.0
#to normalize image from 0 to 1 uncomment the line below
#images = (images.astype(np.float32))/255.0
#changes numpy array to tensors
images = torch.from_numpy(images).permute(0, 3, 1, 2)
#to convert cpu tensors to cuda uncomment the line below
#images = images.to("cuda")
return images
You can then use the function to convert your images to tensors and pass them to the classification model to get the output predictions.

Related

Resizing images in data preprocessing for training convolution network

I am trying to load data from jpeg files to train a convolution network. The images are large, with 24 million pixels however, so loading and using the full resolution is not practical.
To get the images to a more useful format I am trying to load each image, rescale it and then append it to a list. Once this is done, I can then convert the list into a numpy array and feed into the network for training as usual.
My problem is that my data set is very large and it takes about a second to rescale every image, which means it is not feasible to resize every image the way I have currently implemented this:
length_training_DF = 30000
for i in range(length_training_DF):
im = plt.imread(TRAIN_IM_DIR + trainDF.iloc[i]['image_name'] + '.jpg')
image = block_reduce(im, block_size=(10, 10, 1), func=np.max)
trainX.append(image)
I have also used the following:
length_training_DF = 30000
from keras.preprocessing import image
for i in range(50):
img = image.load_img(TRAIN_IM_DIR + trainDF.iloc[0]['image_name'] + '.jpg', target_size=(224, 224))
trainX.append(ima)
Is there any way to load these images more quickly into a format for training a network? I have thought about using a keras dataset, perhaps by using tf.keras.preprocessing.image_dataset_from_directory(), but the directory in which the image data is stored is not formatted correctly into folders containing the same targets as is required by this method.
The images are for a binary classification problem.

The usual way would be to write a preprocessing script that loads the large images, rescales them, applies other operations if needed, and then saves each class to a separate directory, as required by ImageDataGenerator.
There are at least three good reasons to do that:
Typically, you will run your training process dozens of time. You don't want to every time do the rescaling or e.g. auto-white balance.
ImageDataGenerator provides vital methods for augmenting your training data set.
It's a good generator out of the box. Likely you don't want to load entire data set into memory.

TensorFlow tf.data.Dataset API for medical imaging

I'm a student in medical imaging. I have to construct a neural network for image segmentation. I have a data set of 285 subjects, each with 4 modalities (T1, T2, T1ce, FLAIR) + their respective segmentation ground truth. Everything is in 3D with resolution of 240x240x155 voxels (this is BraTS data set).
As we know, I cannot input the whole image on a GPU for memory reasons. I have to preprocess the images and decompose them in 3D overlapping patches (sub-volumes of 40x40x40) which I do with scikit-image view_as_windows and then serialize the windows in a TFRecords file. Since each patch overlaps of 10 voxels in each direction, these sums to 5,292 patches per volume. The problem is, with only 1 modality, I get sizes of 800 GB per TFRecords file. Plus, I have to compute their respective segmentation weight map and store it as patches too. Segmentation is also stored as patches in the same file.
And I eventually have to include all the other modalities, which would take nothing less than terabytes of storage. I also have to remember I must also sample equivalent number of patches between background and foreground (class balancing).
So, I guess I have to do all preprocessing steps on-the-fly, just before every training step (while hoping not to slow down training too). I cannot use tf.data.Dataset.from_tensors() since I cannot load everything in RAM. I cannot use tf.data.Dataset.from_tfrecords() since preprocessing the whole thing before takes a lot of storage and I will eventually run out.
The question is : what's left for me for doing this cleanly with the possibility to reload the model after training for image inference ?
Thank you very much and feel free to ask for any other details.
Pierre-Luc

Finally, I found a method to solve my problem.
I first crop a subject's image without applying the actual crop. I only measure the slices I need to crop the volume to only the brain. I then serialize all the data set images into one TFRecord file, each training example being an image modality, original image's shape and the slices (saved as Int64 feature).
I decode the TFRecords afterward. Each training sample are reshaped to the shape it contains in a feature. I stack all the image modalities into a stack using tf.stack() method. I crop the stack using the previously extracted slices (the crop then applies to all images in the stack). I finally get some random patches using tf.random_crop() method that allows me to randomly crop a 4-D array (heigh, width, depth, channel).
The only thing I still haven't figured out is data augmentation. Since all this is occurring in Tensors format, I cannot use plain Python and NumPy to rotate, shear, flip a 4-D array. I would need to do it in the tf.Session(), but I would rather like to avoid this and directly input the training handle.
For the evaluation, I serialize in a TFRecords file only one test subject per file. The test subject contains all modalities too, but since there is no TensorFLow methods to extract patches in 4-D, the image is preprocessed in small patches using Scikit-Learn extract_patches() method. I serialize these patches to the TFRecords.
This way, training TFRecords is a lot smaller. I can evaluate the test data using batch prediction.
Thanks for reading and feel free to comment !

Making a custom generator in keras for prediction

I am working on doing prediction for my large database of ~1 million images. For each image, I have code that can chop the image up into ~200 smaller images and pass them into keras as a numpy array for prediction.
I want to avoid unnecessary reading and writing to the hard drive, so I don't want to save all these smaller images and use flow_from_directory. Instead, I am looking to read in an image, chop it up with my existing code, and pass the smaller images into my network as a batch all in memory, and then repeat this process for many images.
Is this something Keras can handle? If so, I suspect I will need to make my own custom generator, but I'm not sure how to do this, and I couldn't find any good examples. Does anyone have an example of how to implement a custom generator?

Try something like this:
dpath='path to test folder'
ids=os.listdir(dpath+"test/")
for id in ids:
x_batch=[]
img = cv2.imread(dpath+'test/{}.jpg'.format(id)) #jpg if image in jpg format
img = cv2.resize(img, (224, 224), interpolation = cv2.INTER_CUBIC) #if resize is needed
chopped_image= your code that chops image
x_batch.append(chopped_image)
x_batch = np.array(x_batch, np.float32)
preds=(model.predict_on_batch(x_batch))
if first==1:
predsA=preds.copy()
first=0
else:
predsA=np.append(predsA,preds,axis=0)

Is it possible to train on multiple images sizes in keras?

Keras takes numpy arrays as input for training data, however it is possible to create models that can take variable input sizes. I'm wondering if there is a way to incorporate images of various dimensions in the training data for a model.

You cannot give variable size images to train a model in Keras. According to Keras API, the Input layer function looks as follows.
Input(shape=(3,None,None))
Where, 3 is presenting number of channels for RGB images. But you have to clearly tell Keras, what is the width and height of the training images. So, there is no way Keras can handle variable length images.
So, you have to transform the images to a specific size first and then train model using Keras.

Tensorflow variable images size within single batch

I am currently implementing FCN in tensorflow that enables variable input image size.
I have images of really various image sizes, but unfortunately I am not able to start the training with batch size different than 1.
I am using the feed dict in a following way:
feed_dict = {fcn.images: image_batch,
fcn.labels: labels_batch,
fcn.dropout_keep: dropout}
result = sess.run(list(tf_ops), feed_dict=feed_dict)
I have already tried:
Creating image_batch and labels_batch as numpy array, this however does not work since numpy arrays does not support variable certain dimensions.
Creating image_batch and labels_batch as list of numpy arrays. Here seems that tensorflow is trying to call numpy.array(image_batch).
Going with tf.pack(), this unfortunately does not support different image sizes as well
My question is:
Is there a way how to solve this problem?
Thank you in advance for any suggestions and advices.

So we can close this - quoting Olivier Moindrot above:
You have to pad or resize all your images to the same size before batching them.
Note that after Olivier's answer, there was a new tf.image.decode_and_crop_jpeg op added that can make it a bit easier to do this.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.