Different method of normalize values in image processing - python

I was trying to create a neural network to distinguish forest from other land in satellite images.
I started analysing the images but I'm not sure not sure how to normalize the pixel values.
I thought to divide each pixel value by 255 but in an example made by bnsreenu i found this part
from sklearn.preprocessing import MinMaxScaler, StandardScaler
scaler = MinMaxScaler()
root_directory = 'Semantic segmentation dataset/'
patch_size = 256
#Read images from repsective 'images' subdirectory
#As all images are of ddifferent size we have 2 options, either resize or crop
#But, some images are too large and some small. Resizing will change the size of real objects.
#Therefore, we will crop them to a nearest size divisible by 256 and then
#divide all images into patches of 256x256x3.
image_dataset = []
for path, subdirs, files in os.walk(root_directory):
#print(path)
dirname = path.split(os.path.sep)[-1]
if dirname == 'images': #Find all 'images' directories
images = os.listdir(path) #List of all image names in this subdirectory
for i, image_name in enumerate(images):
if image_name.endswith(".jpg"): #Only read jpg images...
image = cv2.imread(path+"/"+image_name, 1) #Read each image as BGR
SIZE_X = (image.shape[1]//patch_size)*patch_size #Nearest size divisible by our patch size
SIZE_Y = (image.shape[0]//patch_size)*patch_size #Nearest size divisible by our patch size
image = Image.fromarray(image)
image = image.crop((0 ,0, SIZE_X, SIZE_Y)) #Crop from top left corner
#image = image.resize((SIZE_X, SIZE_Y)) #Try not to resize for semantic segmentation
image = np.array(image)
#Extract patches from each image
print("Now patchifying image:", path+"/"+image_name)
patches_img = patchify(image, (patch_size, patch_size, 3), step=patch_size) #Step=256 for 256 patches means no overlap
for i in range(patches_img.shape[0]):
for j in range(patches_img.shape[1]):
single_patch_img = patches_img[i,j,:,:]
#Use minmaxscaler instead of just dividing by 255.
single_patch_img = scaler.fit_transform(single_patch_img.reshape(-1, single_patch_img.shape[-1])).reshape(single_patch_img.shape)
#single_patch_img = (single_patch_img.astype('float32')) / 255.
single_patch_img = single_patch_img[0] #Drop the extra unecessary dimension that patchify adds.
image_dataset.append(single_patch_img)
In this example he uses a minmaxscaler that give different values compared as diving by 255.
What method is better or more adapt to the situation?
I'll leave the link below:
github repo with full code

MinMaxScaler indeed may produce different values rather than simple division by 255 (In case, there are no pixels with intensities 0 or 255). As official scikit-learn documentation say, it performs the following transformation:
X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))
X_scaled = X_std * (max - min) + min
where max, min - desirable values.
Therefore, normalalizing data is rather data (and probably model) specific operation. The division by 255 is a most common way to do so and for the many cases it's enough to do. As you use neural network, you can check answers to this question to learn more about why you should normilize/center your data.

Related

How to treat different size images in ML pipeline

I'm here asking a general question about image processing applied to a machine learning pipeline. In this post, I will refer to ML as every algorithm that is not deep learning (therefore it doesn't use a neural network).
I'm developing a classifier to catalog different clothes .png images. I have labels (for each image I know the category) so it's a supervised learning problem.
My objective is to use PCA to reduce the problem's dimensionality and then use bag of visual words to perform the classification. I'm using python for this project.
The problem is that each photo has a different size and a different ratio between width and height (therefore I can't only resize them because I wouldn't have a unique height value for each image).
My, inelegant, solution is to fix the width at 200 px and then pad a bunch of zeros rows to each image (each image is a NumPy array of maximum_h rows and each row is width long).
Here the script:
#help function to convert images in array
def get_image(image_path: str, resize=True, w=300):
"""
:param image_path: string, path of the image
:param resize: boolean, if True the image is resized. Default: True
:param w: integer, specify the width of the resized image
:return: numpy array of the greyscale version of the image
"""
try:
image = Image.open(image_path).convert("L")
if resize:
wpercent = (w/float(image.size[0]))
hsize = int((float(image.size[1])*float(wpercent)))
image = image.resize((w,hsize), Image.ANTIALIAS)
#pixel_values = np.array(image.getdata())
return image
except:
#AI19/04442.png corrupted
#AI18/02971.png corrupted
#print(image_path)
return None
def extract_images(paths:list, categories: list, w: int, maximum_h: int):
A = np.zeros([len(paths), w * maximum_h])
y = []
counter = 0
for image_path, label in tqdm(zip(paths, categories)):
im = get_image(image_path, w=w)
if im:
#adapt images to fit
h,w = np.array(im).shape
delta_h = maximum_h-h
zeros_ = np.zeros((delta_h, w), dtype=int)
im = np.concatenate((im, zeros_), axis=0)
A[counter, :] = im.reshape(1, -1)
y.append(label)
counter += 1
else:
continue
return (A,y)
The problem here is the classifier performs badly (20%) because I add a significant amount of zeros to each image that increases the dimensionality but doesn't add information.
Looking at the biggest eigenvectors of the PCA algorithm I see that a lot of information is concentrated in these "padding" area (and this confirm my impression).
Is there a better way to handle different size images in python?

OpenCV can't resize() a numpy array created from a pygame.PixelArray, error: src data type = 8 is not supported [duplicate]

I would like to take an image and change the scale of the image, while it is a numpy array.
For example I have this image of a coca-cola bottle:
bottle-1
Which translates to a numpy array of shape (528, 203, 3) and I want to resize that to say the size of this second image:
bottle-2
Which has a shape of (140, 54, 3).
How do I change the size of the image to a certain shape while still maintaining the original image? Other answers suggest stripping every other or third row out, but what I want to do is basically shrink the image how you would via an image editor but in python code. Are there any libraries to do this in numpy/SciPy?
Yeah, you can install opencv (this is a library used for image processing, and computer vision), and use the cv2.resize function. And for instance use:
import cv2
import numpy as np
img = cv2.imread('your_image.jpg')
res = cv2.resize(img, dsize=(54, 140), interpolation=cv2.INTER_CUBIC)
Here img is thus a numpy array containing the original image, whereas res is a numpy array containing the resized image. An important aspect is the interpolation parameter: there are several ways how to resize an image. Especially since you scale down the image, and the size of the original image is not a multiple of the size of the resized image. Possible interpolation schemas are:
INTER_NEAREST - a nearest-neighbor interpolation
INTER_LINEAR - a bilinear interpolation (used by default)
INTER_AREA - resampling using pixel area relation. It may be a preferred method for image decimation, as it gives moire’-free
results. But when the image is zoomed, it is similar to the
INTER_NEAREST method.
INTER_CUBIC - a bicubic interpolation over 4x4 pixel neighborhood
INTER_LANCZOS4 - a Lanczos interpolation over 8x8 pixel neighborhood
Like with most options, there is no "best" option in the sense that for every resize schema, there are scenarios where one strategy can be preferred over another.
While it might be possible to use numpy alone to do this, the operation is not built-in. That said, you can use scikit-image (which is built on numpy) to do this kind of image manipulation.
Scikit-Image rescaling documentation is here.
For example, you could do the following with your image:
from skimage.transform import resize
bottle_resized = resize(bottle, (140, 54))
This will take care of things like interpolation, anti-aliasing, etc. for you.
One-line numpy solution for downsampling (by 2):
smaller_img = bigger_img[::2, ::2]
And upsampling (by 2):
bigger_img = smaller_img.repeat(2, axis=0).repeat(2, axis=1)
(this asssumes HxWxC shaped image. note this method only allows whole integer resizing (e.g., 2x but not 1.5x))
For people coming here from Google looking for a fast way to downsample images in numpy arrays for use in Machine Learning applications, here's a super fast method (adapted from here ). This method only works when the input dimensions are a multiple of the output dimensions.
The following examples downsample from 128x128 to 64x64 (this can be easily changed).
Channels last ordering
# large image is shape (128, 128, 3)
# small image is shape (64, 64, 3)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((output_size, bin_size,
output_size, bin_size, 3)).max(3).max(1)
Channels first ordering
# large image is shape (3, 128, 128)
# small image is shape (3, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((3, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
For grayscale images just change the 3 to a 1 like this:
Channels first ordering
# large image is shape (1, 128, 128)
# small image is shape (1, 64, 64)
input_size = 128
output_size = 64
bin_size = input_size // output_size
small_image = large_image.reshape((1, output_size, bin_size,
output_size, bin_size)).max(4).max(2)
This method uses the equivalent of max pooling. It's the fastest way to do this that I've found.
If anyone came here looking for a simple method to scale/resize an image in Python, without using additional libraries, here's a very simple image resize function:
#simple image scaling to (nR x nC) size
def scale(im, nR, nC):
nR0 = len(im) # source number of rows
nC0 = len(im[0]) # source number of columns
return [[ im[int(nR0 * r / nR)][int(nC0 * c / nC)]
for c in range(nC)] for r in range(nR)]
Example usage: resizing a (30 x 30) image to (100 x 200):
import matplotlib.pyplot as plt
def sqr(x):
return x*x
def f(r, c, nR, nC):
return 1.0 if sqr(c - nC/2) + sqr(r - nR/2) < sqr(nC/4) else 0.0
# a red circle on a canvas of size (nR x nC)
def circ(nR, nC):
return [[ [f(r, c, nR, nC), 0, 0]
for c in range(nC)] for r in range(nR)]
plt.imshow(scale(circ(30, 30), 100, 200))
Output:
This works to shrink/scale images, and works fine with numpy arrays.
For people who wants to resize(interpolate) a batch of numpy array, pytorch provide a faster function names torch.nn.functional.interpolate, just remember to use np.transpose first to change the channel from batchxWxHx3 to batchx3xWxH.
SciPy's imresize() method was another resize method, but it will be removed starting with SciPy v 1.3.0 . SciPy refers to PIL image resize method: Image.resize(size, resample=0)
size – The requested size in pixels, as a 2-tuple: (width, height).
resample – An optional resampling filter. This can be one of PIL.Image.NEAREST (use nearest neighbour), PIL.Image.BILINEAR (linear interpolation), PIL.Image.BICUBIC (cubic spline interpolation), or PIL.Image.LANCZOS (a high-quality downsampling filter). If omitted, or if the image has mode “1” or “P”, it is set PIL.Image.NEAREST.
Link here:
https://pillow.readthedocs.io/en/3.1.x/reference/Image.html#PIL.Image.Image.resize
Stumbled back upon this after a few years. It looks like the answers so far fall into one of a few categories:
Use an external library. (OpenCV, SciPy, etc)
User Power-of-Two Scaling
Use Nearest Neighbor
These solutions are all respectable, so I offer this only for completeness. It has three advantages over the above: (1) it will accept arbitrary resolutions, even non-power-of-two scaling factors; (2) it uses pure Python+Numpy with no external libraries; and (3) it interpolates all the pixels for an arguably 'nicer-looking' result.
It does not make good use of Numpy and, thus, is not fast, especially for large images. If you're only rescaling smaller images, it should be fine. I offer this under Apache or MIT license at the discretion of the user.
import math
import numpy
def resize_linear(image_matrix, new_height:int, new_width:int):
"""Perform a pure-numpy linear-resampled resize of an image."""
output_image = numpy.zeros((new_height, new_width), dtype=image_matrix.dtype)
original_height, original_width = image_matrix.shape
inv_scale_factor_y = original_height/new_height
inv_scale_factor_x = original_width/new_width
# This is an ugly serial operation.
for new_y in range(new_height):
for new_x in range(new_width):
# If you had a color image, you could repeat this with all channels here.
# Find sub-pixels data:
old_x = new_x * inv_scale_factor_x
old_y = new_y * inv_scale_factor_y
x_fraction = old_x - math.floor(old_x)
y_fraction = old_y - math.floor(old_y)
# Sample four neighboring pixels:
left_upper = image_matrix[math.floor(old_y), math.floor(old_x)]
right_upper = image_matrix[math.floor(old_y), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
left_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), math.floor(old_x)]
right_lower = image_matrix[min(image_matrix.shape[0] - 1, math.ceil(old_y)), min(image_matrix.shape[1] - 1, math.ceil(old_x))]
# Interpolate horizontally:
blend_top = (right_upper * x_fraction) + (left_upper * (1.0 - x_fraction))
blend_bottom = (right_lower * x_fraction) + (left_lower * (1.0 - x_fraction))
# Interpolate vertically:
final_blend = (blend_top * y_fraction) + (blend_bottom * (1.0 - y_fraction))
output_image[new_y, new_x] = final_blend
return output_image
Sample rescaling:
Original:
Downscaled by Half:
Upscaled by one and one quarter:
Are there any libraries to do this in numpy/SciPy
Sure. You can do this without OpenCV, scikit-image or PIL.
Image resizing is basically mapping the coordinates of each pixel from the original image to its resized position.
Since the coordinates of an image must be integers (think of it as a matrix), if the mapped coordinate has decimal values, you should interpolate the pixel value to approximate it to the integer position (e.g. getting the nearest pixel to that position is known as Nearest neighbor interpolation).
All you need is a function that does this interpolation for you. SciPy has interpolate.interp2d.
You can use it to resize an image in numpy array, say arr, as follows:
W, H = arr.shape[:2]
new_W, new_H = (600,300)
xrange = lambda x: np.linspace(0, 1, x)
f = interp2d(xrange(W), xrange(H), arr, kind="linear")
new_arr = f(xrange(new_W), xrange(new_H))
Of course, if your image is RGB, you have to perform the interpolation for each channel.
If you would like to understand more, I suggest watching Resizing Images - Computerphile.
import cv2
import numpy as np
image_read = cv2.imread('filename.jpg',0)
original_image = np.asarray(image_read)
width , height = 452,452
resize_image = np.zeros(shape=(width,height))
for W in range(width):
for H in range(height):
new_width = int( W * original_image.shape[0] / width )
new_height = int( H * original_image.shape[1] / height )
resize_image[W][H] = original_image[new_width][new_height]
print("Resized image size : " , resize_image.shape)
cv2.imshow(resize_image)
cv2.waitKey(0)

How can I resize a mask and RGB image to match by cropping out unwanted regions in both images

I am working on a cell counting project with a histology dataset of RGB images and their corresponding masks. However, I have been stuck for over a week now on resizing the RGB and masks images to only the FOV by cropping out the regions of zero pixels which can be clearly seen on the masks without affecting the annotations withing. Please, any suggestions will be beneficial. A screenshot of the images I obtained is shown below:
** My Code **
# Data Path
IMAGE_PATH = '/content/drive/MyDrive/dissertation/QCed single-rater dataset/rgb/'
MASKS_PATH = '/content/drive/MyDrive/dissertation/QCed single-rater dataset/mask/'
TRUE_LABEL_PATH = '/content/drive/MyDrive/dissertation/QCed single-rater dataset/visualization/'
dataset_path = '/content/drive/MyDrive/dissertation/NuCLS_dataset/'
# Get train and test IDs
image_ids = sorted(os.listdir(IMAGE_PATH)) #next(os.walk(IMAGE_PATH))[2]
mask_ids = sorted(os.listdir(MASKS_PATH)) #next(os.walk(MASKS_PATH))[2]
true_ids = sorted(os.listdir(TRUE_LABEL_PATH)) #next(os.walk(MASKS_PATH))[2]
#training data
train_data = train_imgs[:int(train_imgs.shape[0]*0.85)] #training data = 85% train_imgs
train_mask = np.squeeze(train_masks[:int(train_masks.shape[0]*0.85)]) # train mask
train_label = true_labels[:int(true_labels.shape[0]*0.85)] #training data = 85% train_imgs
# validation data
val_data = train_imgs[int(train_imgs.shape[0]*0.85):int(train_imgs.shape[0]*0.95)] # validation data = 10%train_imgs
val_mask = np.squeeze(train_masks[int(train_masks.shape[0]*0.85):int(train_imgs.shape[0]*0.95)]) # val mask
val_label = true_labels[int(true_labels.shape[0]*0.85):int(true_labels.shape[0]*0.95)]
#test data
test_data = train_imgs[int(train_imgs.shape[0]*0.95):] # test data = 5%train_imgs
test_mask = np.squeeze(train_masks[int(train_masks.shape[0]*0.95):]) # val mask
test_label = true_labels[int(true_labels.shape[0]*0.95):]
print(val_mask.shape)
(174, 256, 256, 3)
ix = 0
for ix in range(0,5):
print('Training example No.',ix)
fig = plt.figure(figsize=(16, 16))
plt.subplot(131).set_title('Original Image')
plt.imshow(test_data[ix])
plt.subplot(132).set_title('Mask (Target)')
plt.imshow(test_mask[ix])
plt.subplot(133).set_title('True Label')
plt.imshow(test_label[ix])
#plt.savefig(base_path +'fig- Sanity check on training dataset no {}.png'.format(ix))
plt.show()
ix +=1
Additional Information
I also have a CSV file containing the dimensions of the purple region, which I want both RGB and mask to be resized to. I am just stuck with implementing this on the RGB and mask images.
** My Answer **
Here is how I resolved this issue for anyone who might face similar challenges.
Firstly, I ensured my file names match that from the CSV by simply adding the suffix '.png' to the fovname column of the CSV.
df['fovname'] = df['fovname'].astype(str)+ '.png'
print (list(df['fovname']))
Then, cropping the images with the appropriate FOV coordinates solve the issue.
# RGB Images
for x in image_ids:
im = Image.open(IMAGE_PATH + x)
DF = df.loc[df['fovname']== x]
DF = DF.drop_duplicates()
xmin = DF['xmin']
ymin = DF['ymin']
xmax = DF['xmax']
ymax = DF['ymax']
print(x)
im = im.crop((xmin, ymin, xmax, ymax))
data_path = '/content/drive/MyDrive/dissertation/NuCLS_dataset/NEW/RGB/'
im.save(data_path+'{}'.format(x))
print('Saved')
#plt.imshow(im)
I think if you crop to the parts of the images you are interested in, then imshow() will zoom to show them in as much space as is available.
Cropping is discussed in a previous question Cropping image by the center.

Pytorch transforms.Compose usage for pair of images in segmentation tasks

I'm trying to use the transforms.Compose() in my segmentation task. But I'm not sure how to use the same (almost) random transforms for both the image and the mask.
So in my segmentation task, I have the raw picture and the corresponding mask, I'd like to generate more random transformed image pairs for training popurse. Meaning if I do some transform on my raw pictures, and this transformation should also happen on my mask pictures, and then this pair can go into my CNN. My transformer is something like:
train_transform = transforms.Compose([
transforms.Resize(512), # resize, the smaller edge will be matched.
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomVerticalFlip(p=0.5),
transforms.RandomRotation(90),
transforms.RandomResizedCrop(320,scale=(0.3, 1.0)),
AddGaussianNoise(0., 1.),
transforms.ToTensor(), # convert a PIL image or ndarray to tensor.
transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) # normalize to Imagenet mean and std
])
mask_transform = transforms.Compose([
transforms.Resize(512), # resize, the smaller edge will be matched.
transforms.RandomHorizontalFlip(p=0.5),
transforms.RandomVerticalFlip(p=0.5),
transforms.RandomRotation(90),
transforms.RandomResizedCrop(320,scale=(0.3, 1.0)),
##---------------------!------------------
transforms.ToTensor(), # convert a PIL image or ndarray to tensor.
transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225)) # normalize to Imagenet mean and std
])
Notice, in the code block, I added a class that can add random noise to the raw images transformation, which is not in the mask_transformation, that I want my mask images follow the raw image transformation, but ignore the random noise. So how can these two transformations happen in pairs (with the same random act)?
This seems to have an answer here: How to apply same transform on a pair of picture.
Basically, you can use the torchvision functional API to get a handle to the randomly generated parameters of a random transform such as RandomCrop. Then call torchvision.transforms.functional.crop() on both images with the same parameter values. It seems a bit lengthy but gets the job done. You can skip some transforms on some images, as per your need.
Another option that I've seen elsewhere is to re-seed the random generator with the same seed, to force generation of the same random transformations twice. I would think that such implementations are hacky and keep changing with pytorch versions (e.g. whether to re-seed np.random, random, or torch.manual_seed() ?)
So Sabyasachi's answer is really helpful for me, and I was able to use the transformer in PyTorch to transform my images. This usage of the torchvision.transformer is not the most straightforward way for transferring images. So I'm adding my solution that has an example of using the torchvision.transforms.functional, but also using skimage.filters, and lots of transform functions are available here: https://scikit-image.org/docs/dev/api/skimage.filters.html#skimage.filters.unsharp_mask.
import torchvision.transforms.functional as TF
from skimage.filters import gaussian
from skimage.filters import unsharp_mask
def transformer(image, mask):
# image and mask are PIL image object.
img_w, img_h = image.size
# Random horizontal flipping
if random.random() > 0.5:
image = TF.hflip(image)
mask = TF.hflip(mask)
# Random vertical flipping
if random.random() > 0.5:
image = TF.vflip(image)
mask = TF.vflip(mask)
# Random affine
affine_param = transforms.RandomAffine.get_params(
degrees = [-180, 180], translate = [0.3,0.3],
img_size = [img_w, img_h], scale_ranges = [1, 1.3],
shears = [2,2])
image = TF.affine(image,
affine_param[0], affine_param[1],
affine_param[2], affine_param[3])
mask = TF.affine(mask,
affine_param[0], affine_param[1],
affine_param[2], affine_param[3])
image = np.array(image)
mask = np.array(mask)
# Randome GaussianBlur -- only for images
if random.random() < 0.25:
sigma_param = random.uniform(0.01, 1)
image = gaussian(image, sigma=sigma_param)
# Randome Gaussian Noise -- only for images
if random.random() < 0.25:
factor_param = random.uniform(0.01, 0.5)
image = image + factor_param * image.std() * np.random.randn(image.shape[0], image.shape[1])
# Unsharp filter -- only for images
if random.random() < 0.25:
radius_param = random.uniform(0, 5)
amount_param = random.uniform(0.5, 2)
image = unsharp_mask(image, radius = radius_param, amount=amount_param)
f, ax = plt.subplots(1, 2, figsize=(8, 8))
ax[0].imshow(image)
ax[1].imshow(mask)
return image, mask
I think I have a simple solution:
If the images are concatenated, the transformations are applied to all of them identically:
import torch
import torchvision.transforms as T
# Create two fake images (identical for test purposes):
image = torch.randn((3, 128, 128))
target = image.clone()
# This is the trick (concatenate the images):
both_images = torch.cat((image.unsqueeze(0), target.unsqueeze(0)),0)
# Apply the transformations to both images simultaneously:
transformed_images = T.RandomRotation(180)(both_images)
# Get the transformed images:
image_trans = transformed_images[0]
target_trans = transformed_images[1]
# Compare the transformed images:
torch.all(image_trans == target_trans).item()
>> True

Resample DICOM Images to Size and Spacing and align to same Origin

I have a set of 4 DICOM CT Volumes which I am reading with SimpleITK ImageSeriesReader. Two of the images represent the CT of patient before and after the surgery. The other two images are binary segmentation masks segmented on the former 2 CT images. The segmentations are a ROI of their source CT.
All the 4 CT images, have different Size, Spacing, Origin and Direction. I have tried applying this GitHub gist https://gist.github.com/zivy/79d7ee0490faee1156c1277a78e4a4c4 to resize my images to 512x512x512 and Spacing 1x1x1. However, it doesn't place the images at the correct location. The segmented structure is always placed in the center of the CT image, instead of the correct location, as you can see from the pictures.
This my "raw" DICOM Image with its tumor segmentation (orange blob).
This is after the "resizing" algorithm and writing to disk (same image as before, just the tumor is colored green blob because inconsistency):
Code used for resampling all 4 DICOM Volumes to the same dimensions:
def resize_resample_images(images):
""" Resize all the images to the same dimensions, spacing and origin.
Usage: newImage = resize_image(source_img_plan, source_img_validation, ROI(ablation/tumor)_mask)
1. translate to same origin
2. largest number of slices and interpolate the others.
3. same resolution 1x1x1 mm3 - resample
4. (physical space)
Slice Thickness (0018,0050)
ImagePositionPatient (0020,0032)
ImageOrientationPatient (0020,0037)
PixelSpacing (0028,0030)
Frame Of Reference UID (0020,0052)
"""
# %% Define tuple to store the images
tuple_resized_imgs = collections.namedtuple('tuple_resized_imgs',
['img_plan',
'img_validation',
'ablation_mask',
'tumor_mask'])
# %% Create Reference image with zero origin, identity direction cosine matrix and isotropic dimension
dimension = images.img_plan.GetDimension() #
reference_direction = np.identity(dimension).flatten()
reference_size = [512] * dimension
reference_origin = np.zeros(dimension)
data = [images.img_plan, images.img_validation, images.ablation_mask, images.tumor_mask]
reference_spacing = np.ones(dimension) # resize to isotropic size
reference_image = sitk.Image(reference_size, images.img_plan.GetPixelIDValue())
reference_image.SetOrigin(reference_origin)
reference_image.SetSpacing(reference_spacing)
reference_image.SetDirection(reference_direction)
reference_center = np.array(
reference_image.TransformContinuousIndexToPhysicalPoint(np.array(reference_image.GetSize()) / 2.0))
#%% Paste the GT segmentation masks before transformation
tumor_mask_paste = (paste_roi_image(images.img_plan, images.tumor_mask))
ablation_mask_paste = (paste_roi_image(images.img_validation, images.ablation_mask))
images.tumor_mask = tumor_mask_paste
images.ablation_mask = ablation_mask_paste
# %% Apply transforms
data_resized = []
for idx,img in enumerate(data):
transform = sitk.AffineTransform(dimension) # use affine transform with 3 dimensions
transform.SetMatrix(img.GetDirection()) # set the cosine direction matrix
# TODO: check translation when computing the segmentations
transform.SetTranslation(np.array(img.GetOrigin()) - reference_origin) # set the translation.
# Modify the transformation to align the centers of the original and reference image instead of their origins.
centering_transform = sitk.TranslationTransform(dimension)
img_center = np.array(img.TransformContinuousIndexToPhysicalPoint(np.array(img.GetSize()) / 2.0))
centering_transform.SetOffset(np.array(transform.GetInverse().TransformPoint(img_center) - reference_center))
centered_transform = sitk.Transform(transform)
centered_transform.AddTransform(centering_transform)
# Using the linear interpolator as these are intensity images, if there is a need to resample a ground truth
# segmentation then the segmentation image should be resampled using the NearestNeighbor interpolator so that
# no new labels are introduced.
if (idx==1 or idx==2): # temporary solution to resample the GT image with NearestNeighbour
resampled_img = sitk.Resample(img, reference_image, centered_transform, sitk.sitkNearestNeighbor, 0.0)
else:
resampled_img = sitk.Resample(img, reference_image, centered_transform, sitk.sitkLinear, 0.0)
# append to list
data_resized.append(resampled_img)
# assuming the order stays the same, reassigng back to tuple
resized_imgs = tuple_resized_imgs(img_plan=data_resized[0],
img_validation=data_resized[1],
ablation_mask=data_resized[2],
tumor_mask=data_resized[3])
Code for "pasting" the ROI segmentations images into a correct size. Might be redundant.:
def paste_roi_image(image_source, image_roi):
""" Resize ROI binary mask to size, dimension, origin of its source/original img.
Usage: newImage = paste_roi_image(source_img_plan, roi_mask)
"""
newSize = image_source.GetSize()
newOrigin = image_source.GetOrigin()
newSpacing = image_roi.GetSpacing()
newDirection = image_roi.GetDirection()
if image_source.GetSpacing() != image_roi.GetSpacing():
print('the spacing of the source and derived mask differ')
# re-cast the pixel type of the roi mask
pixelID = image_source.GetPixelID()
caster = sitk.CastImageFilter()
caster.SetOutputPixelType(pixelID)
image_roi = caster.Execute(image_roi)
# black 3D image
outputImage = sitk.Image(newSize, image_source.GetPixelIDValue())
outputImage.SetOrigin(newOrigin)
outputImage.SetSpacing(newSpacing)
outputImage.SetDirection(newDirection)
# transform from physical point to index the origin of the ROI image
# img_center = np.array(img.TransformContinuousIndexToPhysicalPoint(np.array(img.GetSize()) / 2.0))
destinationIndex = outputImage.TransformPhysicalPointToIndex(image_roi.GetOrigin())
# paste the roi mask into the re-sized image
pasted_img = sitk.Paste(outputImage, image_roi, image_roi.GetSize(), destinationIndex=destinationIndex)
return pasted_img

Categories

Resources