So i have preprocessed some dicom images to feed a neural network, and in image augmentation step, the image data generator expects a 4d input while my data is 3d (200, 420, 420)
i tried reshaping the array and expanding dimensions, but in both cases i cannot plot the individual images in the array (expects image with shape 420, 420 and instead my new images have shape 420, 420, 1)
and here are my codes;
I have three functions to convert DICOM images into images with good contrast;
This one takes housefield units
def transform_to_hu(medical_image, image):
intercept = medical_image.RescaleIntercept
slope = medical_image.RescaleSlope
hu_image = image * slope + intercept
return hu_image
This one sets window image values;
def window_image(image, window_center, window_width):
img_min = window_center - window_width // 2
img_max = window_center + window_width // 2
window_image = image.copy()
window_image[window_image < img_min] = img_min
window_image[window_image > img_max] = img_max
return window_image
And this function loads the image:
def load_image(file_path):
medical_image = dicom.read_file(file_path)
image = medical_image.pixel_array
hu_image = transform_to_hu(medical_image, image)
brain_image = window_image(hu_image, 40, 80)
return brain_image
Then i load my images:
files = sorted(glob.glob('F:\CT_Data_Classifier\*.dcm'))
images = np.array([load_image(path) for path in files])
images.shape returns (200, 512, 512)
and everything is fine about the data, for example i can plot 100th image by
plt.imshow(images[100]) and it plots an image
i then feed the data into image data generator
train_image_data = ImageDataGenerator(
rescale=1./255,
shear_range=0.,
zoom_range=0.05,
rotation_range=180,
width_shift_range=0.05,
height_shift_range=0.05,
horizontal_flip=True,
vertical_flip=True,
fill_mode='constant',
cval=0
but then, when i try to plot, with this code:
plt.figure(figsize=(12, 12))
for X_batch, y_batch in train_image_data.flow(trainX, trainY, batch_size=9):
for i in range(0, 9):
plt.subplot(330 + 1 + i)
plt.imshow(X_batch[i])
plt.show()
break
it returns
(ValueError: ('Input data in "NumpyArrayIterator" should have rank 4. You passed an array with shape', (162, 420, 420)))
i tried expand_dims and reshape to add an extra dimension at the end of the array to represent channels
but then it returns
TypeError: Invalid shape (420, 420, 1) for image data
in the plt.imshow stage
im a doctor and not an experienced programmer, so i would really appreciate your help. cheers.
You are correct in adding an extra dimension to represent channels. That part seems fine. The problem is with plotting. For that, you can use:
plt.matshow(x[..., 0]).
where x is the 3D array. The syntax x[..., 0] means take index 0 of the last dimension of array x. The ellipsis (...) is shorthand to fill in the dimensions. For a 3D array, the equivalent call would be x[:, :, 0].
Related
I'm training a Yolo model by using cv2.dnn and blobFromImage. I have a df with all the images paths, which i iterate over, to obtain the features, through blobFromImage. So far, I have this:
for i in df.iloc:
img = cv2.imread(str(i[8]))
height, width, shape = img.shape
blob = cv2.dnn.blobFromImage(img, 1/255, (416,416), (0,0,0), True, crop = False) # extract features. Normalize and resize. Swap RGB colours
print(blob.shape)
net = cv2.dnn.readNet(path_cfg, path_weights)
layer_names = net.getLayerNames()
outputlayers = [layer_names[i[0] - 1] for i in net.getUnconnectedOutLayers()]
net.setInput(blob)
outs = net.forward(outputlayers)
All my images are of shape (1024, 1024, 3). When I pass the df into the code, blob.shape is (1,3,416,416) in the majority of cases. However, for some images, it reshapes to other size, such as (1,3,814,450). The interesting thing is that if I create a df1 with that specific image path and pass it into the loop, the shape of the blob turns out correctly to (1,3,416,416). Therefore, I'm assuming that it takes some values from the previously passed images.
I would highly appreciate any help which would explain why this is happening and how to solve, so that all blobs are of shape (1,3,416,416).
Many thanks in advance
I expect all blobs to have (1,3,416,416) shape. Some turn out to be different, although all original images are of the same shape.
In a coursera guided project that I was doing, the instructor used
from skimage.transform import rescale
image_rescaled = rescale(rescale(image,0.5),2.0)
to distort the image.
The error that is occurring on my own device (and that didn't arise on the jupyter notebook of the project, probably due to difference in versions of modules and python) was that image_rescaled's number of channel's are increasing by 1.
eg => images_normal.shape = (256,256,256,3) and images_with_twice_reshape.shape=(256,256,256,4)
This issue doesn't come up if I use rescaled(rescale(image,2.0),0.5).
Is this intended in a newer version of python/skimage or am I doing something wrong?
For additional references(didn't delete anything from source code but highlighted important parts with #s):
import os
import re
from scipy import ndimage, misc
from skimage.transform import resize, rescale
from matplotlib import pyplot
import numpy as np
def train_batches(just_load_dataset=False):
batches = 256 # Number of images to have at the same time in a batch
batch = 0 # Number if images in the current batch (grows over time and then resets for each batch)
batch_nb = 0 # Batch current index
ep = 4 # Number of epochs
images = []
x_train_n = []
x_train_down = []
x_train_n2 = [] # Resulting high res dataset
x_train_down2 = [] # Resulting low res dataset
for root, dirnames, filenames in os.walk("data/cars_train.nosync"):
for filename in filenames:
if re.search("\.(jpg|jpeg|JPEG|png|bmp|tiff)$", filename):
filepath = os.path.join(root, filename)
image = pyplot.imread(filepath)
if len(image.shape) > 2:
image_resized = resize(image, (256, 256)) # Resize the image so that every image is the same size
#########################
x_train_n.append(image_resized) # Add this image to the high res dataset
x_train_down.append(rescale(rescale(image_resized, 0.5), 2.0)) # Rescale it 0.5x and 2x so that it is a low res image but still has 256x256 resolution
########################
# >>>> x_train_down.append(rescale(rescale(image_resized, 2.0), 0.5)), this one works and gives the same shape of x_train_down and x_train_n.
########################
batch += 1
if batch == batches:
batch_nb += 1
x_train_n2 = np.array(x_train_n)
x_train_down2 = np.array(x_train_down)
if just_load_dataset:
return x_train_n2, x_train_down2
print('Training batch', batch_nb, '(', batches, ')')
autoencoder.fit(x_train_down2, x_train_n2,
epochs=ep,
batch_size=10,
shuffle=True,
validation_split=0.15)
x_train_n = []
x_train_down = []
batch = 0
return x_train_n2, x_train_down2
And with the above code, I get x_train_n2.shape = (256,256,256,3) and x_train_down2.shape=(256,256,256,4).
I was able to reproduce your issue as follows:
import numpy as np
from skimage.transform import resize, rescale
image = np.random.random((512, 512, 3))
resized = resize(image, (256, 256))
rescaled2x = rescale(
rescale(resized, 0.5),
2,
)
print(rescaled2x.shape)
# prints (256, 256, 4)
The problem is that resize can infer that your final dimension is channels/RGB, because you give it a 2D shape. rescale, on the other hand, treats your array as a 3D image of shape (256, 256, 3), which goes down to (128, 128, 2), interpolating along the colors as well, as if they were another spatial dimension, and then upsampling to (256, 256, 4).
If you look at the rescale documentation, you'll find the "multichannel" parameter, described as:
Whether the last axis of the image is to be interpreted as multiple channels or another spatial dimension.
So, updating my code:
rescaled2x = rescale(
rescale(resized, 0.5, multichannel=True),
2,
multichannel=True,
)
print(rescaled2x.shape)
# prints (256, 256, 3)
I have the following code that reads an image with opencv and displays it:
import cv2, matplotlib.pyplot as plt
img = cv2.imread('imgs_soccer/soccer_10.jpg',cv2.IMREAD_COLOR)
img = cv2.resize(img, (128, 128))
plt.imshow(img)
plt.show()
I want to generate some random images by using keras so I define this generator:
image_gen = ImageDataGenerator(rotation_range=15,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.01,
zoom_range=[0.9, 1.25],
horizontal_flip=True,
vertical_flip=False,
fill_mode='reflect',
data_format='channels_last',
brightness_range=[0.5, 1.5])
but, when I use it in this way:
image_gen.flow(img)
I get this error:
'Input data in `NumpyArrayIterator` should have rank 4. You passed an array with shape', (128, 128, 3))
And it seems obvious to me: RGB, an image, of course it is 3 dimension!
What am I missing here?
The documentation says that it wants a 4-dim array, but does not specify what should I put in the 4th dimension!
And how this 4-dim array should be made? I have, for now, (width, height, channel), this 4th dimension goes at the start or at the end?
I am also not very familiar with numpy: how can I alter the existing img array to add a 4th dimension?
Use np.expand_dims():
import numpy as np
img = np.expand_dims(img, 0)
print(img.shape) # (1, 128, 128, 3)
The first dimension specifies the number of images (in your case 1 image).
Alternatively, you can use numpy.newaxis or None for promoting your 3D array to 4D as in:
img = img[np.newaxis, ...]
# or use None
img = img[None, ...]
The first dimension is usually the batch_size. This gives you lot of flexibility when you want to fully utilize modern hardwares such as GPUs as long as your tensor fits in your GPU memory. For example, you can pass 64 images by stacking 64 images along the first dimension. In this case, your 4D array would be of shape (64, width, height, channels).
I'm using a set of 32x32x32 grayscale images and I want to apply random rotations on the images as a part of data augmentation while training a CNN by tflearn + tensorflow. I was using the following code to do so:
# Real-time data preprocessing
img_prep = ImagePreprocessing()
img_prep.add_featurewise_zero_center()
img_prep.add_featurewise_stdnorm()
# Real-time data augmentation
img_aug = ImageAugmentation()
img_aug.add_random_rotation(max_angle=360.)
# Input data
with tf.name_scope('Input'):
X = tf.placeholder(tf.float32, shape=(None, image_size,
image_size, image_size, num_channels), name='x-input')
Y = tf.placeholder(tf.float32, shape=(None, label_cnt), name='y-input')
# Convolutional network building
network = input_data(shape=[None, 32, 32, 32, 1],
placeholder = X,
data_preprocessing=img_prep,
data_augmentation=img_aug)
(I'm using a combination of tensorflow and tflearn to be able to use the features from both, so please bear with me. Let me know if something is wrong with the way I'm using placeholders, etc.)
I found that using the add_random_rotation (which itself uses scipy.ndimage.interpolation.rotate) treats the third dimension of my grayscale images as channels (like RGB channels) and rotates all 32 images of the third dimension by a random angel around z-axis(treats my 3D image as a 2D image with 32 channels). But I want the image to be rotated in the space (around all three axes). Do you have any idea how can I do that? Is there a function or package for easily rotating the 3D images in space?!
def random_rotation_3d(batch, max_angle):
""" Randomly rotate an image by a random angle (-max_angle, max_angle).
Arguments:
max_angle: `float`. The maximum rotation angle.
Returns:
batch of rotated 3D images
"""
size = batch.shape
batch = np.squeeze(batch)
batch_rot = np.zeros(batch.shape)
for i in range(batch.shape[0]):
if bool(random.getrandbits(1)):
image1 = np.squeeze(batch[i])
# rotate along z-axis
angle = random.uniform(-max_angle, max_angle)
image2 = scipy.ndimage.interpolation.rotate(image1, angle, mode='nearest', axes=(0, 1), reshape=False)
# rotate along y-axis
angle = random.uniform(-max_angle, max_angle)
image3 = scipy.ndimage.interpolation.rotate(image2, angle, mode='nearest', axes=(0, 2), reshape=False)
# rotate along x-axis
angle = random.uniform(-max_angle, max_angle)
batch_rot[i] = scipy.ndimage.interpolation.rotate(image3, angle, mode='nearest', axes=(1, 2), reshape=False)
# print(i)
else:
batch_rot[i] = batch[i]
return batch_rot.reshape(size)
It is more difficult to incorporate in the ImageAugmentation() but the scipy.ndimage.rotate function by default rotates 3D images correctly and takes the axes argument which specifies which specify the plane of rotation (https://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.ndimage.interpolation.rotate.html). Rotating around the first axis (x) means you pass axes=(1,2), to rotate around the second axis (y) useaxes=(0,2)
If you want to rotate any 3D image around the center and keep it in the center use scipy affine_transform using offset as follows:
# create a 3D image
image = np.random.random((20,20,20))
# output shape
output_shape = np.array(image.shape)
# rotation matrix around z axis
theta = 0.01
cosine = np.cos(theta)
sinus = np.sin(theta)
M = np.array([[cosine, -sinus, 0],
[sinus, cosine, 0],
[0, 0, 1]])
# offset
offset = (np.array(image.shape)-M.dot(np.array(output_shape)))
offset = offset/2.0 # it is important
# affine transformation
f_data = affine_transform(np.asarray(image), np.asarray(M),
output_shape=output_shape, offset=offset)
I want to adjust the image brightness of an input image that is feed into a keras model. The data is supplied from a simulator and feed into the model in real time so i need a way to adjust the image data in the model itself. I am currently using my own layer with openCV to perform the task but i am getting the following error.
File "/usr/lib/python3/dist-packages/numpy/core/_methods.py", line 70, in _mean
ret = ret.dtype.type(ret / rcount)
AttributeError: 'DType' object has no attribute 'type'
The issue appears to be with 'gamma = np.median(img) / 25' and the code trying to do numpy maths on a class 'tensorflow.python.framework.ops.Tensor'.
My class code is
class ImageLayer(Layer):
def __init__(self, **kwargs):
super(ImageLayer, self).__init__(**kwargs)
def call(self, img, mask=None):
print(type(img))
# adjust the image brightness to help normalise dark and light images
gamma = np.median(img) / 25
if gamma > 5.:
gamma = 5
elif gamma < 0.5:
gamma = 0.5
# build a lookup table mapping the pixel values [0, 255] to
# their adjusted gamma values
# http://www.pyimagesearch.com/2015/10/05/opencv-gamma-correction/
invGamma = 1.0 / gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
# apply gamma correction using the lookup table
return cv2.LUT(img, table)
The model calls the class from the model
inputs = Input(shape=(160, 320, 3), dtype='int8')
x = Cropping2D(cropping=((50,0), (0,0)), input_shape=(160, 320, 3), dim_ordering='tf')(inputs)
x = ImageLayer()(x)
x = BatchNormalization(epsilon=0.001, mode=0, axis=2, momentum=0.99)(x)
Is it possible to do what i want to do?
Is it possible to perform numpy arithmetic in Keras? I know that you can in Tensorflow with .eval().
As far as I see, you take an image and then 1. crop it 2. change the brightness. Then feed it into your model. So instead of defining the Input layer of the shape (160, 320, 3), why don't you define one of the shape that you will get after cropping and changing brightness. Then define the rest of your model as usual. If you do this, then instead of writing a layer, you will only have to write your own generator, in which you can change brightness/crop etc. using normal opencv/python/numpy. For example, see my post for how to define a multi-threaded generator capable of working with multiple workers.
Do not do this if you want you want to treat the change in brightness as a learnable parameter or include it in backpropagation. In other words, use the above technique if the brightness change is a pre-processing operation and has nothing to do with how you learn.
A simple generator (works with only 1 worker) on MNIST data is given below which fetches 32 images at a time. You may include your brightness change operation immediately after you read the image. Treat this code only as a skeleton. I have not defined all the variables and it will not work out of the box.
def myGenerator(): # write the definition of your data generator
while True:
count = 0
for i in range(len(allImgFilenames)):
if count == 0:
imgBatch = np.empty((batchSize, 3, 32, 32), dtype=float)
labelsBatch = np.empty((batchSize,), dtype=int)
img = cv2.imread(allImgFilenames[i])
img = cv2.cvtcolor(img, cv2.COLOR_BGR2RGB) # change the brightness
img = np.float32(img)/255.
imgBatch[count, :, :, :] = np.transpose(img, (2,0,1))
labelsBatch[count] = np.random.randint(0,10,(1,1))
count += 1
if count == batchSize:
count = 0
yield (imgBatch, labelsBatch)
Call the generator in the fit function as follows:
my_generator = myGenerator()
print("Built the generator")
model.fit_generator(my_generator, samples_per_epoch=60000, nb_epoch=10)
Testing:
You want to get the data from simulator in real-time. For this, you can replace cv2.imread() by a function which gets the data from simulator. You may also change the batch size to 1, if you want to classify the image as soon as it is simulated. Fetch the image from generator as follows:
img, label = my_generator.next() # this will give you `batchSize` number of samples.
model.predict(img) # `img` should have 4 dimensions if RGB, img.shape = (1,3,nRows,nCols)
I hope this helps.