I have code that looks like this
from skimage import io as sio
test_image = imread('/home/username/pat/file.png')
test_image = skimage.transform.resize(test_image, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True)
print test_image.shape # prints (128,128)
print test_image.max(), test_image.min() # prints 65535.0 0.0
sio.imshow(test_image)
More importantly, I need to make this image be in 3 channels, so I can feed it into a neural network that expects such input, any idea how to do that?
I want to transform a 1-channel image into a 3-channel image that looks reasonable when I plot it, makes sense, etc. How?
I tried padding with 0s, I tried copying the same values 3 times for the 3 channels, but then when I try to display the image, it looks like gibberish. So how can I transform the image into 3 channels, even if it becomes something like, bluescale instead of greyscale, but still be able to visualize it in a meaningful way?
Edit:
if I try
test_image = skimage.color.gray2rgb(test_image)
I get all white image, with some black dots.
I get the same all white, rare small black dots if I try
convert Test1_PC_1.tif -colorspace sRGB -type truecolor Test1_PC_1_new.tif
Before the attempted transform with gray2rgb
print type(test_image[0,0])
<type 'numpy.uint16'>
After
print type(test_image[0,0,0])
<type 'numpy.float64'>
You need to convert the array from 2D to 3D, where the third dimension is the color.
You can use the gray2rgb function function provided by skimage:
test_image = skimage.color.gray2rgb(test_image)
Alternatively, you can write your own conversion -- which gives you some flexibility to tweak the pixel values:
# basic conversion from gray to RGB encoding
test_image = np.array([[[s,s,s] for s in r] for r in test_image],dtype="u1")
# conversion from gray to RGB encoding -- putting the image in the green channel
test_image = np.array([[[0,s,0] for s in r] for r in test_image],dtype="u1")
I notice from your max() value, that you're using 16-bit sample values (which is uncommon). You'll want a different dtype, maybe "u16" or "int32". Also, you may need to play some games to make the image display with the correct polarity (it may appear with black/white reversed).
One way to get there is to just invert all of the pixel values:
test_image = 65535-test_image ## invert 16-bit pixels
Or you could look into the norm parameter to imshow, which appears to have an inverse function.
Your conversion from gray-value to RGB by replicating the gray-value three times such that R==G==B is correct.
The strange displayed result is likely caused by assumptions made during display. You will need to scale your data before display to fix it.
Usually, a uint8 image has values 0-255, which are mapped to min-max scale of display. Uint16 has values 0-65535, with 65535 mapped to max. Floating-point images are very often assumed to be in the range 0-1, with 1 mapped to max. Any larger value will then also be mapped to max. This is why you see so much white in your output image.
If you divide each output sample by the maximum value in your image you’ll be able to display it properly.
Well, imshow is using by default, a kind of heatmap to display the image intensities. To display a grayscale image just specify the colormap as above:
plt.imshow(image, cmap="gray")
Now, i think you can get the channel of an image by doing:
image[:,:,i] where i is in {0,1,2}
To extract an image for a specific channel:
red_image = image.copy()
red_image[:,:,1] = 0
red_image[:,:,2] = 0
Edit:
Do you definitely have to use skimage? What about python-opencv module?
Have you tried the following example?
import cv2
import cv
color_img = cv2.cvtColor(gray_img, cv.CV_GRAY2RGB)
Related
I have bunch of images, randomly I figured out that best preprocessing for my images is using matplotlib imshow with cmap=gray. This is my RGB image (I can't publish the original images, this is a sample that I created to make my point. So the original images are not noiseless and perfect like this):
When I use plt.imshow(img, cmap='gray') the image will be:
I wanted to implement this process in Opencv. I tried to use OpenCV colormaps but there wasn't any gray one there. I used these solutions but the result is like the first image not the second one. (result here)
So I was wondering besides changing colormaps, what preprocessing does matplotlib apply on images when we call imshow?
P.S: You might suggest binarization, I've tested both techniques but on my data binarization will ruin some of the samples which this method (matplotlib) won't.
cv::normalize with NORM_MINMAX should help you. it can map intensity values so the darkest becomes black and the lightest becomes white, regardless of what the absolute values were.
this section of OpenCV docs contains example code. it's a permalink.
or so that minIdst(I)=alpha, maxIdst(I)=beta when normType=NORM_MINMAX (for dense arrays only)
that means, for NORM_MINMAX, alpha=0, beta=255. these two params have different meanings for different normTypes. for NORM_MINMAX it seems that the code automatically swaps them so the lower value of either is used as the lower bound etc.
further, the range for uint8 type data is 0 .. 255. giving 1 only makes sense for float data.
example:
import numpy as np
import cv2 as cv
im = cv.imread("m78xj.jpg")
normalized = cv.normalize(im, dst=None, alpha=0, beta=255, norm_type=cv.NORM_MINMAX)
cv.imshow("normalized", normalized)
cv.waitKey(-1)
cv.destroyAllWindows()
apply a median blur to remove noisy pixels (which go beyond the average gray of the text):
blurred = cv.medianBlur(im, ksize=5)
# ...normalize...
or do the scaling manually. apply the median blur, find the maximum value in that version, then apply it to the original image.
output = im.astype(np.uint16) * 255 / blurred.max()
output = np.clip(output, 0, 255).astype(np.uint8)
# ...
I'm reading DICOM gray image file as
gray = dicom.dcmread(file).pixel_array
There I've got (x,y) shape but I need RGB (x,y,3) shape
I'm trying to convert using CV
img = cv2.cvtColor(gray, cv2.COLOR_GRAY2RGB)
And for testing I'm writing it to file cv2.imwrite('dcm.png', img)
I've got extremely dark image on output which is wrong, what is correct way to convert pydicom image to RGB?
To answer your question, you need to provide a bit more info, and be a bit clearer.
First what are you trying to do? Are you trying to only get an (x,y,3) array in memory? or are you trying to convert the dicom file to a .png file? ...they are very different things.
Secondly, what modality is your dicom image?
It's likely (unless its ultrasound or perhaps nuc med) a 16 bit greyscale image, meaning the data is 16 bit, meaning your gray array above is 16 bit data.
So the first thing to understand is window levelling and how to display a 16-bit image in 8 bits. have a look here: http://www.upstate.edu/radiology/education/rsna/intro/display.php.
If it's a 16-bit image, if you want to view it as a greyscale image in rgb format, then you need to know what window level you're using or need, and adjust appropriately before saving.
Thirdly, like lenik mention above, you need to apply the dicom slope/intercept values to your pixel data prior to using.
If your problem is just making a new array with extra dimension for rgb (so sizes (r,c) to (r,c,3)), then it's easy
# orig is your read in dcmread 2D array:
r, c = orig.shape
new = np.empty((w, h, 3), dtype=orig.dtype)
new[:,:,2] = new[:,:,1] = new[:,:,0] = orig
# or with broadcasting
new[:,:,:] = orig[:,:, np.newaxis]
That will give you the 3rd dimension. BUT the values will still all be 16-bit, not 8 bit as needed if you want it to be RGB. (Assuming your image you read with dcmread is CT, MR or equivalent 16-bit dicom - then the dtype is likely uint16).
If you want it to be RGB, then you need to convert the values to 8-bit from 16-bit. For that you'll need to decide on a window/level and apply it to select the 8-bit values from the full 16-bit data range.
Likely your problem above - I've got extremely dark image on output which is wrong - is actually correct, but it's dark because the window/level cv is using by default makes it 'look' dark, or it's correct but you didn't apply the slope/intercept.
If what you want to do is convert the dicom to png (or jpg), then you should probably use PIL or matplotlib rather than cv. Both of those offer easy ways to save a 16 bit 2D array (which is what you 'gray' is in your code above), both which allow you to specify window and level when saving to png or jpg. CV is complete overkill (meaning much bigger/slower to load, and much higher learning curve).
Some psueudo code using matplotlib. The vmin/vmax values you need to adjust - the ones here would be approximately ok for a CT image.
import matplotlib.pyplot as plt
df = dcmread(file)
slope = float(df.RescaleSlope)
intercept = float(df.RescaleIntercept)
df_data = intercept + df.pixel_array * slope
# tell matplotlib to 'plot' the image, with 'gray' colormap and set the
# min/max values (ie 'black' and 'white') to correspond to
# values of -100 and 300 in your array
plt.imshow(df_data, cmap='gray', vmin=-100, vmax=300)
# save as a png file
plt.savefig('png-copy.png')
that will save a png version, but with axes drawn as well. To save as just an image, without axes and no whitespace, use this:
inches = (3,3)
dpi = 150
fig, ax = plt.subplots(figsize=inches, dpi=dpi)
fig.subplots_adjust(left=0, right=1, top=1, bottom=0, wspace=0, hspace=0)
ax.imshow(df_data, cmap='gray', vmin=-100, vmax=300)
fig.save('copy-without-whitespace.png')
The full tutorial on reading DICOM files is here: https://www.kaggle.com/gzuidhof/full-preprocessing-tutorial
Basically, you have to extract parameters slope and interception from the DICOM file and do the math for every pixel: hu = pixel_value * slope + intercept -- all this explained in the tutorial with the code samples and pictures.
I've used openCV2 to load a grayscale image, which I then converted to a numpy.array. Now I want to pad that array with a 'frame' around the image. However, I'm having some trouble dissecting what the numpy manual wants me to do exactly. I tried googling and searching for padding examples, none came up that were relevant for my case.
My current code looks like this:
import numpy as np
img = cv2.imread('Lena.png', )
imgArray = np.array((img))
imgArray = np.pad(imgArray, pad_width=1,mode='constant' ,constant_values=0)
cv2.imshow('Padded', imgArray)
Check out the openCV2 documentation here: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_core/py_basic_ops/py_basic_ops.html
My best guess is to use constant= cv2.copyMakeBorder(img,10,10,10,10,cv2.BORDER_CONSTANT,value=BLUE)
You can do as follows:
import numpy as np
import cv2
img = cv2.imread('Lena.png', 0)
img = np.pad(img, pad_width=4, mode='constant', constant_values=0)
cv2.imshow('Padded', img)
cv2.waitKey(0)
From the documentation of cv2.imread:
cv2.imread(filename[, flags]) → retval
Parameters:
filename – Name of file to be loaded.
flags:
Flags specifying the color type of a loaded image:
CV_LOAD_IMAGE_ANYDEPTH - If set, return 16-bit/32-bit image when the input has the corresponding depth, otherwise convert it to 8-bit.
CV_LOAD_IMAGE_COLOR - If set, always convert image to the color one
CV_LOAD_IMAGE_GRAYSCALE - If set, always convert image to the grayscale one
>0 Return a 3-channel color image.
Note In the current implementation the alpha channel, if any, is stripped from the output image. Use negative value if you need the alpha channel.
=0 Return a grayscale image.
<0 Return the loaded image as is (with alpha channel).
With the above code we got the following result:
And another option using np.pad:
As you can see here, you need to supply the axis you want to np.pad. Simply using:
imgArray = np.pad(imgArray, pad_width=1, mode='constant', constant_values=0)
adds only values to the third axis (i.e. the RGB channel), so that you cannot plot the image any more.
As described in the referenced question, you would need to use the following arguments to you code:
imgArray = np.pad(imgArray, pad_width=((1,1), (1,1), (0,0)), mode='constant', constant_values=0)
Also see the np.pad documentation:
Number of values padded to the edges of each axis. ((before_1, after_1), … (before_N, after_N)) unique pad widths for each axis. ((before, after),) yields same before and after pad for each axis. (pad,) or int is a shortcut for before = after = pad width for all axes.
This means the first entry of tuple pads the first axis (in case of an image the upper and lower border) and the second tuple pads the second axis (the left and right borders) with one "0".
You do not want to pad the last dimension, as this is the dimension storing the RGB information.
And as you stated in your question that you want a white border: constant_values should be set to 255 or 1, depending on the range of your image. Using 0 results in a black border.
Whilst I see you already have an answer, I wanted to show the general case where you want to pad with something other than black or white, i.e. you want to add a coloured border. I couldn't get any of the methods suggested in the other answers to do that, so...
Say you have lena.png as follows:
Then you can do:
from PIL import Image, ImageOps
import numpy as np
# Load the image - you could just as well use OpenCV `imread()`
img = Image.open('lena.png')
# Pad 20px to all sides with magenta
padded = ImageOps.expand(img, border=20, fill=(255,0,255))
# Save to disk
padded.save('result.png')
Before anyone decides to downvote because the OP asked how to add white borders, please note you can just as easily add white with this method if you use:
padded = ImageOps.expand(img, border=20, fill=(255,255,255))
If you are using numpy arrays to manipulate your images, you can convert from numpy array to PIL Image with:
pil_image = Image.fromarray(numpy_array)
and the other way with:
numpy_array = np.array(pil_image)
I am trying to convert an image into an array of pixels.
Here is my current code.
im = Image.open("beeleg.png")
pixels = im.load()
im.getdata() # doesn't work
print(pixels # doesn't work
Ideally, my end goal is to convert the image into a vector of just pixels, so for instance if I have an image of dimensions 100x100, then I want a vector of dimensions 1x10000, where each value is between [0, 255]. Then, divide each of the values in the array by 256 and add a bias of 1 in the front of the vector. However, I am not able to proceed with all this without being able to obtain an array. How to proceed?
Scipy's ndimage library is generally the go-to library for working with pixels as data (arrays). You can load an image from file (most common formats supported) using scipy.ndimage.imread into a numpy array which can be easily reshaped and mathematically operated on. The mode keyword can be used to specify a colorspace transformation upon load (convert an RGB image to black and white). In your case you asked for single color pixels from 0-255 (8bit grayscale) so you would use mode='L'. See The Documentation for usage / more useful functions.
If use OpenCV, gray=cv2.imread(image,0) will return a grayscale image with n rows x m cols single channel numpy array. rows, cols = gray.shape will return the height and width of the image.
I have a problem with FFT implementation in Python. I have completely strange results.
Ok so, I want to open image, get value of every pixel in RGB, then I need to use fft on it, and convert to image again.
My steps:
1) I'm opening image with PIL library in Python like this
from PIL import Image
im = Image.open("test.png")
2) I'm getting pixels
pixels = list(im.getdata())
3) I'm seperate every pixel to r,g,b values
for x in range(width):
for y in range(height):
r,g,b = pixels[x*width+y]
red[x][y] = r
green[x][y] = g
blue[x][y] = b
4). Let's assume that I have one pixel (111,111,111). And use fft on all red values like this
red = np.fft.fft(red)
And then:
print (red[0][0], green[0][0], blue[0][0])
My output is:
(53866+0j) 111 111
It's completely wrong I think. My image is 64x64, and FFT from gimp is completely different. Actually, my FFT give me only arrays with huge values, thats why my output image is black.
Do you have any idea where is problem?
[EDIT]
I've changed as suggested to
red= np.fft.fft2(red)
And after that I scale it
scale = 1/(width*height)
red= abs(red* scale)
And still, I'm getting only black image.
[EDIT2]
Ok, so lets take one image.
Assume that I dont want to open it and save as greyscale image. So I'm doing like this.
def getGray(pixel):
r,g,b = pixel
return (r+g+b)/3
im = Image.open("test.png")
im.load()
pixels = list(im.getdata())
width, height = im.size
for x in range(width):
for y in range(height):
greyscale[x][y] = getGray(pixels[x*width+y])
data = []
for x in range(width):
for y in range(height):
pix = greyscale[x][y]
data.append(pix)
img = Image.new("L", (width,height), "white")
img.putdata(data)
img.save('out.png')
After this, I'm getting this image , which is ok. So now, I want to make fft on my image before I'll save it to new one, so I'm doing like this
scale = 1/(width*height)
greyscale = np.fft.fft2(greyscale)
greyscale = abs(greyscale * scale)
after loading it. After saving it to file, I have . So lets try now open test.png with gimp and use FFT filter plugin. I'm getting this image, which is correct
How I can handle it?
Great question. I’ve never heard of it but the Gimp Fourier plugin seems really neat:
A simple plug-in to do fourier transform on you image. The major advantage of this plugin is to be able to work with the transformed image inside GIMP. You can so draw or apply filters in fourier space, and get the modified image with an inverse FFT.
This idea—of doing Gimp-style manipulation on frequency-domain data and transforming back to an image—is very cool! Despite years of working with FFTs, I’ve never thought about doing this. Instead of messing with Gimp plugins and C executables and ugliness, let’s do this in Python!
Caveat. I experimented with a number of ways to do this, attempting to get something close to the output Gimp Fourier image (gray with moiré pattern) from the original input image, but I simply couldn’t. The Gimp image appears to be somewhat symmetric around the middle of the image, but it’s not flipped vertically or horizontally, nor is it transpose-symmetric. I’d expect the plugin to be using a real 2D FFT to transform an H×W image into a H×W array of real-valued data in the frequency domain, in which case there would be no symmetry (it’s just the to-complex FFT that’s conjugate-symmetric for real-valued inputs like images). So I gave up trying to reverse-engineer what the Gimp plugin is doing and looked at how I’d do this from scratch.
The code. Very simple: read an image, apply scipy.fftpack.rfft in the leading two dimensions to get the “frequency-image”, rescale to 0–255, and save.
Note how this is different from the other answers! No grayscaling—the 2D real-to-real FFT happens independently on all three channels. No abs needed: the frequency-domain image can legitimately have negative values, and if you make them positive, you can’t recover your original image. (Also a nice feature: no compromises on image size. The size of the array remains the same before and after the FFT, whether the width/height is even or odd.)
from PIL import Image
import numpy as np
import scipy.fftpack as fp
## Functions to go from image to frequency-image and back
im2freq = lambda data: fp.rfft(fp.rfft(data, axis=0),
axis=1)
freq2im = lambda f: fp.irfft(fp.irfft(f, axis=1),
axis=0)
## Read in data file and transform
data = np.array(Image.open('test.png'))
freq = im2freq(data)
back = freq2im(freq)
# Make sure the forward and backward transforms work!
assert(np.allclose(data, back))
## Helper functions to rescale a frequency-image to [0, 255] and save
remmax = lambda x: x/x.max()
remmin = lambda x: x - np.amin(x, axis=(0,1), keepdims=True)
touint8 = lambda x: (remmax(remmin(x))*(256-1e-4)).astype(int)
def arr2im(data, fname):
out = Image.new('RGB', data.shape[1::-1])
out.putdata(map(tuple, data.reshape(-1, 3)))
out.save(fname)
arr2im(touint8(freq), 'freq.png')
(Aside: FFT-lover geek note. Look at the documentation for rfft for details, but I used Scipy’s FFTPACK module because its rfft interleaves real and imaginary components of a single pixel as two adjacent real values, guaranteeing that the output for any-sized 2D image (even vs odd, width vs height) will be preserved. This is in contrast to Numpy’s numpy.fft.rfft2 which, because it returns complex data of size width/2+1 by height/2+1, forces you to deal with one extra row/column and deal with deinterleaving complex-to-real yourself. Who needs that hassle for this application.)
Results. Given input named test.png:
this snippet produces the following output (global min/max have been rescaled and quantized to 0-255):
And upscaled:
In this frequency-image, the DC (0 Hz frequency) component is in the top-left, and frequencies move higher as you go right and down.
Now, let’s see what happens when you manipulate this image in a couple of ways. Instead of this test image, let’s use a cat photo.
I made a few mask images in Gimp that I then load into Python and multiply the frequency-image with to see what effect the mask has on the image.
Here’s the code:
# Make frequency-image of cat photo
freq = im2freq(np.array(Image.open('cat.jpg')))
# Load three frequency-domain masks (DSP "filters")
bpfMask = np.array(Image.open('cat-mask-bpfcorner.png')).astype(float) / 255
hpfMask = np.array(Image.open('cat-mask-hpfcorner.png')).astype(float) / 255
lpfMask = np.array(Image.open('cat-mask-corner.png')).astype(float) / 255
# Apply each filter and save the output
arr2im(touint8(freq2im(freq * bpfMask)), 'cat-bpf.png')
arr2im(touint8(freq2im(freq * hpfMask)), 'cat-hpf.png')
arr2im(touint8(freq2im(freq * lpfMask)), 'cat-lpf.png')
Here’s a low-pass filter mask on the left, and on the right, the result—click to see the full-res image:
In the mask, black = 0.0, white = 1.0. So the lowest frequencies are kept here (white), while the high ones are blocked (black). This blurs the image by attenuating high frequencies. Low-pass filters are used all over the place, including when decimating (“downsampling”) an image (though they will be shaped much more carefully than me drawing in Gimp 😜).
Here’s a band-pass filter, where the lowest frequencies (see that bit of white in the top-left corner?) and high frequencies are kept, but the middling-frequencies are blocked. Quite bizarre!
Here’s a high-pass filter, where the top-left corner that was left white in the above mask is blacked out:
This is how edge-detection works.
Postscript. Someone, make a webapp using this technique that lets you draw masks and apply them to an image real-time!!!
There are several issues here.
1) Manual conversion to grayscale isn't good. Use Image.open("test.png").convert('L')
2) Most likely there is an issue with types. You shouldn't pass np.ndarray from fft2 to a PIL image without being sure their types are compatible. abs(np.fft.fft2(something)) will return you an array of type np.float32 or something like this, whereas PIL image is going to receive something like an array of type np.uint8.
3) Scaling suggested in the comments looks wrong. You actually need your values to fit into 0..255 range.
Here's my code that addresses these 3 points:
import numpy as np
from PIL import Image
def fft(channel):
fft = np.fft.fft2(channel)
fft *= 255.0 / fft.max() # proper scaling into 0..255 range
return np.absolute(fft)
input_image = Image.open("test.png")
channels = input_image.split() # splits an image into R, G, B channels
result_array = np.zeros_like(input_image) # make sure data types,
# sizes and numbers of channels of input and output numpy arrays are the save
if len(channels) > 1: # grayscale images have only one channel
for i, channel in enumerate(channels):
result_array[..., i] = fft(channel)
else:
result_array[...] = fft(channels[0])
result_image = Image.fromarray(result_array)
result_image.save('out.png')
I must admit I haven't managed to get results identical to the GIMP FFT plugin. As far as I see it does some post-processing. My results are all kinda very low contrast mess, and GIMP seems to overcome this by tuning contrast and scaling down non-informative channels (in your case all chanels except Red are just empty). Refer to the image: