matplotlib imshow gray colormap additional preprocess - python

I have bunch of images, randomly I figured out that best preprocessing for my images is using matplotlib imshow with cmap=gray. This is my RGB image (I can't publish the original images, this is a sample that I created to make my point. So the original images are not noiseless and perfect like this):
When I use plt.imshow(img, cmap='gray') the image will be:
I wanted to implement this process in Opencv. I tried to use OpenCV colormaps but there wasn't any gray one there. I used these solutions but the result is like the first image not the second one. (result here)
So I was wondering besides changing colormaps, what preprocessing does matplotlib apply on images when we call imshow?
P.S: You might suggest binarization, I've tested both techniques but on my data binarization will ruin some of the samples which this method (matplotlib) won't.

cv::normalize with NORM_MINMAX should help you. it can map intensity values so the darkest becomes black and the lightest becomes white, regardless of what the absolute values were.
this section of OpenCV docs contains example code. it's a permalink.
or so that minIdst(I)=alpha, maxIdst(I)=beta when normType=NORM_MINMAX (for dense arrays only)
that means, for NORM_MINMAX, alpha=0, beta=255. these two params have different meanings for different normTypes. for NORM_MINMAX it seems that the code automatically swaps them so the lower value of either is used as the lower bound etc.
further, the range for uint8 type data is 0 .. 255. giving 1 only makes sense for float data.
example:
import numpy as np
import cv2 as cv
im = cv.imread("m78xj.jpg")
normalized = cv.normalize(im, dst=None, alpha=0, beta=255, norm_type=cv.NORM_MINMAX)
cv.imshow("normalized", normalized)
cv.waitKey(-1)
cv.destroyAllWindows()
apply a median blur to remove noisy pixels (which go beyond the average gray of the text):
blurred = cv.medianBlur(im, ksize=5)
# ...normalize...
or do the scaling manually. apply the median blur, find the maximum value in that version, then apply it to the original image.
output = im.astype(np.uint16) * 255 / blurred.max()
output = np.clip(output, 0, 255).astype(np.uint8)
# ...

Related

Get pixel location of binary image with intensity 255 in python opencv

I want to get the pixel coordinates of the blue dots in an image.
To get it, I first converted it to gray scale and use threshold function.
import numpy as np
import cv2
img = cv2.imread("dot.jpg")
img_g = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret1,th1 = cv2.threshold(img_g,127,255,cv2.THRESH_BINARY_INV)
What to do next if I want to get the pixel location with intensity 255? Please tell if there is some simpler method to do the same.
I don't think this is going to work as you would expect.
Usually, in order to get a stable tracking over a shape with a specific color, you do that in RGB/HSV/HSL plane, you could start with HSV which is more robust in terms of lighting.
1-Convert to HSV using cv2.cvtColor()
2-Use cv2.inRagne(blue_lower, blue_upper) to "filter" all un-wanted colors.
Now you have a good-looking binary image with only blue color in it (assuming you have a static background or more filters should be added).
3-Now if you want to detect dots (which is usually more than one pixel) you could try cv2.findContours
4- You can get x,y pixel of contours using many methods(depends on the shape of what you want to detect) like this cv2.boundingRect()

Pytorch: Normalize Image data set

I want to normalize custom dataset of images. For that i need to compute mean and standard deviation by iterating over the dataset. How can I normalize my entire dataset before creating the data set?
Well, let's take this image as an example:
The first thing you need to do is decide which library you want to use: Pillow or OpenCV. In this example I'll use Pillow:
from PIL import Image
import numpy as np
img = Image.open("test.jpg")
pix = np.asarray(img.convert("RGB")) # Open the image as RGB
Rchan = pix[:,:,0] # Red color channel
Gchan = pix[:,:,1] # Green color channel
Bchan = pix[:,:,2] # Blue color channel
Rchan_mean = Rchan.mean()
Gchan_mean = Gchan.mean()
Bchan_mean = Bchan.mean()
Rchan_var = Rchan.var()
Gchan_var = Gchan.var()
Bchan_var = Bchan.var()
And the results are:
Red Channel Mean: 134.80585625
Red Channel Variance: 3211.35843945
Green Channel Mean: 81.0884125
Green Channel Variance: 1672.63200823
Blue Channel Mean: 68.1831375
Blue Channel Variance: 1166.20433566
Hope it helps for your needs.
What normalization tries to do is mantain the overall information on your dataset, even when there exists differences in the values, in the case of images it tries to set apart some issues like brightness and contrast that in certain case does not contribute to the general information that the image has. There are several ways to do this, each one with pros and cons, depending on the image set you have and the processing effort you want to do on them, just to name a few:
Linear Histogram stetching: where you do a linear map on the current
range of values in your image and stetch it to match the 0 and 255
values in RGB
Nonlinear Histogram stetching: Where you use a
nonlinear function to map the input pixels to a new image. Commonly
used functions are logarithms and exponentials. My favorite function
is the cumulative probability function of the original histogram, it
works pretty well.
Adaptive Histogram equalization: Where you do a linear
histogram stretching in certain places of your image to avoid doing
an identity mapping where you have the max range of values in your original
image.

greyscale image to 3 channels

I have code that looks like this
from skimage import io as sio
test_image = imread('/home/username/pat/file.png')
test_image = skimage.transform.resize(test_image, (IMG_HEIGHT, IMG_WIDTH), mode='constant', preserve_range=True)
print test_image.shape # prints (128,128)
print test_image.max(), test_image.min() # prints 65535.0 0.0
sio.imshow(test_image)
More importantly, I need to make this image be in 3 channels, so I can feed it into a neural network that expects such input, any idea how to do that?
I want to transform a 1-channel image into a 3-channel image that looks reasonable when I plot it, makes sense, etc. How?
I tried padding with 0s, I tried copying the same values 3 times for the 3 channels, but then when I try to display the image, it looks like gibberish. So how can I transform the image into 3 channels, even if it becomes something like, bluescale instead of greyscale, but still be able to visualize it in a meaningful way?
Edit:
if I try
test_image = skimage.color.gray2rgb(test_image)
I get all white image, with some black dots.
I get the same all white, rare small black dots if I try
convert Test1_PC_1.tif -colorspace sRGB -type truecolor Test1_PC_1_new.tif
Before the attempted transform with gray2rgb
print type(test_image[0,0])
<type 'numpy.uint16'>
After
print type(test_image[0,0,0])
<type 'numpy.float64'>
You need to convert the array from 2D to 3D, where the third dimension is the color.
You can use the gray2rgb function function provided by skimage:
test_image = skimage.color.gray2rgb(test_image)
Alternatively, you can write your own conversion -- which gives you some flexibility to tweak the pixel values:
# basic conversion from gray to RGB encoding
test_image = np.array([[[s,s,s] for s in r] for r in test_image],dtype="u1")
# conversion from gray to RGB encoding -- putting the image in the green channel
test_image = np.array([[[0,s,0] for s in r] for r in test_image],dtype="u1")
I notice from your max() value, that you're using 16-bit sample values (which is uncommon). You'll want a different dtype, maybe "u16" or "int32". Also, you may need to play some games to make the image display with the correct polarity (it may appear with black/white reversed).
One way to get there is to just invert all of the pixel values:
test_image = 65535-test_image ## invert 16-bit pixels
Or you could look into the norm parameter to imshow, which appears to have an inverse function.
Your conversion from gray-value to RGB by replicating the gray-value three times such that R==G==B is correct.
The strange displayed result is likely caused by assumptions made during display. You will need to scale your data before display to fix it.
Usually, a uint8 image has values 0-255, which are mapped to min-max scale of display. Uint16 has values 0-65535, with 65535 mapped to max. Floating-point images are very often assumed to be in the range 0-1, with 1 mapped to max. Any larger value will then also be mapped to max. This is why you see so much white in your output image.
If you divide each output sample by the maximum value in your image you’ll be able to display it properly.
Well, imshow is using by default, a kind of heatmap to display the image intensities. To display a grayscale image just specify the colormap as above:
plt.imshow(image, cmap="gray")
Now, i think you can get the channel of an image by doing:
image[:,:,i] where i is in {0,1,2}
To extract an image for a specific channel:
red_image = image.copy()
red_image[:,:,1] = 0
red_image[:,:,2] = 0
Edit:
Do you definitely have to use skimage? What about python-opencv module?
Have you tried the following example?
import cv2
import cv
color_img = cv2.cvtColor(gray_img, cv.CV_GRAY2RGB)

FFT on image with Python

I have a problem with FFT implementation in Python. I have completely strange results.
Ok so, I want to open image, get value of every pixel in RGB, then I need to use fft on it, and convert to image again.
My steps:
1) I'm opening image with PIL library in Python like this
from PIL import Image
im = Image.open("test.png")
2) I'm getting pixels
pixels = list(im.getdata())
3) I'm seperate every pixel to r,g,b values
for x in range(width):
for y in range(height):
r,g,b = pixels[x*width+y]
red[x][y] = r
green[x][y] = g
blue[x][y] = b
4). Let's assume that I have one pixel (111,111,111). And use fft on all red values like this
red = np.fft.fft(red)
And then:
print (red[0][0], green[0][0], blue[0][0])
My output is:
(53866+0j) 111 111
It's completely wrong I think. My image is 64x64, and FFT from gimp is completely different. Actually, my FFT give me only arrays with huge values, thats why my output image is black.
Do you have any idea where is problem?
[EDIT]
I've changed as suggested to
red= np.fft.fft2(red)
And after that I scale it
scale = 1/(width*height)
red= abs(red* scale)
And still, I'm getting only black image.
[EDIT2]
Ok, so lets take one image.
Assume that I dont want to open it and save as greyscale image. So I'm doing like this.
def getGray(pixel):
r,g,b = pixel
return (r+g+b)/3
im = Image.open("test.png")
im.load()
pixels = list(im.getdata())
width, height = im.size
for x in range(width):
for y in range(height):
greyscale[x][y] = getGray(pixels[x*width+y])
data = []
for x in range(width):
for y in range(height):
pix = greyscale[x][y]
data.append(pix)
img = Image.new("L", (width,height), "white")
img.putdata(data)
img.save('out.png')
After this, I'm getting this image , which is ok. So now, I want to make fft on my image before I'll save it to new one, so I'm doing like this
scale = 1/(width*height)
greyscale = np.fft.fft2(greyscale)
greyscale = abs(greyscale * scale)
after loading it. After saving it to file, I have . So lets try now open test.png with gimp and use FFT filter plugin. I'm getting this image, which is correct
How I can handle it?
Great question. I’ve never heard of it but the Gimp Fourier plugin seems really neat:
A simple plug-in to do fourier transform on you image. The major advantage of this plugin is to be able to work with the transformed image inside GIMP. You can so draw or apply filters in fourier space, and get the modified image with an inverse FFT.
This idea—of doing Gimp-style manipulation on frequency-domain data and transforming back to an image—is very cool! Despite years of working with FFTs, I’ve never thought about doing this. Instead of messing with Gimp plugins and C executables and ugliness, let’s do this in Python!
Caveat. I experimented with a number of ways to do this, attempting to get something close to the output Gimp Fourier image (gray with moiré pattern) from the original input image, but I simply couldn’t. The Gimp image appears to be somewhat symmetric around the middle of the image, but it’s not flipped vertically or horizontally, nor is it transpose-symmetric. I’d expect the plugin to be using a real 2D FFT to transform an H×W image into a H×W array of real-valued data in the frequency domain, in which case there would be no symmetry (it’s just the to-complex FFT that’s conjugate-symmetric for real-valued inputs like images). So I gave up trying to reverse-engineer what the Gimp plugin is doing and looked at how I’d do this from scratch.
The code. Very simple: read an image, apply scipy.fftpack.rfft in the leading two dimensions to get the “frequency-image”, rescale to 0–255, and save.
Note how this is different from the other answers! No grayscaling—the 2D real-to-real FFT happens independently on all three channels. No abs needed: the frequency-domain image can legitimately have negative values, and if you make them positive, you can’t recover your original image. (Also a nice feature: no compromises on image size. The size of the array remains the same before and after the FFT, whether the width/height is even or odd.)
from PIL import Image
import numpy as np
import scipy.fftpack as fp
## Functions to go from image to frequency-image and back
im2freq = lambda data: fp.rfft(fp.rfft(data, axis=0),
axis=1)
freq2im = lambda f: fp.irfft(fp.irfft(f, axis=1),
axis=0)
## Read in data file and transform
data = np.array(Image.open('test.png'))
freq = im2freq(data)
back = freq2im(freq)
# Make sure the forward and backward transforms work!
assert(np.allclose(data, back))
## Helper functions to rescale a frequency-image to [0, 255] and save
remmax = lambda x: x/x.max()
remmin = lambda x: x - np.amin(x, axis=(0,1), keepdims=True)
touint8 = lambda x: (remmax(remmin(x))*(256-1e-4)).astype(int)
def arr2im(data, fname):
out = Image.new('RGB', data.shape[1::-1])
out.putdata(map(tuple, data.reshape(-1, 3)))
out.save(fname)
arr2im(touint8(freq), 'freq.png')
(Aside: FFT-lover geek note. Look at the documentation for rfft for details, but I used Scipy’s FFTPACK module because its rfft interleaves real and imaginary components of a single pixel as two adjacent real values, guaranteeing that the output for any-sized 2D image (even vs odd, width vs height) will be preserved. This is in contrast to Numpy’s numpy.fft.rfft2 which, because it returns complex data of size width/2+1 by height/2+1, forces you to deal with one extra row/column and deal with deinterleaving complex-to-real yourself. Who needs that hassle for this application.)
Results. Given input named test.png:
this snippet produces the following output (global min/max have been rescaled and quantized to 0-255):
And upscaled:
In this frequency-image, the DC (0 Hz frequency) component is in the top-left, and frequencies move higher as you go right and down.
Now, let’s see what happens when you manipulate this image in a couple of ways. Instead of this test image, let’s use a cat photo.
I made a few mask images in Gimp that I then load into Python and multiply the frequency-image with to see what effect the mask has on the image.
Here’s the code:
# Make frequency-image of cat photo
freq = im2freq(np.array(Image.open('cat.jpg')))
# Load three frequency-domain masks (DSP "filters")
bpfMask = np.array(Image.open('cat-mask-bpfcorner.png')).astype(float) / 255
hpfMask = np.array(Image.open('cat-mask-hpfcorner.png')).astype(float) / 255
lpfMask = np.array(Image.open('cat-mask-corner.png')).astype(float) / 255
# Apply each filter and save the output
arr2im(touint8(freq2im(freq * bpfMask)), 'cat-bpf.png')
arr2im(touint8(freq2im(freq * hpfMask)), 'cat-hpf.png')
arr2im(touint8(freq2im(freq * lpfMask)), 'cat-lpf.png')
Here’s a low-pass filter mask on the left, and on the right, the result—click to see the full-res image:
In the mask, black = 0.0, white = 1.0. So the lowest frequencies are kept here (white), while the high ones are blocked (black). This blurs the image by attenuating high frequencies. Low-pass filters are used all over the place, including when decimating (“downsampling”) an image (though they will be shaped much more carefully than me drawing in Gimp 😜).
Here’s a band-pass filter, where the lowest frequencies (see that bit of white in the top-left corner?) and high frequencies are kept, but the middling-frequencies are blocked. Quite bizarre!
Here’s a high-pass filter, where the top-left corner that was left white in the above mask is blacked out:
This is how edge-detection works.
Postscript. Someone, make a webapp using this technique that lets you draw masks and apply them to an image real-time!!!
There are several issues here.
1) Manual conversion to grayscale isn't good. Use Image.open("test.png").convert('L')
2) Most likely there is an issue with types. You shouldn't pass np.ndarray from fft2 to a PIL image without being sure their types are compatible. abs(np.fft.fft2(something)) will return you an array of type np.float32 or something like this, whereas PIL image is going to receive something like an array of type np.uint8.
3) Scaling suggested in the comments looks wrong. You actually need your values to fit into 0..255 range.
Here's my code that addresses these 3 points:
import numpy as np
from PIL import Image
def fft(channel):
fft = np.fft.fft2(channel)
fft *= 255.0 / fft.max() # proper scaling into 0..255 range
return np.absolute(fft)
input_image = Image.open("test.png")
channels = input_image.split() # splits an image into R, G, B channels
result_array = np.zeros_like(input_image) # make sure data types,
# sizes and numbers of channels of input and output numpy arrays are the save
if len(channels) > 1: # grayscale images have only one channel
for i, channel in enumerate(channels):
result_array[..., i] = fft(channel)
else:
result_array[...] = fft(channels[0])
result_image = Image.fromarray(result_array)
result_image.save('out.png')
I must admit I haven't managed to get results identical to the GIMP FFT plugin. As far as I see it does some post-processing. My results are all kinda very low contrast mess, and GIMP seems to overcome this by tuning contrast and scaling down non-informative channels (in your case all chanels except Red are just empty). Refer to the image:

What's the fastest way to increase color image contrast with OpenCV in python (cv2)?

I'm using OpenCV to process some images, and one of the first steps I need to perform is increasing the image contrast on a color image. The fastest method I've found so far uses this code (where np is the numpy import) to multiply and add as suggested in the original C-based cv1 docs:
if (self.array_alpha is None):
self.array_alpha = np.array([1.25])
self.array_beta = np.array([-100.0])
# add a beta value to every pixel
cv2.add(new_img, self.array_beta, new_img)
# multiply every pixel value by alpha
cv2.multiply(new_img, self.array_alpha, new_img)
Is there a faster way to do this in Python? I've tried using numpy's scalar multiply instead, but the performance is actually worse. I also tried using cv2.convertScaleAbs (the OpenCV docs suggested using convertTo, but cv2 seems to lack an interface to this function) but again the performance was worse in testing.
Simple arithmetic in numpy arrays is the fastest, as Abid Rahaman K commented.
Use this image for example: http://i.imgur.com/Yjo276D.png
Here is a bit of image processing that resembles brightness/contrast manipulation:
'''
Simple and fast image transforms to mimic:
- brightness
- contrast
- erosion
- dilation
'''
import cv2
from pylab import array, plot, show, axis, arange, figure, uint8
# Image data
image = cv2.imread('imgur.png',0) # load as 1-channel 8bit grayscale
cv2.imshow('image',image)
maxIntensity = 255.0 # depends on dtype of image data
x = arange(maxIntensity)
# Parameters for manipulating image data
phi = 1
theta = 1
# Increase intensity such that
# dark pixels become much brighter,
# bright pixels become slightly bright
newImage0 = (maxIntensity/phi)*(image/(maxIntensity/theta))**0.5
newImage0 = array(newImage0,dtype=uint8)
cv2.imshow('newImage0',newImage0)
cv2.imwrite('newImage0.jpg',newImage0)
y = (maxIntensity/phi)*(x/(maxIntensity/theta))**0.5
# Decrease intensity such that
# dark pixels become much darker,
# bright pixels become slightly dark
newImage1 = (maxIntensity/phi)*(image/(maxIntensity/theta))**2
newImage1 = array(newImage1,dtype=uint8)
cv2.imshow('newImage1',newImage1)
z = (maxIntensity/phi)*(x/(maxIntensity/theta))**2
# Plot the figures
figure()
plot(x,y,'r-') # Increased brightness
plot(x,x,'k:') # Original image
plot(x,z, 'b-') # Decreased brightness
#axis('off')
axis('tight')
show()
# Close figure window and click on other window
# Then press any keyboard key to close all windows
closeWindow = -1
while closeWindow<0:
closeWindow = cv2.waitKey(1)
cv2.destroyAllWindows()
Original image in grayscale:
Brightened image that appears to be dilated:
Darkened image that appears to be eroded, sharpened, with better contrast:
How the pixel intensities are being transformed:
If you play with the values of phi and theta you can get really interesting outcomes. You can also implement this trick for multichannel image data.
--- EDIT ---
have a look at the concepts of 'levels' and 'curves' on this youtube video showing image editing in photoshop. The equation for linear transform creates the same amount i.e. 'level' of change on every pixel. If you write an equation which can discriminate between types of pixel (e.g. those which are already of a certain value) then you can change the pixels based on the 'curve' described by that equation.
Try this code:
import cv2
img = cv2.imread('sunset.jpg', 1)
cv2.imshow("Original image",img)
# CLAHE (Contrast Limited Adaptive Histogram Equalization)
clahe = cv2.createCLAHE(clipLimit=3., tileGridSize=(8,8))
lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB) # convert from BGR to LAB color space
l, a, b = cv2.split(lab) # split on 3 different channels
l2 = clahe.apply(l) # apply CLAHE to the L-channel
lab = cv2.merge((l2,a,b)) # merge channels
img2 = cv2.cvtColor(lab, cv2.COLOR_LAB2BGR) # convert from LAB to BGR
cv2.imshow('Increased contrast', img2)
#cv2.imwrite('sunset_modified.jpg', img2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Sunset before:
Sunset after increased contrast:
Use the cv2::addWeighted function. It will be faster than any of the other methods presented thus far. It's designed to work on two images:
dst = cv.addWeighted( src1, alpha, src2, beta, gamma[, dst[, dtype]] )
But if you use the same image twice AND you set beta to zero, you can get the effect you want
dst = cv.addWeighted( src1, alpha, src1, 0, gamma)
The big advantage to using this function is that you will not have to worry about what happens when values go below 0 or above 255. In numpy, you have to figure out how to do all of the clipping yourself. Using the OpenCV function, it does all of the clipping for you and it's fast.

Categories

Resources