Converting an OpenCV Image to Black and White - python

How do you convert a grayscale OpenCV image to black and white? I see a similar question has already been asked, but I'm using OpenCV 2.3, and the proposed solution no longer seems to work.
I'm trying to convert a greyscale image to black and white, so that anything not absolutely black is white, and use this as a mask for surf.detect(), in order to ignore keypoints found on the edge of the black mask area.
The following Python gets me almost there, but the threshold value sent to Threshold() doesn't appear to have any effect. If I set it to 0 or 16 or 128 or 255, the result is the same, with all pixels with a value > 128 becoming white, and everything else becoming black.
What am I doing wrong?
import cv, cv2
fn = 'myfile.jpg'
im_gray = cv2.imread(fn, cv.CV_LOAD_IMAGE_GRAYSCALE)
im_gray_mat = cv.fromarray(im_gray)
im_bw = cv.CreateImage(cv.GetSize(im_gray_mat), cv.IPL_DEPTH_8U, 1);
im_bw_mat = cv.GetMat(im_bw)
threshold = 0 # 128#255# HAS NO EFFECT!?!?
cv.Threshold(im_gray_mat, im_bw_mat, threshold, 255, cv.CV_THRESH_BINARY | cv.CV_THRESH_OTSU);
cv2.imshow('', np.asarray(im_bw_mat))
cv2.waitKey()

Step-by-step answer similar to the one you refer to, using the new cv2 Python bindings:
1. Read a grayscale image
import cv2
im_gray = cv2.imread('grayscale_image.png', cv2.IMREAD_GRAYSCALE)
2. Convert grayscale image to binary
(thresh, im_bw) = cv2.threshold(im_gray, 128, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
which determines the threshold automatically from the image using Otsu's method, or if you already know the threshold you can use:
thresh = 127
im_bw = cv2.threshold(im_gray, thresh, 255, cv2.THRESH_BINARY)[1]
3. Save to disk
cv2.imwrite('bw_image.png', im_bw)

Specifying CV_THRESH_OTSU causes the threshold value to be ignored. From the documentation:
Also, the special value THRESH_OTSU may be combined with one of the above values. In this case, the function determines the optimal threshold value using the Otsu’s algorithm and uses it instead of the specified thresh . The function returns the computed threshold value. Currently, the Otsu’s method is implemented only for 8-bit images.
This code reads frames from the camera and performs the binary threshold at the value 20.
#include "opencv2/core/core.hpp"
#include "opencv2/imgproc/imgproc.hpp"
#include "opencv2/highgui/highgui.hpp"
using namespace cv;
int main(int argc, const char * argv[]) {
VideoCapture cap;
if(argc > 1)
cap.open(string(argv[1]));
else
cap.open(0);
Mat frame;
namedWindow("video", 1);
for(;;) {
cap >> frame;
if(!frame.data)
break;
cvtColor(frame, frame, CV_BGR2GRAY);
threshold(frame, frame, 20, 255, THRESH_BINARY);
imshow("video", frame);
if(waitKey(30) >= 0)
break;
}
return 0;
}

For those doing video I cobbled the following based on #tsh :
import cv2 as cv
import numpy as np
def nothing(x):pass
cap = cv.VideoCapture(0)
cv.namedWindow('videoUI', cv.WINDOW_NORMAL)
cv.createTrackbar('T','videoUI',0,255,nothing)
while(True):
ret, frame = cap.read()
vid_gray = cv.cvtColor(frame, cv.COLOR_BGR2GRAY)
thresh = cv.getTrackbarPos('T','videoUI');
vid_bw = cv.threshold(vid_gray, thresh, 255, cv.THRESH_BINARY)[1]
cv.imshow('videoUI',cv.flip(vid_bw,1))
if cv.waitKey(1) & 0xFF == ord('q'):
break
cap.release()
cv.destroyAllWindows()
Results in:

Approach 1
While converting a gray scale image to a binary image, we usually use cv2.threshold() and set a threshold value manually. Sometimes to get a decent result we opt for Otsu's binarization.
I have a small hack I came across while reading some blog posts.
Convert your color (RGB) image to gray scale.
Obtain the median of the gray scale image.
Choose a threshold value either 33% above the median
Why 33%?
This is because 33% works for most of the images/data-set.
You can also work out the same approach by replacing median with the mean.
Approach 2
Another approach would be to take an x number of standard deviations (std) from the mean, either on the positive or negative side; and set a threshold. So it could be one of the following:
th1 = mean - (x * std)
th2 = mean + (x * std)
Note: Before applying threshold it is advisable to enhance the contrast of the gray scale image locally (See CLAHE).

Simply you can write the following code snippet to convert an OpenCV image into a grey scale image
import cv2
image = cv2.imread('image.jpg',0)
cv2.imshow('grey scale image',image)
Observe that the image.jpg and the code must be saved in same folder.
Note that:
('image.jpg') gives a RGB image
('image.jpg',0) gives Grey Scale Image.

Pay attention, if you use cv.CV_THRESH_BINARY means every pixel greater than threshold becomes the maxValue (in your case 255), otherwise the value is 0. Obviously if your threshold is 0 everything becomes white (maxValue = 255) and if the value is 255 everything becomes black (i.e. 0).
If you don't want to work out a threshold, you can use the Otsu's method. But this algorithm only works with 8bit images in the implementation of OpenCV. If your image is 8bit use the algorithm like this:
cv.Threshold(im_gray_mat, im_bw_mat, threshold, 255, cv.CV_THRESH_BINARY | cv.CV_THRESH_OTSU);
No matter the value of threshold if you have a 8bit image.

Here's a two line code I found online that might be helpful for a beginner
# Absolute value of the 32/64
abs_image_in32_64 = np.absolute(image_in32_64)
image_8U = np.uint8(abs_image_in32_64)

Related

Removing pixels below a certain threshold

I have a grayscale image with something written in the front and something at the back. I'd like to filter out the back part of the letters and only have the front. It's only grayscale and not RGB, and I'd rather not have to calculate pixels manually.
Is there any library function I can use to do this? I'm new to python and at the moment, using PIL library, so that's my preference. But if there are other libraries, I'm open to that as well.
Here's the image:
Are you looking for a function that automatically strips the background for any given image, or just one that can filter out pixels that meet a certain criteria for this particular image?
The eval function applies the same transformation to every pixel in an image. This works for your image.
with Image.open("jFmbt.jpg") as im:
im = im.convert("L")
out_image = Image.eval(im, lambda x: 256 if x > 175 and x < 250 else x)
A very commonly used library for that would be OpenCV.
import cv2 as cv
# 0 flag -> read image as greyscale
img = cv.imread("img.jpg", 0)
# threshold
ret, thresh = cv.threshold(img, 150, 255, cv.THRESH_BINARY)
# result
cv.imwrite("output.jpg", thresh)
The resulting image would be:

Remove background (ghost photo) from an image with characters?

I am trying to do text extraction from some images, however, these come with a bit of background, I have tried to "play" with contrast and brightness, as well as looking to apply thresholding techniques like otsu.
Do you have any suggestions on how to improve the extraction? I leave below some parts of the processing, as well as the input and output, any recommendation will be welcome.
Input:
Output:
Processing:
enhancer = ImageEnhance.Brightness(img)
img = enhancer.enhance(1.62) # 1.8
enhancer2 = ImageEnhance.Contrast(img)
img = enhancer2.enhance(1.8) # 2
img = np.array(img)
thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
You should perform adaptive threshold. The algorithm divides the image into blocks of pre-defined size. Every block is given a different threshold value based on the pixel intensities within that block. In the following example, threshold is obtained based on Gaussian weight applied to sum of all pixel values within each block (meaning similar pixel values are given more weightage based on Gaussian curve). Binarization is carried out based on this value for each block. Check this page for more
For the given image, I tried the following:
im = cv2.imread('text_block.jpg')
green_channel = im[:,:,1]
th = cv2.adaptiveThreshold(green_channel, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 27, 6)
You will have to try tweaking the parameters to get a better result. And also try cv.ADAPTIVE_THRESH_MEAN_C

Local Contrast Enhancement for Digit Recognition with cv2 / pytesseract

I want to use pytesseract to read digits from images. The images look as follows:
The digits are dotted and in order to be able to use pytesseract, I need black connected digits on a white background. To do so, I thought about using erode and dilate as preprocessing techniques. As you can see, the images are similar, yet quite different in certain aspects. For example, the dots in the first image are darker than the background, while the dots in the second are whiter. That means, in the first image I can use erode to get black connected lines and in the second image I can use dilate to get white connected lines and then inverse the colors. This leads to the following results:
Using an appropriate threshold, the first image can easily be read with pytesseract. The second image, whoever, is more tricky. The problem is, that for example parts of the "4" are darker than the background around the three. So a simple threshold is not going to work. I need something like local threshold or local contrast enhancement. Does anybody have an idea here?
Edit:
OTSU, mean threshold and gaussian threshold lead to the following results:
Your images are pretty low res, but you can try a method called gain division. The idea is that you try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image.
After gain division is performed, you can try to improve the image by applying an area filter and morphology. I only tried your first image, because it is the "least worst".
These are the steps to get the gain-divided image:
Apply a soft median blur filter to get rid of high frequency noise.
Get the model of the background via local maximum. Apply a very strong close operation, with a big structuring element (I’m using a rectangular kernel of size 15).
Perform gain adjustment by dividing 255 between each local maximum pixel. Weight this value with each input image pixel.
You should get a nice image where the background illumination is pretty much normalized, threshold this image to get a binary mask of the characters.
Now, you can improve the quality of the image with the following, additional steps:
Threshold via Otsu, but add a little bit of bias. (This, unfortunately, is a manual step depending on the input).
Apply an area filter to filter out the smaller blobs of noise.
Let's see the code:
import numpy as np
import cv2
# image path
path = "C:/opencvImages/"
fileName = "iA904.png"
# Reading an image in default mode:
inputImage = cv2.imread(path+fileName)
# Remove small noise via median:
filterSize = 5
imageMedian = cv2.medianBlur(inputImage, filterSize)
# Get local maximum:
kernelSize = 15
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(imageMedian, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Perform gain division
gainDivision = np.where(localMax == 0, 0, (inputImage/localMax))
# Clip the values to [0,255]
gainDivision = np.clip((255 * gainDivision), 0, 255)
# Convert the mat type from float to uint8:
gainDivision = gainDivision.astype("uint8")
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(gainDivision, cv2.COLOR_BGR2GRAY)
This is what gain division gets you:
Note that the lighting is more balanced. Now, let's apply a little bit of contrast enhancement:
# Contrast Enhancement:
grayscaleImage = np.uint8(cv2.normalize(grayscaleImage, grayscaleImage, 0, 255, cv2.NORM_MINMAX))
You get this, which creates a little bit more contrast between the foreground and the background:
Now, let's try to threshold this image to get a nice, binary mask. As I suggested, try Otsu's thresholding but add (or subtract) a little bit of bias to the result. This step, as mentioned, is dependent on the quality of your input:
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
threshValue = 0.9 * threshValue
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)
You end up with this binary mask:
Invert this and filter out the small blobs. I set an area threshold value of 10 pixels:
# Invert image:
binaryImage = 255 - binaryImage
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)
# Set the minimum pixels for the area filter:
minArea = 10
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype("uint8")
And this is the final binary mask:
If you plan on sending this image to an OCR, you might want to apply some morphology first. Maybe a closing to try and join the dots that make up the characters. Also be sure to train your OCR classifier with a font that is close to what you are actually trying to recognize. This is the (inverted) mask after a size 3 rectangular closing operation with 3 iterations:
Edit:
To get the last image, process the filtered output as follows:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 3
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
closingImage = cv2.morphologyEx(filteredImage, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
# Invert image to obtain black numbers on white background:
closingImage = 255 - closingImage

How to group and highlight group of pixels in an image using OpenCV? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
During the process of error level analysis on an image, I want to highlight the pixel changes using OpenCV(With just a single image and not the difference). I know the pixel-level values for the output image but not sure about the methods to group them together and assign a shape to that (Example below where the pixel change is specified with a shape). I want to know if I could detect the circle with the lighter pixels and group them and add a grouped shape for the pixels
Input Image:
Result Image:
If I understand correctly, you want to highlight the differences between the input and output images in a new image. To do this, you can take a quantitative approach to determine the exact discrepancies between images using the Structural Similarity Index (SSIM) which was introduced in Image Quality Assessment: From Error Visibility to Structural Similarity. This method is already implemented in the scikit-image library for image processing. You can install scikit-image with pip install scikit-image.
The skimage.measure.compare_ssim() function returns a score and a diff image. The score represents the structural similarity index between the two input images and can fall between the range [-1,1] with values closer to one representing higher similarity. But since you're only interested in where the two images differ, the diff image is what we'll focus on. Specifically, the diff image contains the actual image differences with darker regions having more disparity. Larger areas of disparity are highlighted in black while smaller differences are in gray. Here's the diff image
If you look closely, there are gray noisy areas probably due to .jpg lossy compression. So to obtain a cleaner result, we perform morphological operations to smooth the image. We would obtain a cleaner result if the images used a lossless image compression format such as .png. After cleaning up the image, we highlight the differences in green
from skimage.measure import compare_ssim
import numpy as np
import cv2
# Load images and convert to grayscale
image1 = cv2.imread('1.jpg')
image2 = cv2.imread('2.jpg')
image1_gray = cv2.cvtColor(image1, cv2.COLOR_BGR2GRAY)
image2_gray = cv2.cvtColor(image2, cv2.COLOR_BGR2GRAY)
# Compute SSIM between two images
(score, diff) = compare_ssim(image1_gray, image2_gray, full=True)
# The diff image contains the actual image differences between the two images
# and is represented as a floating point data type in the range [0,1]
# so we must convert the array to 8-bit unsigned integers in the range
# [0,255] before we can use it with OpenCV
diff = 255 - (diff * 255).astype("uint8")
cv2.imwrite('original_diff.png',diff)
# Perform morphological operations
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
opening = cv2.morphologyEx(diff, cv2.MORPH_OPEN, kernel, iterations=1)
close = cv2.morphologyEx(opening, cv2.MORPH_CLOSE, kernel, iterations=1)
diff = cv2.merge([close,close,close])
# Color difference pixels
diff[np.where((diff > [10,10,50]).all(axis=2))] = [36,255,12]
cv2.imwrite('diff.png',diff)
I thing the best way is to simply threshold you image and apply Morphological Transformations.
I have got the following results.
Threashold + Morphological:
Select the largest component:
using this code:
cv::Mat result;
cv::Mat img = cv::imread("fOTmh.jpg");
//-- gray & smooth image
cv::cvtColor(img, result, cv::COLOR_BGR2GRAY);
cv::blur(result, result, cv::Size(5,5));
//-- threashold with max value of the image and smooth again!
double min, max;
cv::minMaxLoc(result, &min, &max);
cv::threshold(result, result, 0.3*max, 255, cv::THRESH_BINARY);
cv::medianBlur(result, result, 7);
//-- apply Morphological Transformations
cv::Mat se = getStructuringElement(cv::MORPH_ELLIPSE, cv::Size(11, 11));
cv::morphologyEx(result, result, cv::MORPH_DILATE, se);
cv::morphologyEx(result, result, cv::MORPH_CLOSE, se);
//-- find the largest component
vector<vector<cv::Point> > contours;
vector<cv::Vec4i> hierarchy;
cv::findContours(result, contours, hierarchy, cv::RETR_LIST, cv::CHAIN_APPROX_NONE);
vector<cv::Point> *l = nullptr;
for(auto &&c: contours){
if (l==nullptr || l->size()< c.size())
l = &c;
}
//-- expand and plot Rect around the largest component
cv::Rect r = boundingRect(*l);
r.x -=10;
r.y -=10;
r.width +=20;
r.height +=20;
cv::rectangle(img, r, cv::Scalar::all(255), 3);
//-- result
cv::resize(img, img, cv::Size(), 0.25, 0.25);
cv::imshow("result", img);
Python Code :
import cv2 as cv
img = cv.imread("ELA_Final.jpg")
result = cv.cvtColor(img, cv.COLOR_BGR2GRAY);
result = cv.blur(result, (5,5));
minVal, maxVal, minLoc, maxLoc = cv.minMaxLoc(result)
ret,result = cv.threshold(result, 0.3*maxVal, 255, cv.THRESH_BINARY)
median = cv.medianBlur(result, 7)
se = cv.getStructuringElement(cv.MORPH_ELLIPSE,(11, 11));
result = cv.morphologyEx(result, cv.MORPH_DILATE, se);
result = cv.morphologyEx(result, cv.MORPH_CLOSE, se);
_,contours, hierarchy = cv.findContours(result,cv.RETR_LIST, cv.CHAIN_APPROX_NONE)
x = []
for eachCOntor in contours:
x.append(len(eachCOntor))
m = max(x)
p = [i for i, j in enumerate(x) if j == m]
color = (255, 0, 0)
x, y, w, h = cv.boundingRect(contours[p[0]])
x -=10
y -=10
w +=20
h +=20
cv.rectangle(img, (x,y),(x+w,y+h),color, 3)
img = cv.resize( img,( 1500, 700), interpolation = cv.INTER_AREA)
cv.imshow("result", img)
cv.waitKey(0)

Python: How to implement Binary Filter on RGB image? (algorithm)

I'm trying to implement binary image filter (to get monochrome binary image) using python & PyQT5, and, to retrieve new pixel colors I use the following method:
def _new_pixel_colors(self, x, y):
color = QColor(self.pixmap.pixel(x, y))
result = qRgb(0, 0, 0) if all(c < 127 for c in color.getRgb()[:3]) else qRgb(255, 255, 255)
return result
Could It be a correct sample of binary filter for RGB image? I mean, is that a sufficient condition to check whether the pixel is brighter or darker then (127,127,127) Gray color?
And please, do not provide any solutions with opencv, pillow, etc. I'm only asking about the algorithm itself.
I would at least compare against intensity i=R+G+B ...
For ROI like masks you can use any thresholding techniques (adaptive thresholding is the best) but if your resulting image is not a ROI mask and should resemble the visual features of the original image then the best conversion I know of is to use Dithering.
The Idea behind BW dithering is to convert gray scales into BW patterns preserwing the shading. The result is often noisy but preserves much much more visual details. Here simple naive C++ dithering (sorry not a Python coder):
picture pic0,pic1;
// pic0 - source img
// pic1 - output img
int x,y,i;
color c;
// resize output to source image size clear with black
pic1=pic0; pic1.clear(0);
// dithering
i=0;
for (y=0;y<pic0.ys;y++)
for (x=0;x<pic0.xs;x++)
{
// get source pixel color (AARRGGBB)
c=pic0.p[y][x];
// add to leftovers
i+=WORD(c.db[picture::_r]); // _r,_g,_b are just constants 0,1,2
i+=WORD(c.db[picture::_g]);
i+=WORD(c.db[picture::_b]);
// threshold white intensity is 255+255+255=765
if (i>=384){ i-=765; c.dd=0x00FFFFFF; } else c.dd=0;
// copy to destination image
pic1.p[y][x]=c;
}
So its the same as in the link above but using just black and white. i is the accumulated intensity to be placed on the image. xs,ys is the resolution and c.db[] is color channel access.
If I apply this on colored image like this:
The result looks like this:
As you can see all the details where preserved but a noisy patterns emerge ... For printing purposes was sometimes the resolution of the image multiplied to enhance the quality. If you change the naive 2 nested for loops with a better pattern (like 16x16 squares etc) then the noise will be conserved near its source limiting artifacts. There are also approaches that use pseudo random patterns (put the leftover i near its source pixel in random location) that is even better ...
But for a BW dithering even naive approach is enough as the artifacts are just one pixel in size. For colored dithering the artifacts could create unwanted horizontal line patterns of several pixels in size (depends on used palette mis match the worse palette the bigger artifacts...)
PS just for comparison to other answer threshold outputs this is the same image dithered:
Image thresholding is the class of algorithms you're looking for - a binary threshold would set pixels to 0 or 1, yes.
Depending on the desired output, consider converting your image first to other color spaces, in particular HSL, with the luminance channel. Using (127, 127, 127) as a threshold does not uniformly take brightness into account because each channel of RGB is the saturation of R, G, or B; consider this image:
from PIL import Image
import colorsys
def threshold_pixel(r, g, b):
h, l, s = colorsys.rgb_to_hls(r / 255., g / 255., b / 255.)
return 1 if l > .36 else 0
# return 1 if r > 127 and g > 127 and b > 127 else 0
def hlsify(img):
pixels = img.load()
width, height = img.size
# Create a new blank monochrome image.
output_img = Image.new('1', (width, height), 0)
output_pixels = output_img.load()
for i in range(width):
for j in range(height):
output_pixels[i, j] = threshold_pixel(*pixels[i, j])
return output_img
binarified_img = hlsify(Image.open('./sample_img.jpg'))
binarified_img.show()
binarified_img.save('./out.jpg')
There is lots of discussion on other StackExchange sites on this topic, e.g.
Binarize image data
How do you binarize a colored image?
how can I get good binary image using Otsu method for this image?

Categories

Resources