Remove background (ghost photo) from an image with characters? - python

I am trying to do text extraction from some images, however, these come with a bit of background, I have tried to "play" with contrast and brightness, as well as looking to apply thresholding techniques like otsu.
Do you have any suggestions on how to improve the extraction? I leave below some parts of the processing, as well as the input and output, any recommendation will be welcome.
Input:
Output:
Processing:
enhancer = ImageEnhance.Brightness(img)
img = enhancer.enhance(1.62) # 1.8
enhancer2 = ImageEnhance.Contrast(img)
img = enhancer2.enhance(1.8) # 2
img = np.array(img)
thresh = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]

You should perform adaptive threshold. The algorithm divides the image into blocks of pre-defined size. Every block is given a different threshold value based on the pixel intensities within that block. In the following example, threshold is obtained based on Gaussian weight applied to sum of all pixel values within each block (meaning similar pixel values are given more weightage based on Gaussian curve). Binarization is carried out based on this value for each block. Check this page for more
For the given image, I tried the following:
im = cv2.imread('text_block.jpg')
green_channel = im[:,:,1]
th = cv2.adaptiveThreshold(green_channel, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 27, 6)
You will have to try tweaking the parameters to get a better result. And also try cv.ADAPTIVE_THRESH_MEAN_C

Related

How to detect parts of an image that are of certain brightness?

I need to find parts of an image that are of certain brightness (say between 120 and 135) and then set their brightness to 255 and the rest of the image to 0.
I have an image that I converted to grayscale and then blurred so as to reduce noise, and so far I only found this function:
threshold = cv2.threshold(imageGrayscaleBlurred, 120, 255, cv2.THRESH_BINARY)[1]
As far as I know, this function can only take a certain threshold and in my case it will set all parts of the image that are brighter than 120 to 255, and make the rest of the image black. This doesn't fit my goal because the output image looks pretty messy.
Is there a "fancier" way to achieve my goal without using the cv2.threshold() function twice with a different threshold?
Create a 1-channel mask of the same shape as imageGrayscaleBlurred, filled with zeros (black):
mask = np.zeros((imageGrayscaleBlurred.shape[0], imageGrayscaleBlurred.shape[1]), np.uint8))
Place the following condition that suits your requirement:
mask[(imageGrayscaleBlurred >= 120) & (imageGrayscaleBlurred <= 135)] = 255

Python filter to remove outliers in image

Writing CNN to classify pictures. I encountered a problem with garbage pixels. image The resulting network gives ~90% quality, it seems that it can be improved by averaging these pixels.
Is there a ready algorithm in numpy, opencv, etc. that allows to do this? Not normally smoothing, but specifically for these pixels. Or do I have to do it manually?
I agree that if you are using a CNN for some kind of classification you should train the network to handle this kind of noisy images. Maybe augment your dataset with some salt and pepper noise. Anyway, here's a possible solution for filtering out the outliers. It builds on the idea proposed by fmw42. These are the steps:
Apply a median blur with a large kernel
Convert the original (unprocessed) image to grayscale
(Invert) Threshold the grayscale image with a low threshold value (e.g, 5) to create a mask for the outliers close to 0.
Threshold the grayscale image with a high low threshold value (e.g., 250) to create a mask for the outliers close to 255.
Combine both mask to create the outlier mask
Use the outlier mask to adaptive-filter the original input image substituting the median values where necessary.
Let's see the code:
# Imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//noisyNumbers//"
fileName = "noisy01.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Apply median filter:
filteredImage = cv2.medianBlur(inputImage, ksize=11)
# Convert input image to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
The median filtering with a kernel size of 11 looks like this:
The outliers are practically gone. Now, let's put this aside for the moment and compute a pair of binary masks for both outliers:
# Get low mask:
_, lowMask = cv2.threshold(grayscaleImage, 5, 255, cv2.THRESH_BINARY_INV)
# Get high mask:
_, highMask = cv2.threshold(grayscaleImage, 250, 255, cv2.THRESH_BINARY)
# Create outliers mask:
outliersMask = cv2.add(lowMask, highMask)
The outliers mask is this:
Now, you really don't provide your original data. You provide an image most likely plotted using matplotlib. That's a problem, because the image you posted is processed and compressed. This results in some sharp edges around the outliers on the original image. One straightforward solution is to dilate the outliers mask a little bit to cover this compression artifacts:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 1
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Apply dilation:
outliersMask = cv2.dilate(outliersMask, maxKernel)
The outliers mask now looks like this:
Ok, let's adaptive-filter the original input using the median blurred image and the outliers mask. Just make sure to reshape all the numpy arrays to their proper size for broadcasting:
# Re-shape the binary mask to a 3-channeled image:
augmentedBinary = cv2.merge([outliersMask, outliersMask, outliersMask])
# Apply the adaptive filter:
cleanedImage = np.where(augmentedBinary == (255, 255, 255), filteredImage, inputImage)
# Show the result
cv2.imshow("Adaptive Filtering", cleanedImage)
cv2.waitKey(0)
For the first image, this is the result:
More results:

Local Contrast Enhancement for Digit Recognition with cv2 / pytesseract

I want to use pytesseract to read digits from images. The images look as follows:
The digits are dotted and in order to be able to use pytesseract, I need black connected digits on a white background. To do so, I thought about using erode and dilate as preprocessing techniques. As you can see, the images are similar, yet quite different in certain aspects. For example, the dots in the first image are darker than the background, while the dots in the second are whiter. That means, in the first image I can use erode to get black connected lines and in the second image I can use dilate to get white connected lines and then inverse the colors. This leads to the following results:
Using an appropriate threshold, the first image can easily be read with pytesseract. The second image, whoever, is more tricky. The problem is, that for example parts of the "4" are darker than the background around the three. So a simple threshold is not going to work. I need something like local threshold or local contrast enhancement. Does anybody have an idea here?
Edit:
OTSU, mean threshold and gaussian threshold lead to the following results:
Your images are pretty low res, but you can try a method called gain division. The idea is that you try to build a model of the background and then weight each input pixel by that model. The output gain should be relatively constant during most of the image.
After gain division is performed, you can try to improve the image by applying an area filter and morphology. I only tried your first image, because it is the "least worst".
These are the steps to get the gain-divided image:
Apply a soft median blur filter to get rid of high frequency noise.
Get the model of the background via local maximum. Apply a very strong close operation, with a big structuring element (I’m using a rectangular kernel of size 15).
Perform gain adjustment by dividing 255 between each local maximum pixel. Weight this value with each input image pixel.
You should get a nice image where the background illumination is pretty much normalized, threshold this image to get a binary mask of the characters.
Now, you can improve the quality of the image with the following, additional steps:
Threshold via Otsu, but add a little bit of bias. (This, unfortunately, is a manual step depending on the input).
Apply an area filter to filter out the smaller blobs of noise.
Let's see the code:
import numpy as np
import cv2
# image path
path = "C:/opencvImages/"
fileName = "iA904.png"
# Reading an image in default mode:
inputImage = cv2.imread(path+fileName)
# Remove small noise via median:
filterSize = 5
imageMedian = cv2.medianBlur(inputImage, filterSize)
# Get local maximum:
kernelSize = 15
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
localMax = cv2.morphologyEx(imageMedian, cv2.MORPH_CLOSE, maxKernel, None, None, 1, cv2.BORDER_REFLECT101)
# Perform gain division
gainDivision = np.where(localMax == 0, 0, (inputImage/localMax))
# Clip the values to [0,255]
gainDivision = np.clip((255 * gainDivision), 0, 255)
# Convert the mat type from float to uint8:
gainDivision = gainDivision.astype("uint8")
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(gainDivision, cv2.COLOR_BGR2GRAY)
This is what gain division gets you:
Note that the lighting is more balanced. Now, let's apply a little bit of contrast enhancement:
# Contrast Enhancement:
grayscaleImage = np.uint8(cv2.normalize(grayscaleImage, grayscaleImage, 0, 255, cv2.NORM_MINMAX))
You get this, which creates a little bit more contrast between the foreground and the background:
Now, let's try to threshold this image to get a nice, binary mask. As I suggested, try Otsu's thresholding but add (or subtract) a little bit of bias to the result. This step, as mentioned, is dependent on the quality of your input:
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
threshValue = 0.9 * threshValue
_, binaryImage = cv2.threshold(grayscaleImage, threshValue, 255, cv2.THRESH_BINARY)
You end up with this binary mask:
Invert this and filter out the small blobs. I set an area threshold value of 10 pixels:
# Invert image:
binaryImage = 255 - binaryImage
# Perform an area filter on the binary blobs:
componentsNumber, labeledImage, componentStats, componentCentroids = \
cv2.connectedComponentsWithStats(binaryImage, connectivity=4)
# Set the minimum pixels for the area filter:
minArea = 10
# Get the indices/labels of the remaining components based on the area stat
# (skip the background component at index 0)
remainingComponentLabels = [i for i in range(1, componentsNumber) if componentStats[i][4] >= minArea]
# Filter the labeled pixels based on the remaining labels,
# assign pixel intensity to 255 (uint8) for the remaining pixels
filteredImage = np.where(np.isin(labeledImage, remainingComponentLabels) == True, 255, 0).astype("uint8")
And this is the final binary mask:
If you plan on sending this image to an OCR, you might want to apply some morphology first. Maybe a closing to try and join the dots that make up the characters. Also be sure to train your OCR classifier with a font that is close to what you are actually trying to recognize. This is the (inverted) mask after a size 3 rectangular closing operation with 3 iterations:
Edit:
To get the last image, process the filtered output as follows:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 3
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
closingImage = cv2.morphologyEx(filteredImage, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
# Invert image to obtain black numbers on white background:
closingImage = 255 - closingImage

How to remove the background of a noisy image and extract transparent objects?

I have an image processing problem that I can't solve. I have a set of 375 images like the one below (1). I'm trying to remove the background, so to make "background substraction" (or "foreground extraction") and get only the waste on a plain background (black/white/...).
(1) Image example
I tried many things, including createBackgroundSubtractorMOG2 from OpenCV, or threshold. I also tried to remove the background pixel by pixel by subtracting it from the foreground because I have a set of 237 background images (2) (the carpet without the waste, but which is a little bit offset from the image with the objects). There are also variations in brightness on the background images.
(2) Example of a background image
Here is a code example that I was able to test and that gives me the results below (3) and (4). I use Python 3.8.3.
# Function to remove the sides of the images
def delete_side(img, x_left, x_right):
for i in range(img.shape[0]):
for j in range(img.shape[1]):
if j<=x_left or j>=x_right:
img[i,j] = (0,0,0)
return img
# Intialize the background model
backSub = cv2.createBackgroundSubtractorMOG2(history=250, varThreshold=2, detectShadows=True)
# Read the frames and update the background model
for frame in frames:
if frame.endswith(".png"):
filepath = FRAMES_FOLDER + '/' + frame
img = cv2.imread(filepath)
img_cut = delete_side(img, x_left=190, x_right=1280)
gray = cv2.cvtColor(img_cut, cv2.COLOR_BGR2GRAY)
mask = backSub.apply(gray)
newimage = cv2.bitwise_or(img, img, mask=mask)
img_blurred = cv2.GaussianBlur(newimage, (5, 5), 0)
gray2 = cv2.cvtColor(img_blurred, cv2.COLOR_BGR2GRAY)
_, binary = cv2.threshold(gray2, 10, 255, cv2.THRESH_BINARY)
final = cv2.bitwise_or(img, img, mask=binary)
newpath = RESULT_FOLDER + '/' + frame
cv2.imwrite(newpath, final)
I was inspired by many other cases found on Stackoverflow or others (example: removing pixels less than n size(noise) in an image - open CV python).
(3) The result obtained with the code above
(4) Result when increasing the varThreshold argument to 10
Unfortunately, there is still a lot of noise on the resulting pictures.
As a beginner in "background substraction", I don't have all the keys to get an optimal solution. If someone would have an idea to do this task in a more efficient and clean way (Is there a special method to handle the case of transparent objects? Can noise on objects be eliminated more effectively? etc.), I'm interested :)
Thanks
Thanks for your answers. For information, I simply change of methodology and use a segmentation model (U-Net) with 2 labels (foreground, background), to identify the background. It works quite well.

OpenCV (Python): Construct Rectangle from thresholded image

The image below shows an aerial photo of a house block (re-oriented with the longest side vertical), and the same image subjected to Adaptive Thresholding and Difference of Gaussians.
Images: Base; Adaptive Thresholding; Difference of Gaussians
The roof-print of the house is obvious (to the human eye) on the AdThresh image: it's a matter of connecting some obvious dots. In the sample image, finding the blue-bounded box below -
Image with desired rectangle marked in blue
I've had a crack at implementing HoughLinesP() and findContours(), but get nothing sensible (probably because there's some nuance that I'm missing). The python script-chunk that fails to find anything remotely like the blue box, is as follows:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# read in full (RGBA) image - to get alpha layer to use as mask
img = cv2.imread('rotated_12.png', cv2.IMREAD_UNCHANGED)
grey = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Otsu's thresholding after Gaussian filtering
blur_base = cv2.GaussianBlur(grey,(9,9),0)
blur_diff = cv2.GaussianBlur(grey,(15,15),0)
_,thresh1 = cv2.threshold(grey,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
thresh = cv2.adaptiveThreshold(grey,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,2)
DoG_01 = blur_base - blur_diff
edges_blur = cv2.Canny(blur_base,70,210)
# Find Contours
(ed, cnts,h) = cv2.findContours(grey, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:4]
for c in cnts:
approx = cv2.approxPolyDP(c, 0.1*cv2.arcLength(c, True), True)
cv2.drawContours(grey, [approx], -1, (0, 255, 0), 1)
# Hough Lines
minLineLength = 30
maxLineGap = 5
lines = cv2.HoughLinesP(edges_blur,1,np.pi/180,20,minLineLength,maxLineGap)
print "lines found:", len(lines)
for line in lines:
cv2.line(grey,(line[0][0], line[0][1]),(line[0][2],line[0][3]),(255,0,0),2)
# plot all the images
images = [img, thresh, DoG_01]
titles = ['Base','AdThresh','DoG01']
for i in xrange(len(images)):
plt.subplot(1,len(images),i+1),plt.imshow(images[i],'gray')
plt.title(titles[i]), plt.xticks([]), plt.yticks([])
plt.savefig('a_edgedetect_12.png')
cv2.destroyAllWindows()
I am trying to set things up without excessive parameterisation. I'm wary of 'tailoring' an algorithm for just this one image since this process will be run on hundreds of thousands of images (with roofs/rooves of different colours which may be less distinguishable from background). That said, I would love to see a solution that 'hit' the blue-box target - that way I could at the very least work out what I've done wrong.
If anyone has a quick-and-dirty way to do this sort of thing, it would be awesome to get a Python code snippet to work with.
The 'base' image ->
Base Image
You should apply the following:
1. Contrast Limited Adaptive Histogram Equalization-CLAHE and convert to gray-scale.
2. Gaussian Blur & Morphological transforms (dialation, erosion, etc) as mentioned by #bad_keypoints. This will help you get rid of the background noise. This is the most tricky step as the results will depend on the order in which you apply (first Gaussian Blur and then Morphological transforms or vice versa) and the window sizes you choose for this purpose.
3. Apply Adaptive thresholding
4. Apply Canny's Edge detection
5. Find contour having four corner points
As said earlier you need to tweak with input parameters of these functions and also need to validate these parameters with other images. As it might be possible that it will work for this case but not for other cases. Based on trial and error you need to fix the parameter values.

Categories

Resources