I am trying to remove rules and a background smiley face from multiple notebook pages before performing text detection and recognition on the handwritten text.
An earlier thread offers helpful hints, but my problem is different in several respects.
The text to keep is written over the background items to be removed.
The items to be removed have distinct colors from that of the text, which may be the key to their removal.
The lines to be removed are not very straight, and the smiley face even less so.
I'm thinking of using OpenCV for this task, but I'm open to using ImageMagick or command-line GIMP so long as I can process the entire batch at once. Since I have never used any of these tools before, any advice would be welcome. Thank you.
Here's a simple approach with the assumption that the text is blue
Convert image to HSV format and color threshold with cv2.inRange()
Perform morphological transformations to smooth image
Isolate characters
Recolor characters for OCR/Tesseract
We begin by converting the image to HSV format and create a mask to isolate the characters
image = cv2.imread('1.png')
result = image.copy()
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([21,0,0])
upper = np.array([179, 255, 209])
mask = cv2.inRange(image, lower, upper)
Now we perform morphological transformations to remove small noise
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2,2))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=1)
We have the desired text outlines so we can isolate characters by masking with the original image
result[close==0] = (255,255,255)
Finally to prepare the image for OCR/Tesseract, we change the characters to black
retouch_mask = (result <= [250.,250.,250.]).all(axis=2)
result[retouch_mask] = [0,0,0]
Full code
import numpy as np
import cv2
image = cv2.imread('1.png')
result = image.copy()
image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
lower = np.array([21,0,0])
upper = np.array([179, 255, 209])
mask = cv2.inRange(image, lower, upper)
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (2,2))
close = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel, iterations=1)
result[close==0] = (255,255,255)
cv2.imshow('cleaned', result)
retouch_mask = (result <= [250.,250.,250.]).all(axis=2)
result[retouch_mask] = [0,0,0]
cv2.imshow('mask', mask)
cv2.imshow('close', close)
cv2.imshow('result', result)
cv2.waitKey()
Related
The objective is to remove large irregular area and maintained only character in the image.
For example, given the following
and the expected masked output
I have the impression this can be achieved as below
import cv2
import numpy as np
from matplotlib import pyplot as plt
dpath='remove_bg1.jpg'
img = cv2.imread(dpath)
img_fh=img.copy()
cv2.bitwise_not(img_fh,img_fh)
ksize=10
kernel = np.ones((ksize,ksize),np.uint8)
erosion = cv2.erode(img_fh,kernel,iterations = 3)
invertx = cv2.bitwise_not(erosion)
masked = cv2.bitwise_not(cv2.bitwise_and(img_fh,invertx))
all_image=[img,invertx,masked]
ncol=len(all_image)
for idx, i in enumerate(all_image):
plt.subplot(int(f'1{ncol}{idx+1}')),plt.imshow(i)
plt.show()
which produce
Clearly, the code above did not produced the expected result.
May I know how to address this issue properly?
To remove the unwanted blob, we must create a mask such that it encloses it completely.
Flow:
Inversely binarize the image (such that you have a white foreground against dark background)
Dilate the image (since the blob makes contact with letter 'A', it has to be isolated )
Find contour with the largest area
Draw the contour on an another 1-channel image and thicken it (dilation)
Pixel Assignment: Pixels containing the dilated blob are made white on the original image
Code:
im = cv2.imread('stained_text.jpg')
im2 = im.copy()
gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
# inverse binaraization
th = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
Notice the blob region touching the letter 'A'. Hence to isolate it we perform erosion using an elliptical kernel
# erosion
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5,5))
erode = cv2.erode(th, kernel, iterations=2)
# find contours
contours, hierarchy = cv2.findContours(erode, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# Contour of maximum area
c = max(contours, key = cv2.contourArea)
# create 1-channel image in black
black = np.zeros((im.shape[0], im.shape[1]), np.uint8)
# draw the contour on it
black = cv2.drawContours(black, [c], 0, 255, -1)
# perform dilation to have clean border
# we are using the same kernel
dilate = cv2.dilate(black, kernel, iterations = 3)
# assign the dilated area in white over the original image
im2[dilate == 255] = (255,255,255)
This was just one of the many possible ways on how to proceed. The key thing to note is how to isolate the blob.
I have a webcam which takes pictures of a concrete slab. Now I want to check if there are objects on the slab or not. The objects could be anything and accordingly cannot be enumerated in a class. Unfortunately I cannot compare the webcam image directly with an image without objects on the concrete slab, because the image of the camera could shift minimally in x and y direction and the lighting is also not always the same. So I cannot use cv2.substract.
I would prefer a foreground and background substract, where the background is just my concrete slab and the foreground is then the objects. But since the objects don´t move but lie still on the slab, I can´t use cv2.createBackgroundSubtractorMOG2() either.
The Pictures look like this:
The Concrete slap without any objects:
The slap with Objects:
In Python/OpenCV, you could do division normalization to even out the illumination and make the background white. Then do your subtraction. Then use morphology to clean up small regions. Then find contours and discard any small regions that are due to noise left after the division normalization and morphology.
Here is how to do division normalization.
Input 1:
Input 2:
import cv2
import numpy as np
# load image
img1 = cv2.imread("img1.jpg")
img2 = cv2.imread("img2.jpg")
# convert to grayscale
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
# blur
blur1 = cv2.GaussianBlur(gray1, (0,0), sigmaX=13, sigmaY=13)
blur2 = cv2.GaussianBlur(gray2, (0,0), sigmaX=13, sigmaY=13)
# divide
divide1 = cv2.divide(gray1, blur1, scale=255)
divide2 = cv2.divide(gray2, blur2, scale=255)
# threshold
thresh1 = cv2.threshold(divide1, 200, 255, cv2.THRESH_BINARY)[1]
thresh2 = cv2.threshold(divide2, 200, 255, cv2.THRESH_BINARY)[1]
# morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
morph1 = cv2.morphologyEx(thresh1, cv2.MORPH_OPEN, kernel)
morph2 = cv2.morphologyEx(thresh2, cv2.MORPH_OPEN, kernel)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
morph1 = cv2.morphologyEx(morph1, cv2.MORPH_CLOSE, kernel)
morph2 = cv2.morphologyEx(morph2, cv2.MORPH_CLOSE, kernel)
# write result to disk
cv2.imwrite("img1_division_normalize.jpg", divide1)
cv2.imwrite("img2_division_normalize.jpg", divide2)
cv2.imwrite("img1_division_morph1.jpg", morph1)
cv2.imwrite("img1_division_morph2.jpg", morph2)
# display it
cv2.imshow("img1_norm", divide1)
cv2.imshow("img2_norm", divide2)
cv2.imshow("img1_thresh", thresh1)
cv2.imshow("img2_thresh", thresh2)
cv2.imshow("img1_morph", morph1)
cv2.imshow("img2_morph", morph2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Image 1 Normalized:
Image 2 Normalized:
Image 1 thresholded and morphology cleaned:
Image 2 thresholded and morphology cleaned:
In this case, Image 1 becomes completely white. So it (and subtraction) is not really needed. You just need to find contours for the second image result and if necessary discard tiny regions by area. The rest are your objects.
I am trying to remove text from images that has a black border with white fill. Take the image below as an example.
I have tried a few options utilizing opencv and skimage inpaint
import cv2
from skimage.restoration import inpaint
img = cv2.imread('Documents/test_image.png')
mask = cv2.threshold(img, 210, 255, cv2.THRESH_BINARY)[1][:,:,0]
dst = cv2.inpaint(img, mask, 7, cv2.INPAINT_TELEA)
image_result = inpaint.inpaint_biharmonic(img, mask,
multichannel=True)
cv2.imshow('image',img)
cv2.imshow('mask',mask)
cv2.imshow('dst',dst)
cv2.imshow('image_result',image_result)
cv2.waitKey(0)
It seems like the inpainting is just trying to fill with black as that is what it is identifying as being around the areas of interest. What I would like to do is remove the white text and black borders completely, or secondarily try to fill the white with more information from surrounding colors than just the black.
Here is the best solution I could come up with, still open to others with more experience showing me a better way if anyone has an idea.
mask = cv2.threshold(img, 245, 255, cv2.THRESH_BINARY)[1][:,:,0]
new_mask = cv2.dilate(mask, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (10,10)))
dst = cv2.inpaint(img, new_mask, 7, cv2.INPAINT_TELEA)
Here are two inpainting methods in Python/OpenCV. Note that I use the saturation channel to create the threshold, since white and black have zero saturation, in principle.
Input:
import cv2
import numpy as np
# read input
img = cv2.imread('white_black_text.png')
# convert to hsv and extract saturation
hsv = cv2.cvtColor(img,cv2.COLOR_BGR2HSV)
sat = hsv[:,:,1]
# threshold and invert
thresh = cv2.threshold(sat, 10, 255, cv2.THRESH_BINARY)[1]
thresh = 255 - thresh
# apply morphology dilate
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (15,15))
thresh = cv2.morphologyEx(thresh, cv2.MORPH_DILATE, kernel)
# do inpainting
result1 = cv2.inpaint(img,thresh,11,cv2.INPAINT_TELEA)
result2 = cv2.inpaint(img,thresh,11,cv2.INPAINT_NS)
# save results
cv2.imwrite('white_black_text_threshold.png', thresh)
cv2.imwrite('white_black_text_inpainted1.png', result1)
cv2.imwrite('white_black_text_inpainted2.png', result1)
# show results
cv2.imshow('thresh',thresh)
cv2.imshow('result1',result1)
cv2.imshow('result2',result2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Threshold and morphology cleaned result:
Result 1 (Telea):
Result 2 (Navier Stokes):
I am trying to extract the text from this image
.
I tried adjusting contrast and brightness, smoothing (e.g. GaussianBlur, medianblur) and threshold techniques(e.g. Otsu) with OpenCV, there is still lots of remaining noise.
Is there anything else I can try?
You could try a combination of Gaussian blurring, thresholding, and morphological operations to isolate the text. Here's a pipeline
Blur -> Threshold -> Opening -> Dilation -> Bitwise-and
import cv2
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,2))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=3)
# Repair text
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (7,7))
dilate = cv2.dilate(opening, kernel, iterations=2)
# Bitwise-and with input image
result = cv2.bitwise_and(image,image,mask=dilate)
result[dilate==0] = (255,255,255)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('dilate', dilate)
cv2.imshow('result', result)
cv2.waitKey()
If a more general solution is required, read on. Otherwise, you can refer to nathancy's answer, or plenty of other answers in this site.
I assume that
extract the text from this image
means you want the text from this image as a string or the ROI of the text.
This is something called OCR (Optical Character Recognition) and is a pretty complicated deep learning problem specifically for the type of image you posted (noisy, low sharpness, low dynamic range etc.). If you are looking for a vanilla OpenCV library that can do this out of the box then, as far as I know, there isn't any.
Check these links for source code and explanation -
OCR with pytesseract
Text ROI detection using EAST
I'm looking to extract the text from an image, The output I am receiving is not very accurate. I wonder if there's any additional steps I can take to process the image more to increase the accuracy of this OCR.
I've looked into some of the different ways to process the image and improve the OCR results. The image is quite small and I've been able to blow it up slightly, but to no avail.
The image will always be horizontal, no other text will be present other than the numbers. The maximum number will go up to 55000.
An example of the image in question:
After image processing, my image is scaled up by 4 on the X and Y axis. And some saturation is removed, although this does not improve the accuracy at all.
image = self._process(scale=6, iterations=2)
text = pytesseract.image_to_string(image, config="--psm 7")
My process method is doing the following:
# Resize and desaturate.
image = cv2.resize(image, None, fx=scale, fy=scale,
interpolation=cv2.INTER_CUBIC)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion.
kernel = np.ones((1, 1), np.uint8)
image = cv2.dilate(image, kernel, iterations=iterations)
image = cv2.erode(image, kernel, iterations=iterations)
return image
Expected: "10411"
The actual value is varied, usually an unrecognizable string, or some numbers are parsed but the accuracy rate is too low to be usable.
I don't have experience with OCR, but I think you're on the right track: increasing the image size so the algorithm has more pixels to work with and increasing the distinction between the numbers and the background.
Tricks I added: thresholding the image, which creates a mask where only the white pixels remain. There were a few white blobs that were not numbers, so I used findContours to color those unwanted blobs black.
Result:
Code:
import numpy as np
import cv2
# load image
image = cv2.imread('number.png')
# resize image
image = cv2.resize(image,None,fx=5, fy=5, interpolation = cv2.INTER_CUBIC)
# create grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# perform threshold
retr, mask = cv2.threshold(gray_image, 230, 255, cv2.THRESH_BINARY)
# find contours
ret, contours, hier = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# draw black over the contours smaller than 200 - remove unwanted blobs
for cnt in contours:
# print contoursize to detemine threshold
print(cv2.contourArea(cnt))
if cv2.contourArea(cnt) < 200:
cv2.drawContours(mask, [cnt], 0, (0), -1)
#show image
cv2.imshow("Result", mask)
cv2.imshow("Image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()