I am trying to run OCR (using Google's Tesseract) on a document with the following format:
However, Tesseract assumes the short bars in between to be letters/numbers (l or i or 1).
As a pre-processing measure I tried to remove vertical and horizontal lines using the following code:
import cv2
from pdf2image import convert_from_path
pages = convert_from_path('..\\app\\1.pdf', 500)
for page in pages:
page.save('..\\app\\out.jpg', 'JPEG')
image = cv2.imread('out.jpg')
result = image.copy()
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (40,1))
remove_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(remove_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(result, [c], -1, (255,255,255), 5)
# Remove vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,40))
remove_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(remove_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(result, [c], -1, (255,255,255), 5)
cv2.imshow('thresh', thresh)
cv2.imshow('result', result)
cv2.imwrite('result.png', result)
cv2.waitKey()
I run into an issue where the output of this document removes most of the vertical and horizontal lines in the document even the start and the finish line on the left and right side of the image below but not the small bars in between.
I'm wondering if I am going about this wrong by trying to pre-process and remove lines. Is there a better way to pre-process or another way to solve this problem?
With the observation that the form fields are separate from the characters, you can simply filter using contour area to isolate the text characters. The idea is to Gaussian blur, then Otsu's threshold to obtain a binary image. From here we find contours and filter using contour area with some predetermined threshold value. We can effectively remove the lines by drawing in the contours with cv2.drawContours.
Binary image
Removed lines
Invert ready for OCR
OCR result using Pytesseract
HELLO
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, blur, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (3,3), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours and filter using contour area
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area > 500:
cv2.drawContours(thresh, [c], -1, 0, -1)
# Invert image and OCR
invert = 255 - thresh
data = pytesseract.image_to_string(invert, lang='eng',config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()
Note: If you still want to go with the remove horizontal/vertical lines approach, you need to modify the vertical kernel size. For instance, change (1,40) to (1,10). This will help to remove smaller lines but it may also remove some of the vertical lines in the text such as in L.
Related
I have an image such as the following:
Original Image
I need to remove the line that the characters are written on, so I use the following code:
image = cv2.imread('im.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,1))
remove_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(remove_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(result, [c], -1, (255,255,255), 5)
plt.imshow(result)
The resulting image looks like this:
Image With Line Removed
As you can see, the portions of the characters that were also touching the line were removed, which I need to preserve.
I attempted to solve this by using the following code:
image = cv2.imread('connected/im.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (40,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
# Repair image
repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,6))
result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=2)
plt.imshow(result)
This appears to solve the issue (sort of), but the image appears grainy, and you can still tell where the line is:
Repaired Image
Does anyone know of a better way of repairing the image / removing the line?
For what it is worth, you can erase the horizontal line with a purely vertical closing, and replace just that part in the original image. A little better than erasing.
I am planning to build an OCR (Optical character recognition), and now I am cleaning the data.
I have a dataset that consists of tens of thousands of images.
in the dataset, there are some images that have a straight line in their edges, and I want to cut the part that is located outside these straight lines.
for example:
the above images contains a horizontal line at the top and a vertical line at the right and some characters outside of the straight line, now I have the code for getting rid of the straight lines (both horizontal and vertical), when I get rid of these lines the image will be like that:
now I got rid of the straight lines, but there is some characters outside the straight line (at right of the straight line) that is messing the data, so I want to get rid of these extra data that are residing outside of the straight lines and crop the image to get rid of them.
that is the code I am using for removing the straight lines:
def remove_lines(img_path, folder_path, img_name_with_extension):
image = cv2.imread(img_path)
# remove horizontal lines in the original image
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
# Repair image
repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,6))
result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=1)
image = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE)
# remove horizontal lines in the rotated image
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove horizontal
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25,1))
detected_lines = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(detected_lines, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(image, [c], -1, (255,255,255), 2)
# Repair image
repair_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,6))
result = 255 - cv2.morphologyEx(255 - image, cv2.MORPH_CLOSE, repair_kernel, iterations=1)
image = cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE)
os.chdir(folder_path)
cv2.imwrite(img_name_with_extension, image)
is there any way that I can tweak the code I have written to crop the image and get rid of the straight lines and the characters that are located outside of them?
Thanks.
Perhaps find the centroid of the contour that came from your horizontal line detection--and its boundaries--and use those to crop the image? For instance, after you have your contours from the horizontal line detection portion of your code you could (on a given contour c from your contours cnts) do:
moments = cv2.moments(c)
c_y = int(moments['m01'] / moments['m00'])
And then get the height of the contour:
_, _, _,h = cv2.boundingRect(c)
half_height = int(np.ceil(h/2))
And then crop:
new_image = image[c_y + half_height:, :]
This code assumes the horizontal line is on the top so that you can just default to cropping away everything above it. It seemed that assumption was built into your question. If it is possible for horizontal lines to be at the bottom of the image instead (or as well), you could add a check of whether the centroid of the contour was in the top or the bottom half of the image and then crop away above it or below it, accordingly.
And for the vertical lines, a nearly identical process could be used or even the exact same one if you stick with how your code rotates the image to get the vertical lines using the same method as the horizontal lines.
EDIT:
Actually, you don't even need to compute the moments of the contours. You can do the same process described above simply using the bounding box information:
_, y, _, h = cv2.boundingRect(c)
new_image = image[(y + h + 1):, :]
Integrating that last simple solution into some code based off your original code gives a solution like:
import os
import cv2
def detect_hlines_and_crop(image):
"""Remove horizontal lines and anything above them in an image
"""
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,
0,
255,
cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
thresh = thresh[1]
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (25, 1))
detected_lines = cv2.morphologyEx(thresh,
cv2.MORPH_OPEN,
horizontal_kernel,
iterations=2)
cnts = cv2.findContours(detected_lines,
cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
# assumes all lines detected are above the important bit of image
_, y, _, h = cv2.boundingRect(c)
image = image[(y + h + 1):, :]
return image
def remove_lines(img_path, folder_path, img_name_with_extension):
image = cv2.imread(img_path)
image = detect_hlines_and_crop(image)
image = cv2.rotate(image, cv2.ROTATE_90_COUNTERCLOCKWISE)
image = detect_hlines_and_crop(image)
image = cv2.rotate(image, cv2.ROTATE_90_CLOCKWISE)
os.chdir(folder_path)
cv2.imwrite(img_name_with_extension, image)
I'm trying to separate the lines and circles, instead of lines some circles are connected with curves.
I tried to use contours to find circles but it is however including the lines inside the contour, so I also tried skeletoning the image so that to see if the connection between the circles and lines might break, but it is unsuccessful.
Hough_circles is not detecting circles in all cases, so the only option I've to find the circles using contours once the lines around it are eliminated.
EDIT
Example 2
Input
Output : Not desired output
In the output image, I got circles weren't got separated and lines got merged with circles and the contour gave a different shape.
Please find some way to split the circles and lines. Please try to answer it in Python instead of C++. C++ answers are allowed too.
Thanks in advance!
Here's a simple approach using morphological operations. The idea is to fill the contours, create an elliptical shaped structuring element, then morph open to remove the lines. From here we simply find the external contours and draw the circles. Here's the process visualized:
Filled thresholded image
Morph open
Result
Code
import cv2
# Load iamge, grayscale, Otsu's threshold
image = cv2.imread('1.png')
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Fill contours
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(thresh, [c], -1, (255,255,255), -1)
# Morph open
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5,5))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=3)
# Draw circles
cnts = cv2.findContours(opening, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.drawContours(image, [c], -1, (36,255,12), 3)
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('image', image)
cv2.waitKey()
For the other image using contour hierarchy
import cv2
import numpy as np
# Load image, grayscale, Otsu's threshold
image = cv2.imread('1.png')
mask = np.zeros(image.shape, dtype=np.uint8)
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
# Filter using contour hierarchy
cnts, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
hierarchy = hierarchy[0]
for component in zip(cnts, hierarchy):
currentContour = component[0]
currentHierarchy = component[1]
x,y,w,h = cv2.boundingRect(currentContour)
# Only select inner contours
if currentHierarchy[3] > 0:
cv2.drawContours(mask, [currentContour], -1, (255,255,255), -1)
# Filter contours on mask using contour approximation
mask = cv2.cvtColor(mask, cv2.COLOR_BGR2GRAY)
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
peri = cv2.arcLength(c, True)
approx = cv2.approxPolyDP(c, 0.05 * peri, True)
if len(approx) > 5:
cv2.drawContours(mask, [c], -1, (0,0,0), -1)
else:
cv2.drawContours(image, [c], -1, (36,255,12), 2)
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.imshow('mask', mask)
cv2.waitKey()
Note: For a through explanation on contour hierarchy, take a look at understanding contour hierarchy and retrieval modes
Having an image such as one above, I am able to crop it into four square boxes, remove the borders using OpenCV morphological operations (basic dilation, erosion) and get a result such as:
Which works great in most cases, but if someone writes over the line, this may get predicted as 7 instead of 2.
I am having trouble finding a solution that would recover the parts of the character written over the line while removing the borders. Images I have are already converted to grayscale so I can't distinguish written digits based on the color. What would be the best way to approach this problem?
Here's a pipeline
Convert image to grayscale
Otsu's threshold to obtain a binary image
Remove vertical lines
Remove horizontal lines
Construct repair kernel and repair image
Invert image
After converting to grayscale, we Otsu's threshold
From here we remove vertical lines
Then remove horizontal lines
This leaves us with a gap in the characters, to fix this, we create a repair kernel to dilate the image
Next we bitwise-and with the thresholded image to maintain our character detail
The gap is still there but a little better. We perform morph close to close the gap
It's now closed but we lost character detail. We perform a final bitwise-and with the thresholded image to recover our detail
To get the desired result, we invert the image
import cv2
image = cv2.imread('1.png')
removed = image.copy()
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Remove vertical lines
vertical_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,40))
remove_vertical = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, vertical_kernel, iterations=2)
cnts = cv2.findContours(remove_vertical, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(removed, [c], -1, (255,255,255), 15)
# Remove horizontal lines
horizontal_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (40,1))
remove_horizontal = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, horizontal_kernel, iterations=2)
cnts = cv2.findContours(remove_horizontal, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
cv2.drawContours(removed, [c], -1, (255,255,255), 5)
# Repair kernel
repair_kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
removed = 255 - removed
dilate = cv2.dilate(removed, repair_kernel, iterations=5)
dilate = cv2.cvtColor(dilate, cv2.COLOR_BGR2GRAY)
pre_result = cv2.bitwise_and(dilate, thresh)
result = cv2.morphologyEx(pre_result, cv2.MORPH_CLOSE, repair_kernel, iterations=5)
final = cv2.bitwise_and(result, thresh)
invert_final = 255 - final
cv2.imshow('thresh', thresh)
cv2.imshow('removed', removed)
cv2.imshow('dilate', dilate)
cv2.imshow('pre_result', pre_result)
cv2.imshow('result', result)
cv2.imshow('final', final)
cv2.imshow('invert_final', invert_final)
cv2.waitKey()
I have an image with measurements that I need to read with python and right now it reads the most text but not all because some lines are in the way. I cant use the original image so I made an image that looks like the one I'm using.
def erode(img):
kernel = np.ones((3,3), np.uint8)
eroded = cv2.erode(img, kernel, iterations=1)
gray = cv2.cvtColor(eroded,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,150,apertureSize = 3)
minLineLength = 10
maxLineGap = 1
lines = cv2.HoughLinesP(edges,1,np.pi/180,120,minLineLength,maxLineGap)
for line in lines:
for x1,y1,x2,y2 in line:
cv2.line(eroded,(x1,y1),(x2,y2),(255,255,255),7)
I have tried using the OpenCV function houghLinesP and drawing a line over these but this doesn't remove all lines and still leaves some dots all over the place like this:
what I want is to give something like this as input:
and get something like this as an output:
the reason I need to remove all the lines but not change the
text is because I need to save the text coordinates.
The idea is to dilate and connect the text together to form a single contour. From here we can find contours and filter using a minimum threshold area. If it passes this filter then we have a desired text ROI to keep and we draw this ROI onto a mask
import cv2
import numpy as np
image = cv2.imread('3.png')
mask = np.ones(image.shape, dtype=np.uint8) * 255
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
dilate = cv2.dilate(thresh, kernel, iterations=3)
cnts = cv2.findContours(dilate, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
for c in cnts:
area = cv2.contourArea(c)
if area < 5000:
x,y,w,h = cv2.boundingRect(c)
mask[y:y+h, x:x+w] = image[y:y+h, x:x+w]
cv2.imshow('thresh', thresh)
cv2.imshow('dilate', dilate)
cv2.imshow('mask', mask)
cv2.waitKey()