Pytesseract gets inaccurate in reading chess notations - python

I've been experimenting with pytesseract and I have searched some improvements for accuracy but it didn't work for me. So here's my img:
This is the output:
Code:
img = cv2.imread("temp.png")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.adaptiveThreshold(gry, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 11, 4)
txt = pytesseract.image_to_string(thr, config='--psm 13')
print(txt) # "#)"
I don't know everything since I just started, can someone give me tips how can this be done?
Edit: Ahx solved my question but there's something buggy in the code. It's reading 6 as é. For example d6, it will read it as dé.
I added some thresholds and blurs because I think it will improve it but it didn't.
Here's my updated code:
img = cv2.imread('temp.png')
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
lwr = np.array([0, 0, 0])
upr = np.array([179, 255, 180])
msk = cv2.inRange(hsv, lwr, upr)
msk = cv2.resize(msk, (0, 0), fx=3, fy=3, interpolation=cv2.INTER_CUBIC)
msk = cv2.adaptiveThreshold(cv2.bilateralFilter(msk, 9, 75, 75), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
msk = cv2.adaptiveThreshold(cv2.medianBlur(msk, 3), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2)
txt = pytesseract.image_to_string(msk, lang='eng', config=r'--psm 6 --oem 3')
The e6 actually worked but 6 in others like d6, Nf6 is always é.
Here's an example if ever you wanna try it out:
Output:
result: Nf6é

You can easily get the result by performing color-segmentation. First, you need to load the image, convert it to the HSV format. Next, define the upper and lower boundaries to get the binary-mask. The binary mask will contain the required features for recognizing the characters. Then we will upsample the binary-mask and give input to the tesseract.
Upsampled Binary-mask
OCR Result: e6
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("iTO9h.png")
# Convert to grayscale
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Get binary-mask
lwr = np.array([0, 0, 0])
upr = np.array([179, 255, 180])
msk = cv2.inRange(hsv, lwr, upr)
# Up-sample
msk = cv2.resize(msk, (0, 0), fx=2, fy=2)
# OCR
txt = pytesseract.image_to_string(msk)
print(txt)

Related

How can I remove black dots around hand in image

using opencv for capturing image in python
i want to make this image :
code for this :
# Image Processing
gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (51,51), 15)
th3 = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
ret, test_image = cv2.threshold(th3, 10, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
to somewhat like this:
If you'll consider Scikit-Image, instead of OpenCV, here's an approach. It has been edited from my original post, thanks to guiding commentary from #Yves Daoust. This more correct approach runs significantly faster
import numpy as np
import matplotlib.pyplot as plt
from skimage import io
from skimage import filters
from skimage import morphology
threshold = filters.threshold_otsu(image)
bimage = np.where(image>threshold,255,0)
rsh_img = morphology.remove_small_holes(bimage,area_threshold=33)
I think you should try closing operation. It will do the work.
gray = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (51,51), 15)
th3 = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11, 2)
ret, test_image = cv2.threshold(th3, 10, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
kernel = np.ones((3, 3), np.uint8)
# opening the image
opening = cv2.morphologyEx(binr, cv2.MORPH_OPEN,
kernel, iterations=1)
If the resultant image is not good enough then try to increase he kernal size

Unable to read text from Image using pytesseract.image_to_string

The problem here is I need to remove the lines and write code to recognize the characters. Till now I have seen solutions, where the char was in solid, but this has char with double border.
For this specific captcha, there's quite a simple solution. But, there's no guarantee for this approach to work on other, even very similar captchas – due to the "nature" of captchas as already mentioned in the comments, and in general when dealing with image-processing tasks with limited provided input data.
Read the image as grayscale.
Threshold the image at nearly white cutoff.
Flood fill the "background" with black.
Run pytesseract with -psm 6 option.
That'd be the whole code:
import cv2
import pytesseract
# Read image as grayscale
img = cv2.imread('FuZEJ.png', cv2.IMREAD_GRAYSCALE)
# Threshold at nearly white cutoff
thr = cv2.threshold(img, 224, 255, cv2.THRESH_BINARY)[1]
# Floodfill "background" with black
ff = cv2.floodFill(thr, None, (0, 0), 0)[1]
# OCR using pytesseract
text = pytesseract.image_to_string(ff, config='--psm 6').replace('\n', '').replace('\f', '')
print(text)
# xwphs
Caveat: I use a special version of Tesseract from the Mannheim University Library.
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
PyCharm: 2021.1.1
OpenCV: 4.5.1
pytesseract: 5.0.0-alpha.20201127
----------------------------------------
I would try a mask:
import cv2
import numpy as np
def process(img): # To process the image
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, img_gray = cv2.threshold(img_gray, 224, 255, cv2.THRESH_TOZERO_INV)
img_blur = cv2.GaussianBlur(img_gray, (7, 7), 6)
img_canny = cv2.Canny(img_blur, 0, 100)
return cv2.dilate(img_canny, np.ones((1, 5)), iterations=1)
def get_mask(img): # To generate the mask
mask = np.zeros(img.shape[:2], 'uint8')
contours, _ = cv2.findContours(process(img), cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
for cnt in contours:
cv2.drawContours(mask, [cnt], -1, 255, -1)
return mask
def crop(img, mask): # To mask an image and use white background
bg = np.full(img.shape, 255, 'uint8')
fg = cv2.bitwise_or(img, img, mask=mask)
fg_back_inv = cv2.bitwise_or(bg, bg, mask=cv2.bitwise_not(mask))
return cv2.bitwise_or(fg, fg_back_inv)
img = cv2.imread("image.png")
img = cv2.pyrUp(cv2.pyrUp(img)) # To enlarge image by 4x
cv2.imshow("Masked Image", crop(img, get_mask(img)))
cv2.waitKey(0)
Before:
After:

How do I remove certain part of an Image in Python OpenCV?

I have been trying to isolate only the beans from the container. However, it doesn't seems to work. Please how can I achieve this OpenCV python?
I have use cv2.inrange() but I don't seem to get it.
This image is what i can give you :
Code:
import cv2
## Read
img = cv2.imread("QdMz0.jpg")
rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
mask1 = cv2.inRange(rgb, (56, 0, 0), (70, 255,255))
mask2 = cv2.inRange(rgb, (5,0,0), (36, 255, 255))
## final mask and masked
mask = cv2.bitwise_or(mask1, mask2)
target = cv2.bitwise_and(img,img, mask=mask)
#cv2.imwrite("target.png", target)
plt.imshow(target)

Watershed Segmentation excluding alone object?

Problem
Using this answer to create a segmentation program, it is counting the objects incorrectly. I noticed that alone objects are being ignored or poor imaging acquisition.
I counted 123 objects and the program returns 117, as can be seen, bellow. The objects circled in red seem to be missing:
Using the following image from a 720p webcam:
Code
import cv2
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import label
import urllib.request
# https://stackoverflow.com/a/14617359/7690982
def segment_on_dt(a, img):
border = cv2.dilate(img, None, iterations=5)
border = border - cv2.erode(border, None)
dt = cv2.distanceTransform(img, cv2.DIST_L2, 3)
plt.imshow(dt)
plt.show()
dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(np.uint8)
_, dt = cv2.threshold(dt, 140, 255, cv2.THRESH_BINARY)
lbl, ncc = label(dt)
lbl = lbl * (255 / (ncc + 1))
# Completing the markers now.
lbl[border == 255] = 255
lbl = lbl.astype(np.int32)
cv2.watershed(a, lbl)
print("[INFO] {} unique segments found".format(len(np.unique(lbl)) - 1))
lbl[lbl == -1] = 0
lbl = lbl.astype(np.uint8)
return 255 - lbl
# Open Image
resp = urllib.request.urlopen("https://i.stack.imgur.com/YUgob.jpg")
img = np.asarray(bytearray(resp.read()), dtype="uint8")
img = cv2.imdecode(img, cv2.IMREAD_COLOR)
## Yellow slicer
mask = cv2.inRange(img, (0, 0, 0), (55, 255, 255))
imask = mask > 0
slicer = np.zeros_like(img, np.uint8)
slicer[imask] = img[imask]
# Image Binarization
img_gray = cv2.cvtColor(slicer, cv2.COLOR_BGR2GRAY)
_, img_bin = cv2.threshold(img_gray, 140, 255,
cv2.THRESH_BINARY)
# Morphological Gradient
img_bin = cv2.morphologyEx(img_bin, cv2.MORPH_OPEN,
np.ones((3, 3), dtype=int))
# Segmentation
result = segment_on_dt(img, img_bin)
plt.imshow(np.hstack([result, img_gray]), cmap='Set3')
plt.show()
# Final Picture
result[result != 255] = 0
result = cv2.dilate(result, None)
img[result == 255] = (0, 0, 255)
plt.imshow(result)
plt.show()
Question
How to count the missing objects?
Answering your main question, watershed does not remove single objects. Watershed was functioning fine in your algorithm. It receives the predefined labels and perform segmentation accordingly.
The problem was the threshold you set for the distance transform was too high and it removed the weak signal from the single objects, thus preventing the objects from being labeled and sent to the watershed algorithm.
The reason for the weak distance transform signal was due to the improper segmentation during the color segmentation stage and the difficulty of setting a single threshold to remove noise and extract signal.
To remedy this, we need to perform proper color segmentation and use adaptive threshold instead of the single threshold when segmenting the distance transform signal.
Here is the code i modified. I have incorporated color segmentation method by #user1269942 in the code. Extra explanation is in the code.
import cv2
import numpy as np
import matplotlib.pyplot as plt
from scipy.ndimage import label
import urllib.request
# https://stackoverflow.com/a/14617359/7690982
def segment_on_dt(a, img, img_gray):
# Added several elliptical structuring element for better morphology process
struct_big = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
struct_small = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3))
# increase border size
border = cv2.dilate(img, struct_big, iterations=5)
border = border - cv2.erode(img, struct_small)
dt = cv2.distanceTransform(img, cv2.DIST_L2, 3)
dt = ((dt - dt.min()) / (dt.max() - dt.min()) * 255).astype(np.uint8)
# blur the signal lighty to remove noise
dt = cv2.GaussianBlur(dt,(7,7),-1)
# Adaptive threshold to extract local maxima of distance trasnform signal
dt = cv2.adaptiveThreshold(dt, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 21, -9)
#_ , dt = cv2.threshold(dt, 2, 255, cv2.THRESH_BINARY)
# Morphology operation to clean the thresholded signal
dt = cv2.erode(dt,struct_small,iterations = 1)
dt = cv2.dilate(dt,struct_big,iterations = 10)
plt.imshow(dt)
plt.show()
# Labeling
lbl, ncc = label(dt)
lbl = lbl * (255 / (ncc + 1))
# Completing the markers now.
lbl[border == 255] = 255
plt.imshow(lbl)
plt.show()
lbl = lbl.astype(np.int32)
cv2.watershed(a, lbl)
print("[INFO] {} unique segments found".format(len(np.unique(lbl)) - 1))
lbl[lbl == -1] = 0
lbl = lbl.astype(np.uint8)
return 255 - lbl
# Open Image
resp = urllib.request.urlopen("https://i.stack.imgur.com/YUgob.jpg")
img = np.asarray(bytearray(resp.read()), dtype="uint8")
img = cv2.imdecode(img, cv2.IMREAD_COLOR)
## Yellow slicer
# blur to remove noise
img = cv2.blur(img, (9,9))
# proper color segmentation
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
mask = cv2.inRange(hsv, (0, 140, 160), (35, 255, 255))
#mask = cv2.inRange(img, (0, 0, 0), (55, 255, 255))
imask = mask > 0
slicer = np.zeros_like(img, np.uint8)
slicer[imask] = img[imask]
# Image Binarization
img_gray = cv2.cvtColor(slicer, cv2.COLOR_BGR2GRAY)
_, img_bin = cv2.threshold(img_gray, 140, 255,
cv2.THRESH_BINARY)
plt.imshow(img_bin)
plt.show()
# Morphological Gradient
# added
cv2.morphologyEx(img_bin, cv2.MORPH_OPEN,cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3)),img_bin,(-1,-1),10)
cv2.morphologyEx(img_bin, cv2.MORPH_ERODE,cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(3,3)),img_bin,(-1,-1),3)
plt.imshow(img_bin)
plt.show()
# Segmentation
result = segment_on_dt(img, img_bin, img_gray)
plt.imshow(np.hstack([result, img_gray]), cmap='Set3')
plt.show()
# Final Picture
result[result != 255] = 0
result = cv2.dilate(result, None)
img[result == 255] = (0, 0, 255)
plt.imshow(result)
plt.show()
Final results :
124 Unique items found.
An extra item was found because one of the object was divided to 2.
With proper parameter tuning, you might get the exact number you are looking. But i would suggest getting a better camera.
Looking at your code, it is completely reasonable so I'm just going to make one small suggestion and that is to do your "inRange" using HSV color space.
opencv docs on color spaces:
https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_colorspaces/py_colorspaces.html
another SO example using inRange with HSV:
How to detect two different colors using `cv2.inRange` in Python-OpenCV?
and a small code edits for you:
img = cv2.blur(img, (5,5)) #new addition just before "##yellow slicer"
## Yellow slicer
#mask = cv2.inRange(img, (0, 0, 0), (55, 255, 255)) #your line: comment out.
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) #new addition...convert to hsv
mask = cv2.inRange(hsv, (0, 120, 120), (35, 255, 255)) #new addition use hsv for inRange and an adjustment to the values.
Improving Accuracy
Detecting missing objects
im_1, im_2, im_3
I've count 12 missing objects: 2, 7, 8, 11, 65, 77, 78, 84, 92, 95, 96. edit: 85 too
117 found, 12 missing, 6 wrong
1° Attempt: Decrease Mask Sensibility
#mask = cv2.inRange(img, (0, 0, 0), (55, 255, 255)) #Current
mask = cv2.inRange(img, (0, 0, 0), (80, 255, 255)) #1' Attempt
inRange documentaion
im_4, im_5, im_6, im_7
[INFO] 120 unique segments found
120 found, 9 missing, 6 wrong

opencv python3.6 OCR

I do not know openCV but I would like to know
Area extraction
If the white area is reduced by hsv, the text is broken and can not be recognised. If there is a white area, THRESH_BINARY_INV is converted as shown above.
I would appreciate your help.
I set out to solve this problem.
I do not know if this is being interpreted properly.
My code will be attached below
import cv2
import numpy as np
def tracking():
frame = cv2.imread('test4.png')
hsv = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
lower = np.array([0, 0, 0])
upper = np.array([255, 255, 240])
mask = cv2.inRange(hsv, lower, upper)
res2 = cv2.bitwise_and(frame, frame, mask=mask)
cv2.imshow('asdasd2', res2)
_, edge2 = cv2.threshold(res2, 100, 255, cv2.THRESH_BINARY_INV)
cv2.imshow('asdasd2', edge2)
cv2.imshow('original', frame)
cv2.imshow('finish', res2)
cv2.waitKey(0)
cv2.destroyAllWindows()
tracking()

Categories

Resources