Detect text in image and trace on them - python

I need some help with image processing , I'm working on a script which can detect Alphabets on the image and trace them , For example if there is a letter A in the Image then code has to detect it and trace(side-by-side not over same line) 3-4 times (with different colors) with given distance based on width of the text . As of now I'am able to detect the words and font and size using tesserocr module, but I'm unable to do the tracing thing .
import io
import tesserocr
from PIL import Image
with tesserocr.PyTessBaseAPI() as api:
image = Image.open("1d.png")
api.SetImage(image)
api.Recognize()
iterator = api.GetIterator()
print iterator.WordFontAttributes()
Thanks in advance #peace

Related

(OCR) Tesseract not recognizing simple digits

I am using PyTesseract to extract information from multiple images which contain vertically separated prices (one price per line), horizontally aligned to the right like the following image:
Tesseract is not able to extract reliable text with such image, so, image processing has to occur:
Image scaling to 4x;
Binarization
"Bolding";
Gaussian blur;
Which results in the following image:
Pytesseract is successfully able to extract its information (using PSM --6) resulting in a string containing:
96,000,000
94,009,999
90,000,000
85,000,000
78,000,000
70,000,000
66,000,000
However, when Pytesseract is presented with some edge cases like an image with a single digit, recognition fails. Example:
Pre-processed:
post-processed:
Which results in an empty string extracted. This is strange as the number 8 was previously successfully read. What other suggestions should I follow? I've spent endless hours trying to optimize the algorythm without success for such case scenarios.
I had tried the same exact scenario with easyocr. Easyocr is also using tesseract engine internally for optical character recognition. I try with resizing image of custom size (600,600) and fed to easyocr, it worked.
import easyocr
import cv2
image = cv2.imread('7.png')
image = cv2.resize(image,(600,600))
cv2.imwrite('image.png',image)
reader = easyocr.Reader(['en'])
result = reader.readtext('image.png')
texts = [detection[1] for detection in result if detection[2] > 0.5]
print(texts)
The output for first image is,
['96,000,000', '94,009,999', '90,000,000', '85,000,000', '78,000,000', '70,000,000', '66,000,000']
The output for second image is,
['8']
May be this alternate solution work for your case. You can install easyocr bypip install easyocr. Happy coding :)

Using pytesseract to get numbers from an image

I'm trying to take an image that's just a number 1-10 in a bubble font and use pytesseract to get the number.
Picture in question:
Here is an article that makes this process seem straightforward:
https://towardsdatascience.com/read-text-from-image-with-one-line-of-python-code-c22ede074cac
lives = pyautogui.locateOnScreen('pics/lives.png', confidence = 0.9)
ss = pyautogui.screenshot(region=(lives[0]+lives[2],lives[1],lives[2]-6,lives[3]))
ss.save('pics/tempLives.png')
img = cv2.imread('pics/tempLives.png')
cv2.imwrite('pics/testPic.png',img)
test = pytesseract.image_to_string(img)
print(test)
I know 'img' is the same as the image provided because I've used ss.save cv2.imwrite to see it.
I suppose my question is why it works so well in the article yet I cannot manage to get anything to print? I suppose the bubble font makes it trickier, but in the article those blue parentheses were read easily, so that makes me think this font wouldn't be too hard. Thanks for any help!
There are many cases when PyTesseract fails to recognize the text, and in some cases we have to give it some hints.
In the specific image you have posted, we better add config=" --psm 6" argument.
According to Tesseract documentation regarding PSM:
6 Assume a single uniform block of text.
Here is a code sample that manages to identify the text from the posted image:
import cv2
import pytesseract
#pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # May be required when using Windows
img = cv2.imread('pics/testPic.png') # Reading the input image (the PNG image from the posted question).
text = pytesseract.image_to_string(img, config=" --psm 6") # Execute PyTesseract OCR with "PSM 6" configuration (Assume a single uniform block of text)
print(text) # Prints the text (prints 10).
Note:
The OCR is not always working, and there are many techniques to improve the OCR accuracy.

Python Tesseract not recognising number in my image

I've got this picture (preprocessed image) from which I want to extract the numeric values of each line. I'm using pytesseract but it doesnt show any results for this image.
I've tried several config options from other questions like "--psm 13 --oem 3" or whitelisting numbers but nothing yields results.
As a result I usually get just one or two characters or ~5 dots/dashes but nothing even remotly resembling the size of my input.
I hope someone can help me cheers in advance for your time.
pytesseract version: 0.3.8
tesseract version: 5.0.0-alpha.20210506
You must think to use --psm 4, it's more appropriate for your image. I also recommend to rethink about the image pre-process. Tesseract is not perfect and it requires good image as input to work well.
import cv2 as cv
import pytesseract as tsr
img = cv.imread('41DAx.jpg')
img = cv.cvtColor(img, cv.COLOR_BGR2RGB)
config = '--psm 4 -c tessedit_char_whitelist=0123456789,'
text = tsr.image_to_string(img, config=config)
print(text)
The above code was not able to well detect all digts in the image, but almost of them. Maybe with a bit of image pre-processing, you can reach your objective.

Can't extract text from an image with python OCR pytesseract

I'm trying to extract texts from some images. It worked for hundreds of other images but in some cases it doesn't find any texts. In order to optimize the images for extraction phase, all images are converted to black and white. All of their backgrounds are white and others are black such as icons, texts etc.
For example it worked for below image and succesfully found 'Sleep Timer' text in the image. I'm not sure if it's relevant but size of the below image with 'Sleep Timer' text is 320 × 351
But for the below image it doesn't find any text at all. Image size for this one is 161 × 320.
Since I couldn't find the reason, I tried to resize the image but it didn't work.
Here is my code:
from pytesseract import Output
import pytesseract
import cv2
image = cv2.imread('imagePath')
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
results = pytesseract.image_to_data(rgb, output_type=Output.DICT)
for i in range(0, len(results["text"])):
text = results["text"][i]
conf = int(results["conf"][i])
print("Confidence: {}".format(conf))
print("Text: {}".format(text))
print("")
It is working for me I tested:
import pytesseract
print(pytesseract.image_to_string('../images/grmgrm.jfif'))
results = pytesseract.image_to_data('../images/grmgrm.jfif', output_type=pytesseract.Output.DICT)
print(results)
Are you getting an error? Show us the error you are getting.

Empty string with Tesseract

I'm trying to read different cropped images from a big file and I manage to read most of them but there are some of them which return an empty string when I try to read them with tesseract.
The code is just this line:
pytesseract.image_to_string(cv2.imread("img.png"), lang="eng")
Is there anything I can try to be able to read these kind of images?
Thanks in advance
Edit:
Thresholding the image before passing it to pytesseract increases the accuracy.
import cv2
import numpy as np
# Grayscale image
img = Image.open('num.png').convert('L')
ret,img = cv2.threshold(np.array(img), 125, 255, cv2.THRESH_BINARY)
# Older versions of pytesseract need a pillow image
# Convert back if needed
img = Image.fromarray(img.astype(np.uint8))
print(pytesseract.image_to_string(img))
This printed out
5.78 / C02
Edit:
Doing just thresholding on the second image returns 11.1. Another step that can help is to set the page segmentation mode to "Treat the image as a single text line." with the config --psm 7. Doing this on the second image returns 11.1 "202 ', with the quotation marks coming from the partial text at the top. To ignore those, you can also set what characters to search for with a whitelist by the config -c tessedit_char_whitelist=0123456789.%. Everything together:
pytesseract.image_to_string(img, config='--psm 7 -c tessedit_char_whitelist=0123456789.%')
This returns 11.1 202. Clearly pytesseract is having a hard time with that percent symbol, which I'm not sure how to improve on that with image processing or config changes.

Categories

Resources