Python cannot read text from an image [Python OCR with Tesseract] - python

I have this issue with reading exactly two lines of numbers (each line contains max of 3 digits) from an image.
My Python code has a big problem with reading a data from images like the ones below:
Most of the times it is just printing random numbers.
What should I do to make this work?
This is my Python code:
from PIL import ImageGrab, Image
from datetime import datetime
from pytesseract import pytesseract
import numpy as nm
pytesseract.tesseract_cmd = 'F:\\Tesseract\\tesseract'
while True:
screenshot = ImageGrab.grab(bbox=(515, 940, 560, 990))
datetime = datetime.now()
filename = 'pic_{}.{}.png'.format(datetime.strftime('%H%M_%S'), datetime.microsecond / 500000)
gray = screenshot.convert('L')
bw = nm.asarray(gray).copy()
bw[bw < 160] = 0
bw[bw >= 160] = 255
convertedScreenshot = Image.fromarray(bw)
tesseract = pytesseract.image_to_string(convertedScreenshot, config='digits --psm 6')
convertedScreenshot.save(filename)
print(tesseract)
The image has to have white text on the black background or the black text on the white background.
It is also important to save the image afterwards.

Tesseract works best on images having black text on white Background. Invert the image before using tesseract by adding the below line :
convertedScreenshot = 255 - convertedScreenshot

Hey I was facing similar problem, I still am but using a few arguments in 'image_to_string' function helped.
I was using it for a single digit detection
d = pytesseract.image_to_string(thr, lang='eng',config='--psm 10 --oem
3 -c tessedit_char_whitelist=0123456789')
This helped me detect the single digits

Related

How to preprocess image better to identify number on same colored background?

I want to find a way to detect the red number 3 which is on a red background. I've tried changing the contrast on the image, as well as also trying a blur + adaptive thresholding, which both don't detect anything. What's interesting is I can't detect single numbers, but can detect 2 numbers next to each other at nearly 100% accuracy using the same two methods above. I think it's because the background is lighter when it's just one number, so the OCR is having trouble finding it.
Here's the number 3 from the original image (it's 96 dpi): (https://i.stack.imgur.com/t0VR7.jpg)
I changed the contrast on the image by using the following code, and then cropped it to just show the number.
img = cv2.imread(path_to_img, 0)
alpha = 3 # Contrast control (1.0-3.0)
beta = 0 # Brightness control (0-100)
images_contrast = cv2.convertScaleAbs(img, alpha=alpha, beta=beta)
cropped = images_contrast[885:917, 1008:1055]
cv2.imshow("contrast.jpg", cropped)
cv2.imwrite("contrast_easyOCR.jpg", cropped)
cv2.waitKey(0)
cv2.destroyAllWindows()
reader = easyocr.Reader(['en'], gpu=False, verbose=False)
result_Kripp_Hp = reader.readtext(cropped, allowlist="-0123456789")
print(result_Kripp_Hp)
This is the result: 3hp after changing contrast
I also tried a medianblur + adaptive thresholding, which gets me this: (https://i.stack.imgur.com/ezpVD.jpg)
Code below:
img = cv2.imread(path_to_img, 0)
img = cv2.medianBlur(img, 3)
adapt_Thresholding = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 11, 2)
images = adapt_Thresholding
cropped_adaptive_thresholding = images[885:917, 1011:1055]
cv2.imshow("adaptiveThresholding.jpg", cropped_adaptive_thresholding)
cv2.imwrite("adaptThreshold_easyOCR.jpg", cropped_adaptive_thresholding)
cv2.waitKey(0)
cv2.destroyAllWindows()
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
num = pytesseract.image_to_data(cropped_adaptive_thresholding, config='--psm 11 -c tessedit_char_whitelist=0123456789')
print(num)
Both of the above result in no detection by easyocr and pytesseract.
Lastly, easyocr is finding 37 at 99.9% confidence using the contrast code (near the top of this post), which I find a bit odd. Image here: easyocr detects this as '37' correctly at 99.9% confidence
Another thing I tried was messing around with the image in GIMP, and after adding some black pixels to the perimeter of my '3' and then running it through the 'contrast code' above, it detected the 3 correctly at 99.9% confidence. Here's the image: (https://i.stack.imgur.com/fEZ0i.jpg). I think thickening the black line around the 3 would work, but I couldn't figure out how to do this with opencv / python.
Any tips / suggestions (I'm coding in Python) would be greatly appreciated! Thank you.

Why will tesseract not detect this letter?

I am trying to detect this letter but it doesn't seem to recognize it.
import cv2
import pytesseract as tess
img = cv2.imread("letter.jpg")
imggray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(tess.image_to_string(imggray))
this is the image in question:
Preprocessing of the image (e.g. inverting it) should help, and also you could take advantage of pytesseract image_to_string config options.
For instance, something along these lines:
import pytesseract
import cv2 as cv
import requests
import numpy as np
import io
# I read this directly from imgur
response = requests.get('https://i.stack.imgur.com/LGFAu.jpg')
nparr = np.frombuffer(response.content, np.uint8)
img = cv.imdecode(nparr, cv.IMREAD_GRAYSCALE)
# simple inversion as preprocessing
neg_img = cv.bitwise_not(img)
# invoke tesseract with options
text = pytesseract.image_to_string(neg_img, config='--psm 7')
print(text)
should parse the letter correctly.
Have a look at related questions for some additional info about preprocessing and tesseract options:
Why does pytesseract fail to recognise digits from image with darker background?
Why does pytesseract fail to recognize digits in this simple image?
Why does tesseract fail to read text off this simple image?
#Davide Fiocco 's answer is definitely correct.
I just want to show another way of doing it with adaptive-thresholding
When you apply adaptive-thesholding result will be:
Now when you read it:
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)
Result:
B
Code:
import cv2
import pytesseract
img = cv2.imread("LGFAu.jpg")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.adaptiveThreshold(gry, 252, cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY_INV, 11, 2)
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)

Recognize numbers from an image python

I am trying to extract numbers from in game screenshots.
I'm trying to extract:
98
3430
5/10
from PIL import Image
import pytesseract
image="D:/img/New folder (2)/1.png"
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files/Tesseract-OCR/tesseract.exe'
text = pytesseract.image_to_string(Image.open(image),lang='eng',config='--psm 5')
print(text)
output is gibberish
‘t hl) keteeeees
ek pSlaerenen
JU) pgrenmnreserenny
Rates B
d dali eas. 5
cle aM (Sores
|, S| pgranmrerererecons
a cee 3
pea 3
oS :
(geo eenee
ey
=
es A
okay, so I tried changing it into grayscale, reverse contrast or use different treshold, but it all seems to be fairly inaccurate.
The issue seems to be the tilted and smaller numbers. You do not happen to have any hiher res image?
Most accurate I could get was the following code.
import cv2
import pytesseract
import imutils
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
img = cv2.imread('D:/img/New folder (2)/1.png') #test.png is your original image
img = imutils.resize(img, width=1400)
crop = img[340:530, 100:400]
data = pytesseract.image_to_string(crop,config=' --psm 1 --oem 3 -c tessedit_char_whitelist=0123456789/')
print(data)
cv2.imshow('crop', crop)
cv2.waitKey()
Otherwise I recommend one of these methods as described in the similar question
or in this one.
if the text is surrounded with the designs, tesseract suffers a lot
insted of tesseract try using findcontours in opencv (after little blurring, dilating)
you will get bounding boxes, then it might cover that text also

How can I get text from this image with Tesseract?

Currently I'm using the code below to get text from image and it works fine, but it doesn't work well with these two images, it seems like tesseract cannot scan these types of image. Please show me how to fix it
https://i.ibb.co/zNkbhKG/Untitled1.jpg
https://i.ibb.co/XVbjc3s/Untitled3.jpg
def read_screen():
spinner = Halo(text='Reading screen', spinner='bouncingBar')
spinner.start()
screenshot_file="Screens/to_ocr.png"
screen_grab(screenshot_file)
#prepare argparse
ap = argparse.ArgumentParser(description='HQ_Bot')
ap.add_argument("-i", "--image", required=False,default=screenshot_file,help="path to input image to be OCR'd")
ap.add_argument("-p", "--preprocess", type=str, default="thresh", help="type of preprocessing to be done")
args = vars(ap.parse_args())
# load the image
image = cv2.imread(args["image"])
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
if args["preprocess"] == "thresh":
gray = cv2.threshold(gray, 177, 177,
cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
elif args["preprocess"] == "blur":
gray = cv2.medianBlur(gray, 3)
# store grayscale image as a temp file to apply OCR
filename = "Screens/{}.png".format(os.getpid())
cv2.imwrite(filename, gray)
# load the image as a PIL/Pillow image, apply OCR, and then delete the temporary file
pytesseract.pytesseract.tesseract_cmd = 'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'
#ENG
#text = pytesseract.image_to_string(Image.open(filename))
#VIET
text = pytesseract.image_to_string(Image.open(filename), lang='vie')
os.remove(filename)
os.remove(screenshot_file)
# show the output images
'''cv2.imshow("Image", image)
cv2.imshow("Output", gray)
os.remove(screenshot_file)
if cv2.waitKey(0):
cv2.destroyAllWindows()
print(text)
'''
spinner.succeed()
spinner.stop()
return text
You should try different psm modes instead of default like so:
target = pytesseract.image_to_string(im,config='--psm 4',lang='vie')
Exert from docs:
Page segmentation modes:
0 Orientation and script detection (OSD) only.
1 Automatic page segmentation with OSD.
2 Automatic page segmentation, but no OSD, or OCR.
3 Fully automatic page segmentation, but no OSD. (Default)
4 Assume a single column of text of variable sizes.
5 Assume a single uniform block of vertically aligned text.
6 Assume a single uniform block of text.
7 Treat the image as a single text line.
8 Treat the image as a single word.
9 Treat the image as a single word in a circle.
10 Treat the image as a single character.
11 Sparse text. Find as much text as possible in no particular order.
12 Sparse text with OSD.
13 Raw line. Treat the image as a single text line,
bypassing hacks that are Tesseract-specific.
So for example for /Untitled3.jpg you could try --psm 4 and failing that you could try --psm 11 for both.
Depending on your version of tesseract you could also try different oem modes:
Use --oem 1 for LSTM, --oem 0 for Legacy Tesseract. Please note that Legacy Tesseract models are only included in traineddata files from tessdata repo.
EDIT
Also as seen in your images there are two languages so if you wish to use lang parameter you need to manually separate image into two to not to confuse tesseract engine and use different lang values for them.
EDIT 2
Below a full working example with Unitiled3. What I noticed was your improper use of thresholding. You should set maxval to something bigger than the value you are thresholding at. Like in my example I set thresh 177 but maxval to 255 so everything above 177 will be black. I didn't even had to do any binarization.
import cv2
import pytesseract
from cv2.cv2 import imread, cvtColor, COLOR_BGR2GRAY, threshold, THRESH_BINARY
image = imread("./Untitled3.jpg")
image = cvtColor(image,COLOR_BGR2GRAY)
_,image = threshold(image,177,255,THRESH_BINARY)
cv2.namedWindow("TEST")
cv2.imshow("TEST",image)
cv2.waitKey()
text = pytesseract.image_to_string(image, lang='eng')
print(text)
Output:
New York, New York
Salzburg, Austria
Hollywood, California

Image with white-on-purple text not recognized by Tesseract, even converted to B&W

I am Using the Below code . But tesseract cant even ocr from this simple image.
from PIL import Image
import pytesseract
image_file = Image.open("question.png")
image_file = image_file.convert('1') # convert image to black and white
image_file.save('question.png')
text = pytesseract.image_to_string(image_file, lang = 'eng',config='-psm 6')
print(text)
Note:
[1.] Tried to turn Pic into black and white
[2.] Used psm
QUESTION: How to OCR the given image correctly?
Updates:
I used this code now:
from PIL import Image
import pytesseract
image_file = Image.open('purple.png')
image_file = image_file.convert('L') # convert image to black and white
image_file.save('question.png')
image_file = Image.open('question.png') #without this line also the output is same
text = pytesseract.image_to_string(image_file, lang = 'eng',config='-psm 6')
print(text)
Output:
Please help me why it doesn't work.
With image.convert('-1') you get this image
It's a picture with a lot of noise, so Tesseract will fail with it.
I suggest you to convert it to a better grayscaled image with image_file.convert('L') (for more information about '1' mode and 'L' mode take a look at the documentation). With 'L' mode you will get
With this image Tesseract is perfectly able to recognize the text
from PIL import Image
import pytesseract
image_file = Image.open('purple.png')
image_file = image_file.convert('L') # convert image to black and white
image_file.save('question.png')
text = pytesseract.image_to_string(image_file, lang = 'eng',config='-psm 6')
print(text)
Tesseract output:
which of these elements Is a gas In Its
standard state?

Categories

Resources