I have been trying to detect and decode QR codes from captured images through phone. I have been facing trouble to decode picture a as the script is failing to detect there is QR code here. However, if I crop just the QR code as in picture c the script is able to decode the code. What modification do I need to make in the script to make it work in picture a?`
import cv2
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
image_path = "Test_Image.png"
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(img, (5, 5), 0)
ret, bw_im = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
code = decode(bw_im, symbols=[ZBarSymbol.QRCODE])
print (code)
for barcode in decode (bw_im, symbols=[ZBarSymbol.QRCODE]):
print (barcode.data)
print (barcode.rect)
myData = barcode.data.decode ('utf-8')
print (myData)
break
This should probably be renamed to "Pyzbar not detecting QR code in the image" since you are just using OpenCV for loading the image/image processing.
Regardless, I have run into a similar issue as you, and I think the limitation is that opencv and pyzbar perform worse the smaller the QR code is relative to the entire image.
In my case, I already had some bounding boxes around the objects that had the QR code on them so I just search within the bounding box, and the smaller search area improves detection/decoding.
Depending on your efficiency requirements, you could try to do a coarser search, finding certain shapes in the image or objects that have a certain color and then use the QR search in those regions of interest. If you need to do real-time processing or keep computational overhead low then this may not be feasible.
You could also just apply some default cropping to your image, which won't be very robust if your QR code is going to be moving around, but would reduce the search area and improve the detection/decoding.
As far as I have been able to experiment, both pyzbar and OpenCV perform pretty well for decoding, but they have some problems with detecting difficult QRs. You could try QReader for your use case. It uses pyzbar, sweetened with some image preprocessing fallbacks, on the decoding side, so the decoding performance will be quite similar to the one you are using now. However, for the detection part, it uses a YoloV7 based QR detector that heavily increases the detection rate on difficult images.
from qreader import QReader
import cv2
image_path = "Test_Image.png"
# Create a QReader instance
qreader = QReader()
# Get the image (as RGB)
image = cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB)
# Use the detect_and_decode function to get the decoded QR data
decoded_texts = qreader.detect_and_decode(image=image)
# Print the results
for text in decoded_texts:
print(text)
Related
I have attached a very simple text image that I want text from. It is white with a black background. To the naked eye it seems absolutely legible but apparently to tesseract it is a rubbish. I have tried changing the oem and psm parameters but nothing seems to work. Please note that this works for other images but for this one.
Please try running it on your machine and see if it works. Or else I might have to change my ocr engine altogether.
Note: It was working earlier until I tried to add black pixels around the image to help the extraction process. Also I don't think that tesseract was trained on black text on a white background. It should be able to do this too. Also if this was true why does it work for other text images that have the same format as this one
Edit: Miraculously I tried running the script again and this time it was able to extract Chand properly but failed in the below mentioned case. Also please look at the parameters I have used. I have read the documentation and I feel this would be the right choice. I have added the image for your reference. It is not about just this image. Why is tesseract failing for such simple use cases?
To find the desired result, you need to know the followings:
Page-segmentation-modes
Suggested Image processing methods
The input images are boldly written, we need to shrink the bold font and then assume the output as a single uniform block of text.
To shrink the images we could use erosion
Result will be:
Erode
Result
CHAND
BAKLIWAL
Code:
# Load the library
import cv2
import pytesseract
# Initialize the list
img_lst = ["lKpdZ.png", "ZbDao.png"]
# For each image name in the list
for name in img_lst:
# Load the image
img = cv2.imread(name)
# Convert to gry-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Erode the image
erd = cv2.erode(gry, None, iterations=2)
# OCR with assuming the image as a single uniform block of text
txt = pytesseract.image_to_string(erd, config="--psm 6")
print(txt)
I've this python code which I use to convert a text written in a picture to a string, it does work for certain images which have large characters, but not for the one I'm trying right now which contains only digits.
This is the picture:
This is my code:
import pytesseract
from PIL import Image
img = Image.open('img.png')
pytesseract.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
result = pytesseract.image_to_string(img)
print (result)
Why is it failing at recognising this specific image and how can I solve this problem?
I have two suggestions.
First, and this is by far the most important, in OCR preprocessing images is key to obtaining good results. In your case I suggest binarization. Your images look extremely good so you shouldn't have any problem but if you do, then maybe you should try to binarize your images:
import cv2
from PIL import Image
img = cv2.imread('gradient.png')
# If your image is not already grayscale :
# img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
threshold = 180 # to be determined
_, img_binarized = cv2.threshold(img, threshold, 255, cv2.THRESH_BINARY)
pil_img = Image.fromarray(img_binarized)
And then try the ocr again with the binarized image.
Check if your image is in grayscale and uncomment if needed.
This is simple thresholding. Adaptive thresholding also exists but it is noisy and does not bring anything in your case.
Binarized images will be much easier for Tesseract to handle. This is already done internally (https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality) but sometimes things can be messed up and very often it's useful to do your own preprocessing.
You can check if the threshold value is right by looking at the images :
import matplotlib.pyplot as plt
plt.imshow(img, cmap='gray')
plt.imshow(img_binarized, cmap='gray')
Second, if what I said above still doesn't work, I know this doesn't answer "why doesn't pytesseract work here" but I suggest you try out tesserocr. It is a maintained python wrapper for Tesseract.
You could try:
import tesserocr
text_from_ocr = tesserocr.image_to_text(pil_img)
Here is the doc for tesserocr from pypi : https://pypi.org/project/tesserocr/
And for opencv : https://pypi.org/project/opencv-python/
As a side-note, black and white is treated symetrically in Tesseract so having white digits on a black background is not a problem.
I have a very specific scene text detection and parsing problem. I am not even sure if you can say it is an actual scene text.
I have extracted a name field from an identity card photo:
I could immediately start applying some OCR on that image, but i believe a further text localisation could be applied. To achieve this image: Do you know any of such text localisation algorithms? I have already tried 'FASText by Busta', 'EAST by argman' and they work decently. Any algorithms on this specific task?
After the localisation of the text i think now it is the best time to apply OCR. And now i feel lost. Which OCR could you recommend to use? I have already tried 'Tesseract' but it just doesn't work well. Is it a better idea to make your own OCR for document characters by using e.g. Tensorflow?
Try to increase the contrast of the image. You can use:
import matplotlib.pyplot as plt
import cv2
import numpy as np
def cvt_BGR2RGB(img):
return cv2.cvtColor(img,cv2.COLOR_BGR2RGB)
def contrast(img,show=False):
# CLAHE (Contrast Limited Adaptive Histogram Equalization)
clahe=cv2.createCLAHE(clipLimit=3., tileGridSize=(8,8))
lab=cv2.cvtColor(img, cv2.COLOR_BGR2LAB) # convert from BGR to LAB color space
l,a,b=cv2.split(lab) # split on 3 different channels
l2=clahe.apply(l) # apply CLAHE to the L-channel
lab=cv2.merge((l2,a,b)) # merge channels
img2=cv2.cvtColor(lab, cv2.COLOR_LAB2BGR) # convert from LAB to BGR
if show:
#plot the original and contrasted image
f=plt.figure(figsize=(15,15))
ax1=f.add_subplot(121)
img1_cvt=cvt_BGR2RGB(img)
plt.imshow(img1_cvt)
ax2=f.add_subplot(122)
img2_cvt=cvt_BGR2RGB(img2)
plt.imshow(img2_cvt)
plt.show()
return img,img2
And maybe then you can use pyteserract
I am trying to extract data from a scanned form. The form has a standard format similar to the one shown in the image below:
I have tried using pytesseract (tesseract OCR) to detect the image's text and it has done a decent job at finding the text and converting the image to text.
However it essentially just gives me all the detected text without keeping the format of the data.
I would like to be able to do something like the below:
Find a particular piece of text and then find the associated data below or beside it. Similar to this question using opencv Detect text region in image using Opencv
Is there a way that I can essentially do the following:
Either find all text boxes on the form, perform OCR on each box and see which one is the closest match to the "witnesess:" text, then find the sections immediately below it and perform separate OCR on those.
Or if the form is standard and I know the approximate location of the "witness" text section can I specify its general location in opencv and then just extract the below text and perform OCR on it.
EDIT: I have tried the below code to try to detect specific regions of the text. However it is not specifically identifying the text just all regions.
import cv2
img = cv2.imread('t2.jpg')
mser = cv2.MSER_create()
img = cv2.resize(img, (img.shape[1]*2, img.shape[0]*2))
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
regions = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions[0]]
cv2.polylines(vis, hulls, 1, (0,255,0))
cv2.imshow('img', vis)
Here is the result:
I think you have the answer already in your own post.
I did recently something similar and this is how I did it:
//id_image was loaded with cv2.imread
temp_image = id_image[start_y:end_y,start_x:end_x]
img = Image.fromarray(temp_image)
text = pytesseract.image_to_string(img, config="-psm 7")
So basically, if your format is predefined, you just need to know the location of the fields that you want the text of (which you already know), crop it, and then apply the ocr (tesseract) extraction.
In this case you need import pytesseract, PIL, cv2, numpy.
I tried almost all filters in PIL, but failed.
Is there any function in numpy of scipy to remove the noise?
Like Bwareaopen() in Matlab()?
e.g:
PS: If there is a way to fill the letters into black, I will be grateful
Numpy/Scipy can do morphological operations just as well as Matlab can.
See scipy.ndimage.morphology, containing, among other things, binary_opening(), the equivalent of Matlab's bwareaopen().
Numpy/Scipy solution: scipy.ndimage.morphology.binary_opening. More powerful solution: use scikits-image.
from skimage import morphology
cleaned = morphology.remove_small_objects(YOUR_IMAGE, min_size=64, connectivity=2)
See http://scikit-image.org/docs/0.9.x/api/skimage.morphology.html#remove-small-objects
I don't think this is what you want, but this works (uses Opencv (which uses Numpy)):
import cv2
# load image
fname = 'Myimage.jpg'
im = cv2.imread(fname,cv2.COLOR_RGB2GRAY)
# blur image
im = cv2.blur(im,(4,4))
# apply a threshold
im = cv2.threshold(im, 175 , 250, cv2.THRESH_BINARY)
im = im[1]
# show image
cv2.imshow('',im)
cv2.waitKey(0)
Output ( image in a window ):
You can save the image using cv2.imwrite