I'm processing images that have a line across the 4 letters. After some processing, i have get this result:
The only thing I can't get to do is remove that black line, so it can be more clear to be recognized with some OCR library. The final image should look like this (not losing quality of text):
I have tried to dilate the image, to try to fill the gaps, but I lose quality:
import cv2
from google.colab.patches import cv2_imshow
import numpy as np
image = cv2.imread("image.png", 0)
cv2_imshow(imagen2)
kernel = np.ones((0,0),np.uint8)
dilation2 = cv2.dilate(imagen2,kernel,iterations = 1) # to remove blackline noise
cv2_imshow(dilation2)
After that, I have tried to erode the image, but this doesn't work well, and I lose more quality:
kernel2 = np.ones((2,2),np.uint8)
closing = cv2.morphologyEx(dilation2, cv2.MORPH_CLOSE, kernel2)
cv2_imshow(closing)
Without knowing the allowed character shapes, this is simply an impossible task. Because from an image processing point of view, nothing qualitatively distinguishes a character and the same character with a line across. So any "blind" processing that removes thin lines will as well destroy thin lines that can legitimately appear in the text, such as the space between characters, and is likely to worsen the problem.
Related
I have been trying to detect and decode QR codes from captured images through phone. I have been facing trouble to decode picture a as the script is failing to detect there is QR code here. However, if I crop just the QR code as in picture c the script is able to decode the code. What modification do I need to make in the script to make it work in picture a?`
import cv2
from pyzbar.pyzbar import decode
from pyzbar.pyzbar import ZBarSymbol
image_path = "Test_Image.png"
img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
blur = cv2.GaussianBlur(img, (5, 5), 0)
ret, bw_im = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)
code = decode(bw_im, symbols=[ZBarSymbol.QRCODE])
print (code)
for barcode in decode (bw_im, symbols=[ZBarSymbol.QRCODE]):
print (barcode.data)
print (barcode.rect)
myData = barcode.data.decode ('utf-8')
print (myData)
break
This should probably be renamed to "Pyzbar not detecting QR code in the image" since you are just using OpenCV for loading the image/image processing.
Regardless, I have run into a similar issue as you, and I think the limitation is that opencv and pyzbar perform worse the smaller the QR code is relative to the entire image.
In my case, I already had some bounding boxes around the objects that had the QR code on them so I just search within the bounding box, and the smaller search area improves detection/decoding.
Depending on your efficiency requirements, you could try to do a coarser search, finding certain shapes in the image or objects that have a certain color and then use the QR search in those regions of interest. If you need to do real-time processing or keep computational overhead low then this may not be feasible.
You could also just apply some default cropping to your image, which won't be very robust if your QR code is going to be moving around, but would reduce the search area and improve the detection/decoding.
As far as I have been able to experiment, both pyzbar and OpenCV perform pretty well for decoding, but they have some problems with detecting difficult QRs. You could try QReader for your use case. It uses pyzbar, sweetened with some image preprocessing fallbacks, on the decoding side, so the decoding performance will be quite similar to the one you are using now. However, for the detection part, it uses a YoloV7 based QR detector that heavily increases the detection rate on difficult images.
from qreader import QReader
import cv2
image_path = "Test_Image.png"
# Create a QReader instance
qreader = QReader()
# Get the image (as RGB)
image = cv2.cvtColor(cv2.imread(image_path), cv2.COLOR_BGR2RGB)
# Use the detect_and_decode function to get the decoded QR data
decoded_texts = qreader.detect_and_decode(image=image)
# Print the results
for text in decoded_texts:
print(text)
I have attached a very simple text image that I want text from. It is white with a black background. To the naked eye it seems absolutely legible but apparently to tesseract it is a rubbish. I have tried changing the oem and psm parameters but nothing seems to work. Please note that this works for other images but for this one.
Please try running it on your machine and see if it works. Or else I might have to change my ocr engine altogether.
Note: It was working earlier until I tried to add black pixels around the image to help the extraction process. Also I don't think that tesseract was trained on black text on a white background. It should be able to do this too. Also if this was true why does it work for other text images that have the same format as this one
Edit: Miraculously I tried running the script again and this time it was able to extract Chand properly but failed in the below mentioned case. Also please look at the parameters I have used. I have read the documentation and I feel this would be the right choice. I have added the image for your reference. It is not about just this image. Why is tesseract failing for such simple use cases?
To find the desired result, you need to know the followings:
Page-segmentation-modes
Suggested Image processing methods
The input images are boldly written, we need to shrink the bold font and then assume the output as a single uniform block of text.
To shrink the images we could use erosion
Result will be:
Erode
Result
CHAND
BAKLIWAL
Code:
# Load the library
import cv2
import pytesseract
# Initialize the list
img_lst = ["lKpdZ.png", "ZbDao.png"]
# For each image name in the list
for name in img_lst:
# Load the image
img = cv2.imread(name)
# Convert to gry-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Erode the image
erd = cv2.erode(gry, None, iterations=2)
# OCR with assuming the image as a single uniform block of text
txt = pytesseract.image_to_string(erd, config="--psm 6")
print(txt)
I have a bunch of image like this one:
Where the yellow box are contents of different profiles (text), where each section is divided by the dashed lines (no the straight lines). So what I need is to split the image into multiple single images by the dashed lines. So far I have tried a lot of python and cv2 examples with the Hough Line Transform but none of my attempts works in the detection.
Following #efirvida's comment, here's a very basic approach on how to do it.
What it does is simply checking whether each line of pixels in the given picture is equal in value to the first line containing a dashed line, and then crop the picture to split it into multiple pictures...
# import image/array manipulation packages
import cv2
import numpy as np
# read image with OpenCV 2
img = cv2.imread("path/to/file/z4Xje.jpg")
# identify one line of pixels that has dashes
dashes = img[761,:,:]
# check where to split the picture and store that information
splits = [0]
for i in range(img.shape[0]):
# np.allclose allows you to have small differences between dashed lines
if np.allclose(img[i,:,:], dashes):
splits.append(i)
# add last split (height of picture)
splits.append(img.shape[0])
# write each cropped picture to your desired directory
for j in range(len(splits)-1):
new_img = img[splits[j]:splits[j+1],:]
cv2.imwrite("/path/to/export/"+str(j)+".jpg", new_img)
It quite certainly isn't a perfect solution but I hope it gives you clues on how to improve your current algorithm!
It gave me these pictures for the one you provided:
first split
second split
third split
Using cv2, I am able to find the contours of text in an image. I would like to remove said text and replace it with the average pixel of the surrounding area.
However, the contours are just a bit smaller than I would like, resulting in a blurred edge where one can barely tell what the original text was:
I once chanced upon a cv2 tutorial with a stylized "j" as the sample image. It showed how to "expand" a contour in a manner similar to adding a positive sample next to every pre-existing positive sample in a mask.
If such a method does not already exist in cv2, how may I do this manually?
The function I sought was dilation, as detailed here:
https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_imgproc/py_morphological_ops/py_morphological_ops.html
import cv2
import numpy as np
img = cv2.imread('j.png',0)
kernel = np.ones((5,5),np.uint8)
dilation = cv2.dilate(img,kernel,iterations = 1)
I have been trying to draw rectangle on a black image, uscv2.rectangle.Here is my code : (It is just a sample, in actual code there is a loop i.e values x2,y2,w2,h2 changes in a loop)
heir = np.zeros((np.shape(image1)[0],np.shape(image1)[1]),np.uint8);
cv2.rectangle(heir,(x2,y2),(x2+w2,y2+h2),(255,255,0),5)
cv2.imshow("img",heir);
cv2.waitKey()
It is giving the following output:
Why the image is like that? Why the boundaries are not just a line a width 5.
I have tried, but I am not able to figure it out.
Can't post this in a comment, but it's a negative answer: the same operations work for me on Windows/python 2.7.8/opencv3.1
import numpy as np
import cv2
heir = np.zeros((100,200),np.uint8);
x2=10
y2=20
w2=30
h2=40
cv2.rectangle(heir,(x2,y2),(x2+w2,y2+h2),(255,255,0),5)
cv2.imshow("img",heir);
cv2.waitKey()
Because you are loading the image to be tagged (draw rectangles) in grayscale, thats why when you are adding rectangles/bounding boxes the colors are being converted to grayscale.
To fix the issue, open image in "color" format. Since, you didn't included that part of code, here is the proposed solution:
tag_img = cv2.imread(MYIMAGE,1)
Pay attention to the second parameter here, which is "1" and means load image as color. Read more about reading images here: https://docs.opencv.org/3.0-beta/doc/py_tutorials/py_gui/py_image_display/py_image_display.html