How to extract text from an image with a slight background present? - python

I'm looking to extract the text from an image, The output I am receiving is not very accurate. I wonder if there's any additional steps I can take to process the image more to increase the accuracy of this OCR.
I've looked into some of the different ways to process the image and improve the OCR results. The image is quite small and I've been able to blow it up slightly, but to no avail.
The image will always be horizontal, no other text will be present other than the numbers. The maximum number will go up to 55000.
An example of the image in question:
After image processing, my image is scaled up by 4 on the X and Y axis. And some saturation is removed, although this does not improve the accuracy at all.
image = self._process(scale=6, iterations=2)
text = pytesseract.image_to_string(image, config="--psm 7")
My process method is doing the following:
# Resize and desaturate.
image = cv2.resize(image, None, fx=scale, fy=scale,
interpolation=cv2.INTER_CUBIC)
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Apply dilation and erosion.
kernel = np.ones((1, 1), np.uint8)
image = cv2.dilate(image, kernel, iterations=iterations)
image = cv2.erode(image, kernel, iterations=iterations)
return image
Expected: "10411"
The actual value is varied, usually an unrecognizable string, or some numbers are parsed but the accuracy rate is too low to be usable.

I don't have experience with OCR, but I think you're on the right track: increasing the image size so the algorithm has more pixels to work with and increasing the distinction between the numbers and the background.
Tricks I added: thresholding the image, which creates a mask where only the white pixels remain. There were a few white blobs that were not numbers, so I used findContours to color those unwanted blobs black.
Result:
Code:
import numpy as np
import cv2
# load image
image = cv2.imread('number.png')
# resize image
image = cv2.resize(image,None,fx=5, fy=5, interpolation = cv2.INTER_CUBIC)
# create grayscale
gray_image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# perform threshold
retr, mask = cv2.threshold(gray_image, 230, 255, cv2.THRESH_BINARY)
# find contours
ret, contours, hier = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
# draw black over the contours smaller than 200 - remove unwanted blobs
for cnt in contours:
# print contoursize to detemine threshold
print(cv2.contourArea(cnt))
if cv2.contourArea(cnt) < 200:
cv2.drawContours(mask, [cnt], 0, (0), -1)
#show image
cv2.imshow("Result", mask)
cv2.imshow("Image", image)
cv2.waitKey(0)
cv2.destroyAllWindows()

Related

Compare two different images and find the differences

I have a webcam which takes pictures of a concrete slab. Now I want to check if there are objects on the slab or not. The objects could be anything and accordingly cannot be enumerated in a class. Unfortunately I cannot compare the webcam image directly with an image without objects on the concrete slab, because the image of the camera could shift minimally in x and y direction and the lighting is also not always the same. So I cannot use cv2.substract.
I would prefer a foreground and background substract, where the background is just my concrete slab and the foreground is then the objects. But since the objects don´t move but lie still on the slab, I can´t use cv2.createBackgroundSubtractorMOG2() either.
The Pictures look like this:
The Concrete slap without any objects:
The slap with Objects:
In Python/OpenCV, you could do division normalization to even out the illumination and make the background white. Then do your subtraction. Then use morphology to clean up small regions. Then find contours and discard any small regions that are due to noise left after the division normalization and morphology.
Here is how to do division normalization.
Input 1:
Input 2:
import cv2
import numpy as np
# load image
img1 = cv2.imread("img1.jpg")
img2 = cv2.imread("img2.jpg")
# convert to grayscale
gray1 = cv2.cvtColor(img1, cv2.COLOR_BGR2GRAY)
gray2 = cv2.cvtColor(img2, cv2.COLOR_BGR2GRAY)
# blur
blur1 = cv2.GaussianBlur(gray1, (0,0), sigmaX=13, sigmaY=13)
blur2 = cv2.GaussianBlur(gray2, (0,0), sigmaX=13, sigmaY=13)
# divide
divide1 = cv2.divide(gray1, blur1, scale=255)
divide2 = cv2.divide(gray2, blur2, scale=255)
# threshold
thresh1 = cv2.threshold(divide1, 200, 255, cv2.THRESH_BINARY)[1]
thresh2 = cv2.threshold(divide2, 200, 255, cv2.THRESH_BINARY)[1]
# morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (3,3))
morph1 = cv2.morphologyEx(thresh1, cv2.MORPH_OPEN, kernel)
morph2 = cv2.morphologyEx(thresh2, cv2.MORPH_OPEN, kernel)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
morph1 = cv2.morphologyEx(morph1, cv2.MORPH_CLOSE, kernel)
morph2 = cv2.morphologyEx(morph2, cv2.MORPH_CLOSE, kernel)
# write result to disk
cv2.imwrite("img1_division_normalize.jpg", divide1)
cv2.imwrite("img2_division_normalize.jpg", divide2)
cv2.imwrite("img1_division_morph1.jpg", morph1)
cv2.imwrite("img1_division_morph2.jpg", morph2)
# display it
cv2.imshow("img1_norm", divide1)
cv2.imshow("img2_norm", divide2)
cv2.imshow("img1_thresh", thresh1)
cv2.imshow("img2_thresh", thresh2)
cv2.imshow("img1_morph", morph1)
cv2.imshow("img2_morph", morph2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Image 1 Normalized:
Image 2 Normalized:
Image 1 thresholded and morphology cleaned:
Image 2 thresholded and morphology cleaned:
In this case, Image 1 becomes completely white. So it (and subtraction) is not really needed. You just need to find contours for the second image result and if necessary discard tiny regions by area. The rest are your objects.

Retinal blood vessel segmentation using Python

I have the following retina image and I'm trying to trace the vessels (the darker lines coming out of the circle). Here is the original image:
I have tried thresholding the image using division normalization followed by filtering on contour area (as per a different stackoverflow solution):
import cv2
import numpy as np
# read the image
img = cv2.imread('retina_eye.jpg')
# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# apply morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT , (5,5))
morph = cv2.morphologyEx(gray, cv2.MORPH_DILATE, kernel)
# divide gray by morphology image
division = cv2.divide(gray, morph, scale=255)
# threshold
thresh = cv2.threshold(division, 0, 255, cv2.THRESH_OTSU )[1]
# invert
thresh = 255 - thresh
# find contours and discard contours with small areas
mask = np.zeros_like(thresh)
contours = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]
area_thresh = 10000
for cntr in contours:
area = cv2.contourArea(cntr)
if area > area_thresh:
cv2.drawContours(mask, [cntr], -1, 255, 2)
# apply mask to thresh
result1 = cv2.bitwise_and(thresh, mask)
mask = cv2.merge([mask,mask,mask])
result2 = cv2.bitwise_and(img, mask)
# save results
cv2.imwrite('retina_eye_division.jpg',division)
cv2.imwrite('retina_eye_thresh.jpg',thresh)
cv2.imwrite('retina_eye_mask.jpg',mask)
cv2.imwrite('retina_eye_result1.jpg',result1)
cv2.imwrite('retina_eye_result2.jpg',result2)
# show results
cv2.imshow('morph', morph)
cv2.imshow('division', division)
cv2.imshow('thresh', thresh)
cv2.imshow('mask', mask)
cv2.imshow('result1', result1)
cv2.imshow('result2', result2)
cv2.waitKey(0)
cv2.destroyAllWindows()
Here is the final output I got:
It ended up tracing the vessels, but it also had some background noise.
Ideally I am looking for this output:
Any suggestions for achieving this result?
I have been researching the topic of retinal blood-vessel segmentation using deep learning and the question you asked is basically the same.
I would like to share my research with you.
In the cases where the part of images we want to segment has very low intensity or contrast, we have to apply CLAHE (Contrast Limited Adaptive Histogram Equalization). It is very powerful technique to get the very good results. I would like you to try it. Let me provide you some code for this also:
import cv2
bgr = cv2.imread(retinal_image)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
equalized = clahe.apply(gray)
More over you can also use Non Local Mean for denoising the image.
If you want to know about the full process how it is done, I would recommend you to go through this easy paper which covers the whole process.

How to handle Credit Cards fonts with OpenCV and Tesseract in Python

I'm trying to read cards and output card numbers and expiry date with OpenCV.
import cv2
import pytesseract
filename = 'image1.png'
img = cv2.imread(filename)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
canny = cv2.Canny(gray, 50, 150, apertureSize=3)
result = pytesseract.image_to_string(canny)
print(f"OCR Results: {result}")
cv2.imshow('img', img)
cv2.imshow('canny', canny)
if cv2.waitKey(0) & 0xff == 27:
cv2.destroyAllWindows()
Image before processing
Image after Canny
The result text does not look good. See the screenshot below:
Question: How can I properly handle the cards fonts well for better results. Any idea is highly appreciated.
Thanks.
It looks like the OCR is not working well when passing the edges of the text.
You better apply threshold instead of using Canny.
I suggest the following stages:
Convert from BGR to HSV color space, and get the S (saturation) color channel of HSV.
All gray pixels in S are zero, and colored pixels are above zero.
Convert to binary using automatic threshold (use cv2.THRESH_OTSU).
Crop the contour with the maximum size.
Because the image you posted contains some background.
Apply OCR on the cropped area.
Here is the code:
import numpy as np
import cv2
import imutils # https://pypi.org/project/imutils/
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe' # I am using Windows
img = cv2.imread('image1.png') # Read input image
# Convert from BGR to HSV color space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Get the saturation color channel - all gray pixels are zero, and colored pixels are above zero.
s = hsv[:, :, 1]
# Convert to binary using automatic threshold (use cv2.THRESH_OTSU)
ret, thresh = cv2.threshold(s, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
# Find contours (in inverted thresh)
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cnts = imutils.grab_contours(cnts)
# Find the contour with the maximum area.
c = max(cnts, key=cv2.contourArea)
# Get bounding rectangle
x, y, w, h = cv2.boundingRect(c)
# Crop the bounding rectangle out of thresh
thresh_card = thresh[y:y+h, x:x+w].copy()
# OCR
result = pytesseract.image_to_string(thresh_card)
print(f"OCR Results:\n {result}")
# Show images for debugging
cv2.imshow('s', s)
cv2.imshow('thresh', thresh)
cv2.imshow('thresh_card', thresh_card)
cv2.waitKey(0)
cv2.destroyAllWindows()
OCR Result:
Visa Classic
| By)
4000 1234 Sb18 9010
CARDHOLDER MARE
VISA
Still not perfect...
s:
thresh:
thresh_card:

Insert an image onto a substrate

please help me, I need to insert an image on the substrate.
substrate:
It png, and in the area that is blank with cities, you must insert the image from edge to edge of the frame.
The problem is that I can't find an example of how to insert an image to the known coordinate points of the corners of a given substrate.
Pls help))
My test image
import cv2
import numpy as np
from skimage import io
frame = cv2.cvtColor(io.imread('as.png'), cv2.COLOR_RGB2BGR)
image = cv2.cvtColor(io.imread("Vw5Rc.jpg"), cv2.COLOR_RGB2BGR)
mask = 255 * np.uint8(np.all(frame == [0, 0, 0], axis=2))
contours, _ = cv2.findContours(mask, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
cnt = min(contours, key=cv2.contourArea)
(x, y, w, h) = cv2.boundingRect(cnt)
# Copy appropriately resized image to frame
frame[y:y+h, x:x+w] = cv2.resize(image, (w, h))
cv2.imwrite('frame.png', frame)
I'm trying to find the area where to insert the image by color, the red color of the area I can find, and if there is no color?
The static frame has a constant size.
Here is one way to do it in Python/OpenCV, if I understand what you want.
Read the substrate and trees images
Extract the alpha channel from the substrate
Extract the substrate image without the alpha channel
Use the alpha channel to color the base substrate image white where the alpha channel is black to correct a flaw in the base image
Threshold the alpha channel and invert it
Use morphology to remove the grid lines so that there is only one "outer" contour.
Extract the contour and its bounding box
Resize the trees image to the size of the bounding box.
Use numpy indexing and slicing to multiply the region of the substrate with the resized trees image.
Save the results.
Optionally, display the various images.
Substrate Image:
Trees Image:
import cv2
import numpy as np
# load substrate with alpha channel
substrate = cv2.imread("substrate.png", cv2.IMREAD_UNCHANGED)
hh, ww, cc = substrate.shape
# load colored image
trees = cv2.imread("trees.jpg")
# make img white where alpha is black to merge the alpha channel with the image
alpha = substrate[:,:,3]
img = substrate[:,:,0-2]
img[alpha==0] = 255
img = cv2.merge((img,img,img))
# threshold the img
ret, thresh = cv2.threshold(alpha,0,255,0)
# invert thresh
thresh = 255 - thresh
# make grid lines white in thresh so will get only one contour
kernel = np.ones((9,9), np.uint8)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# find one outer contour
cntrs = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cntrs = cntrs[0] if len(cntrs) == 2 else cntrs[1]
# get bounding box of contour of white rectangle in thresh
for c in cntrs:
x,y,w,h = cv2.boundingRect(c)
#cv2.rectangle(img, (x,y), (x+w,y+h),(0, 0, 255), 2)
# resize trees
trees = cv2.resize(trees,(w,h),0,0)
# generate result
result = img.copy()
result[y:y+h, x:x+w] = img[y:y+h, x:x+w]/255 * trees
# write result to disk
cv2.imwrite("substrate_over_trees.jpg", result)
cv2.imshow("ALPHA", alpha)
cv2.imshow("IMG", img)
cv2.imshow("THRESH", thresh)
cv2.imshow("TREES", trees)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
Note that there is distortion of the trees image, because its aspect ratio does not match the region of the substrate image corresponding to the contour bounding box. This can be changed to maintain the aspect ratio, but then the image will need to be padded to white or some other color to fill the remaining area of the bounding box.

find rectangle in image and extract text inside of it to save it as new image

I am new to OpenCV so I really need your help. I have a bunch of images like this one:
I need to detect the rectangle on the image, extract the text part from it and save it as a new image.
Can you please help me with this?
Thank you!
Just to add to Danyals answer I have added an example code with steps written in comments. For this image you don't even need to perform morphological opening on the image. But usually for this kind of noise in the image it is recomended. Cheers!
import cv2
import numpy as np
# Read the image and create a blank mask
img = cv2.imread('napis.jpg')
h,w = img.shape[:2]
mask = np.zeros((h,w), np.uint8)
# Transform to gray colorspace and invert Otsu threshold the image
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
# ***OPTIONAL FOR THIS IMAGE
### Perform opening (erosion followed by dilation)
#kernel = np.ones((2,2),np.uint8)
#opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
# ***
# Search for contours, select the biggest and draw it on the mask
_, contours, hierarchy = cv2.findContours(thresh, # if you use opening then change "thresh" to "opening"
cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
cv2.drawContours(mask, [cnt], 0, 255, -1)
# Perform a bitwise operation
res = cv2.bitwise_and(img, img, mask=mask)
########### The result is a ROI with some noise
########### Clearing the noise
# Create a new mask
mask = np.zeros((h,w), np.uint8)
# Transform the resulting image to gray colorspace and Otsu threshold the image
gray = cv2.cvtColor(res,cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
# Search for contours and select the biggest one again
_, contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
cnt = max(contours, key=cv2.contourArea)
# Draw it on the new mask and perform a bitwise operation again
cv2.drawContours(mask, [cnt], 0, 255, -1)
res = cv2.bitwise_and(img, img, mask=mask)
# If you will use pytesseract it is wise to make an aditional white border
# so that the letters arent on the borders
x,y,w,h = cv2.boundingRect(cnt)
cv2.rectangle(res,(x,y),(x+w,y+h),(255,255,255),1)
# Crop the result
final_image = res[y:y+h+1, x:x+w+1]
# Display the result
cv2.imshow('img', final_image)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
One way to do this (if the rectangle sizes are somewhat predictable) is:
Convert the image to black and white
Invert the image
Perform morphological opening on the image from (2) with a horizontal line / rectangle (I tried with 2x30).
Perform morphological opening on the image from (2) with a vertical line (I tried it with 15x2).
Add the images from (3) and (4). You should only have a white rectangle now. Now can remove all corresponding rows and columns in the original image that are entirely zero in this image.

Categories

Resources