I'm trying to do OCR arabic on the following ID but I get a very noisy picture, and can't extract information from it.
Here is my attempt
import tesserocr
from PIL import Image
import pytesseract
import matplotlib as plt
import cv2
import imutils
import numpy as np
image = cv2.imread(r'c:\ahmed\ahmed.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.bilateralFilter(gray,11,18,18)
gray = cv2.GaussianBlur(gray,(5,5), 0)
kernel = np.ones((2,2), np.uint8)
gray = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY,11,2)
#img_dilation = cv2.erode(gray, kernel, iterations=1)
#cv2.imshow("dilation", img_dilation)
cv2.imshow("gray", gray)
text = pytesseract.image_to_string(gray, lang='ara')
print(text)
with open(r"c:\ahmed\file.txt", "w", encoding="utf-8") as myfile:
myfile.write(text)
cv2.waitKey(0)
result
sample
The text for your id is in black color which makes the extraction process easy. All you need to do is threshold the dark pixels and you should be able to get the text out.
Here is a snip of the code
import cv2
import numpy as np
# load image in grayscale
image = cv2.imread('AVXjv.jpg',0)
# remove noise
dst = cv2.blur(image,(3,3))
# extract dark regions which corresponds to text
val, dst = cv2.threshold(dst,80,255,cv2.THRESH_BINARY_INV)
# morphological close to connect seperated blobs
dst = cv2.dilate(dst,None)
dst = cv2.erode(dst,None)
cv2.imshow("dst",dst)
cv2.waitKey(0)
And here is the result:
This is my output using ImageMagick TextCleaner script:
Script: textcleaner -g -e stretch -f 50 -o 30 -s 1 C:/Users/PC/Desktop/id.jpg C:/Users/PC/Desktop/out.png
Take a look here if you want to install and use TextCleaner script on Windows... It's a tutorial I made as simple as possible after few researches I made when I was in your same situation.
Now it should be very easy to detect the text and (not sure how simple) recognize it.
Related
I am trying to detect this letter but it doesn't seem to recognize it.
import cv2
import pytesseract as tess
img = cv2.imread("letter.jpg")
imggray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
print(tess.image_to_string(imggray))
this is the image in question:
Preprocessing of the image (e.g. inverting it) should help, and also you could take advantage of pytesseract image_to_string config options.
For instance, something along these lines:
import pytesseract
import cv2 as cv
import requests
import numpy as np
import io
# I read this directly from imgur
response = requests.get('https://i.stack.imgur.com/LGFAu.jpg')
nparr = np.frombuffer(response.content, np.uint8)
img = cv.imdecode(nparr, cv.IMREAD_GRAYSCALE)
# simple inversion as preprocessing
neg_img = cv.bitwise_not(img)
# invoke tesseract with options
text = pytesseract.image_to_string(neg_img, config='--psm 7')
print(text)
should parse the letter correctly.
Have a look at related questions for some additional info about preprocessing and tesseract options:
Why does pytesseract fail to recognise digits from image with darker background?
Why does pytesseract fail to recognize digits in this simple image?
Why does tesseract fail to read text off this simple image?
#Davide Fiocco 's answer is definitely correct.
I just want to show another way of doing it with adaptive-thresholding
When you apply adaptive-thesholding result will be:
Now when you read it:
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)
Result:
B
Code:
import cv2
import pytesseract
img = cv2.imread("LGFAu.jpg")
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thr = cv2.adaptiveThreshold(gry, 252, cv2.ADAPTIVE_THRESH_MEAN_C,
cv2.THRESH_BINARY_INV, 11, 2)
txt = pytesseract.image_to_string(thr, config="--psm 7")
print(txt)
is there a way to change an image to grayscale without using cmap in matpotlib?
my function is
def grayscale(image):
img = image.copy()
r=img[:,:,0]*0.3
g=img[:,:,1]*0.59
b=img[:,:,2]*0.11
gray=r+g+b
img=np.dstack((gray,gray,gray))
return img
plt.imshow(img)
However, the image I got is just black and white, not in grayscale TT. Then when I tried using gray=r+g+b,plt.imshow(img), I got a green and yellow picture. I have tried searching everywhere to get clues and all I found was the use of cmaps. However the project I am doing doesn't allow us to use cmap.
Here is a working solution using your code - you should add your picture path in the image_path variable:
import numpy as np
from PIL import Image
import cv2
def grayscale(image):
img = np.asarray(Image.open(image))
r = img[:,:,0]*0.3
g = img[:,:,1]*0.59
b = img[:,:,2]*0.11
gray = r+g+b
return gray
image_path = "test.jpg"
img = grayscale(image_path)
cv2.imwrite('greyscale.jpg',img)
The trick was that you did not load the image in a proper way.
Friends, I need to implement a code, which blur faces from given images (I am not a dev so this is really difficult to me). I found out that I can use OpenCV and cvlib to do that and found a sample code (repository from cvlib) which does part of the job.
I understood that I need to get the subfaces and apply the face blur to them and I could do it but now I don't know how to add the blurred face to original image. Could someone help me with that?
import cvlib as cv
import sys
from cv2 import cv2
import os
# read input image
image = cv2.imread('path')
# apply face detection
faces, confidences = cv.detect_face(image)
print(faces)
print(confidences)
# loop through detected faces
for face,conf in zip(faces,confidences):
(startX,startY) = face[0],face[1]
(endX,endY) = face[2],face[3]
subFace = image[startY:endY,startX:endX]
subFace = cv2.GaussianBlur(subFace,(23, 23), 30)
# display output
# press any key to close window
cv2.imshow("face_detection", image)
cv2.waitKey()
cv2.imshow("face_detection", subFace)
# release resources
cv2.destroyAllWindows()
I finally figured out how to do it:
import cvlib as cv
import sys
from cv2 import cv2
import os
# read input image
image = cv2.imread('path')
# apply face detection
faces, confidences = cv.detect_face(image)
# print the array with the coordinates and the confidence
print(faces)
print(confidences)
# loop through detected faces
for face,conf in zip(faces,confidences):
(startX,startY) = face[0],face[1]
(endX,endY) = face[2],face[3]
# get the subface
subFace = image[startY:endY,startX:endX]
# apply gaussian blur over subfaces
subFace = cv2.GaussianBlur(subFace,(23, 23), 30)
# add the subfaces to de original image
image[startY:startY+subFace.shape[0], startX:startX+subFace.shape[1]] = subFace
cv2.imshow("face_detection", image)
cv2.waitKey()
# save output
cv2.imwrite("face_detection.jpg", image)
# release resources
cv2.destroyAllWindows()
I want to automate the task of entering set of images into a number generating system & before that i like to remove a dotted watermark which is common across these images.
I tried using google, tesseract & abby reader, but I found that the image part that does not contain the watermark is recognized well, but the part that is watermarked is almost impossible to recognize.
I would like to remove the watermark using image processing. I already tried few sample codes of opencv, python, matlab etc but none matching my requirements...
Here is a sample code in Python that I tried which changes the brightness & darkness:
import cv2
import numpy as np
img = cv2.imread("d:\\Docs\\WFH_Work\\test.png")
alpha = 2.5
beta = -250
new = alpha * img + beta
new = np.clip(new, 0, 255).astype(np.uint8)
cv2.imshow("my window", new)
Unusually, i dont know the watermark of this image consists how many pixels. Is there a way to get rid of this watermark OR make digits dark and lower the darkness of watermark via code?
Here is watermarked image
I am using dilate to remove the figures, then find the edge to detect watermark. Remove it by main gray inside watermark
import cv2
import numpy as np
from matplotlib import pyplot as plt
img = cv2.imread('test.png', 0)
kernel = np.ones((10,10),np.uint8)
dilation = cv2.dilate(img,kernel,iterations = 1)
erosion = cv2.erode(dilation,kernel,iterations = 1)
plt.imshow(erosion, cmap='gray')
plt.show()
#contour
gray = cv2.bilateralFilter(erosion, 11, 17, 17)
edged = cv2.Canny(gray, 30, 200)
plt.imshow(edged, cmap='gray')
plt.show()
The input is an image(document) from the scanner and my task is to crop the background and return only the document, just like this: Input Output
I've done this through thresholding and getbbox:
import matplotlib.pyplot as plt
import matplotlib.image as pli
from skimage.filters import threshold_otsu as otsu
from PIL import Image
cnh_gray = Image.open("cnh.jpg").convert('L')
cnh_gray.save('cnhgray.jpg')
img = pli.imread('cnhgray.jpg')
imagem = Image.open('cnhgray.jpg')
thresh = otsu(img)
mask = img < thresh
msk = Image.fromarray(mask,'L')
box = msk.getbbox()
crop = imagem.crop(box)
The problem is: The getbbox function doesn't work when the document isn't vertical. Since I don't know the angle, how can I rotate the image to use the getbbox funcion? If there's another function that I can use for inclined images instead of getbbox, please tell me.
Thanks for the help.