I want to apply some kind of preprocessing to this image so that text can be more readable, so that later I can read text from image. I'm new to this so I do not know what should I do, should I increase contrast or should I reduce noise, or something else. Basically, I want to remove these gray areas on the image and keep only black letters (as clear as they can be) and white background.
import cv2
img = cv2.imread('slika1.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imshow('gray', img)
cv2.waitKey(0)
thresh = 200
img = cv2.threshold(img, thresh, 255, cv2.THRESH_BINARY)[1]
cv2.imshow('filter',img)
cv2.waitKey(0)
I read the image and applied threshold to the image but I needed to try 20 different thresholds until I found one that gives results.
Is there any better way to solve problems like this?
The problem is that I can get different pictures with different size of gray areas, so sometime I do not need to apply any kind of threshold, and sometimes I do, because of that I think that my solution with threshold is not that good.
For this image, my code works good:
But for this it gives terrible results:
Try division normalization in Python/OpenCV. Divide the input by its blurred copy. Then sharpen. You may want to crop the receipt better or mask out the background first.
Input:
import cv2
import numpy as np
import skimage.filters as filters
# read the image
img = cv2.imread('receipt2.jpg')
# convert to gray
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# blur
smooth = cv2.GaussianBlur(gray, (95,95), 0)
# divide gray by morphology image
division = cv2.divide(gray, smooth, scale=255)
# sharpen using unsharp masking
sharp = filters.unsharp_mask(division, radius=1.5, amount=1.5, multichannel=False, preserve_range=False)
sharp = (255*sharp).clip(0,255).astype(np.uint8)
# save results
cv2.imwrite('receipt2_division.png',division)
cv2.imwrite('receipt2_division_sharp.png',sharp)
# show results
cv2.imshow('smooth', smooth)
cv2.imshow('division', division)
cv2.imshow('sharp', sharp)
cv2.waitKey(0)
cv2.destroyAllWindows()
Division result:
Sharpened result:
Related
I have the following image:
Initial Image
I am using the following code the rotate the image:
from skimage.transform import rotate
image = cv2.imread('122.png')
rotated = rotate(image,34,cval=1,resize = True)
Once I execute this code, I receive the following image:
Rotated Image
To eliminate the blur on the image, I use the following code to set a threshold. Anything that is not white is turned to black (so the gray spots turn black). The code for that is as follows:
ret, thresh_hold = cv2.threshold(rotated, 0, 100, cv2.THRESH_BINARY)
plt.imshow(thresh_hold)
Instead of getting a nice clear picture, I receive the following:
Choppy Image
Does anyone know what I can do to improve the image quality, or adjust the threshold to create a clearer image?
I attempted to adjust the threshold to different values, but this changed the image to all black or all white.
One way to approach that is to simply antialias the image in Python/OpenCV.
To do that one simply converts to grayscale. Then blurs the image, then applies a stretch of the image.
Adjust the blur sigma to change the antialiasing.
Input:
import cv2
import numpy as np
import skimage.exposure
# load image
img = cv2.imread('122.png')
# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# blur threshold image
blur = cv2.GaussianBlur(gray, (0,0), sigmaX=2, sigmaY=2, borderType = cv2.BORDER_DEFAULT)
# stretch so that 255 -> 255 and 127.5 -> 0
result = skimage.exposure.rescale_intensity(blur, in_range=(127.5,255), out_range=(0,255)).astype(np.uint8)
# save output
cv2.imwrite('122_antialiased.png', result)
# Display various images to see the steps
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:
I am trying to estimate the area of vegetation in square meters on satellite photos, from the colors. I don't have a training dataset, and therefore cannot do machine learning. So I know the results will not be very good, but I try anyway.
To do this, I apply a filter on the colors thanks to cv2.inRange.
import numpy as np
import cv2
img = cv2.imread('staticmap.png')
upperbound = np.array([70, 255,255])
lowerbound = np.array([40, 40,40])
mask = cv2.inRange(img, lowerbound, upperbound)
imask = mask>0
white = np.full_like(img, [255,255,255], np.uint8)
result = np.zeros_like(img, np.uint8)
result[imask] = white[imask]
cv2.imshow(winname = 'satellite image', mat = img)
cv2.imshow('vegetation detection', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
This gives the following results
So it seems that the detection is not too bad.
Now, I would like, from the density of white pixels, detect the areas where there is vegetation and areas where there is not. I imagine an output like this :
Are there any open cv functions that can do this?
You could consider using a Gaussian blur followed by Otsu thresholding like this:
import cv2
# Load image as greyscale
im = cv2.imread('veg.jpg', cv2.IMREAD_GRAYSCALE)
# Apply blur
blur = cv2.GaussianBlur(im,(19,19),0)
# Otsu threshold
_,thr = cv2.threshold(blur,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
I want to auto adjust the brightness and contrast of a color image taken from phone under different lighting conditions. Please help me I am new to OpenCV.
Source:
Input Image
Result:
result
What I am looking for is more of a localized transformation. In essence, I want the shadow to get as light as possible completely gone if possible and get darker pixels of the image to get darker, more in contrast and the light pixels to get more white but not to a point where it gets overexposed or anything like that.
I have tried CLAHE, Histogram Equalization, Binary Thresholding, Adaptive Thresholding, etc But nothing has worked.
My initials thoughts are that I need to neutralize Highlights and bring darker pixels more towards the average value while keeping the text and lines as dark as possible. And then maybe do a contrast filter. But I am unable to Get the result please help me.
Here is one way to do that in Python/OpenCV.
Read the input
Increase contrast
Convert original to grayscale
Adaptive threshold
Use the thresholded image to make the background white on the contrast increased image
Save results
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("math_diagram.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 21, 15)
# make background of input white where thresh is white
result = img.copy()
result[thresh==255] = (255,255,255)
# write results to disk
cv2.imwrite("math_diagram_threshold.jpg", thresh)
cv2.imwrite("math_diagram_processed.jpg", result)
# display it
cv2.imshow("THRESHOLD", thresh)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Threshold image:
Result:
You can use any local binarization method. In OpenCV there is one such method called Wolf-Julion local binarization which can be applied to the input image. Below is code snippet as an example:
import cv2
image = cv2.imread('input.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)[:,:,2]
T = cv2.ximgproc.niBlackThreshold(gray, maxValue=255, type=cv2.THRESH_BINARY_INV, blockSize=81, k=0.1, binarizationMethod=cv2.ximgproc.BINARIZATION_WOLF)
grayb = (gray > T).astype("uint8") * 255
cv2.imshow("Binary", grayb)
cv2.waitKey(0)
The output result from above code is below. Please note that to use ximgproc module you need to install opencv contrib package.
I am doing OCR to extract information from the ID card. However, accuracy is quite low.
My assumption is that removing the background will make OCR more accurate.
I use the ID scanner machine (link) to obtain the grey image below. It seems that the machine uses IR instead of image processing.
Does anyone knows how to get the same result by using Opencv or tools (photoshop, gimp, etc)?
Thanks in advance.
Here are two more methods: adaptive thresholding and division normalization.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("green_card.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh1 = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 51, 25)
# write results to disk
cv2.imwrite("green_card_thresh1.jpg", thresh1)
# apply morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT , (11,11))
morph = cv2.morphologyEx(gray, cv2.MORPH_DILATE, kernel)
# divide gray by morphology image
division = cv2.divide(gray, morph, scale=255)
# threshold
thresh2 = cv2.threshold(division, 0, 255, cv2.THRESH_OTSU )[1]
# write results to disk
cv2.imwrite("green_card_thresh2.jpg", thresh2)
# display it
cv2.imshow("thresh1", thresh1)
cv2.imshow("thresh2", thresh2)
cv2.waitKey(0)
Adaptive Thresholding Result:
Division Normalization Result:
EDIT:
since there are different lighting conditions, contrast adjustment is added here.
The simple approache in my mind to solve your issue is that: since the undesired background colours are Green and Red, and the desired font colour is Black, simply suppress the Red and green colours as following:
import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread, imsave
from skimage.color import rgb2gray
from skimage.filters import threshold_otsu
from skimage import exposure
def adjustContrast(img):
p2, p98 = np.percentile(img, (2, 98))
img_rescale = exposure.rescale_intensity(img, in_range=(p2, p98))
return img_rescale
# Read the image
img = imread('ID_OCR.jpg')
# Contrast Adjustment for each channel
img[:,:,0] = adjustContrast(img[:,:,0]) # R
img[:,:,1] = adjustContrast(img[:,:,1]) # G
img[:,:,2] = adjustContrast(img[:,:,2]) # B
# # Supress unwanted colors
img[img[...,0] > 100] = 255 # R
img[img[...,1] > 100] = 255 # B
# Convert the image to graylevel
img = rgb2gray(img)
# Rescale into 0-255
img = 255*img.astype(np.uint8)
# Save the results
imsave('Result.png', img)
The image will look like:
The Results are not optimal, because also your image resolution isn't high.
At the end, there are many solutions, and improvements, also you can use Morphology to make it look nicer, this is just a simple proposal to solve the problem.
Hi I am looking to improve my performance with pytesseract at digit recognition.
I take my raw image and split it into parts that look like this:
The size can vary.
To this I apply some pre-processing methods like so
image = cv2.imread(im, cv2.IMREAD_GRAYSCALE)
image = cv2.GaussianBlur(image, (1, 1), 0)
kernel = np.ones((5, 5), np.uint8)
result_img = cv2.blur(img, (2, 2), 0)
result_img = cv2.dilate(result_img, kernel, iterations=1)
result_img = cv2.erode(result_img, kernel, iterations=1)
and I get this
I then pass this to pytesseract:
num = pytesseract.image_to_string(result_img, lang='eng',
config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
However this is not good enough for me and often gets numbers wrong.
I am looking for ways to improve, I have tried to keep this minimal and self contained but let me know if I've not been clear and I will elaborate.
Thank you.
You're on the right track by trying to preprocess the image before performing OCR but using an incorrect approach. There is no reason to dilate or erode the image since these operations are mainly used for removing small noise particles. In addition, your current output is not a binary image. It may look like it only contains black and white pixels but it is actually a 3-channel BGR image which is probably why you're getting incorrect OCR results. If you look at Tesseract improve quality, you will notice that for Pytesseract to perform optimal OCR, the image needs to be preprocessed so that the desired text to detect is in black with the background in white. To do this, we can perform a Otsu's threshold
to obtain a binary image then invert it so the text is in the foreground. This will result in our preprocessed image where we can throw it into image_to_string. We use the --psm 6 configuration option to assume a single uniform block of text. Take a look at configuration options for more settings. Here's the results:
Input image -> Binary -> Invert
Result from Pytesseract OCR
8
Code
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
# Load image, grayscale, Otsu's threshold, invert
image = cv2.imread('1.png')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
invert = 255 - thresh
# OCR
data = pytesseract.image_to_string(invert, lang='eng', config='--psm 6')
print(data)
cv2.imshow('thresh', thresh)
cv2.imshow('invert', invert)
cv2.waitKey()