I have a trouble with my image was taken by phone. I can't reduce unwanted noise of my photo
I have tried to increase contrast and also brightness but it's not effective
img = cv2.imread(image_path, 0)
blur = cv2.GaussianBlur(img, (5, 5), 0)
thresh = cv2.adaptiveThreshold(blur, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY_INV, 11,
2) # Convert Image To Binary
plt.imshow(img)
This is my original photo
This is my photo after processing
This is my expected photo
You have used the right thresholding method to produce the binary image from the original image, whereas it needs some optimization to get the best result:
import cv2 as cv
import matplotlib.pyplot as plt
img = cv.imread('aa.jpg', 0)
blur = cv.medianBlur(img,11)
# blur = cv.GaussianBlur(img,(11,11),0)
thresh = cv.adaptiveThreshold(blur,255,cv.ADAPTIVE_THRESH_GAUSSIAN_C,\
cv.THRESH_BINARY, 31, 4)
plt.subplot(121),plt.imshow(img)
plt.subplot(122),plt.imshow(thresh)
plt.show()
I didn't understand if you need also to filter off the pen marks, because in that case you need something more complex.
Anyway in your code the mistake is just in
plt.imshow(img)
you just plot your original image and not the filtered one...try this
plt.imshow(blur)
or this
plt.imshow(thresh)
and check the results
Related
Good day. I'm trying to identify the both printed and hand written text from the below check leaf
and here is the image after preprocessing, used below code
import cv2
import pytesseract
import numpy as np
img = cv2.imread('Images/cheque_leaf.jpg')
# Rescaling the image (it's recommended if you’re working with images that have a DPI of less than 300 dpi)
img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
h, w = img.shape[:2]
# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
# it to reduce noise
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1] # perform OTSU threhold
thresh = cv2.rectangle(thresh, (0, 0), (w, h), (0, 0, 0), 2) # draw a rectangle around regions of interest in an image
# Dilates an image by using a specific structuring element.
# enrich the charecters(to large)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
# The function erodes the source image using the specified structuring element that determines
# the shape of a pixel neighborhood over which the minimum is taken
erode = cv2.erode(thresh, kernel, iterations = 1)
# To extract the text
custom_config = r'--oem 3 --psm 6'
pytesseract.image_to_string(thresh, config=custom_config)
and now using pytesseract.image_to_string() method to convert image to text. here I'm getting irrelavant output. In that above image I wanted to identify the date,branch payee,amount in both numbers and wordings and digital signature name followed by account number.
any OCR Techniques to solve the above problem by extract the exact data as mentioned above. Thanks in advance
The following is just one of the several approaches.
I would suggest using Sauvola threshold technique. Threshold is calculated for each pixel in the image using a specific formula mentioned here. It involves calculating the mean and standard deviation of pixel values within a certain window.
This functionality is available in the skimage library (also known as scikit-image)
Following is the working example for the given image:
from skimage.filters import threshold_sauvola
img = cv2.imread('cheque.jpg', cv2.IMREAD_GRAYSCALE)
# choosing a window size of 13 (feel free to change it and visualize)
thresh_sauvola = threshold_sauvola(img, window_size=13)
binary_sauvola = img > thresh_sauvola
# converting resulting Boolean array to unsigned integer array of 8-bit (0 - 255)
binary_sauvola_int = binary_sauvola.astype(np.uint8)
result = cv2.normalize(binary_sauvola_int, dst=None, alpha=0, beta=255,norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)
Result:
Note: This result is just a launchpad to try out other image processing techniques to get your desired result.
I'm using a KNN to detect characters, however, it is sensitive to background noise. the image is basically what I'm using and I have developed a mini script to try get the best threshold image. would anyone have any suggestions/ changed to get better results? make the X more viewable. ( attached version of current output and what the input is)
import cv2
import numpy as np
image = cv2.imread("resize.png")
img = image
img = cv2.blur(img, (5, 5))
imgGray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # get grayscale image
imgBlurred = cv2.GaussianBlur(imgGray, (11, 11), 5) # blur
# cv2.imshow("test",imgBlurred)
imgThresh = cv2.adaptiveThreshold(imgBlurred, # input image
255, # make pixels that pass the threshold full white
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
# use gaussian rather than mean, seems to give better results
cv2.THRESH_BINARY_INV,
# invert so foreground will be white, background will be black
13, # size of a pixel neighborhood used to calculate threshold value
2) # constant subtracted from the mean or weighted mean
imgThreshCopy = imgThresh.copy() # make a copy of the thresh image, this in necessary b/c findContours modifies the image
kernel = np.ones((3, 3), np.uint8)
kernel2 = np.ones((7, 7), np.uint8)
kernel3 = np.ones((1, 1), np.uint8)
imgThreshCopy = cv2.morphologyEx(imgThreshCopy, cv2.MORPH_OPEN, kernel)
imgThreshCopy = cv2.morphologyEx(imgThreshCopy, cv2.MORPH_CLOSE, kernel2)
imgThreshCopy = cv2.dilate(imgThreshCopy, kernel3, iterations=150)
res = imgThreshCopy
cv2.imwrite("test.jpg", res)
cv2.imshow("image", res)
cv2.waitKey(0)
output
input
I am given the below image as input.
I want to remove the background and black cable from the image and keep only the red cable.
Also construct the red cable wherever it is hidden because of the black cable.
NOTE : ONLY IMAGE PROCESSING TECHNIQUES ARE ALLOWED. NO ML OR DL IS ALLOWED.
In this case it is red wire that I want to extract but it can be some other colour too, so I want to generalize it.
I guess construction can be done by dilation and erosion.
But please help me on how to extract this?
To improve the results you have to play with morphological operations, also if the color of the cable is different you have to play with the code, There is no General Solution, except I can say that Deep Learning or Machine Learning may give better results.
Edit: Thanks to #fmw42 for mentioning cv.inRange(), I changed parts of the code to implement thresholding in HSV space.
#========================
# Import Libraies
#========================
import numpy as np
import matplotlib.pyplot as plt
import cv2 as cv
#------------------------
# Read Image
#========================
img = cv.imread("img.jpg")
imgHSV = cv.cvtColor(img, cv.COLOR_BGR2HSV)
img = cv.cvtColor(img, cv.COLOR_BGR2RGB)
#------------------------
# Threshold Image
#========================
## mask of red
mask1 = cv.inRange(imgHSV, (0, 30, 0), (10, 255,255))
mask2 = cv.inRange(imgHSV, (170, 30, 0), (180, 255,255))
mask = cv.bitwise_or(mask1, mask2)
mask = np.tile(mask, (3,1,1))
mask = np.swapaxes(mask , 0, 1)
mask = np.swapaxes(mask , 1, 2)
print(mask.shape)
th1 = cv.bitwise_and(img,mask)
#------------------------
# Morphology
#========================
kernel1 = np.ones((7,7),np.uint8)
kernel2 = np.zeros((70,70),np.uint8)
kernel2[10:60, 10:60] = 1
img_opn = cv.morphologyEx(th1 ,cv.MORPH_OPEN ,kernel1)
img_cls = cv.morphologyEx(img_opn, cv.MORPH_CLOSE, kernel2)
#------------------------
# Results Visualization
#========================
plt.figure(num = "Red Cable")
plt.subplot(221)
plt.imshow(img)
plt.title('Original')
plt.axis('off')
plt.subplot(222)
plt.imshow(th1)
plt.title('Thresholded')
plt.axis('off')
plt.subplot(223)
plt.imshow(img_opn)
plt.title('Opening')
plt.axis('off')
plt.subplot(224)
plt.imshow(img_cls)
plt.title('Result')
plt.axis('off')
plt.show()
#------------------------
#------------------------
You could just look at every pixel and if the pixel isn't close to a red threshold remove it. You will have to understand RGB color representations. This solution won't keep the white caps from your image.
I am doing OCR to extract information from the ID card. However, accuracy is quite low.
My assumption is that removing the background will make OCR more accurate.
I use the ID scanner machine (link) to obtain the grey image below. It seems that the machine uses IR instead of image processing.
Does anyone knows how to get the same result by using Opencv or tools (photoshop, gimp, etc)?
Thanks in advance.
Here are two more methods: adaptive thresholding and division normalization.
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("green_card.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh1 = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 51, 25)
# write results to disk
cv2.imwrite("green_card_thresh1.jpg", thresh1)
# apply morphology
kernel = cv2.getStructuringElement(cv2.MORPH_RECT , (11,11))
morph = cv2.morphologyEx(gray, cv2.MORPH_DILATE, kernel)
# divide gray by morphology image
division = cv2.divide(gray, morph, scale=255)
# threshold
thresh2 = cv2.threshold(division, 0, 255, cv2.THRESH_OTSU )[1]
# write results to disk
cv2.imwrite("green_card_thresh2.jpg", thresh2)
# display it
cv2.imshow("thresh1", thresh1)
cv2.imshow("thresh2", thresh2)
cv2.waitKey(0)
Adaptive Thresholding Result:
Division Normalization Result:
EDIT:
since there are different lighting conditions, contrast adjustment is added here.
The simple approache in my mind to solve your issue is that: since the undesired background colours are Green and Red, and the desired font colour is Black, simply suppress the Red and green colours as following:
import numpy as np
import matplotlib.pyplot as plt
from skimage.io import imread, imsave
from skimage.color import rgb2gray
from skimage.filters import threshold_otsu
from skimage import exposure
def adjustContrast(img):
p2, p98 = np.percentile(img, (2, 98))
img_rescale = exposure.rescale_intensity(img, in_range=(p2, p98))
return img_rescale
# Read the image
img = imread('ID_OCR.jpg')
# Contrast Adjustment for each channel
img[:,:,0] = adjustContrast(img[:,:,0]) # R
img[:,:,1] = adjustContrast(img[:,:,1]) # G
img[:,:,2] = adjustContrast(img[:,:,2]) # B
# # Supress unwanted colors
img[img[...,0] > 100] = 255 # R
img[img[...,1] > 100] = 255 # B
# Convert the image to graylevel
img = rgb2gray(img)
# Rescale into 0-255
img = 255*img.astype(np.uint8)
# Save the results
imsave('Result.png', img)
The image will look like:
The Results are not optimal, because also your image resolution isn't high.
At the end, there are many solutions, and improvements, also you can use Morphology to make it look nicer, this is just a simple proposal to solve the problem.
from PIL import Image
import pytesseract
from pdf2image import convert_from_path
import os
import pandas as pd
import cv2
import numpy as np
files = os.chdir("C:/Users/abhishek_kumar1/Desktop/New folder")
#print(os.getcwd())
pages = convert_from_path("d.pdf",190,single_file=True,
poppler_path='C:/Users/abhishek_kumar1/Downloads/poppler-0.68.0_x86/poppler-0.68.0/bin')
image_counter=1
for page in pages:
filename = "page_"+str(image_counter)+".jpg"
page.save(filename,'JPEG')
img = cv2.imread(filename)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
cv2.imwrite('grey.png',gray)
binary,thresh1 = cv2.threshold(gray, 0, 255,cv2.THRESH_OTSU|cv2.THRESH_BINARY_INV)
cv2.imwrite('Thresh1.png',thresh1)
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (15, 3))
dilation = cv2.dilate(thresh1, rect_kernel, iterations = 6)
contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
im2 = img.copy()
ROI_number = 0
for cnt in contours[::-1]:
[x,y,w,h] = cv2.boundingRect(cnt)
ROI=im2[y:y+h, x:x+w]
#print(str(w),str(h))
#cv2.putText(im2, str(h), (x,y - 10 ), cv2.FONT_HERSHEY_SIMPLEX, 0.1, (255, 0, 0), 1)
#cv2.putText(im2, str(w), (x,y + 10 ), cv2.FONT_HERSHEY_SIMPLEX, 0.1, (0, 0, 255), 1)
cv2.imwrite('ROI_{}.jpg'.format(ROI_number),ROI)
cv2.rectangle(im2,(x,y),(x+w,y+h),(36,255,12),1)
ROI_number += 1
cv2.imwrite('contours1.png',im2)
How to find only this image from above code section section, is there any options to understand font type from image like bold, italic,something else
get trouble to find only the bold line part from all of images.
Please any body have a suggestion regarding this please help me out.
Alex Alex's answer did not work for me. Here is my alternative described in words.
The general idea is that we compare how many black pixels there are in comparison to the minimum possible pixels to still form characters. This provides us with a difference from the skeleton to normal text and skeleton to bold text. In this way, we can quite clearly separate normal text from the bold text.
Use OCR software to extract bounding boxes of individual words. Optional: Combine individual words into lines of words, for example by word_num in Pytesseract.
Convert the image to grayscale and invert the image colors
Perform Zhang-Suen thinning on the selected area of text on the image (opencv contribution: cv2.ximgproc.thinning)
Sum where there are white pixels in the thinned image, i.e. where values are equal to 255 (white pixels are letters)
Sum where there are white pixels in the inverted image
Finally compute the thickness (sum_inverted_pixels - sum_skeleton_pixels) / sum_skeleton_pixels (sometimes there will be zero division error, check when the sum of the skeleton is 0 and return 0 instead)
Normalize the thickness by minimum and maximum values
Apply a threshold for deciding when a word/line of text is bold, e.g. 0.6 or 0.7
See python code and result:
import cv2
import numpy as np
img = cv2.imread('C.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 160, 255, cv2.THRESH_BINARY)[1]
kernel = np.ones((5,5),np.uint8)
kernel2 = np.ones((3,3),np.uint8)
marker = cv2.dilate(thresh,kernel,iterations = 1)
mask=cv2.erode(thresh,kernel,iterations = 1)
while True:
tmp=marker.copy()
marker=cv2.erode(marker, kernel2)
marker=cv2.max(mask, marker)
difference = cv2.subtract(tmp, marker)
if cv2.countNonZero(difference) == 0:
break
marker_color = cv2.cvtColor(marker, cv2.COLOR_GRAY2BGR)
out=cv2.bitwise_or(img, marker_color)
cv2.imwrite('out.png', out)
cv2.imshow('result', out )