I am working in python on openCV 3.0. In order to find the largest white pixel region, first of all thresholded gray image to binary image.
import cv2
import numpy as np
img = cv2.imread('graimage.png')
img = cv2.resize(img,(400,500))
gray = img.copy()
(thresh, im_bw) = cv2.threshold(img, 0, 255, cv2.THRESH_BINARY )
derp,contours,hierarchy = cv2.findContours(im_bw,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
cnts = max(cnts, key=cv2.contourArea)
But it shows error as follows.
cv2.error: ..../opencv/modules/imgproc/src/contours.cpp:198: error: (-210) [Start]FindContours supports only CV_8UC1 images when mode != CV_RETR_FLOODFILL otherwise supports CV_32SC1 images only in function cvStartFindContours.
It looks like this was answered in the comments, but just to mark the question as answered:
CV_8UC1 means 8-bit pixels, unsigned, and only one channel, so grayscale. It looks like you're reading it in with 3 color channels, or CV_8UC3. You can check the image type by printing img.dtype and img.shape. The dtype should be uint8, and the shape should be (#, #), indicating two dimensions. I'm guessing you'll see that shape prints (#, #, 3) for your image as-is, indicating three color channels.
As #user3515225 said, you can fix that by reading the image in as grayscale using cv2.imread('img.png', cv2.IMREAD_GRAYSCALE). That assumes you have no use for color anywhere else, though. If you want a separate grayscale copy of the image, then replace gray = img.copy() with gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) instead.
Related
Good day. I'm trying to identify the both printed and hand written text from the below check leaf
and here is the image after preprocessing, used below code
import cv2
import pytesseract
import numpy as np
img = cv2.imread('Images/cheque_leaf.jpg')
# Rescaling the image (it's recommended if you’re working with images that have a DPI of less than 300 dpi)
img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
h, w = img.shape[:2]
# By default OpenCV stores images in BGR format and since pytesseract assumes RGB format,
# we need to convert from BGR to RGB format/mode:
# it to reduce noise
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU)[1] # perform OTSU threhold
thresh = cv2.rectangle(thresh, (0, 0), (w, h), (0, 0, 0), 2) # draw a rectangle around regions of interest in an image
# Dilates an image by using a specific structuring element.
# enrich the charecters(to large)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
# The function erodes the source image using the specified structuring element that determines
# the shape of a pixel neighborhood over which the minimum is taken
erode = cv2.erode(thresh, kernel, iterations = 1)
# To extract the text
custom_config = r'--oem 3 --psm 6'
pytesseract.image_to_string(thresh, config=custom_config)
and now using pytesseract.image_to_string() method to convert image to text. here I'm getting irrelavant output. In that above image I wanted to identify the date,branch payee,amount in both numbers and wordings and digital signature name followed by account number.
any OCR Techniques to solve the above problem by extract the exact data as mentioned above. Thanks in advance
The following is just one of the several approaches.
I would suggest using Sauvola threshold technique. Threshold is calculated for each pixel in the image using a specific formula mentioned here. It involves calculating the mean and standard deviation of pixel values within a certain window.
This functionality is available in the skimage library (also known as scikit-image)
Following is the working example for the given image:
from skimage.filters import threshold_sauvola
img = cv2.imread('cheque.jpg', cv2.IMREAD_GRAYSCALE)
# choosing a window size of 13 (feel free to change it and visualize)
thresh_sauvola = threshold_sauvola(img, window_size=13)
binary_sauvola = img > thresh_sauvola
# converting resulting Boolean array to unsigned integer array of 8-bit (0 - 255)
binary_sauvola_int = binary_sauvola.astype(np.uint8)
result = cv2.normalize(binary_sauvola_int, dst=None, alpha=0, beta=255,norm_type=cv2.NORM_MINMAX, dtype=cv2.CV_8U)
Result:
Note: This result is just a launchpad to try out other image processing techniques to get your desired result.
I have the following function to pre-process an image for Tesseract OCR, in most of the image the text is white, there can be green, red and purple text too. I want to be able to read all of that, but when I apply the thresholding during the pre-processing the red text is gone. Is there a way to avoid this? It doesn't happen with the green text unless it's dark green
def pre_process_img(img):
open_cv_image = numpy.array(img)
# Convert RGB to BGR
open_cv_image = open_cv_image[:, :, ::-1].copy()
img_gray = cv2.cvtColor(numpy.array(img), cv2.COLOR_BGR2GRAY)
img_gray = cv2.resize(img_gray, None, fx=3, fy=3, interpolation=cv2.INTER_CUBIC)
img_inverted = 255 - img_gray
ret, thresh1 = cv2.threshold(img_inverted, 127, 255, cv2.THRESH_BINARY)
# [DEBUG] show pre processed image
# cv2.imshow("inverted", thresh1)
# cv2.waitKey(0)
return thresh1
In this function img is a PIL.Image.Image image, I convert it to an OpenCV image and apply preprocessing (turning into greyscale, rezising, inverting and binary thresholding). With psm 11 on Tesseract it has given a good enough result.
Btw If you have any suggestion to improve my pre_process_img function I'm open to listen. I'm new to OpenCV and I just stuck with the thing that gave me the best result from everything I've tried
This is my image here
Convert from BGR to HSV colorspace in Python/OpenCV. Then simply threshold the value channel. Here is the value channel. You will see that all text is white (in this case).
I'm using py-tesseract for OCR on images as below but I'm unable to get consistent output from the unprocessed images. How can the spotted background be reduced and the numbers highlighted using cv2 to increase accuracy? I'm also interested in keeping the separators in the output string.
Below pre-processing seems to work with some accuracy
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(gray, (7, 7), 0)
(T, threshInv) = cv2.threshold(blurred, 0, 255,
cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
Getting output using psm --6: 6.903.722,99
Here's one solution, which is based on the ideas on a similar post. The main idea is to apply a Hit-or-Miss operation looking for the pattern you want to eliminate. In this case the pattern is one black (or white, if you invert the image) surrounded by pixels of the complimentary color. I've also included a thresholding operation with some bias, because some of the characters are easily destroyed (you could really benefit from more high-res image). These are the steps:
Get grayscale image via color conversion
Threshold with bias to get a binary image
Apply the Hit-or-Miss with one central pixel target kernel
Use the result from the prior operation to suppress the noise in the original image
Let's see the code:
# Imports:
import numpy as np
import cv2
image path
path = "D://opencvImages//"
fileName = "8WFNvsZ.jpg"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Convert RGB to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu:
thresh, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
# Use Otsu's threshold value and add some bias:
thresh = 1.05 * thresh
_, binaryImage = cv2.threshold(grayscaleImage, thresh, 255, cv2.THRESH_BINARY_INV )
The first bit of code gets the binary image of the input. Note that I've added some bias to the threshold obtained via Otsu to avoid degrading the characters. This is the result:
Ok, let's apply the Hit-or-Miss operation to get the dot mask:
# Perform morphological hit or miss operation
kernel = np.array([[-1,-1,-1], [-1,1,-1], [-1,-1,-1]])
dotMask = cv2.filter2D(binaryImage, -1, kernel)
# Bitwise-xor mask with binary image to remove dots
result = cv2.bitwise_xor(binaryImage, dotMask)
The dot mask is this:
And the result of subtracting (or XORing) this mask to the original binary image is this:
If I run the inverted (black text on white background) result image on PyOCR I get this string output:
Text is: 6.003.722,09
The other image produces this final result:
And its OCR returns this:
Text is: 4.705.640,00
I want to auto adjust the brightness and contrast of a color image taken from phone under different lighting conditions. Please help me I am new to OpenCV.
Source:
Input Image
Result:
result
What I am looking for is more of a localized transformation. In essence, I want the shadow to get as light as possible completely gone if possible and get darker pixels of the image to get darker, more in contrast and the light pixels to get more white but not to a point where it gets overexposed or anything like that.
I have tried CLAHE, Histogram Equalization, Binary Thresholding, Adaptive Thresholding, etc But nothing has worked.
My initials thoughts are that I need to neutralize Highlights and bring darker pixels more towards the average value while keeping the text and lines as dark as possible. And then maybe do a contrast filter. But I am unable to Get the result please help me.
Here is one way to do that in Python/OpenCV.
Read the input
Increase contrast
Convert original to grayscale
Adaptive threshold
Use the thresholded image to make the background white on the contrast increased image
Save results
Input:
import cv2
import numpy as np
# read image
img = cv2.imread("math_diagram.jpg")
# convert img to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# do adaptive threshold on gray image
thresh = cv2.adaptiveThreshold(gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 21, 15)
# make background of input white where thresh is white
result = img.copy()
result[thresh==255] = (255,255,255)
# write results to disk
cv2.imwrite("math_diagram_threshold.jpg", thresh)
cv2.imwrite("math_diagram_processed.jpg", result)
# display it
cv2.imshow("THRESHOLD", thresh)
cv2.imshow("RESULT", result)
cv2.waitKey(0)
Threshold image:
Result:
You can use any local binarization method. In OpenCV there is one such method called Wolf-Julion local binarization which can be applied to the input image. Below is code snippet as an example:
import cv2
image = cv2.imread('input.jpg')
gray = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)[:,:,2]
T = cv2.ximgproc.niBlackThreshold(gray, maxValue=255, type=cv2.THRESH_BINARY_INV, blockSize=81, k=0.1, binarizationMethod=cv2.ximgproc.BINARIZATION_WOLF)
grayb = (gray > T).astype("uint8") * 255
cv2.imshow("Binary", grayb)
cv2.waitKey(0)
The output result from above code is below. Please note that to use ximgproc module you need to install opencv contrib package.
i hava problem in fire detection
my code is :
ret, frame = cap.read()
lab_image = cv2.cvtColor(frame, cv2.COLOR_BGR2LAB)
L , a , b = cv2.split(lab_image)
ret,thresh_L = cv2.threshold(L,70,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
ret,thresh_a = cv2.threshold(a,70,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
ret,thresh_b = cv2.threshold(b,70,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh_image = cv2.merge((thresh_L, thresh_a, thresh_b))
dilation = cv2.dilate(thresh_image, None, iterations=2)
gray = cv2.cvtColor(thresh_image,cv2.COLOR_
(cnts, _) = cv2.findContours(dilation.copy(), cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for c in cnts:
if cv2.contourArea(c) < args["min_area"]:
continue
(x,y,w,h) = cv2.boundingRecy(c)
cv2.rectangle(frame,(x,y),(x+w, y+h), (0,255,0), 2)
cv2.imshow('frame1',frame)
and when i run this program , see this error
FindContours support only 8uC1 and 32sC1 images in function cvStartFindContours
please help me .
tnx
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
Use this line on your image to convert it from BGR to grayscale (8UC1) format before finding contours. FindContours function only supports a grayscale image format.
In my solution I had to convert the dtype into uint8.
Yes, my image was binary image(single channel), however in my code somehow the thresh_image was changed into float32 data type. But cv2.findContours() cannot handle float32.
So I had to explicitly convert float32 --> uint8.
thresh_image = thresh_image.astype(np.uint8)
For completion, the 8UC1 format is 8 byte, unsigned, single channel.
In addition to cv2 grayscale, single-channel uint8 format will also be valid, in case anyone is building the image outside of cv2 functions and encounters this error.
The documention of findContours is clearly saying that it can afford to take single channel images as inputs(i.e 8uc1 and 32sc1) But you are sending 3 channel image.here is the documentation of findcontours http://docs.opencv.org/2.4/modules/imgproc/doc/structural_analysis_and_shape_descriptors.html#findcontours
I got this error and found the cause:
I created: gray_img = np.zeros((width,height,3),np.uint8)
In gray_img, depth = 3, it doesn't match findContours.
Then I recreated: gray_img = np.zeros((width,height),np.uint8)
it worked.