Recognizing handwritten digits off a scanned image - python

I'm trying to read all of the handwritten digits in the scanned image here
I tried looking through pixel-by-pixel using PIL, cropping the sub-images, then feeding them through a neural network, but the regions that were being cropped never quite lined up and led to a lot of inaccuracy.
I've also tried using OpenCV to find all the grey squares, then crop the images and feed them through a neural network, but I couldn't seem to get it to find all or even only miss a few; it would miss ~30% of squares. (I'm not very experienced with OpenCV, so I could be messing something up)
So I'm just looking for a potential idea/solution for this problem, so any suggestions would be appreciated, thanks in advance!

I assume the input image name is "sqaures.jpg"
First of all, import required libraries and load image in both RGB and Gray format:
import cv2
import numpy as np
image = cv2.imread("squares.jpg", 1)
image_gray = cv2.imread("squares.jpg", 0)
Then, we perform a simple operation to clean some noise from the input image using np.where() function:
image_gray = np.where(image_gray > 240, 255, image_gray)
image_gray = np.where(image_gray <= 240, 0, image_gray)
Because we want to grab the whole square regions from the image. We need to blur the image a little bit before performing the adaptive thresholding method:
image_gray = cv2.blur(image_gray, (5, 5))
im_th = cv2.adaptiveThreshold(image_gray, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 115, 1)
kernal = cv2.getStructuringElement(cv2.MORPH_RECT, (3, 3))
im_th = cv2.morphologyEx(im_th, cv2.MORPH_OPEN, kernal, iterations=3)
Use contour detection in OpenCV to find all possible regions:
_, contours, _ = cv2.findContours(im_th.copy(), cv2.RETR_LIST,
cv2.CHAIN_APPROX_SIMPLE)
contours = sorted(contours, key=cv2.contourArea, reverse=True)
contours.remove(contours[0]) #remove the biggest contour
Finally, try to find the potential square regions based on the ratio of height and width:
square_rects = []
square_areas = []
for i, cnt in enumerate(contours):
(x, y, w, h) = cv2.boundingRect(cnt)
ar = w / float(h)
if 0.9 < ar < 1.1:
square_rects.append(((x,y), (x+w, y+h)))
square_areas.append(w*h) #store area information
We need to remove anything that is too small from the list by doing the follows:
import statistics
median_size_limit= statistics.median(square_areas) * 0.8
square_rects = [rect for i, rect in enumerate(square_rects)
if square_areas[i] > median_size_limit]
You can visually check the output by drawing all the rectangles on the original image:
for rect in square_rects:
cv2.rectangle(image, rect[0], rect[1], (0,255,0), 2)
cv2.imwrite("_output_image.png", image)
cv2.imshow("image", image)
cv2.waitKey()
You can use "square_rects" to locate all the squares and crop them from the original image.
The following is the preview of the final result.
Cheers.

Related

Getting a centre of an irregular shape

I have a irregular shape like below:
I need to get the centre of that white area, I just tried with contour in openCV like below
img=cv.imread('mypic',0)
ret,thresh = cv.threshold(img,127,255,cv.THRESH_BINARY_INV)
cnts = cv.findContours(thresh.copy(), cv.RETR_EXTERNAL,
cv.CHAIN_APPROX_SIMPLE)
cnt = contours[0]
x,y,w,h = cv.boundingRect(cnt)
res_img = cv.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
cv.imwrite('output.png',res_img)
But those cnts doeesn't give me very good results, as you can see the original image and its two small black point below the picture. Can someone point me to a good solution to get a centre of an irregular shape like above?
As I suggested, perform an erosion followed by dilation (an opening operation) on the binary image, then compute central moments and use this information to calculate the centroid. These are the steps:
Get a binary image from the input via Otsu's Thresholding
Compute central moments using cv2.moments
Compute the blob's centroid using the previous information
Let's see the code:
import cv2
import numpy as np
# Set image path
path = "C:/opencvImages/"
fileName = "pn43H.png"
# Read Input image
inputImage = cv2.imread(path+fileName)
# Convert BGR to grayscale:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu + bias adjustment:
threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
This is the binary image you get. Notice the small noise:
The opening operation will get rid of the smalls blobs. A rectangular structuring element will suffice, let's use 3 iterations:
# Apply an erosion + dilation to get rid of small noise:
# Set kernel (structuring element) size:
kernelSize = 3
# Set operation iterations:
opIterations = 3
# Get the structuring element:
maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
# Perform closing:
openingImage = cv2.morphologyEx(binaryImage, cv2.MORPH_OPEN, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
This is the filtered image:
Now, compute the central moments and then the blob's centroid:
# Calculate the moments
imageMoments = cv2.moments(openingImage)
# Compute centroid
cx = int(imageMoments['m10']/imageMoments['m00'])
cy = int(imageMoments['m01']/imageMoments['m00'])
# Print the point:
print("Cx: "+str(cx))
print("Cy: "+str(cy))
Additionally, let's draw this point onto the binary image to check out the results:
# Draw centroid onto BGR image:
bgrImage = cv2.cvtColor(binaryImage, cv2.COLOR_GRAY2BGR)
bgrImage = cv2.line(bgrImage, (cx,cy), (cx,cy), (0,255,0), 10)
This is the result:
One can think of the centroid calculated using the image moments as the "mass" center of the object in relation to the pixel intensity. Depending on the actual shape of the object it may not even be inside the object.
An alternative would be calculating the center of the bounding circle:
thresh = cv2.morphologyEx(thresh, cv2.MORPH_DILATE, np.ones((3, 3)))
contours, _ = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = [c for c in contours if cv2.contourArea(c) > 100]
(x, y), r = cv2.minEnclosingCircle(contours[0])
output = thresh.copy()
cv2.circle(output, (int(x), int(y)), 3, (0, 0, 0), -1)
cv2.putText(output, f"{(int(x), int(y))}", (int(x-50), int(y-10)), cv2.FONT_HERSHEY_PLAIN, 1, (0, 0, 0), 1)
cv2.circle(output, (int(x), int(y)), int(r), (255, 0, 0), 2)
The output of that code looks like this:
You may want to try ConnectedComponentsWithStats function. This returns centroids, areas and bounding box parameters. Also blur and morphology(dilate/erode) helps a lot with noice, as noted above. If you're generous enough with erode, you`re gonna get almost no stray pixels after processing.

How to crop a region of small pixels in an image using OpenCv in Python

I would like to crop out a region full of small curves in an image.
The original image is like the following:
Using the opening morphing, I can remove most of the noises. The result is like:
I tried to use dilate to connect these pixels in my desired region, but the result is not satisfactory.
Is there any function in opencv that can locate this region?
You are on the right track, here's an approach using morphological transformations
Convert image to grayscale and Gaussian blur
Otsu's threshold
Perform morphological operations
Find contours and filter using maximum area
Extract ROI
The idea is to connect the desired region into a single contour then filter using maximum area. This way, we can grab the region as one piece. Here's the detected area
Afterwards, we can extract the region with Numpy slicing
import cv2
image = cv2.imread('1.jpg')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (9,9), 0)
thresh = cv2.threshold(gray,0,255,cv2.THRESH_OTSU + cv2.THRESH_BINARY)[1]
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2,2))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
dilate_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9,9))
dilate = cv2.dilate(opening, dilate_kernel, iterations=5)
cnts = cv2.findContours(dilate, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
for c in cnts:
x,y,w,h = cv2.boundingRect(c)
cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
ROI = original[y:y+h, x:x+w]
break
cv2.imshow('thresh', thresh)
cv2.imshow('opening', opening)
cv2.imshow('dilate', dilate)
cv2.imshow('image', image)
cv2.imshow('ROI', ROI)
cv2.waitKey(0)
Here's my approach using NumPy's sum. Just sum the pixel values along the x and y axis individually, set up some thresholds for the minimum number of pixels describing the desired area, and obtain proper column and row indices.
Let's have a look at the following code:
import cv2
import numpy as np
from matplotlib import pyplot as plt
# Read input image; get shape
img = cv2.imread('images/UKf5Z.jpg', cv2.IMREAD_GRAYSCALE)
w, h = img.shape[0:2]
# Threshold to prevent JPG artifacts
_, img = cv2.threshold(img, 240, 255, cv2.THRESH_BINARY)
# Sum pixels along x and y axis
xSum = np.sum(img / 255, axis=0)
ySum = np.sum(img / 255, axis=1)
# Visualize curves
plt.plot(xSum)
plt.plot(ySum)
plt.show()
# Set up thresholds
xThr = 15
yThr = 15
# Find proper row indices
tmp = np.argwhere(xSum > xThr)
tmp = tmp[np.where((tmp > 20) & (tmp < w - 20))]
x1 = tmp[0]
x2 = tmp[-1]
# Find proper column indices
tmp = np.argwhere(ySum > yThr)
tmp = tmp[np.where((tmp > 20) & (tmp < h - 20))]
y1 = tmp[0]
y2 = tmp[-1]
# Visualize result
out = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
cv2.rectangle(out, (x1, y1), (x2, y2), (0, 0, 255), 4)
cv2.imshow('out', out)
cv2.waitKey(0)
The curves of the summations look like this (just for visualization purposes):
And, for visualization I just drew a red rectangle described by the found indices.
As you can see, I manually excluded some "border" area of 20 pixels, since there are some larger artifacts. Depending on the location of your desired area, this may be sufficient. Otherwise, your approach using morphological opening should be kept.
Hope that helps!
EDIT: As suggested by Mark in his answer, using mean instead of sum avoids adaptations regarding varying image dimensions. Changing the code appropriately is left to the reader. :-)
Same idea as #HansHirse's excellent answer, but I had made a diagram that I couldn't share in the comments.

Floor Plan Text Recognition & OCR

The objective is to create bounding boxes using text recognition methods (eg: OpenCV) for US floor plan images, which can then be fed into a text reader (eg: LSTM or tesseract).
Several methods which have been tried cv2.findContours and cv2.boundingRect methods have been attempted but have largely failed to generalise to different types of floor plans (there is a wide deviation in how the floor plans look).
For example, cv2.findContours using grayscale, adaptive thresholds, erosion and dilation (with various iterations) before applying the cv2.findContours function results in the bellow. Note that Bedroom 2 and Kitchen are not being picked up correctly.
Additional example which fails to find any regions:
Any thoughts on text recognition models or cleaning procedures that will improve the accuracy of the text recognition model, preferably with code examples?
This answer is based on the assumption that images are similar one to another (like their size, thickness of walls, letters...). If they are not this wouldn't be a good approach because you would have to change the thresholders for every image. That being said, I would try to transform the image to binary and search for contours. After that you can add criterion like height, weight etc. to filter out the walls. After that You can draw contours on a mask and then dilate the image. That will combine letters close to each other into one contour. Then you can create bounding box for all the contours which is your ROI. Then you can use any OCR on that region. Hope it helps a bit. Cheers!
Example:
import cv2
import numpy as np
img = cv2.imread('floor.png')
mask = np.zeros(img.shape, dtype=np.uint8)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
_, threshold = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV)
_, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
ROI = []
for cnt in contours:
x,y,w,h = cv2.boundingRect(cnt)
if h < 20:
cv2.drawContours(mask, [cnt], 0, (255,255,255), 1)
kernel = np.ones((7,7),np.uint8)
dilation = cv2.dilate(mask,kernel,iterations = 1)
gray_d = cv2.cvtColor(dilation, cv2.COLOR_BGR2GRAY)
_, threshold_d = cv2.threshold(gray_d,150,255,cv2.THRESH_BINARY)
_, contours_d, hierarchy = cv2.findContours(threshold_d,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
for cnt in contours_d:
x,y,w,h = cv2.boundingRect(cnt)
if w > 35:
cv2.rectangle(img,(x,y),(x+w,y+h),(0,255,0),2)
roi_c = img[y:y+h, x:x+w]
ROI.append(roi_c)
cv2.imshow('img', img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Result:

Removing partial borders with openCV

I'm using OpenCV to find tabular data within images so that I can use an OCR on it. So far I have been able to find the table in the image, find the columns of the table, then find each cell within each column. It works pretty well, but I'm having an issue with the cell walls getting stuck in my images and I'm unable to remove them reliably.This is one example that I'm having difficulty with.This would be another example.
I have tried several approaches to get these images better. I have been having the most luck with finding the contours.
img2gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 180, 255, cv2.THRESH_BINARY)
image_final = cv2.bitwise_and(img2gray, img2gray, mask=mask)
ret, new_img = cv2.threshold(image_final, 180, 255, cv2.THRESH_BINARY_INV)
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS, (5, 3))
# dilate , more the iteration more the dilation
dilated = cv2.dilate(new_img, kernel, iterations=3)
cv2.imwrite('../test_images/find_contours_dilated.png', dilated)
I have been toying with the kernel size and dilation iterations and have found this to be the best configuration.
Another approach I used was with PIL, but it is only really good if the border is uniform around the whole image, which in my cases it is not.
copy = Image.fromarray(img)
try:
bg = Image.new(copy.mode, copy.size, copy.getpixel((0, 0)))
except:
return None
diff = ImageChops.difference(copy, bg)
diff = ImageChops.add(diff, diff, 2.0, -100)
bbox = diff.getbbox()
if bbox:
return np.array(copy.crop(bbox))
There were a few other ideas that I tried, but none got me very far. Any help would be appreciated.
You could try with finding contours and "drawing" them out. Meaning you can draw a border that will be connected with the "walls" mannualy with cv2.rectangle on the borders of your image (that will combine all contours - walls). The biggest two contours will be outer and inner line of your walls and you can draw the contour white to remove the border. Then apply threshold again to remove the rest of the noise. Cheers!
Example:
import cv2
import numpy as np
# Read the image
img = cv2.imread('borders2.png')
# Get image shape
h, w, channels = img.shape
# Draw a rectangle on the border to combine the wall to one contour
cv2.rectangle(img,(0,0),(w,h),(0,0,0),2)
# Convert to grayscale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Apply binary threshold
_, threshold = cv2.threshold(gray, 180, 255, cv2.THRESH_BINARY_INV)
# Search for contours and sort them by size
_, contours, hierarchy = cv2.findContours(threshold,cv2.RETR_TREE,cv2.CHAIN_APPROX_NONE)
area = sorted(contours, key=cv2.contourArea, reverse=True)
# Draw it out with white color from biggest to second biggest contour
cv2.drawContours(img, ((contours[0]),(contours[1])), -1, (255,255,255), -1)
# Apply binary threshold again to the new image to remove little noises
_, img = cv2.threshold(img, 180, 255, cv2.THRESH_BINARY)
# Display results
cv2.imshow('img', img)
Result:

Circular contour detection in an image python opencv

I am trying to have the circle detected in the following image.
So I did color thresholding and finally got this result.
Because of the lines in the center being removed, the circle is split into many small parts, so if I do contour detection on this, it can only give me each contour separately.
But is there a way I can somehow combine the contours so I could get a circle instead of just pieces of it?
Here is my code for color thresholding:
blurred = cv2.GaussianBlur(img, (9,9), 9)
ORANGE_MIN = np.array((12, 182, 221),np.uint8)
ORANGE_MAX = np.array((16, 227, 255),np.uint8)
hsv_disk = cv2.cvtColor(blurred,cv2.COLOR_BGR2HSV)
disk_threshed = cv2.inRange(hsv_disk, ORANGE_MIN, ORANGE_MAX)
The task is much easier when performed with the red plane only.
I guess there was problem with the thresholds for color segmentation, So the idea here was to generate a binary mask. By inspection your region of interest seems to be brighter than the other regions of input image, so thresholding can simply be done on a grayScale image to simplify the context. Note: You may change this step as per your requirement. After satisfying with the threshold output, you may use cv2.convexHull() to get the convex shape of your contour.
Also keep in mind to select the largest contour and ignore the small contours. The following code can be used to generate the required output:
import cv2
import numpy as np
# Loading the input_image
img = cv2.imread("/Users/anmoluppal/Downloads/3xGG4.jpg")
# Converting the input image to grayScale
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Thresholding the image to get binary mask.
ret, img_thresh = cv2.threshold(img_gray, 145, 255, cv2.THRESH_BINARY)
# Dilating the mask image
kernel = np.ones((3,3),np.uint8)
dilation = cv2.dilate(img_thresh,kernel,iterations = 3)
# Getting all the contours
_, contours, __ = cv2.findContours(dilation, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
# Finding the largest contour Id
largest_contour_area = 0
largest_contour_area_idx = 0
for i in xrange(len(contours)):
if (cv2.contourArea(contours[i]) > largest_contour_area):
largest_contour_area = cv2.contourArea(contours[i])
largest_contour_area_idx = i
# Get the convex Hull for the largest contour
hull = cv2.convexHull(contours[largest_contour_area_idx])
# Drawing the contours for debugging purposes.
img = cv2.drawContours(img, [hull], 0, [0, 255, 0])
cv2.imwrite("./garbage.png", img)

Categories

Resources