Writing robust (size invariant) circle detection (Watershed) - python

Edit: Quick Summary so far: I use the watershed algorithm but I have probably a problem with threshold. It didn't detect the brighter circles.
New: Fast radial symmetry transform approach which didn't quite work eiter (Edit 6).
I want to detect circles with different sizes. The use case is to detect coins on an image and to extract them solely. -> Get the single coins as single image files.
For this I used the Hough Circle Transform of open-cv:
(https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_circle/hough_circle.html)
import sys
import cv2 as cv
import numpy as np
def main(argv):
## [load]
default_file = "data/newcommon_1euro.jpg"
filename = argv[0] if len(argv) > 0 else default_file
# Loads an image
src = cv.imread(filename, cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image!')
print ('Usage: hough_circle.py [image_name -- default ' + default_file + '] \n')
return -1
## [load]
## [convert_to_gray]
# Convert it to gray
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
## [convert_to_gray]
## [reduce_noise]
# Reduce the noise to avoid false circle detection
gray = cv.medianBlur(gray, 5)
## [reduce_noise]
## [houghcircles]
rows = gray.shape[0]
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 8,
param1=100, param2=30,
minRadius=0, maxRadius=120)
## [houghcircles]
## [draw]
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0, :]:
center = (i[0], i[1])
# circle center
cv.circle(src, center, 1, (0, 100, 100), 3)
# circle outline
radius = i[2]
cv.circle(src, center, radius, (255, 0, 255), 3)
## [draw]
## [display]
cv.imshow("detected circles", src)
cv.waitKey(0)
## [display]
return 0
if __name__ == "__main__":
main(sys.argv[1:])
I tried all parameters (rows, param1, param2, minRadius, and maxRadius) to optimize the results. This worked very well for one specific image but other images with different sized coins didn't work.
Examples:
Parameters
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 16,
param1=100, param2=30,
minRadius=0, maxRadius=120)
With the same parameters:
Changed to rows/8
I also tried two other approaches of this thread: writing robust (color and size invariant) circle detection with opencv (based on Hough transform or other features)
The approach of fireant leads to this result:
The approach of fraxel didn't work either.
For the first approach: This happens with all different sizes and also the min and max radius.
How could I change the code, so that the coin size is not important or that it finds the parameters itself?
Thank you in advance for any help!
Edit:
I tried the watershed algorithm of Open-cv, as suggested by Alexander Reynolds: https://docs.opencv.org/3.4/d3/db4/tutorial_py_watershed.html
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('data/P1190263.jpg')
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret, thresh = cv.threshold(gray,0,255,cv.THRESH_BINARY_INV+cv.THRESH_OTSU)
# noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv.morphologyEx(thresh,cv.MORPH_OPEN,kernel, iterations = 2)
# sure background area
sure_bg = cv.dilate(opening,kernel,iterations=3)
# Finding sure foreground area
dist_transform = cv.distanceTransform(opening,cv.DIST_L2,5)
ret, sure_fg = cv.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
markers = cv.watershed(img,markers)
img[markers == -1] = [255,0,0]
#Display:
cv.imshow("detected circles", img)
cv.waitKey(0)
It works very well on the test image of the open-cv website:
But it performs very bad on my own images:
I can't really think of a good reason why it's not working on my images?
Edit 2:
As suggested I looked at the intermediate images. The thresh looks not good in my opinion. Next, there is no difference between opening and dist_transform. The corresponding sure_fg shows the detected images.
thresh:
opening:
dist_transform:
sure_bg:
sure_fg:
Edit 3:
I tried all distanceTypes and maskSizes I could find, but the results were quite the same (https://www.tutorialspoint.com/opencv/opencv_distance_transformation.htm)
Edit 4:
Furthermore, I tried to change the (first) threshold function. I used different threshold values instead of the OTSU Function. The best one was with 160, but it was far from good:
In the tutorial it looks like this:
It seems like the coins are somehow too bright to be detected by this algorithm, but I don't know how to improve it?
Edit 5:
Changing the overall contrast and brightness of the image (with cv.convertScaleAbs) didn't improve the results. Increasing the contrast however should increase the "difference" between foreground and background, at least on the normal image. But it even got worse. The corresponding threshold image didn't improved (didn't get more white pixel).
Edit 6: I tried another approach, the fast radial symmetry transform (from here https://github.com/ceilab/frst_python)
import cv2
import numpy as np
def gradx(img):
img = img.astype('int')
rows, cols = img.shape
# Use hstack to add back in the columns that were dropped as zeros
return np.hstack((np.zeros((rows, 1)), (img[:, 2:] - img[:, :-2]) / 2.0, np.zeros((rows, 1))))
def grady(img):
img = img.astype('int')
rows, cols = img.shape
# Use vstack to add back the rows that were dropped as zeros
return np.vstack((np.zeros((1, cols)), (img[2:, :] - img[:-2, :]) / 2.0, np.zeros((1, cols))))
# Performs fast radial symmetry transform
# img: input image, grayscale
# radii: integer value for radius size in pixels (n in the original paper); also used to size gaussian kernel
# alpha: Strictness of symmetry transform (higher=more strict; 2 is good place to start)
# beta: gradient threshold parameter, float in [0,1]
# stdFactor: Standard deviation factor for gaussian kernel
# mode: BRIGHT, DARK, or BOTH
def frst(img, radii, alpha, beta, stdFactor, mode='BOTH'):
mode = mode.upper()
assert mode in ['BRIGHT', 'DARK', 'BOTH']
dark = (mode == 'DARK' or mode == 'BOTH')
bright = (mode == 'BRIGHT' or mode == 'BOTH')
workingDims = tuple((e + 2 * radii) for e in img.shape)
# Set up output and M and O working matrices
output = np.zeros(img.shape, np.uint8)
O_n = np.zeros(workingDims, np.int16)
M_n = np.zeros(workingDims, np.int16)
# Calculate gradients
gx = gradx(img)
gy = grady(img)
# Find gradient vector magnitude
gnorms = np.sqrt(np.add(np.multiply(gx, gx), np.multiply(gy, gy)))
# Use beta to set threshold - speeds up transform significantly
gthresh = np.amax(gnorms) * beta
# Find x/y distance to affected pixels
gpx = np.multiply(np.divide(gx, gnorms, out=np.zeros(gx.shape), where=gnorms != 0),
radii).round().astype(int);
gpy = np.multiply(np.divide(gy, gnorms, out=np.zeros(gy.shape), where=gnorms != 0),
radii).round().astype(int);
# Iterate over all pixels (w/ gradient above threshold)
for coords, gnorm in np.ndenumerate(gnorms):
if gnorm > gthresh:
i, j = coords
# Positively affected pixel
if bright:
ppve = (i + gpx[i, j], j + gpy[i, j])
O_n[ppve] += 1
M_n[ppve] += gnorm
# Negatively affected pixel
if dark:
pnve = (i - gpx[i, j], j - gpy[i, j])
O_n[pnve] -= 1
M_n[pnve] -= gnorm
# Abs and normalize O matrix
O_n = np.abs(O_n)
O_n = O_n / float(np.amax(O_n))
# Normalize M matrix
M_max = float(np.amax(np.abs(M_n)))
M_n = M_n / M_max
# Elementwise multiplication
F_n = np.multiply(np.power(O_n, alpha), M_n)
# Gaussian blur
kSize = int(np.ceil(radii / 2))
kSize = kSize + 1 if kSize % 2 == 0 else kSize
S = cv2.GaussianBlur(F_n, (kSize, kSize), int(radii * stdFactor))
return S
img = cv2.imread('data/P1190263.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
result = frst(gray, 60, 2, 0, 1, mode='BOTH')
cv2.imshow("detected circles", result)
cv2.waitKey(0)
I only get this nearly black output (it has some very dark grey shadows). I don't know what to change and would be thankful for help!

Related

How to improve a variable lens blur algorithm in Python OpenCV?

I want to emulate the blur of a cheap camera lens (like Holga).
Blur is very weak close to the photo center.
And it's getting more decisive close to corners.
I wrote the code and it works in general.
Input image:
Result image:
.
But I feel that it could be done better and faster.
I've found a similar question but it still has no answer.
How to improve an algorithm speed and avoid iteration over pixels?
UPDATE:
It's not the same as standard Gaussian or 2D filter blur with constant kernel size.
import cv2
import numpy as np
import requests
from tqdm import tqdm
import warnings
warnings.filterwarnings("ignore")
def blur(img=None, blur_radius=None, test=False):
# test image loading
if img is None:
test=True
print('test mode ON')
print('loading image...')
url = r'http://www.lenna.org/lena_std.tif'
resp = requests.get(url, stream=True).raw
img = np.asarray(bytearray(resp.read()), dtype="uint8")
img = cv2.imdecode(img, cv2.IMREAD_COLOR)
cv2.imwrite('img_input.png', img)
print('image loaded')
# channels splitting
img_lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
l, a, b = cv2.split(img_lab)
if test:
cv2.imwrite('l_channel.png', l)
print('l channel saved')
# make blur map
height, width = l.shape[:2]
center = np.array([height/2, width/2])
diag = ((height / 2) ** 2 + (width / 2) ** 2) ** 0.5
blur_map = np.linalg.norm(
np.indices(img.shape[:2]) - center[:,None,None] + 0.5,
axis = 0
)
if blur_radius is None:
blur_radius = int(max(height, width) * 0.03)
blur_map = blur_map / diag
blur_map = blur_map * blur_radius
if test:
blur_map_norm = cv2.normalize(blur_map, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_32F)
cv2.imwrite('blur_map.png', blur_map_norm)
print('blur map saved')
# very inefficient blur algorithm!!!
l_blur = np.copy(l)
for x in tqdm(range(width)):
for y in range(height):
kernel_size = int(blur_map[y, x])
if kernel_size == 0:
l_blur[y, x] = l[y, x]
continue
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (kernel_size, kernel_size))
cut = l[
max(0, y - kernel_size):min(height, y + kernel_size),
max(0, x - kernel_size):min(width, x + kernel_size)
]
if cut.shape == kernel.shape:
cut = (cut * kernel).mean()
else:
cut = cut.mean()
l_blur[y, x] = cut
if test: cv2.imwrite('l_blur.png', l_blur); print('l_blur saved')
if test: print('done')
return l_blur
blur()
The only way to implement a filter where the kernel is different for every pixel is to create the kernel for each pixel and apply it in a loop, like OP's code does. The Fourier transform does not apply to this case. Python is a very slow language, the same algorithm implemented in a compiled language would be much faster. Unless there is some predefined structure in how the kernel is created at each pixel, there is no way to reduce the complexity of the algorithm.
For example, the uniform filter with a square kernel (commonly called the "box" filter) can be computed based on the integral image, using only 4 additions per pixel. This implementation should be able to choose a different kernel size at each pixel without any additional cost.
DIPlib has an implementation of an adaptive Gaussian filter [disclaimer: I'm an author of DIPlib, but I did not implement this functionality]. Here is the documentation.
This filter applies a Gaussian filter, but the Gaussian kernel is scaled and rotated differently at every pixel.
Lens blur is not a Gaussian, but it's not easy to see the difference by eye in most cases; the difference matters only if there is a very small dot with high contrast.
OP's case would be implemented as follows:
import diplib as dip
img = dip.ImageRead('examples/trui.ics')
blur_map = dip.CreateRadiusSquareCoordinate(img.Sizes())
blur_map /= dip.Maximum(blur_map)
img_blur = dip.AdaptiveGauss(img, [0, blur_map], sigmas=[5])
(the blur_map here is defined differently, I chose a quadratic function of the distance to the center, because I think it looks really nice; use dip.CreateRadiusCoordinate() to reproduce OP's map).
I've chosen a maximum blur of 5 (this is the sigma, in pixels, of the Gaussian, not the footprint of the kernel), and blur_map here scales this sigma with a factor between 0 in the middle and 1 at the corners of the image.
Another interesting effect would be as follows, with increasing blur tangential to each circle centered in the middle of the image, with very little blur radially:
angle_map = dip.CreatePhiCoordinate(img.Sizes())
img_blur = dip.AdaptiveGauss(img, [angle_map, blur_map], sigmas=[8,1])
Here is one way to apply (uniform, non-varying) lens defocus blur in Python/OpenCV by transforming both the image and filter to the Fourier (frequency) domain.
Read the input
Take dft of input to transform to Fourier domain
Draw a white filled circle on a black background the size of the input as a mask (filter kernel). This is the defocus kernel in the spatial domain, i.e. a circular rect function.
Blur the circle slightly to anti-alias the edge
Roll the mask so that the center is at the origin (top left corner) and normalize so that the sum of values = 1
Take dft of mask to transform to Fourier domain where its amplitude profile is a jinx function.
Multiply the two dft images to apply the blur
Take the idft of the product to transform back to spatial domain
Get the magnitude of the real and imaginary components of the product, clip and convert to uint8 as the result
Save the result
Input:
import numpy as np
import cv2
# read input and convert to grayscale
img = cv2.imread('lena_512_gray.png', cv2.IMREAD_GRAYSCALE)
# do dft saving as complex output
dft_img = np.fft.fft2(img, axes=(0,1))
# create circle mask
radius = 32
mask = np.zeros_like(img)
cy = mask.shape[0] // 2
cx = mask.shape[1] // 2
cv2.circle(mask, (cx,cy), radius, 255, -1)[0]
# blur the mask slightly to antialias
mask = cv2.GaussianBlur(mask, (3,3), 0)
# roll the mask so that center is at origin and normalize to sum=1
mask_roll = np.roll(mask, (256,256), axis=(0,1))
mask_norm = mask_roll / mask_roll.sum()
# take dft of mask
dft_mask_norm = np.fft.fft2(mask_norm, axes=(0,1))
# apply dft_mask to dft_img
dft_shift_product = np.multiply(dft_img, dft_mask_norm)
# do idft saving as complex output
img_filtered = np.fft.ifft2(dft_shift_product, axes=(0,1))
# combine complex real and imaginary components to form (the magnitude for) the original image again
img_filtered = np.abs(img_filtered).clip(0,255).astype(np.uint8)
cv2.imshow("ORIGINAL", img)
cv2.imshow("MASK", mask)
cv2.imshow("FILTERED DFT/IFT ROUND TRIP", img_filtered)
cv2.waitKey(0)
cv2.destroyAllWindows()
# write result to disk
cv2.imwrite("lena_512_gray_mask.png", mask)
cv2.imwrite("lena_dft_numpy_lowpass_filtered_rad32.jpg", img_filtered)
Mask - Filter Kernel In Spatial Domain:
Result for Circle Radius=4:
Result for Circle Radius=8:
Result for Circle Radius=16:
Result for Circle Radius=32
:
ADDITION
Using OpenCV for the dft, etc rather than Numpy, the above becomes:
import numpy as np
import cv2
# read input and convert to grayscale
img = cv2.imread('lena_512_gray.png', cv2.IMREAD_GRAYSCALE)
# do dft saving as complex output
dft_img = cv2.dft(np.float32(img), flags = cv2.DFT_COMPLEX_OUTPUT)
# create circle mask
radius = 32
mask = np.zeros_like(img)
cy = mask.shape[0] // 2
cx = mask.shape[1] // 2
cv2.circle(mask, (cx,cy), radius, 255, -1)[0]
# blur the mask slightly to antialias
mask = cv2.GaussianBlur(mask, (3,3), 0)
# roll the mask so that center is at origin and normalize to sum=1
mask_roll = np.roll(mask, (256,256), axis=(0,1))
mask_norm = mask_roll / mask_roll.sum()
# take dft of mask
dft_mask_norm = cv2.dft(np.float32(mask_norm), flags = cv2.DFT_COMPLEX_OUTPUT)
# apply dft_mask to dft_img
dft_product = cv2.mulSpectrums(dft_img, dft_mask_norm, 0)
# do idft saving as complex output, then clip and convert to uint8
img_filtered = cv2.idft(dft_product, flags=cv2.DFT_SCALE+cv2.DFT_REAL_OUTPUT)
img_filtered = img_filtered.clip(0,255).astype(np.uint8)
cv2.imshow("ORIGINAL", img)
cv2.imshow("MASK", mask)
cv2.imshow("FILTERED DFT/IFT ROUND TRIP", img_filtered)
cv2.waitKey(0)
cv2.destroyAllWindows()
# write result to disk
cv2.imwrite("lena_512_gray_mask.png", mask)
cv2.imwrite("lena_dft_opencv_defocus_rad32.jpg", img_filtered)

Pixels intensity values between two lines

I have created an alghoritm that detects the edges of an extruded colagen casing and draws a centerline between these edges on an image. Casing with a centerline.
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
img = cv2.imread("C:/Users/5.jpg", cv2.IMREAD_GRAYSCALE)
img = cv2.resize(img, (1500, 1200))
#ROI
fromCenter = False
r = cv2.selectROI(img, fromCenter)
imCrop = img[int(r[1]):int(r[1]+r[3]), int(r[0]):int(r[0]+r[2])]
#Operations on an image
_,thresh = cv2.threshold(imCrop,100,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)
blur = cv2.GaussianBlur(opening,(7,7),0)
edges = cv2.Canny(blur, 0,20)
#Edges localization, packing coords into a list
indices = np.where(edges != [0])
coordinates = list(zip(indices[1], indices[0]))
num = len(coordinates)
#Separating into top and bot edge
bot_cor = coordinates[:int(num/2)]
top_cor = coordinates[-int(num/2):]
#Converting to arrays, sorting
a, b = np.array(top_cor), np.array(bot_cor)
a, b = a[a[:,0].argsort()], b[b[:,0].argsort()]
#Edges approximation by a 5th degree polynomial
min_a_x, max_a_x = np.min(a[:,0]), np.max(a[:,0])
new_a_x = np.linspace(min_a_x, max_a_x, imCrop.shape[1])
a_coefs = np.polyfit(a[:,0],a[:,1], 5)
new_a_y = np.polyval(a_coefs, new_a_x)
min_b_x, max_b_x = np.min(b[:,0]), np.max(b[:,0])
new_b_x = np.linspace(min_b_x, max_b_x, imCrop.shape[1])
b_coefs = np.polyfit(b[:,0],b[:,1], 5)
new_b_y = np.polyval(b_coefs, new_b_x)
#Defining a centerline
midx = [np.average([new_a_x[i], new_b_x[i]], axis = 0) for i in range(imCrop.shape[1])]
midy = [np.average([new_a_y[i], new_b_y[i]], axis = 0) for i in range(imCrop.shape[1])]
plt.figure(figsize=(16,8))
plt.title('Cross section')
plt.xlabel('Length of the casing', fontsize=18)
plt.ylabel('Width of the casing', fontsize=18)
plt.plot(new_a_x, new_a_y,c='black')
plt.plot(new_b_x, new_b_y,c='black')
plt.plot(midx, midy, '-', c='blue')
plt.show()
#Converting coords type to a list (plotting purposes)
coords = list(zip(midx, midy))
points = list(np.int_(coords))
mask = np.zeros((imCrop.shape[:2]), np.uint8)
mask = edges
#Plotting
for point in points:
cv2.circle(mask, tuple(point), 1, (255,255,255), -1)
for point in points:
cv2.circle(imCrop, tuple(point), 1, (255,255,255), -1)
cv2.imshow('imCrop', imCrop)
cv2.imshow('mask', mask)
cv2.waitKey(0)
cv2.destroyAllWindows()
Now I would like to sum up the intensities of each pixel in a region between top edge and a centerline (same thing for a region between centerline and a bottom edge).
Is there any way to limit the ROI to the region between the detected edges and split it into two regions based on the calculated centerline?
Or is there any way to access the pixels which are contained between the edge and a centerline based on theirs coordinates?
(It's my very first post here, sorry in advance for all the mistakes)
I wrote a somewhat naïve code to get masks for the upper and lower part. My code considers that the source image will be always like yours: with horizontal stripes.
After applying Canny I get this:
Then I run some loops through image array to fill unwanted areas of your image. This is done separately for upper and lower part, creating masks. The results are:
Then you can use this masks to sum only the elements you're interested in, using cv.sumElems.
import cv2 as cv
#open as grayscale image
src = cv.imread("colagen.png",cv.IMREAD_GRAYSCALE)
# apply canny and find contours
threshold = 100
canny_output = cv.Canny(src, threshold, threshold * 2)
# find mask for upper part
mask1 = canny_output.copy()
x, y = canny_output.shape
area = 0
for j in range(y):
area = 0
for i in range(x):
if area == 0:
if mask1[i][j] > 0:
area = 1
continue
else:
mask1[i][j] = 255
elif area == 1:
if mask1[i][j] > 0:
area = 2
else:
continue
else:
mask1[i][j] = 255
mask1 = cv.bitwise_not(mask1)
# find mask for lower part
mask2 = canny_output.copy()
x, y = canny_output.shape
area = 0
for j in range(y):
area = 0
for i in range(x):
if area == 0:
if mask2[-i][j] > 0:
area = 1
continue
else:
mask2[-i][j] = 255
elif area == 1:
if mask2[-i][j] > 0:
area = 2
else:
continue
else:
mask2[-i][j] = 255
mask2 = cv.bitwise_not(mask2)
# apply masks and calculate sum of elements in upper and lower part
sums = [0,0]
(sums[0],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask1))
(sums[1],_,_,_) = cv.sumElems(cv.bitwise_and(src,mask2))
cv.imshow('src',src)
cv.imshow('canny',canny_output)
cv.imshow('mask1',mask1)
cv.imshow('mask2',mask2)
cv.imshow('masked1',cv.bitwise_and(src,mask1))
cv.imshow('masked2',cv.bitwise_and(src,mask2))
cv.waitKey()
Alternatives...
Probably there exist some function that fill the areas of the Canny result. I tried cv.fillPoly and cv.floodFill, but didn't manage to make them work easily... But maybe someone else can help you with that...
Edit
Found another way to get the masks with a cleaner code. Using numpy np.add.accumulate then np.clip, and then a modulo operation:
# first divide canny_output by 255 to get 0's and 1's, then perform
# an accumulate addition for each column. Thus you'll get +1 for every
# line, "painting" areas with 1, 2, 3...
a = np.add.accumulate(canny_output/255,0)
# clip values: anything greater than 2 becomes 2
a = np.clip(a, 0, 2)
# performe a modulo, to get areas alternating with 0 or 1; then multiply by 255
a = a%2 * 255
# convert to uint8
mask1 = cv.convertScaleAbs(a)
# to get mask2 (the lower mask) flip the array then do the same as above
a = np.add.accumulate(np.flip(canny_output,0)/255,0)
a = np.clip(a, 0, 2)
a = a%2 * 255
mask2 = cv.convertScaleAbs(np.flip(a,0))
This returns almost the same result. The border of the mask is a little bit different...

Best approach to process a grid like image [duplicate]

I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.
I did the programming using Python API of OpenCV 2.3.1.
Below is what I did :
Read the image
Find the contours
Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.
e.g. given below:
(Notice here that the green line correctly coincides with the true boundary of the Sudoku, so the Sudoku can be correctly warped. Check next image)
warp the image to a perfect square
eg image:
Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )
And the method worked well.
Problem:
Check out this image.
Performing the step 4 on this image gives the result below:
The red line drawn is the original contour which is the true outline of sudoku boundary.
The green line drawn is approximated contour which will be the outline of warped image.
Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.
My Question :
How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?
I have a solution that works, but you'll have to translate it to OpenCV yourself. It's written in Mathematica.
The first step is to adjust the brightness in the image, by dividing each pixel with the result of a closing operation:
src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]
The next step is to find the sudoku area, so I can ignore (mask out) the background. For that, I use connected component analysis, and select the component that's got the largest convex area:
components =
ComponentMeasurements[
ColorNegate#Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All,
2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]
By filling this image, I get a mask for the sudoku grid:
mask = FillingTransform[largestComponent]
Now, I can use a 2nd order derivative filter to find the vertical and horizontal lines in two separate images:
lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];
I use connected component analysis again to extract the grid lines from these images. The grid lines are much longer than the digits, so I can use caliper length to select only the grid lines-connected components. Sorting them by position, I get 2x10 mask images for each of the vertical/horizontal grid lines in the image:
verticalGridLineMasks =
SortBy[ComponentMeasurements[
lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks =
SortBy[ComponentMeasurements[
lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 2]] &][[All, 3]];
Next I take each pair of vertical/horizontal grid lines, dilate them, calculate the pixel-by-pixel intersection, and calculate the center of the result. These points are the grid line intersections:
centerOfGravity[l_] :=
ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters =
Table[centerOfGravity[
ImageData[Dilation[Image[h], DiskMatrix[2]]]*
ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h,
horizontalGridLineMasks}, {v, verticalGridLineMasks}];
The last step is to define two interpolation functions for X/Y mapping through these points, and transform the image using these functions:
fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed =
ImageTransformation[
srcAdjusted, {fnX ## Reverse[#], fnY ## Reverse[#]} &, {9*50, 9*50},
PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]
All of the operations are basic image processing function, so this should be possible in OpenCV, too. The spline-based image transformation might be harder, but I don't think you really need it. Probably using the perspective transformation you use now on each individual cell will give good enough results.
Nikie's answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie's mathematica code. And also, I couldn't find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)
1. Image PreProcessing ( closing operation )
import cv2
import numpy as np
img = cv2.imread('dave.jpg')
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
Result :
2. Finding Sudoku Square and Creating Mask Image
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
Result :
3. Finding Vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
Result :
4. Finding Horizontal Lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
Result :
Of course, this one is not so good.
5. Finding Grid Points
res = cv2.bitwise_and(closex,closey)
Result :
6. Correcting the defects
Here, nikie does some kind of interpolation, about which I don't have much knowledge. And i couldn't find any corresponding function for this OpenCV. (may be it is there, i don't know).
Check out this SOF which explains how to do this using SciPy, which I don't want to use : Image transformation in OpenCV
So, here I took 4 corners of each sub-square and applied warp Perspective to each.
For that, first we find the centroids.
contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
But resulting centroids won't be sorted. Check out below image to see their order:
So we sort them from left to right, top to bottom.
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))
Now see below their order :
Finally we apply the transformation and create a new image of size 450x450.
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = i/10
ci = i%10
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
Result :
The result is almost same as nikie's, but code length is large. May be, better methods are available out there, but until then, this works OK.
Regards
ARK.
You could try to use some kind of grid based modeling of you arbitrary warping. And since the sudoku already is a grid, that shouldn't be too hard.
So you could try to detect the boundaries of each 3x3 subregion and then warp each region individually. If the detection succeeds it would give you a better approximation.
I thought this was a great post, and a great solution by ARK; very well laid out and explained.
I was working on a similar problem, and built the entire thing. There were some changes (i.e. xrange to range, arguments in cv2.findContours), but this should work out of the box (Python 3.5, Anaconda).
This is a compilation of the elements above, with some of the missing code added (i.e., labeling of points).
'''
https://stackoverflow.com/questions/10196198/how-to-remove-convexity-defects-in-a-sudoku-square
'''
import cv2
import numpy as np
img = cv2.imread('test.png')
winname="raw image"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,100)
img = cv2.GaussianBlur(img,(5,5),0)
winname="blurred"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,150)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
winname="gray"
cv2.namedWindow(winname)
cv2.imshow(winname, gray)
cv2.moveWindow(winname, 100,200)
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
winname="res2"
cv2.namedWindow(winname)
cv2.imshow(winname, res2)
cv2.moveWindow(winname, 100,250)
#find elements
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
img_c, contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
winname="puzzle only"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,300)
# vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
img_d, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
winname="vertical lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_d)
cv2.moveWindow(winname, 100,350)
# find horizontal lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
img_e, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
winname="horizontal lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_e)
cv2.moveWindow(winname, 100,400)
# intersection of these two gives dots
res = cv2.bitwise_and(closex,closey)
winname="intersections"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,450)
# text blue
textcolor=(0,255,0)
# points green
pointcolor=(255,0,0)
# find centroids and sort
img_f, contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
# sorting
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in range(10)])
bm = b.reshape((10,10,2))
# make copy
labeled_in_order=res2.copy()
for index, pt in enumerate(b):
cv2.putText(labeled_in_order,str(index),tuple(pt),cv2.FONT_HERSHEY_DUPLEX, 0.75, textcolor)
cv2.circle(labeled_in_order, tuple(pt), 5, pointcolor)
winname="labeled in order"
cv2.namedWindow(winname)
cv2.imshow(winname, labeled_in_order)
cv2.moveWindow(winname, 100,500)
# create final
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = int(i/10) # row index
ci = i%10 # column index
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
winname="final"
cv2.namedWindow(winname)
cv2.imshow(winname, output)
cv2.moveWindow(winname, 600,100)
cv2.waitKey(0)
cv2.destroyAllWindows()
I want to add that above method works only when sudoku board stands straight, otherwise height/width (or vice versa) ratio test will most probably fail and you will not be able to detect edges of sudoku. (I also want to add that if lines that are not perpendicular to the image borders, sobel operations (dx and dy) will still work as lines will still have edges with respect to both axes.)
To be able to detect straight lines you should work on contour or pixel-wise analysis such as contourArea/boundingRectArea, top left and bottom right points...
Edit: I managed to check whether a set of contours form a line or not by applying linear regression and checking the error. However linear regression performed poorly when slope of the line is too big (i.e. >1000) or it is very close to 0. Therefore applying the ratio test above (in most upvoted answer) before linear regression is logical and did work for me.
To remove undected corners I applied gamma correction with a gamma value of 0.8.
The red circle is drawn to show the missing corner.
The code is:
gamma = 0.8
invGamma = 1/gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
cv2.LUT(img, table, img)
This is in addition to Abid Rahman's answer if some corner points are missing.

Detect the green lines in this image and calculate their lengths

Sample Images
The image can be more noisy at times where more objects intervene from the background. Right now I am using various techniques using the RGB colour space to detect the lines but it fails when there is change in the colour due to intervening obstacles from the background. I am using opencv and python.
I have read that HSV is better for colour detection and used but haven't been successful yet.
I am not able to find a generic solution to this problem. Any hints or clues in this direction would be of great help.
STILL IN PROGRESS
First of all, an RGB image consists of 3 grayscale images. Since you need the green color you will deal only with one channel. The green one. To do so, you can split the image, you can use b,g,r = cv2.split('Your Image'). You will get an output like that if you are showing the green channel:
After that you should threshold the image using your desired way. I prefer Otsu's thresholding in this case. The output after thresholding is:
It's obvious that the thresholded image is extremley noisy. So performing erosion will reduce the noise a little bit. The noise reduced image will be similar to the following:
I tried using closing instead of dilation, but closing preserves some unwanted noise. So I separately performed erosion followed by dilation. After dilation the output is:
Note that: You can do your own way in morphological operation. You can use opening instead of what I did. The results are subjective from
one person to another.
Now you can try one these two methods:
1. Blob Detection.
2. HoughLine Transform.
TODO
Try out these two methods and choose the best.
You should use the fact that you know you are trying to detect a line by using the line hough transform.
http://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_lines/hough_lines.html
When the obstacle also look like a line use the fact that you know approximately what is the orientation of the green lines.
If you don't know the orientation of the line use hte fact that there are several green lines with the same orientation and only one line that is the obstacle
Here is a code for what i meant:
import cv2
import numpy as np
# Params
minLineCount = 300 # min number of point alogn line with the a specif orientation
minArea = 100
# Read img
img = cv2.imread('i.png')
greenChannel = img[:,:,1]
# Do noise reduction
iFilter = cv2.bilateralFilter(greenChannel,5,5,5)
# Threshold data
#ret,iThresh = cv2.threshold(iFilter,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
iThresh = (greenChannel > 4).astype(np.uint8)*255
# Remove small areas
se1 = cv2.getStructuringElement(cv2.MORPH_RECT, (5,5))
iThreshRemove = cv2.morphologyEx(iThresh, cv2.MORPH_OPEN, se1)
# Find edges
iEdge = cv2.Canny(iThreshRemove,50,100)
# Hough line transform
lines = cv2.HoughLines(iEdge, 1, 3.14/180,75)
# Find the theta with the most lines
thetaCounter = dict()
for line in lines:
theta = line[0, 1]
if theta in thetaCounter:
thetaCounter[theta] += 1
else:
thetaCounter[theta] = 1
maxThetaCount = 0
maxTheta = 0
for theta in thetaCounter:
if thetaCounter[theta] > maxThetaCount:
maxThetaCount = thetaCounter[theta]
maxTheta = theta
# Find the rhos that corresponds to max theta
rhoValues = []
for line in lines:
rho = line[0, 0]
theta = line[0, 1]
if theta == maxTheta:
rhoValues.append(rho)
# Go over all the lines with the specific orientation and count the number of pixels on that line
# if the number is bigger than minLineCount draw the pixels in finaImage
lineImage = np.zeros_like(iThresh, np.uint8)
for rho in range(min(rhoValues), max(rhoValues), 1):
a = np.cos(maxTheta)
b = np.sin(maxTheta)
x0 = round(a*rho)
y0 = round(b*rho)
lineCount = 0
pixelList = []
for jump in range(-1000, 1000, 1):
x1 = int(x0 + jump * (-b))
y1 = int(y0 + jump * (a))
if x1 < 0 or y1 < 0 or x1 >= lineImage.shape[1] or y1 >= lineImage.shape[0]:
continue
if iThreshRemove[y1, x1] == int(255):
pixelList.append((y1, x1))
lineCount += 1
if lineCount > minLineCount:
for y,x in pixelList:
lineImage[y, x] = int(255)
# Remove small areas
## Opencv 2.4
im2, contours, hierarchy = cv2.findContours(lineImage,cv2.RETR_CCOMP,cv2.CHAIN_APPROX_NONE )
finalImage = np.zeros_like(lineImage)
finalShapes = []
for contour in contours:
if contour.size > minArea:
finalShapes.append(contour)
cv2.fillPoly(finalImage, finalShapes, 255)
## Opencv 3.0
# output = cv2.connectedComponentsWithStats(lineImage, 8, cv2.CV_32S)
#
# finalImage = np.zeros_like(output[1])
# finalImage = output[1]
# stat = output[2]
# for label in range(output[0]):
# if label == 0:
# continue
# cc = stat[label,:]
# if cc[cv2.CC_STAT_AREA] < minArea:
# finalImage[finalImage == label] = 0
# else:
# finalImage[finalImage == label] = 255
# Show image
#cv2.imwrite('finalImage2.jpg',finalImage)
cv2.imshow('a', finalImage.astype(np.uint8))
cv2.waitKey(0)
and the result for the images:

How to remove convexity defects in a Sudoku square?

I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.
I did the programming using Python API of OpenCV 2.3.1.
Below is what I did :
Read the image
Find the contours
Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.
e.g. given below:
(Notice here that the green line correctly coincides with the true boundary of the Sudoku, so the Sudoku can be correctly warped. Check next image)
warp the image to a perfect square
eg image:
Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )
And the method worked well.
Problem:
Check out this image.
Performing the step 4 on this image gives the result below:
The red line drawn is the original contour which is the true outline of sudoku boundary.
The green line drawn is approximated contour which will be the outline of warped image.
Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.
My Question :
How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?
I have a solution that works, but you'll have to translate it to OpenCV yourself. It's written in Mathematica.
The first step is to adjust the brightness in the image, by dividing each pixel with the result of a closing operation:
src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]
The next step is to find the sudoku area, so I can ignore (mask out) the background. For that, I use connected component analysis, and select the component that's got the largest convex area:
components =
ComponentMeasurements[
ColorNegate#Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All,
2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]
By filling this image, I get a mask for the sudoku grid:
mask = FillingTransform[largestComponent]
Now, I can use a 2nd order derivative filter to find the vertical and horizontal lines in two separate images:
lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];
I use connected component analysis again to extract the grid lines from these images. The grid lines are much longer than the digits, so I can use caliper length to select only the grid lines-connected components. Sorting them by position, I get 2x10 mask images for each of the vertical/horizontal grid lines in the image:
verticalGridLineMasks =
SortBy[ComponentMeasurements[
lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks =
SortBy[ComponentMeasurements[
lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 2]] &][[All, 3]];
Next I take each pair of vertical/horizontal grid lines, dilate them, calculate the pixel-by-pixel intersection, and calculate the center of the result. These points are the grid line intersections:
centerOfGravity[l_] :=
ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters =
Table[centerOfGravity[
ImageData[Dilation[Image[h], DiskMatrix[2]]]*
ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h,
horizontalGridLineMasks}, {v, verticalGridLineMasks}];
The last step is to define two interpolation functions for X/Y mapping through these points, and transform the image using these functions:
fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed =
ImageTransformation[
srcAdjusted, {fnX ## Reverse[#], fnY ## Reverse[#]} &, {9*50, 9*50},
PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]
All of the operations are basic image processing function, so this should be possible in OpenCV, too. The spline-based image transformation might be harder, but I don't think you really need it. Probably using the perspective transformation you use now on each individual cell will give good enough results.
Nikie's answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie's mathematica code. And also, I couldn't find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)
1. Image PreProcessing ( closing operation )
import cv2
import numpy as np
img = cv2.imread('dave.jpg')
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
Result :
2. Finding Sudoku Square and Creating Mask Image
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
Result :
3. Finding Vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
Result :
4. Finding Horizontal Lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
Result :
Of course, this one is not so good.
5. Finding Grid Points
res = cv2.bitwise_and(closex,closey)
Result :
6. Correcting the defects
Here, nikie does some kind of interpolation, about which I don't have much knowledge. And i couldn't find any corresponding function for this OpenCV. (may be it is there, i don't know).
Check out this SOF which explains how to do this using SciPy, which I don't want to use : Image transformation in OpenCV
So, here I took 4 corners of each sub-square and applied warp Perspective to each.
For that, first we find the centroids.
contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
But resulting centroids won't be sorted. Check out below image to see their order:
So we sort them from left to right, top to bottom.
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))
Now see below their order :
Finally we apply the transformation and create a new image of size 450x450.
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = i/10
ci = i%10
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
Result :
The result is almost same as nikie's, but code length is large. May be, better methods are available out there, but until then, this works OK.
Regards
ARK.
You could try to use some kind of grid based modeling of you arbitrary warping. And since the sudoku already is a grid, that shouldn't be too hard.
So you could try to detect the boundaries of each 3x3 subregion and then warp each region individually. If the detection succeeds it would give you a better approximation.
I thought this was a great post, and a great solution by ARK; very well laid out and explained.
I was working on a similar problem, and built the entire thing. There were some changes (i.e. xrange to range, arguments in cv2.findContours), but this should work out of the box (Python 3.5, Anaconda).
This is a compilation of the elements above, with some of the missing code added (i.e., labeling of points).
'''
https://stackoverflow.com/questions/10196198/how-to-remove-convexity-defects-in-a-sudoku-square
'''
import cv2
import numpy as np
img = cv2.imread('test.png')
winname="raw image"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,100)
img = cv2.GaussianBlur(img,(5,5),0)
winname="blurred"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,150)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
winname="gray"
cv2.namedWindow(winname)
cv2.imshow(winname, gray)
cv2.moveWindow(winname, 100,200)
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
winname="res2"
cv2.namedWindow(winname)
cv2.imshow(winname, res2)
cv2.moveWindow(winname, 100,250)
#find elements
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
img_c, contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
winname="puzzle only"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,300)
# vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
img_d, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
winname="vertical lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_d)
cv2.moveWindow(winname, 100,350)
# find horizontal lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
img_e, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
winname="horizontal lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_e)
cv2.moveWindow(winname, 100,400)
# intersection of these two gives dots
res = cv2.bitwise_and(closex,closey)
winname="intersections"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,450)
# text blue
textcolor=(0,255,0)
# points green
pointcolor=(255,0,0)
# find centroids and sort
img_f, contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
# sorting
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in range(10)])
bm = b.reshape((10,10,2))
# make copy
labeled_in_order=res2.copy()
for index, pt in enumerate(b):
cv2.putText(labeled_in_order,str(index),tuple(pt),cv2.FONT_HERSHEY_DUPLEX, 0.75, textcolor)
cv2.circle(labeled_in_order, tuple(pt), 5, pointcolor)
winname="labeled in order"
cv2.namedWindow(winname)
cv2.imshow(winname, labeled_in_order)
cv2.moveWindow(winname, 100,500)
# create final
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = int(i/10) # row index
ci = i%10 # column index
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
winname="final"
cv2.namedWindow(winname)
cv2.imshow(winname, output)
cv2.moveWindow(winname, 600,100)
cv2.waitKey(0)
cv2.destroyAllWindows()
I want to add that above method works only when sudoku board stands straight, otherwise height/width (or vice versa) ratio test will most probably fail and you will not be able to detect edges of sudoku. (I also want to add that if lines that are not perpendicular to the image borders, sobel operations (dx and dy) will still work as lines will still have edges with respect to both axes.)
To be able to detect straight lines you should work on contour or pixel-wise analysis such as contourArea/boundingRectArea, top left and bottom right points...
Edit: I managed to check whether a set of contours form a line or not by applying linear regression and checking the error. However linear regression performed poorly when slope of the line is too big (i.e. >1000) or it is very close to 0. Therefore applying the ratio test above (in most upvoted answer) before linear regression is logical and did work for me.
To remove undected corners I applied gamma correction with a gamma value of 0.8.
The red circle is drawn to show the missing corner.
The code is:
gamma = 0.8
invGamma = 1/gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
cv2.LUT(img, table, img)
This is in addition to Abid Rahman's answer if some corner points are missing.

Categories

Resources