How to find max value between boundary in segmentation? - python

here's an issue: I want to find actual maximum width of boundary in segmented image with irregular shape (this)
below I post some example image I use for testing
So far I managed to obtain boundaries and skeleton line, but how do I measure distance between contours perpendicural to the skeleton line?
def get_skeleton(image_path):
im = cv2.imread(img_path , cv2.IMREAD_GRAYSCALE)
binary = im > filters.threshold_otsu(im)
skeleton = morphology.skeletonize(binary)
return skeleton
skeleton = get_skeleton(img_path)
plt.imshow(skeleton, cmap="gray")
def get_boundary(image_path):
reading_Img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
reading_Img = cv2.cvtColor(reading_Img,cv2.COLOR_BGR2RGB)
canny_Img = cv2.Canny(reading_Img,90,100)
contours,_ = cv2.findContours(canny_Img,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
canvas = np.zeros_like(reading_Img)
boundary = cv2.drawContours(canvas , contours, -1, (255, 0, 0), 1)
return boundary
boundary = get_boundary(img_path)
plt.imshow(boundary)
Sample input image
EDIT:
First of all thanks for your answer, I would like to add more detail on what I am trying to do.
So I made a segmentation model which detects cracks in concrete (they can be any shape, vertical, horizontal, diagonal, etc) and now I need to identify their max-width and draw a line that shows where it occurs.
I found that the medial axis returns the distance from the boundary and by filtering max value I was able to get the width (see colab below) and its coordinate on the medial axis. Now I need to draw a line connecting the width between boundaries, but I have no idea on how to find the coordinates of such a line.
I thought of an algorithm which starts at the point of max distance occurrence on medial axis and expands until it finds a boundary, but I don't know how to implement it.
This image shows what I need to have:
After I find x and y of points I will be able to calculate euclidean distance between 2 points
dist=sqrt((y2-y1)^2+(x2-x1)^2)
Please look at my code in colab notebook: https://colab.research.google.com/drive/1NvEyfrxpKGJ1kxjP48PGNB_UUSp6f6Ze?usp=sharing
sample input images:
https://imgur.com/ewTmH8M
https://imgur.com/JRAQCke
https://imgur.com/7QQFfAv

Starting with your approach using the medial axis function you can
interpolate the direction of the axis at the point that was found
derive the orthogonal from the direction
look where the orthogonal reaches the boundary.
The example below shows the principle and works with your example images. But I'm sure there will be some boundary conditions that are not yet considered. I leave it to you to get it robust against real live data.
import cv2
import numpy as np
from skimage.morphology import medial_axis
from skimage import img_as_ubyte
delta = 3 # delta index for interpolation
# get crack
im = cv2.imread("img.png", cv2.IMREAD_GRAYSCALE)
rgb = cv2.cvtColor(im, cv2.COLOR_GRAY2RGB) # rgb just for demo purpose
_, crack = cv2.threshold(im, 127, 255, cv2.THRESH_BINARY)
# get medial axis
medial, distance = medial_axis(im, return_distance=True)
med_img = img_as_ubyte(medial)
med_contours, _ = cv2.findContours(med_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
cv2.drawContours(rgb, med_contours, -1, (255, 0, 0), 1)
med_pts = [v[0] for v in med_contours[0]]
# get point with maximal distance from medial axis
max_idx = np.argmax(distance)
max_pos = np.unravel_index(max_idx, distance.shape)
max_dist = distance[max_pos]
coords = np.array([max_pos[1], max_pos[0]])
print(f"max distance from medial axis to boundary = {max_dist} at {coords}")
# interpolate orthogonal of medial axis at coords
idx = next(i for i, v in enumerate(med_pts) if (v == coords).all())
px1, py1 = med_pts[(idx-delta) % len(med_pts)]
px2, py2 = med_pts[(idx+delta) % len(med_pts)]
orth = np.array([py1 - py2, px2 - px1]) * max(im.shape)
# intersect orthogonal with crack and get contour
orth_img = np.zeros(crack.shape, dtype=np.uint8)
cv2.line(orth_img, coords + orth, coords - orth, color=255, thickness=1)
gap_img = cv2.bitwise_and(orth_img, crack)
gap_contours, _ = cv2.findContours(gap_img, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
gap_pts = [v[0] for v in gap_contours[0]]
# determine the end points of the gap contour by negative dot product
n = len(gap_pts)
gap_ends = [
p for i, p in enumerate(gap_pts)
if np.dot(p - gap_pts[(i-1) % n], gap_pts[(i+1) % n] - p) < 0
]
print(f"Maximum gap found from {gap_ends[0]} to {gap_ends[1]}")
cv2.line(rgb, gap_ends[0], gap_ends[1], color=(0, 0, 255), thickness=1)
cv2.imwrite("test_out.png", rgb)

First thing I did was keep your images greyscale, there is no need to covert to 3 channels to find contours. Second was to convert the boundary image to a binary so that it is the same as the skeleton image. Then I simply added the two to get the both image.
I then clocked through each row (as you are looking for perpendicular distances) of the combined both image & looked for elements that where True i.e that are either boundary or skeleton pixels. I made a simplifying assumption at this point - I only searched for cases where there is a boundary followed by a single skeleton pixel then by a second boundary, I appreciate that this may not always be the case but I leave that particular headache for you to sort out.
After that its just a case of keeping track of he max & min distances recorded as you go through the image row by row. (edit: there maybe a cleaner way to do this than the way I've done it but hopefully you get the idea)
import numpy as np
import matplotlib.pyplot as plt
import cv2
from skimage import filters
from skimage import morphology
def get_skeleton(image_path):
im = cv2.imread(image_path , cv2.IMREAD_GRAYSCALE)
binary = im > filters.threshold_otsu(im)
skeleton = morphology.skeletonize(binary)
return skeleton
def get_boundary(image_path):
reading_Img = cv2.imread(image_path, cv2.IMREAD_GRAYSCALE)
canny_Img = cv2.Canny(reading_Img, 90, 100)
contours,_ = cv2.findContours(canny_Img,cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
canvas = np.zeros_like(reading_Img)
boundary = cv2.drawContours(canvas, contours, -1, (255, 0, 0), 1)
binary = boundary > filters.threshold_otsu(boundary)
return binary
skeleton = get_skeleton("LtqlM.png")
boundary = get_boundary("LtqlM.png")
both = skeleton + boundary
max_dist = 0
min_dist = 100000
for idx in range(both.shape[0]): # counting through rows
row = both[idx, :]
lines = np.where(row==True)[0]
if len(lines) == 3:
dist_1 = lines[1] - lines[0]
dist_2 = lines[2] - lines[1]
if (dist_1 > dist_2) and dist_1 > max_dist:
max_dist = dist_1
if (dist_2 > dist_1) and dist_2 > max_dist:
max_dist = dist_2
if (dist_1 < dist_2) and dist_1 < min_dist:
min_dist = dist_1
if (dist_2 < dist_1) and dist_2 < min_dist:
min_dist = dist_2
print("Maximum distance = ", max_dist)
print("Minimum distance = ", min_dist)
plt.imshow(both)

From the pixel with the largest distance, you can explore concentric square layers, until you find a background pixel. Then find the background pixel with the shortest Euclidean distance on the last layer. The second endpoint is symmetrical.

Related

Remove and measure a line openCV

Links to all images at the bottom
I have drawn a line over an arrow which captures the angle of that arrow. I would like to then remove the arrow, keep only the line, and use cv2.minAreaRect to determine the angle. So far I've got everything to work except removing the original arrow, which results in an incorrect angle generated by the cv2.minAreaRect bounding box.
Really, I just want the bold black line running through the arrow to use to measure the angle, not the arrow itself. if anyone has an idea to make this work, or a simpler way, please let me know. Thanks
Code:
import numpy as np
import cv2
image = cv2.imread("templates/a_15.png")
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(image, 127, 255, 0)
contours, hierarchy = cv2.findContours(thresh, 1, 2)
cont = contours[0]
rows,cols = image.shape[:2]
[vx,vy,x,y] = cv2.fitLine(cont, cv2.DIST_L2,0,0.01,0.01)
leftish = int((-x*vy/vx) + y)
rightish = int(((cols-x)*vy/vx)+y)
line = cv2.line(image,(cols-1,rightish),(0,leftish),(0,255,0),10)
# thresholding
thresh = cv2.threshold(line, 0, 255, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
# compute rotated bounding box based on all pixel values > 0 and
# use coordinates to compute a rotated bounding box of those coordinates
coordinates = np.column_stack(np.where(thresh > 0))
w = coordinates[0]
h = coordinates[1]
# Compute minimum rotated angle that contains entire image.
# Return angle values in the range [-90, 0).
# As the rectangle rotates clockwise, angle values increase towards 0.
# Once 0 is reached, angle is set back to -90 degrees.
angle = cv2.minAreaRect(coordinates)[-1]
# for angles less than -45 degrees, add 90 degrees to angle to take the inverse.
if angle < - 45:
angle = -(90 + angle)
else:
angle = -angle
# rotate image
(h, w) = image.shape[:2]
center = (w // 2, h // 2) # image center
RM = cv2.getRotationMatrix2D(center, angle, 1.0)
rotated = cv2.warpAffine(image, RM, (w, h),
flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)
# correction angle for validation
cv2.putText(rotated, "Angle {:.2f} degrees".format(angle),
(10, 30), cv2.FONT_HERSHEY_DUPLEX, 0.9, (0, 255, 0), 2)
# output
print("[INFO] angle: {:.3f}".format(angle))
cv2.imshow("Line", line)
cv2.imshow("Input", image)
cv2.imshow("Rotated", rotated)
cv2.waitKey(0)
Images
original
current results
goal
Here's a possible solution. The main idea is to identify de "tip" and the "tail" of the arrow approximating some key points. After you have identified both ends, you can draw a line joining both points. It is also an advantage to know which of the endpoints is the tip, because that way you can measure the angle from a constant point.
There's more than one way to achieve this. I choose something that I have applied in the past: I will use this approach to identify the endpoints of the overall shape. My assumption is that the tip will yield more points than the tail. After that, I'll cluster all the endpoints in two groups: tip and tail. I can use K-Means for that, as it will return the mean centers for both clusters. After that, we have our tip and tail points that can be joined easily with a line. These are the steps:
Convert the image to grayscale
Get the skeleton of the image, to normalize the shape to a width of 1 pixel
Apply the method described in the link to get the arrow's endpoints
Divide the endpoints in two clusters and use K-Means to get their centers
Join both endpoints with a line
Let's see the code:
# imports:
import cv2
import numpy as np
# image path
path = "D://opencvImages//"
fileName = "CoXeb.png"
# Reading an image in default mode:
inputImage = cv2.imread(path + fileName)
# Grayscale conversion:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
grayscaleImage = 255 - grayscaleImage
# Extend the borders for the skeleton:
extendedImg = cv2.copyMakeBorder(grayscaleImage, 5, 5, 5, 5, cv2.BORDER_CONSTANT)
# Store a deep copy of the crop for results:
grayscaleImageCopy = cv2.cvtColor(extendedImg, cv2.COLOR_GRAY2BGR)
# Compute the skeleton:
skeleton = cv2.ximgproc.thinning(extendedImg, None, 1)
The first step is to get the skeleton of the arrow. As I said, this step is needed prior to the convolution-based method that identifies the endpoints of a shape. Computing the skeleton normalizes the shape to a one pixel width. However, sometimes, if the shape is too close to the "canvas" borders, the skeleton could show some artifacts. This is avoided with a border extension. The skeleton of the arrow is this:
Check that image out. If we identify the endpoints, the tip will exhibit at least 3 points, while the tail at least 1. That's handy - the tip will always have more points than the tail. If only we could detect those points... Luckily, we can:
# Threshold the image so that white pixels get a value of 0 and
# black pixels a value of 10:
_, binaryImage = cv2.threshold(skeleton, 128, 10, cv2.THRESH_BINARY)
# Set the end-points kernel:
h = np.array([[1, 1, 1],
[1, 10, 1],
[1, 1, 1]])
# Convolve the image with the kernel:
imgFiltered = cv2.filter2D(binaryImage, -1, h)
# Extract only the end-points pixels, those with
# an intensity value of 110:
binaryImage = np.where(imgFiltered == 110, 255, 0)
# The above operation converted the image to 32-bit float,
# convert back to 8-bit uint
binaryImage = binaryImage.astype(np.uint8)
This endpoint detecting method convolves the skeleton with a special kernel that identifies endpoints. It returns a binary image where all the endpoints have the value 110. After thresholding this mid-result, we get this image, which represents the arrow endpoints:
Nice, as you see, we can group the points in two clusters and get their cluster centers. Sounds like a job for K-Means, because that's exactly what it does. We first need to treat our data, though, because K-Means operates on defined-shaped arrays of float data:
# Find the X, Y location of all the end-points
# pixels:
Y, X = binaryImage.nonzero()
# Reshape the arrays for K-means
Y = Y.reshape(-1,1)
X = X.reshape(-1,1)
Z = np.hstack((X, Y))
# K-means operates on 32-bit float data:
floatPoints = np.float32(Z)
# Set the convergence criteria and call K-means:
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 10, 1.0)
ret, label, center = cv2.kmeans(floatPoints, 2, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
# Set the cluster count, find the points belonging
# to cluster 0 and cluster 1:
cluster1Count = np.count_nonzero(label)
cluster0Count = np.shape(label)[0] - cluster1Count
print("Elements of Cluster 0: "+str(cluster0Count))
print("Elements of Cluster 1: " + str(cluster1Count))
The last two lines prints the endpoints that are assigned to Cluster 0 Cluster 1, respectively. That outputs this:
Elements of Cluster 0: 3
Elements of Cluster 1: 2
Just as expected - well, kinda. Seems that Cluster 0 is the tip and cluster 2 the tail! But the tail actually got 2 points. If you look the image of the skeleton closely, you'll see there's a small bifurcation at the tail. That's why we, in reality, got two points instead of just one. Alright, let's get the center points and draw them on the original input:
# Look for the cluster of max number of points
# That cluster will be the tip of the arrow:
maxCluster = 0
if cluster1Count > cluster0Count:
maxCluster = 1
# Check out the centers of each cluster:
matRows, matCols = center.shape
# Store the ordered end-points here:
orderedPoints = [None] * 2
# Let's identify and draw the two end-points
# of the arrow:
for b in range(matRows):
# Get cluster center:
pointX = int(center[b][0])
pointY = int(center[b][1])
# Get the "tip"
if b == maxCluster:
color = (0, 0, 255)
orderedPoints[0] = (pointX, pointY)
# Get the "tail"
else:
color = (255, 0, 0)
orderedPoints[1] = (pointX, pointY)
# Draw it:
cv2.circle(grayscaleImageCopy, (pointX, pointY), 3, color, -1)
cv2.imshow("End-Points", grayscaleImageCopy)
cv2.waitKey(0)
This is the resulting image:
The tip always gets drawn in red while the tail is drawn in blue. Very cool, let's store these points in the orderedPoints list and draw the final line in a new "canvas", with dimension same as the original image:
# Store the tip and tail points:
p0x = orderedPoints[1][0]
p0y = orderedPoints[1][1]
p1x = orderedPoints[0][0]
p1y = orderedPoints[0][1]
# Create a new "canvas" (image) using the input dimensions:
imageHeight, imageWidth = binaryImage.shape[:2]
newImage = np.zeros((imageHeight, imageWidth), np.uint8)
newImage = 255 - newImage
# Draw a line using the detected points:
(x1, y1) = orderedPoints[0]
(x2, y2) = orderedPoints[1]
lineColor = (0, 0, 0)
cv2.line(newImage , (x1, y1), (x2, y2), lineColor, thickness=2)
cv2.imshow("Detected Line", newImage)
cv2.waitKey(0)
The line overlaid on the original image and the new image containing only the line:
It sounds like you want to measure the angle of the line but because you are measuring a line you drew in the original image, you must now filter out the original image to get an accurate measure of the line...which you drew with coordinates you know the endpoints of?
I guess:
make a better filter?
draw the line in a blank image and detect angle there?
determine the angle from the known coordinates?
Since you were asking for just a line, I tried that...just made a blank image, drew your detected line on it and then used that downstream...
blankIm = np.ones((height, width, channels), dtype=np.uint8)
blankIm.fill(255)
line = cv2.line(blankIm,(cols-1,rightish),(0,leftish),(0,255,0),10)

Writing robust (size invariant) circle detection (Watershed)

Edit: Quick Summary so far: I use the watershed algorithm but I have probably a problem with threshold. It didn't detect the brighter circles.
New: Fast radial symmetry transform approach which didn't quite work eiter (Edit 6).
I want to detect circles with different sizes. The use case is to detect coins on an image and to extract them solely. -> Get the single coins as single image files.
For this I used the Hough Circle Transform of open-cv:
(https://docs.opencv.org/2.4/doc/tutorials/imgproc/imgtrans/hough_circle/hough_circle.html)
import sys
import cv2 as cv
import numpy as np
def main(argv):
## [load]
default_file = "data/newcommon_1euro.jpg"
filename = argv[0] if len(argv) > 0 else default_file
# Loads an image
src = cv.imread(filename, cv.IMREAD_COLOR)
# Check if image is loaded fine
if src is None:
print ('Error opening image!')
print ('Usage: hough_circle.py [image_name -- default ' + default_file + '] \n')
return -1
## [load]
## [convert_to_gray]
# Convert it to gray
gray = cv.cvtColor(src, cv.COLOR_BGR2GRAY)
## [convert_to_gray]
## [reduce_noise]
# Reduce the noise to avoid false circle detection
gray = cv.medianBlur(gray, 5)
## [reduce_noise]
## [houghcircles]
rows = gray.shape[0]
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 8,
param1=100, param2=30,
minRadius=0, maxRadius=120)
## [houghcircles]
## [draw]
if circles is not None:
circles = np.uint16(np.around(circles))
for i in circles[0, :]:
center = (i[0], i[1])
# circle center
cv.circle(src, center, 1, (0, 100, 100), 3)
# circle outline
radius = i[2]
cv.circle(src, center, radius, (255, 0, 255), 3)
## [draw]
## [display]
cv.imshow("detected circles", src)
cv.waitKey(0)
## [display]
return 0
if __name__ == "__main__":
main(sys.argv[1:])
I tried all parameters (rows, param1, param2, minRadius, and maxRadius) to optimize the results. This worked very well for one specific image but other images with different sized coins didn't work.
Examples:
Parameters
circles = cv.HoughCircles(gray, cv.HOUGH_GRADIENT, 1, rows / 16,
param1=100, param2=30,
minRadius=0, maxRadius=120)
With the same parameters:
Changed to rows/8
I also tried two other approaches of this thread: writing robust (color and size invariant) circle detection with opencv (based on Hough transform or other features)
The approach of fireant leads to this result:
The approach of fraxel didn't work either.
For the first approach: This happens with all different sizes and also the min and max radius.
How could I change the code, so that the coin size is not important or that it finds the parameters itself?
Thank you in advance for any help!
Edit:
I tried the watershed algorithm of Open-cv, as suggested by Alexander Reynolds: https://docs.opencv.org/3.4/d3/db4/tutorial_py_watershed.html
import numpy as np
import cv2 as cv
from matplotlib import pyplot as plt
img = cv.imread('data/P1190263.jpg')
gray = cv.cvtColor(img,cv.COLOR_BGR2GRAY)
ret, thresh = cv.threshold(gray,0,255,cv.THRESH_BINARY_INV+cv.THRESH_OTSU)
# noise removal
kernel = np.ones((3,3),np.uint8)
opening = cv.morphologyEx(thresh,cv.MORPH_OPEN,kernel, iterations = 2)
# sure background area
sure_bg = cv.dilate(opening,kernel,iterations=3)
# Finding sure foreground area
dist_transform = cv.distanceTransform(opening,cv.DIST_L2,5)
ret, sure_fg = cv.threshold(dist_transform,0.7*dist_transform.max(),255,0)
# Finding unknown region
sure_fg = np.uint8(sure_fg)
unknown = cv.subtract(sure_bg,sure_fg)
# Marker labelling
ret, markers = cv.connectedComponents(sure_fg)
# Add one to all labels so that sure background is not 0, but 1
markers = markers+1
# Now, mark the region of unknown with zero
markers[unknown==255] = 0
markers = cv.watershed(img,markers)
img[markers == -1] = [255,0,0]
#Display:
cv.imshow("detected circles", img)
cv.waitKey(0)
It works very well on the test image of the open-cv website:
But it performs very bad on my own images:
I can't really think of a good reason why it's not working on my images?
Edit 2:
As suggested I looked at the intermediate images. The thresh looks not good in my opinion. Next, there is no difference between opening and dist_transform. The corresponding sure_fg shows the detected images.
thresh:
opening:
dist_transform:
sure_bg:
sure_fg:
Edit 3:
I tried all distanceTypes and maskSizes I could find, but the results were quite the same (https://www.tutorialspoint.com/opencv/opencv_distance_transformation.htm)
Edit 4:
Furthermore, I tried to change the (first) threshold function. I used different threshold values instead of the OTSU Function. The best one was with 160, but it was far from good:
In the tutorial it looks like this:
It seems like the coins are somehow too bright to be detected by this algorithm, but I don't know how to improve it?
Edit 5:
Changing the overall contrast and brightness of the image (with cv.convertScaleAbs) didn't improve the results. Increasing the contrast however should increase the "difference" between foreground and background, at least on the normal image. But it even got worse. The corresponding threshold image didn't improved (didn't get more white pixel).
Edit 6: I tried another approach, the fast radial symmetry transform (from here https://github.com/ceilab/frst_python)
import cv2
import numpy as np
def gradx(img):
img = img.astype('int')
rows, cols = img.shape
# Use hstack to add back in the columns that were dropped as zeros
return np.hstack((np.zeros((rows, 1)), (img[:, 2:] - img[:, :-2]) / 2.0, np.zeros((rows, 1))))
def grady(img):
img = img.astype('int')
rows, cols = img.shape
# Use vstack to add back the rows that were dropped as zeros
return np.vstack((np.zeros((1, cols)), (img[2:, :] - img[:-2, :]) / 2.0, np.zeros((1, cols))))
# Performs fast radial symmetry transform
# img: input image, grayscale
# radii: integer value for radius size in pixels (n in the original paper); also used to size gaussian kernel
# alpha: Strictness of symmetry transform (higher=more strict; 2 is good place to start)
# beta: gradient threshold parameter, float in [0,1]
# stdFactor: Standard deviation factor for gaussian kernel
# mode: BRIGHT, DARK, or BOTH
def frst(img, radii, alpha, beta, stdFactor, mode='BOTH'):
mode = mode.upper()
assert mode in ['BRIGHT', 'DARK', 'BOTH']
dark = (mode == 'DARK' or mode == 'BOTH')
bright = (mode == 'BRIGHT' or mode == 'BOTH')
workingDims = tuple((e + 2 * radii) for e in img.shape)
# Set up output and M and O working matrices
output = np.zeros(img.shape, np.uint8)
O_n = np.zeros(workingDims, np.int16)
M_n = np.zeros(workingDims, np.int16)
# Calculate gradients
gx = gradx(img)
gy = grady(img)
# Find gradient vector magnitude
gnorms = np.sqrt(np.add(np.multiply(gx, gx), np.multiply(gy, gy)))
# Use beta to set threshold - speeds up transform significantly
gthresh = np.amax(gnorms) * beta
# Find x/y distance to affected pixels
gpx = np.multiply(np.divide(gx, gnorms, out=np.zeros(gx.shape), where=gnorms != 0),
radii).round().astype(int);
gpy = np.multiply(np.divide(gy, gnorms, out=np.zeros(gy.shape), where=gnorms != 0),
radii).round().astype(int);
# Iterate over all pixels (w/ gradient above threshold)
for coords, gnorm in np.ndenumerate(gnorms):
if gnorm > gthresh:
i, j = coords
# Positively affected pixel
if bright:
ppve = (i + gpx[i, j], j + gpy[i, j])
O_n[ppve] += 1
M_n[ppve] += gnorm
# Negatively affected pixel
if dark:
pnve = (i - gpx[i, j], j - gpy[i, j])
O_n[pnve] -= 1
M_n[pnve] -= gnorm
# Abs and normalize O matrix
O_n = np.abs(O_n)
O_n = O_n / float(np.amax(O_n))
# Normalize M matrix
M_max = float(np.amax(np.abs(M_n)))
M_n = M_n / M_max
# Elementwise multiplication
F_n = np.multiply(np.power(O_n, alpha), M_n)
# Gaussian blur
kSize = int(np.ceil(radii / 2))
kSize = kSize + 1 if kSize % 2 == 0 else kSize
S = cv2.GaussianBlur(F_n, (kSize, kSize), int(radii * stdFactor))
return S
img = cv2.imread('data/P1190263.jpg')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
result = frst(gray, 60, 2, 0, 1, mode='BOTH')
cv2.imshow("detected circles", result)
cv2.waitKey(0)
I only get this nearly black output (it has some very dark grey shadows). I don't know what to change and would be thankful for help!

Best approach to process a grid like image [duplicate]

I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.
I did the programming using Python API of OpenCV 2.3.1.
Below is what I did :
Read the image
Find the contours
Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.
e.g. given below:
(Notice here that the green line correctly coincides with the true boundary of the Sudoku, so the Sudoku can be correctly warped. Check next image)
warp the image to a perfect square
eg image:
Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )
And the method worked well.
Problem:
Check out this image.
Performing the step 4 on this image gives the result below:
The red line drawn is the original contour which is the true outline of sudoku boundary.
The green line drawn is approximated contour which will be the outline of warped image.
Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.
My Question :
How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?
I have a solution that works, but you'll have to translate it to OpenCV yourself. It's written in Mathematica.
The first step is to adjust the brightness in the image, by dividing each pixel with the result of a closing operation:
src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]
The next step is to find the sudoku area, so I can ignore (mask out) the background. For that, I use connected component analysis, and select the component that's got the largest convex area:
components =
ComponentMeasurements[
ColorNegate#Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All,
2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]
By filling this image, I get a mask for the sudoku grid:
mask = FillingTransform[largestComponent]
Now, I can use a 2nd order derivative filter to find the vertical and horizontal lines in two separate images:
lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];
I use connected component analysis again to extract the grid lines from these images. The grid lines are much longer than the digits, so I can use caliper length to select only the grid lines-connected components. Sorting them by position, I get 2x10 mask images for each of the vertical/horizontal grid lines in the image:
verticalGridLineMasks =
SortBy[ComponentMeasurements[
lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks =
SortBy[ComponentMeasurements[
lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 2]] &][[All, 3]];
Next I take each pair of vertical/horizontal grid lines, dilate them, calculate the pixel-by-pixel intersection, and calculate the center of the result. These points are the grid line intersections:
centerOfGravity[l_] :=
ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters =
Table[centerOfGravity[
ImageData[Dilation[Image[h], DiskMatrix[2]]]*
ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h,
horizontalGridLineMasks}, {v, verticalGridLineMasks}];
The last step is to define two interpolation functions for X/Y mapping through these points, and transform the image using these functions:
fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed =
ImageTransformation[
srcAdjusted, {fnX ## Reverse[#], fnY ## Reverse[#]} &, {9*50, 9*50},
PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]
All of the operations are basic image processing function, so this should be possible in OpenCV, too. The spline-based image transformation might be harder, but I don't think you really need it. Probably using the perspective transformation you use now on each individual cell will give good enough results.
Nikie's answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie's mathematica code. And also, I couldn't find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)
1. Image PreProcessing ( closing operation )
import cv2
import numpy as np
img = cv2.imread('dave.jpg')
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
Result :
2. Finding Sudoku Square and Creating Mask Image
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
Result :
3. Finding Vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
Result :
4. Finding Horizontal Lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
Result :
Of course, this one is not so good.
5. Finding Grid Points
res = cv2.bitwise_and(closex,closey)
Result :
6. Correcting the defects
Here, nikie does some kind of interpolation, about which I don't have much knowledge. And i couldn't find any corresponding function for this OpenCV. (may be it is there, i don't know).
Check out this SOF which explains how to do this using SciPy, which I don't want to use : Image transformation in OpenCV
So, here I took 4 corners of each sub-square and applied warp Perspective to each.
For that, first we find the centroids.
contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
But resulting centroids won't be sorted. Check out below image to see their order:
So we sort them from left to right, top to bottom.
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))
Now see below their order :
Finally we apply the transformation and create a new image of size 450x450.
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = i/10
ci = i%10
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
Result :
The result is almost same as nikie's, but code length is large. May be, better methods are available out there, but until then, this works OK.
Regards
ARK.
You could try to use some kind of grid based modeling of you arbitrary warping. And since the sudoku already is a grid, that shouldn't be too hard.
So you could try to detect the boundaries of each 3x3 subregion and then warp each region individually. If the detection succeeds it would give you a better approximation.
I thought this was a great post, and a great solution by ARK; very well laid out and explained.
I was working on a similar problem, and built the entire thing. There were some changes (i.e. xrange to range, arguments in cv2.findContours), but this should work out of the box (Python 3.5, Anaconda).
This is a compilation of the elements above, with some of the missing code added (i.e., labeling of points).
'''
https://stackoverflow.com/questions/10196198/how-to-remove-convexity-defects-in-a-sudoku-square
'''
import cv2
import numpy as np
img = cv2.imread('test.png')
winname="raw image"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,100)
img = cv2.GaussianBlur(img,(5,5),0)
winname="blurred"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,150)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
winname="gray"
cv2.namedWindow(winname)
cv2.imshow(winname, gray)
cv2.moveWindow(winname, 100,200)
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
winname="res2"
cv2.namedWindow(winname)
cv2.imshow(winname, res2)
cv2.moveWindow(winname, 100,250)
#find elements
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
img_c, contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
winname="puzzle only"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,300)
# vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
img_d, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
winname="vertical lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_d)
cv2.moveWindow(winname, 100,350)
# find horizontal lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
img_e, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
winname="horizontal lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_e)
cv2.moveWindow(winname, 100,400)
# intersection of these two gives dots
res = cv2.bitwise_and(closex,closey)
winname="intersections"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,450)
# text blue
textcolor=(0,255,0)
# points green
pointcolor=(255,0,0)
# find centroids and sort
img_f, contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
# sorting
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in range(10)])
bm = b.reshape((10,10,2))
# make copy
labeled_in_order=res2.copy()
for index, pt in enumerate(b):
cv2.putText(labeled_in_order,str(index),tuple(pt),cv2.FONT_HERSHEY_DUPLEX, 0.75, textcolor)
cv2.circle(labeled_in_order, tuple(pt), 5, pointcolor)
winname="labeled in order"
cv2.namedWindow(winname)
cv2.imshow(winname, labeled_in_order)
cv2.moveWindow(winname, 100,500)
# create final
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = int(i/10) # row index
ci = i%10 # column index
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
winname="final"
cv2.namedWindow(winname)
cv2.imshow(winname, output)
cv2.moveWindow(winname, 600,100)
cv2.waitKey(0)
cv2.destroyAllWindows()
I want to add that above method works only when sudoku board stands straight, otherwise height/width (or vice versa) ratio test will most probably fail and you will not be able to detect edges of sudoku. (I also want to add that if lines that are not perpendicular to the image borders, sobel operations (dx and dy) will still work as lines will still have edges with respect to both axes.)
To be able to detect straight lines you should work on contour or pixel-wise analysis such as contourArea/boundingRectArea, top left and bottom right points...
Edit: I managed to check whether a set of contours form a line or not by applying linear regression and checking the error. However linear regression performed poorly when slope of the line is too big (i.e. >1000) or it is very close to 0. Therefore applying the ratio test above (in most upvoted answer) before linear regression is logical and did work for me.
To remove undected corners I applied gamma correction with a gamma value of 0.8.
The red circle is drawn to show the missing corner.
The code is:
gamma = 0.8
invGamma = 1/gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
cv2.LUT(img, table, img)
This is in addition to Abid Rahman's answer if some corner points are missing.

Copy a part of an image in opencv and python

I'm trying to split an image into several sub-images with opencv by identifying templates of the original image and then copy the regions where I matched those templates. I'm a TOTAL newbie to opencv! I've identified the sub-images using:
result = cv2.matchTemplate(img, template, cv2.TM_CCORR_NORMED)
After some cleanup I get a list of tuples called points in which I iterate to show the rectangles. tw and th is the template width and height respectively.
for pt in points:
re = cv2.rectangle(img, pt, (pt[0] + tw, pt[1] + th), 0, 2)
print('%s, %s' % (str(pt[0]), str(pt[1])))
count+=1
What I would like to accomplish is to save the octagons (https://dl.dropbox.com/u/239592/region01.png) into separated files.
How can I do this? I've read something about contours but I'm not sure how to use it. Ideally I would like to contour the octagon.
Thanks a lot for your help!
If template matching is working for you, stick to it. For instance, I considered the following template:
Then, we can pre-process the input in order to make it a binary one and discard small components. After this step, the template matching is performed. Then it is a matter of filtering the matches by means of discarding close ones (I've used a dummy method for that, so if there are too many matches you could see it taking some time). After we decide which points are far apart (and thus identify different hexagons), we can do minor adjusts to them in the following manner:
Sort by y-coordinate;
If two adjacent items start at a y-coordinate that is too close, then set them both to the same y-coord.
Now you can sort this point list in an appropriate order such that the crops are done in raster order. The cropping part is easily achieved using slicing provided by numpy.
import sys
import cv2
import numpy
outbasename = 'hexagon_%02d.png'
img = cv2.imread(sys.argv[1])
template = cv2.cvtColor(cv2.imread(sys.argv[2]), cv2.COLOR_BGR2GRAY)
theight, twidth = template.shape[:2]
# Binarize the input based on the saturation and value.
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
saturation = hsv[:,:,1]
value = hsv[:,:,2]
value[saturation > 35] = 255
value = cv2.threshold(value, 0, 255, cv2.THRESH_OTSU)[1]
# Pad the image.
value = cv2.copyMakeBorder(255 - value, 3, 3, 3, 3, cv2.BORDER_CONSTANT, value=0)
# Discard small components.
img_clean = numpy.zeros(value.shape, dtype=numpy.uint8)
contours, _ = cv2.findContours(value, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
for i, c in enumerate(contours):
area = cv2.contourArea(c)
if area > 500:
cv2.drawContours(img_clean, contours, i, 255, 2)
def closest_pt(a, pt):
if not len(a):
return (float('inf'), float('inf'))
d = a - pt
return a[numpy.argmin((d * d).sum(1))]
match = cv2.matchTemplate(img_clean, template, cv2.TM_CCORR_NORMED)
# Filter matches.
threshold = 0.8
dist_threshold = twidth / 1.5
loc = numpy.where(match > threshold)
ptlist = numpy.zeros((len(loc[0]), 2), dtype=int)
count = 0
print "%d matches" % len(loc[0])
for pt in zip(*loc[::-1]):
cpt = closest_pt(ptlist[:count], pt)
dist = ((cpt[0] - pt[0]) ** 2 + (cpt[1] - pt[1]) ** 2) ** 0.5
if dist > dist_threshold:
ptlist[count] = pt
count += 1
# Adjust points (could do for the x coords too).
ptlist = ptlist[:count]
view = ptlist.ravel().view([('x', int), ('y', int)])
view.sort(order=['y', 'x'])
for i in xrange(1, ptlist.shape[0]):
prev, curr = ptlist[i - 1], ptlist[i]
if abs(curr[1] - prev[1]) < 5:
y = min(curr[1], prev[1])
curr[1], prev[1] = y, y
# Crop in raster order.
view.sort(order=['y', 'x'])
for i, pt in enumerate(ptlist, start=1):
cv2.imwrite(outbasename % i,
img[pt[1]-2:pt[1]+theight-2, pt[0]-2:pt[0]+twidth-2])
print 'Wrote %s' % (outbasename % i)
If you want only the contours of the hexagons, then crop on img_clean instead of img (but then it is pointless to sort the hexagons in raster order).
Here is a representation of the different regions that would be cut for your two examples without modifying the code above:
I am sorry, I didn't understand from your question on how do you relate matchTemplate and Contours.
Anyway, below is a small technique using contours. It is on the assumption that your other images are also like the one you provided. I am not sure if it works with your other images. But I think it would help to get a startup. Try this yourself and make necessary adjustments and modifications.
What I did :
1 - I needed the edge of octagons . So Thresholded Image using Otsu and apply dilation and erosion (or use any method you like that works well for all your images, beware of the edges in left edge of image).
2 - Then found contours (More about contours : http://goo.gl/r0ID0
3 - For each contours, find its convex hull, find its area(A) & perimeter(P)
4 - For a perfect octagon, P*P/A = 13.25 approximately. I used it here and cut it and saved it.
5 - You can see cropping it also removes some edges of octagon. If you want it, adjust the cropping dimension.
Code :
import cv2
import numpy as np
img = cv2.imread('region01.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = cv2.dilate(thresh,None,iterations = 2)
thresh = cv2.erode(thresh,None)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
number = 0
for cnt in contours:
hull = cv2.convexHull(cnt)
area = cv2.contourArea(hull)
P = cv2.arcLength(hull,True)
if ((area != 0) and (13<= P**2/area <= 14)):
#cv2.drawContours(img,[hull],0,255,3)
x,y,w,h = cv2.boundingRect(hull)
number = number + 1
roi = img[y:y+h,x:x+w]
cv2.imshow(str(number),roi)
cv2.imwrite("1"+str(number)+".jpg",roi)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Those 6 octagons will be stored as separate files.
Hope it helps !!!

How to remove convexity defects in a Sudoku square?

I was doing a fun project: Solving a Sudoku from an input image using OpenCV (as in Google goggles etc). And I have completed the task, but at the end I found a little problem for which I came here.
I did the programming using Python API of OpenCV 2.3.1.
Below is what I did :
Read the image
Find the contours
Select the one with maximum area, ( and also somewhat equivalent to square).
Find the corner points.
e.g. given below:
(Notice here that the green line correctly coincides with the true boundary of the Sudoku, so the Sudoku can be correctly warped. Check next image)
warp the image to a perfect square
eg image:
Perform OCR ( for which I used the method I have given in Simple Digit Recognition OCR in OpenCV-Python )
And the method worked well.
Problem:
Check out this image.
Performing the step 4 on this image gives the result below:
The red line drawn is the original contour which is the true outline of sudoku boundary.
The green line drawn is approximated contour which will be the outline of warped image.
Which of course, there is difference between green line and red line at the top edge of sudoku. So while warping, I am not getting the original boundary of the Sudoku.
My Question :
How can I warp the image on the correct boundary of the Sudoku, i.e. the red line OR how can I remove the difference between red line and green line? Is there any method for this in OpenCV?
I have a solution that works, but you'll have to translate it to OpenCV yourself. It's written in Mathematica.
The first step is to adjust the brightness in the image, by dividing each pixel with the result of a closing operation:
src = ColorConvert[Import["http://davemark.com/images/sudoku.jpg"], "Grayscale"];
white = Closing[src, DiskMatrix[5]];
srcAdjusted = Image[ImageData[src]/ImageData[white]]
The next step is to find the sudoku area, so I can ignore (mask out) the background. For that, I use connected component analysis, and select the component that's got the largest convex area:
components =
ComponentMeasurements[
ColorNegate#Binarize[srcAdjusted], {"ConvexArea", "Mask"}][[All,
2]];
largestComponent = Image[SortBy[components, First][[-1, 2]]]
By filling this image, I get a mask for the sudoku grid:
mask = FillingTransform[largestComponent]
Now, I can use a 2nd order derivative filter to find the vertical and horizontal lines in two separate images:
lY = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {2, 0}], {0.02, 0.05}], mask];
lX = ImageMultiply[MorphologicalBinarize[GaussianFilter[srcAdjusted, 3, {0, 2}], {0.02, 0.05}], mask];
I use connected component analysis again to extract the grid lines from these images. The grid lines are much longer than the digits, so I can use caliper length to select only the grid lines-connected components. Sorting them by position, I get 2x10 mask images for each of the vertical/horizontal grid lines in the image:
verticalGridLineMasks =
SortBy[ComponentMeasurements[
lX, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 1]] &][[All, 3]];
horizontalGridLineMasks =
SortBy[ComponentMeasurements[
lY, {"CaliperLength", "Centroid", "Mask"}, # > 100 &][[All,
2]], #[[2, 2]] &][[All, 3]];
Next I take each pair of vertical/horizontal grid lines, dilate them, calculate the pixel-by-pixel intersection, and calculate the center of the result. These points are the grid line intersections:
centerOfGravity[l_] :=
ComponentMeasurements[Image[l], "Centroid"][[1, 2]]
gridCenters =
Table[centerOfGravity[
ImageData[Dilation[Image[h], DiskMatrix[2]]]*
ImageData[Dilation[Image[v], DiskMatrix[2]]]], {h,
horizontalGridLineMasks}, {v, verticalGridLineMasks}];
The last step is to define two interpolation functions for X/Y mapping through these points, and transform the image using these functions:
fnX = ListInterpolation[gridCenters[[All, All, 1]]];
fnY = ListInterpolation[gridCenters[[All, All, 2]]];
transformed =
ImageTransformation[
srcAdjusted, {fnX ## Reverse[#], fnY ## Reverse[#]} &, {9*50, 9*50},
PlotRange -> {{1, 10}, {1, 10}}, DataRange -> Full]
All of the operations are basic image processing function, so this should be possible in OpenCV, too. The spline-based image transformation might be harder, but I don't think you really need it. Probably using the perspective transformation you use now on each individual cell will give good enough results.
Nikie's answer solved my problem, but his answer was in Mathematica. So I thought I should give its OpenCV adaptation here. But after implementing I could see that OpenCV code is much bigger than nikie's mathematica code. And also, I couldn't find interpolation method done by nikie in OpenCV ( although it can be done using scipy, i will tell it when time comes.)
1. Image PreProcessing ( closing operation )
import cv2
import numpy as np
img = cv2.imread('dave.jpg')
img = cv2.GaussianBlur(img,(5,5),0)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
Result :
2. Finding Sudoku Square and Creating Mask Image
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
Result :
3. Finding Vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
Result :
4. Finding Horizontal Lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
Result :
Of course, this one is not so good.
5. Finding Grid Points
res = cv2.bitwise_and(closex,closey)
Result :
6. Correcting the defects
Here, nikie does some kind of interpolation, about which I don't have much knowledge. And i couldn't find any corresponding function for this OpenCV. (may be it is there, i don't know).
Check out this SOF which explains how to do this using SciPy, which I don't want to use : Image transformation in OpenCV
So, here I took 4 corners of each sub-square and applied warp Perspective to each.
For that, first we find the centroids.
contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
But resulting centroids won't be sorted. Check out below image to see their order:
So we sort them from left to right, top to bottom.
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in xrange(10)])
bm = b.reshape((10,10,2))
Now see below their order :
Finally we apply the transformation and create a new image of size 450x450.
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = i/10
ci = i%10
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
Result :
The result is almost same as nikie's, but code length is large. May be, better methods are available out there, but until then, this works OK.
Regards
ARK.
You could try to use some kind of grid based modeling of you arbitrary warping. And since the sudoku already is a grid, that shouldn't be too hard.
So you could try to detect the boundaries of each 3x3 subregion and then warp each region individually. If the detection succeeds it would give you a better approximation.
I thought this was a great post, and a great solution by ARK; very well laid out and explained.
I was working on a similar problem, and built the entire thing. There were some changes (i.e. xrange to range, arguments in cv2.findContours), but this should work out of the box (Python 3.5, Anaconda).
This is a compilation of the elements above, with some of the missing code added (i.e., labeling of points).
'''
https://stackoverflow.com/questions/10196198/how-to-remove-convexity-defects-in-a-sudoku-square
'''
import cv2
import numpy as np
img = cv2.imread('test.png')
winname="raw image"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,100)
img = cv2.GaussianBlur(img,(5,5),0)
winname="blurred"
cv2.namedWindow(winname)
cv2.imshow(winname, img)
cv2.moveWindow(winname, 100,150)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
mask = np.zeros((gray.shape),np.uint8)
kernel1 = cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(11,11))
winname="gray"
cv2.namedWindow(winname)
cv2.imshow(winname, gray)
cv2.moveWindow(winname, 100,200)
close = cv2.morphologyEx(gray,cv2.MORPH_CLOSE,kernel1)
div = np.float32(gray)/(close)
res = np.uint8(cv2.normalize(div,div,0,255,cv2.NORM_MINMAX))
res2 = cv2.cvtColor(res,cv2.COLOR_GRAY2BGR)
winname="res2"
cv2.namedWindow(winname)
cv2.imshow(winname, res2)
cv2.moveWindow(winname, 100,250)
#find elements
thresh = cv2.adaptiveThreshold(res,255,0,1,19,2)
img_c, contour,hier = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)
max_area = 0
best_cnt = None
for cnt in contour:
area = cv2.contourArea(cnt)
if area > 1000:
if area > max_area:
max_area = area
best_cnt = cnt
cv2.drawContours(mask,[best_cnt],0,255,-1)
cv2.drawContours(mask,[best_cnt],0,0,2)
res = cv2.bitwise_and(res,mask)
winname="puzzle only"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,300)
# vertical lines
kernelx = cv2.getStructuringElement(cv2.MORPH_RECT,(2,10))
dx = cv2.Sobel(res,cv2.CV_16S,1,0)
dx = cv2.convertScaleAbs(dx)
cv2.normalize(dx,dx,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dx,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernelx,iterations = 1)
img_d, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if h/w > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_CLOSE,None,iterations = 2)
closex = close.copy()
winname="vertical lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_d)
cv2.moveWindow(winname, 100,350)
# find horizontal lines
kernely = cv2.getStructuringElement(cv2.MORPH_RECT,(10,2))
dy = cv2.Sobel(res,cv2.CV_16S,0,2)
dy = cv2.convertScaleAbs(dy)
cv2.normalize(dy,dy,0,255,cv2.NORM_MINMAX)
ret,close = cv2.threshold(dy,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,kernely)
img_e, contour, hier = cv2.findContours(close,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
for cnt in contour:
x,y,w,h = cv2.boundingRect(cnt)
if w/h > 5:
cv2.drawContours(close,[cnt],0,255,-1)
else:
cv2.drawContours(close,[cnt],0,0,-1)
close = cv2.morphologyEx(close,cv2.MORPH_DILATE,None,iterations = 2)
closey = close.copy()
winname="horizontal lines"
cv2.namedWindow(winname)
cv2.imshow(winname, img_e)
cv2.moveWindow(winname, 100,400)
# intersection of these two gives dots
res = cv2.bitwise_and(closex,closey)
winname="intersections"
cv2.namedWindow(winname)
cv2.imshow(winname, res)
cv2.moveWindow(winname, 100,450)
# text blue
textcolor=(0,255,0)
# points green
pointcolor=(255,0,0)
# find centroids and sort
img_f, contour, hier = cv2.findContours(res,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
centroids = []
for cnt in contour:
mom = cv2.moments(cnt)
(x,y) = int(mom['m10']/mom['m00']), int(mom['m01']/mom['m00'])
cv2.circle(img,(x,y),4,(0,255,0),-1)
centroids.append((x,y))
# sorting
centroids = np.array(centroids,dtype = np.float32)
c = centroids.reshape((100,2))
c2 = c[np.argsort(c[:,1])]
b = np.vstack([c2[i*10:(i+1)*10][np.argsort(c2[i*10:(i+1)*10,0])] for i in range(10)])
bm = b.reshape((10,10,2))
# make copy
labeled_in_order=res2.copy()
for index, pt in enumerate(b):
cv2.putText(labeled_in_order,str(index),tuple(pt),cv2.FONT_HERSHEY_DUPLEX, 0.75, textcolor)
cv2.circle(labeled_in_order, tuple(pt), 5, pointcolor)
winname="labeled in order"
cv2.namedWindow(winname)
cv2.imshow(winname, labeled_in_order)
cv2.moveWindow(winname, 100,500)
# create final
output = np.zeros((450,450,3),np.uint8)
for i,j in enumerate(b):
ri = int(i/10) # row index
ci = i%10 # column index
if ci != 9 and ri!=9:
src = bm[ri:ri+2, ci:ci+2 , :].reshape((4,2))
dst = np.array( [ [ci*50,ri*50],[(ci+1)*50-1,ri*50],[ci*50,(ri+1)*50-1],[(ci+1)*50-1,(ri+1)*50-1] ], np.float32)
retval = cv2.getPerspectiveTransform(src,dst)
warp = cv2.warpPerspective(res2,retval,(450,450))
output[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1] = warp[ri*50:(ri+1)*50-1 , ci*50:(ci+1)*50-1].copy()
winname="final"
cv2.namedWindow(winname)
cv2.imshow(winname, output)
cv2.moveWindow(winname, 600,100)
cv2.waitKey(0)
cv2.destroyAllWindows()
I want to add that above method works only when sudoku board stands straight, otherwise height/width (or vice versa) ratio test will most probably fail and you will not be able to detect edges of sudoku. (I also want to add that if lines that are not perpendicular to the image borders, sobel operations (dx and dy) will still work as lines will still have edges with respect to both axes.)
To be able to detect straight lines you should work on contour or pixel-wise analysis such as contourArea/boundingRectArea, top left and bottom right points...
Edit: I managed to check whether a set of contours form a line or not by applying linear regression and checking the error. However linear regression performed poorly when slope of the line is too big (i.e. >1000) or it is very close to 0. Therefore applying the ratio test above (in most upvoted answer) before linear regression is logical and did work for me.
To remove undected corners I applied gamma correction with a gamma value of 0.8.
The red circle is drawn to show the missing corner.
The code is:
gamma = 0.8
invGamma = 1/gamma
table = np.array([((i / 255.0) ** invGamma) * 255
for i in np.arange(0, 256)]).astype("uint8")
cv2.LUT(img, table, img)
This is in addition to Abid Rahman's answer if some corner points are missing.

Categories

Resources