How to segment bent rod for angle calculations? - python

I'm trying to use OpenCV to segment a bent rod from it's background then find the bends in it and calculate the angle between each bend.
The first part luckily is trivial with a enough contrast between the foreground and background.
A bit of erosion/dilation takes care of reflections/highlights when segmenting.
The second part is where I'm not sure how to approach it.
I can easily retrieve a contour (top and bottom are very similar so either would do),
but I can't seem to figure out is how to get split the contour into the straight parts and the bend rods to calculate the angles.
So far I've tried simplyfying the contours, but either I get too many or too few points and it feels difficult to pin point the right
settings to keep the straight parts straight and the bent parts simplified.
Here is my input image(bend.png)
And here's what I've tried so far:
#!/usr/bin/env python
import numpy as np
import cv2
threshold = 229
# erosion/dilation kernel
kernel = np.ones((5,5),np.uint8)
# contour simplification
epsilon = 0
# slider callbacks
def onThreshold(x):
global threshold
print "threshold = ",x
threshold = x
def onEpsilon(x):
global epsilon
epsilon = x * 0.01
print "epsilon = ",epsilon
# make a window to add sliders/preview to
cv2.namedWindow('processed')
#make some sliders
cv2.createTrackbar('threshold','processed',60,255,onThreshold)
cv2.createTrackbar('epsilon','processed',1,1000,onEpsilon)
# load image
img = cv2.imread('bend.png',0)
# continuously process for quick feedback
while 1:
# exit on ESC key
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
# Threshold
ret,processed = cv2.threshold(img,threshold,255,0)
# Invert
processed = (255-processed)
# Dilate
processed = cv2.dilate(processed,kernel)
processed = cv2.erode(processed,kernel)
# Canny
processed = cv2.Canny(processed,100,200)
contours, hierarchy = cv2.findContours(processed,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
if len(contours) > 0:
approx = cv2.approxPolyDP(contours[0],epsilon,True)
# print len(approx)
cv2.drawContours(processed, [approx], -1, (255,255,255), 3)
demo = img.copy()
cv2.drawContours(demo, [approx], -1, (192,0,0), 3)
# show result
cv2.imshow('processed ',processed)
cv2.imshow('demo ',demo)
# exit
cv2.destroyAllWindows()
Here's what I've got so far, but I'm not convinced this is the best approach:
I've tried to figure this out visually and what I've aimed for is something along these lines:
Because the end goal is to calculate the angle between bent parts something like this feels simpler:
My assumption that fitting lines and compute the angles between pairs of intersecting lines could work:
I did a quick test using the HoughLines OpenCV Python tutorial, but regardless of the parameters passed I didn't get great results:
#!/usr/bin/env python
import numpy as np
import cv2
threshold = 229
minLineLength = 30
maxLineGap = 10
houghThresh = 15
# erosion/dilation kernel
kernel = np.ones((5,5),np.uint8)
# slider callbacks
def onMinLineLength(x):
global minLineLength
minLineLength = x
print "minLineLength = ",x
def onMaxLineGap(x):
global maxLineGap
maxLineGap = x
print "maxLineGap = ",x
def onHoughThresh(x):
global houghThresh
houghThresh = x
print "houghThresh = ",x
# make a window to add sliders/preview to
cv2.namedWindow('processed')
#make some sliders
cv2.createTrackbar('minLineLength','processed',1,50,onMinLineLength)
cv2.createTrackbar('maxLineGap','processed',5,30,onMaxLineGap)
cv2.createTrackbar('houghThresh','processed',15,50,onHoughThresh)
# load image
img = cv2.imread('bend.png',0)
# continuously process for quick feedback
while 1:
# exit on ESC key
k = cv2.waitKey(1) & 0xFF
if k == 27:
break
# Threshold
ret,processed = cv2.threshold(img,threshold,255,0)
# Invert
processed = (255-processed)
# Dilate
processed = cv2.dilate(processed,kernel)
processed = cv2.erode(processed,kernel)
# Canny
processed = cv2.Canny(processed,100,200)
lineBottom = np.zeros(img.shape,np.uint8)
contours, hierarchy = cv2.findContours(processed,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
if len(contours) > 0:
cv2.drawContours(lineBottom, contours, 0, (255,255,255), 1)
# HoughLinesP
houghResult = img.copy()
lines = cv2.HoughLinesP(lineBottom,1,np.pi/180,houghThresh,minLineLength,maxLineGap)
try:
for x in range(0, len(lines)):
for x1,y1,x2,y2 in lines[x]:
cv2.line(houghResult,(x1,y1),(x2,y2),(0,255,0),2)
except Exception as e:
print e
# show result
cv2.imshow('lineBottom',lineBottom)
cv2.imshow('houghResult ',houghResult)
# exit
cv2.destroyAllWindows()
Is this a feasible approach ? If so, what's the correct way of doing line fitting in OpenCV Python ?
Otherwise, that's the best way to tackle this problem ?
Update Following Miki's advise I've tried OpenCV 3's LSD and got nicer results than with HoughLinesP but it looks like there's still some tweaking needed, although it doesn't look other than cv2.createLineSegmentDetector there aren't many options to play with:

It may be convenient to use curvature to find line segments. Here example of splitting contour by minimal curvature points, it may be better to use maximal curvature points in your case. B You can split your curve to parts, then each part approximate with line segment using RANSAC method.

I know this is old but I found this after having a similar problem
The method I used (after finding binary image) was along the lines of:
Find ends (points with fewest neighbors)
skeletonize (optional)
Starting at one end find a few closest points using skimage cdist
Perform a linear regression with these points and find all points in the image within a few pixels error of the line of best fit. I used query_ball_point
This gives additional points within the same straight line. Order them by distance from the last fiducial point. Some of these might be projections of the line onto distant parts of the object and should be deleted.
Repeat steps 4 and 5 until no more points are added.
Once no more points are added to the line you find the start of the next valid line by looking at R-squared for the fit. The line should have very high R squared eg. > 0.95 (depending on the image - I was getting > 0.99). Keep changing starting point until high R squared is achieved.
This gives a bunch of line segments from where it should be easy to find the angles between them. One potential problem occurs when the segment is vertical (or horizontal) and the slope becomes infinite. When this occurred I just flipped the axes around. You can also get around this by defining end points of a line and finding all points within a threshold distance from that line rather than doing the regression.
This involves a lot more coding than using the other methods suggested but execution time is fast and it gives much greater control over what is happening.

Once you have the contour, you can analyze it using a method like the one proposed in this paper: https://link.springer.com/article/10.1007/s10032-011-0175-3
Basically, the contour is tracked calculating the curvature at each point.
Then you can use a curvature threshold to segment the contour into straight and curved sections.

Related

How to delete or clear contours from image?

I'm working with license plates, what I do is apply a series of filters to it, such as:
Grayscale
Blur
Threshhold
Binary
The problem is when I doing this, there are some contour like this image at borders, how can I clear them? or make it just black color (masked)? I used this code but sometimes it falls.
# invert image and detect contours
inverted = cv2.bitwise_not(image_binary_and_dilated)
contours, hierarchy = cv2.findContours(inverted,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
# get the biggest contour
biggest_index = -1
biggest_area = -1
i = 0
for c in contours:
area = cv2.contourArea(c)
if area > biggest_area:
biggest_area = area
biggest_index = i
i = i+1
print("biggest area: " + str(biggest_area) + " index: " + str(biggest_index))
cv2.drawContours(image_binary_and_dilated, contours, biggest_index, [0,0,255])
center, size, angle = cv2.minAreaRect(contours[biggest_index])
rot_mat = cv2.getRotationMatrix2D(center, angle, 1.)
#cv2.warpPerspective()
print(size)
dst = cv2.warpAffine(inverted, rot_mat, (int(size[0]), int(size[1])))
mask = dst * 0
x1 = max([int(center[0] - size[0] / 2)+1, 0])
y1 = max([int(center[1] - size[1] / 2)+1, 0])
x2 = int(center[0] + size[0] / 2)-1
y2 = int(center[1] + size[1] / 2)-1
point1 = (x1, y1)
point2 = (x2, y2)
print(point1)
print(point2)
cv2.rectangle(dst, point1, point2, [0,0,0])
cv2.rectangle(mask, point1, point2, [255,255,255], cv2.FILLED)
masked = cv2.bitwise_and(dst, mask)
#cv2_imshow(imgg)
cv2_imshow(dst)
cv2_imshow(masked)
#cv2_imshow(mask)
Some results:
The original plates were:
Good result 1
Good result 2
Good result 3
Good result 4
Bad result 1
Bad result 2
Binary plates are:
Image 1
Image 2
Image 3
Image 4
Image 5 - Bad result 1
Image 6 - Bad result 2
How can I fix this code? only that I want to avoid that bad result or improve it.
INTRODUCTION
What you are asking starts to become complicated, and I believe there is not anymore a right or wrong answer, just different ways to do this. Almost all of them will yield positive and negative results, most likely in a different ratio. Having a 100% positive result is quite a challenging task, and I do believe my answer does not reach it. Yet it can be the basis for a more sophisticated work towards that goal.
MY PROPOSAL
So, I want to make a different proposal here.
I am not 100% sure why you are doing all the steps, and I believe some of them could be unnecessary.
Let's start from the problem: you want to remove the white parts on the borders (which are not numbers).
So, we need an idea about how to distinguish them from the letters, in order to correctly tackle them.
If we just try to contour and warp, it is likely to work on some images and not on others, because not all of them look the same. This is the hardest problem to have a general solution that works for many images.
What are the difference between the characteristics of the numbers and the characteristics of the borders (and other small points?):
after thinking about that, I would say: the shapes! That meaning, if you would imagine a bounding box around a letter/number, it would look like a rectangle, whose size is related to the image size. While in the case of the border, they are usually very large and narrow, or too small to be considered a letter/number (random points).
Therefore, my guess would be on segmentation, dividing the features via their shape. So we take the binary image, we remove some parts using the projection on their axes (as you correctly asked in the previous question and I believe we should use) and we get an image where each letter is separated from the white borders.
Then we can segment and check the shape of each segmented object, and if we think these are letters, we keep them, otherwise we discard them.
THE CODE
I wrote the code before as an example on your data. Some of the parameters are tuned on this set of images, so they may have to be relaxed for a larger dataset.
import cv2
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
import scipy.ndimage as ndimage
# do this for all the images
num_images = 6
plt.figure(figsize=(16,16))
for k in range(num_images):
# read the image
binary_image = cv2.imread("binary_image/img{}.png".format(k), cv2.IMREAD_GRAYSCALE)
# just for visualization purposes, I create another image with the same shape, to show what I am doing
new_intermediate_image = np.zeros((binary_image.shape), np.uint8)
new_intermediate_image += binary_image
# here we will copy only the cleaned parts
new_cleaned_image = np.zeros((binary_image.shape), np.uint8)
### THIS CODE COMES FROM THE PREVIOUS ANSWER:
# https://stackoverflow.com/questions/62127537/how-to-clean-binary-image-using-horizontal-projection?noredirect=1&lq=1
(rows,cols)=binary_image.shape
h_projection = np.array([ x/rows for x in binary_image.sum(axis=0)])
threshold_h = (np.max(h_projection) - np.min(h_projection)) / 10
print("we will use threshold {} for horizontal".format(threshold))
# select the black areas
black_areas_horizontal = np.where(h_projection < threshold_h)
for j in black_areas_horizontal:
new_intermediate_image[:, j] = 0
v_projection = np.array([ x/cols for x in binary_image.sum(axis=1)])
threshold_v = (np.max(v_projection) - np.min(v_projection)) / 10
print("we will use threshold {} for vertical".format(threshold_v))
black_areas_vertical = np.where(v_projection < threshold_v)
for j in black_areas_vertical:
new_intermediate_image[j, :] = 0
### UNTIL HERE
# define the features we are looking for
# this parameters can also be tuned
min_width = binary_image.shape[1] / 14
max_width = binary_image.shape[1] / 2
min_height = binary_image.shape[0] / 5
max_height = binary_image.shape[0]
print("we look for feature with width in [{},{}] and height in [{},{}]".format(min_width, max_width, min_height, max_height))
# segment the iamge
labeled_array, num_features = ndimage.label(new_intermediate_image)
# loop over all features found
for i in range(num_features):
# get a bounding box around them
slice_x, slice_y = ndimage.find_objects(labeled_array==i)[0]
roi = labeled_array[slice_x, slice_y]
# check the shape, if the bounding box is what we expect, copy it to the new image
if roi.shape[0] > min_height and \
roi.shape[0] < max_height and \
roi.shape[1] > min_width and \
roi.shape[1] < max_width:
new_cleaned_image += (labeled_array == i)
# print all images on a grid
plt.subplot(num_images,3,1+(k*3))
plt.imshow(binary_image)
plt.subplot(num_images,3,2+(k*3))
plt.imshow(new_intermediate_image)
plt.subplot(num_images,3,3+(k*3))
plt.imshow(new_cleaned_image)
that produces the output (in the grid, left image are the input images, central one are the images after the mask based on histogram projections, and on the right are the cleaned images):
CONCLUSIONS:
As said above, this method does not yield 100% positive results. The last picture has lower quality and some parts are unconnected, and they are lost in the process. I personally believe this is a price to pay to get cleaner image, and if you have a lot of images, it won't be a problem, and you can remove those kind of images. Overall, I think this method returns quite clear images, where all other parts that are not letters or numbers are correctly removed.
ADVANTAGES
the image is clean, nothing more than letters or numbers are kept
the parameters can be tuned, and should be consistent across images
in case of problem, using some prints or some debugging on the loop that chooses the features to keep should make it easier to understand where are the problem and correct them
LIMITATIONS
it may fail in some cases where letters and numbers touch the white borders, which seems quite possible. It is handled from the black_areas created using the projection, but I am not so confident this will work 100% of the time.
some small parts of the numbers can be lost during the process, as in the last picture.

Python OpenCV: Hough Transform does not detect obvious lines

Problem:
I want to detect lines in a given image using OpenCV in Python. Although there are multiple obvious vertical lines, neither normal HoughLines nor probabilistic HoughLines does find them. As I spent plenty of time playing around with the Parameters, I guess I am doing something fundamental wrong here. I am Aware of the fact, that hough-lines is usually applied on edges, e.g. after using canny. Due to canny´s non-maximum supression, canny does not give good results here.
Image, where detecting the vertical lines Fails :
Why:
Given this (image of a water meter) :
I want to detect the rectangle around each digit. To detect the rectangles, I used sobel filters in x and y direction and calculated Magnitude and angle/Phase of the Gradient. As I assume the image to be rotated correctly in this step, I extract vertical and horizontal edges as shown in the image. My hope was to make use of houghLines to find the bounding boxes. Finding the horizontal lines works perfectly, as seen in the
Debug plot containing further insights on the Problem, where as I does not work on the vertical components (second row) :
Detecting the rectangles around each digit would help me to
locate the Region of Interest
cut out the region inside the rectangle, in other words the digit. Several other approches to detect the digits directly by using contours, all had the problem of the outer rectangles interfering with the digit.
Update: the Code for detecting the vertical lines:
#img is initialized with the binarized, vertical component image, as shown above
minLength = 30
maxGap = 7
angle_res = np.pi / 180
rad_res = 2
threshold_val = 100
linesP = cv2.HoughLinesP(img, rad_res, angle_res, threshold_val, minLineLength=minLength, maxLineGap=maxGap)
cdst = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
cdstP = np.copy(cdst)
if linesP is None:
print("Error when finding lines (probabilistic hough transformation). No lines detected")
else:
# Copy edges to the images that will display the results in BGR
for i in range(0, len(linesP)):
l = linesP[i][0]
cv2.line(cdstP, (l[0], l[1]), (l[2], l[3]), (255,0,0), 3, cv2.LINE_AA)
plt.imshow(cdstP); plt.show()
First apply canny edge with proper settings of threshold. Then apply probabilistic hough line transform. After applying hough transform filter the lines with slope. You want to filter the box so you need to filter horizontal and vertical lines. After filtering lines apply morphological dilation and erosion operation back to back to resultant image to get neat box around each digit. While applying hough transform select parameters minimum line length, maximum line length and maximum line gap appropriately.
You can use trackbar function while selecting appropriate parameters. The sample code is given below for selection of threshold for canny edge.
import cv2
import numpy as np
cv2.namedWindow('Result')
img = cv2.imread('qkEuE.png')
v1 = 0
v2 = 0
def doEdges():
edges = cv2.Canny(img,v1,v2)
edges = cv2.cvtColor(edges,cv2.COLOR_GRAY2BGR)
res = np.concatenate((img,edges),axis = 0)
cv2.imshow('Result',res)
def setVal1(val):
global v1
v1 = val
doEdges()
def setVal2(val):
global v2
v2 = val
doEdges()
cv2.createTrackbar('Val1','Result',0,500,setVal1)
cv2.createTrackbar('Val2','Result',0,500,setVal2)
cv2.imshow('Result',img)
cv2.waitKey(0)
cv2.destroyAllWindows
Hope it helps you.

Detect if an OCR text image is upside down

I have some hundreds of images (scanned documents), most of them are skewed. I wanted to de-skew them using Python.
Here is the code I used:
import numpy as np
import cv2
from skimage.transform import radon
filename = 'path_to_filename'
# Load file, converting to grayscale
img = cv2.imread(filename)
I = cv2.cvtColor(img, COLOR_BGR2GRAY)
h, w = I.shape
# If the resolution is high, resize the image to reduce processing time.
if (w > 640):
I = cv2.resize(I, (640, int((h / w) * 640)))
I = I - np.mean(I) # Demean; make the brightness extend above and below zero
# Do the radon transform
sinogram = radon(I)
# Find the RMS value of each row and find "busiest" rotation,
# where the transform is lined up perfectly with the alternating dark
# text and white lines
r = np.array([np.sqrt(np.mean(np.abs(line) ** 2)) for line in sinogram.transpose()])
rotation = np.argmax(r)
print('Rotation: {:.2f} degrees'.format(90 - rotation))
# Rotate and save with the original resolution
M = cv2.getRotationMatrix2D((w/2,h/2),90 - rotation,1)
dst = cv2.warpAffine(img,M,(w,h))
cv2.imwrite('rotated.jpg', dst)
This code works well with most of the documents, except with some angles: (180 and 0) and (90 and 270) are often detected as the same angle (i.e it does not make difference between (180 and 0) and (90 and 270)). So I get a lot of upside-down documents.
Here is an example:
The resulted image that I get is the same as the input image.
Is there any suggestion to detect if an image is upside down using Opencv and Python?
PS: I tried to check the orientation using EXIF data, but it didn't lead to any solution.
EDIT:
It is possible to detect the orientation using Tesseract (pytesseract for Python), but it is only possible when the image contains a lot of characters.
For anyone who may need this:
import cv2
import pytesseract
print(pytesseract.image_to_osd(cv2.imread(file_name)))
If the document contains enough characters, it is possible for Tesseract to detect the orientation. However, when the image has few lines, the orientation angle suggested by Tesseract is usually wrong. So this can not be a 100% solution.
Python3/OpenCV4 script to align scanned documents.
Rotate the document and sum the rows. When the document has 0 and 180 degrees of rotation, there will be a lot of black pixels in the image:
Use a score keeping method. Score each image for it's likeness to a zebra pattern. The image with the best score has the correct rotation. The image you linked to was off by 0.5 degrees. I omitted some functions for readability, the full code can be found here.
# Rotate the image around in a circle
angle = 0
while angle <= 360:
# Rotate the source image
img = rotate(src, angle)
# Crop the center 1/3rd of the image (roi is filled with text)
h,w = img.shape
buffer = min(h, w) - int(min(h,w)/1.15)
roi = img[int(h/2-buffer):int(h/2+buffer), int(w/2-buffer):int(w/2+buffer)]
# Create background to draw transform on
bg = np.zeros((buffer*2, buffer*2), np.uint8)
# Compute the sums of the rows
row_sums = sum_rows(roi)
# High score --> Zebra stripes
score = np.count_nonzero(row_sums)
scores.append(score)
# Image has best rotation
if score <= min(scores):
# Save the rotatied image
print('found optimal rotation')
best_rotation = img.copy()
k = display_data(roi, row_sums, buffer)
if k == 27: break
# Increment angle and try again
angle += .75
cv2.destroyAllWindows()
How to tell if the document is upside down? Fill in the area from the top of the document to the first non-black pixel in the image. Measure the area in yellow. The image that has the smallest area will be the one that is right-side-up:
# Find the area from the top of page to top of image
_, bg = area_to_top_of_text(best_rotation.copy())
right_side_up = sum(sum(bg))
# Flip image and try again
best_rotation_flipped = rotate(best_rotation, 180)
_, bg = area_to_top_of_text(best_rotation_flipped.copy())
upside_down = sum(sum(bg))
# Check which area is larger
if right_side_up < upside_down: aligned_image = best_rotation
else: aligned_image = best_rotation_flipped
# Save aligned image
cv2.imwrite('/home/stephen/Desktop/best_rotation.png', 255-aligned_image)
cv2.destroyAllWindows()
Assuming you did run the angle-correction already on the image, you can try the following to find out if it is flipped:
Project the corrected image to the y-axis, so that you get a 'peak' for each line. Important: There are actually almost always two sub-peaks!
Smooth this projection by convolving with a gaussian in order to get rid of fine structure, noise, etc.
For each peak, check if the stronger sub-peak is on top or at the bottom.
Calculate the fraction of peaks that have sub-peaks on the bottom side. This is your scalar value that gives you the confidence that the image is oriented correctly.
The peak finding in step 3 is done by finding sections with above average values. The sub-peaks are then found via argmax.
Here's a figure to illustrate the approach; A few lines of you example image
Blue: Original projection
Orange: smoothed projection
Horizontal line: average of the smoothed projection for the whole image.
here's some code that does this:
import cv2
import numpy as np
# load image, convert to grayscale, threshold it at 127 and invert.
page = cv2.imread('Page.jpg')
page = cv2.cvtColor(page, cv2.COLOR_BGR2GRAY)
page = cv2.threshold(page, 127, 255, cv2.THRESH_BINARY_INV)[1]
# project the page to the side and smooth it with a gaussian
projection = np.sum(page, 1)
gaussian_filter = np.exp(-(np.arange(-3, 3, 0.1)**2))
gaussian_filter /= np.sum(gaussian_filter)
smooth = np.convolve(projection, gaussian_filter)
# find the pixel values where we expect lines to start and end
mask = smooth > np.average(smooth)
edges = np.convolve(mask, [1, -1])
line_starts = np.where(edges == 1)[0]
line_endings = np.where(edges == -1)[0]
# count lines with peaks on the lower side
lower_peaks = 0
for start, end in zip(line_starts, line_endings):
line = smooth[start:end]
if np.argmax(line) < len(line)/2:
lower_peaks += 1
print(lower_peaks / len(line_starts))
this prints 0.125 for the given image, so this is not oriented correctly and must be flipped.
Note that this approach might break badly if there are images or anything not organized in lines in the image (maybe math or pictures). Another problem would be too few lines, resulting in bad statistics.
Also different fonts might result in different distributions. You can try this on a few images and see if the approach works. I don't have enough data.
You can use the Alyn module. To install it:
pip install alyn
Then to use it to deskew images(Taken from the homepage):
from alyn import Deskew
d = Deskew(
input_file='path_to_file',
display_image='preview the image on screen',
output_file='path_for_deskewed image',
r_angle='offest_angle_in_degrees_to_control_orientation')`
d.run()
Note that Alyn is only for deskewing text.

How to detect rectangular items in image with Python

I have found a plethora of questions regarding finding "things" in images using openCV, et al. in Python but so far I have been unable to piece them together for a reliable solution to my problem.
I am attempting to use computer vision to help count tiny surface mount electronics parts. The idea is for me to dump parts onto a solid color piece of paper, snap a picture, and have the software tell me how many items are in it.
The "things" differ from one picture to the next but will always be identical in any one image. I seem to be able to manually tune the parameters for things like hue/saturation for a particular part but it tends to require tweaking every time I change to a new part.
My current, semi-functioning code is posted below:
import imutils
import numpy
import cv2
import sys
def part_area(contours, round=10):
"""Finds the mode of the contour area. The idea is that most of the parts in an image will be separated and that
finding the most common area in the list of areas should provide a reasonable value to approximate by. The areas
are rounded to the nearest multiple of 200 to reduce the list of options."""
# Start with a list of all of the areas for the provided contours.
areas = [cv2.contourArea(contour) for contour in contours]
# Determine a threshold for the minimum amount of area as 1% of the overall range.
threshold = (max(areas) - min(areas)) / 100
# Trim the list of areas down to only those that exceed the threshold.
thresholded = [area for area in areas if area > threshold]
# Round the areas to the nearest value set by the round argument.
rounded = [int((area + (round / 2)) / round) * round for area in thresholded]
# Remove any areas that rounded down to zero.
cleaned = [area for area in rounded if area != 0]
# Count the areas with the same values.
counts = {}
for area in cleaned:
if area not in counts:
counts[area] = 0
counts[area] += 1
# Reduce the areas down to only those that are in groups of three or more with the same area.
above = []
for area, count in counts.iteritems():
if count > 2:
for _ in range(count):
above.append(area)
# Take the mean of the areas as the average part size.
average = sum(above) / len(above)
return average
def find_hue_mode(hsv):
"""Given an HSV image as an input, compute the mode of the list of hue values to find the most common hue in the
image. This is used to determine the center for the background color filter."""
pixels = {}
for row in hsv:
for pixel in row:
hue = pixel[0]
if hue not in pixels:
pixels[hue] = 0
pixels[hue] += 1
counts = sorted(pixels.keys(), key=lambda key: pixels[key], reverse=True)
return counts[0]
if __name__ == "__main__":
# load the image and resize it to a smaller factor so that the shapes can be approximated better
image = cv2.imread(sys.argv[1])
# define range of blue color in HSV
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
center = find_hue_mode(hsv)
print 'Center Hue:', center
lower = numpy.array([center - 10, 50, 50])
upper = numpy.array([center + 10, 255, 255])
# Threshold the HSV image to get only blue colors
mask = cv2.inRange(hsv, lower, upper)
inverted = cv2.bitwise_not(mask)
blurred = cv2.GaussianBlur(inverted, (5, 5), 0)
edged = cv2.Canny(blurred, 50, 100)
dilated = cv2.dilate(edged, None, iterations=1)
eroded = cv2.erode(dilated, None, iterations=1)
# find contours in the thresholded image and initialize the shape detector
contours = cv2.findContours(eroded.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if imutils.is_cv2() else contours[1]
# Compute the area for a single part to use when setting the threshold and calculating the number of parts within
# a contour area.
part_area = part_area(contours)
# The threshold for a part's area - can't be too much smaller than the part itself.
threshold = part_area * 0.5
part_count = 0
for contour in contours:
if cv2.contourArea(contour) < threshold:
continue
# Sometimes parts are close enough together that they become one in the image. To battle this, the total area
# of the contour is divided by the area of a part (derived earlier).
part_count += int((cv2.contourArea(contour) / part_area) + 0.1) # this 0.1 "rounds up" slightly and was determined empirically
# Draw an approximate contour around each detected part to give the user an idea of what the tool has computed.
epsilon = 0.1 * cv2.arcLength(contour, True)
approx = cv2.approxPolyDP(contour, epsilon, True)
cv2.drawContours(image, [approx], -1, (0, 255, 0), 2)
# Print the part count and show off the processed image.
print 'Part Count:', part_count
cv2.imshow("Image", image)
cv2.waitKey(0)
Here's an example of the type of input image I am using:
or this:
And I'm currently getting results like this:
The results clearly show that the script is having trouble identifying some parts and it's true Achilles heel seems to be when parts touch one another.
So my question/challenge is, what can I do to improve the reliability of this script?
The script is to be integrated into an existing Python tool so I am searching for a solution using Python. The solution does not need to be pure Python as I am willing to install whatever 3rd party libraries might be needed.
If the objects are all of similar types, you might have more success isolating a single example in the image and then using feature matching to detect them.
A full solution would be out of scope for Stack Overflow, but my suggestion for progress would be to first somehow find one or more "correct" examples using your current rectangle retrieval method. You could probably look for all your samples that are of the expected size, or that are accurate rectangles.
Once you have isolated a few positive examples, use some feature matching techniques to find the others. There is a lot of reading up you probably need to do on it but that is a potential solution.
A general summary is that you use your positive examples to find "features" of the object you want to detect. These "features" are generally things like corners or changes in gradient. OpenCV contains many methods you can use.
Once you have the features, there are several algorithms in OpenCV you can look at that will search the image for all matching features. You’ll want one that is rotation invariant (can detect the same features arranged in different rotation), but you probably don’t need scale invariance (can detect the same features at multiple scales).
My one concern with this method is that the items you are searching for in your images are quite small. It might be difficult to find good, consistent features to match on.
You're tackling a 2D object recognition problem, for which there are many possible approaches. You've gone about it using background/foreground segmentation, which is ok as you have control on the scene (laying down the background paper sheet). However this will always have fundamental limitations when the objects touch. A simple solution to your problem can be this:
1) You assume that touching objects are rare events (which is a fine assumption in your problem). Therefore you can compute the areas for each segmented region, and compute the median of these, which will give a robust estimate for the object's area. Let's call this robust estimate A (in squared pixels). This will be fine if fewer than 50% of regions correspond to touching objects.
2) You then proceed to measure the number of objects in each segmented region. Let Ai be the area of the ith region. You then compute the number of objects in each region by Ni=round(Ai/A). You then sum Ni to give you the total number of objects.
This approach will be fine as long as the following conditions are met:
A) The touching objects do not significantly overlap
B) You do not have objects lying on their sides. If you do you might be able to deal with this using two area estimates (side and flat). Better to eliminate this scenario if you can for simplicity.
C) The objects are all roughly the same distance to the camera. If this is not the case then the areas of the objects (in pixels) cannot be modelled well by a single value.
D) There are not partially visible objects at the borders of the image.
E) You ensure that only the same type of object is visible in each image.

Lines connecting corners over hand drawn image in python with cv2

I am looking to detect lines connecting corners over a hand drawn image like this. I am using Harris Corner Detection to find the corners of the image. Next I am connecting all of the corners with lines and iterating though the points to see if they match the pixels from original image and setting a threshold for each line pixel cover to say what is acceptable to say it is a correct line connecting corners.Image of connected lines. It works... but it is very slow. Is there a better way to do this or different method I should use? (Hough lines will not work because of the possibility of curved lines, I only want the lines connecting corners.
for i in c_corners: #corners thru harris and coorected with subpix
x1,y1 = i.ravel()
for k in c_corners:
x2,y2 = k.ravel()
if x1 != x2 and y1 != y2: #ignore vertical lines
linePoints = line_points(x1,y1, x2,y2) # function to get line pnts
totalLinePoints = len(linePoints)
coverPoints = 0
########## This is where I think the slow down is happening and could be optimized
for m in originalImage: #image is dialated to help detection
for n in linePoints:
match = np.all(m == n)
if match == True:
coverPoints += 1
print("Line Cover = ", (coverPoints/totalLinePoints))
if (coverPoints/totalLinePoints) > .65:
good_lines.append([x1,y1,x2,y2])
Any help at all is appreciated, thank you!
My original approach was to create a blank image and draw each line on it, and then use cv2.bitwise_and() with the binary (dilated) image to count how many pixels were in agreement, and if they met a threshold, then draw those lines over the original image. However setting a threshold for the number of pixels penalizes small lines. A better indicator would be the ratio of the number of correct matches to incorrect matches (I realize now that's what you were actually doing). Furthermore this is a little more robust towards dilation and the line thickness you choose to draw your lines.
However the general method you're using is not very robust to issues in the drawing where, like this one, synthetic lines may be able to fit easily to lines they don't belong to, because many drawn curves may hit a line segment. You can see this issue in the output of my code:
I simply hardcoded some corner estimates and went from there. Note the use of itertools to help create all possible pairs of points to define line segments.
import cv2
import numpy as np
import itertools
img = cv2.imread('drawing.png')
bin_inv = cv2.bitwise_not(img) # flip image colors
bin_inv = cv2.cvtColor(bin_inv, cv2.COLOR_BGR2GRAY) # make one channel
bin_inv = cv2.dilate(bin_inv, np.ones((5,5)))
corners = ((517, 170),
(438, 316),
(574, 315),
(444, 436),
(586, 436))
lines = itertools.combinations(corners,2) # create all possible lines
line_img = np.ones_like(img)*255 # white image to draw line markings on
for line in lines: # loop through each line
bin_line = np.zeros_like(bin_inv) # create a matrix to draw the line in
start, end = line # grab endpoints
cv2.line(bin_line, start, end, color=255, thickness=5) # draw line
conj = (bin_inv/255 + bin_line/255) # create agreement image
n_agree = np.sum(conj==2)
n_wrong = np.sum(conj==1)
if n_agree/n_wrong > .05: # high agreements vs disagreements
cv2.line(line_img, start, end, color=[0,200,0], thickness=5) # draw onto original img
# combine the identified lines with the image
marked_img = cv2.addWeighted(img, .5, line_img, .5, 1)
cv2.imwrite('marked.png', marked_img)
I tried a lot of different settings (playing with thickness, dilation, different ratios, etc) and couldn't get that spurious longer line from appearing. It fits the original black pixels super well though, so I'm not sure how you would be able to get rid of it if you used this method. It's got the curve from the top-right line going for it, as well as the middle line it crosses, and the curve at the bottom right which trends that direction for a bit. Regardless, this only takes two seconds to run, so at least it's faster than your current code.

Categories

Resources