Related
The output that I get is just the reference image and no bounding box is seen in the output.
I have tried this code from this website: https://www.sicara.fr/blog-technique/object-detection-template-matching
Here's the reference image
Reference Image
Here are the templates:
1:
templates 2:
templates 3:
As compared to the website, using the code the output should look like this:
Expected Output:
I am expecting to have this output as discussed in the website, however, when I tried to run this code, nothing seems to be detected. Here is the code that I copied:
import cv2
import numpy as np
DEFAULT_TEMPLATE_MATCHING_THRESHOLD = 0.9
class Template:
"""
A class defining a template
"""
def __init__(self, image_path, label, color, matching_threshold=DEFAULT_TEMPLATE_MATCHING_THRESHOLD):
"""
Args:
image_path (str): path of the template image path
label (str): the label corresponding to the template
color (List[int]): the color associated with the label (to plot detections)
matching_threshold (float): the minimum similarity score to consider an object is detected by template
matching
"""
self.image_path = image_path
self.label = label
self.color = color
self.template = cv2.imread(image_path)
self.template_height, self.template_width = self.template.shape[:2]
self.matching_threshold = matching_threshold
image = cv2.imread("reference.jpg")
templates = [
Template(image_path="Component1.jpg", label="1", color=(0, 0, 255), matching_threshold=0.99),
Template(image_path="Component2.jpg", label="2", color=(0, 255, 0,) , matching_threshold=0.91),
Template(image_path="Component3.jpg", label="3", color=(0, 191, 255), matching_threshold=0.99),
detections = []
for template in templates:
template_matching = cv2.matchTemplate(template.template, image, cv2.TM_CCORR_NORMED)
match_locations = np.where(template_matching >= template.matching_threshold)
for (x, y) in zip(match_locations[1], match_locations[0]):
match = {
"TOP_LEFT_X": x,
"TOP_LEFT_Y": y,
"BOTTOM_RIGHT_X": x + template.template_width,
"BOTTOM_RIGHT_Y": y + template.template_height,
"MATCH_VALUE": template_matching[y, x],
"LABEL": template.label,
"COLOR": template.color
}
detections.append(match)
def compute_iou(boxA, boxB):
xA = max(boxA["TOP_LEFT_X"], boxB["TOP_LEFT_X"])
yA = max(boxA["TOP_LEFT_Y"], boxB["TOP_LEFT_Y"])
xB = min(boxA["BOTTOM_RIGHT_X"], boxB["BOTTOM_RIGHT_X"])
yB = min(boxA["BOTTOM_RIGHT_Y"], boxB["BOTTOM_RIGHT_Y"])
interArea = max(0, xB - xA + 1) * max(0, yB - yA + 1)
boxAArea = (boxA["BOTTOM_RIGHT_X"] - boxA["TOP_LEFT_X"] + 1) * (boxA["BOTTOM_RIGHT_Y"] - boxA["TOP_LEFT_Y"] + 1)
boxBArea = (boxB["BOTTOM_RIGHT_X"] - boxB["TOP_LEFT_X"] + 1) * (boxB["BOTTOM_RIGHT_Y"] - boxB["TOP_LEFT_Y"] + 1)
iou = interArea / float(boxAArea + boxBArea - interArea)
return iou
def non_max_suppression(objects, non_max_suppression_threshold=0.5, score_key="MATCH_VALUE"):
"""
Filter objects overlapping with IoU over threshold by keeping only the one with maximum score.
Args:
objects (List[dict]): a list of objects dictionaries, with:
{score_key} (float): the object score
{top_left_x} (float): the top-left x-axis coordinate of the object bounding box
{top_left_y} (float): the top-left y-axis coordinate of the object bounding box
{bottom_right_x} (float): the bottom-right x-axis coordinate of the object bounding box
{bottom_right_y} (float): the bottom-right y-axis coordinate of the object bounding box
non_max_suppression_threshold (float): the minimum IoU value used to filter overlapping boxes when
conducting non-max suppression.
score_key (str): score key in objects dicts
Returns:
List[dict]: the filtered list of dictionaries.
"""
sorted_objects = sorted(objects, key=lambda obj: obj[score_key], reverse=True)
filtered_objects = []
for object_ in sorted_objects:
overlap_found = False
for filtered_object in filtered_objects:
iou = compute_iou(object_, filtered_object)
if iou > non_max_suppression_threshold:
overlap_found = True
break
if not overlap_found:
filtered_objects.append(object_)
return filtered_objects
NMS_THRESHOLD = 0.2
detections = non_max_suppression(detections, non_max_suppression_threshold=NMS_THRESHOLD)
image_with_detections = image.copy()
for detection in detections:
cv2.rectangle(
image_with_detections,
(detection["TOP_LEFT_X"], detection["TOP_LEFT_Y"]),
(detection["BOTTOM_RIGHT_X"], detection["BOTTOM_RIGHT_Y"]),
detection["COLOR"],
2,
)
cv2.putText(
image_with_detections,
f"{detection['LABEL']} - {detection['MATCH_VALUE']}",
(detection["TOP_LEFT_X"] + 2, detection["TOP_LEFT_Y"] + 20),
cv2.FONT_HERSHEY_SIMPLEX, 0.5,
detection["COLOR"], 1,
cv2.LINE_AA,
)
# NMS_THRESHOLD = 0.2
# detection = non_max_suppression(detections, non_max_suppression_threshold=NMS_THRESHOLD)
print("Image written to file-system: ", status)
cv2.imshow("res", image_with_detections)
cv2.waitKey(0)
this is how his final output looks like:
5
Here's my attempt in detecting the larger components, the code was able to detect them and here is the result:
Result
Here are the resize templates and the original components that I wanted to detect but unfortunately can't:
1st 2nd 3rd
Here is a method of finding multiple matches in template matching in Python/OpenCV using your reference and smallest template. I have remove all the white padding you had around your template. My method simply draws a black rectangle over the correlation image where it matches and then repeats looking for the next best match in the modified correlation image.
I have used cv2.TM_CCORR_NORMED and a match threshold of 0.90. You have 4 of these templates showing in your reference image, so I set my search number to 4 and spacing of 10 for the non-maximum suppression by masking. You have other small items of the same shape and size, but the text on them is different. So you will need different templates for each.
Reference:
Template:
import cv2
import numpy as np
# read image
img = cv2.imread('circuit_board.jpg')
# read template
tmplt = cv2.imread('circuit_item.png')
hh, ww, cc = tmplt.shape
# set arguments
match_thresh = 0.90 # stopping threshold for match value
num_matches = 4 # stopping threshold for number of matches
match_radius = 10 # approx radius of match peaks
match_radius2 = match_radius//2
# get correlation surface from template matching
corrimg = cv2.matchTemplate(img,tmplt,cv2.TM_CCORR_NORMED)
hc, wc = corrimg.shape
# get locations of all peaks higher than match_thresh for up to num_matches
imgcopy = img.copy()
corrcopy = corrimg.copy()
for i in range(0, num_matches):
# get max value and location of max
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(corrcopy)
x1 = max_loc[0]
y1 = max_loc[1]
x2 = x1 + ww
y2 = y1 + hh
loc = str(x1) + "," + str(y1)
if max_val > match_thresh:
print("match number:", i+1, "match value:", max_val, "match x,y:", loc)
# draw draw white bounding box to define match location
cv2.rectangle(imgcopy, (x1,y1), (x2,y2), (255,255,255), 1)
# insert black rectangle over copy of corr image so not find that match again
corrcopy[y1-match_radius2:y1+match_radius2, x1-match_radius2:x1+match_radius2] = 0
i = i + 1
else:
break
# save results
# power of 4 exaggeration of correlation image to emphasize peaks
cv2.imwrite('circuit_board_multi_template_corr.png', (255*cv2.pow(corrimg,4)).clip(0,255).astype(np.uint8))
cv2.imwrite('circuit_board_multi_template_corr_masked.png', (255*cv2.pow(corrcopy,4)).clip(0,255).astype(np.uint8))
cv2.imwrite('circuit_board_multi_template_match.png', imgcopy)
# show results
# power of 4 exaggeration of correlation image to emphasize peaks
cv2.imshow('image', img)
cv2.imshow('template', tmplt)
cv2.imshow('corr', cv2.pow(corrimg,4))
cv2.imshow('corr masked', cv2.pow(corrcopy,4))
cv2.imshow('result', imgcopy)
cv2.waitKey(0)
cv2.destroyAllWindows()
Original Correlation Image:
Modified Correlation Image after 4 matches:
Matches Marked on Input as White Rectangles:
Match Locations:
match number: 1 match value: 0.9982172250747681 match x,y: 128,68
match number: 2 match value: 0.9762057065963745 match x,y: 128,90
match number: 3 match value: 0.9755787253379822 match x,y: 128,48
match number: 4 match value: 0.963689923286438 match x,y: 127,107
I am trying to create my own face filtering augmented reality program in open-cv. The idea is it will map a beaver to the users face. Currently, I am unable to get proper transperency when loading this image with 'cv2.imread(...)'. It looks black in the background, and often shows partially in certain areas that are white. When I open this image in photoshop, I am fully capable of moving this image on top of a background with expected transparency results. I am wondering if the alpha is not getting rendered properly. Here is the relevent code where I am loading the image in.
import numpy
import cv2
def augment_stream(face: numpy.array, augment: numpy.array) -> numpy.array:
face_h, face_w, _ = face.shape
augment_h, augment_w, _ = augment.shape
scalar = min(face_h / augment_h, face_w / augment_w)
delta_augment_h = int(scalar * augment_h)
delta_augment_w = int(scalar * augment_w)
delta_augment_shape = (delta_augment_w, delta_augment_h)
resized_augment = cv2.resize(augment, delta_augment_shape)
augmented_face = face.copy()
dark_pixels = (resized_augment < 250).all(axis=2)
offset_x = int((face_w - delta_augment_w) / 2)
offset_y = int((face_h - delta_augment_h) / 2)
augmented_face[offset_y: offset_y+delta_augment_h, offset_x: offset_x+delta_augment_w][dark_pixels] = resized_augment[dark_pixels]
return augmented_face
def main():
stream = cv2.VideoCapture(0)
cascade = cv2.CascadeClassifier(cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
augment = cv2.imread('assets/normal.png')
# tmp = cv2.cvtColor(augment, cv2.COLOR_BGR2GRAY)
# _,alpha = cv2.threshold(tmp,0,255,cv2.THRESH_BINARY)
# b, g, r = cv2.split(augment)
# rgba = [b,g,r, alpha]
# dst = cv2.merge(rgba,4)
# cv2.imwrite("assets/normal.png", dst)
while True:
ret, border = stream.read()
border_h, border_w, _ = border.shape
bw = cv2.equalizeHist(cv2.cvtColor(border, cv2.COLOR_BGR2GRAY))
rects = cascade.detectMultiScale(bw, scaleFactor=1.3, minNeighbors=4, minSize=(30, 30), flags=cv2.CASCADE_SCALE_IMAGE)
for x, y, w, h in rects:
y0 = int(y - 0.25*h)
y1 = int(y + 0.75*h)
x0 = x
x1 = x + w
if x0 < 0 or x1 > border_w or y0 < 0 or y1 > border_h:
continue
border[y0: y1, x0: x1] = augment_stream(border[y0: y1, x0: x1], augment)
cv2.imshow('border', border)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
stream.release()
cv2.destroyAllWindows()
if __name__ == '__main__':
main()
Using example from question overlay a smaller image on a larger image python OpenCv
I reduced it to show only how to put image
use cv2.IMREAD_UNCHANGED to load it as RGBA.
split it to RGB and A
use A to create masks for image and border
use loop to add channels
import cv2
stream = cv2.VideoCapture(0)
# load RGBA
augment = cv2.imread('image.png', cv2.IMREAD_UNCHANGED) # load RGBA
# make it smaller then frame - only for test
W = 320
H = 240
augment = cv2.resize(augment, (W, H))
# split image and alpha
image = augment[:,:,0:3]
alpha = augment[:,:,3]
mask_image = alpha / 255.0
mask_border = 1.0 - mask_image
# ROI - region of interest
x1 = 200
y1 = 100
x2 = x1 + W
y2 = y1 + H
while True:
ret, border = stream.read()
# copy only in some region (ROI) (don't assign to variable) but gives worse result
#cv2.copyTo(image, alpha, border[y1:y2, x1:x2])
for c in range(0, 3): # channels RGB
border[y1:y2, x1:x2, c] = (image[:, :, c]*mask_image + border[y1:y2, x1:x2, c]*mask_border)
cv2.imshow('border', border)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
stream.release()
cv2.destroyAllWindows()
BTW: I tried to use cv2.copyTo(image, mask_image, border) but it gives worse result - maybe mask/alpha needs 3 channels.
It seems it can be done in C/C++ - how to insert a small size image on to a big image
I'm trying to create a program that knows what number is on an image with the following function:
def img_in_img(big_picture, small_picture, tamper):
big_picture = str(big_picture)
small_picture = str(small_picture)
if os.path.isfile(big_picture) == False or os.path.isfile(small_picture) == False:
return "Image does not exist"
img = cv2.imread(big_picture,0)
templace_loc = small_picture
template = cv2.imread(templace_loc,0)
w, h = template.shape[::-1]
method = cv2.TM_CCOEFF
tamper = int(tamper)
res = cv2.matchTemplate(img,template,method)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(res)
top_left = max_loc
bottom_right = (top_left[0] + w, top_left[1] + h)
height, width, channels = cv2.imread(templace_loc).shape
if int(top_left[0]) < int(height + tamper) and int(top_left[0]) > int(height - tamper) and int(top_left[1]) < int(width + tamper) and int(top_left[1]) > int(width - tamper):
return True
else:
return False
But then when I check if 7.png is in img.png with the code
nur = "7"
if img_in_img("2020-01-14-17-36-08.537043/verification_image2.png", "verifynr/" + "old_orange" + "/" + nur + ".png", 25):
print(Style.BRIGHT + Fore.GREEN + "Color: " + "old_orange" + ", Num: " + nur + Fore.RESET)
else:
print(Style.BRIGHT + Fore.RED + "Color: " + "old_orange" + ", Num: " + nur + Fore.RESET)
it gives me in RED: Color: old_orange, Num: 7
but then if I check if 6.png is in img.png by changing nur from 7 to 6 it gives me in Green: Color: old_orange, Num: 6, but that's the wrong image.
I've also tried the following code:
img_rgb = cv2.imread("2020-01-14-17-36-08.537043/verification_image2.png")
img_gray = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('verifynr/old_orange/7.png',0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gray,template,cv2.TM_SQDIFF)
threshold = 0.8
loc = np.where( res >= threshold)
pt = list(zip(*loc[::-1]))
if len(pt) >= 1:
print("True")
which prints True, but it does that for every number png I saved.
How do I make my program recognize the 7.png in the img.png without recognizing every single number png?
img.png:
6.png:
7.png:
Template matching not for object detection or pattern recognition, it's function is to find the most similar patch position. For detection use detectors (haar, dnn based ones, ...), for recognition use classifiers (descriptor based or nn based).
'
Here is an example of using a template mask to do template matching at one resolution level using Python/OpenCV. The black region of the mask tells the template matching to ignore those regions when matching, so that when the template has a background color different from that in the larger image, it does not degrade the match score. Only the color regions of the template corresponding to the white in the mask contribute to the matching score. Note that masked template matching in OpenCV only works for methods TM_SQDIFF and TM_CCORR_NORMED.
Input:
Template:
Template Mask:
import cv2
import numpy as np
# read image
img = cv2.imread('logo.png')
# read template
tmplt = cv2.imread('hat.png')
hh, ww, cc = tmplt.shape
# read template mask as grayscale
tmplt_mask = cv2.imread('hat_mask.png', cv2.COLOR_BGR2GRAY)
# do template matching
corrimg = cv2.matchTemplate(img,tmplt,cv2.TM_CCORR_NORMED, mask=tmplt_mask)
min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(corrimg)
max_val_ncc = '{:.3f}'.format(max_val)
print("correlation match score: " + max_val_ncc)
xx = max_loc[0]
yy = max_loc[1]
print('xmatch =',xx,'ymatch =',yy)
# draw red bounding box to define match location
result = img.copy()
pt1 = (xx,yy)
pt2 = (xx+ww, yy+hh)
cv2.rectangle(result, pt1, pt2, (0,0,255), 1)
cv2.imshow('image', img)
cv2.imshow('template', tmplt)
cv2.imshow('template_mask', tmplt_mask)
cv2.imshow('result', result)
cv2.waitKey(0)
cv2.destroyAllWindows()
# save results
cv2.imwrite('logo_hat_match.png', result)
Result:
Textual Information Listed:
correlation match score: 1.000
xmatch = 417 ymatch = 44
(Apologies for borrowing and modifying the ImageMagick logo. I already had that template matching example)
These are some few potential causes:
Scale and rotation difference: If the template image and the image you are searching for the template in have different scales or rotations, cv2.matchTemplate() may not find a match. You can try using techniques such as scaling or rotating the template image to match the image you are searching in.
Lighting and contrast differences: If the lighting or contrast of the template image and the image you are searching in are different, it can affect the accuracy of the match. You can try adjusting the brightness and contrast of the images to make them more similar.
Template size: If the template is too small or too big compared to the area you are searching in, it can affect the accuracy of the match. You should try using a template size that is similar to the expected size of the object in the image.
Similarity threshold: The matchTemplate() method has a threshold value that you can set to control the level of similarity required for a match. If the threshold value is set too high, it may not find any matches even if there are similar templates in the image. If it's set too low, it may return false positives.
Similarity Measurement Algorithm: matchTemplate() uses a specific algorithm for measuring the similarity between the template and the search image. Depending on the image and the template, one algorithm might work better than another.
I have this image of a table (seen below). And I'm trying to get the data from the table, similar to this form (first row of table image):
rows[0] = [x,x, , , , ,x, ,x,x, ,x, ,x, , , , ,x, , , ,x,x,x, ,x, ,x, , , , ]
I need the number of x's as well as the number of spaces.
There will also be other table images that are similar to this one (all having x's and the same number of columns).
So far, I am able to detect all of the x's using an image of an x. And I can somewhat detect the lines. I'm using open cv2 for python. I'm also using a houghTransform to detect the horizontal and vertical lines (that works really well).
I'm trying to figure out how I can go row by row and store the information in a list.
These are the training images:
used to detect x (train1.png in the code)
used to detect lines (train2.png in the code)
used to detect lines (train3.png in the code)
This is the code I have so far:
# process images
from pytesser import *
from PIL import Image
from matplotlib import pyplot as plt
import pytesseract
import numpy as np
import cv2
import math
import os
# the table images
images = ['table1.png', 'table2.png', 'table3.png', 'table4.png', 'table5.png']
# the template images used for training
templates = ['train1.png', 'train2.png', 'train3.png']
def hough_transform(im):
img = cv2.imread('imgs/'+im)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 50, 150, apertureSize=3)
lines = cv2.HoughLines(edges, 1, np.pi/180, 200)
i = 1
for rho, theta in lines[0]:
a = np.cos(theta)
b = np.sin(theta)
x0 = a*rho
y0 = b*rho
x1 = int(x0 + 1000*(-b))
y1 = int(y0 + 1000*(a))
x2 = int(x0 - 1000*(-b))
y2 = int(y0 - 1000*(a))
#print '%s - 0:(%s,%s) 1:(%s,%s), 2:(%s,%s)' % (i,x0,y0,x1,y1,x2,y2)
cv2.line(img, (x1,y1), (x2,y2), (0,0,255), 2)
i += 1
fn = os.path.splitext(im)[0]+'-lines'
cv2.imwrite('imgs/'+fn+'.png', img)
def match_exes(im, te):
img_rgb = cv2.imread('imgs/'+im)
img_gry = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('imgs/'+te, 0)
w, h = template.shape[::-1]
res = cv2.matchTemplate(img_gry, template, cv2.TM_CCOEFF_NORMED)
threshold = 0.71
loc = np.where(res >= threshold)
pts = []
exes = []
blanks = []
for pt in zip(*loc[::-1]):
pts.append(pt)
cv2.rectangle(img_rgb, pt, (pt[0]+w, pt[1]+h), (0,0,255), 1)
fn = os.path.splitext(im)[0]+'-exes'
cv2.imwrite('imgs/'+fn+'.png', img_rgb)
return pts, exes, blanks
def match_horizontal_lines(im, te, te2):
img_rgb = cv2.imread('imgs/'+im)
img_gry = cv2.cvtColor(img_rgb, cv2.COLOR_BGR2GRAY)
template = cv2.imread('imgs/'+te, 0)
w1, h1 = template.shape[::-1]
template2 = cv2.imread('imgs/'+te2, 0)
w2, h2 = template2.shape[::-1]
# first line template (the downward facing line)
res1 = cv2.matchTemplate(img_gry, template, cv2.TM_CCOEFF_NORMED)
threshold1 = 0.8
loc1 = np.where(res1 >= threshold1)
# second line template (the upward facing line)
res2 = cv2.matchTemplate(img_gry, template2, cv2.TM_CCOEFF_NORMED)
threshold2 = 0.8
loc2 = np.where(res2 >= threshold2)
pts = []
exes = []
blanks = []
# find first line template (the downward facing line)
for pt in zip(*loc1[::-1]):
pts.append(pt)
cv2.rectangle(img_rgb, pt, (pt[0]+w1, pt[1]+h1), (0,0,255), 1)
# find second line template (the upward facing line)
for pt in zip(*loc2[::-1]):
pts.append(pt)
cv2.rectangle(img_rgb, pt, (pt[0]+w2, pt[0]+h2), (0,0,255), 1)
fn = os.path.splitext(im)[0]+'-horiz'
cv2.imwrite('imgs/'+fn+'.png', img_rgb)
return pts, exes, blanks
# process
text = ''
for img in images:
print 'processing %s' % img
hough_transform(img)
pts, exes, blanks = match_exes(img, templates[0])
pts1, exes1, blanks1 = match_horizontal_lines(img, templates[1], templates[2])
text += '%s: %s x\'s & %s horizontal lines\n' % (img, len(pts), len(pts1))
# statistics file
outputFile = open('counts.txt', 'w')
outputFile.write(text)
outputFile.close()
And, the output images look like this (as you can see, all x's are detected but not all lines)
x's
horizontal lines
hough transform
As I said, I'm actually just trying to get the data from the table, similar to this form (first row of table image):
row a = [x,x, , , , ,x, ,x,x, ,x, ,x, , , , ,x, , , ,x,x,x, ,x, ,x, , , , ]
I need the number of x's as well as the number of spaces.
There will also be other table images that are similar to this one (all having x's and the same number of columns and a different number of rows).
Also, I am using python 2.7
Ok, I have figured it out. I used the suggestion provided by #beaker of looking between the grid lines.
Before doing that I had to remove the duplicate lines from the hough transformation code. Then, I sorted those remaining lines into 2 lists, vertical and horizontal. From there, I could loop through the horizontal and then vertical and then create a region of interest (roi) image. Each roi image represents a 'cell' in the table master image. I checked each of those cells for contours and noticed that when there was an 'x' in the cell, len(contours) >= 2. So, any len(contours) < 2 was a blank space (I did several test programs to figure this out). Here is the code I used to get it working:
import cv2
import numpy as np
import os
# the list of images (tables)
images = ['table1.png', 'table2.png', 'table3.png', 'table4.png', 'table5.png']
# the list of templates (used for template matching)
templates = ['train1.png']
def remove_duplicates(lines):
# remove duplicate lines (lines within 10 pixels of eachother)
for x1, y1, x2, y2 in lines:
for index, (x3, y3, x4, y4) in enumerate(lines):
if y1 == y2 and y3 == y4:
diff = abs(y1-y3)
elif x1 == x2 and x3 == x4:
diff = abs(x1-x3)
else:
diff = 0
if diff < 10 and diff is not 0:
del lines[index]
return lines
def sort_line_list(lines):
# sort lines into horizontal and vertical
vertical = []
horizontal = []
for line in lines:
if line[0] == line[2]:
vertical.append(line)
elif line[1] == line[3]:
horizontal.append(line)
vertical.sort()
horizontal.sort(key=lambda x: x[1])
return horizontal, vertical
def hough_transform_p(image, template, tableCnt):
# open and process images
img = cv2.imread('imgs/'+image)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray, 50, 150, apertureSize=3)
# probabilistic hough transform
lines = cv2.HoughLinesP(edges, 1, np.pi/180, 200, minLineLength=20, maxLineGap=999)[0].tolist()
# remove duplicates
lines = remove_duplicates(lines)
# draw image
for x1, y1, x2, y2 in lines:
cv2.line(img, (x1, y1), (x2, y2), (0, 0, 255), 1)
# sort lines into vertical & horizontal lists
horizontal, vertical = sort_line_list(lines)
# go through each horizontal line (aka row)
rows = []
for i, h in enumerate(horizontal):
if i < len(horizontal)-1:
row = []
for j, v in enumerate(vertical):
if i < len(horizontal)-1 and j < len(vertical)-1:
# every cell before last cell
# get width & height
width = horizontal[i+1][1] - h[1]
height = vertical[j+1][0] - v[0]
else:
# last cell, width = cell start to end of image
# get width & height
width = tW
height = tH
tW = width
tH = height
# get roi (region of interest) to find an x
roi = img[h[1]:h[1]+width, v[0]:v[0]+height]
# save image (for testing)
dir = 'imgs/table%s' % (tableCnt+1)
if not os.path.exists(dir):
os.makedirs(dir)
fn = '%s/roi_r%s-c%s.png' % (dir, i, j)
cv2.imwrite(fn, roi)
# if roi contains an x, add x to array, else add _
roi_gry = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
ret, thresh = cv2.threshold(roi_gry, 127, 255, 0)
contours, hierarchy = cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
if len(contours) > 1:
# there is an x for 2 or more contours
row.append('x')
else:
# there is no x when len(contours) is <= 1
row.append('_')
row.pop()
rows.append(row)
# save image (for testing)
fn = os.path.splitext(image)[0] + '-hough_p.png'
cv2.imwrite('imgs/'+fn, img)
def process():
for i, img in enumerate(images):
# perform probabilistic hough transform on each image
hough_transform_p(img, templates[0], i)
if __name__ == '__main__':
process()
So, the sample image:
And, the output (code to generate text file was deleted for brevity):
As you can see, the text file contains the same number of x's in the same position as the image. Now that the hard part is over, I can continue on with my assignment!
I'm trying to split an image into several sub-images with opencv by identifying templates of the original image and then copy the regions where I matched those templates. I'm a TOTAL newbie to opencv! I've identified the sub-images using:
result = cv2.matchTemplate(img, template, cv2.TM_CCORR_NORMED)
After some cleanup I get a list of tuples called points in which I iterate to show the rectangles. tw and th is the template width and height respectively.
for pt in points:
re = cv2.rectangle(img, pt, (pt[0] + tw, pt[1] + th), 0, 2)
print('%s, %s' % (str(pt[0]), str(pt[1])))
count+=1
What I would like to accomplish is to save the octagons (https://dl.dropbox.com/u/239592/region01.png) into separated files.
How can I do this? I've read something about contours but I'm not sure how to use it. Ideally I would like to contour the octagon.
Thanks a lot for your help!
If template matching is working for you, stick to it. For instance, I considered the following template:
Then, we can pre-process the input in order to make it a binary one and discard small components. After this step, the template matching is performed. Then it is a matter of filtering the matches by means of discarding close ones (I've used a dummy method for that, so if there are too many matches you could see it taking some time). After we decide which points are far apart (and thus identify different hexagons), we can do minor adjusts to them in the following manner:
Sort by y-coordinate;
If two adjacent items start at a y-coordinate that is too close, then set them both to the same y-coord.
Now you can sort this point list in an appropriate order such that the crops are done in raster order. The cropping part is easily achieved using slicing provided by numpy.
import sys
import cv2
import numpy
outbasename = 'hexagon_%02d.png'
img = cv2.imread(sys.argv[1])
template = cv2.cvtColor(cv2.imread(sys.argv[2]), cv2.COLOR_BGR2GRAY)
theight, twidth = template.shape[:2]
# Binarize the input based on the saturation and value.
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
saturation = hsv[:,:,1]
value = hsv[:,:,2]
value[saturation > 35] = 255
value = cv2.threshold(value, 0, 255, cv2.THRESH_OTSU)[1]
# Pad the image.
value = cv2.copyMakeBorder(255 - value, 3, 3, 3, 3, cv2.BORDER_CONSTANT, value=0)
# Discard small components.
img_clean = numpy.zeros(value.shape, dtype=numpy.uint8)
contours, _ = cv2.findContours(value, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
for i, c in enumerate(contours):
area = cv2.contourArea(c)
if area > 500:
cv2.drawContours(img_clean, contours, i, 255, 2)
def closest_pt(a, pt):
if not len(a):
return (float('inf'), float('inf'))
d = a - pt
return a[numpy.argmin((d * d).sum(1))]
match = cv2.matchTemplate(img_clean, template, cv2.TM_CCORR_NORMED)
# Filter matches.
threshold = 0.8
dist_threshold = twidth / 1.5
loc = numpy.where(match > threshold)
ptlist = numpy.zeros((len(loc[0]), 2), dtype=int)
count = 0
print "%d matches" % len(loc[0])
for pt in zip(*loc[::-1]):
cpt = closest_pt(ptlist[:count], pt)
dist = ((cpt[0] - pt[0]) ** 2 + (cpt[1] - pt[1]) ** 2) ** 0.5
if dist > dist_threshold:
ptlist[count] = pt
count += 1
# Adjust points (could do for the x coords too).
ptlist = ptlist[:count]
view = ptlist.ravel().view([('x', int), ('y', int)])
view.sort(order=['y', 'x'])
for i in xrange(1, ptlist.shape[0]):
prev, curr = ptlist[i - 1], ptlist[i]
if abs(curr[1] - prev[1]) < 5:
y = min(curr[1], prev[1])
curr[1], prev[1] = y, y
# Crop in raster order.
view.sort(order=['y', 'x'])
for i, pt in enumerate(ptlist, start=1):
cv2.imwrite(outbasename % i,
img[pt[1]-2:pt[1]+theight-2, pt[0]-2:pt[0]+twidth-2])
print 'Wrote %s' % (outbasename % i)
If you want only the contours of the hexagons, then crop on img_clean instead of img (but then it is pointless to sort the hexagons in raster order).
Here is a representation of the different regions that would be cut for your two examples without modifying the code above:
I am sorry, I didn't understand from your question on how do you relate matchTemplate and Contours.
Anyway, below is a small technique using contours. It is on the assumption that your other images are also like the one you provided. I am not sure if it works with your other images. But I think it would help to get a startup. Try this yourself and make necessary adjustments and modifications.
What I did :
1 - I needed the edge of octagons . So Thresholded Image using Otsu and apply dilation and erosion (or use any method you like that works well for all your images, beware of the edges in left edge of image).
2 - Then found contours (More about contours : http://goo.gl/r0ID0
3 - For each contours, find its convex hull, find its area(A) & perimeter(P)
4 - For a perfect octagon, P*P/A = 13.25 approximately. I used it here and cut it and saved it.
5 - You can see cropping it also removes some edges of octagon. If you want it, adjust the cropping dimension.
Code :
import cv2
import numpy as np
img = cv2.imread('region01.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
thresh = cv2.dilate(thresh,None,iterations = 2)
thresh = cv2.erode(thresh,None)
contours,hierarchy = cv2.findContours(thresh,cv2.RETR_LIST,cv2.CHAIN_APPROX_SIMPLE)
number = 0
for cnt in contours:
hull = cv2.convexHull(cnt)
area = cv2.contourArea(hull)
P = cv2.arcLength(hull,True)
if ((area != 0) and (13<= P**2/area <= 14)):
#cv2.drawContours(img,[hull],0,255,3)
x,y,w,h = cv2.boundingRect(hull)
number = number + 1
roi = img[y:y+h,x:x+w]
cv2.imshow(str(number),roi)
cv2.imwrite("1"+str(number)+".jpg",roi)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Those 6 octagons will be stored as separate files.
Hope it helps !!!