Orienting an Object in an image horizontally - python

I have images of Food Trays oriented in various angles. I would like to make all the trays horizontally oriented. For this I tried finding the longest edge of the tray using hough's transformation and calculated its orientation with respect to the image border and rotated it. It works fine for very few cases. I would like to make it work for all the images I have. Can anyone please help me with this? I have attached some sample images in the link below and also I have included the code which I am currently using.
Link for images
def Enquiry(lis1):
return(np.array(lis1))
img = cv2.imread('path/to/image')
canny = cv.Canny(img, 100, 200)
minLineLength = 200
maxLineGap = 10
lines = cv.HoughLinesP(canny, 1, np.pi / 180, 100, minLineLength, maxLineGap)
if Enquiry(lines).size>=4:
lines1 = lines[:,0,:]
max_length = 0
index = 0
i = 0
for x1, y1, x2, y2 in lines1:
length = (x1-x2)*(x1-x2) + (y1-y2)*(y1-y2)
if length > max_length:
max_length = length
index = i
i += 1
[x1,y1,x2,y2]=lines1[index]
degree = math.atan(abs(y1-y2)/abs(x1-x2))
angle = degree*180/np.pi
H, W = img.shape[:2]
rotation_matrix = cv.getRotationMatrix2D((W/2, H/2), -angle, 1)
img_rotation = cv.warpAffine(img, rotation_matrix, (W, H))
cv2.imwrite('rotated_image.jpg', img_rotation)

Only rotation will not help. In your test images, the tray has some shear and skew too.
What I would suggest, is find the corners from the intersection of the lines.
At least 3 corners. Then find the affine transformation between the corners and the expected actual corners of the tray.

Related

How to find where a pixel maps to in cv2.resize?

I was wondering, given the type of interpolation that is used for image resizes using cv2.resize. How can I find out exactly where a particular pixel maps too? For example, if I'm increasing the size of an image using Linear_interpolation and I take coordinates (785, 251) for a particular pixel, regardless of whether or not the aspect ratio changes between the source image and resized image, how could I find out exactly to what coordinates the pixel in the source image with coordinates == (785, 251) maps in the resized version? I've looked over the internet for a solution but all solutions seem to be indirect methods of finding out where a pixel maps that don't actually work for different aspect ratio's:
https://answers.opencv.org/question/209827/resize-and-remap/
After resizing an image with cv2, how to get the new bounding box coordinate
Is there a way through cv2 to access the way pixels are mapped maybe and through reversing the script finding out the new coordinates?
The reason why I would like this is that I want to be able to create bounding boxes that give me back the same information regardless of the change in aspect ratio of a given image. Every method I've used so far doesn't give me back the same information. I figure that if I can figure out where the particular pixel coordinates of x,y top left and bottom right maps I can recreate an accurate bounding box regardless of aspect ratio changes.
Scaling the coordinates works when the center coordinate is (0, 0).
You may compute x_scaled and y_scaled as follows:
Subtract x_original_center and y_original_center from x_original and y_original.
After subtraction, (0, 0) is the "new center".
Scale the "zero centered" coordinates by scale_x and scale_y.
Convert the "scaled zero centered" coordinates to "top left (0, 0)" by adding x_scaled_center and y_scaled_center.
Computing the center accurately:
The Python conversion is:
(0, 0) is the top left, and (cols-1, rows-1) is the bottom right coordinate.
The accurate center coordinate is:
x_original_center = (original_rows-1)/2
y_original_center = (original_cols-1)/2
Python code (assume img is the original image):
resized_img = cv2.resize(img, [int(cols*scale_x), int(rows*scale_y)])
rows, cols = img.shape[0:2]
resized_rows, resized_cols = resized_img.shape[0:2]
x_original_center = (cols-1) / 2
y_original_center = (rows-1) / 2
x_scaled_center = (resized_cols-1) / 2
y_scaled_center = (resized_rows-1) / 2
# Subtract the center, scale, and add the "scaled center".
x_scaled = (x_original - x_original_center)*scale_x + x_scaled_center
y_scaled = (y_original - y_original_center)*scale_y + y_scaled_center
Testing
The following code sample draws crosses at few original and scaled coordinates:
import cv2
def draw_cross(im, x, y, use_color=False):
""" Draw a cross with center (x,y) - cross is two rows and two columns """
x = int(round(x - 0.5))
y = int(round(y - 0.5))
if use_color:
im[y-4:y+6, x] = [0, 0, 255]
im[y-4:y+6, x+1] = [255, 0, 0]
im[y, x-4:x+6] = [0, 0, 255]
im[y+1, x-4:x+6] = [255, 0, 0]
else:
im[y-4:y+6, x] = 0
im[y-4:y+6, x+1] = 255
im[y, x-4:x+6] = 0
im[y+1, x-4:x+6] = 255
img = cv2.imread('graf.png') # http://man.hubwiz.com/docset/OpenCV.docset/Contents/Resources/Documents/db/d70/tutorial_akaze_matching.html
rows, cols = img.shape[0:2] # cols = 320, rows = 256
# 3 points for testing:
x0_original, y0_original = cols//2-0.5, rows//2-0.5 # 159.5, 127.5
x1_original, y1_original = cols//5-0.5, rows//4-0.5 # 63.5, 63.5
x2_original, y2_original = (cols//5)*3+20-0.5, (rows//4)*3+30-0.5 # 211.5, 221.5
draw_cross(img, x0_original, y0_original) # Center of cross (159.5, 127.5)
draw_cross(img, x1_original, y1_original)
draw_cross(img, x2_original, y2_original)
scale_x = 2.5
scale_y = 2
resized_img = cv2.resize(img, [int(cols*scale_x), int(rows*scale_y)], interpolation=cv2.INTER_NEAREST)
resized_rows, resized_cols = resized_img.shape[0:2] # cols = 800, rows = 512
# Compute center column and center row
x_original_center = (cols-1) / 2 # 159.5
y_original_center = (rows-1) / 2 # 127.5
# Compute center of resized image
x_scaled_center = (resized_cols-1) / 2 # 399.5
y_scaled_center = (resized_rows-1) / 2 # 255.5
# Compute the destination coordinates after resize
x0_scaled = (x0_original - x_original_center)*scale_x + x_scaled_center # 399.5
y0_scaled = (y0_original - y_original_center)*scale_y + y_scaled_center # 255.5
x1_scaled = (x1_original - x_original_center)*scale_x + x_scaled_center # 159.5
y1_scaled = (y1_original - y_original_center)*scale_y + y_scaled_center # 127.5
x2_scaled = (x2_original - x_original_center)*scale_x + x_scaled_center # 529.5
y2_scaled = (y2_original - y_original_center)*scale_y + y_scaled_center # 443.5
# Draw crosses on resized image
draw_cross(resized_img, x0_scaled, y0_scaled, True)
draw_cross(resized_img, x1_scaled, y1_scaled, True)
draw_cross(resized_img, x2_scaled, y2_scaled, True)
cv2.imshow('img', img)
cv2.imshow('resized_img', resized_img)
cv2.waitKey()
cv2.destroyAllWindows()
Original image:
Resized image:
Making sure the crosses are aligned:
Note:
In my answer I was using the naming conventions of Miki's comment.

Is there a way to slice an image using either numpy or opencv such that the sliced image has at least one instance of the objects of interest?

Essentially, my original image has N instances of a certain object. I have the bounding box coordinates and the class for all of them in a text file. This is basically a dataset for YoloV3 and darknet. I want to generate additional images by slicing the original one in a way such that it contains at least 1 instance of one of those objects and if it does, save the image, and the new bounding box coordinates of the objects in that image.
The following is the code for slicing the image:
x1 = random.randint(0, 1200)
width = random.randint(0, 800)
y1 = random.randint(0, 1200)
height = random.randint(30, 800)
slice_img = img[x1:x1+width, y1:y1+height]
plt.imshow(slice_img)
plt.show()
My next step is to use template matching to find if my sliced image is in the original one:
w, h = slice_img.shape[:-1]
res = cv2.matchTemplate(img, slice_img, cv2.TM_CCOEFF_NORMED)
threshold = 0.6
loc = np.where(res >= threshold)
for pt in zip(*loc[::-1]): # Switch columns and rows
cv2.rectangle(img, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 5)
cv2.imwrite('result.png', img)
At this stage, I am quite lost and not sure how to proceed any further.
Ultimately, I need many new images with corresponding text files containing the class and coordinates. Any advice would be appreciated. Thank you.
P.S I cannot share my images with you, unfortunately.
Template matching is way overkill for this. Template matching essentially slides a kernel image over your main image and compares pixels of each, performing many many computations. There's no need to search the image because you already know where the objects are within the image. Essentially, you are trying to determine whether one rectangle (bounding box for an object) overlaps sufficiently with the slice, and you know the exact coordinates of each rectangle. Thus, it's a geometry problem rather than a computer vision problem.
(As an aside: the correct term for what you are calling a slice would probably be crop; slice generally means you're taking an N-dimensional array (say 3 x 4 x 5) and taking a subset of data that is N-1 dimensional by selecting a single index for one dimension (i.e. take index 0 on dimension 0 to get a 1 x 4 x 5 array).
Here's a brief example of how you might do this. Let x1 x2 y1 y2 be the min and max x and y coordinates for the crop you generate. Let ox1 ox2 oy1 oy2 be the min and max x and y coordinates for an object:
NO_SUCCESSFUL_CROPS = True
while NO_SUCCESSFUL_CROPS:
# Generate crop
x1 = random.randint(0, 1200)
width = random.randint(0, 800)
y1 = random.randint(0, 1200)
height = random.randint(30, 800)
x2 = x1 + width
y2 = y1 + height
# for each bounding box
#check if at least (nominally) 70% of object is within crop
threshold = 0.7
for bbox in all_objects:
#assign bbox to ox1 ox2 oy1 oy2
ox1,ox2,oy1,oy2 = bbox
# compute percentage of bbox that is within crop
minx = max(ox1,x1)
miny = max(oy1,y1)
maxx = min(ox2,x2)
maxy = min(oy2,y2)
area_in_crop = (maxx-minx)*(maxy-miny)
area of bbox = (ox2-ox1)*(oy2-oy1)
ratio = area_in_crop / area_of_bbox
if ratio > threshold:
# break loop
NO_SUCCESSFUL_CROPS = False
# crop image as above
crop_image = image[y1:y2,x1:x2] # if image is an array, may have to do y then x because y is row and x is column. Not sure exactly which form opencv uses
cv2.imwrite("output_file.png",crop_image)
# shift bbox coords since (x1,y1) is the new (0,0) pixel in crop_image
ox1 -= x1
ox2 -= x1
oy1 -= y1
oy2 -= y2
break # no need to continue (although you could alternately continue until you have N crops, or even make sure you get one crop with each object)

how to track a pixel location after rotating image?

I'm trying to randomly rotate some images with annotations,
I'm trying to understand how to get the new point location after rotation,
my images have different shapes
Example of what I'm trying to do is: this function to calculate the new position according to this answer (here)
def new_pixel(x,y,theta,X,Y):
sin = math.sin(theta)
cos = math.cos(theta)
x_new = (x-X/2)*cos + (y-X/2)*sin + X/2
y_new = -(x-X/2)*sin + (y-Y/2)*cos + Y/2
return int(x_new),int(y_new)
the code of open original image:
img = cv2.imread('D://ubun/1.jpg')
print(img.shape)
X, Y, c = img.shape
p1 = (124,291)
p2 = (168,291)
p3 = (169,391)
p4 = (125,391)
img1 = img.copy()
cv2.circle(img1, p1, 10, color=(255,0,0), thickness=2)
plt.imshow(img1)
the red dot is the point
the code for rotation:
rotated = ndimage.rotate(img, 45)
print(rotated.shape)
p11 = new_pixel(p1[0],p1[1],45,X,Y)
p22 = new_pixel(p2[0],p2[1],45,X,Y)
p33 = new_pixel(p3[0],p3[1],45,X,Y)
p44 = new_pixel(p4[0],p4[1],45,X,Y)
cv2.circle(rotated, p11, 10, color=(255,0,0), thickness=2)
plt.imshow(rotated)
The image after rotation and see the point is not in the correct position after rotation:
I noticed that image shape is different after rotation, does this effect the calculations ?
This reply is really late, but even I faced the same issue. So there are basically two issues with your code.
math.sin /math.cos takes radians as input not degrees. SO just convert it to radians to get the right result.
X,Y : You are taking the height and width of the initial image which was not rotated. This would be right if the resulting image is the same size as the initial image , but since the resulting image resizes the output to fit the complete image after rotation , you need to take X , Y as the new width and height. But you need to consider the new height and width only while your adding it after transformation. SO the formula would be :
x_new = (x-old_Width/2)*cos + (y-old_height/2)*sin + new_width/2
y_new = -(x-old_Width/2)*sin + (y-old_height/2)*cos + new_height/2
I know this post is too late , but hope this helps out the people whoever faces the same issue.

Extracting data from tables without any grid lines and border from scanned image of document

Extracting table data from digital PDFs have been simple using camelot and tabula. However, the solution doesn't work with scanned images of the document pages specifically when the table doesn't have borders and inner grids. I have been trying to generate vertical and horizontal lines using OpenCV. However, since the scanned images will have slight rotation angles, it is difficult to proceed with the approach.
How can we utilize OpenCV to generate grids (horizontal and vertical lines) and borders for the scanned document page which contains table data (along with paragraphs of text)? If this is feasible, how to nullify the rotation angle of the scanned image?
I wrote some code to estimate the horizontal lines from the printed letters in the page. The same could be done for vertical ones I guess. The code below follows some general assumptions, here
some basic steps in pseudo code style:
prepare picture for contour detection
do contour detection
we assume most contours are letters
calc mean width of all contours
calc mean area of contours
filter all contours with two conditions:
a) contour (letter) heigths < meanHigh * 2
b) contour area > 4/5 meanArea
calc center point of all remaining contours
assume we have line regions (bins)
list all center point which are inside the region
do linear regression of region points
save slope and intercept
calc mean slope and intercept
here the full code:
import cv2
import numpy as np
from scipy import stats
def resizeImageByPercentage(img,scalePercent = 60):
width = int(img.shape[1] * scalePercent / 100)
height = int(img.shape[0] * scalePercent / 100)
dim = (width, height)
# resize image
return cv2.resize(img, dim, interpolation = cv2.INTER_AREA)
def calcAverageContourWithAndHeigh(contourList):
hs = list()
ws = list()
for cnt in contourList:
(x, y, w, h) = cv2.boundingRect(cnt)
ws.append(w)
hs.append(h)
return np.mean(ws),np.mean(hs)
def calcAverageContourArea(contourList):
areaList = list()
for cnt in contourList:
a = cv2.minAreaRect(cnt)
areaList.append(a[2])
return np.mean(areaList)
def calcCentroid(contour):
houghMoments = cv2.moments(contour)
# calculate x,y coordinate of centroid
if houghMoments["m00"] != 0: #case no contour could be calculated
cX = int(houghMoments["m10"] / houghMoments["m00"])
cY = int(houghMoments["m01"] / houghMoments["m00"])
else:
# set values as what you need in the situation
cX, cY = -1, -1
return cX,cY
def getCentroidWhenSizeInRange(contourList,letterSizeWidth,letterSizeHigh,deltaOffset,minLetterArea=10.0):
centroidList=list()
for cnt in contourList:
(x, y, w, h) = cv2.boundingRect(cnt)
area = cv2.minAreaRect(cnt)
#calc diff
diffW = abs(w-letterSizeWidth)
diffH = abs(h-letterSizeHigh)
#thresold A: almost smaller than mean letter size +- offset
#when almost letterSize
if diffW < deltaOffset and diffH < deltaOffset:
#threshold B > min area
if area[2] > minLetterArea:
cX,cY = calcCentroid(cnt)
if cX!=-1 and cY!=-1:
centroidList.append((cX,cY))
return centroidList
DEBUGMODE = True
#read image, do git clone https://github.com/WZBSocialScienceCenter/pdftabextract.git for the example
img = cv2.imread('pdftabextract/examples/catalogue_30s/data/ALA1934_RR-excerpt.pdf-2_1.png')
#get some basic infos
imgHeigh, imgWidth, imgChannelAmount = img.shape
if DEBUGMODE:
cv2.imwrite("img00original.jpg",resizeImageByPercentage(img,30))
cv2.imshow("original",img)
# prepare img
imgGrey = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# apply Gaussian filter
imgGaussianBlur = cv2.GaussianBlur(imgGrey,(5,5),0)
#make binary img, black or white
_, imgBinThres = cv2.threshold(imgGaussianBlur, 130, 255, cv2.THRESH_BINARY)
## detect contours
contours, _ = cv2.findContours(imgBinThres, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#we get some letter parameter
averageLetterWidth, averageLetterHigh = calcAverageContourWithAndHeigh(contours)
threshold1AllowedLetterSizeOffset = averageLetterHigh * 2 # double size
averageContourAreaSizeOfMinRect = calcAverageContourArea(contours)
threshHold2MinArea = 4 * averageContourAreaSizeOfMinRect / 5 # 4/5 * mean
print("mean letter Width: ", averageLetterWidth)
print("mean letter High: ", averageLetterHigh)
print("threshold 1 tolerance: ", threshold1AllowedLetterSizeOffset)
print("mean letter area ", averageContourAreaSizeOfMinRect)
print("thresold 2 min letter area ", threshHold2MinArea)
#we get all centroid of letter sizes contours, the other we ignore
centroidList = getCentroidWhenSizeInRange(contours,averageLetterWidth,averageLetterHigh,threshold1AllowedLetterSizeOffset,threshHold2MinArea)
if DEBUGMODE:
#debug print all centers:
imgFilteredCenter = img.copy()
for cX,cY in centroidList:
#draw in red color as BGR
cv2.circle(imgFilteredCenter, (cX, cY), 5, (0, 0, 255), -1)
cv2.imwrite("img01letterCenters.jpg",resizeImageByPercentage(imgFilteredCenter,30))
cv2.imshow("letterCenters",imgFilteredCenter)
#we estimate a bin widths
amountPixelFreeSpace = averageLetterHigh #TODO get better estimate out of histogram
estimatedBinWidth = round( averageLetterHigh + amountPixelFreeSpace) #TODO round better ?
binCollection = dict() #range(0,imgHeigh,estimatedBinWidth)
#we do seperate the center points into bins by y coordinate
for i in range(0,imgHeigh,estimatedBinWidth):
listCenterPointsInBin = list()
yMin = i
yMax = i + estimatedBinWidth
for cX,cY in centroidList:
if yMin < cY < yMax:#if fits in bin
listCenterPointsInBin.append((cX,cY))
binCollection[i] = listCenterPointsInBin
#we assume all point are in one line ?
#model = slope (x) + intercept
#model = m (x) + n
mList = list() #slope abs in img
nList = list() #intercept abs in img
nListRelative = list() #intercept relative to bin start
minAmountRegressionElements = 12 #is also alias for letter amount we expect
#we do regression for every point in the bin
for startYOfBin, values in binCollection.items():
#we reform values
xValues = [] #TODO use more short transform
yValues = []
for x,y in values:
xValues.append(x)
yValues.append(y)
#we assume a min limit of point in bin
if len(xValues) >= minAmountRegressionElements :
slope, intercept, r, p, std_err = stats.linregress(xValues, yValues)
mList.append(slope)
nList.append(intercept)
#we calc the relative intercept
nRelativeToBinStart = intercept - startYOfBin
nListRelative.append(nRelativeToBinStart)
if DEBUGMODE:
#we debug print all lines in one picute
imgLines = img.copy()
colorOfLine = (0, 255, 0) #green
for i in range(0,len(mList)):
slope = mList[i]
intercept = nList[i]
startPoint = (0, int( intercept)) #better round ?
endPointY = int( (slope * imgWidth + intercept) )
if endPointY < 0:
endPointY = 0
endPoint = (imgHeigh,endPointY)
cv2.line(imgLines, startPoint, endPoint, colorOfLine, 2)
cv2.imwrite("img02lines.jpg",resizeImageByPercentage(imgLines,30))
cv2.imshow("linesOfLetters ",imgLines)
#we assume in mean we got it right
meanIntercept = np.mean(nListRelative)
meanSlope = np.mean(mList)
print("meanIntercept :", meanIntercept)
print("meanSlope ", meanSlope)
#TODO calc angle with math.atan(slope) ...
if DEBUGMODE:
cv2.waitKey(0)
original:
center point of letters:
lines:
I had the same problem some time ago and this tutorial is the solution to that. It explains using pdftabextract which is a Python library by Markus Konrad and leverages OpenCV’s Hough transform to detect the lines and works even if the scanned document is a bit tilted. The tutorial walks your through parsing a 1920s German newspaper

Find a rotated bounding box of known size around a rectangle with noisy sides

I'm trying to find a rotated bounding box around a less-than-perfect binarized image of a rectangle. The imperfections are always different: sometimes it's hollow, sometimes there's stuff inside, sometimes one of the edges is missing a chunk, sometimes there's an extra chunk somewhere on the edge, and they're always slightly rotated by a random amount, but the size and shape of the expected bounding box is always nearly the same absolute value in pixels.
Here's some samples of what I have as inputs (resized to fit better in the post):
And ideally I'd like to find a bounding box around the outside of the white rectangle (although I'm mostly just interested in the edges) like this:
(found by inverting one of the hollow ones, getting the largest connected component, and getting a rotatedrect of forced size)
So far I've tried just getting a rotatedrect and forcing a shape afterwards, which works for almost every case except for when there's an extra chunk along one of the edges. I've tried getting connected components to isolate parts of it and get bounding boxes around those, which works for every case as long as they're hollow. I've tried dilating and eroding the image, getting contours and hough lines to try to find only the four cornerpoints, but I've had no luck with that either. I've also looked online for anything useful to no avail.
Any help or ideas would be greatly appreciated.
My solution comprises two parts:
Find (upright) bounding box of the big white rectangle by finding the biggest connected component, fill all holes in it, find outside vertical and horizontal lines (Hough), get the bounding box by taking the min/max x/y coordinates.
Match a (filled) rectangle of given size with center at center of bounding box from step 1 at different angles, print the best match as result.
Following is a simple program demonstrating this approach. The arguments at the beginning (filename, size of know rectangle, angle search range) would normally be passed in from the command line.
import cv2
import numpy as np
# arguments
file = '1.png'
w0, h0 = 425, 630 # size of known rectangle
ang_range = 1 # posible range (+/-) of angle in degrees
# read image
img = cv2.imread(file, cv2.IMREAD_GRAYSCALE)
h, w = img.shape
# find biggest connceted components
nb_components, output, stats, _ = cv2.connectedComponentsWithStats(img, connectivity=4)
sizes = stats[:, -1]
max_label, max_size = 1, sizes[1]
for i in range(2, nb_components):
if sizes[i] > max_size:
max_label = i
max_size = sizes[i]
img2 = np.zeros(img.shape, np.uint8)
img2[output == max_label] = 128
# fill holes
contours, _ = cv2.findContours(img2, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
for contour in contours:
cv2.drawContours(img2, [contour], 0, 128, -1)
# find lines
edges = cv2.Canny(img2, 50, 150, apertureSize = 3)
lines = cv2.HoughLinesP(edges, 1, np.pi/180, 40)
# find bounding lines
xmax = ymax = 0
xmin, ymin = w-1, h-1
for i in range(lines.shape[0]):
x1 = lines[i][0][0]
y1 = lines[i][0][1]
x2 = lines[i][0][2]
y2 = lines[i][0][3]
cv2.line(img2, (x1,y1), (x2,y2), 255, 2, cv2.LINE_AA)
if abs(x1-x2) < abs(y1-y2):
# vertical line
xmin = min(xmin,x1,x2)
xmax = max(xmax,x1,x2)
else:
# horizcontal line
ymin = min(ymin,y1,y2)
ymax = max(ymax,y1,y2)
cv2.rectangle(img2, (xmin,ymin), (xmax,ymax), 255, 1, cv2.LINE_AA)
cv2.imwrite(file.replace('.png', '_intermediate.png'), img2)
# rectangle of known size centered at bounding box
xc = (xmax + xmin) / 2
yc = (ymax + ymin) / 2
box = np.zeros(img.shape, np.uint8)
box[int(yc-h0/2):int(yc+h0/2), int(xc-w0/2):int(xc+w0/2)] = 255
# find best match of this rectangle at different angles
smax = angmax = 0
for ang in np.linspace(-ang_range, ang_range, 20):
rm = cv2.getRotationMatrix2D((xc,yc), ang, 1)
rotbox = cv2.warpAffine(box, rm, (w,h))
s = cv2.countNonZero(cv2.bitwise_and(rotbox, img))
if s > smax:
smax = s
angmax = ang
# output and visualize result
def draw_rotated_rect(img, size, center, angle, color, thickness):
rm = cv2.getRotationMatrix2D(center, angle, 1)
p0 = np.dot(rm,(xc-w0/2, yc-h0/2,1))
p1 = np.dot(rm,(xc-w0/2, yc+h0/2,1))
p2 = np.dot(rm,(xc+w0/2, yc+h0/2,1))
p3 = np.dot(rm,(xc+w0/2, yc-h0/2,1))
pnts = np.int32(np.vstack([p0,p1,p2,p3]) + 0.5).reshape(-1,4,2)
cv2.polylines(img, pnts, True, color, thickness, cv2.LINE_AA)
print(f'{file}: edges {pnts[0].tolist()}, angle = {angle:.2f}°')
res = cv2.cvtColor(img, cv2.COLOR_GRAY2BGR)
draw_rotated_rect(res, (w0,h0), (xc,yc), angmax, (0,255,0), 2)
cv2.imwrite(file.replace('.png', '_result.png'), res)
Intermediate results to show how it works (gray = filled biggest connected component, thick white lines = Hough lines, thin white rectangle = upright bounding box):
(to view the full size pictures click on them and then remove the final m before the file extension)
Visualization of results (green = rotated rectangle of known size):
Results (should eventually be clamped to [0,image size), -1 is due to floating point rotation):
1.png: edges [[17, -1], [17, 629], [442, 629], [442, -1]], angle = 0.00°
2.png: edges [[7, 18], [9, 648], [434, 646], [432, 16]], angle = 0.26°
3.png: edges [[38, 25], [36, 655], [461, 657], [463, 27]], angle = -0.26°
4.png: edges [[36, 14], [28, 644], [453, 650], [461, 20]], angle = -0.79°
As we see in image 3, the match is not perfect. This could be due to the example images that were shrinked to somewhat differing sizes and of course I didn't know the size of the known rectangle, so I just assumed an appropriate value for the demonstration.
If this occurs with real data too, you may want to not only vary the angle to find the best match, but also shift the matching box a couple of pixels up/down and right/left. See for instance section 8.1 of Dawson-Howe: A Practical Introduction to Computer Vision with OpenCV for further details.

Categories

Resources