Apply transform matrix to warpTransform in OpenCV - python

So, I want to transform an image but can't really find a proper way to do it using OpenCV.
First thing I have image lets say 500x600px inside of which there is a distorted thing I want to "straighten up" see the image:
I'm obtaining the contour of the sudoku like this:
cropped_image, contours, _ =
cv2.findContours(cropped_image, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
max_contour = max(contours, key=cv2.contourArea)
Then I'm getting the max_contour and image extreme pixels (top-left, top-right, bottom-right and bottom-left) and getting the transform matrix and transforming the image like this:
x, y = cropped_image.shape
image_extreme_pixels = np.array([[0, y], [x, y], [x, 0], [0, 0]], dtype=np.float32)
c_x, c_y = [], []
for i in contour:
c_x.append(i[0][0])
c_y.append(i[0][1])
contour_extreme_pixels = np.array([
[min(c_x), max(c_y)],
[max(c_x), max(c_y)],
[max(c_x), min(c_y)],
[min(c_x), min(c_y)]],
dtype=np.float32)
t_matrix = cv2.getPerspectiveTransform(contour_extreme_pixels, image_extreme_pixels)
transformed_image = cv2.warpPerspective(cropped_image, t_matrix, (y, x))
plt.imshow(cropped_image, interpolation='nearest', cmap=plt.cm.gray)
But when I view the image it's transformed in a weird fashion. I wanted to stretch the top parts of sudoku so that it's contour are straight.
Could you point what's wron with my code?
I'm assuming it might be the fashion in which I'm creating the 4 extreme pixels that are then put into the getPerspectiveTransform to get the transformation matrix but didn't manage to make it work yet.

Assuming that you have found out the corner points of sudoku accurately you can affine transform the input image as:
# Hard coded the points here assuming that you already have 4 corners of sudoku image
sudoku_corner_points = np.float32([[235, 40], [1022, 55], [190, 875], [1090, 880]])
canvas = np.ones((500, 500), dtype=np.uint8)
dst_points = np.float32([[0, 0], [500, 0], [0, 500], [500, 500]])
t_matrix = cv2.getPerspectiveTransform(sudoku_corner_points, dst_points)
transformed_image = cv2.warpPerspective(img, t_matrix, (500, 500))

So turns out the extreme points I've found were incorrect.
The correct way (one of many) to find 4 extreme points in a shape that we expect to be rectangular would be something like this:
def get_contour_extreme_points(img, contour):
m_point = image_center(img)
l1, l2, l3, l4 = 0, 0, 0, 0
p1, p2, p3, p4 = 0, 0, 0, 0
for point in contour:
d = distance(m_point, point[0])
if inside_bottom_right(m_point, point[0]) and l1 < d:
l1 = d
p1 = point[0]
continue
if inside_bottom_left(m_point, point[0]) and l2 < d:
l2 = d
p2 = point[0]
continue
if inside_top_right(m_point, point[0]) and l3 < d:
l3 = d
p3 = point[0]
continue
if inside_top_left(m_point, point[0]) and l4 < d:
l4 = d
p4 = point[0]
continue
return np.float32([p1, p2, p3, p4])
def inside_bottom_right(center, point):
return center[0] < point[0] and center[1] < point[1]
def inside_bottom_left(center, point):
return center[0] > point[0] and center[1] < point[1]
def inside_top_right(center, point):
return center[0] < point[0] and center[1] > point[1]
def inside_top_left(center, point):
return center[0] > point[0] and center[1] > point[1]
def distance(p1, p2):
return math.sqrt( ((p1[0]-p2[0])**2)+((p1[1]-p2[1])**2) )
def image_center(img):
x, y = img.shape
return tuple([x/2, y/2])
then I would have to be careful about the order of the 4 extreme points of the image. Which should look like this:
x, y = img.shape
img_extreme_points = np.float32([[x, y], [0, y], [x, 0], [0, 0]])
so first is the bottom right extreme point, then bottom left, top right and top left. As long as the extreme points index corresponds correctly the matrix will be computed correctly as well.

Related

How to filter extra points in Gray Code Structured Light?

I'm trying to calculate 3d points with graycode structured light. My setup has one camera and one projector. I project gray code and capture them. However when i was try calculate 3d points, i'm getting extra 3d point as you can see in image. Here is my code to decode graycode images and triangulate them.
x_index = stripe.decode(x_captures, x_nega_captures) # thresh=60) #
y_index = stripe.decode(y_captures, y_nega_captures) # thresh=60) #
# Merge X and Y correspondences
img_correspondence = cv2.merge([0.0 * np.zeros_like(x_index),
x_index / CAMERA_RESOLUTION[0],
y_index / CAMERA_RESOLUTION[1]])
# Clip correspondences to 8b
img_correspondence = np.clip(img_correspondence * 255, 0, 255)
img_correspondence = img_correspondence.astype(np.uint8)
# Mask correspondences with projectable area to eliminate sprurious points
img_correspondence[~is_projectable] = 0
# Visualize correspondences
plt.figure()
plt.imshow(img_correspondence)
plt.axis('off')
plt.title('Correspondence Map')
#plt.show()
# Construct 2D correspondences
# x_p and y_p are the 2D coordinates in projector image space
x_p = np.expand_dims(x_index[is_projectable].flatten(), -1)
y_p = np.expand_dims(y_index[is_projectable].flatten(), -1)
projector_points = np.hstack((x_p, y_p))
# x and y are the 2D coordinates in camera image space
xx, yy = np.meshgrid(np.arange(CAMERA_RESOLUTION[0]), np.arange(CAMERA_RESOLUTION[1]))
x = np.expand_dims(xx[is_projectable].flatten(), -1)
y = np.expand_dims(yy[is_projectable].flatten(), -1)
camera_points = np.hstack((x, y))
camera_points = np.expand_dims(camera_points, 1).astype(np.float32)
projector_points = np.expand_dims(projector_points, 1).astype(np.float32)
camera_norm_1 = cv2.undistortPoints(camera_points, camera_K, camera_d, P=camera_K)
proj_norm_1 = cv2.undistortPoints(projector_points, projector_K, projector_d, P=camera_K)
P0 = np.dot(camera_K, np.array([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]]))
P1 = np.concatenate((camera_K # projector_R, camera_K # projector_t), axis=1)
triangulated_points = cv2.triangulatePoints(P0, P1, camera_norm_1, proj_norm_1).T
points_3d = (triangulated_points[:, :3] / triangulated_points[:, -1:])

Is it possible to find the bending point using opencv Python?

Aim of my program is find the angle of bending Led.
I got the angle using convexity defects in convex hull but the midpoint is move away from center point of that bend.
original image
original
below image is the output of program
output
black dot is starting point.
red dot is end point.
blue dot is mid point.
Now I want move blue dot to the center of the curve
my code
import cv2
import numpy as np
from math import sqrt
from collections import OrderedDict
def findangle(x1,y1,x2,y2,x3,y3):
ria = np.arctan2(y2 - y1, x2 - x1) - np.arctan2(y3 - y1, x3 - x1)
if ria > 0:
if ria < 3:
webangle = int(np.abs(ria * 180 / np.pi))
elif ria > 3:
webangle = int(np.abs(ria * 90 / np.pi))
elif ria < 0:
if ria < -3:
webangle = int(np.abs(ria * 90 / np.pi))
elif ria > -3:
webangle = int(np.abs(ria * 180 / np.pi))
return webangle
image = cv2.imread("cam/2022-09-27 10:01:57image.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
_, thresh = cv2.threshold(gray, 240, 255, cv2.THRESH_BINARY)
contours,hie= cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
selected_contour = max(contours, key=lambda x: cv2.contourArea(x))
# Draw Contour
approx = cv2.approxPolyDP(selected_contour, 0.0035 * cv2.arcLength(selected_contour, True), True)
for point in approx:
cv2.drawContours(image, [point], 0, (0, 0, 255), 3)
convexHull = cv2.convexHull(selected_contour,returnPoints=False)
cv2.drawContours(image, cv2.convexHull(selected_contour), 0, (0, 255, 0), 3)
convexHull[::-1].sort(axis=0)
convexityDefects = cv2.convexityDefects(selected_contour, convexHull)
start2,distance=[],[]
for i in range(convexityDefects.shape[0]):
s, e, f, d = convexityDefects[i, 0]
start = tuple(selected_contour[s][0])
end = tuple(selected_contour[e][0])
far = tuple(selected_contour[f][0])
start2.append(start)
cv2.circle(image, start, 2, (255, 0, 0), 3)
cv2.line(image,start,end , (0, 255, 0), 3)
distance.append(d)
distance.sort(reverse=True)
for i in range(convexityDefects.shape[0]):
s, e, f, d = convexityDefects[i, 0]
if distance[0]==d:
defect={"s":s,"e":e,"f":f,"d":d}
cv2.circle(image, selected_contour[defect.get("f")][0], 2, (255, 0, 0), 3)
cv2.circle(image, selected_contour[defect.get("s")][0], 2, (0, 0, 0), 3)
cv2.circle(image, selected_contour[defect.get("e")][0], 2, (0, 0, 255), 3)
x1, y1 = selected_contour[defect.get("f")][0]
x2, y2 = selected_contour[defect.get("e")][0]
x3, y3 = selected_contour[defect.get("s")][0]
cv2.line(image,(x1,y1),(x2,y2),(255,200,0),2)
cv2.line(image,(x1,y1),(x3,y3),(255,200,0),2)
cv2.putText(image, "Web Angle : " + str((findangle(x1,y1,x2,y2,x3,y3))), (50, 200), cv2.FONT_HERSHEY_SCRIPT_SIMPLEX, 1, (0,0,0),2,cv2.LINE_AA)
cv2.imshow("frame",image)
cv2.waitKey(0)
cv2.destroyAllWindows()
so i want any concept to get exact center of the bend point.
Here is one way to do that in Python/OpenCV. I make no guarantees that it is universal and would work on all such images. I also leave it for others to add trapping for empty arrays/lists and other general best practices.
Read the input
Threshold to binary on white using cv2.inRange()
Apply morphology to close up the gap near the top
Skeletonize the binary image
Get the x and y coordinates of the points of the skeleton
Zip the x and y coordinates
Sort the zipped data by x
Sort another copy of the zipped data by y
Get the first line (end points) from the top for 40% of y from the y sorted data, since that region of the skeleton is nearly straight
Get the first line (end points) from the left for 40% of x from the x sorted data, since that region of the skeleton is nearly straight
Get the intersection point of these two lines
Compute the x and y derivatives of the x coordinates and the y coordinates, respectively
Loop over each point and compute the slope from the derivatives, which will be tangent to the skeleton at the point
Then still in the loop compute the inverse slope of the line from the point to the previously computed intersection point. This will be normal (perpendicular) to this line.
Compute the difference in slopes and find the point where the difference is minimum. This will be the bend point.
Draw relevant lines and points on skeleton and input
Save results
Input:
import cv2
import numpy as np
import skimage.morphology
img = cv2.imread("wire.png")
# create a binary thresholded image
lower = (255,255,255)
upper = (255,255,255)
thresh = cv2.inRange(img, lower, upper)
thresh = (thresh/255).astype(np.float64)
# apply morphology to connect at top
kernel = np.ones((11,11), np.uint8)
thresh = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)
# apply skeletonization
skeleton = skimage.morphology.skeletonize(thresh)
skeleton = (255*skeleton).clip(0,255).astype(np.uint8)
# get skeleton points
pts = np.where(skeleton != 0)
x = pts[1]
y = pts[0]
num_pts = len(x)
print(num_pts)
# zip x and y
xy1 = zip(x,y)
xy2 = zip(x,y)
# sort on y
xy_sorty = sorted(xy1, key = lambda x: x[1])
#print(xy_sorty[0])
# sort on x
xy_sortx = sorted(xy2, key = lambda x: x[0])
#print(xy_sortx[0])
# unzip x and y for xy_sortedy
xu1, yu1 = zip(*xy_sorty)
# get first line from top
# find miny from y sort, then get point 40% down from miny
miny = np.amin(yu1)
y1 = miny
[xy1] = [(xi, yi) for (xi, yi) in xy_sorty if abs(yi - y1) <= 0.00001]
x1 = xy1[0]
y1 = xy1[1]
#print(x1,y1)
maxy = np.amax(yu1)
dely = maxy - miny
y2 = int(y1+0.4*dely)
[xy2] = [(xi, yi) for (xi, yi) in xy_sorty if abs(yi - y2) <= 0.00001]
x2 = xy2[0]
y2 = xy2[1]
#print(x2,y2)
# unzip x and y for xy_sortedx
xu2, yu2 = zip(*xy_sortx)
# get first line from left
# find minx from x sort, then get point 40% right from minx
minx = np.amin(xu2)
x3 = minx
[xy3] = [(xi, yi) for (xi, yi) in xy_sortx if abs(xi - x3) <= 0.00001]
x3 = xy3[0]
y3 = xy3[1]
#print(x3,y3)
maxx = np.amax(xu2)
delx = maxx - minx
x4 = int(x3+0.4*delx)
[xy4] = [(xi, yi) for (xi, yi) in xy_sortx if abs(xi - x4) <= 0.00001]
x4 = xy4[0]
y4 = xy4[1]
#print(x4,y4)
# draw lines on copy of skeleton
skeleton_lines = skeleton.copy()
skeleton_lines = cv2.merge([skeleton_lines,skeleton_lines,skeleton_lines])
cv2.line(skeleton_lines, (x1,y1), (x2,y2), (0,0,255), 2)
cv2.line(skeleton_lines, (x3,y3), (x4,y4), (0,0,255), 2)
# get intersection between line1 (x1,y1 to x2,y2) and line2 (x3,y3 to x4,y4) and draw circle
# https://en.wikipedia.org/wiki/Line–line_intersection
den = (x1-x2)*(y3-y4) - (y1-y2)*(x3-x4)
px = ((x1*y2-y1*x2)*(x3-x4) - (x1-x2)*(x3*y4-y3*x4))/den
py = ((x1*y2-y1*x2)*(y3-y4) - (y1-y2)*(x3*y4-y3*x4))/den
px = int(px)
py = int(py)
cv2.circle(skeleton_lines, (px,py), 3, (0,255,0), -1)
# compute first derivatives in x and also in y
dx = np.gradient(x, axis=0)
dy = np.gradient(y, axis=0)
# loop over each point
# get the slope of the tangent to the curve
# get the inverse slop of the line from the point to the intersection point (inverse slope is normal direction)
# get difference in slopes and find the point that has the minimum difference
min_diff = 1000000
eps = 0.0000000001
for i in range(num_pts):
slope1 = abs(dy[i]/(dx[i] + eps))
slope2 = abs((px - x[i])/(py - y[i] + eps))
slope_diff = abs(slope1 - slope2)
if slope_diff < min_diff:
min_diff = slope_diff
bend_x = x[i]
bend_y = y[i]
#print(x[i], y[i], min_diff)
bend_x = int(bend_x)
bend_y = int(bend_y)
#print(bend_x, bend_y)
cv2.line(skeleton_lines, (px,py), (bend_x,bend_y), (0,0,255), 2)
cv2.circle(skeleton_lines, (bend_x,bend_y), 3, (0,255,0), -1)
# get end points and bend point and draw on copy of input
result = img.copy()
end1 = (x1,y1)
end2 = (x3,y3)
bend = (bend_x,bend_y)
print("end1:", end1)
print("end2:", end2)
print("bend:", bend)
cv2.circle(result, (end1), 3, (0,0,255), -1)
cv2.circle(result, (end2), 3, (0,0,255), -1)
cv2.circle(result, (bend), 3, (0,0,255), -1)
# save result
cv2.imwrite("wire_skeleton.png", skeleton)
cv2.imwrite("wire_skeleton_lines.png", skeleton_lines)
cv2.imwrite("wire_result.png", result)
# show results
cv2.imshow("thresh", (255*thresh).astype(np.uint8))
cv2.imshow("skeleton", skeleton)
cv2.imshow("skeleton_lines", skeleton_lines)
cv2.imshow("skeleton_result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()
Skeleton:
Skeleton with lines:
Result showing end points and bend point:

Find bounding box contour with largest surface area excluding intersection areas

I have an array of bounding boxes from the object detection system.
They are in the format:
[[x,y], [x,y], [x,y], [x,y]]
I want to find the largest bounding box that is not intersecting with any other provided boxes nor is inside an excluded box.
I am using python, but response in any programming language is welcomed :)
Visual example
How I tried and failed to solve this problem.
Approach I.
Iterate over every point and find the min and max of x and y.
Then crop to a polygon using these coordinates.
The problem is that algorithm on an example image would remove the top part of the image but there is no need to because we 'missed' top left and right boxes.
Approach II.
Try to choose to crop only one side at a time, because usually in my dataset things to exclude are on one side. e.g. remove top 100px
So I calculated the min and max of x and y like before.
Then the calculated area of every possible cut - left, right, top, bottom and choose one with the smallest area.
This approach failed pretty quickly when there are boxes on two sides of picture like left and right
Consider a full recangle (initially the whole picture) and take away one excluded box. You will get 2x2x2x2=16 possible rectangular subdivisions, for example this one.
┌────────────────────────┐
│ │
│ │
├───────┬───────┬────────┤
│ │ exc │ │
│ │ lude │ │
│ ├───────┴────────┤
│ │ │
│ │ │
└───────┴────────────────┘
For each box in the subdivision, take away the next excluded box.
Do this N times, and take the biggest box of the final step.
Here's a potential solution to find the bounding box contour with the largest surface area. We have two requirements:
Largest bounding box is not intersecting with any other box
Largest bounding box is not inside another box
Essentially we can reword the two requirements to this:
Given C1 and C2, determine if C1 and C2 intersect
Given C1 and C2, check if there is a point from C1 in C2
To solve #1, we can create a contour_intersect function that uses a bitwise AND operation with np.logical_and() to detect intersection. The idea is to create two separate masks for each contour and then use the logical AND operation on them. Any points that have a positive value (1 or True) will be points of intersection. Essentially, if the entire array is False then there was no intersection between the contours. But if there is a single True, then the contours touched at some point and thus intersect.
For #2, we can create a function contour_inside and use cv2.pointPolygonTest() to determine if a point is inside, outside, or on the edge of a contour. The function returns +1, -1, or 0 to indicate if a point is inside, outside, or on the contour, respectively. We find the centroid of C1 and then check if that point is inside C2.
Here's an example to visualize the scenarios:
Input image with three contours. Nothing special here, the expected answer would be the contour with the largest area.
Answer:
Contour #0 is the largest
Next we add two additional contours. Contour #3 will represent the intersection scenario and contour #4 will represent the inside contour scenario.
Answer:
Contour #0 has failed test
Contour #1 has failed test
Contour #2 is the largest
To solve this problem, we find contours then sort using contour area from largest to smallest. Next, we compare this contour with all other contours and check the two cases. If either case fails, we dump the current contour and move onto the next largest contour. The first contour that passes both tests for all other contours is our largest bounding box contour. Normally, contour #0 would be our largest but it fails the intersection test. We then move onto contour #1 but this fails the inside test. Thus the last remaining contour that passes both tests is contour #2.
import cv2
import numpy as np
# Check if C1 and C2 intersect
def contour_intersect(original_image, contour1, contour2):
# Two separate contours trying to check intersection on
contours = [contour1, contour2]
# Create image filled with zeros the same size of original image
blank = np.zeros(original_image.shape[0:2])
# Copy each contour into its own image and fill it with '1'
image1 = cv2.drawContours(blank.copy(), contours, 0, 1)
image2 = cv2.drawContours(blank.copy(), contours, 1, 1)
# Use the logical AND operation on the two images
# Since the two images had bitwise and applied to it,
# there should be a '1' or 'True' where there was intersection
# and a '0' or 'False' where it didnt intersect
intersection = np.logical_and(image1, image2)
# Check if there was a '1' in the intersection
return intersection.any()
# Check if C1 is in C2
def contour_inside(contour1, contour2):
# Find centroid of C1
M = cv2.moments(contour1)
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
inside = cv2.pointPolygonTest(contour2, (cx, cy), False)
if inside == 0 or inside == -1:
return False
elif inside == 1:
return True
# Load image, convert to grayscale, Otsu's threshold
image = cv2.imread('1.png')
original = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
# Find contours, sort by contour area from largest to smallest
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
sorted_cnts = sorted(cnts, key=lambda x: cv2.contourArea(x), reverse=True)
# "Intersection" and "inside" contours
# Add both contours to test
# --------------------------------
intersect_contour = np.array([[[230, 93]], [[230, 187]], [[326, 187]], [[326, 93]]])
sorted_cnts.append(intersect_contour)
cv2.drawContours(original, [intersect_contour], -1, (36,255,12), 3)
inside_contour = np.array([[[380, 32]], [[380, 229]], [[740, 229]], [[740, 32]]])
sorted_cnts.append(inside_contour)
cv2.drawContours(original, [inside_contour], -1, (36,255,12), 3)
# --------------------------------
# Find centroid for each contour and label contour number
for count, c in enumerate(sorted_cnts):
M = cv2.moments(c)
cx = int(M['m10']/M['m00'])
cy = int(M['m01']/M['m00'])
cv2.putText(original, str(count), (cx-5, cy+5), cv2.FONT_HERSHEY_SIMPLEX, 0.7, (246,255,12), 3)
# Find largest bounding box contour
largest_contour_name = ""
largest_contour = ""
contours_length = len(sorted_cnts)
for i1 in range(contours_length):
found = True
for i2 in range(i1 + 1, contours_length):
c1 = sorted_cnts[i1]
c2 = sorted_cnts[i2]
# Test intersection and "inside" contour
if contour_intersect(original, c1, c2) or contour_inside(c1, c2):
print('Contour #{} has failed test'.format(i1))
found = False
continue
if found:
largest_contour_name = i1
largest_contour = sorted_cnts[i1]
break
print('Contour #{} is the largest'.format(largest_contour_name))
print(largest_contour)
# Display
cv2.imshow('thresh', thresh)
cv2.imshow('image', image)
cv2.imshow('original', original)
cv2.waitKey()
Note: The assumption is that you have an array of contours from cv2.findContours() with the format like this example:
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
sorted_cnts = sorted(cnts, key=lambda x: cv2.contourArea(x), reverse=True)
for c in sorted_cnts:
print(c)
print(type(c))
x,y,w,h = cv2.boundingRect(c)
print((x,y,w,h))
Output
[[[230 93]]
[[230 187]]
[[326 187]]
[[326 93]]]
<class 'numpy.ndarray'>
(230, 93, 97, 95)
Performance note: The intersection check function suffers on the performance side since it creates three copies of the input image to draw the contours and may be slower when it comes to execution time with a greater number of contours or a larger input image size. I'll leave this optimization step to you!
You can use the cv2.boundingRect() method to get the x, y, w, h of each bounding box, and with the x, y, w, h of each bounding box, you can use the condition x2 + w2 > x1 > x2 - w1 and y2 + h2 > y1 > y2 - h1 to check if any two bounding boxes intersect or are within each others:
import cv2
import numpy as np
def intersect(b1, b2):
x1, y1, w1, h1 = b1
x2, y2, w2, h2 = b2
return x2 + w2 > x1 > x2 - w1 and y2 + h2 > y1 > y2 - h1
# Here I am generating a random array of 10 boxes in the format [[x,y], [x,y], [x,y], [x,y]]
np.random.seed(55)
boxes = np.random.randint(10, 150, (10, 4, 2)) + np.random.randint(0, 300, (10, 1, 2))
bounds = [cv2.boundingRect(box) for box in boxes]
valids = [b1 for b1 in bounds if not any(intersect(b1, b2) for b2 in bounds if b1 != b2)]
if valids:
x, y, w, h = max(valids, key=lambda b: b[2] * b[3])
print(f"x: {x} y: {y} w: {w} h: {h}")
else:
print("All boxes intersect.")
Output:
x: 75 y: 251 w: 62 h: 115
For visualization:
import cv2
import numpy as np
def intersect(b1, b2):
x1, y1, w1, h1 = b1
x2, y2, w2, h2 = b2
return x2 + w2 > x1 > x2 - w1 and y2 + h2 > y1 > y2 - h1
np.random.seed(55)
boxes = np.random.randint(10, 150, (10, 4, 2)) + np.random.randint(0, 300, (10, 1, 2))
bounds = [cv2.boundingRect(box) for box in boxes]
valids = [b1 for b1 in bounds if not any(intersect(b1, b2) for b2 in bounds if b1 != b2)]
img = np.zeros((500, 500), "uint8")
for x, y, w, h in bounds:
cv2.rectangle(img, (x, y), (x + w, y + h), 255, 1)
if valids:
x, y, w, h = max(valids, key=lambda b: b[2] * b[3])
cv2.rectangle(img, (x, y), (x + w, y + h), 128, -1)
cv2.imshow("IMAGE", img)
cv2.waitKey(0)
Output:
Assumption: you want the largest box from your array that complies with your rules, and it is not the largest NEW bounding box that complies.
This is pseudo code, you still have to fill in blanks
int largestBoxIndex = -1;
int largestBoxArea = -1;
for (i=0; i<allBoxes[].length; i++)
{
box CurrentBox = allBoxes[i];
bool isComply = false;
for (j=0; j<allBoxes[].length; j++)
{
isComply = false;
if(i==j) break;
ComparedBox = allBoxes[j]
if (isIntersected(CurrentBox, ComparedBox)) break;
if (isInside(CurrentBox, ComparedBox)) break;
isComply = true;
}
if(isComply)
if(Area(allBoxes[i]) > largestBoxArea)
{
largestBoxArea = Area(allBoxes[i]):
largestBoxIndex =i;
}
}
if(largestBoxIndex != -1)
largestBoxIndex;//this is the largest box
A simple mathematical solution to the problem
Suppose you are given 5 rectangles as shown below:
rects = [[100, 100, 200, 200],
[200, 200, 200, 200],
[200, 500, 200, 200],
[350, 50, 150, 200],
[500, 400, 200, 300]]
Note that the format of these rectangles is: [x, y, width, height]
Where, (x, y) is the coordinate of the top left corner of the rectangle, and width & height are the width and height of the rectangle respectively. You will have to covert your coordinates in this format first.
3 out of these 5 are intersecting.
Now what we will do is iterate over these rectangles one by one, and for each rectangle, find the intersection of this rectangle with the other rectangles one by one. If any rectangle is found to be intersecting with any of the other rectangles, then we'll set the flag value for the two rectangles as 0. If a rectangle is found not to be intersecting with any other rectangle, then its flag value will be set to 1. (Default flag value is -1). Finally, we'll find the rectangle of the greatest area among the rectangles with flag value 1.
Let's see the code for finding the intersection area of the two rectangles:
# Rect : [x, y, w, h]
def Intersection(Rect1, Rect2):
x = max(Rect1[0], Rect2[0])
y = max(Rect1[1], Rect2[1])
w = min(Rect1[0] + Rect1[2], Rect2[0] + Rect2[2]) - x
h = min(Rect1[1] + Rect1[3], Rect2[1] + Rect2[3]) - y
if w < 0 or h < 0:
return None
return [x, y, w, h]
This function will return None if there is no intersecting area between these rectangles or it will return the coordinates of the intersection rectangle(Ignore this value for the current problem. This might be helpful in other problems).
Now, let's have a look at the algorithm.
n = len(rects)
# -1 : Not determined
# 0 : Intersects with some
# 1 : No intersection
flag = [-1]*n
for i in range(n):
if flag[i] == 0:
continue
isIntersecting = False
for j in range(n):
if i == j or flag[j] == 1:
continue
Int_Rect = Intersection(rects[i], rects[j])
if Int_Rect is not None:
isIntersecting = True
flag[j] = 0
flag[i] = 0
break
if isIntersecting == False:
flag[i] = 1
# Finding the maximum area rectangle without any intersection.
maxRect = None
maxArea = -1
for i in range(n):
if flag[i] == 1:
if rects[i][2] * rects[i][3] > maxArea:
maxRect = rects[i]
maxArea = rects[i][2] * rects[i][3]
print(maxRect)
Note: Add the "excluded areas" rectangle coordinates to the rects list and assign their flag value as 0 to avoid them from getting selected as the maximum area rectangle.
This solution does not involve any images so it will be the fastest algorithm unless it is optimized.
Find the biggest square in numpy array
Maybe this would help? If you know the size of the whole area you can calculate the biggest box within numpy array. If you set all your given boxes to 1 and your whole area to 0 you need to find the largest area that is unique and not 1.
Here's a O(n^2) solution. find_maxbox takes array of rectangles and convert them into Box objects and then compare each pair of boxes to eliminate invalid rectangles. This solution assumes that the rectangles' sides are parallel to X-Y axes.
class Box():
def __init__(self, coordinates):
self.coordinates = tuple(sorted(coordinates))
self.original = coordinates
self.height = abs(self.coordinates[0][1] - self.coordinates[3][1])
self.width = abs(self.coordinates[0][0] - self.coordinates[3][0])
self.excluded = False
def __eq__(self, b2):
return self.coordinates == b2.coordinates
def get_area(self):
return self.height * self.width
def bounding_box(self, b2):
maxX, maxY = map(max, zip(*self.coordinates, *b2.coordinates))
minX, minY = map(min, zip(*self.coordinates, *b2.coordinates))
return Box([(minX, minY), (maxX, minY), (minX, maxY), (maxX, maxY)])
def intersects(self, b2):
box = self.bounding_box(b2)
if box.height < self.height + b2.height and box.width < self.width + b2.width:
return True
else: return False
def encloses(self, b2):
return self == self.bounding_box(b2)
def exclude(self):
self.excluded = True
def is_excluded(self):
return self.excluded
def __str__(self):
return str(self.original)
def __repr__(self):
return str(self.original)
# Pass array of rectangles as argument.
def find_maxbox(boxes):
boxes = sorted(map(Box, boxes), key=Box.get_area, reverse=True)
_boxes = []
_boxes.append((boxes[0], boxes[0]))
for b1 in boxes[1:]:
b2, bb2 = _boxes[-1]
bbox = b1.bounding_box(bb2)
if not b1.intersects(bb2):
_boxes.append((b1, bbox))
continue
for (b2, bb2) in reversed(_boxes):
if not b1.intersects(bb2):
break
if b1.intersects(b2):
if b2.encloses(b1):
b1.exclude()
break
b1.exclude()
b2.exclude()
_boxes.append((b1, bbox))
for box in boxes:
if box.is_excluded():
continue
else: return box.original
return None
In other words:
rectangles that share points are excluded
of the remaining rectangles, take the largest
No need for contours, centroids, bounding boxes, masking or redrawing pixels!
As stated before, in the provided case, the rectangles coordinates contain duplicates. Here, we use a single class to store the outer limits of the rectangle. The Separating Axis theorem from this answer by #samgak is used in an intersects() method.
from __future__ import annotations # optional
from dataclasses import dataclass # optional ?
#dataclass
class Rectangle:
left: int
top: int
right: int
bottom: int
def __repr__(self):
"""String representation of the rectangle's coordinates."""
return f"⟔ {self.left},{self.top} ⟓ {self.right},{self.bottom}"
def intersects(self, other: Rectangle):
"""Whether this Rectangle shares points with another Rectangle."""
h = self.right < other.left or self.left > other.right
v = self.bottom < other.top or self.top > other.bottom
return not h or not v
def size(self):
"""An indicator of the Rectangle's size, equal to half the perimeter."""
return self.right - self.left + self.bottom - self.top
main = Rectangle(100, 100, 325, 325)
others = {
0: Rectangle(100, 100, 400, 400),
1: Rectangle(200, 200, 300, 300),
2: Rectangle(200, 300, 300, 500),
3: Rectangle(300, 300, 500, 500),
4: Rectangle(500, 500, 600, 600),
5: Rectangle(350, 350, 600, 600),
}
for i, r in others.items():
print(i, main.intersects(r), r.size())
Simply put, h is True if the other rectangle is completely to the left or to the right; v is True if it's at the top or the bottom. The intersects() method returns True if the rectangles share points (even so much as a corner).
Output:
0 True 600
1 True 200
2 True 300
3 True 400
4 False 500
5 False 200
It is then trivial to find the largest:
valid = {r.size():i for i, r in others.items() if not main.intersects(r)}
print('Largest:', valid[max(valid)], 'with size', max(valid))
Output:
Largest: 4 with size 500
This answer assumes left < right and top < bottom for all rectangles.
The following function turns the provided rectangle coordinates to the kind used by the Rectangle class above. This assumes that the order is [[l, t], [r, t], [r, b], [l, b]] (a path).
def trim(coordinates):
"""Remove redundant coordinates in a path describing a rectangle."""
return coordinates[0][0], coordinates[1][1], coordinates[2][0], coordinates[3][1]
Finally, we want to do this for all rectangles, not just a "main" one. We can simply have each rectangle be the main one in turns. Use itertools.combinations() on an iterable such as a list:
itertools.combinations(rectangles, 2)
This will ensure that we don't compare two rectangles more than one time.

Rotate Image in Affine Transformation

I am having trouble correctly rotating an image in an affine transformation. Currently the below below is what i'm using:
rotation_matrix = np.array([[np.cos(rotation_angle),
-np.sin(rotation_angle),0],
[np.sin(rotation_angle),
np.cos(rotation_angle),0],
[0,0,1]])
If I set the angle to anything greater than approximately 50 degrees I get an entirely black image without anything in it (I set the new image as entirely black, which indicates that none of the translated pixels are falling within the range of the new image). If i rotate less than 50 degrees I get some portion of the image, but it doesn't look correctly rotated from what I can tell. Also, origin 0,0 is in the top left corner. I want part of the image to be obscured if it is rotated outside of the bounds of the original image.
Prior to applying the the rotation, I am taking the inverse via
#get inverse of transform matrix
inverse_transform_matrix = np.linalg.inv(multiplied_matrices)
Where rotation occurs:
def Apply_Matrix_To_Image(matrix_to_apply, image_map):
#takes an image and matrices and applies it.
x_min = 0
y_min = 0
x_max = image_map.shape[0]
y_max = image_map.shape[1]
new_image_map = np.zeros((x_max, y_max), dtype=int)
for y_counter in range(0, y_max):
for x_counter in range(0, x_max):
curr_pixel = [x_counter,y_counter,1]
curr_pixel = np.dot(matrix_to_apply, curr_pixel)
print(curr_pixel)
if curr_pixel[0] > x_max - 1 or curr_pixel[1] > y_max - 1 or x_min > curr_pixel[0] or y_min > curr_pixel[1]:
next
else:
new_image_map[x_counter][y_counter] = image_map[int(curr_pixel[0])][int(curr_pixel[1])]
return new_image_map
# tested with python3
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
def GetRotateMatrixWithCenter(x, y, angle):
# https://math.stackexchange.com/questions/2093314
move_matrix = np.array(
[
[1, 0, x],
[0, 1, y],
[0, 0, 1]
])
rotation_matrix = np.array(
[
[np.cos(angle), -np.sin(angle), 0],
[np.sin(angle), np.cos(angle), 0],
[0, 0, 1]
])
back_matrix = np.array(
[
[1, 0, -x],
[0, 1, -y],
[0, 0, 1]
])
r = np.dot(move_matrix, rotation_matrix)
return np.dot(r, back_matrix)
def Apply_Matrix_To_Image(matrix_to_apply, image_map):
#takes an image and matrices and applies it.
x_min = 0
y_min = 0
x_max = image_map.shape[0]
y_max = image_map.shape[1]
new_image_map = np.zeros((x_max, y_max), dtype=int)
for y_counter in range(0, y_max):
for x_counter in range(0, x_max):
curr_pixel = [x_counter,y_counter,1]
curr_pixel = np.dot(matrix_to_apply, curr_pixel)
# print(curr_pixel)
if curr_pixel[0] > x_max - 1 or curr_pixel[1] > y_max - 1 or x_min > curr_pixel[0] or y_min > curr_pixel[1]:
next
else:
new_image_map[x_counter][y_counter] = image_map[int(curr_pixel[0])][int(curr_pixel[1])]
return new_image_map
# convert image to grayscale
img = Image.open('small.png').convert("L")
img = np.asarray(img)
image_width = img.shape[0]
image_height = img.shape[1]
plt.subplot(1,2,1)
plt.title('Origin image')
plt.imshow(img, cmap='gray', vmin=0, vmax=255)
plt.subplot(1,2,2)
plt.title('Transformed image')
alpha = 0
while True:
rotation_angle = 0 + alpha
alpha = alpha + 1 # increate 1 degree
rotation_angle = np.deg2rad(rotation_angle) # degree to radian
rotation_matrix = GetRotateMatrixWithCenter(image_width / 2, image_height / 2, rotation_angle)
roteated = Apply_Matrix_To_Image(rotation_matrix, img)
plt.imshow(roteated, cmap='gray', vmin=0, vmax=255)
plt.pause(0.001)
plt.show()
Updated Contents:
transform Degree to radian with np.deg2rad()
Realtime draw rotated image with matplotlib for debug
With https://math.stackexchange.com/questions/2093314, roate with image center
**Running screen: **

Finding threshold limits in image processing

As stated in the help section of Stackoverflow that one can ask about a "software algorithm," I believe this question is on topic. I'm viewing the following algorithm and I'm having a hard time understanding the why it is being used. I've explained the mechanics below. The code was pulled from the following github repo.
import numpy as np
import cv2
import sys
def calc_sloop_change(histo, mode, tolerance):
sloop = 0
for i in range(0, len(histo)):
if histo[i] > max(1, tolerance):
sloop = i
return sloop
else:
sloop = i
def process(inpath, outpath, tolerance):
original_image = cv2.imread(inpath)
tolerance = int(tolerance) * 0.01
#Get properties
width, height, channels = original_image.shape
color_image = original_image.copy()
blue_hist = cv2.calcHist([color_image], [0], None, [256], [0, 256])
green_hist = cv2.calcHist([color_image], [1], None, [256], [0, 256])
red_hist = cv2.calcHist([color_image], [2], None, [256], [0, 256])
blue_mode = blue_hist.max()
blue_tolerance = np.where(blue_hist == blue_mode)[0][0] * tolerance
green_mode = green_hist.max()
green_tolerance = np.where(green_hist == green_mode)[0][0] * tolerance
red_mode = red_hist.max()
red_tolerance = np.where(red_hist == red_mode)[0][0] * tolerance
sloop_blue = calc_sloop_change(blue_hist, blue_mode, blue_tolerance)
sloop_green = calc_sloop_change(green_hist, green_mode, green_tolerance)
sloop_red = calc_sloop_change(red_hist, red_mode, red_tolerance)
gray_image = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
gray_hist = cv2.calcHist([original_image], [0], None, [256], [0, 256])
largest_gray = gray_hist.max()
threshold_gray = np.where(gray_hist == largest_gray)[0][0]
#Red cells
gray_image = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 85, 4)
_, contours, hierarchy = cv2.findContours(gray_image, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
c2 = [i for i in contours if cv2.boundingRect(i)[3] > 15]
cv2.drawContours(color_image, c2, -1, (0, 0, 255), 1)
cp = [cv2.approxPolyDP(i, 0.015 * cv2.arcLength(i, True), True) for i in c2]
countRedCells = len(c2)
for c in cp:
xc, yc, wc, hc = cv2.boundingRect(c)
cv2.rectangle(color_image, (xc, yc), (xc + wc, yc + hc), (0, 255, 0), 1)
#Malaria cells
gray_image = cv2.inRange(original_image, np.array([sloop_blue, sloop_green, sloop_red]), np.array([255, 255, 255]))
_, contours, hierarchy = cv2.findContours(gray_image, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
c2 = [i for i in contours if cv2.boundingRect(i)[3] > 8]
cv2.drawContours(color_image, c2, -1, (0, 0, 0), 1)
cp = [cv2.approxPolyDP(i, 0.15 * cv2.arcLength(i, True), True) for i in c2]
countMalaria = len(c2)
for c in cp:
xc, yc, wc, hc = cv2.boundingRect(c)
cv2.rectangle(color_image, (xc, yc), (xc + wc, yc + hc), (0, 0, 0), 1)
#Write image
cv2.imwrite(outpath, color_image)
#Write statistics
with open(outpath + '.stats', mode='w') as f:
f.write(str(countRedCells) + '\n')
f.write(str(countMalaria) + '\n')
The above code looks at images of cells(irregular shapes) and identifies if there are black spots /blobs inside them. Then, it draws contours around the cells and blobs. For example:
I don't understand why the algorithm works the following way:
Let me illustrate with an example:
Let's say my tolerance passed into process() is 50. Let's say blue_hist returns an array [1, 2, 3, 4, 100, 0, ..., 0] and the largest value in this array is 100 at index 4. This indicates that there are a 100 pixels with an intensity of 4 in the gray scale version of the color image when just the blue signal is extracted. In this situation, the function where(blue_hist = blue_mode) will return 4. This value is multiplied by 0.01*tolerance giving us 2.
So, if the value 4 is pixel intensity value then multiplying it by a scalar only gives another pixel intensity value (in our case, (4 * (0.01*50)) = 2. This new pixel intensity is passed into calc_sloop_change(). In this function, compares histo[i] which returns the number of pixels at intensity i with tolerance(which is the pixel value we calculated earlier). So in our case, the first value greater than 2 happens when i=3. So 3 is returned.
This is where I'm confused. Why is this being done? It seems illogical to compare the number of pixels vs pixel intensity. They are not even the same entity. So, why are they using this algorithm? I must add that this code actually performs really well. So something must be right.
Lastly, the three values calculated by calc_sloop_change(), one for each color signal, acts as a lower cutoff to produce a binary image. Anything less than those values(which are actually pixel intensity values) becomes black and everything above those values becomes white.

Categories

Resources