I've created a trained yolov4 model, and I tried to test it as well.
My original image size was - width 1920 and height 1080.
For training I reduced it to 416*416. When I was tested I got a good result but I cannot understand the output values:
(left_x: 506 top_y: -376 width: 2076 height: 1179)
(how can a coordinate be negative or bigger than the image's size?)
I'm sure that there is a formula behind it but I wasn't be able to find it.
I searched inside the code (darknet.py) and bbox2points(bbox) function returned a bad result.
What am I missing?
Can you help me to find the bounding box’s coordinate in this example?
Code - darknet.py:
x, y, w, h = bbox
xmin = int(round(x - (w / 2)))
xmax = int(round(x + (w / 2)))
ymin = int(round(y - (h / 2)))
ymax = int(round(y + (h / 2)))
return xmin, xmax, ymin, ymax
those x, y, w, h are the same as it is outputted (left_x, top_y, width, height)
Code:
https://github.com/AlexeyAB/darknet/blob/master/darknet.py
This is how i transform the bounding boxes returned by the darknet_video.py to use it in opencv.
def __transform_boxes(boxes, image):
image_height, image_width, image_channels = image.shape
top_coordinates_x = int((boxes[0] - (boxes[2]) / 2) * (image_width / 416))
top_coordinates_y = int((boxes[1] - (boxes[3]) / 2) * (image_height / 416))
bottom_coordinates_x = int((boxes[0] + (boxes[2]) / 2) * (image_width / 416))
bottom_coordinates_y = int((boxes[1] + (boxes[3]) / 2) * (image_height / 416))
return bottom_coordinates_x, bottom_coordinates_y, top_coordinates_x, top_coordinates_y
Related
Trying to convert Kitti label format to Yolo. But after converting the bbox is misplaced.
this is kitti bounding box
This is conversion code:
def convertToYoloBBox(bbox, size):
# Yolo uses bounding bbox coordinates and size relative to the image size.
# This is taken from https://pjreddie.com/media/files/voc_label.py .
dw = 1. / size[0]
dh = 1. / size[1]
x = (bbox[0] + bbox[1]) / 2.0
y = (bbox[2] + bbox[3]) / 2.0
w = bbox[1] - bbox[0]
h = bbox[3] - bbox[2]
x = x * dw
w = w * dw
y = y * dh
h = h * dh
return (x, y, w, h)
convert =convertToYoloBBox([kitti_bbox[0],kitti_bbox[1],kitti_bbox[2],kitti_bbox[3]],image.shape[:2])
The function does some normalization which is essential for yolo and outputs following:
(0.14763590391908976,
0.3397063758389261,
0.20452591656131477,
0.01810402684563757)
but when i try to check if the normalization is being done correctly with this code:
x = int(convert[0] * image.shape[0])
y = int(convert[1] * image.shape[1])
width = x+int(convert[2] * image.shape[0])
height = y+ int(convert[3] * image.shape[1])
cv.rectangle(image, (int(x), int(y)), (int(width), int(height)), (255,0,0), 2 )
the bounding box is misplaced:
Any suggestions ? Is conversion fucntion correct? or the problem is in the checking code ?
You got the centroid calculation wrong.
Kitti labels are given in the order of left, top, right, and bottom.
to get the centroid you have to do (left + right)/ 2 and (top + bottom)/2
so your code will become
x = (bbox[0] + bbox[2]) / 2.0
y = (bbox[1] + bbox[3]) / 2.0
w = bbox[2] - bbox[0]
h = bbox[3] - bbox[1]
I have seen some questions related to this topic, but I have been unable to get the results I was looking for even after following them, so I assume something about my math is off. Here's the scenario.
I have got an image that must be rotated by 90 degrees. At the same time, I have got the coordinates (x_min, y_min) and (x_max, y_max) of the corners of a rectangle that is drawn over a certain object in the image. Let's use the image below as an example, the rectangle being used here is defined by (4, 196) and (145, 269).
My goal, then, is to rotate the image and do the same with the rectangle, assuring it keeps surrounding the target object.
First, I am rotating the image 90 degrees (clockwise) by using imutils.
90_rotated_image = imutils.rotate_bound(original_image, 90)
Then, I am calculating the new coordinates of the rectangle. For this, I need to know the point around which imutils rotates the image. I have tried a few combinations, including (0,0), but here I will assume it's the center of the image itself.
radians = math.radians(90)
h, w, c = original_image.shape
center_x = w/2
center_y = h/2
# Changes to solve the problem. Center of rotated image is now considered.
h_rotated, w_rotated, c = 90_rotated_image.shape
center_x_rotated = w_rotated/2
center_y_rotated = h_rotated/2
x_min_90 = center_x_rotated + math.cos(radians) * (x_min - center_x) - math.sin(radians) * (y_min - center_y)
y_min_90 = center_y_rotated + math.sin(radians) * (x_min - center_x) + math.cos(radians) * (y_min - center_y)
x_max_90 = center_x_rotated + math.cos(radians) * (x_max - center_x) - math.sin(radians) * (y_max - center_y)
y_max_90 = center_y_rotated + math.sin(radians) * (x_max - center_x) + math.cos(radians) * (y_max - center_y)
Finally, I draw the new rotated rectangle over the rotated image.
start_point = (x_min_90, y_min_90)
end_point = (x_max_90, y_max_90)
image = cv2.rectangle(90_rotated_image, start_point, end_point, (255, 0, 0), 2)
Here's what I am getting.
The question is: what's wrong here? Why won't the rectangle rotate properly and give me the intended result, simulated below.
EDIT: To anyone stuck with the same problem, the solution to make the rotation work with rectangular images is quite simple. The center_x and center_y variables added to the result of the rotation (the multiplications involving sin and cos) have to be related to the center of the image post-rotation not pre-rotation as I had initially written into the code. The post has been updated to reflect the solution.
At first, in general case center of image in global coordinates should look like this (I don't know exactly what is rotation center in your case):
center_x = min_x + w/2
center_y = min_y + h/2
Second: arguments of trigonometric function must be in radians, not degrees, so math.cos(math.radians(degrees_angle)) and so on
Third: if before rotation, min_corner is left bottom, after rotation it becomes right bottom (or perhaps left top depending on your coordinate system orientation); same for max_corner. Also using rotation about some center, you have to add center coordinates at the end.
So min/max coordinates for rotated rectangle are:
x_min_90 = center_x + (y_min - center_y)
y_min_90 = center_y - (x_max - center_x)
x_max_90 = center_x + (y_max - center_y)
y_max_90 = center_y - (x_min - center_x)
For this example ABCD -> FGHI:
xmin = 2 ymin = 1
xmax = 8 ymax = 5
center_x = 2 + 3 = 5
center_y = 1 + 2 = 3
xmin90 = 5 + (1 - 3) = 3
y_min_90 = 3 - (8 - 5) = 0
x_max_90 = 5 + (5 - 3) = 7
y_max_90 = 3 - (2 - 5) = 6
Note:
I used mathematical coordinate system (OX right, OY to the top), if your OY axis goes down, then change orientation as needed.
I substituted cos and sin of 90 with 0 and 1. For other angles use cos and sin but don't forget about radians:
How to use formulas from question to get right result for angles 90,180,270 degrees:
x_a_90 = center_x + math.cos(radians) * (x_min - center_x) - math.sin(radians) * (y_min - center_y)
y_a_90 = center_y + math.sin(radians) * (x_min - center_x) + math.cos(radians) * (y_min - center_y)
x_b_90 = center_x + math.cos(radians) * (x_max - center_x) - math.sin(radians) * (y_max - center_y)
y_b_90 = center_y + math.sin(radians) * (x_max - center_x) + math.cos(radians) * (y_max - center_y)
x_min_90 = min(x_a_90, x_b_90)
x_max_90 = max(x_a_90, x_b_90)
y_min_90 = min(y_a_90, y_b_90)
y_max_90 = max(y_a_90, y_b_90)
I am pretty new to Python and want to do the following: I want to divide the following image into 8 pie segments:
I want it to look something like this (I made this in PowerPoint):
The background should be black and the edge of the figure should have an unique color as well as each pie segment.
EDIT: I have written a code that divides the whole image in 8 segments:
from PIL import Image, ImageDraw
im=Image.open('C:/Users/20191881/Documents/OGO Beeldanalyse/Python/asymmetrie/rotation.png')
fill = 255
draw = ImageDraw.Draw(im)
draw.line((0,0) + im.size, fill)
draw.line((0, im.size[1], im.size[0], 0), fill)
draw.line((0.5*im.size[0],0, 0.5*im.size[0], im.size[1]), fill)
draw.line((0, 0.5*im.size[1], im.size[0], 0.5*im.size[1]), fill)
del draw
im.show()
The output gives:
The only thing that is left to do is to find a way to make each black segment inside the border an unique color and also give all the white edge segments an unique color.
Your code divides the image in eight parts, that's correct, but with respect to the image center, you don't get eight "angular equally" pie segments like you show in your sketch.
Here would be my solution, only using Pillow and the math module:
import math
from PIL import Image, ImageDraw
def segment_color(i_color, n_colors):
r = int((192 - 64) / (n_colors - 1) * i_color + 64)
g = int((224 - 128) / (n_colors - 1) * i_color + 128)
b = 255
return (r, g, b)
# Load image; generate ImageDraw
im = Image.open('path_to/vgdrD.png').convert('RGB')
draw = ImageDraw.Draw(im)
# Number of pie segments (must be an even number)
n = 8
# Replace (all-white) edge with defined edge color
edge_color = (255, 128, 0)
pixels = im.load()
for y in range(im.height):
for x in range(im.width):
if pixels[x, y] == (255, 255, 255):
pixels[x, y] = edge_color
# Draw lines with defined line color
line_color = (0, 255, 0)
d = min(im.width, im.height) - 10
center = (int(im.width/2), int(im.height)/2)
for i in range(int(n/2)):
angle = 360 / n * i
x1 = math.cos(angle/180*math.pi) * d/2 + center[0]
y1 = math.sin(angle/180*math.pi) * d/2 + center[1]
x2 = math.cos((180+angle)/180*math.pi) * d/2 + center[0]
y2 = math.sin((180+angle)/180*math.pi) * d/2 + center[1]
draw.line([(x1, y1), (x2, y2)], line_color)
# Fill pie segments with defined segment colors
for i in range(n):
angle = 360 / n * i + 360 / n / 2
x = math.cos(angle/180*math.pi) * 20 + center[0]
y = math.sin(angle/180*math.pi) * 20 + center[1]
ImageDraw.floodfill(im, (x, y), segment_color(i, n))
im.save(str(n) + '_pie.png')
For n = 8 pie segments, the following result is produced:
The first step is to replace all white pixels in the original image with the desired edge color. Of course, the assumption here is, that there are no other (white) pixels in the image. Also, this might be better done using NumPy and vectorized code, but I wanted to keep the solution Pillow-only.
Next step is to draw the (green) lines. Here, I calculate the proper coordinates of the lines' start and end using sin and cos.
The last step is to flood fill the pie segments' area, cf. ImageDraw.floodfill. Therefore, I calculate the seed points the same way as before, but add an angular shift to hit a point exactly within the pie segment.
As you can see, n is variable in my solution (n must be even):
Of course, there are limitations regarding the angular resolution, most due to the small image.
Hope that helps!
EDIT: Here's a modified version to also allow for individually colored edges.
import math
from PIL import Image, ImageDraw
def segment_color(i_color, n_colors):
r = int((192 - 64) / (n_colors - 1) * i_color + 64)
g = int((224 - 128) / (n_colors - 1) * i_color + 128)
b = 255
return (r, g, b)
def edge_color(i_color, n_colors):
r = 255
g = 255 - int((224 - 32) / (n_colors - 1) * i_color + 32)
b = 255 - int((192 - 16) / (n_colors - 1) * i_color + 16)
return (r, g, b)
# Load image; generate ImageDraw
im = Image.open('images/vgdrD.png').convert('RGB')
draw = ImageDraw.Draw(im)
center = (int(im.width/2), int(im.height)/2)
# Number of pie segments (must be an even number)
n = 8
# Replace (all-white) edge with defined edge color
max_len = im.width + im.height
im_pix = im.load()
for i in range(n):
mask = Image.new('L', im.size, 0)
mask_draw = ImageDraw.Draw(mask)
angle = 360 / n * i
x1 = math.cos(angle/180*math.pi) * max_len + center[0]
y1 = math.sin(angle/180*math.pi) * max_len + center[1]
angle = 360 / n * (i+1)
x2 = math.cos(angle/180*math.pi) * max_len + center[0]
y2 = math.sin(angle/180*math.pi) * max_len + center[1]
mask_draw.polygon([center, (x1, y1), (x2, y2)], 255)
mask_pix = mask.load()
for y in range(im.height):
for x in range(im.width):
if (im_pix[x, y] == (255, 255, 255)) & (mask_pix[x, y] == 255):
im_pix[x, y] = edge_color(i, n)
# Draw lines with defined line color
line_color = (0, 255, 0)
d = min(im.width, im.height) - 10
for i in range(int(n/2)):
angle = 360 / n * i
x1 = math.cos(angle/180*math.pi) * d/2 + center[0]
y1 = math.sin(angle/180*math.pi) * d/2 + center[1]
x2 = math.cos((180+angle)/180*math.pi) * d/2 + center[0]
y2 = math.sin((180+angle)/180*math.pi) * d/2 + center[1]
draw.line([(x1, y1), (x2, y2)], line_color)
# Fill pie segments with defined segment colors
for i in range(n):
angle = 360 / n * i + 360 / n / 2
x = math.cos(angle/180*math.pi) * 20 + center[0]
y = math.sin(angle/180*math.pi) * 20 + center[1]
ImageDraw.floodfill(im, (x, y), segment_color(i, n))
im.save(str(n) + '_pie.png')
Binary masks for each pie segment are created, and all white pixels only within that binary mask are replaced with a defined edge color.
Using NumPy still seems favorable, but I was curious to do that in Pillow only.
I have the following situation, in which I have an heightmap, and several patches extracted after an affine mapping from it, then I have applied a color mapping to the patch, and now what I need to do is to blend the patch onto the heightmap at the corrected coordinates. How can I do this? Following is the function I use to do the transformation, basically I will need the inverse.
def extract_patch(image, center, theta, width, height):
vx = (np.cos(theta), np.sin(theta))
vy = (-np.sin(theta), np.cos(theta))
sx = center[0] - vx[0] * (width / 2) - vy[0] * (height / 2)
sy = center[1] - vx[1] * (width / 2) - vy[1] * (height / 2)
mapping = np.array([[vx[0],vy[0], sx], [vx[1],vy[1], sy]])
return cv2.warpAffine(image, mapping, (width, height), flags = cv2.WARP_INVERSE_MAP, borderMode = cv2.BORDER_REPLICATE)
I am having issues getting the correct translation values after rotating my image. The code I have so far calculates the bounding box for a given rotation using basic trigonometry, it then applies a translation to the rotation matrix. The issue I am having however is that my translation always seems to be 1 pixel out, by that I mean I get a 1-pixel black border along the top or sides of my rotated image.
Here is my code:
def rotate_image(mat, angle):
height, width = mat.shape[:2]
image_center = (width / 2.0, height / 2.0)
rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)
# Get Bounding Box
radians = math.radians(angle)
sin = abs(math.sin(radians))
cos = abs(math.cos(radians))
bound_w = (width * cos) + (height * sin)
bound_h = (width * sin) + (height * cos)
# Set Translation
rotation_mat[0, 2] += (bound_w / 2.0) - image_center[0]
rotation_mat[1, 2] += (bound_h / 2.0) - image_center[1]
rotated_mat = cv2.warpAffine(mat, rotation_mat, (int(bound_w), int(bound_h)))
return rotated_mat
Here is the original image for reference and some examples of the image using that code:
coffee.png – Original
coffee.png - 90° - Notice the 1px border across the top
coffee.png - 180° - Notice the 1px border across the top and left
I am not so hot on my math, but I hazard a guess that this is being caused by some rounding issue as we’re dealing with floating point numbers.
I would like to know what methods other people use, what would be the simplest and most performant way to rotate and translate an image about its centre point please?
Thank you.
EDIT
As per #Falko's answer, I was not using zero-based calculation. My corrected code is as follows:
def rotate_image(mat, angle):
height, width = mat.shape[:2]
image_center = ((width - 1) / 2.0, (height - 1) / 2.0)
rotation_mat = cv2.getRotationMatrix2D(image_center, angle, 1.0)
# Get Bounding Box
radians = math.radians(angle)
sin = abs(math.sin(radians))
cos = abs(math.cos(radians))
bound_w = (width * cos) + (height * sin)
bound_h = (width * sin) + (height * cos)
# Set Translation
rotation_mat[0, 2] += ((bound_w - 1) / 2.0 - image_center[0])
rotation_mat[1, 2] += ((bound_h - 1) / 2.0 - image_center[1])
rotated_mat = cv2.warpAffine(mat, rotation_mat, (int(bound_w), int(bound_h)))
return rotated_mat
I'd still appreciate seeing alternative methods people are using to perform rotation and translation! :)
I guess your image center is wrong. Take, e.g., a 4x4 image with columns 0, 1, 2 and 3. Then your center is computed as 4 / 2 = 2. But it should be 1.5 between column 1 and 2.
So you better use (width - 1) / 2.0 and (height - 1) / 2.0.