Real distance between two points over image - python

I have two points in a 2D space:
(255.62746737327373, 257.61185343423432)
(247.86430198019812, 450.74937623762395)
Plotting them over a png with matplotlib i have this result:
Now i would like to calculate the real distance (in meters) between these two points. I know that the real dimension for that image is 125 meters x 86 meters.
How can i do this in some way?

Let ImageDim be the length of the image in x and y coordinate.
In this case it would be ImageDim = (700, 500), and let StadionDim
be length of the stadium. StadionDim = (125, 86)
So the function to calculate point in the stadium that is in the image would be:
def calc(ImageDim, StadionDim, Point):
return (Point[0] * StadionDim[0]/ImageDim[0], Point[1] * StadionDim[1]/ImageDim[1])
So now you would get two points in the stadium. Calculate the distance:
Point_one = calc((700,500), (125,86), (257, 255))
Point_two = calc((700,500), (125,86), (450, 247))
Distance = sqrt((Point_one[0]-Point_two[0])**2 + (Point_one[1]-Point_two[1])**2)

I believe your input coordinates are in world space. But when you plot the image without any scaling then you will have plot coordinates in image space from (0,0) in left bottom corner to (image_width, image_height) in right to corner. So to plot your points correctly to image there is need to transform them to image space and vice verse when any real world space calculations are needed to be done. I suppose you will not want to calculate lets say soccer ball speed in pixels per second but in meters in second.
So why not to draw an image in world coordinate to avoid the two spaces coordinates conversions pain? You may do it easily in matplotlib. Use the extent parameter.
extent : scalars (left, right, bottom, top), optional, default: None
The location, in data-coordinates, of the lower-left and upper-right corners. If None, the image is positioned such that the pixel centers fall on zero-based (row, column) indices.
For example this way:
imshow(imade_data, origin='upper',extent=[0, 0, field_width, field_height]);
Then you may plot your points on image in world coordinates. Also the distance calculation will become clear:
import math;
dx = x2-x1;
dy = y2-y1;
distance = math.sqrt(dx*dx+dy*dy);

Related

How to convert screen x,y (cartesian coordinates) to 3D world space crosshair movement angles (screenToWorld)?

Recently I've been playing around with computer vision and neural networks.
And came across experimental object detection within a 3D application.
But, surprisingly to me - I've faced an issue of converting one coordinates system to another (AFAIK cartesian to polar/sphere).
Let me explain.
For example, we have a screenshot of a 3D application window (some 3D game):
Now, using Open-CV or neural network I'm able to detect the round spheres (in-game targets).
As well as their X, Y coordinates within the game window (x, y offsets).
And if I will programmatically move a mouse cursor within the given X, Y coordinates in order to aim one of the targets.
It will work only when I'm in desktop environment (moving the cursor in desktop).
But when I switch to the 3D game and thus, my mouse cursor is now within 3D game world environment - it does not work and does not aim the target.
So, I did a decent research on the topic.
And what I came across, is that the mouse cursor is locked inside 3D game.
Because of this, we cannot move the cursor using MOUSEEVENTF_MOVE (0x0001) + MOUSEEVENTF_ABSOLUTE (0x8000) flags within the mouse_event win32 call.
We are only able to move the mouse programmatically using relative movement.
And, theoretically, in order to get this relative mouse movement offsets, we can calculate the offset of detections from the middle of the 3D game window.
In such case, relative movement vector would be something like (x=-100, y=0) if the target point is 100px left from the middle of the screen.
The thing is, that the crosshair inside a 3D game will not move 100px to the left as expected.
And will not aim the given target.
But it will move a bit in a given direction.
After that, I've made more research on the topic.
And as I understand, the crosshair inside a 3D game is moving using angles in 3D space.
Specifically, there are only two of them: horizontal movement angles and vertical movement angles.
So the game engine takes our mouse movement and converts it to the movement angles within a given 3D world space.
And that's how the crosshair movement is done inside a 3D game.
But we don't have access to that, all we can is move the mouse with win32 calls externally.
Then I've decided to somehow calculate pixels per degree (amount of pixels we need to use with win32 relative mouse movement in order to move the crosshair by 1 degrees inside the game).
In order to do this, I've wrote down a simple calculation algorithm.
Here it is:
As you can see, we need to move our mouse relatively with win32 by 16400 pixels horizontally, in order to move the crosshair inside our game by 360 degrees.
And indeed, it works.
16400/2 will move the crosshair by 180 degrees respectively.
What I did next, is I tried to convert our screen X, Y target offset coordinates to percentages (from the middle of the screen).
And then convert them to degrees.
The overall formula looked like (example for horizontal movement only):
w = 100 # screen width
x_offset = 10 # target x offset
hor_fov = 106.26
degs = (hor_fov/2) * (x_offset /w) # 5.313 degrees
And indeed, it worked!
But not quite as expected.
The overall aiming precision was different, depending on how far the target is from the middle of the screen.
I'm not that great with trigonometry, but as I can say - there's something to do with polar/sphere coordinates.
Because we can see only some part of the game world both horizontally & vertically.
It's also called the FOV (Field of view).
Because of this, in the given 3D game we are only able to view 106.26 degrees horizontally.
And 73.74 degrees vertically.
My guess, is that I'm trying to convert coordinates from linear system to something non-linear.
As a result, the overall accuracy is not good enough.
I've also tried to use math.atan in Python.
And it works, but still - not accurate.
Here is the code:
def point_get_difference(source_point, dest_point):
# 1000, 1000
# source_point = (960, 540)
# dest_point = (833, 645)
# result = (100, 100)
x = dest_point[0]-source_point[0]
y = dest_point[1]-source_point[1]
return x, y
def get_move_angle__new(aim_target, gwr, pixels_per_degree, fov):
game_window_rect__center = (gwr[2]/2, gwr[3]/2)
rel_diff = list(point_get_difference(game_window_rect__center, aim_target))
x_degs = degrees(atan(rel_diff[0]/game_window_rect__center[0])) * ((fov[0]/2)/45)
y_degs = degrees(atan(rel_diff[1] / game_window_rect__center[0])) * ((fov[1]/2)/45)
rel_diff[0] = pixels_per_degree * x_degs
rel_diff[1] = pixels_per_degree * y_degs
return rel_diff, (x_degs+y_degs)
get_move_angle__new((900, 540), (0, 0, 1920, 1080), 16364/360, (106.26, 73.74))
# Output will be: ([-191.93420990140876, 0.0], -4.222458785413539)
# But it's not accurate, overall x_degs must be more or less than -4.22...
Is there a way to precisely convert 2D screen X, Y coordinates into 3D game crosshair movement degrees?
There must be a way, I just can't figure it out ...
The half-way point between the center and the edge of the screen is not equal to the field of view divided by four. As you noticed, the relationship is nonlinear.
The angle between a fractional position on the screen (0-1) and the middle of the screen can be calculated as follows. This is for the horizontal rotation (i.e, around the vertical axis), so we're only considering the X position on the screen.
# angle is the angle in radians that the camera needs to
# rotate to aim at the point
# px is the point x position on the screen, normalised by
# the resolution (so 0.0 for the left-most pixel, 0.5 for
# the centre and 1.0 for the right-most
# FOV is the field of view in the x dimension in radians
angle = math.atan((x-0.5)*2*math.tan(FOV/2))
For a field of view of 100 degrees and an x of zero, that gives us -50 degrees of rotation (exactly half the field of view). For an x of 0.25 (half-way between the edge and middle), we get a rotation of around -31 degrees.
Note that the 2*math.tan(FOV/2) part is constant for any given field of view, so you can calculate it in advance and store it. Then it just becomes (assuming we named it z):
angle = math.atan((x-0.5)*z)
Just do that for both x and y and it should work.
Edit / update:
Here is a complete function. I've tested it, and it seems to work.
import math
def get_angles(aim_target, window_size, fov):
"""
Get (x, y) angles from center of image to aim_target.
Args:
aim_target: pair of numbers (x, y) where to aim
window_size: size of area (x, y)
fov: field of view in degrees, (horizontal, vertical)
Returns:
Pair of floating point angles (x, y) in degrees
"""
fov = (math.radians(fov[0]), math.radians(fov[1]))
x_pos = aim_target[0]/(window_size[0]-1)
y_pos = aim_target[1]/(window_size[1]-1)
x_angle = math.atan((x_pos-0.5)*2*math.tan(fov[0]/2))
y_angle = math.atan((y_pos-0.5)*2*math.tan(fov[1]/2))
return (math.degrees(x_angle), math.degrees(y_angle))
print(get_angles(
(0, 0), (1920, 1080), (100, 67.67)
), "should be around -50, -33.835")
print(get_angles(
(1919, 1079), (1920, 1080), (100, 67.67)
), "should be around 50, 33.835")
print(get_angles(
(959.5, 539.5), (1920, 1080), (100, 67.67)
), "should be around 0, 0")
print(get_angles(
(479.75, 269.75), (1920, 1080), (100, 67.67)
), "should be around 30.79, 18.53")

how to convert normalized coordinates to pixel coordinates?

what is the most effective way to translate from normalized coordinates (x = 0,1) (y = 0,1)
to pixel coordinates (x = 0, 1920) (x = 0, 1080)
or is there even a way to do it inside python?
i have no idea where to even start.
cause im trying to get coordinates from mediapipe's pose detection module, then tracking my mouse cursor to it
but mediapipe uses normalized coordinates and all of the mouse manipulator modules use pixel coordinates
thanks, best regards
If I am understanding correctly, you are given two decimal numbers between 1 and 0 and asked to scale them to fit the screen.
For simplicity, let's just focus on the x axis. You have a ratio that represents how far across the screen your mouse is. For instance, 0.75 means you are 75% of the way across the screen. In order to convert to pixel coordinates, just multiply this percentage by the screen width. The same method can be applied to the y axis, just use the screen height instead of the screen width.
test_coord = (0.5, 0.3)
SCREEN_DIMENSIONS = (1920, 1080)
def to_pixel_coords(relative_coords):
return tuple(round(coord * dimension) for coord, dimension in zip(relative_coords, SCREEN_DIMENSIONS))
print(to_pixel_coords(test_coord)) # prints (960, 324)

Reconstructing a flying object's 3D trajectory off a single 2D-video

I am trying to reconstruct the basketball's 3D trajectory, using solely the broadcast feed.
To do this, I had to calculate the homography matrix, so in each frame I successfully tracked the ball, and 6 points which their location is known in the "real world" (4 on the court itself, and 2 on the backboard) as seen in the picture.
Using the laws of physics I've also approximated the z-coordinate of the ball in every frame.
Now I want to map the ball's location from the 2D pixel coordinates to the real world. The code I have right now(which is attached later), inputs the pixel location (u,v) and height (z) and outputs x,y,z location. It works well for points on the court (meaning z=0), however when I need to track something in the air (the ball), the results don't make sense. If anyone can help tell me what I need to do to get the mapping I would appreciate it a lot.
# Make empty list for ball's 3D location
ball_3d_location = []
# Fixed things
size = frame_list[0].shape
focal_length = size[1]
center = (size[1]/2, size[0]/2)
camera_matrix= np.array(
[[focal_length, 0, center[0]],
[0, focal_length, center[1]],
[0, 0, 1]], dtype = "double"
)
def groundProjectPoint(image_point, z = 0.0):
camMat = np.asarray(camera_matrix)
iRot = np.linalg.inv(rotMat)
iCam = np.linalg.inv(camMat)
uvPoint = np.ones((3, 1))
# Image point
uvPoint[0, 0] = image_point[0]
uvPoint[1, 0] = image_point[1]
tempMat = np.matmul(np.matmul(iRot, iCam), uvPoint)
tempMat2 = np.matmul(iRot, translation_vector)
s = (z + tempMat2[2, 0]) / tempMat[2, 0]
wcPoint = np.matmul(iRot, (np.matmul(s * iCam, uvPoint) - translation_vector))
# wcPoint[2] will not be exactly equal to z, but very close to it
assert int(abs(wcPoint[2] - z) * (10 ** 8)) == 0
wcPoint[2] = z
return wcPoint
dist_coeffs = np.zeros((4,1)) # Assuming no lens distortion
# The tracked points coordinates in the "Real World"
model_points = np.array([
(0,1524/2,0), #Baseline-sideline
(0,-244,0), #Paint-sideline
(579,-244,0), #Paint-FT
(579,1524/2,0), #Sideline-FT
(122,-182.9/2,396.32),#Top Left Backboard
(122,182.9/2,396.32)],dtype=np.float32 #Top Right BackBoard
)
for i,frame in enumerate(bball_frames):
f =frame
#This array has the pixel coordinates of the court & backboard points
image_points =np.array([f.baseline_sideline,
f.paint_sideline,
f.paint_ft,
f.sideline_ft,
f.top_left_backboard,
f.top_right_backboard],dtype=np.float32)
(success, rotation_vector, translation_vector) = cv2.solvePnP(model_points, image_points, camera_matrix, dist_coeffs, flags=cv2.SOLVEPNP_ITERATIVE)
rotMat, _ = cv2.Rodrigues(rotation_vector)
#We assume we know the ball's height in each frame due to the laws of physics.
ball_3d_location+=[groundProjectPoint(image_point=ball_2d_location[i],z = ball_height[i])]
EDIT:
First, I want to clarify the planes of reference:
The video you have is a 2D projection (viewing plane) of the 3D world, as a plane perpendicular to the centerline of the camera lens.
The shot arc is embedded in a plane (shot plane) which is perpendicular to the real-world (3D) floor, defined by the point of release (shooter's hand) and point of contact (backboard).
The shot arc you see on the video is from the projection of that shot plane onto the viewing plane.
I want to make sure we're clear with respect to your most recent
comment: So let's say I can estimate the shooting location on the
court (x,y). using the laws of physics I can say where the ball is in
each frame (x,y) wise and then from that and the pixel coordinates I
can extract the height coordinate?
You can, indeed, estimate the (x,y) coordinate. However, I would not ascribe my approach to "the laws of physics". I would use analytic geometry.
You can estimate, with good accuracy, the 3D coordinates of both the release point (from the known (x, y, 0) position of the shooter's feet) and the end point on the backboard (whose corners are known).
Drop a perpendicular from each of these points to the floor (z=0). That line on the floor is the vertical projection of the arc to the floor -- these are the (x,y) coordinates of the ball in flight.
For each video frame, drop a projected perpendicular from the ball's image to that line on the floor ... that gives you the (x,y) coordinates of the ball, for what it's worth.
You have the definition (equation) of the view plane, the viewpoint (camera), and the arc plane. To determine the ball's position for each video frame, draw a line from the viewpoint, through the ball's image on the view plane. Determine the intersection of this line with the arc plane. That gives you the 3D coordinates of the ball in that frame.
Does that clarify a useful line of attack?

Measuring width of contour at given angle in OpenCv

Given a contour in OpenCV, I can extract the width and height by using cv2.boundingRect(contour). This returns the width and height of a bounding rectangle, as illustrated by the left figure.
Given an angle, is it possible to extract the width/height of a rotated bounding rectangle, as illustrated in the right figure?
I am trying to measure the length of moving objects, but I need to measure the length in the movement direction, which may sometimes be up to 45 degrees from the horizontal line.
I know there is a way of getting a rotated bounding rectangle, using cv2.minAreaRect and cv2.boxPoints, but the rotation I need will not always match the minimum area rectangle, so I need to be able to specify the angle somehow.
I only need the rotated width and height values, I don't really need the rotated contour, if that makes it easier.
As my comment: Why not. Support you have an angle, then you can make two orthogonality lines. Calculate the distance between pts and each line, then calc max_dist - min_dict, you will get width and height.
My mother language is Chinese, and not I'm good at English writing. So I just turn it into code:
#!/usr/bin/python3
# 2017.12.13 22:50:16 CST
# 2017.12.14 00:13:41 CST
import numpy as np
def calcPointsWH(pts, theta=0):
# 计算离散点在特定角度的长宽
# Measuring width of points at given angle
th = theta * np.pi /180
e = np.array([[np.cos(th), np.sin(th)]]).T
es = np.array([
[np.cos(th), np.sin(th)],
[np.sin(th), np.cos(th)],
]).T
dists = np.dot(pts,es)
wh = dists.max(axis=0) - dists.min(axis=0)
print("==> theta: {}\n{}".format(theta, wh))
Give this diamond for testing:
pts = np.array([[100, 200],[200, 26],[300, 200],[200, 373]], np.int32)
for theta in range(0,91, 30):
calcPointsWH(pts, theta)
==> theta: 0
[ 200. 347.]
==> theta: 30
[ 173.60254038 300.51081511]
==> theta: 60
[ 300.51081511 173.60254038]
==> theta: 90
[ 347. 200.]
Now it's 2017.12.14 00:20:55 CST, goodnight.
use
cv2.minAreaRect(cnt)
Here you can find here complete example and explanation.
Edit: minAreaRect is actually what you need, what's the problem in taking width and height from an oriented rectangle?

How do I rotate an isometric camera around the screen center?

I have a tile-based project game, got a nice pseudo-3d effect using multiple tile layers, but I would like to be able to rotate the camera (essentially, rotating the sprites).
But, simply rotating the sprites isn't equal to rotating the world, right?
By that, I got to:
x = (x - y) * (tile_width/2);
y = (x + y) * (tile_width/2);
But, see? That only works for 45 degree rotated tiles! How can I modify the angle used in those formulas (or maybe a better, more appropriate one)?
Rotating the sprites is only part of rotating the world/camera.
To rotate the world/camera, each tile needs to be moved along an arc, and rotated at the same time. To do that you need to use polar coordinates. Compute the distance and angle from the center-of-rotation to the center of each tile. Then add the desired angle-of-rotation to the polar angle for each tile. Compute the new x and y values for the center of the tile by converting back to cartesian coordinates.
Here's what the code might look like, assuming that each tile is represented by a struct, and that struct has the original x and y coordinates of the center of the tile (i.e. the coordinates of the center of the tile when the world is not rotated).
// compute the distance from the center of the tile to the center of rotation
double dx = tile[t].x - centerOfRotation.x;
double dy = tile[t].y - centerOfRotation.y;
// compute the new location of tile in polar coordinates relative to the center of rotation
double r = sqrt(dx*dx + dy*dy);
double angle = atan2(dy, dx) + angleOfRotation;
// compute the display location for the tile
x = centerOfRotation.x + r * cos(angle);
y = centerOfRotation.y + r * sin(angle);
// note that the tile itself needs to rotated as well, using angleOfRotation
// also, all angles are in radians
When attempting to rotate graphics, it helps to keep things simple. Use a small number of tiles, make the tiles different colors with strong borders, and an identifying mark to show orientation. Here's an example with 9 tiles in the initial orientation and rotated 30 degrees counter-clockwise.
Here are some examples of what can go wrong:
1) moving the tiles without rotating the tiles
2) rotating the tiles without moving the tiles
3) rotating the tiles clockwise while moving the tiles counter-clockwise
Bottom line: Pretty much any mistake will result in an image with gaps.

Categories

Resources