what is the most effective way to translate from normalized coordinates (x = 0,1) (y = 0,1)
to pixel coordinates (x = 0, 1920) (x = 0, 1080)
or is there even a way to do it inside python?
i have no idea where to even start.
cause im trying to get coordinates from mediapipe's pose detection module, then tracking my mouse cursor to it
but mediapipe uses normalized coordinates and all of the mouse manipulator modules use pixel coordinates
thanks, best regards
If I am understanding correctly, you are given two decimal numbers between 1 and 0 and asked to scale them to fit the screen.
For simplicity, let's just focus on the x axis. You have a ratio that represents how far across the screen your mouse is. For instance, 0.75 means you are 75% of the way across the screen. In order to convert to pixel coordinates, just multiply this percentage by the screen width. The same method can be applied to the y axis, just use the screen height instead of the screen width.
test_coord = (0.5, 0.3)
SCREEN_DIMENSIONS = (1920, 1080)
def to_pixel_coords(relative_coords):
return tuple(round(coord * dimension) for coord, dimension in zip(relative_coords, SCREEN_DIMENSIONS))
print(to_pixel_coords(test_coord)) # prints (960, 324)
Related
Recently I've been playing around with computer vision and neural networks.
And came across experimental object detection within a 3D application.
But, surprisingly to me - I've faced an issue of converting one coordinates system to another (AFAIK cartesian to polar/sphere).
Let me explain.
For example, we have a screenshot of a 3D application window (some 3D game):
Now, using Open-CV or neural network I'm able to detect the round spheres (in-game targets).
As well as their X, Y coordinates within the game window (x, y offsets).
And if I will programmatically move a mouse cursor within the given X, Y coordinates in order to aim one of the targets.
It will work only when I'm in desktop environment (moving the cursor in desktop).
But when I switch to the 3D game and thus, my mouse cursor is now within 3D game world environment - it does not work and does not aim the target.
So, I did a decent research on the topic.
And what I came across, is that the mouse cursor is locked inside 3D game.
Because of this, we cannot move the cursor using MOUSEEVENTF_MOVE (0x0001) + MOUSEEVENTF_ABSOLUTE (0x8000) flags within the mouse_event win32 call.
We are only able to move the mouse programmatically using relative movement.
And, theoretically, in order to get this relative mouse movement offsets, we can calculate the offset of detections from the middle of the 3D game window.
In such case, relative movement vector would be something like (x=-100, y=0) if the target point is 100px left from the middle of the screen.
The thing is, that the crosshair inside a 3D game will not move 100px to the left as expected.
And will not aim the given target.
But it will move a bit in a given direction.
After that, I've made more research on the topic.
And as I understand, the crosshair inside a 3D game is moving using angles in 3D space.
Specifically, there are only two of them: horizontal movement angles and vertical movement angles.
So the game engine takes our mouse movement and converts it to the movement angles within a given 3D world space.
And that's how the crosshair movement is done inside a 3D game.
But we don't have access to that, all we can is move the mouse with win32 calls externally.
Then I've decided to somehow calculate pixels per degree (amount of pixels we need to use with win32 relative mouse movement in order to move the crosshair by 1 degrees inside the game).
In order to do this, I've wrote down a simple calculation algorithm.
Here it is:
As you can see, we need to move our mouse relatively with win32 by 16400 pixels horizontally, in order to move the crosshair inside our game by 360 degrees.
And indeed, it works.
16400/2 will move the crosshair by 180 degrees respectively.
What I did next, is I tried to convert our screen X, Y target offset coordinates to percentages (from the middle of the screen).
And then convert them to degrees.
The overall formula looked like (example for horizontal movement only):
w = 100 # screen width
x_offset = 10 # target x offset
hor_fov = 106.26
degs = (hor_fov/2) * (x_offset /w) # 5.313 degrees
And indeed, it worked!
But not quite as expected.
The overall aiming precision was different, depending on how far the target is from the middle of the screen.
I'm not that great with trigonometry, but as I can say - there's something to do with polar/sphere coordinates.
Because we can see only some part of the game world both horizontally & vertically.
It's also called the FOV (Field of view).
Because of this, in the given 3D game we are only able to view 106.26 degrees horizontally.
And 73.74 degrees vertically.
My guess, is that I'm trying to convert coordinates from linear system to something non-linear.
As a result, the overall accuracy is not good enough.
I've also tried to use math.atan in Python.
And it works, but still - not accurate.
Here is the code:
def point_get_difference(source_point, dest_point):
# 1000, 1000
# source_point = (960, 540)
# dest_point = (833, 645)
# result = (100, 100)
x = dest_point[0]-source_point[0]
y = dest_point[1]-source_point[1]
return x, y
def get_move_angle__new(aim_target, gwr, pixels_per_degree, fov):
game_window_rect__center = (gwr[2]/2, gwr[3]/2)
rel_diff = list(point_get_difference(game_window_rect__center, aim_target))
x_degs = degrees(atan(rel_diff[0]/game_window_rect__center[0])) * ((fov[0]/2)/45)
y_degs = degrees(atan(rel_diff[1] / game_window_rect__center[0])) * ((fov[1]/2)/45)
rel_diff[0] = pixels_per_degree * x_degs
rel_diff[1] = pixels_per_degree * y_degs
return rel_diff, (x_degs+y_degs)
get_move_angle__new((900, 540), (0, 0, 1920, 1080), 16364/360, (106.26, 73.74))
# Output will be: ([-191.93420990140876, 0.0], -4.222458785413539)
# But it's not accurate, overall x_degs must be more or less than -4.22...
Is there a way to precisely convert 2D screen X, Y coordinates into 3D game crosshair movement degrees?
There must be a way, I just can't figure it out ...
The half-way point between the center and the edge of the screen is not equal to the field of view divided by four. As you noticed, the relationship is nonlinear.
The angle between a fractional position on the screen (0-1) and the middle of the screen can be calculated as follows. This is for the horizontal rotation (i.e, around the vertical axis), so we're only considering the X position on the screen.
# angle is the angle in radians that the camera needs to
# rotate to aim at the point
# px is the point x position on the screen, normalised by
# the resolution (so 0.0 for the left-most pixel, 0.5 for
# the centre and 1.0 for the right-most
# FOV is the field of view in the x dimension in radians
angle = math.atan((x-0.5)*2*math.tan(FOV/2))
For a field of view of 100 degrees and an x of zero, that gives us -50 degrees of rotation (exactly half the field of view). For an x of 0.25 (half-way between the edge and middle), we get a rotation of around -31 degrees.
Note that the 2*math.tan(FOV/2) part is constant for any given field of view, so you can calculate it in advance and store it. Then it just becomes (assuming we named it z):
angle = math.atan((x-0.5)*z)
Just do that for both x and y and it should work.
Edit / update:
Here is a complete function. I've tested it, and it seems to work.
import math
def get_angles(aim_target, window_size, fov):
"""
Get (x, y) angles from center of image to aim_target.
Args:
aim_target: pair of numbers (x, y) where to aim
window_size: size of area (x, y)
fov: field of view in degrees, (horizontal, vertical)
Returns:
Pair of floating point angles (x, y) in degrees
"""
fov = (math.radians(fov[0]), math.radians(fov[1]))
x_pos = aim_target[0]/(window_size[0]-1)
y_pos = aim_target[1]/(window_size[1]-1)
x_angle = math.atan((x_pos-0.5)*2*math.tan(fov[0]/2))
y_angle = math.atan((y_pos-0.5)*2*math.tan(fov[1]/2))
return (math.degrees(x_angle), math.degrees(y_angle))
print(get_angles(
(0, 0), (1920, 1080), (100, 67.67)
), "should be around -50, -33.835")
print(get_angles(
(1919, 1079), (1920, 1080), (100, 67.67)
), "should be around 50, 33.835")
print(get_angles(
(959.5, 539.5), (1920, 1080), (100, 67.67)
), "should be around 0, 0")
print(get_angles(
(479.75, 269.75), (1920, 1080), (100, 67.67)
), "should be around 30.79, 18.53")
I want to conceal few points on a plot, I am using patches to draw a rectangle, so is there any way of plotting a rectangle with just specifying the corners?
I only know how to draw by height and width parameters.
patch= ax1.add_patch(patches.Rectangle((x, y), 0.3, 0.5)
how can i modify the code to draw rectangle by just using say coordinates like these (x1,y1),(x2,y2)(x3,y3)(x4,y4).
I assume that the coordinates to be ordered in the following way:
top_left = [2,2]
bottom_left = [2, 1]
top_right = [4,2]
bottm_right = [4, 1]
So you can easily calculate the width and height and input them to patches
w = top_left[0]-top_right[0]
h = top_left[1]-bottom_left[1]
NOTE
If they are not ordered the logic is simple, you find to points where the x position is identical and calculate in absolute value the the difference and obtain the width (and symmetrically the height)
The selected answer still just calculates the length and width (and ignores any angle if one was desired). It could be made to work by calculating the angle and adding that too, but it's still hacking around your intention if you've already calculated all of the vertices.
Another option you have is to just use the patches.Polygon class.
points = [(x1,y1),(x2,y2)(x3,y3)(x4,y4)]
rect = patches.Polygon(points, linewidth=1, edgecolor='r', facecolor='none')
ax.add_patch(rect)
will end up just drawing a rectangle if that's what those points specify. Note, the order of the points matters, but that isn't a big deal. Here is an image of where I just did this. The green boxes + are my calculated points, and the red rectangles are my polygons
sample of where I did this
Here is a test program. I started with two random dots and the line connecting them. Now, I want to take a given image (with x,y dimensions of 79 x 1080) and blit it on top of the guide line. I understand that arctan will give me the angle between the points on a cartesian grid, but because y is backwards the screen (x,y), I have to invert some values. I'm confused about the negating step.
If you run this repeatedly, you'll see the image is always parallel to the line, and sometimes on top, but not consistently.
import math
import pygame
import random
pygame.init()
screen = pygame.display.set_mode((600,600))
#target = (126, 270)
#start = (234, 54)
target = (random.randrange(600), random.randrange(600))
start = (random.randrange(600), random.randrange(600))
BLACK = (0,0,0)
BLUE = (0,0,128)
GREEN = (0,128,0)
pygame.draw.circle(screen, GREEN, start, 15)
pygame.draw.circle(screen, BLUE, target, 15)
pygame.draw.line(screen, BLUE, start, target, 5)
route = pygame.Surface((79,1080))
route.set_colorkey(BLACK)
BMP = pygame.image.load('art/trade_route00.png').convert()
(bx, by, bwidth, bheight) = route.get_rect()
route.blit(BMP, (0,0), area=route.get_rect())
# get distance within screen in pixels
dist = math.sqrt((start[0] - target[0])**2 + (start[1] - target[1])**2)
# scale to fit: use distance between points, and make width extra skinny.
route = pygame.transform.scale(route, (int(bwidth * dist/bwidth * 0.05), int( bheight * dist/bheight)))
# and rotate... (invert, as negative is for clockwise)
angle = math.degrees(math.atan2(-1*(target[1]-start[1]), target[0]-start[0]))
route = pygame.transform.rotate(route, angle + 90 )
position = route.get_rect()
HERE = (abs(target[0] - position[2]), target[1]) # - position[3]/2)
print(HERE)
screen.blit(route, HERE)
pygame.display.update()
print(start, target, dist, angle, position)
The main problem
The error is not due to the inverse y coordinates (0 at top, max at bottom) while rotating as you seems to think. That part is correct. The error is here:
HERE = (abs(target[0] - position[2]), target[1]) # - position[3]/2)
HERE must be the coordinates of the top-left corner of the rectangle inscribing your green and blue dots connected by the blue line. At those coordinates, you need to place the Surface route after rescaling.
You can get this vertex by doing:
HERE = (min(start[0], target[0]), min(start[1], target[1]))
This should solve the problem, and your colored dots should lay on the blue line.
A side note
Another thing you might wish to fix is the scaling parameter of route:
route = pygame.transform.scale(route, (int(bwidth * dist/bwidth * 0.05), int( bheight * dist/bheight)))
If my guess is correct and you want to preserve the original widht/height ratio in the rescaled route (since your original image is not a square) this should be:
route = pygame.transform.scale(route, (int(dist* bwidth/bheight), int(dist)))
assuming that you want height (the greater size in the original) be scaled to dist. So you may not need the 0.05, or maybe you can use a different shrinking parameter (probably 0.05 will shrink it too much).
I have two points in a 2D space:
(255.62746737327373, 257.61185343423432)
(247.86430198019812, 450.74937623762395)
Plotting them over a png with matplotlib i have this result:
Now i would like to calculate the real distance (in meters) between these two points. I know that the real dimension for that image is 125 meters x 86 meters.
How can i do this in some way?
Let ImageDim be the length of the image in x and y coordinate.
In this case it would be ImageDim = (700, 500), and let StadionDim
be length of the stadium. StadionDim = (125, 86)
So the function to calculate point in the stadium that is in the image would be:
def calc(ImageDim, StadionDim, Point):
return (Point[0] * StadionDim[0]/ImageDim[0], Point[1] * StadionDim[1]/ImageDim[1])
So now you would get two points in the stadium. Calculate the distance:
Point_one = calc((700,500), (125,86), (257, 255))
Point_two = calc((700,500), (125,86), (450, 247))
Distance = sqrt((Point_one[0]-Point_two[0])**2 + (Point_one[1]-Point_two[1])**2)
I believe your input coordinates are in world space. But when you plot the image without any scaling then you will have plot coordinates in image space from (0,0) in left bottom corner to (image_width, image_height) in right to corner. So to plot your points correctly to image there is need to transform them to image space and vice verse when any real world space calculations are needed to be done. I suppose you will not want to calculate lets say soccer ball speed in pixels per second but in meters in second.
So why not to draw an image in world coordinate to avoid the two spaces coordinates conversions pain? You may do it easily in matplotlib. Use the extent parameter.
extent : scalars (left, right, bottom, top), optional, default: None
The location, in data-coordinates, of the lower-left and upper-right corners. If None, the image is positioned such that the pixel centers fall on zero-based (row, column) indices.
For example this way:
imshow(imade_data, origin='upper',extent=[0, 0, field_width, field_height]);
Then you may plot your points on image in world coordinates. Also the distance calculation will become clear:
import math;
dx = x2-x1;
dy = y2-y1;
distance = math.sqrt(dx*dx+dy*dy);
Say we have a photo frame like the one above.
Starting from center, how would u find a rectangle with maximum area that can be used to draw (all pixels in the rectangle must be rgb(255,255,255)?
I need to find the x and y coordinate of point A and B shown in the picture.
One of my approach is to do this:
starting from the center, and expand the boundary like the graph above.
But I am not sure how you could write loop(s) like that.
You should use the flood fill algorithm: link.
I suggest you to use sets to store the pixels to be altered in a set; that way the number of recursions to be done can be reduced.
Edit: I obviously didn't read the question well. Still, the flood fill could be used, if you used it it on a circle that is expanded.
Start with a single pixel, that is the center of your circle.
set the radius lager by 1 unit.
Find the pixels within your circle, get their colours using flood fill.
If they are the same color, goto 2. If not, you have the radius finding the rectangle is next.
This algorithm may give you a possible solution, but there may be more than one, depending on your frame - you should start developing using some simple frame where the correctness of the solution can be judged easily.
Edit: based on the comment, the problem is to find the largest area axis-parallel rectangle in a polygon - and luckily there is a paper on this: here. Doesn't look as an easy task though.
I would use brute force here. Choose a y_bottom and y_top, determine the corresponding x_left and x_right and loop both y_bottom and y_top over the width of the picture. In pseudocode:
for y_bottom in range (0, H):
for y_top in range (y_bottom, H):
# Assume that x = W/2 is part of the rectangle
x_left = maximum X such that all pixels in box (x_left, y_top, W/2, y_bottom) are white
x_right = minimum X such that all pixels in box (W/2, y_top, x_right, y_bottom) are white
determine Area of box (x_left, y_top, x_right, y_bottom)
store this box if Area is larger than max found so far