I've been trying all morning to figure this problem out, eventually had to resort to SO.
I'm trying to rotate a set of 'objects' which have a 3D position and rotation (it's actually for another program but I am writing a quick Python tool to parse the data, rotate it how I want and spit it back out).
In my program there are two classes:
class Object:
def __init__(self, mod, px, py, pz, rx, ry, rz):
self.mod = mod
self.pos = [px, py, pz]
self.rot = [rx, ry, rz]
def rotate(self, axisx, axisy, axisz, rotx, roty, rotz):
"""rotates object around axis"""
?
This is my 'object' class (okay, I realise now how badly that's named!). Ignore 'mod', it's very simple, just exists in space with a position and rotation (degrees).
I have no idea what to write into the rotate part. I sort of get matrices but only in the mathematical form, I've never actually written code for them and wondered if there are any libraries out there to help.
My other class is a simple group for these objects. The only other attribute is an averaged position which is actually the axis that I want to rotate each of the objects around:
class ObjectMap:
def __init__(self, objs):
self.objs = objs
tpx = 0.0
tpy = 0.0
tpz = 0.0
for obj in objs:
tpx += obj.pos[0]
tpy += obj.pos[1]
tpz += obj.pos[2]
# calculate average position for this object map
# somewhere in the middle of all the objects
self.apx = tpx / len(objs)
self.apy = tpy / len(objs)
self.apz = tpz / len(objs)
def rotate(self, rotx, roty, rotz):
"""rotate the entire object map around the averaged position in the centre"""
for o in self.objs:
o.rotate(self.apx, self.apy, self.apz, rotx, roty, rotz)
As you can see, there is a rotate function for this class which simply runs through all the objects contained within it and rotates them about the "average position" axis which should be somewhere in the middle since it's an average.
I made a quick animation to better explain what I am after here:
http://puu.sh/i3DxU/adfe44a99d.gif http://puu.sh/i3DxU/adfe44a99d.gif
Where the sphere shapes are my "objects" and the shape in the middle is the axis they are rotating around (the apx, apy, apz coordinates of the ObjectMap class).
I tried to get this library working but it just wasn't working so I abandoned that idea. I'm using Python 3, got numpy installed as I figured it would help. I've also tried numerous bits of code on the internet but things just aren't working (or they are for old python versions, or just plain fail at installing).
I'd love if someone could point me in the right direction for getting these rotations working. Even just a link to an example of matrices in Python or a useful library would be great!
Edit: My original answer avoided pitch, roll, and yaw entirely. Based on a clarification of the question, it seems this code may be using data structures and/or APIs that require the use of pitch, roll, and yaw, so I will now try to address this requirement.
There are several ways to specify a rotation in a three-dimensional Cartesian coordinate system:
Euler angles (3 numeric parameters)
Axis and angle (4 numeric parameters)
Rotation matrix (9 numeric parameters)
Quaternions (4 numeric parameters)
Yaw, pitch, and roll are Euler angles
(at least according to any applicable definitions I know of those three terms).
But transformations.py
says there are 24 possible ways to interpret a sequence of three Euler angles,
and every single one of those interpretations has different results from
each of the others for at least some sequences of angles.
It's not obvious how to translate "yaw, pitch, and roll" to one of the
24 possible "axis sequences" in transformations.py.
In fact, unless you know exactly how the existing data/software you are
interfacing with applies yaw, pitch, and roll to objects that are to be
rotated, I don't think you can really say what "yaw, pitch, and roll" means in this application, and you are unlikely to guess the correct "axis sequence" to use in transformations.py.
I suspect this may be the main reason why you have not been able to get
transformations.py to work for you.
On top of all that ambiguity, it's unclear to me what the parameters
axisx, axisy, and axisz represent in
rotate(self, axisx, axisy, axisz, rotx, roty, rotz).
Conventionally, yaw, pitch, and roll refer to rotations about three axes
of the body being rotated, and generally one is supposed to define
what those axes are and the order in which the rotations are applied
before doing any rotations, and never change those definitions.
So it really makes no sense to specify axes every time one has to
do another rotation; the software should already know exactly which
axes to use, even if they are body axes rather than world axes.
(I'm assuming now that each of the parameters axisx, axisy, and axisz
is an axis by itself, and that these three parameters are not somehow
being used to specify a single axis as I assumed in my initial answer.)
To add to the confusion, while pitch, roll, and yaw are typically applied
to body axes, you are supposed to be rotating an entire ensemble of objects,
which seems to imply you should be rotating around world axes rather than
individual body axes.
In practical terms, once you figure out what yaw, pitch, and roll
really mean in your application, and what the parameters of your
rotate function are supposed to mean, the first thing I would do with
any rotation is to convert it to a representation that is not
any kind of Euler angles. Rotation matrices look like a good choice.
If you know the correct "axis sequence" that represents your definition
of yaw, pitch and roll in transformations.py, euler_matrix promises
to compute that matrix for you.
You can further rotate objects by doing a matrix multiplication of the
new rotation matrix and the matrix of the existing rotation;
the result is a third matrix.
The new matrix goes on the left in the multiplication if it is a rotation in world coordinates, on the right if it is a rotation in body coordinates.
Once you have reoriented your objects using the new rotation matrix,
if you really need to store the resulting orientation of the object as
a sequence of Euler angles (roll, pitch, and yaw) somewhere,
euler_from_matrix in transformations.py promises to tell you what those
angles are (but once again, you have to know how your "roll, pitch, and yaw"
are defined and how transformations.py represents that definition as an axis sequence).
Below the line is material from my original answer (that is, thoughts about how one might do things if one were not forced to use Euler angles).
My recommendations:
For Object, the rotation function signature should be something equivalent to:
def rotate(self, axisx, axisy, axisz, angle)
For ObjectMap the signature should be equivalent to
def rotate(self, angle)
The idea is that once you choose an axis (either through input variables to the function or implicitly already computed as in ObjectMap), the only difference between any two rotations is the angle of rotation around that axis, described by a single scalar parameter angle.
(I recommend that the units of angle be radians.)
To relate this to your GIF, in the GIF each of the colored arcs has its
own axis perpendicular to the plane of the arc. The radius of the arc
does not really matter; the only other thing that controls how the
spheres move (when you rotate around that axis) is how far the pointer
has moved back or forth along the arc. Motion back or forth is something
that is described by a single scalar parameter, in this case the angle of rotation.
For the contents of Object, it takes one set of coordinates to
specify the location of the object's "position" (which could actually
be any reference point of your choosing on the object; but the
center is usually a good choice if there's an obvious center):
def __init__(self, mod, px, py, pz):
self.mod = mod
self.position = [px, py, pz]
If you also want to represent the orientation of the object
(for example, so that the spheres in your GIF appear to rotate around their
own axes as they revolve around the axis of the ObjectMap),
add sufficient points to the Object (in addition to the "position")
so that you can draw it wherever it may end up after rotation.
The minimum is two additional points;
for example, to draw a sphere like one of the ones in your GIF, it is
sufficient to know the location of the north pole and the location of one
point on the equator.
(It is actually possible to know the exact orientation of the sphere
with only three scalar coordinates, rather than the six involved in
these two points, but I would not recommend it unless you're willing
to seriously study the mathematics involved.)
This leads to something like this:
def __init__(self, mod, px, py, pz, vector1x, vector1y, vector1z, vector2x, vector2y, vector2z):
self.mod = mod
self.position = [px, py, pz]
self.feature1 = [px + vector1x, py + vector1y, pz + vector1z]
self.feature2 = [px + vector2x, py + vector2y, pz + vector2z]
The rationale for px + vector1x, etc., is that you might find it
convenient to describe the features of your objects by vectors
from the object's center to each feature; but for the drawing and
rotation of objects you may prefer for all the points
to be described by their global coordinates.
(I should note, however, it is not necessary to
describe the object entirely in global coordinates.)
The rotation of an Object then becomes something like this pseudocode:
def rotate(self, axisx, axisy, axisz, angle):
rotate self.position around [axisx, axisy, axisz] by angle radians
rotate self.feature1 around [axisx, axisy, axisz] by angle radians
rotate self.feature2 around [axisx, axisy, axisz] by angle radians
If all your coordinates are global coordinates, this will move all of them
around the global axis, resulting in movement and rotation of the objects
made of those points. If you decide that only the center of an Object
has a global coordinate and all the other parts of it are described
relative to that point, using vectors, you can still use the same
rotate function; the result will be the same as if you simply rotated
the global coordinates of each point.
The actual details of how to rotate a point in 3D space around
a given axis by a given angle are written up in various places
such as Python - Rotation of 3D vector (which includes
at least one numpy implementation).
You may find it beneficial to create a "rotation" object
(probably a matrix, but you can let the library take care of
the details of setting its values)
via something like
rotation = define_rotation(axisx, axisy, axisz, angle)
so that you can repeatedly use this same object to compute the rotated
position of every point within every Object. This tends to yield
faster execution than if you have to compute every rotation of every point
from the original axis coordinates and angle value.
If this were my code I'd rather define a Point class
and/or a Vector class (or use classes from an existing library)
consisting of x, y, and z coordinates of a single point or vector,
so that I would not have to keep passing parameters in groups of three
and writing out vector-addition formulas throughout my code.
For example, instead of
self.feature1 = [px + vector1x, py + vector1y, pz + vector1z]
I might have
self.feature1 = p.add(vector1)
But that is a design choice that you can make independently of
what math you ultimately choose to do for the rotations.
Related
When the Rodrigues function is called with a rotation matrix as argument it provides 2 results.
I understand that the first item returned is the vector around which the rotation occurs and that the magnitude of the vector provides the angle of rotation. It seems that it provides a number (in radians) in the range (0,180) degrees for rotations covering (0,360) degrees, therefore there must be a way to determine the sign of the rotation. How do you do that.
As a supplimentary question, I understand that the second result is a Jacobian matrix. How do you use that?
The rotation is always positive, and when it "needs" to be negative (equivalently, closer to 360 than 0 degrees), the vector is simply flipped to the other side, so now it can be positive.
There is the "right hand rule". Right hand grabs vector, thumb pointing along the vector. Fingers indicate positive rotation around vector.
Example: Place your (right) fist on the desk, thumb up. Going +90 degrees is a quarter turn counterclockwise (inward). Going -90 degrees is a quarter turn clockwise (outward)... or +90 degrees with your thumb pointing into the desk.
The Jacobian is a bunch of derivatives, a vector in output-space for each component of the input. It tells you how stable the calculation is, i.e. how easily perturbed the result is, were any of the elements of your input vector to fluctuate by a bit.
Jacobians also show up in robotics. You can use them for inverse kinematics, combined with a solver. Given the Jacobian of your "robot arm", a tool center point, and a target, some math involving a Jacobian tells you what joints to move (a little bit) in which way to get closer to the target. The Jacobian depends on the current pose (i.e. it's not a constant matrix), so you'd recalculate it all the time.
two scenarios
I have an x axis size, a y axis size, a render distance, a list of grid position numbers and a central grid position.
I am trying to create a list of all the grid positions within the render distance of a central grid position.
The size of the x and y axis may be different independently. Optimally this algorithm would not attempt to get positions where the render distance extends over the side of the x or y axis.
Thanks.
I'm writing this to help you answer your own question in the style that I would go about it. As with anything in coding, what you need to do is be able to break down a big problem into multiple smaller ones.
Design two functions to convert to and from (x, y) coordinates (optional, it'll make your life easier, but won't be as efficient, personally I would avoid this for a bit of a challenge).
Given n, size and distance, calculate up, down, left and right. If the size is different for different axis, then just provide the function with the correct one.
eg.
def right(n, size, distance):
return n + size * distance
def down(n, size, distance):
return n - distance
Given size, make sure the above functions don't go off the edge of the grid. Converting the points to (x, y) coordinates for this part may help.
Now you have the sides of the square, run the functions again to get the corners. For example, to get the top right corner, you could do right(up(*args)) or up(right(*args))
With the corners, you can now calculate what's in your square. Converting the points to (x, y) coordinates will make it easier.
I'm estimating the translation and rotation of a single camera using the following code.
E, mask = cv2.findEssentialMat(k1, k2,
focal = SCALE_FACTOR * 2868
pp = (1920/2 * SCALE_FACTOR, 1080/2 * SCALE_FACTOR),
method = cv2.RANSAC,
prob = 0.999,
threshold = 1.0)
points, R, t, mask = cv2.recoverPose(E, k1, k2)
where k1 and k2 are my matching set of key points, which are Nx2 matrices where the first column is the x-coordinates and the second column is y-coordinates.
I collect all the translations over several frames and generate a path that the camera traveled like this.
def generate_path(rotations, translations):
path = []
current_point = np.array([0, 0, 0])
for R, t in zip(rotations, translations):
path.append(current_point)
# don't care about rotation of a single point
current_point = current_point + t.reshape((3,)
return np.array(path)
So, I have a few issues with this.
The OpenCV camera coordinate system suggests that if I want to view the 2D "top down" view of the camera's path, I should plot the translations along the X-Z plane.
plt.plot(path[:,0], path[:,2])
This is completely wrong.
However, if I write this instead
plt.plot(path[:,0], path[:,1])
I get the following (after doing some averaging)
This path is basically perfect.
So, perhaps I am misunderstanding the coordinate system convention used by cv2.recoverPose? Why should the "birds eye view" of the camera path be along the XY plane and not the XZ plane?
Another, perhaps unrelated issue is that the reported Z-translation appears to decrease linearly, which doesn't really make sense.
I'm pretty sure there's a bug in my code since these issues appear systematic - but I wanted to make sure my understanding of the coordinate system was correct so I can restrict the search space for debugging.
At the very beginning, actually, your method is not producing a real path. The translation t produced by recoverPose() is always a unit vector. Thus, in your 'path', every frame is moving exactly 1 'meter' from the previous frame. The correct method would be, 1) initialize:(featureMatch, findEssentialMatrix, recoverPose), then 2) track:(triangluate, featureMatch, solvePnP). If you would like to dig deeper, finding tutorials on Monocular Visual SLAM would help.
Secondly, you might have messed up with the camera coordinate system and world coordinate system. If you want to plot the trajectory, you would use the world coordinate system rather than camera coordinate system. Besides, the results of recoverPose() are also in world coordinate system. And the world coordinate system is: x-axis pointing to right, y-axis pointing forward, z-axix pointing up.Thus, when you would like to plot the 'bird view', it is correct that you should plot along the X-Y plane.
I am asking this questions as a trimmed version of my previous question. Now that I have a face looking some position on screen and also gaze coordinates (pitch and yaw) of both the eye. Let us say
Left_Eye = [-0.06222888 -0.06577308]
Right_Eye = [-0.04176027 -0.44416167]
I want to identify the screen coordinates where the person probably may be looking at? Is this possible? Please help!
What you need is:
3D position and direction for each eye
you claim you got it but pitch and yaw are just Euler angles and you need also some reference frame and order of transforms to convert them back into 3D vector. Its better to leave the direction in a vector form (which I suspect you got in the first place). Along with the direction you need th position in 3D in the same coordinate system too...
3D definition of your projection plane
so you need at least start position and 2 basis vectors defining your planar rectangle. Much better is to use 4x4 homogenous transform matrix for this because that allows very easy transform from and in to its local coordinate system...
So I see it like this:
So now its just matter of finding the intersection between rays and plane
P(s) = R0 + s*R
P(t) = L0 + t*L
P(u,v) = P0 + u*U +v*V
Solving this system will lead to acquiring u,v which is also the 2D coordinate inside your plane yo are looking at. Of course because of inaccuracies this will not be solvable algebraicaly. So its better to convert the rays into plane local coordinates and just computing the point on each ray with w=0.0 (making this a simple linear equation with single unknown) and computing average position between one for left eye and the other for right eye (in case they do not align perfectly).
so If R0',R',L0',L' are the converted values in UVW local coordinates then:
R0z' + s*Rz' = 0.0
s = -R0z'/Rz'
// so...
R1 = R0' - R'*R0z'/Rz'
L1 = L0' - L'*L0z'/Lz'
P = 0.5 * (R1 + L1)
Where P is the point you are looking at in the UVW coordinates...
The conversion is done easily according to your notations you either multiply the inverse or direct matrix representing the plane by (R,1),(L,1),(R0,0)(L0,0). The forth coordinate (0,1) just tells if you are transforming vector or point.
Without knowing more about your coordinate systems, data accuracy, and what knowns and unknowns you got is hard to be more specific than this.
If your plane is the camera projection plane than U,V are the x and y axis of the image taken from camera and W is normal to it (direction is just matter of notation).
As you are using camera input which uses a perspective projection I hope your positions and vectors are corrected for it.
I have the coordinates of 6 points in an image
(170.01954650878906, 216.98866271972656)
(201.3812255859375, 109.42137145996094)
(115.70114135742188, 210.4272918701172)
(45.42426300048828, 97.89037322998047)
(167.0367889404297, 208.9329833984375)
(70.13690185546875, 140.90538024902344)
I have a point as center [89.2458, 121.0896]. I am trying to re-calculate the position of points in python using 4 rotation degree (from 0,90,-90,180) and 6 scaling factor (0.5,0.75,1,1.10,1.25,1.35,1.5).
My question is how can I rotate and scale the abovementioned points relative to the center point and get the new coordinates of those 6 points?
Your help is really appreciated.
Mathematics
A mathematical approach would be to represent this data as vectors from the center to the image-points, translate these vectors to the origin, apply the transformation and relocate them around the center point. Let's look at how this works in detail.
Representation as vectors
We can show these vectors in a grid, this will produce following image
This image provides a nice way to look at these points, so we can see our actions happening in a visual way. The center point is marked with a dot at the beginning of all the arrows, and the end of each arrow is the location of one of the points supplied in the question.
A vector can be seen as a list of the values of the coordinates of the point so
my_vector = [point[0], point[1]]
could be a representation for a vector in python, it just holds the coordinates of a point, so the format in the question could be used as is! Notice that I will use the position 0 for the x-coordinate and 1 for the y-coordinate throughout my answer.
I have only added this representation as a visual aid, we can look at any set of two points as being a vector, no calculation is needed, this is only a different way of looking at those points.
Translation to origin
The first calculations happen here. We need to translate all these vectors to the origin. We can very easily do this by subtracting the location of the center point from all the other points, for example (can be done in a simple loop):
point_origin_x = point[0] - center_point[0] # Xvalue point - Xvalue center
point_origin_y = point[1] - center_point[1] # Yvalue point - Yvalue center
The resulting points can now be rotated around the origin and scaled with respect to the origin. The new points (as vectors) look like this:
In this image, I deliberately left the scale untouched, so that it is clear that these are exactly the same vectors (arrows), in size and orientation, only shifted to be around (0, 0).
Why the origin
So why translate these points to the origin? Well, rotations and scaling actions are easy to do (mathematically) around the origin and not as easy around other points.
Also, from now on, I will only include the 1st, 2nd and 4th point in these images to save some space.
Scaling around the origin
A scaling operation is very easy around the origin. Just multiply the coordinates of the point with the factor of the scaling:
scaled_point_x = point[0] * scaling_factor
scaled_point_y = point[1] * scaling_factor
In a visual way, that looks like this (scaling all by 1.5):
Where the blue arrows are the original vectors and the red ones are the scaled vectors.
Rotating
Now for rotating. This is a little bit harder, because a rotation is most generally described by a matrix multiplication with this vector.
The matrix to multiply with is the following
(from wikipedia: Rotation Matrix)
So if V is the vector than we need to perform V_r = R(t) * V to get the rotated vector V_r. This rotation will always be counterclockwise! In order to rotate clockwise, we simply need to use R(-t).
Because only multiples of 90° are needed in the question, the matrix becomes a almost trivial. For a rotation of 90° counterclockwise, the matrix is:
Which is basically in code:
rotated_point_x = -point[1] # new x is negative of old y
rotated_point_y = point[0] # new y is old x
Again, this can be nicely shown in a visual way:
Where I have matched the colors of the vectors.
A rotation 90° clockwise will than be
rotated_counter_point_x = point[1] # x is old y
rotated_counter_point_y = -point[0] # y is negative of old x
A rotation of 180° will just be taking the negative coordinates or, you could just scale by a factor of -1, which is essentially the same.
As last point of these operations, might I add that you can scale and/or rotated as much as you want in a sequence to get the desired result.
Translating back to the center point
After the scaling actions and/or rotations the only thing left is te retranslate the vectors to the center point.
retranslated_point_x = new_point[0] + center_point_x
retranslated_point_y = new_point[1] + center_point_y
And all is done.
Just a recap
So to recap this long post:
Subtract the coordinates of the center point from the coordinates of the image-point
Scale by a factor with a simply multiplication of the coordinates
Use the idea of the matrix multiplication to think about the rotation (you can easily find these things on Google or Wikipedia).
Add the coordinates of the center point to the new coordinates of the image-point
I realize now that I could have just given this recap, but now there is at least some visual aid and a slight mathematical background in this post, which is also nice. I really believe that such problems should be looked at from a mathematical angle, the mathematical description can help a lot.