Why does OpenCV decomposed homography translation matrix have three values?

Why does OpenCV decomposed homography translation matrix have three values? - python

In a nutshell...the translation matrix returned from decomposing a homography matrix is actually a 3X1 matrix (a vector really). Yet, every description of a translation matrix is a 3X2 matrix.
Here are the two images (IR camera), the position 1 image (approx camera cartesian coords x = 0mm, y=300mm):
This is the position 2 image (approx camera cartesian coords x = 680mm, y=0mm):
I used the following with +90 points to determine the homography matrix (M):
M, mask = cv2.findHomography(source_pts, destination_pts, cv2.RANSAC,5.0)
This process picked out a good number of keypoints:
If you apply this homography matrix to the original image -- it works perfectly:
im_out = cv2.warpPerspective(img1,M, (640,480) )
and the output of the difference between the point set:
np.mean(dst_pts-src_pts , axis = 0)
array([[-305.16345, -129.94157]], dtype=float32)
is fairly close to the dot product of the homography matrix for a single point....
np.dot(M,[1,1,1])
array([-293.00352303, -132.93478376, 1.00009461])
I decomposed the homography matrix with the following command:
num, Rs, Ts,Ns = cv2.decomposeHomographyMat(M, camera_matrix)
This returns 4 solutions (num), a rotation matrix, a translation matrix, and Ns (cant remember what it is).
I'm interested in the translation matrix.
Firstly...
The translation matrix, lists the 4 solutions (Is this correct?):
Ts =
[array([[-0.60978834],[-0.26268874],[ 0.01638967]]),
array([[ 0.60978834], [ 0.26268874],[-0.01638967]]),
array([[-0.19035409],[-0.06628793],[ 0.63284046]]),
array([[ 0.19035409], [ 0.06628793],[-0.63284046]])]
Secondly, and most puzzling
is that each of the solutions has 3 values...
e.g., the first solution: [-0.6097, -0.2626, 0.01638967].
My understanding is that a translation matrix would have the form of :
Here is my reference
How do I get from the values returned from the decomposition matrix to the translation matrix in the form above?
**ie... how do I convert this:
[-0.6097, -0.2626, 0.01638967]
to this format:**
Thanks for your help.

Let's take your first translation vector:
np.array([-0.60978834, -0.26268874, 0.01638967])
To me it looks like those are your tx, ty and tz estimated translation component. Plus those quantities make sense when I look at the image with the green dots. So I guess that your translation matrix in homogeneous coordinates would be:
M = np.array([[1, 0, 0, -0.60978834], [0, 1, 0, -0.26268874], [0, 0, 1, 0.01638967]])
Or simply:
M = np.array([[1, 0, -0.60978834], [0, 1, -0.26268874]])
If you ignore the tz component. Isn't this what you're looking for ?

Related

Normalize the 3D skeleton data for deep learning

I have some 3D motion capture data for training a deep learning model, but the loss is too high so I need to apply normalization to adjust the input data.
The list below is the steps I am planning to do for the normalization:
Set the skeleton's left sole to be the origin point (0, 0, 0)
All joint points (x, y, z) minus the left sole point(xls, yls, zls)
Set the skeleton's height as 1
Divide all the joint point y-axis values by the y-value of the head point
Calculate the angle between the waist-line and x-axis line
Calculate the angle between the two lines, the waist-line (The line connected by the Left-waist point and Right-waist point), and the line parallel to the X-axis extending from the midpoint of the waist-line.
Every joint point multiply with the rotation matrix
Multiply the rotation matrix with the y-axis(height axis) as the rotation axis (right-handed coordinate system: the value on the X-axis is greater on the right than on the left)
The rotation matrix is obtained as follows:
[[cos_theta, 0, neg_sin_theta],
[0, 1, 0],
[sin_theta, 0, cos_theta]]
From a mathematical perspective, I drew an image as shown below to depict the normalization by multiplying with a rotation matrix, hoping the skeleton can rotate with the waist and let the waist-line always be parallel with the x-axis :
Now I want to normalize the 3D skeleton coordinates by multiplying the rotation matrix:
(I tested 3 different rotation matrixes. The above mentioned rotation matrix produced very high distortion, so I tested with the matrix below and got a more reasonable rotation, which I don't know why...)
rotate_matrix =[[cos_theta, neg_sin_theta, 0],
[sin_theta, cos_theta, 0],
[0, 0, 1]]
Got the result of normalization:
The normalization using this rotation matrix still produces weird result, but at least it seems to rotate with the height-axis as desired.
The original skeleton movement:
Hope someone give some advice of this normalization, or provide other ideas to deal with 3D skeleton normalization.
Thank you in advance for your input.

Best way to rotate a 3D grid (nxnxn) of values in Python with interpolation?

If I have a nxnxn grid of values, say 32x32x32, and I want to rotate this cube grid of values by some rotation angle in either the x, y, or z axes, and interpolate missing values, what would be the best way to go about doing this without using any existing algorithms from packages (such as Scipy)?
I'm familiar with applying a 3D rotation matrix to a 3D grid of points when it's represented as a [n, 3] matrix, but I'm not sure how to go about applying a rotation when the representation is given in its 3D form as nxnxn.
I found a prior Stack Overflow post about this topic, but it uses three for loops for its approach, which doesn't really scale in terms of speed. Is there a more vectorized approach that can accomplish a similar task?
Thanks in advance!

One way I could think of would look like this:
reshape nxnxn matrix to an array containing n-dimensional points
apply rotation on this array
reshape array back to nxnxn
Here is some code:
import numpy as np
#just a way to create some nxnxn matrix
n = 4
a = np.arange(n)
b = np.array([a]*n)
mat = np.array([b]*n)
#creating an array containg n-dimensional points
flat_mat = mat.reshape((int(mat.size/n),n))
#just a random matrix we will use as a rotation
rot = np.eye(n) + 2
#apply the rotation on each n-dimensional point
result = np.array([rot.dot(x) for x in flat_mat])
#return to original shape
result=result.reshape((n,n,n))
print(result)

How to construct camera matrix from known parameters

I wish to project an image taken with a camera for which I know all parameters (focal length, sensor size, X, Y, Z, rotation (omega, phi, kappa) on a 2D plane. I know that I need to construct a camera matrix before being able to do the planar homography, but how?
I've successfully produced a matrix using 4 known pairs of points on each plane following this answer, but it's not the way I want to do it. I've looked at this video that give me almost all my answers, hovewer the matrix named "extrinsic parameters" is not enterely described. How the rotation matrix R and the matrix T of the camera position should be constructed?
With the final camera matrix in hand, is suppose I will be able to take each parameter and feed them to PIL.Image.transform. I'm also open to using the python OpenCV library.
Here is some exemple data:
Original image here (4288 x 2848 pixels)
#Camera position
X: 72003 m
Y: 1070100 m
Z: 1243 m
#Rotation of camera
Omega: 0°
Phi: 27°
Kappa: -38°
Focal length: 26 mm
Pixel size on sensor: 0.00551 mm

The camera matrix P is a 4x3 matrix of the form P = K[R t]:
K is a 3x3 matrix containing the intrinsic parameters (principal point and focal length in pixels)
[R t] is a 3x4 matrix obtained by concatenating R, a 3x3 matrix representing the rotation from the camera frame to the world frame, and t, a 3-vector which represents the position of the origin of the world in the camera frame.
This means that the parameters you have, which seem to be the position of the camera in the world frame, have to be inverted. The inverse of [R t] is [R' t'] where R' = inverse(R) = transpose(R) and t' = -inverse(R)t.
You would first have to know how to compute the 3x3 camera rotation matrix from the three angles you have, and there are many possible representations of a rotation matrix from three angles. The most common are yaw/pitch/roll, and Euler angles with all possible rotation orders.
The 3x3 intrinsics matrix K is [[f 0 cx][0 f cy][0 0 1]], where f = 26/0.00551 = 4719 and (cx,cy) is the principal point, which you can take as the center of the image (4288/2,2848/2).
Then to compute the homography (3x3 matrix) that goes from the plane at world height Z0 to your image, you multiply P by (X,Y,Z0,1), which gives you an expression of the form Xv1 + Yv2 + v3 where v1, v2, and v3 are 3-vectors. The matrix H=[v1 v2 v3] is the homography you are looking for. The 8 coefficients for PIL.Image.transform should be the first 8 coefficients from that matrix, divided by the 9th one.

OpenGL Rotation from Velocity Vector

This should be easy, but I've been all over trying to find a simple explanation that I can grasp. I have an object that I'd like to represent in OpenGL as a cone. The object has x, y, z coordinates and a velocity vector vx, vy, and vz. The cone should point in the direction of the velocity vector.
So, I think my PyOpenGL code should look something like this:
glPushMatrix()
glTranslate(x, y, z)
glPushMatrix()
# do some sort of rotation here #
glutSolidCone(base, height, slices, stacks)
glPopMatrix()
glPopMatrix()
So, is that correct (so far)? What do I put in place of the "# do some sort of rotation here #" ?
In my world, the Z-axis points up (0, 0, 1) and, without any rotations, so does my cone.
Okay, Reto Koradi's answer seems to be the approach that I should take, but I'm not sure of some of the implementation details and my code is not working.
If I understand correctly, the rotation matrix should be a 4x4. Reto shows me how to get a 3x3, so I'm assuming that the 3x3 should be the upper-left corner of a 4x4 identity matrix. Here's my code:
import numpy as np
def normalize(v):
norm = np.linalg.norm(v)
if norm > 1.0e-8: # arbitrarily small
return v/norm
else:
return v
def transform(v):
bz = normalize(v)
if (abs(v[2]) < abs(v[0])) and (abs(v[2]) < abs(v[1])):
by = normalize(np.array([v[1], -v[0], 0]))
else:
by = normalize(np.array([v[2], 0, -v[0]]))
#~ by = normalize(np.array([0, v[2], -v[1]]))
bx = np.cross(by, bz)
R = np.array([[bx[0], by[0], bz[0], 0],
[bx[1], by[1], bz[1], 0],
[bx[2], by[2], bz[2], 0],
[0, 0, 0, 1]], dtype=np.float32)
return R
and here is the way it gets inserted into the rendering code:
glPushMatrix()
glTranslate(x, y, z)
glPushMatrix()
v = np.array([vx, vy, vz])
glMultMatrixf(transform(v))
glutSolidCone(base, height, slices, stacks)
glPopMatrix()
glPopMatrix()
Unfortunately, this isn't working. My test case cones just do not point correctly and I can't identify the failure mode. Without the "glutMultMatrixf(transform(v)" line, the cones align along the z-axis, as expected.
It's working. Reto Koradi correctly identified that the rotation matrix needed to be transposed in order to match the column order of OpenGL. The code should look like this (before optimization):
def transform(v):
bz = normalize(v)
if (abs(v[2]) < abs(v[0])) and (abs(v[2]) < abs(v[1])):
by = normalize(np.array([v[1], -v[0], 0]))
else:
by = normalize(np.array([v[2], 0, -v[0]]))
#~ by = normalize(np.array([0, v[2], -v[1]]))
bx = np.cross(by, bz)
R = np.array([[bx[0], by[0], bz[0], 0],
[bx[1], by[1], bz[1], 0],
[bx[2], by[2], bz[2], 0],
[0, 0, 0, 1]], dtype=np.float32)
return R.T

A helpful concept to remember here is that a linear transformation can also be interpreted as a change of coordinate systems. In other words, instead of picturing points being transformed within a coordinate system, you can just as well picture the points staying in place, but their coordinates being expressed in a new coordinate system. When looking at the matrix expressing the transformation, the base vectors of this new coordinate system are the column vectors of the matrix.
In the following, the base vectors of the new coordinate system are named bx, by and bz. Since the columns of a rotation matrix need to be orthonormal, bx, by and bz need to form an orthonormal set of vectors.
In this case, the original cone is oriented along the z-axis. Since you want the cone to be oriented along (vx, vy, vz) instead, we use this vector as the z-axis of our new coordinate system. Since we want an orthonormal coordinate system, the only thing left to do to obtain bz is to normalize this vector:
[vx]
bz = normalize([vy])
[vz]
Since the cone is rotationally symmetrical, it does not really matter how the remaining two base vectors are chosen, just as long as they are both orthogonal to bz, and orthogonal to each other. A simple way to find an arbitrary orthogonal vector to a given vector is to keep one coordinate 0, swap the other two coordinates, and change the sign of one of those two coordinates. Again, the vector needs to be normalized. Vectors we could choose with this approach include:
[ vy] [ vz] [ 0 ]
by = normalize([-vx]) by = normalize([ 0 ]) by = normalize([ vz])
[ 0 ] [-vx] [-vy]
The dot product of each of these vectors with (vx, vy, vz) is zero, which means that the vectors are orthogonal.
While the choice between these (or other variations) is mostly arbitrary, care must be taken to not end up with a degenerate vector. For example if vx and vy are both zero, using the first of these vector would be bad. To avoid choosing a (near) degenerate vector, a simple strategy is to use the first of these three vectors if vz is smaller than both vx and vy, and one of the other two otherwise.
With two new base vectors in place, the third is the cross product of the other two:
bx = by x bz
All that's left is to populate the rotation matrix with column vectors bx, by and bz, and the rotation matrix is complete:
[ bx.x by.x bz.x ]
R = [ bx.y by.y bz.y ]
[ bx.z by.z bz.z ]
If you need a 4x4 matrix, e.g. because you are using the legacy fixed function OpenGL pipeline, you can extend this to a 4x4 matrix:
[ bx.x by.x bz.x 0 ]
R = [ bx.y by.y bz.y 0 ]
[ bx.z by.z bz.z 0 ]
[ 0 0 0 1 ]

Rotation of a 2D array over an angle using rotation matrix

What I want to do is to rotate a 2D numpy array over a given angle. The approach I'm taking is using a rotation matrix. The rotation matrix I defined as:
angle = 65.
theta = (angle/180.) * numpy.pi
rotMatrix = numpy.array([[numpy.cos(theta), -numpy.sin(theta)],
[numpy.sin(theta), numpy.cos(theta)]])
The matrix I want to rotate is shaped (1002,1004). However, just for testing purposes I created a 2D array with shape (7,6)
c = numpy.array([[0,0,6,0,6,0], [0,0,0,8,7,0], [0,0,0,0,5,0], [0,0,0,3,4,0], [0,0,2,0,1,0], [0,8,0,0,9,0], [0,0,0,0,15,0]])
Now, when I apply the rotation matrix on my 2D array I get the following error:
c = numpy.dot(rotMatrix, c)
print c
c = numpy.dot(rotMatrix, c)
ValueError: matrices are not aligned
Exception in thread Thread-1 (most likely raised during interpreter shutdown):
What am I doing wrong?

You seem to be looking for scipy.ndimage.interpolation.rotate, or similar. If you specifically want 90, 180, or 270 degree rotations, which do not require interpolation, then numpy.rot90 is better.

Matrix dimensions will need to be compatible in order to obtain a matrix product. You are trying to multiply a 7x6 matrix with a 2x2 matrix. This is not mathematically coherent. It only really makes sense to apply a 2D rotation to a 2D vector to obtain the transformed coordinates.
The result of a matrix product is defined only when the left hand matrix has column count equal to right hand matrix row count.

You may want to look at skimage.transform. This module has several useful functions including rotation. No sense in rewriting something that is already done.

You can not rotate any ndim vector using 2D matrix.
I did not find an in built function in numpy. I was hoping that this is a very common functionality and should be there. Let me know if you find it.
Mean while I have create function of my own.
def rotate(vector, theta, rotation_around=None) -> np.ndarray:
"""
reference: https://en.wikipedia.org/wiki/Rotation_matrix#In_two_dimensions
:param vector: list of length 2 OR
list of list where inner list has size 2 OR
1D numpy array of length 2 OR
2D numpy array of size (number of points, 2)
:param theta: rotation angle in degree (+ve value of anti-clockwise rotation)
:param rotation_around: "vector" will be rotated around this point,
otherwise [0, 0] will be considered as rotation axis
:return: rotated "vector" about "theta" degree around rotation
axis "rotation_around" numpy array
"""
vector = np.array(vector)
if vector.ndim == 1:
vector = vector[np.newaxis, :]
if rotation_around is not None:
vector = vector - rotation_around
vector = vector.T
theta = np.radians(theta)
rotation_matrix = np.array([
[np.cos(theta), -np.sin(theta)],
[np.sin(theta), np.cos(theta)]
])
output: np.ndarray = (rotation_matrix # vector).T
if rotation_around is not None:
output = output + rotation_around
return output.squeeze()
if __name__ == '__main__':
angle = 30
print(rotate([1, 0], 30)) # passing one point
print(rotate([[1, 0], [0, 1]], 30)) # passing multiple points

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Why does OpenCV decomposed homography translation matrix have three values? - python

Related

Normalize the 3D skeleton data for deep learning

Best way to rotate a 3D grid (nxnxn) of values in Python with interpolation?

How to construct camera matrix from known parameters

OpenGL Rotation from Velocity Vector

Rotation of a 2D array over an angle using rotation matrix

Categories

Resources