Get homography matrix from rotation or orientation matrix - python

I am getting rotation matrix and orientations (Euler angles) using a sensor in an android device. I want to use these in opencv for affine transformation. affine transformation uses homography matrix to do its job. My question is how to convert rotation matrix or orientation array to homography matrix that is usable in affine transformation?
Android code to get rotation matrix and orientation:
final float[] rotationMatrix = new float[9];
SensorManager.getRotationMatrix(rotationMatrix, null, accelerometerReading, magnetometerReading);
final float[] orientationAngles = new float[3];
SensorManager.getOrientation(rotationMatrix, orientationAngles);
opencv code to affine transform:
homographtMatrix = ... # to calc from rotation matrix or orientation angls
warped = cv2.warpPerspective(img, homographtMatrix, (cols, 600))
Sample rotation matrix:
[
[-0.39098227, -0.24775778, 0.8864249],
[0.9200034, -0.07699536, 0.38427263],
[-0.026955934, 0.96575755, 0.2580418]
]
Sample euler angles:
[1.3097044 0.0269592 1.97264932]
Image going to affine transform:
Desired transform (Cuts from left doesn't matter i can fix it):
Then I will tile a floor in a segmented image.

Homography matrix is a matrix to project a point in the world that is 3d (Pw) to a point in an image plane (Px). For example:
Px = H . Pw
This can be done by an original equation:
Px = K . (R . Pw + t)
Which K is the camera intrinsic parameters function that is created by a number called focal (f) length and a point called principal point (c) that is the center of the image most of the time.
K = [
[f, 0, cx],
[0, f, cy],
[0, 0, 1 ],
]
And R is a 3d rotaion matrix. To explain this, We assume that the point in the world has z = 0 (lays on z plane). To move it to camera coordinates a rotation matrix should be used. z = 0 causes the removal of a column of the R matrix. The combination of K, one column removed R ant t that is the distance of point coordinates from camera coordinates is called homography matrix.
TL;DR
Mathematics concepts are boring to understand. I found an implementation of camera calibration from github that makes projections easy:
import cv2
from vcam import vcam, meshGen
img = cv2.imread("chess.png")
c1.focus = 390
c1.sx = 101/100
c1.sy = 101/100
c1.KpCoeff[0] = 0
c1.KpCoeff[1] = 0
c1.KpCoeff[2] = 0
c1.KpCoeff[3] = 0
c1.set_tvec(0, 0, -500)
c1.set_rvec(0, 0, -180)
pts2d = c1.project(pts3d)
map_x, map_y = c1.getMaps(pts2d)
output = cv2.remap(img, map_x, map_y, interpolation=cv2.INTER_LINEAR)
It has an interesting GUI to test parameters that how they effect the output.

Related

Projection of a 3D circle onto a 2D camera image

Asked this on math.stackexchange, but no responses so trying here, hopefully the computer vision people are more able to help out.
Assume that I have a 3D circle with a center at (c1, c2, c3) in the circle coordinate frame C. The radius of the circle is r, and there is a unit vector (v1, v2, v3) (also in coordinate frame C) normal to the plane of the circle at the center point.
I have a pinhole camera located at point (k1, k2, k3) in the camera coordinate frame K. I have a known camera-to-circle coordinate frame transformation matrix kTc that transforms any point in C to coordinate frame K so that point_k = np.dot(kTc, point_c) where point_k is a point in the K frame coordinates and point_c is a point in the C frame coordinates. The camera has a known intrinsic camera matrix I.
How do I project the 3D circle onto the image plane of the camera?
Ideally would like to do this in python.
I think you want to use the scipy module in conjunction with numpy. The problem you are solving is the transformation of the sphere into a 2D plane and then transforming that plane (by rotating and shifting) to the camera plane (whichever that is).
First of all, some information is missing here about the camera plane (phi and/or theta wrt origin) or the equation of the plane and about the sphere (the equation of the sphere in radial coordinates). Transformations would be a bit tough without equations, angles, and plane(s) information. I agree you have given the information in variables, but it would be easy to have equations in handy OR the first step would be making equations (doing it by hand or using mathematica etc.)
I will be using this documentation to attempt to answer your question - https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.transform.Rotation.html
Following is a pseudo code
do not attempt to execute it as is
import numpy as np
from scipy.spatial.transform import Rotation
# Define the center of the circle in the circle coordinate frame
c = np.array([c1, c2, c3])
# Define the normal vector of the circle in the circle coordinate frame
n = np.array([v1, v2, v3])
# Define the camera center in the camera coordinate frame
k = np.array([k1, k2, k3])
# Define the intrinsic camera matrix
I = np.array(..., dtype=np.float32)
# Define the rotation matrix part of kTc
R = np.array(..., dtype=np.float32)
# Define the translation vector part of kTc
t = np.array(..., dtype=np.float32)
# Convert R to a scipy Rotation object
R_obj = Rotation.from_dcm(R)
# Transform the center of the circle to the camera coordinate frame
c_k = R_obj.apply(c) + t
# Project the center of the circle onto the image plane
c_p = I # c_k
# Normalize the center projection by dividing by its third component
c_p = c_p[:2] / c_p[2]
# Transform the normal vector to the camera coordinate frame
n_k = R_obj.apply(n)
# Project the normal vector onto the image plane
n_p = I[:2, :3] # n_k
# Normalize the normal projection by dividing by its third component
n_p = n_p / n_p[2]
# Calculate the equation of the circle in the image plane
r_squared = r**2 - np.sum(n_p[:2]**2) / n_p[2]**2
And here's an explanation of the equations used in the code:
c = np.array([c1, c2, c3]): The center of the circle in the circle coordinate frame is represented by the vector c = [c1, c2, c3].
n = np.array([v1, v2, v3]): The normal vector of the circle in the circle coordinate frame is represented by the vector n = [v1, v2, v3].
k = np.array([k1, k2, k3]): The camera center in the camera coordinate frame is represented by the vector k = [k1, k2, k3].
I = np.array(..., dtype=np.float32): The intrinsic camera matrix is represented by the 2D array I.
R = np.array(..., dtype=np.float32): The rotation matrix part of the kTc transformation matrix is represented by the 3x3 array R.
t = np.array(..., dtype=np.float32): The translation vector part
Answer algorithm:
To project the 3D circle onto the image plane of the camera, you'll need to follow these steps:
Transform the 3D circle from the circle coordinate frame (C) to the camera coordinate frame (K) using the transformation matrix kTc. The transformed circle's center will be given by:
center_k = np.dot(kTc, [c1, c2, c3, 1])
Project the transformed circle's center onto the image plane of the camera using the intrinsic camera matrix I. The projection of the center will be given by:
center_p = np.dot(I, center_k)
Transform the normal vector from the circle coordinate frame (C) to the camera coordinate frame (K) using the transformation matrix kTc. The transformed normal vector will be given by:
normal_k = np.dot(kTc[:3, :3], [v1, v2, v3])
Project the transformed normal vector onto the image plane of the camera using the intrinsic camera matrix I. The projection of the normal vector will be given by:
normal_p = np.dot(I[:2, :3], normal_k)
Using the center and normal projections, you can now find the equation of the 2D circle in the image plane of the camera. The equation of the circle in the image plane will be given by:
(x - center_p[0]/center_p[2])^2 + (y - center_p[1]/center_p[2])^2 = r^2 - (normal_p[0]^2 + normal_p[1]^2)/(normal_p[2]^2)
where (x, y) are the image plane coordinates of the projected circle.
Note that you'll need to normalize the center and normal projections by dividing each by their respective third components (center_p[2] and normal_p[2]) to get their actual image plane coordinates.

Camera calibration, focal length value seems too large

I tried a camera calibration with python and opencv to find the camera matrix. I used the following code from this link
https://automaticaddison.com/how-to-perform-camera-calibration-using-opencv/
import cv2 # Import the OpenCV library to enable computer vision
import numpy as np # Import the NumPy scientific computing library
import glob # Used to get retrieve files that have a specified pattern
# Path to the image that you want to undistort
distorted_img_filename = r'C:\Users\uid20832\3.jpg'
# Chessboard dimensions
number_of_squares_X = 10 # Number of chessboard squares along the x-axis
number_of_squares_Y = 7 # Number of chessboard squares along the y-axis
nX = number_of_squares_X - 1 # Number of interior corners along x-axis
nY = number_of_squares_Y - 1 # Number of interior corners along y-axis
# Store vectors of 3D points for all chessboard images (world coordinate frame)
object_points = []
# Store vectors of 2D points for all chessboard images (camera coordinate frame)
image_points = []
# Set termination criteria. We stop either when an accuracy is reached or when
# we have finished a certain number of iterations.
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# Define real world coordinates for points in the 3D coordinate frame
# Object points are (0,0,0), (1,0,0), (2,0,0) ...., (5,8,0)
object_points_3D = np.zeros((nX * nY, 3), np.float32)
# These are the x and y coordinates
object_points_3D[:,:2] = np.mgrid[0:nY, 0:nX].T.reshape(-1, 2)
def main():
# Get the file path for images in the current directory
images = glob.glob(r'C:\Users\Kalibrierung\*.jpg')
# Go through each chessboard image, one by one
for image_file in images:
# Load the image
image = cv2.imread(image_file)
# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Find the corners on the chessboard
success, corners = cv2.findChessboardCorners(gray, (nY, nX), None)
# If the corners are found by the algorithm, draw them
if success == True:
# Append object points
object_points.append(object_points_3D)
# Find more exact corner pixels
corners_2 = cv2.cornerSubPix(gray, corners, (11,11), (-1,-1), criteria)
# Append image points
image_points.append(corners)
# Draw the corners
cv2.drawChessboardCorners(image, (nY, nX), corners_2, success)
# Display the image. Used for testing.
#cv2.imshow("Image", image)
# Display the window for a short period. Used for testing.
#cv2.waitKey(200)
# Now take a distorted image and undistort it
distorted_image = cv2.imread(distorted_img_filename)
# Perform camera calibration to return the camera matrix, distortion coefficients, rotation and translation vectors etc
ret, mtx, dist, rvecs, tvecs = cv2.calibrateCamera(object_points,
image_points,
gray.shape[::-1],
None,
None)
But I think I always get wrong parameters. My focal length is around 1750 in x and y direction from calibration. I think this couldnt be rigth, it is pretty much. The camera documentation says the focal lentgh is between 4-7 mm. But I am not sure, why it is so high from the calibration. Here are some of my photos for the calibration. Maybe something is wrong with them. I moved the chessboard under the camera in different directions, angles and high.
I was also wondering, why I dont need the size of the squares in the code. Can someone explains it to me or did I forgot this input somewhere?
Your misconception is about "focal length". It's an overloaded term.
"focal length" (unit mm) in the optical part: it describes the distance between the lens plane and image/sensor plane, assuming a focus to infinity
"focal length" (unit pixels) in the camera matrix: it describes a scale factor for mapping the real world to a picture of a certain resolution
1750 may very well be correct, if you have a high resolution picture (Full HD or something).
The calculation goes:
f [pixels] = (focal length [mm]) / (pixel pitch [µm / pixel])
(take care of the units and prefixes, 1 mm = 1000 µm)
Example: a Pixel 4a phone, which has 1.40 µm pixel pitch and 4.38 mm focal length, has f = ~3128.57 (= fx = fy).
Another example: A Pixel 4a has a diagonal Field of View of approximately 77.7 degrees, and a resolution of 4032 x 3024 pixels, so that's 5040 pixels diagonally. You can calculate:
f = (5040 / 2) / tan(~77.7° / 2)
f = ~3128.6 [pixels]
And that calculation you can apply to arbitrary cameras for which you know the field of view and picture size. Use horizontal FoV and horizontal resolution if the diagonal resolution is ambiguous. That can happen if the sensor isn't 16:9 but the video you take from it is cropped to 16:9... assuming the crop only crops vertically, and leaves the horizontal alone.
Why don't you need the size of the chessboard squares in this code? Because it only calibrates the intrinsic parameters (camera matrix and distortion coefficients). Those don't depend on the distance to the board or any other object in the scene.
If you were to calibrate extrinsic parameters, i.e. the distance of cameras in a stereo setup, then you would need to give the size of the squares.

Homogeneous transform of point clouds in camera coordinates to world coordinates using OpenCV

I have camera calibration intrinsics and extrinsics (including rotations and translations, i.e. rvecs and tvecs, for a set of N camera poses, relative to a fixed ChArUco target.
Additionally, for every camera pose, I have a set of 3D coordinates ("point cloud") defined in the standard OpenCV camera coordinate system.
After a lot of reading, I would have thought that I need to first compute each camera pose relative to the ChArUCo board, by constructing a homogeneous transformation matrix like so in python:
# initialize 4x4 transform
inverted_homogeneous_transform_matrix = np.zeros(shape=(4,4))
inverted_homogeneous_transform_matrix[3,3] = 1.0
# convert Rodrigues vector into Rodrigues matrix, and then invert it
rotation_matrix = np.zeros(shape=(3,3))
cv2.Rodrigues(rvecs, rotation_matrix)
inverted_rotation = rotation_matrix.transpose()
# add inverted rotation to transform
inverted_homogeneous_transform_matrix[:3,:3] = inverted_rotation
# compute inverted translation, e.g. see http://ksimek.github.io/2012/08/22/extrinsic/
inverted_translation_vector = -inverted_rotation * tvecs
inverted_transform_matrix[:3,3] = np.asarray(inverted_translation_vector).flatten()
# x_coords, y_coords, z_coords are defined in camera coordinate system
x_coords=np.asarray([1,2,3,4,5])
y_coords=np.asarray([2,4,6,8,10])
z_coords=np.asarray([3,6,9,12,15])
homogeneous_ones = np.ones(len(x_coords))
homogeneous_points = np.matrix([x_coords, y_coords, z_coords, homogeneous_ones])
# perform the transformation
transformed_points = inverted_transform_matrix * homogeneous_points
# clean up to extract x,y,z values from matrix and save as 1D array
x_coords = np.asarray(transformed_points[0,:]).flatten()
y_coords = np.asarray(transformed_points[1,:]).flatten()
z_coords = np.asarray(transformed_points[1,:]).flatten()
Basically, the above code works, but when I run it on multiple point clouds from different camera perspectives, they do not magically line up as I would expect. I can confirm that my inverse homogeneous transform is indeed an inverse to the homogeneous transform constructed directly from rvecs and tvecs; have tried breaking it apart into translation first then rotation, and vice versa; and have seen a non-inverse transformation make everything almost line up from one perspective (but all rotated weirdly from every other perspective)...
Any help appreciated!

Inconsistence of left-right image point reprojections after reprojectImageto3d

everyone. I'm trying to triangulate some points (dense reconstruction) lying on a plane in a setup which involves two cameras.
[Reference image]: https://imgur.com/gOps4vP and [The other image]: https://imgur.com/VIiH9Rv
First of all, I solve the relative pose problem using the 5pts algorithm on the undistorted points for the Essential Matrix estimation, the I recover the pose. I'm using RANSAC.
Then, I rectify the stereo pairs the usual way.
R1, R2, Pn1, Pn2, Q, _, _ = cv2.stereoRectify(K1, dcoeffs1, K2, dcoeffs2,
img1.shape[::-1], R, t,
flags=cv2.CALIB_ZERO_DISPARITY,
alpha=-1)
# Compute the rigid transform that OpenCV apply to world points (USEFUL LATER)
# in order for the rectified reference camera to be K_new[I|0]
tn_1 = np.zeros((3,1)) # Cameras are never translated in the rectification
G1_rect = np.block([[R1, tn_1], [np.zeros((1,3)), 1.0]])
maps1 = cv2.initUndistortRectifyMap(K1, dcoeffs1, R1, Pn1, (1920,1080), cv2.CV_32FC1)
maps2 = cv2.initUndistortRectifyMap(K2, dcoeffs2, R2, Pn2, (1920,1080), cv2.CV_32FC1)
img1_remap = cv2.remap(img1, maps1[0], maps1[1], cv2.INTER_LANCZOS4)
img2_remap = cv2.remap(img2, maps2[0], maps2[1], cv2.INTER_LANCZOS4)
Result of the rectification:
[Rectified reference image] https://drive.google.com/open?id=10VfgXrXFO3_lYqtO9qJXr17Dc6F1PuXU
[The other one rectified] https://drive.google.com/open?id=13ZkeMiF5xEovGmX13LSQVaJ237hoJLX0
Now I call a function that recognize a known object in the images (target).
#Now call a function that recognize a known object in the images (target)
# Find target
target_corners, _ = dt.detectTarget(img_scene1, img_target, 0.5) # return 4 corners of the detected polygon
target_corners = target_corners[:,0,:]
# Compute mask for the target cutout:
target_mask = mp.maskPolygon(target_corners, img_scene1.shape[::-1]) # Output: mask of same dimension of the image
Target found (please note the highlighted corners):
[Target found] https://imgur.com/QjYV8tp
Then I compute the disparity map using StereoSGBM. I'm interested in the computation of the target disparity only (I'll mask all the other points).
With the Disparity map obtained and using the 4x4 projection Matrix Q given by stereoRectify, I perform the 3d reprojection of the disparity map.
# Compute disparity map
# https://docs.opencv.org/3.3.1/d2/d85/classcv_1_1StereoSGBM.html
window_size = 5
min_disp = 16
max_disp = 1024
num_disp = max_disp-min_disp # Deve essere divisibile per 16!
stereo = cv2.StereoSGBM_create(minDisparity = min_disp,
numDisparities = num_disp,
blockSize = window_size,
P1 = 8*3*window_size**2,
P2 = 32*3*window_size**2,
disp12MaxDiff = 1,
uniquenessRatio = 10,
speckleWindowSize = 150,
speckleRange = 2
)
print('Calcolo SGBM della disparità...')
disp = stereo.compute(img_scene1, img_scene2).astype(np.float32) / 16.0
target_disparity = target_mask*disp
points = cv2.reprojectImageTo3D(target_disparity, Q)
# DEBUG:
cv2.namedWindow('scene1', cv2.WINDOW_NORMAL)
cv2.resizeWindow('scene1', 800,450)
cv2.imshow('scene1', img_scene1)
cv2.namedWindow('disparity', cv2.WINDOW_NORMAL)
cv2.resizeWindow('disparity', 800,450)
cv2.imshow('disparity', (disp-min_disp)/num_disp)
cv2.namedWindow('target_disparity', cv2.WINDOW_NORMAL)
cv2.resizeWindow('target_disparity', 800,450)
cv2.imshow('target_disparity', target_mask*(disp-min_disp)/num_disp)
cv2.waitKey()
cv2.destroyAllWindows()
# Obtain matrix of the target 3D points starting from disparity image obtained from reprojectImageTo3D()
mask_disp = disp > disp.min()
mask_inf = ~(np.isinf(points[:,:,0]) | np.isinf(points[:,:,1]) | np.isinf(points[:,:,2]))
mask_nan = ~(np.isnan(points[:,:,0]) | np.isnan(points[:,:,1]) | np.isnan(points[:,:,2]))
mask = mask_disp & mask_inf & mask_nan
pts3D = points[mask]
Now, I have 3d reconstructed the region of the images corresponding to the target. I noted that OpenCv, during camera rectification, apply a rigid transform to world points such that the reference original camera and the new (rectified) reference camera have the same extrinsics (R=eye(3) and t=[0,0,0]'). Infact, during rectification both cameras must be rotated, and I think OpenCV simply brings back the new cameras to a new reference such that the reference rectified camera has the same extrinsics of the original one. But this implies that the reconstructed 3d points will be expressed in a world reference that is not the world reference of the original camera!
So, applying the inverse rigid transform to the pts3D, we obtain a reconstruction in the original reference camera frame. (See code).
target3Dpts_hom = cv2.convertPointsToHomogeneous(target3Dpts)[:,0,:].T
target3Dpts_hom = G.T # target3Dpts_hom
new_target3Dpts = cv2.convertPointsFromHomogeneous(target3Dpts_hom.T[:,np.newaxis,:])[:,0,:]
Please NOTE that if I don't perform this operation, the pt3D reprojected on the original cameras by means of their projection matrices will not correspond to the target points!
Check reconstruction via reprojection; Now, i can reproject the new_target3Dpts:
Let me introduce the projection function that I call:
def proj_dist(P, dcoeffs, M):
import numpy as np
import cv2
K, R, t,_,_,_,_ = cv2.decomposeProjectionMatrix(P)
rotv, _ = cv2.Rodrigues(R)
# Projection. Returns a (N,2) shaped array
m,_ = cv2.projectPoints(M,rotv,t[0:-1],K,dcoeffs)
m = m.squeeze()
return m
Finally, the reprojections:
#P_kin = K_kin[eye(3),0] # Originals MPPs of two cameras
#P_rpi = K_rpi[R,t]
m0 = proj.proj_dist(P_kin,dcoeffs_kin,new_points).astype('int32')
for (x, y) in m0:
x = int(x)
y= int(y)
cv2.circle(img_kin, (x, y), 2, (255, 255, 0), 4)
cv2.namedWindow('frame1', cv2.WINDOW_NORMAL)
cv2.resizeWindow('frame1', 800,450)
cv2.imshow('frame1',img_kin)
cv2.waitKey(0)
m1 = proj.proj_dist(P_rpi,dcoeffs_rpi,new_points).astype('int32')
img_rpi1 = img_rpi.copy()
for (x, y) in m1:
x = int(x)
y = int(y)
cv2.circle(img_rpi1, (x, y), 2, (255, 255, 0), 4)
cv2.namedWindow('frame2', cv2.WINDOW_NORMAL)
cv2.resizeWindow('frame2', 800,450)
cv2.imshow('frame2',img_rpi1)
cv2.waitKey(0)
But, while the reprojected points on the original reference camera are correct, this is not true for the second one....The points are simply translated, but I can't explain why.
Results: [First frame repj] https://imgur.com/S4lo9Wz
[2nd frame repj. Error] https://imgur.com/y4igaEI
Any ideas? I will include all the code now.
Thank you.
SM
I solved the problem, which is not related with the reprojectImageto3D --that works fine--, but with this piece of code I've wrote and that I used to reproject the points onto the original frames:
def proj_dist(P, dcoeffs, M):
import numpy as np
import cv2
K, R, t,_,_,_,_ = cv2.decomposeProjectionMatrix(P)
rotv, _ = cv2.Rodrigues(R)
# Projection. Returns a (N,2) shaped array
m,_ = cv2.projectPoints(M,rotv,t[0:-1],K,dcoeffs)
m = m.squeeze()
return m
I've wrote my own function for points projection:
def proj(P, M, hom=0):
# proj(): Esegue la proiezione prospettica dei punti 3D M secondo la MPP P,
# sul piano immagine 2D di una camera pinhole.
import numpy as np
n = M.shape[1]
M = np.concatenate((M, np.ones((1,n))))
# Proiezione
m = P # M
m = m/m[2,:]
if hom !=1 :
# Passo a cartesiane
m = m[0:2,:]
return m
and the problem is solved!
My function does not take in account for lens distortion. I'll further investigate the problem related with the projectPoints() OpenCV function.

Calculate camera world position with OpenCV Python

I want to calculate my camera's position in world coordinates. This should be fairly easy, but I don't get the results I expect. I believe I've read everything on this topic, but my code isn't working. Here's what I do:
I have a camera looking at an area.
1) I drew a map of the area.
2) I calculated the homography by matching 4 image points to 4 points on my map using cv2.getPerspectiveTransform
3) The H homography transforms every world coordinate to camera coordinate; this is working properly
4) To calculate the camera matrix I followed this:
translation = np.zeros((3,1))
translation[:,0] = homography[:,2]
rotation = np.zeros((3,3))
rotation[:,0] = homography[:,0]
rotation[:,1] = homography[:,1]
rotation[:,2] = np.cross(homography[0:3,0],homography[0:3,1])
cameraMatrix = np.zeros((3,4))
cameraMatrix[:,0:3] = rotation
cameraMatrix[:,3] = homography[:,2]
cameraMatrix = cameraMatrix/cameraMatrix[2][3] #normalize the matrix
5) According to this, the camera's position should be calculated like this:
x,y,z = np.dot(-np.transpose(rotation),translation)
The coordinates I'm getting are totally wrong. The problem should be somewhere in step 4 or 5 I guess. What's wrong with my method?
I think I've got it now. The problem was with the method described in step 4. The camera position cannot be calculated from the homography matrix alone. The camera intrinsics matrix is also necessary. So, the correct procedure is the following:
1) draw a map of the area
2) calibrate the camera using the chessboard image with cv2.findChessboardCorners this yields the camera matrix and the distortion coefficients
3) solvePnP with the world coordinates (3D) and image coordinates (2D). The solvePnP returns the object's origo in the camera's coordinate system given the 4 corresponding points and the camera matrix.
4) Now I need to calculate the camera's position in world coordinates. The rotation matrix is: rotM = cv2.Rodrigues(rvec)[0]
5) The x,y,z position of the camera is: cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)

Categories

Resources