Estimating relative rotation between two uncalibrated cameras from the fundamental matrix - python

I have two images of the same object that were taken from different positions and an uncalibrated camera. I want to treat these images pair as stereo pairs. Then I can estimate a fundamental matrix, without any knowledge of the scene. I want to know the rotation angle between camera two relative to the first one.
pts1 and pts2 are the array of matching key points that are obtained from one of the known Feature Descriptor methods. RANSAC is used to exclude outliers.
F, mask = cv2.findFundamentalMat(pts1,pts2,v2.FM_RANSAC, 3,0.99 )
p1fNew and p2fNew - inliers of key points
With the stereoRectifyUncalibrated method, I can obtain two matrices to rectify images into the same plane.
retBool, H1, H2 = cv2.stereoRectifyUncalibrated(p1fNew,p2fNew, F, image_size)
By multiplying these matrices I can estimate transformation of one image plane to another:
T = np.linalg.inv(np.mat(H1)) * np.mat(H2)
How do I decompose this T matrix and extract from it rotation information and vector direction from the first image center to another?

Related

How to find the rotation matrix with known points

The corner points of a rectangle are given in 2D coordinates. I also know the real distance of the points and I have the camera matrix.
Now I want to find the rotation vector with respect to the camera, but without using the cv2.calibrateCamera() method with the chessboard corners.

Projecting a Texture Mask onto an existing 3D Mesh given the camera extrinsics

Given an image mask, I want to project the pixels onto a mesh in respect to the position and orientation of the camera and convert these pixels into a pointcloud. I have the intrinsic and extrinsic parameters of the camera in respect to the world, and the location of the mesh in world coordinates. I know the mapping from world coordinates to camera image is as follow:
imgpoint = Intrinsic * Extrinsic * worldpoint
So when I want to the opposite i do the inverse of the intrinsic and extrinsic matrices:
worldpoint= Intrinsic^(-1) * Extrinsic^(-1) * imgpoint
However, the idea that I had was to obtain two points from one pixel, with different depth values, to obtain a line and then look for the closest intersection for the mesh I want with the line, but I do not know how to properly generate a point away from the original camera plane. How can I find this extra point and/or am I complicating this problem?
The top equation below shows how to project a point (x,y,z) onto a pixel (u,v);
The extrinsic parameters are the 3x3 rotation matrix R and translation t.
The intrinsic parameters are the focal distances f_x, f_y and
principal point (c_x, c_y). The value alpha is the perspective foreshortening term that is divided out.
The bottom equation reverses the process by describing how to project
a ray from the camera position through through the pixel (u,v) out into the scene as the parameter alpha varies from 0 to infinity.
Now we have converted the problem into a ray casting problem.
Find the intersection of the ray with your mesh which is a
standard computer graphics problem.

Visual Odometry - Problem with Rotation and translation

I'm trying to reconstruct the trajectory of a camera by extracting the features from images and computing the camera pose, however the trajectory is not as I expected. For my visual odometry system I am using an Intel camera D435i which extracts color and depth frames. My code is written with python and all the important functions used are from the OpenCV library.
I am using a dataset where the camera just moves straightforward and therefore I expect the rotation to be the identity matrix and the translation to go forward. However, this isn't why I get when I compute them.
This is my approach:
Read the color frames, transform them to grayscale and apply CLAHE-method to reduce light contrasts
Extract features using SIFT from the frame t and frame t+1
Match them using FLANN and apply Lowe's ratio test to filter outliers
Transform the 2D pixel coordinates (u1p, v1p) of the features to 3D world coordinates using the camera intrinsic matrix (K) and the depth of the pixels (s) (Acquired from the depth frames):
Normalization: x_y_norm = (K^-1) * [u1p ,v1p, 1]
Scaling: x_y_norm_s *= x_y_norm
Afterwards I try two methods: PNP and essential matrix
A) PNP:
4. Use the function cv2.solvePNPRansac() to get the rotation vector and the translation vector. The inputs of the function are the normalized and scaled keypoints of the frame t (object points) and the normalized keypoints of the frame t+1 (image points). The rotation matrix is then computed with the function cv2.Rodrigues()
B) Essential matrix:
4. Use cv2.findEssentialMat() with the normalized x-y-coordinates of the keypoints as input. Check if the essential matrix is computed correctly by checking the singular values. Use cv2.recoverPose() to get the rotation and translation (Here I also used the normalized x-y-coordinates as input).
Compute the trajectory as follows (assuming the second camera_pose is [R|t] and the first pose is the identity matrix as stated in the book multiple view geometry):
camera_pose[t+1] = pose[t] * pose[t+1]^-1 (Get pose from initial position by multiplying previous and current pose as stated in Combining relative camera rotations & translations with known pose )
position = pose[t+1] * [0,0,0,1] (Position calculated from the origin)
trajectory[:,t+1] = position[0:3]
I hope someone can help me since I don't know where my error is and neither the PnP method nor essential matrix method work.
Edit: For both methods I added a picture of the trajectory (in the Z-X and the X-Y plane) that I get from the first 15 frames of my dataset and an extract of a text file containing the rotation matrix and the translation vector of some frames. The Z-X-Plane should show dots forming a line in positive direction, since my dataset just shows images where the camera moves straightforward, however a random trajectory is shown. The green dots are old camera poses and the red dot the current one. I also added an example of the features extracted in the first frame and matched with the second frame, they seem too be alright.

Homogeneous transform of point clouds in camera coordinates to world coordinates using OpenCV

I have camera calibration intrinsics and extrinsics (including rotations and translations, i.e. rvecs and tvecs, for a set of N camera poses, relative to a fixed ChArUco target.
Additionally, for every camera pose, I have a set of 3D coordinates ("point cloud") defined in the standard OpenCV camera coordinate system.
After a lot of reading, I would have thought that I need to first compute each camera pose relative to the ChArUCo board, by constructing a homogeneous transformation matrix like so in python:
# initialize 4x4 transform
inverted_homogeneous_transform_matrix = np.zeros(shape=(4,4))
inverted_homogeneous_transform_matrix[3,3] = 1.0
# convert Rodrigues vector into Rodrigues matrix, and then invert it
rotation_matrix = np.zeros(shape=(3,3))
cv2.Rodrigues(rvecs, rotation_matrix)
inverted_rotation = rotation_matrix.transpose()
# add inverted rotation to transform
inverted_homogeneous_transform_matrix[:3,:3] = inverted_rotation
# compute inverted translation, e.g. see http://ksimek.github.io/2012/08/22/extrinsic/
inverted_translation_vector = -inverted_rotation * tvecs
inverted_transform_matrix[:3,3] = np.asarray(inverted_translation_vector).flatten()
# x_coords, y_coords, z_coords are defined in camera coordinate system
x_coords=np.asarray([1,2,3,4,5])
y_coords=np.asarray([2,4,6,8,10])
z_coords=np.asarray([3,6,9,12,15])
homogeneous_ones = np.ones(len(x_coords))
homogeneous_points = np.matrix([x_coords, y_coords, z_coords, homogeneous_ones])
# perform the transformation
transformed_points = inverted_transform_matrix * homogeneous_points
# clean up to extract x,y,z values from matrix and save as 1D array
x_coords = np.asarray(transformed_points[0,:]).flatten()
y_coords = np.asarray(transformed_points[1,:]).flatten()
z_coords = np.asarray(transformed_points[1,:]).flatten()
Basically, the above code works, but when I run it on multiple point clouds from different camera perspectives, they do not magically line up as I would expect. I can confirm that my inverse homogeneous transform is indeed an inverse to the homogeneous transform constructed directly from rvecs and tvecs; have tried breaking it apart into translation first then rotation, and vice versa; and have seen a non-inverse transformation make everything almost line up from one perspective (but all rotated weirdly from every other perspective)...
Any help appreciated!

Rotate an image about a specific pixel in python

How can I rotate an image about a specific pixel in Python? I am trying to de-rotate a set of images of the night sky. Since the stars rotate around Polaris, I could define Polaris as the center of rotation and rotate each image to line up the stars.
In phadej's answer the transformation between the old and new coordinates of a point on the image is an affine transformation.
PIL (Python Imaging Library) has an image method called transform which can perform an affine transformation of an image.
The documentation for transform is near the bottom of this page.
With a little math: if each image's pixel position is vector a, and position of Polaris is p, then new position new_p is
new_p = a + R * (a-p)
where R is Rotation matrix.
There will be problem, as new_p is probably not an integer valued position-vector. You
can do it backwards. For each pixel of rotated image apply the inverse of above transform,
than you will get pixel from original image. As it could be not integer also, sample the neighbor pixels like in wu-pixels (the amount of dot spread around can be used as the sampling weight).

Categories

Resources