I'm trying to reconstruct the trajectory of a camera by extracting the features from images and computing the camera pose, however the trajectory is not as I expected. For my visual odometry system I am using an Intel camera D435i which extracts color and depth frames. My code is written with python and all the important functions used are from the OpenCV library.
I am using a dataset where the camera just moves straightforward and therefore I expect the rotation to be the identity matrix and the translation to go forward. However, this isn't why I get when I compute them.
This is my approach:
Read the color frames, transform them to grayscale and apply CLAHE-method to reduce light contrasts
Extract features using SIFT from the frame t and frame t+1
Match them using FLANN and apply Lowe's ratio test to filter outliers
Transform the 2D pixel coordinates (u1p, v1p) of the features to 3D world coordinates using the camera intrinsic matrix (K) and the depth of the pixels (s) (Acquired from the depth frames):
Normalization: x_y_norm = (K^-1) * [u1p ,v1p, 1]
Scaling: x_y_norm_s *= x_y_norm
Afterwards I try two methods: PNP and essential matrix
A) PNP:
4. Use the function cv2.solvePNPRansac() to get the rotation vector and the translation vector. The inputs of the function are the normalized and scaled keypoints of the frame t (object points) and the normalized keypoints of the frame t+1 (image points). The rotation matrix is then computed with the function cv2.Rodrigues()
B) Essential matrix:
4. Use cv2.findEssentialMat() with the normalized x-y-coordinates of the keypoints as input. Check if the essential matrix is computed correctly by checking the singular values. Use cv2.recoverPose() to get the rotation and translation (Here I also used the normalized x-y-coordinates as input).
Compute the trajectory as follows (assuming the second camera_pose is [R|t] and the first pose is the identity matrix as stated in the book multiple view geometry):
camera_pose[t+1] = pose[t] * pose[t+1]^-1 (Get pose from initial position by multiplying previous and current pose as stated in Combining relative camera rotations & translations with known pose )
position = pose[t+1] * [0,0,0,1] (Position calculated from the origin)
trajectory[:,t+1] = position[0:3]
I hope someone can help me since I don't know where my error is and neither the PnP method nor essential matrix method work.
Edit: For both methods I added a picture of the trajectory (in the Z-X and the X-Y plane) that I get from the first 15 frames of my dataset and an extract of a text file containing the rotation matrix and the translation vector of some frames. The Z-X-Plane should show dots forming a line in positive direction, since my dataset just shows images where the camera moves straightforward, however a random trajectory is shown. The green dots are old camera poses and the red dot the current one. I also added an example of the features extracted in the first frame and matched with the second frame, they seem too be alright.
Related
I have two images of the same object that were taken from different positions and an uncalibrated camera. I want to treat these images pair as stereo pairs. Then I can estimate a fundamental matrix, without any knowledge of the scene. I want to know the rotation angle between camera two relative to the first one.
pts1 and pts2 are the array of matching key points that are obtained from one of the known Feature Descriptor methods. RANSAC is used to exclude outliers.
F, mask = cv2.findFundamentalMat(pts1,pts2,v2.FM_RANSAC, 3,0.99 )
p1fNew and p2fNew - inliers of key points
With the stereoRectifyUncalibrated method, I can obtain two matrices to rectify images into the same plane.
retBool, H1, H2 = cv2.stereoRectifyUncalibrated(p1fNew,p2fNew, F, image_size)
By multiplying these matrices I can estimate transformation of one image plane to another:
T = np.linalg.inv(np.mat(H1)) * np.mat(H2)
How do I decompose this T matrix and extract from it rotation information and vector direction from the first image center to another?
I have a .dxf file containing a drawing (template) which is just a piece with holes, from said drawing I successfully extract the coordinates of the holes and their diameters given in a list [[x1,y1,d1],[x2,y2,d2]...[xn,yn,dn]].
After this, I take a picture of the piece (same as template) and after some image processing, I obtain the coordinates of my detected holes and the contours. However, this piece in the picture can be rotated with respect to the template.
How do I do the right hole correspondance (between coordinates of holes in template and the rotated coordinates of holes in image) so I can know the which diameter corresponds to each hole in the image?
Is there any method of point sorting it can give me this correspondence?
I'm working with Python and OpenCV.
All answers will be highly appreciated. Thanks!!!
Image of Template: https://ibb.co/VVpWmKx
In the template image, contours are drawn to the same size as given in the .dxf file, which differs to the size (in pixels) of the contours of the piece taken from camera.
Processed image taken from the camera, contours of the piece are shown: https://ibb.co/3rjCg5F
I've tried OpenCV functions of feature matching (ORB algorithm) so I can get the rotation angle the piece in picture was rotates with respect to the template?
but I still cannot get this rotation angle? how can I get the rotation angle with image descriptors?
is this the best approach for this problem? are there any better methods to address this problem?
Considering the image of the extracted contours, you might not need something as heavy as the feature matching algorithm of the OCV library. One approach would be to take the most outter contour of the piece and get the cv::minAreaRect of it. Resulting rotated rectangle will give you the angle. Now you just have to decide if the symmetry matches, because it might be flipped. That can be done as well in many ways. One of the most simple one (excluding the fact, the scale might be off) is that you take the most outter contour again, fill it and count the percentage of the points that overlay with the template. The one with right symmetric orientation should match in almost all points. Given that the scale of the matched piece and the template are the same.
emm you should use huMoments which gives translation, scale and rotation invariance descriptor for matching.
The hu moment can be found here https://en.wikipedia.org/wiki/Image_moment. and it is implemented in opencv
you can dig up the theory of Moment invariants on the wiki site pretty easily
to use it you can simply call
// Calculate Moments
Moments moments = moments(im, false);
// Calculate Hu Moments
double huMoments[7];
HuMoments(moments, huMoments);
The sample moment will be
h[0] = 0.00162663
h[1] = 3.11619e-07
h[2] = 3.61005e-10
h[3] = 1.44485e-10
h[4] = -2.55279e-20
h[5] = -7.57625e-14
h[6] = 2.09098e-20
Usually, here is a large range of the moment. There usually coupled with a log transform to lower the dynamic range for matching
H=log(H)
H[0] = 2.78871
H[1] = 6.50638
H[2] = 9.44249
H[3] = 9.84018
H[4] = -19.593
H[5] = -13.1205
H[6] = 19.6797
BTW, you might need to pad the template to extract the edge contour
I would like to emulate the optical behavior of a certain camera knowing only the following parameters:
it is a fisheye lens
focal length in mm
sensor size (horizontal and vertical) in mm
sensor resolution (horizontal and vertical) in pixels
HFOV and VFOV in degrees or radians
My goal is to transform a set of 3D points in space into 2D points on the camera's sensor. It looks like cv2.projectPoints() can do this, but it requires a set of distortion coefficients, K1, K2, P1, P2, etc.
Most people get these coefficients by calibrating using a real camera. I would like to work with a notional camera. One such example is for a GoPro, a person found these values:
[[-2.57614020e-01, 8.77086999e-02, -2.56970803e-04, -5.93390389e-04, -1.52194091e-02]]
I haven't been able to find an explanation of these parameters w.r.t. the input variables I've listed, and I don't understand enough about optics to know how to generate them.
How do I generate the distortion coefficients needed to use projectPoints() for a notional fisheye lens?
I am developing an application which processes cheques for banks. But when the bank's image of a cheque can be skewed or rotated slightly by an angle of maximum value 20 degrees. Before the cheque can be processed, I need to properly align this skewed image. I am stuck here.
My initial idea was that I will first try to get the straight horizontal lines using Hough Line Transform in an "ideal cheque image". Once i get the number of straight lines, I will use the same technique to detect straight lines in a skewed image. If the number of lines is less than some threshold, I will detect the image as skewed. Following is my attempt:
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
edges = cv2.Canny(gray,50,50)
lines = cv2.HoughLinesP(edges,1,np.pi/180,100,1000,100)
if len(lines[0]) > 2:
#image is mostly properly aligned
else:
#rotate it by some amount to align it
However, this gets me nowhere in finding the angle by which it is skewed. If i can find the angle, I can just do the following:
#say it is off by +20 degrees
deg = 20
M = cv2.getRotationMatrix2D(center, -deg, 1.0)
rotated = cv2.warpAffine(image, M, (w, h))
I then thought of getting the angle of rotation using scalar product. But then, using the scalar product of which two elements? I cannot get elements from the "bad" cheque by their coordinates in the "ideal" cheque, because its contents are skewed. So, is there any way in openCV by which, I can, say, superimpose the "bad" image over the "ideal" one and somehow calculate the angle it is off by?
What I would do in your case is to find the check within the image using feature matching with your template check image. Then you only need to find the transformation from one to the other and deduce the angle from this.
Take a look at this OpenCV tutorial that teaches you how to do that.
EDIT:
In fact, if what you want is to have the bank check with the right orientation, the homography is the right tool for that. No need to extract an angle. Just apply it to your image (or its inverse depending on how you computed it) and you should get a beautiful check, ready for processing.
How can I rotate an image about a specific pixel in Python? I am trying to de-rotate a set of images of the night sky. Since the stars rotate around Polaris, I could define Polaris as the center of rotation and rotate each image to line up the stars.
In phadej's answer the transformation between the old and new coordinates of a point on the image is an affine transformation.
PIL (Python Imaging Library) has an image method called transform which can perform an affine transformation of an image.
The documentation for transform is near the bottom of this page.
With a little math: if each image's pixel position is vector a, and position of Polaris is p, then new position new_p is
new_p = a + R * (a-p)
where R is Rotation matrix.
There will be problem, as new_p is probably not an integer valued position-vector. You
can do it backwards. For each pixel of rotated image apply the inverse of above transform,
than you will get pixel from original image. As it could be not integer also, sample the neighbor pixels like in wu-pixels (the amount of dot spread around can be used as the sampling weight).