I am beginner in OpenCV and I want to find 3D point from 2D (projected on image) and known two out of three world coordinates of a point.
What I have:
calibrated camera (known: matrix of intrinsic parameters, vector of distortion coefficients)
rotation vector and translation vector
(u, v) - coordinates of a point on image
two known world coordinates of a point and one unknown
Problem variant #1 Known: X, Y Unknown: Z
Problem variant #2 Known: X, Z Unknown: Y
Problem variant #3 Known: Y, Z Unknown: X
How can I find third unknown coordinate? Is it possible?
Yes it is possible. Consider the simple case of absent nonlinear distortion. Let Ki be the inverse of the camera matrix, and camera center at the world origin (i.e. no rotation nor translation). Let p=(u, v, 1) be the homogeneous pixel coordinate. Then the ray through the pixel is:
s * P = Ki * p
where s > 0 is an unknown scale. But s * P = [X, Y, Z], so if you know any one of X, Y or Z you can solve for s and find the missing coordinates.
For non-zero roto-translation, replace Ki with the inverse of the projection matrix. For non-zero distortion, replace the simple multiplication by Ki with the complete reprojection equation.
Related
I have a set of different point clouds, from each of which I find the plane with the largest supponrt and get the plane equation (ax + by + cz + d = 0)
Using the plane equation, I am trying to insert a plane into each of the point clouds, which I manage to do. The problem is the distortion of said plane.
How can I insert a plane that has a rectangular form?
This is how I am inserting the plane on each of the point clouds:
I first find the min/max value for the x and y, so the plane is proportional to the point cloud size and intersects it.
I multiply the min by 10 and divide x and y by 10 to increase the density of the inserted plane
min = np.min(point_cloud, axis=0)
max = np.max(point_cloud, axis=0)
for i in range (int(min[0] * 10), int(max[0]* 10)):
for j in range (int(min[1]* 10), int(max[1]* 10)):
x = i/10
y = j/10
z = (-d1 - a1*x - b1*y)/c1
plane.append([x, y, z])
And here is a gif showing different angles of the different point clouds with the plane inserted
Any guidance or literature where I could understand/learn how to fix this?
I think the problem is the way I am calculating z, because when I set z to a fixed number (e.g. 10), the plane has a rectangle form, without distortion. But of course, it doesn't intersect the point cloud.
This is an example of that:
Thanks
I saw a lot of similar questions, but not this exact one.
I know:
pixel coordinates (u,v) and depth (d) - 4 points
corresponding world coordinates (x,y,z) - 4 points
intrinsic matrix K of the camera (I didn't calibrate yet but used the camera default one)
extrinsic matrix [R|t] (I call its 4x4 version M_ext)
conversion from world coordinate points X (3D) to pixel coordinate points p (2D)
I don't know:
conversion from pixel coordinate points p (2D) + depth d to world coordinate points X (3D)
0. Get extrinsic matrix:
To get the extrinsic matrix I use opencvs function solvePnP:
ret, rvec, tvec = cv.solvePnP(world_points, img_points, K, np.float32([]), cv.SOLVEPNP_IPPE)
I get the extrinsic matrix [R|t] using the rotation and translation vector above ( Rt = np.hstack((cv.Rodrigues(rvec)[0], tvec))).
I want to apply both conversions for validation that my extrinsic matrix is correct.
1. World coordinate points to pixel coordinates:
camera_coordinates = K * M_ext * world_coordinates
u = camera_coordinates[0] / camera_coordinates[2]
v = camera_coordinates[1] / camera_coordinates[2]
This conversion works, my pixel coordinates u, v are correct.
2. Pixel coordinates + depth to world coordinate points:
camera_coordinates = ( inv(K) * [u, v, 1] ) * d
world_coordinates = inv(M_ext) * [camera_coordinates, 1]
This inverse conversion does not yield my world coordinate points. Does anyone see where my error is?
I wish to project an image taken with a camera for which I know all parameters (focal length, sensor size, X, Y, Z, rotation (omega, phi, kappa) on a 2D plane. I know that I need to construct a camera matrix before being able to do the planar homography, but how?
I've successfully produced a matrix using 4 known pairs of points on each plane following this answer, but it's not the way I want to do it. I've looked at this video that give me almost all my answers, hovewer the matrix named "extrinsic parameters" is not enterely described. How the rotation matrix R and the matrix T of the camera position should be constructed?
With the final camera matrix in hand, is suppose I will be able to take each parameter and feed them to PIL.Image.transform. I'm also open to using the python OpenCV library.
Here is some exemple data:
Original image here (4288 x 2848 pixels)
#Camera position
X: 72003 m
Y: 1070100 m
Z: 1243 m
#Rotation of camera
Omega: 0°
Phi: 27°
Kappa: -38°
Focal length: 26 mm
Pixel size on sensor: 0.00551 mm
The camera matrix P is a 4x3 matrix of the form P = K[R t]:
K is a 3x3 matrix containing the intrinsic parameters (principal point and focal length in pixels)
[R t] is a 3x4 matrix obtained by concatenating R, a 3x3 matrix representing the rotation from the camera frame to the world frame, and t, a 3-vector which represents the position of the origin of the world in the camera frame.
This means that the parameters you have, which seem to be the position of the camera in the world frame, have to be inverted. The inverse of [R t] is [R' t'] where R' = inverse(R) = transpose(R) and t' = -inverse(R)t.
You would first have to know how to compute the 3x3 camera rotation matrix from the three angles you have, and there are many possible representations of a rotation matrix from three angles. The most common are yaw/pitch/roll, and Euler angles with all possible rotation orders.
The 3x3 intrinsics matrix K is [[f 0 cx][0 f cy][0 0 1]], where f = 26/0.00551 = 4719 and (cx,cy) is the principal point, which you can take as the center of the image (4288/2,2848/2).
Then to compute the homography (3x3 matrix) that goes from the plane at world height Z0 to your image, you multiply P by (X,Y,Z0,1), which gives you an expression of the form Xv1 + Yv2 + v3 where v1, v2, and v3 are 3-vectors. The matrix H=[v1 v2 v3] is the homography you are looking for. The 8 coefficients for PIL.Image.transform should be the first 8 coefficients from that matrix, divided by the 9th one.
I'm trying to write a function that when given two cameras, their rotation, translation matrices, focal point, and the coordinates of a point for each camera, will be able to triangulate the point into 3D space. Basically, given all the extrinsic/intrinsic values needed
I'm familiar with the general idea: to somehow create two rays and find the closest point that satisfies the least squares problem, however, I don't know exactly how to translate the given information to a series of equations to the coordinate point in 3D.
I've arrived a couple years late on my journey. I ran into the exact same issue and found several people asking the same question but never found an answer that was simplified enough for me to understand, so I spent days learning this stuff just so I can simplify it to the essentials and post what I found here for future people.
I'll also give you some code samples at the end to do what you want in python, so stick around.
Some screen shots of my handwritten notes which explain the full process.
Page 1. Page 2. Page 3.
This is the equation I start with can be found in https://docs.opencv.org/master/d9/d0c/group__calib3d.html
Starting formula
Once you choose an origin in the real world that is the same for both cameras, you will have two of these equations with the same X, Y, Z values.
Sorry this next part you already have but others might not have gotten this far:
First you need to calibrate your camera which will give you the camera matrix and distortions (intrinsic properties) for each camera.
https://docs.opencv.org/master/dc/dbb/tutorial_py_calibration.html
You only need those two and can dump the rvecs and tvecs because this will change when you set up the camera.
Once you choose your real world coordinate system, you can use cv2.solvePnP to get the rotation and translation vectors. To do this you need a set of real world points and their corresponding coordinates in the camera for each camera. My trick was I made some code that would show an image of the field and I would pass in points. Then I would click on locations on the image and create a mapping. The code for this bit is a bit lengthy so I won't share it here unless it is requested.
cv2.solvePnP will give you a vector for the rotation matrix, so you need to convert this to a 3x3 matrix using the following line:
`R, jac = cv2.Rodrigues(rvec)`
So now back to the original question:
You have the 3x3 camera matrix for each camera. You have the 3x3 rotation matrix for each camera. You have the 3x1 translation vector for each camera. You have some (u, v) coordinate for where the object of interest is in each camera. The math is explained more in the image of the notes.
import numpy as np
def get_xyz(camera1_coords, camera1_M, camera1_R, camera1_T, camera2_coords, camera2_M, camera2_R, camera2_T):
# Get the two key equations from camera1
camera1_u, camera1_v = camera1_coords
# Put the rotation and translation side by side and then multiply with camera matrix
camera1_P = camera1_M.dot(np.column_stack((camera1_R,camera1_T)))
# Get the two linearly independent equation referenced in the notes
camera1_vect1 = camera1_v*camera1_P[2,:]-camera1_P[1,:]
camera1_vect2 = camera1_P[0,:] - camera1_u*camera1_P[2,:]
# Get the two key equations from camera2
camera2_u, camera2_v = camera2_coords
# Put the rotation and translation side by side and then multiply with camera matrix
camera2_P = camera2_M.dot(np.column_stack((camera2_R,camera2_T)))
# Get the two linearly independent equation referenced in the notes
camera2_vect1 = camera2_v*camera2_P[2,:]-camera2_P[1,:]
camera2_vect2 = camera2_P[0,:] - camera2_u*camera2_P[2,:]
# Stack the 4 rows to create one 4x3 matrix
full_matrix = np.row_stack((camera1_vect1, camera1_vect2, camera2_vect1, camera2_vect2))
# The first three columns make up A and the last column is b
A = full_matrix[:, :3]
b = full_matrix[:, 3].reshape((4, 1))
# Solve overdetermined system. Note b in the wikipedia article is -b here.
# https://en.wikipedia.org/wiki/Overdetermined_system
soln = np.linalg.inv(A.T.dot(A)).dot(A.T).dot(-b)
return soln
Assume you have two cameras -- camera 1 and camera 2.
For each camera j = 1, 2 you are given:
The distance hj between it's center Oj, (is "focal point" the right term? Basically the point Oj from which the camera is looking at its screen) and the camera's screen. The camera's coordinate system is centered at Oj, the Oj--->x and Oj--->y axes are parallel to the screen, while the Oj--->z axis is perpendicular to the screen.
The 3 x 3 rotation matrix Uj and the 3 x 1 translation vector Tj which transforms the Cartesian 3D coordinates with respect to the system of camera j (see point 1) to the world-coordinates, i.e. the coordinates with respect to a third coordinate system from which all points in the 3D world are described.
On the screen of camera j, which is the plane parallel to the plane Oj-x-y and at a distance hj from the origin Oj, you have the 2D coordinates (let's say the x,y coordinates only) of point pj, where the two points p1 and p2 are in fact the projected images of the same point P, somewhere in 3D, onto the screens of camera 1 and 2 respectively. The projection is obtained by drawing the 3D line between point Oj and point P and defining point pj as the unique intersection point of this line with with the screen of camera j. The equation of the screen in camera j's 3D coordinate system is z = hj , so the coordinates of point pj with respect to the 3D coordinate system of camera j look like pj = (xj, yj, hj) and so the 2D screen coordinates are simply pj = (xj, yj) .
Input: You are given the 2D points p1 = (x1, y1), p2 = (x2, y2) , the twp cameras' focal distances h1, h2 , two 3 x 3 rotation matrices U1 and U2, two translation 3 x 1 vector columns T1 and T2 .
Output: The coordinates P = (x0, y0, z0) of point P in the world coordinate system.
One somewhat simple way to do this, avoiding homogeneous coordinates and projection matrices (which is fine too and more or less equivalent), is the following algorithm:
Form Q1 = [x1; y1; h1] and Q2 = [x2; y2; h2] , where they are interpreted as 3 x 1 vector columns;
Transform P1 = U1*Q1 + T1 and P2 = U1*Q2 + T1 , where * is matrix multiplication, here it is a 3 x 3 matrix multiplied by a 3 x 1 column, givin a 3 x 1 column;
Form the lines X = T1 + t1*(P1 - T1) and X = T2 + t2*(P2 - T2) ;
The two lines from the preceding step 3 either intersect at a common point, which is the point P or they are skew lines, i.e. they do not intersect but are not parallel (not coplanar).
If the lines are skew lines, find the unique point X1 on the first line and the uniqe point X2 on the second line such that the vector X2 - X1 is perpendicular to both lines, i.e. X2 - X1 is perpendicular to both vectors P1 - T1 and P2 - T2. These two point X1 and X2 are the closest points on the two lines. Then point P = (X1 + X2)/2 can be taken as the midpoint of the segment X1 X2.
In general, the two lines should pass very close to each other, so the two points X1 and X2 should be very close to each other.
I can get some info from a Arc.
FirstPoint [x, y, z]
LastPoint [x, y, z]
Center [x, y, z]
Axis [x, y, z] # Perpendicular to the plane
How can I get the FirstPoint&LastPoint's tangential direction Vector?
I want to get a intersection Point from two direction vector.
I work in FreeCAD.
Circular arc from A to B with center M and normal vector N.
The tangent directions can be obtained by the cross product.
Tangent at A: N x (A-M)
Tangent at B: (B-M) x N
Both correspond to a rotation of 90DEG or -90DEG of the radius vectors around the axis N
We'll need a lot more information to give a good answer, but here is a first attempt, with questions after.
One way to approximate a tangent vector is with a secant vector: If your curve is given parametrically as a function of t and you want the tangent at t_0, then choose some small number e; evaluate the function at t_0 + e and at t_0 - e; then subtract the two results to get the secant vector. It will be a good approximation to the tangent vector if your curve isn't too curvy in that interval around t.
Now for the questions. How is your question related to Python, and where does FreeCAD come in? You have constructed the curve in FreeCAD, and you want to compute tangents in Python? Can you say anything about the curve, like whether it's a cubic spline curve, whether it curves in only one direction, what you mean by "center" and "axis"? (An arbitrary curve with tangent vectors isn't necessarily a cubic spline, might curve in very complicated ways, and doesn't have any notion of a center or axis.)
s.Curve
Circle (Radius : 1, Position : (0.335157, 11.988, 5.55452), Direction : (-0.914329, -0.257151, 0.312851))
s.Vertex1.Point #FirstPoint
Vector (0.7393506936636021, 11.360676836326173, 6.220155663200929)
s.Vertex2.Point #LastPoint
Vector (0.3602513339713556, 12.723079925995924, 6.232050903393676)
s.Curve.FirstParameter
0.0
s.Curve.LastParameter
6.283185307179586
It's a simple arc.