OpenCV provides methods to calibrate a camera. I want to know if it also has a way to simply generate a view projection matrix if and when the parameters are known.
i.e I know the camera position, rotation, up, FOV... and whatever else is needed, then call MagicOpenCVCamera(parameters) and obtain a 4x4 transformation matrix.
I have searched this up but I can only find information about calibrating the camera, not about creating one if you already know the parameters.
The projection matrix is simply a 3x4 matrix whose [0:3,0:3] left square is occupied by the product K.dot(R) of the camera intrinsic calibration matrix K and its camera-from-world rotation matrix R, and the last column is K.dot(t), where t is the camera-from-world translation. To clarify, R is the matrix that brings into camera coordinates a vector decomposed in world coordinates, and t is the vector whose tail is at the camera center, and whose tip is at the world origin.
The OpenCV calibration routines produce the camera orientations as rotation vectors, not matrices, but you can use cv.Rodrigues to convert them.
Related
I try to create Structured-light 3D scanner.
Camera calibration
Camera calibration is copy of OpenCV official tutorial. As resutlt I have camera intrinsic parameters(camera matrix).
Projector calibration
Projector calibration maybe is not correct but process was: Projector show chessboard pattern and camera take some photos from different angles. Images are cv.undistored with camera parameters and then result images are used for calibration with OpenCV official tutorial. As result I have projector intrinsic parameters(projector matrix).
Rotation and Transition
From cv.calibrate I have rotarion and transition vectors as results but vectors count are equal to images count and I thing it is not corect ones because I move camera and projector in calibration.
My new idea is to project chessboard on scanning background, perform calibration and in this way I will have Rotation vector and Transition vector. I don't know is that correct way.
Scanning
Process of scanning is:
Generate patterns -> undistor patterns with projector matrix -> Project pattern and take photos with camera -> undistort taken photos with camera matrix
Camera-projector pixels map
I use GrayCode pattern and with cv.graycode.getProjPixel and have pixels mapping between camera and projector. My projector is not very high resolution and last patterns are not very readable. I will create custom function that generate mapping without the last patterns.
Problem
I don't know how to get depth map(Z) from all this information. My confution is because there are 3 coordinate systems - camera, projector and world coordinate system.
How to find 'Z' with code? Can I just get Z from pixels mapping between image and pattern?
Information that have:
p(x,y,1) = R*q(x,y,z) + T - where p is image point, q is real world point(maybe), R and T are rotation vector and transition vector. How to find R and T?
Z = B.f/(x-x') - where Z is coordinate(depth), B-baseline(distanse between camera and projector) I can measure it by hand but maybe this is not the way, (x-x') - distance between camera pixel and projector pixel. I don't know how to get baseline. Maybe this is Transition vector?
I tried to get 4 meaning point, use them in cv.getPerspectiveTransform and this result to be used in cv.reprojectImageTo3D. But cv.getPerspectiveTransform return 3x3 matrix and cv.reprojectImageTo3D use Q-4×4 perspective transformation matrix that can be obtained with stereoRectify.
Similar Questions:
How is point cloud data acquired from the structured light 3D scanning? - Answer is you need to define a vector that goes from the camera perspective center to the pixel in the image and then rotate this vector by the camera rotation. But I don't know how to define/find thid vercor and Rotation vector is needed.
How to compute the rotation and translation between 2 cameras? - Question is about R and T between two cameras but almost everywhere writes that projector is inverse camera. One good answer is The only thing you have to do is to make sure that the calibration chessboard is seen by both of the cameras. But I think if I project chessboard pattern it will be additional distored by wall(Projective transormation)
There are many other resources and I will update list with comment. I missed something and I can't figure out how to implement it.
Lets assume p(x,y) is the image point and the disparity as (x-x'). You can obtain the depth point as,
disparity = x-x_ # x-x'
point_and_disparity = np.array([[[x, y, disparity]]], dtype=np.float32)
depth = cv2.perspectiveTransform(point_and_disparity, q_matrix)
I am working on an application using an IFM 3D camera to identify parts prior to a robot pickup. Currently I am able to find the centroid of these objects using contours from a depth image and from there calculate the center point of these objects in pixel space.
My next task is to then transform the 2D centroid coordinates to a 3D point in 'real' space. I am able to train the robot such that it's coordinate frame is either at the center of the image or at the traditional (0,0) point of an image (top left).
The 3D camera I am using provides both an intrinsic and extrinsic matrix. I know I need to use some combination of these matrices to project my centroid into three space but the following questions remain:
My current understanding from googling is the intrinsic matrix is used to fix lens distortion (barrel and pinhole warping, etc.) whereas the extrinsic matrix is used to project points into the real world. Is this simplified assumption correct?
How can a camera supply a single extrinsic matrix? I know traditionally these matrices are found using the checkerboard corners method but are these not dependent on the height of the camera?
Is the solution as simple as taking the 3x4 extrinsic matrix and multiplying it by a 3x1 matrix [x, y, 1] and if so, will the returned values be relative to the camera center or the traditional (0,0) point of an image.
Thanks in advance for any insight! Also if it's any consolation I am doing everything in python and openCV.
No. I suggest you read the basics in Multiple View Geometry of Hartley and Zisserman, freely available in the web. Dependent on the camera model, the intrinsics contain different parameters. For the pinhole camera model, these are the focal length and the principal point.
The only reason why you maybe could directly transform your 2D centroid to 3D is that you use a 3D camera. Read the manual of the camera, it should be explained how the relation between 2D and 3D coordinates is given for your specific model.
If you have only image data, you can only compute a 3D point from at least two views.
No, of course not. Please don't be lazy and start reading the basics about camera projection instead of asking for others to explain the common basics that are written down everywhere in the web and literature.
So if you take a pinhole camera and make it as the origin of our plane(3D) and a pixel from the image plane and connect the two with a straight line it should make a vector, which has direction and a length. Think of this as the path followed by the light reflected from an object into the camera lens. And I want to calculate this. I think we have to use the cameras intrinsic properties for this.
Below is a statement that made me think about it all.
In a pinhole camera model, each pixel defines a direction vector in 3D space, specifically the vector from the projection center through the pixel's position on the image plane.
Here is a diagram better explaining this.
I want to calculate the three red lines, and known parameters would be, I guess, the camera position(origin) and the image pixel value, and the intrinsic camera parameters.
I have a flat quadrilateral plane that I know the dimensions of, and I can find the contours of it, along with the 4 corners. I need assistance in figuring out the method of determining its 3D position. I have managed to get a 3x3 perspective transform of it, which looks something like this:
[[ 3.91873630e-02 1.20990983e+00 -2.81213415e+02]
[ 1.21202027e+00 -1.85962357e-15 -3.52697898e+02]
[ 3.83991908e-04 2.52680041e-05 1.00000000e+00]]
(this is OpenCV 3.1.0, in python, and that matrix is just one frame, so it might not be representative of all potential orientations)
Can I determine the angle away from the camera and distance from the camera from this information, or do I need to perform more calculations?
I'm not sure what you guys need to know, so I'll be happy to give you more information if you need it?
You can use solvePnP function to determine quadrilateral's position and orientation wrt camera.
All you need to have is
3D coordinates of quadrilateral's corners in world frame
corresponding pixel coordinates
Camera's intrinsic parameters
You can directly use solvepnp() function of Opencv.
In output you get pose of world coordinate system in camera coordinate system. If you take world as quadrilateral itself, you get pose information(Rotation and translation) of quadrilateral in camera coordinate system
C++: bool solvePnP(InputArray objectPoints, InputArray imagePoints,
InputArray cameraMatrix, InputArray distCoeffs,
OutputArray rvec, OutputArray tvec,
bool useExtrinsicGuess=false, int flags=ITERATIVE );
Use CV_P3P argument for flags, in case you have exactly 4 points.
Here is my question:
My optical system is made of a camera plus a circular plexiglass "lens" that changes its curvature depending on pressure (radial bending).
This curvature induces a deformation of the image captured by the camera.
To correct this deformation, images need to be calibrated.
Calibration can be made with a grid (chessboard, dots, lines), pressure range has to be discretized with a certain step.
For each pressure step, an image of the grid has to be taken.
Then each image has to be compared to the reference one (P=0), and a transformation matrix has to be computed and stored.
Finally, each image taken during the experiment for a specific pressure has to be corrected by the transformation matrix.
The deformation is non-linear (not only a combination of rotations and translations), but most likely Barrel distortion. (again not induced by the camera)
Which looks like that:
http://en.wikipedia.org/wiki/Distortion_%28optics%29#mediaviewer/File:Barrel_distortion.svg
I found a plugin in ImageJ called BunwarpJ, http://biocomp.cnb.csic.es/~iarganda/bUnwarpJ/
and I basically want to know if there is an equivalent way to produce the same result in Opencv.
(CalibrateCamera won't do the trick)
OpenCv has an undistort function that can take a current image, a matrix of camera coefficients, distorsion coeffs. and produces a new image corrected for sent camera coeffs. and a new set of camera coeffs. (if you need to do other transformations on the new image).
I have not used it before, so I can't say what exactly are camera or distorsion coefficients are but as manual describes:
The function transforms an image to compensate radial and tangential
lens distortion. The function is simply a combination of
initUndistortRectifyMap() (with unity R ) and remap() (with bilinear
interpolation).
So checking those two funcs. out are a good way to find out.
I believe you misunderstood the manual perhaps because you seem to think that CalibrateCamera does this for you. Instead CalibrateCamera actually returns the camera and distorsion coeffs. which you need to undistort your image.
Each lens has its own constant coeffs. which in your case means that you'll have to calibrateCamera for a range of pressures (I assume you control that experimentally?) and then call different undistort func. with different parameters which you'll get out of your experiments.
A matrix can only capture a linear transformation (or possibly a linear transformation in homogeneous space), not a general distortion.
In my experience any attempt to use a single global transformation formula wouldn't be very accurate (it's not trivial to get just 99.9% accuracy). Even just correcting camera lens distortion this way is difficult if you want high accuracy.
In the past I got good enough results using a sparse global RBF interpolation, but later I moved to an interpolating 2d spline approach; if you can choose your calibration points to be on a regular grid this is the solution I would suggest.
In the end the mapping could be a 2-valued 3d interpolating spline on a regular grid (XY for the image, Z for the pressure; values UV are the pixel coordinates).
Straightening the image once pressure is known is just texture mapping.