I want to calculate the depth map of a stereo system, not the disparity alone.
Input:
I am a little bit new to this part of OpenCV. I have two cameras, intrinsic and extrinsic parameters for both cameras, and rotation and translation between two cameras.
Output:
I want to use the current rotation, translation, and camera parameters to calculate the depth map. I do not want to re-calculate those parameters. However, I do not know the proper python OpenCV functions.
My two cameras have angels with respect to each other.
Related
I try to create Structured-light 3D scanner.
Camera calibration
Camera calibration is copy of OpenCV official tutorial. As resutlt I have camera intrinsic parameters(camera matrix).
Projector calibration
Projector calibration maybe is not correct but process was: Projector show chessboard pattern and camera take some photos from different angles. Images are cv.undistored with camera parameters and then result images are used for calibration with OpenCV official tutorial. As result I have projector intrinsic parameters(projector matrix).
Rotation and Transition
From cv.calibrate I have rotarion and transition vectors as results but vectors count are equal to images count and I thing it is not corect ones because I move camera and projector in calibration.
My new idea is to project chessboard on scanning background, perform calibration and in this way I will have Rotation vector and Transition vector. I don't know is that correct way.
Scanning
Process of scanning is:
Generate patterns -> undistor patterns with projector matrix -> Project pattern and take photos with camera -> undistort taken photos with camera matrix
Camera-projector pixels map
I use GrayCode pattern and with cv.graycode.getProjPixel and have pixels mapping between camera and projector. My projector is not very high resolution and last patterns are not very readable. I will create custom function that generate mapping without the last patterns.
Problem
I don't know how to get depth map(Z) from all this information. My confution is because there are 3 coordinate systems - camera, projector and world coordinate system.
How to find 'Z' with code? Can I just get Z from pixels mapping between image and pattern?
Information that have:
p(x,y,1) = R*q(x,y,z) + T - where p is image point, q is real world point(maybe), R and T are rotation vector and transition vector. How to find R and T?
Z = B.f/(x-x') - where Z is coordinate(depth), B-baseline(distanse between camera and projector) I can measure it by hand but maybe this is not the way, (x-x') - distance between camera pixel and projector pixel. I don't know how to get baseline. Maybe this is Transition vector?
I tried to get 4 meaning point, use them in cv.getPerspectiveTransform and this result to be used in cv.reprojectImageTo3D. But cv.getPerspectiveTransform return 3x3 matrix and cv.reprojectImageTo3D use Q-4×4 perspective transformation matrix that can be obtained with stereoRectify.
Similar Questions:
How is point cloud data acquired from the structured light 3D scanning? - Answer is you need to define a vector that goes from the camera perspective center to the pixel in the image and then rotate this vector by the camera rotation. But I don't know how to define/find thid vercor and Rotation vector is needed.
How to compute the rotation and translation between 2 cameras? - Question is about R and T between two cameras but almost everywhere writes that projector is inverse camera. One good answer is The only thing you have to do is to make sure that the calibration chessboard is seen by both of the cameras. But I think if I project chessboard pattern it will be additional distored by wall(Projective transormation)
There are many other resources and I will update list with comment. I missed something and I can't figure out how to implement it.
Lets assume p(x,y) is the image point and the disparity as (x-x'). You can obtain the depth point as,
disparity = x-x_ # x-x'
point_and_disparity = np.array([[[x, y, disparity]]], dtype=np.float32)
depth = cv2.perspectiveTransform(point_and_disparity, q_matrix)
I'm currently extracting the calibration parameters of two images that were taken in a stereo vision setup via cv2.aruco.calibrateCameraCharucoExtended(). I'm using the cv2.undistortPoints() & cv2.triangulatePoints() function to convert any two 2D points to a 3D point coordinate, which works perfectly fine.
I'm now looking for a way to convert the 2D images, which can be seen under approach 1, to one 3D image. I need this 3D image because I would like to determine the order of these cups from left to right, to correctly use the triangulatePoints function. If I determine the order of the cups from left to right purely based on the x-coordinates of each of the 2D images, I get different results for each camera (the cup on the front left corner of the table for example is in a different 'order' depending on the camera angle).
Approach 1: Keypoint Feature Matching
I was first thinking about using a keypoint feature extractor like SIFT or SURF, so I therefore tried to do some keypoint extraction and matching. I tried using both the Brute-Force Matching and FLANN based Matcher, but the results are not really good:
Brute-Force
FLANN-based
I also tried to swap the images, but it still gives more or less the same results.
Approach 2: ReprojectImageTo3D()
I looked further into the issue and I think I need the cv2.reprojectImageTo3D() [docs] function. However, to use this function, I first need the Q matrix which needs to be obtained with cv2.stereoRectify [docs]. This stereoRectify function on its turn expects a couple of parameters that I'm able to provide, but there's two I'm confused about:
R – Rotation matrix between the
coordinate systems of the first and
the second cameras.
T – Translation vector between
coordinate systems of the cameras.
I do have the rotation and translation matrices for each camera separately, but not between them? Also, do I really need to do this stereoRectify all over again when I already did a full calibration in ChArUco and already have the camera matrix, distortion coefficients, rotation vectors and translations vectors?
Some extra info that might be useful
I'm using 40 calibration images per camera of the ChArUco board to calibrate. I first extract all corners and markers after which I estimate the calibration parameters with the following code:
(ret, camera_matrix, distortion_coefficients0,
rotation_vectors, translation_vectors,
stdDeviationsIntrinsics, stdDeviationsExtrinsics,
perViewErrors) = cv2.aruco.calibrateCameraCharucoExtended(
charucoCorners=allCorners,
charucoIds=allIds,
board=board,
imageSize=imsize,
cameraMatrix=cameraMatrixInit,
distCoeffs=distCoeffsInit,
flags=flags,
criteria=(cv2.TERM_CRITERIA_EPS & cv2.TERM_CRITERIA_COUNT, 10000, 1e-9))
The board paremeter is created with the following settings:
CHARUCO_BOARD = aruco.CharucoBoard_create(
squaresX=9,
squaresY=6,
squareLength=4.4,
markerLength=3.5,
dictionary=ARUCO_DICT)
Thanks a lot in advance!
I have some SIFT features in two stereo images, and I'm trying to place them in 3D space. I've found triangulatePoints, which seems to be what I want, however, I'm having trouble with the arguments.
triangulatePoints takes 4 arguments, projMatr1 and projMatr2, which is where my issues start, and projPoints1 and projPoints2, which are my feature points. The OpenCV docs suggest using stereoRectify to find the projection matrices.
stereoRectify takes the intrinsic camera matrices (which I've calculated prior with calibrateCamera) and the image size from calibration. As well as two arguments R (rotation matrix) and T (translation vector), which can be found with stereoCalibrate.
However, stereoCalibrate takes "object points", which I'm pretty sure I can't calculate for images without a reference, which is a bit of a roadblock.
Is this the best way to be calculating 3D positions from pairs of features? If so, how can I calculate projMatr1 and projMatr2 without stereoCalibrate?
As you say, you have no calibration, so let’s forget about rectification. What you want is the depth of the points, so you can project them into 3D (which then uses just the intrinsic calibration of one camera, mainly the focal length).
Since you have no rectification, you cannot expect exact results, so let’s try to get as close as possible:
Depth is focal length times baseline divided by disparity, disparity and focal length being in pixels, and depth and baseline in (recommendation) meters.
For accurate disparity you need a rectified camera and correspondences between your features in both images. Since without calibration, you have no hope of rectification, you could try to just use the original images instead. It will work fine the more parallel the cameras are. If they are not parallel, you will introduce an error here and your results will become less accurate. If this becomes bad you must find a way to calibrate your camera.
But most importantly, you need correspondences between your features in both images. Running SIFT in both images won‘t do. A better approach would be running SIFT in just one image and then finding the corresponding pixels for each of the features in the other image. There are plenty of methods for that, I believe OpenCv has some simple block matching builtin.
I need to calculate distance from camera to depth image pixel. I searched through internet but I found stereo image related info and code example where I need info for depth image.
Here, I defined depth image in gray scale(0-255) and I defined a particular value( let range defined 0 pixel value is equal to 5m and 255 pixel value is equal to 500m in gray scale).
camera's intrinsic (focal length, image sensor format) and extrinsic (rotation and transition matrix) is given. I need to calculate distance from different camera orientation and rotation.
I want to do it using opencv python. Is there any specific documentation and code example regarding this?
Or any further info is necessary to find this.
The content of my research is the same as yours, but I have a problem now. I use stereocalibrate() to calibrate the binocular camera, and found that the obtained translation matrix is very different from the actual baseline distance. In addition, the parameters used in stereocalibrate() are obtained by calibrating the two cameras with calibrate().
Here is my question:
My optical system is made of a camera plus a circular plexiglass "lens" that changes its curvature depending on pressure (radial bending).
This curvature induces a deformation of the image captured by the camera.
To correct this deformation, images need to be calibrated.
Calibration can be made with a grid (chessboard, dots, lines), pressure range has to be discretized with a certain step.
For each pressure step, an image of the grid has to be taken.
Then each image has to be compared to the reference one (P=0), and a transformation matrix has to be computed and stored.
Finally, each image taken during the experiment for a specific pressure has to be corrected by the transformation matrix.
The deformation is non-linear (not only a combination of rotations and translations), but most likely Barrel distortion. (again not induced by the camera)
Which looks like that:
http://en.wikipedia.org/wiki/Distortion_%28optics%29#mediaviewer/File:Barrel_distortion.svg
I found a plugin in ImageJ called BunwarpJ, http://biocomp.cnb.csic.es/~iarganda/bUnwarpJ/
and I basically want to know if there is an equivalent way to produce the same result in Opencv.
(CalibrateCamera won't do the trick)
OpenCv has an undistort function that can take a current image, a matrix of camera coefficients, distorsion coeffs. and produces a new image corrected for sent camera coeffs. and a new set of camera coeffs. (if you need to do other transformations on the new image).
I have not used it before, so I can't say what exactly are camera or distorsion coefficients are but as manual describes:
The function transforms an image to compensate radial and tangential
lens distortion. The function is simply a combination of
initUndistortRectifyMap() (with unity R ) and remap() (with bilinear
interpolation).
So checking those two funcs. out are a good way to find out.
I believe you misunderstood the manual perhaps because you seem to think that CalibrateCamera does this for you. Instead CalibrateCamera actually returns the camera and distorsion coeffs. which you need to undistort your image.
Each lens has its own constant coeffs. which in your case means that you'll have to calibrateCamera for a range of pressures (I assume you control that experimentally?) and then call different undistort func. with different parameters which you'll get out of your experiments.
A matrix can only capture a linear transformation (or possibly a linear transformation in homogeneous space), not a general distortion.
In my experience any attempt to use a single global transformation formula wouldn't be very accurate (it's not trivial to get just 99.9% accuracy). Even just correcting camera lens distortion this way is difficult if you want high accuracy.
In the past I got good enough results using a sparse global RBF interpolation, but later I moved to an interpolating 2d spline approach; if you can choose your calibration points to be on a regular grid this is the solution I would suggest.
In the end the mapping could be a 2-valued 3d interpolating spline on a regular grid (XY for the image, Z for the pressure; values UV are the pixel coordinates).
Straightening the image once pressure is known is just texture mapping.