Here is my question:
My optical system is made of a camera plus a circular plexiglass "lens" that changes its curvature depending on pressure (radial bending).
This curvature induces a deformation of the image captured by the camera.
To correct this deformation, images need to be calibrated.
Calibration can be made with a grid (chessboard, dots, lines), pressure range has to be discretized with a certain step.
For each pressure step, an image of the grid has to be taken.
Then each image has to be compared to the reference one (P=0), and a transformation matrix has to be computed and stored.
Finally, each image taken during the experiment for a specific pressure has to be corrected by the transformation matrix.
The deformation is non-linear (not only a combination of rotations and translations), but most likely Barrel distortion. (again not induced by the camera)
Which looks like that:
http://en.wikipedia.org/wiki/Distortion_%28optics%29#mediaviewer/File:Barrel_distortion.svg
I found a plugin in ImageJ called BunwarpJ, http://biocomp.cnb.csic.es/~iarganda/bUnwarpJ/
and I basically want to know if there is an equivalent way to produce the same result in Opencv.
(CalibrateCamera won't do the trick)
OpenCv has an undistort function that can take a current image, a matrix of camera coefficients, distorsion coeffs. and produces a new image corrected for sent camera coeffs. and a new set of camera coeffs. (if you need to do other transformations on the new image).
I have not used it before, so I can't say what exactly are camera or distorsion coefficients are but as manual describes:
The function transforms an image to compensate radial and tangential
lens distortion. The function is simply a combination of
initUndistortRectifyMap() (with unity R ) and remap() (with bilinear
interpolation).
So checking those two funcs. out are a good way to find out.
I believe you misunderstood the manual perhaps because you seem to think that CalibrateCamera does this for you. Instead CalibrateCamera actually returns the camera and distorsion coeffs. which you need to undistort your image.
Each lens has its own constant coeffs. which in your case means that you'll have to calibrateCamera for a range of pressures (I assume you control that experimentally?) and then call different undistort func. with different parameters which you'll get out of your experiments.
A matrix can only capture a linear transformation (or possibly a linear transformation in homogeneous space), not a general distortion.
In my experience any attempt to use a single global transformation formula wouldn't be very accurate (it's not trivial to get just 99.9% accuracy). Even just correcting camera lens distortion this way is difficult if you want high accuracy.
In the past I got good enough results using a sparse global RBF interpolation, but later I moved to an interpolating 2d spline approach; if you can choose your calibration points to be on a regular grid this is the solution I would suggest.
In the end the mapping could be a 2-valued 3d interpolating spline on a regular grid (XY for the image, Z for the pressure; values UV are the pixel coordinates).
Straightening the image once pressure is known is just texture mapping.
Related
I am working on an application using an IFM 3D camera to identify parts prior to a robot pickup. Currently I am able to find the centroid of these objects using contours from a depth image and from there calculate the center point of these objects in pixel space.
My next task is to then transform the 2D centroid coordinates to a 3D point in 'real' space. I am able to train the robot such that it's coordinate frame is either at the center of the image or at the traditional (0,0) point of an image (top left).
The 3D camera I am using provides both an intrinsic and extrinsic matrix. I know I need to use some combination of these matrices to project my centroid into three space but the following questions remain:
My current understanding from googling is the intrinsic matrix is used to fix lens distortion (barrel and pinhole warping, etc.) whereas the extrinsic matrix is used to project points into the real world. Is this simplified assumption correct?
How can a camera supply a single extrinsic matrix? I know traditionally these matrices are found using the checkerboard corners method but are these not dependent on the height of the camera?
Is the solution as simple as taking the 3x4 extrinsic matrix and multiplying it by a 3x1 matrix [x, y, 1] and if so, will the returned values be relative to the camera center or the traditional (0,0) point of an image.
Thanks in advance for any insight! Also if it's any consolation I am doing everything in python and openCV.
No. I suggest you read the basics in Multiple View Geometry of Hartley and Zisserman, freely available in the web. Dependent on the camera model, the intrinsics contain different parameters. For the pinhole camera model, these are the focal length and the principal point.
The only reason why you maybe could directly transform your 2D centroid to 3D is that you use a 3D camera. Read the manual of the camera, it should be explained how the relation between 2D and 3D coordinates is given for your specific model.
If you have only image data, you can only compute a 3D point from at least two views.
No, of course not. Please don't be lazy and start reading the basics about camera projection instead of asking for others to explain the common basics that are written down everywhere in the web and literature.
I have some SIFT features in two stereo images, and I'm trying to place them in 3D space. I've found triangulatePoints, which seems to be what I want, however, I'm having trouble with the arguments.
triangulatePoints takes 4 arguments, projMatr1 and projMatr2, which is where my issues start, and projPoints1 and projPoints2, which are my feature points. The OpenCV docs suggest using stereoRectify to find the projection matrices.
stereoRectify takes the intrinsic camera matrices (which I've calculated prior with calibrateCamera) and the image size from calibration. As well as two arguments R (rotation matrix) and T (translation vector), which can be found with stereoCalibrate.
However, stereoCalibrate takes "object points", which I'm pretty sure I can't calculate for images without a reference, which is a bit of a roadblock.
Is this the best way to be calculating 3D positions from pairs of features? If so, how can I calculate projMatr1 and projMatr2 without stereoCalibrate?
As you say, you have no calibration, so let’s forget about rectification. What you want is the depth of the points, so you can project them into 3D (which then uses just the intrinsic calibration of one camera, mainly the focal length).
Since you have no rectification, you cannot expect exact results, so let’s try to get as close as possible:
Depth is focal length times baseline divided by disparity, disparity and focal length being in pixels, and depth and baseline in (recommendation) meters.
For accurate disparity you need a rectified camera and correspondences between your features in both images. Since without calibration, you have no hope of rectification, you could try to just use the original images instead. It will work fine the more parallel the cameras are. If they are not parallel, you will introduce an error here and your results will become less accurate. If this becomes bad you must find a way to calibrate your camera.
But most importantly, you need correspondences between your features in both images. Running SIFT in both images won‘t do. A better approach would be running SIFT in just one image and then finding the corresponding pixels for each of the features in the other image. There are plenty of methods for that, I believe OpenCv has some simple block matching builtin.
I have a webcam looking down on a surface which rotates about a single-axis. I'd like to be able to measure the rotation angle of the surface.
The camera position and the rotation axis of the surface are both fixed. The surface is a distinct solid color right now, but I do have the option to draw features on the surface if it would help.
Here's an animation of the surface moving through its full range, showing the different apparent shapes:
My approach thus far:
Record a series of "calibration" images, where the surface is at a known angle in each image
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(). I iterate through various epsilon values until I find one that yields exactly 4 points.
Order the points consistently (top-left, top-right, bottom-right, bottom-left)
Compute the angles between each points with atan2.
Use the angles to fit a sklearn linear_model.linearRegression()
This approach is getting me predictions within about 10% of actual with only 3 training images (covering full positive, full negative, and middle position). I'm pretty new to both opencv and sklearn; is there anything I should consider doing differently to improve the accuracy of my predictions? (Probably increasing the number of training images is a big one??)
I did experiment with cv2.moments directly as my model features, and then some values derived from the moments, but these did not perform as well as the angles. I also tried using a RidgeCV model, but it seemed to perform about the same as the linear model.
If I'm clear, you want to estimate the Rotation of the polygon with respect to the camera. If you know the length of the object in 3D, you can use solvePnP to estimate the pose of the object, from which you can get the Rotation of the object.
Steps:
Calibrate your webcam and get the intrinsic matrix and distortion matrix.
Get the 3D measurements of the object corners and find the corresponding points in 2d. Let me assume a rectangular planar object and the corners in 3d will be (0,0,0), (0, 100, 0), (100, 100, 0), (100, 0, 0).
Use solvePnP to get the rotation and translation of the object
The rotation will be the rotation of your object along the axis. Here you can find an example to estimate the pose of the head, you can modify it to suit your application
Your first step is good -- everything after that becomes way way way more complicated than necessary (if I understand correctly).
Don't think of it as 'learning,' just think of it as a reference. Every time you're in a particular position where you DON'T know the angle, take a picture, and find the reference picture that looks most like it. Guess it's THAT angle. You're done! (They may well be indeterminacies, maybe the relationship isn't bijective, but that's where I'd start.)
You can consider this a 'nearest-neighbor classifier,' if you want, but that's just to make it sound better. Measure a simple distance (Euclidean! Why not!) between the uncertain picture, and all the reference pictures -- meaning, between the raw image vectors, nothing fancy -- and choose the angle that corresponds to the minimum distance between observed, and known.
If this isn't working -- and maybe, do this anyway -- stop throwing away so much information! You're stripping things down, then trying to re-estimate them, propagating error all over the place for no obvious (to me) benefit. So when you do a nearest neighbor, reference pictures and all that, why not just use the full picture? (Maybe other elements will change in it? That's a more complicated question, but basically, throw away as little as possible -- it should all be useful in, later, accurately choosing your 'nearest neighbor.')
Another option that is rather easy to implement, especially since you've done a part of the job is the following (I've used it to compute the orientation of a cylindrical part from 3 images acquired when the tube was rotating) :
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(), alternatively you could find the four sides of your part with LineSegmentDetector (available from OpenCV 3).
Compute the angle alpha, as depicted on the image hereunder
When your part is rotating, this angle alpha will follow a sine curve. That is, you will measure alpha(theta) = A sin(theta + B) + C. Given alpha you want to know theta, but first you need to determine A, B and C.
You've acquired many "calibration" or reference images, you can use all of these to fit a sine curve and determine A, B and C.
Once this is done, you can determine theta from alpha.
Notice that you have to deal with sin(a+Pi/2) = sin(a). It is not a problem if you acquire more than one image sequentially, if you have a single static image, you have to use an extra mechanism.
Hope I'm clear enough, the implementation really shouldn't be a problem given what you have done already.
I have two questions relating to stereo calibration with opencv. I have many pairs of calibration images like these:
Across the set of calibration images the distance of the chessboard away from the camera varies, and it is also rotated in some shots.
From within this scene I would like to map pairs of image coordinates (x,y) and (x',y') onto object coordinates in a global frame: (X,Y,Z).
In order to calibrate the system I have detected pairs of image coordinates of all chessboard corners using cv2.DetectChessboardCorners(). From reading Hartley's Multiple View Geometry in Computer Vision I gather I should be able to calibrate this system up to a scale factor without actually specifying the object points of the chessboard corners. First question: Is this correct?
Investigating cv2's capabilities, the closest thing I've found is cv2.stereoCalibrate(objectpoints,imagepoints1,imagepoints2).
I have obtained imagepoints1 and imagepoints2 from cv2.findChessboardCorners. Apparently from the images shown I can approximately extract (X,Y,Z) relative to the frame on the calibration board (by design), which would allow me to apply cv2.stereoCalibrate(). However, I think this will introduce error, and it prevents me from using all of the rotated photos of the calibration board which I have. Second question: Can I calibrate without object points using opencv?
Thanks!
No. You must specify the object points. Note that they need not change across the image sequence, since you can interpret the change as due to camera motion relative to the target. Also, you can (should) assume that Z=0 for a planar target like yours. You may specify X,Y up to scale, and thus obtain after calibration translations up to scale.
No
Clarification: by "need not change across the image sequence" I mean that you can assume the target fixed in the world frame, and interpret the relative motion as due to the camera only. The world frame itself, absent a better prior, can be defined by the pose of the target in any one of the images (say, the first one). Obviously, I do not mean that the pose of the target relative to the camera does not change - in fact, it must change in order to obtain a calibration. If you do have a better prior, you should use if. For example, if the target moves on a turntable, you should solve directly for the parameters of the cylindrical motion, since there is less of them (one constant axis, one constant radius, plus one angle per image, rather than 6 parameters per image).
I have to images, one simulation, one real data, with bright spots.
Simulation:
Reality:
I can detect the spots just fine and get the coordinates. Now I need to compute transformation matrix (scale, rotation, translation, maybe shear) between the two coordinate systems. If needed, I can pick some (5-10) corresponding points by hand to give to the algorithm
I tried a lot of approaches already, including:
2 implementations of ICP:
https://engineering.purdue.edu/kak/distICP/ICP-2.0.html#ICP
https://github.com/KojiKobayashi/iterative_closest_point_2d
Implementing affine transformations:
https://math.stackexchange.com/questions/222113/given-3-points-of-a-rigid-body-in-space-how-do-i-find-the-corresponding-orienta/222170#222170
Implementations of affine transformations:
Determining a homogeneous affine transformation matrix from six points in 3D using Python
how to perform coordinates affine transformation using python? part 2
Most of them simply fail somehow like this:
The red points are the spots from the simulation transformed into the reality - coordinate system.
The best approach so far is this one how to perform coordinates affine transformation using python? part 2 yielding this:
As you see, the scaling and translating mostly works, but the image still needs to be rotated / mirrored.
Any ideas on how to get a working algorithm? If neccessary, I can provide my current non-working implementations, but they are basically as linked.
I found the error.
I used plt.imshow to display both the simulated and real image and from there, pick the reference points from which to calculate the transformation.
Turns out, due to the usual array-to-image-index-flipping-voodoo (or a bad missunderstanding of the transformation on my side), I need to switch the x and y indices of the reference points from the simulated image.
With this, everything works fine using this how to perform coordinates affine transformation using python? part 2