Method to determine polygon surface rotation from top-down camera - python

I have a webcam looking down on a surface which rotates about a single-axis. I'd like to be able to measure the rotation angle of the surface.
The camera position and the rotation axis of the surface are both fixed. The surface is a distinct solid color right now, but I do have the option to draw features on the surface if it would help.
Here's an animation of the surface moving through its full range, showing the different apparent shapes:
My approach thus far:
Record a series of "calibration" images, where the surface is at a known angle in each image
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(). I iterate through various epsilon values until I find one that yields exactly 4 points.
Order the points consistently (top-left, top-right, bottom-right, bottom-left)
Compute the angles between each points with atan2.
Use the angles to fit a sklearn linear_model.linearRegression()
This approach is getting me predictions within about 10% of actual with only 3 training images (covering full positive, full negative, and middle position). I'm pretty new to both opencv and sklearn; is there anything I should consider doing differently to improve the accuracy of my predictions? (Probably increasing the number of training images is a big one??)
I did experiment with cv2.moments directly as my model features, and then some values derived from the moments, but these did not perform as well as the angles. I also tried using a RidgeCV model, but it seemed to perform about the same as the linear model.

If I'm clear, you want to estimate the Rotation of the polygon with respect to the camera. If you know the length of the object in 3D, you can use solvePnP to estimate the pose of the object, from which you can get the Rotation of the object.
Steps:
Calibrate your webcam and get the intrinsic matrix and distortion matrix.
Get the 3D measurements of the object corners and find the corresponding points in 2d. Let me assume a rectangular planar object and the corners in 3d will be (0,0,0), (0, 100, 0), (100, 100, 0), (100, 0, 0).
Use solvePnP to get the rotation and translation of the object
The rotation will be the rotation of your object along the axis. Here you can find an example to estimate the pose of the head, you can modify it to suit your application

Your first step is good -- everything after that becomes way way way more complicated than necessary (if I understand correctly).
Don't think of it as 'learning,' just think of it as a reference. Every time you're in a particular position where you DON'T know the angle, take a picture, and find the reference picture that looks most like it. Guess it's THAT angle. You're done! (They may well be indeterminacies, maybe the relationship isn't bijective, but that's where I'd start.)
You can consider this a 'nearest-neighbor classifier,' if you want, but that's just to make it sound better. Measure a simple distance (Euclidean! Why not!) between the uncertain picture, and all the reference pictures -- meaning, between the raw image vectors, nothing fancy -- and choose the angle that corresponds to the minimum distance between observed, and known.
If this isn't working -- and maybe, do this anyway -- stop throwing away so much information! You're stripping things down, then trying to re-estimate them, propagating error all over the place for no obvious (to me) benefit. So when you do a nearest neighbor, reference pictures and all that, why not just use the full picture? (Maybe other elements will change in it? That's a more complicated question, but basically, throw away as little as possible -- it should all be useful in, later, accurately choosing your 'nearest neighbor.')

Another option that is rather easy to implement, especially since you've done a part of the job is the following (I've used it to compute the orientation of a cylindrical part from 3 images acquired when the tube was rotating) :
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(), alternatively you could find the four sides of your part with LineSegmentDetector (available from OpenCV 3).
Compute the angle alpha, as depicted on the image hereunder
When your part is rotating, this angle alpha will follow a sine curve. That is, you will measure alpha(theta) = A sin(theta + B) + C. Given alpha you want to know theta, but first you need to determine A, B and C.
You've acquired many "calibration" or reference images, you can use all of these to fit a sine curve and determine A, B and C.
Once this is done, you can determine theta from alpha.
Notice that you have to deal with sin(a+Pi/2) = sin(a). It is not a problem if you acquire more than one image sequentially, if you have a single static image, you have to use an extra mechanism.
Hope I'm clear enough, the implementation really shouldn't be a problem given what you have done already.

Related

Can I compute camera pose from image with known scale?

I have a photo taken from a camera (whose focal length, principle point, and distortion coefficients I know). The photo has a 8cm x 8cm post-in on a table and the center of the post-it is the origin (0, 0) again in cm. I've also indicated the positive-y axis on the post-it.
From this information is it possible to compute the location of the camera and the vector in which the camera is looking in Python using OpenCV? If someone has a snippet of code that does that (assuming you know the coordinates of the post-it corners already) that would be amazing!
Use OpenCV's solvePnP specifying SOLVEPNP_IPPE_SQUARE in the flags. With only 4 points (and a postit) the solution will be quite sensitive to how accurately you mark their images, so ask yourself whether you really need the camera pose and location for your application, and how accurately. E.g., if you just want to make a flat CG "sticker" stay fixed on the table while the camera moves, all you need is estimating a homography, a much simpler task.
It does look like you have all the information required. The marker you use can be easily segmented. Shape analysis will provide corners. I did something similar to get basic eyesight tracking:
Here is a complete example.
Segmentation result for the example:
Please notice, accuracy really matters, so it might be useful to rely on several sets of points.

How to improve a trajectory of a camera, built from rotation and translation?

I am trying to recover a trajectory of a 2D camera, using a sequence of 2D-images and OpenCV. But the trajectory I get is not so good as I would like it to be. It goes back and forth instead of going just forth.
I have a sequence of photos taken on 2D-camera while it was moving (KITTI dataset, outdoors part, namely). For each two sequential frames I compute the rotation matrix (R) and translation vector (t) with E = cv2.findEssentialMat() and cv2.recoverPose(E, ...), and then I estimate the trajectory, assuming that coordinates of every translation vector are given in local coordinate system, which position is set by the corresponding rotation matrix.
upd: Each recovered position looks like [X,Y,Z], and I scatter (X_i, Y_i) for every i (these points are thought to be 2D positions), so the following graphs are my estimated trajectories.
Here's what I get instead of a straight line (the camera was moving straight forward). Previous results were even worse.
The green point is where it starts and the red point is where it ends. So most of the time it even moves backwards. This, though, is probably because of a mistake in the beginning, which was the cause of everything turning around (right?)
Here's what I do:
E, mask = cv2.findEssentialMat(points1, points2, K_00, cv2.RANSAC, 0.99999, 0.1)
inliers, R, t, mask = cv2.recoverPose(E, points1, points2, K_00, R, t, mask)
Seems to me that recoverPose somehow chooses wrong R and t sign on some steps. So the trajectory that was supposed to go forward, goes back. And then forth again.
What I did to improve the situation was:
1) skip the frames with too many outliers (I check this both after using findEssentialMat and after using recoverPose)
2) set the threshold for RANSAC method in findEssentialMat to 0.1
3) increase the number of the feature points on each image from 8 to 24.
This didn't really help.
Here I need to note: I know that on practice, 5-point algorithm, which is used for computing the essential matrix, needs a lot more points than 8 or even 24. And maybe this is actually the problem.
So the questions are:
1) Can the number of feature points (approx. 8-24) be the cause of recoverPose mistakes?
2) If checking the number of outliers if the right thing, then what percentage of outliers should I set as the limitation?
3) I estimate positions like this (instead of simple p[i+1] = R*p[i]+t):
C = np.dot(R, C)
p[i+1] = p[i] + np.dot(np.linalg.inv(C), t)
This is because I can't help thinking of t as a vector in local coordinates, so C is the transformation matrix, which is updated on every step to summarize the rotations. Is that right or not really?
4) It's really possible that I am missing something, since my knowledge of the topic seems tiny. Is there anything (anything!) you could recommend?
Huge thanks for your time! I would appreciate any advice.
upd: for example, here are the first six rotation matrices, translation vectors, and recovered positions I get. Signs of t seem a bit crazy.
upd: here's my code. (I'm not a really good programmer yet). The main idea is that my feature points are corners of bouding boxes of static objects, which I detect with Faster R-CNN (I used this implementation). So the first part of the code detects objects, and the second part uses detected feature points for recovering the trajectory.
Here's the dataset I use (this is part 2011_09_26_drive_0005 from here).

Find the most significant corner of a skeleton and segment the skeleton at that corner

I have images of ore seams which I have first skeletonised (medial axis multiplied by the distance transform), then extracted corners (see the green dots). It looks like this:
The problem is to find a turning point and then segment the seam by separating the seam at the turning point. Not all skeletons have turning points, some are quite linear, and the turning points can be in any orientation. But the above image shows a seam which does have a defined turning point. Other examples of turning points look like (using ASCII): "- /- _". "X" turning points don't really exist.
I've tried a number of methods including downsampling the image, curve fitting, k-means clustering, corner detection at various thresholds and window sizes, and I haven't figured it out yet. (I'm new to to using scikit)
The technique must be able to give me some value which I can use heuristically determine whether there is a turning point or not.
What I'd like to do is to do some sort of 2 line ("piecewise"?) regression and find an intersection or some sort of rotated polynomial regression, then determine if a turning point exists, and if it does exist, the best coordinate that represents the turning point. Here is my work in progress: https://gist.github.com/anonymous/40eda19e50dec671126a
From there, I learned that a watershed segmentation with appropriate label coordinates should be able to segment the skeleton.
I found this resource: Fit a curve for data made up of two distinct regimes
But I wasn't able to figure out to apply it my current situation. More importantly there's no way for me to guess a-priori what the initial coefficients are for the fitting function since the skeletons can be in any orientation.

Image registration b-spline opencv

Here is my question:
My optical system is made of a camera plus a circular plexiglass "lens" that changes its curvature depending on pressure (radial bending).
This curvature induces a deformation of the image captured by the camera.
To correct this deformation, images need to be calibrated.
Calibration can be made with a grid (chessboard, dots, lines), pressure range has to be discretized with a certain step.
For each pressure step, an image of the grid has to be taken.
Then each image has to be compared to the reference one (P=0), and a transformation matrix has to be computed and stored.
Finally, each image taken during the experiment for a specific pressure has to be corrected by the transformation matrix.
The deformation is non-linear (not only a combination of rotations and translations), but most likely Barrel distortion. (again not induced by the camera)
Which looks like that:
http://en.wikipedia.org/wiki/Distortion_%28optics%29#mediaviewer/File:Barrel_distortion.svg
I found a plugin in ImageJ called BunwarpJ, http://biocomp.cnb.csic.es/~iarganda/bUnwarpJ/
and I basically want to know if there is an equivalent way to produce the same result in Opencv.
(CalibrateCamera won't do the trick)
OpenCv has an undistort function that can take a current image, a matrix of camera coefficients, distorsion coeffs. and produces a new image corrected for sent camera coeffs. and a new set of camera coeffs. (if you need to do other transformations on the new image).
I have not used it before, so I can't say what exactly are camera or distorsion coefficients are but as manual describes:
The function transforms an image to compensate radial and tangential
lens distortion. The function is simply a combination of
initUndistortRectifyMap() (with unity R ) and remap() (with bilinear
interpolation).
So checking those two funcs. out are a good way to find out.
I believe you misunderstood the manual perhaps because you seem to think that CalibrateCamera does this for you. Instead CalibrateCamera actually returns the camera and distorsion coeffs. which you need to undistort your image.
Each lens has its own constant coeffs. which in your case means that you'll have to calibrateCamera for a range of pressures (I assume you control that experimentally?) and then call different undistort func. with different parameters which you'll get out of your experiments.
A matrix can only capture a linear transformation (or possibly a linear transformation in homogeneous space), not a general distortion.
In my experience any attempt to use a single global transformation formula wouldn't be very accurate (it's not trivial to get just 99.9% accuracy). Even just correcting camera lens distortion this way is difficult if you want high accuracy.
In the past I got good enough results using a sparse global RBF interpolation, but later I moved to an interpolating 2d spline approach; if you can choose your calibration points to be on a regular grid this is the solution I would suggest.
In the end the mapping could be a 2-valued 3d interpolating spline on a regular grid (XY for the image, Z for the pressure; values UV are the pixel coordinates).
Straightening the image once pressure is known is just texture mapping.

Given a contour outlining the edges of an 'S' shape in OpenCV/Python, what methods can be used to trace a curve along the center of the shape?

Given a contour outlining the edge of the letter S (in comic sans for example), how can I get a series of points along the spine of this letter in order to later represent this shape using lines, cubic spline or other curve-representing technique? I want to process and represent the shape using 30-40 points in Python/OpenCV.
Morphological skeletonization could help with this but the operation always seems to produce erroneous branches. Is there a better way to collapse the contour into just the 'S' shape of the letter?
In the example below you can see the erroneous 'serpent's tongue' like branches that are produced by morphological skeletonization. I don't know if it's fair to say they are erroneous if that's what the algorithm is supposed to be doing, but for me I would not like them to be there.
Below is the comic sans alphabet:
Another problem with skeletonization is that it is computationally expensive, but if you know a way of making it robust to forming 'serpent's tongue' like branches then I will give it a try.
Actually vectorizing fonts isn't trivial problem and quite tricky. To properly vectorize fonts using bezier curve you'll need tracing. There are many library you can use for tracing image, for example Potrace. I'm not knowledgeable using python but based on my experience, I have done similar project using c++ described below:
A. Fit the contour using cubic bezier
This method is quite simple although a lot of work should be done. I believe this also works well if you want to fit skeletons obtained from thinning.
Find contour/edge of the object, you can use OpenCV function findContours()
The entire shape can't be represented using a single cubic bezier, so divide them to several segments using Ramer-Douglas-Peucker (RDP). The important thing in this step, don't delete any points, use RDP only to segment the points. See colored segments on image below.
For each segments, where S is a set of n points S = (s0, s1,...Sn), fit a cubic bezier using Least Square Fitting
Illustration of least square fitting:
B. Resolution Resolution Independent Curve Rendering
This method as described in this paper is quite complex but one of the best algorithms available to display vector fonts:
Find contour (the same with method A)
Use RDP, differently from method A, use RDP to remove points so the contour can be simplified.
Do delaunay triangulation.
Draw bezier curve on the outer edges using method described in the paper
The following simple idea might be usefull.
Calculate Medial axis of the outer contour. This would ensure connectivity of the curves.
Find out the branch points. Depending on its length you can delete them in order to eliminate "serpent's tongue" problem.
Hope it helps.

Categories

Resources