Find 3d perspective on 2d image

Find 3d perspective on 2d image - python

I have an image like this:
I want to find a perpendicular line to the red line (I mean perpendicular line to the track). How can I do this using OpenCV and Python? Problem is that the height of the camera is unknown and a visible angle of 90 degrees is not a real 90 degrees angle. I have found here an option to use OpenCV .projectPoints() method, but looks like it needs to know the position of the resulting point and pass some vector there. Can somebody help how can achieve this? Or is that even possible?

#Chiefir, you don't have enough data to get the perpendicular line you ask for.
I believe your best chance is to find some parallel line in the image, like those marks in the grass (right where the green eleven is).
Some methods look for parallels in the image automatically, assuming a perpendicular straight lines world (like a city of roads and buildings), and get a 3D pose. I don't think those work on you image.

There is very little information in this one image (almost none, in fact) to accomplish your goal, so every solution will necessarily be imprecise. If this is a frame of a video sequence, you can apply the method below to a sequence of frames around this one to improve its accuracy.
One way is to assume that
The height of the rail from the ground is small (compared to their distance from the camera).
The long edges of the "11" number cut in the grass are perpendicular to the red line.
You can then estimate the vanishing point V of the "11". Then, any line drawn from V to a point of your red line is, by construction, the image of a line on the ground plane orthogonal to the one represented by the red line.
You can improve a little the accuracy by using, instead of your (presumably) hand-drawn red line, a line joining the bottom points of the supports of the rails, since this would be really on the ground.
If the poles supporting the railing were vertical (they aren't, as evidenced by the ones supported by the other rail higher in the image), you could compute their vanishing point P, then use in place of V in the method above.

Related

How to get the top view of this image with reference to coin?

I am working on object detection project and to measure it dimension correctly, for that I am using coin for reference, to measure accurately, I need a bird eye view of this image.
[Image Here]

Disclaimer: This approach is not mathematically complete nor exact, I know. Although I hope someone will find it useful for real life applications or has some positive ideas how to improve it.
As you can see from the discussion you can't get an accurate estimation of the vanishing point / the horizon by just one coin because a circle can be projected to the same ellipse for different vananishing points. However if there are two coins of same size at bottom center and top center of the image it should be manageble to get an acceptable accuracy:
If your business allows it you can do assumptions that will lower the accuracy but make it easier to implement:
Assume that the plane's normal vector is parallel to the yz-plane of your image, i.e the camera is held in a "normal" way and - in relation to the plane - not tilted to the left or right.
Assume that the two coins are placed in the middle of the picture.
With this you can:
Extract the two ellipses.
Get the two tangents of both ellipses left and right.
Get the two horizontal tangents of the bigger ellipse.
Finnally get the four points where the tangents intersect.
Use the four points as input to warpPerspective as descibed here.
Of course, if we are talking about a mobile app, then sensor and camera data from the phone could help without bothering the user too much.

Calculating the tangent on a transition point of a black and white image

I would like to calculate the angle of the tangent on a given white to black transition point on an image that consists entirely of black and white pixels and displays simple shapes such as squares, circles or triangles.
Zooming in on an image like that would look like this:
If you were to pick any of the black pixels next to a white one, my solution would be to follow the edge for a few pixels, then define a formula based on the curvature of the pixels and calculate the exact value of the defined point. Is there a simpler way of doing that? The resolution of the images is around 800x600 pixels so a fairly accurate estiamate of the angle of the provided point should be possible.
In my current approach I follow the edge line of the shape for about ten pixels, but I'm not sure where to go from there. Is there a library that already performs this kind of calculation for you? How many pixels would you need in order to be able to make an accurate judgement of the angle at that point?

Such a measurement is highly inaccurate on binary images, if not unusable.
If you measure on two neighboring pixels, the angle will be one of 0° or ±45°, so the angular resolution is very poor !
You can compute on several pixels to improve that resolution (five pixels correspond to like 11°), but now you are no more sure that the direction is the same, because the shape might be rounded.
If in your case the repertoire of shapes is known to be simple, you'd better perform fitting of the whole shapes before querying the tangents.

How do I split a shape with conected pixels in to two parts in a binary image

My goal is to draw a rectangle border around the face by removing the neck area connected to the whole face area. All positive values here represent skin color pixels. Here I have so far filtered out the binary image using OpenCV and python. Code so far skinid.py
Below is the test image.
Noise removals have also been applied to this binary image
Up to this point, I followed this paper Face segmentation using skin-color map in videophone applications. And for the most of it, I used custom functions rather than using built-in OpenCV functions because I kind of wanted to do it from scratch. (although some erosion, opening, closing were used to tune it up)
I want to know a way to split the neck from the whole face area and remove it like this,
as I am quite new to the whole image processing area.

Perform a distance transform (built into opencv or you could write by hand its a pretty fun and easy one to write using the erode function iteratively, and adding the result into another matrix each round, lol slow but conceptually easy). On the binary image you presented above, the highest value in a distance transform (and tbh I think pretty generalized across any mug shots) will be the center of the face. So that pixel is the center of your box, but also that value (value of that pixel after the distance transform) will give you a pretty solid approx face size (since it is going to be the pixel distance from the center of the face to the horizontal edges of the face). Depending on what you are after, you may just be able to multiply that distance by say 1.5 or so (figure out standard face width to height ratio and such to choose your best multiplier), set that as your circle radius (or half side width for a box) and call it a day. Comment if you need anything clarified as I am pretty confident in this answer and would be happy to write up some quick code (in c++ opencv) if you need/ it would help.
(alt idea). You could tweak your color filter a bit to reject darker areas (this will at least in the image presented) create a nice separation between your face and neck due to the shadowing of the chin. (you may have to dial back your dilate/ closing op tho)

Method to determine polygon surface rotation from top-down camera

I have a webcam looking down on a surface which rotates about a single-axis. I'd like to be able to measure the rotation angle of the surface.
The camera position and the rotation axis of the surface are both fixed. The surface is a distinct solid color right now, but I do have the option to draw features on the surface if it would help.
Here's an animation of the surface moving through its full range, showing the different apparent shapes:
My approach thus far:
Record a series of "calibration" images, where the surface is at a known angle in each image
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(). I iterate through various epsilon values until I find one that yields exactly 4 points.
Order the points consistently (top-left, top-right, bottom-right, bottom-left)
Compute the angles between each points with atan2.
Use the angles to fit a sklearn linear_model.linearRegression()
This approach is getting me predictions within about 10% of actual with only 3 training images (covering full positive, full negative, and middle position). I'm pretty new to both opencv and sklearn; is there anything I should consider doing differently to improve the accuracy of my predictions? (Probably increasing the number of training images is a big one??)
I did experiment with cv2.moments directly as my model features, and then some values derived from the moments, but these did not perform as well as the angles. I also tried using a RidgeCV model, but it seemed to perform about the same as the linear model.

If I'm clear, you want to estimate the Rotation of the polygon with respect to the camera. If you know the length of the object in 3D, you can use solvePnP to estimate the pose of the object, from which you can get the Rotation of the object.
Steps:
Calibrate your webcam and get the intrinsic matrix and distortion matrix.
Get the 3D measurements of the object corners and find the corresponding points in 2d. Let me assume a rectangular planar object and the corners in 3d will be (0,0,0), (0, 100, 0), (100, 100, 0), (100, 0, 0).
Use solvePnP to get the rotation and translation of the object
The rotation will be the rotation of your object along the axis. Here you can find an example to estimate the pose of the head, you can modify it to suit your application

Your first step is good -- everything after that becomes way way way more complicated than necessary (if I understand correctly).
Don't think of it as 'learning,' just think of it as a reference. Every time you're in a particular position where you DON'T know the angle, take a picture, and find the reference picture that looks most like it. Guess it's THAT angle. You're done! (They may well be indeterminacies, maybe the relationship isn't bijective, but that's where I'd start.)
You can consider this a 'nearest-neighbor classifier,' if you want, but that's just to make it sound better. Measure a simple distance (Euclidean! Why not!) between the uncertain picture, and all the reference pictures -- meaning, between the raw image vectors, nothing fancy -- and choose the angle that corresponds to the minimum distance between observed, and known.
If this isn't working -- and maybe, do this anyway -- stop throwing away so much information! You're stripping things down, then trying to re-estimate them, propagating error all over the place for no obvious (to me) benefit. So when you do a nearest neighbor, reference pictures and all that, why not just use the full picture? (Maybe other elements will change in it? That's a more complicated question, but basically, throw away as little as possible -- it should all be useful in, later, accurately choosing your 'nearest neighbor.')

Another option that is rather easy to implement, especially since you've done a part of the job is the following (I've used it to compute the orientation of a cylindrical part from 3 images acquired when the tube was rotating) :
Threshold each image to isolate the surface.
Find the four corners with cv2.approxPolyDP(), alternatively you could find the four sides of your part with LineSegmentDetector (available from OpenCV 3).
Compute the angle alpha, as depicted on the image hereunder
When your part is rotating, this angle alpha will follow a sine curve. That is, you will measure alpha(theta) = A sin(theta + B) + C. Given alpha you want to know theta, but first you need to determine A, B and C.
You've acquired many "calibration" or reference images, you can use all of these to fit a sine curve and determine A, B and C.
Once this is done, you can determine theta from alpha.
Notice that you have to deal with sin(a+Pi/2) = sin(a). It is not a problem if you acquire more than one image sequentially, if you have a single static image, you have to use an extra mechanism.
Hope I'm clear enough, the implementation really shouldn't be a problem given what you have done already.

Using real time video feed to monitor analog gauge in Python

I'm hoping someone here can help me go about trying to solve this problem. I'm trying to figure out a good way to use a real time video feed to run a python 0script when a pressure gauge that looks like this reaches a certain variable pressure.
I'm pretty confident that this shouldn't be too difficult as the gauge will be mounted in a fixed position in the image frame, but am new to this sort of stuff and really don't know where to start.
Any suggestions? Thanks in advance!

This problem could be much tougher than you think. Typically, people use OpenCV for image processing/pattern recognition problems. There are many excellent tutorials for Python available in the documentation
At it's core, your problem is the figure out the angle of rotation of the needle. Then from there, you can calculate what the pressure is. If you're lucky, you can find the needle using Feature Detection + Homography and then calculate the angle. But I have my doubts that this kind of image leads itself to feature detection.
Template Matching will be able to find the needle, but it's very sensitive to rotation, so you'd have to generate a large number of templates with different needle angles and then go through all of your templates every time you want a pressure reading.
It may well be faster, cheaper, and easier to use an electronic pressure gauge for reading pressure. There are any number of ways you can get the output of your electronic gauge into Python. Some gauges can be connected to the network. Some can communicate as a serial port with RS232/422 or USB.
If you can find a gauge with a digital display, it should be more straighforward to do some kind of OCR/Template Matching to read the display from a camera.
Edit:
#jeremy arsenault's comment prompted me to think of another solution. If you are focused mostly on detecting when the needle crosses a threshold, you could look for the leading edge of the dial. And you could look more towards the center of the dial, where there are no indicators.
This kind of image really lends itself to easy conversion to Thresholding to black and white. The dial is black and the background (at least towards the center) is white. After you do some Morphology and Thresholding of your image, you could look for black pixels of the dial. No template matching required. Pictures below illustrate some of the ideas. The first picture is your image thresholded to black and white. The starting image was very clean, but if you're grabbing from a camera, you'll probably have to do some Blurring, Morphology before thresholding to get no stray black pixels where you don't want them. The second image has a superimposed circle where you can look for black pixels. You can find the leading edge and trailing edge and average to get a fairly precise value of the center.
Some more implementation details/pseudocode. For computer images, origin is usually top left corner and +y goes down. Also, I've defined increasing angle theta with increasing pressure. Parametric equation for circle is
x = x0 - r*cos(theta + theta0)
y = y0 - r*sin(theta + theta0)
theta0 is the offset angle that gives zero pressure (around -45 degrees for gauge pictured. The brute force way to look for the needle would be to start sweeping the angle from zero and increasing until you find a black pixel at x,y. Then keep increasing until you find a white pixel at x,y. The average of the two angles gives you the center of the needle and you should be able to calculate pressure.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.