Let's say I have a square whose dimensions I know. I have a camera that will likely move around and encounter the square at different angles. I want to calculate the angle at which my camera is pointing at the square.
I know the coordinates of the 4 corner points, and I can calculate all side lengths as seen from the camera (unit is pixels). The point is, I have no problem detecting the square.
The following diagrams will give you an idea of what I mean:
view from camera. Note the distortion of the square causing one side length to be different
view from above, showing angle I am trying to calculate. The grey line is the trajectory of vision of the camera.
I asked on Math SE thinking there might be a mathematical way of doing this by measuring the heights of the two sides of the square. The answer gave me some insight, but it is extremely complicated.
When I asked ChatGPT this question, it appeared to give me a much simpler answer, though incomplete. It suggested that I use a combination of getPerspectiveTransform and decomposeProjectionMatrix to get the desired angles. The answer was grossly incomplete though, so it was only an inspiration.
My question is, how should this be accomplished with perspective transform? Or, if that is the wrong approach altogether, how should this problem be approached?
Related
I am working on object detection project and to measure it dimension correctly, for that I am using coin for reference, to measure accurately, I need a bird eye view of this image.
[Image Here]
Disclaimer: This approach is not mathematically complete nor exact, I know. Although I hope someone will find it useful for real life applications or has some positive ideas how to improve it.
As you can see from the discussion you can't get an accurate estimation of the vanishing point / the horizon by just one coin because a circle can be projected to the same ellipse for different vananishing points. However if there are two coins of same size at bottom center and top center of the image it should be manageble to get an acceptable accuracy:
If your business allows it you can do assumptions that will lower the accuracy but make it easier to implement:
Assume that the plane's normal vector is parallel to the yz-plane of your image, i.e the camera is held in a "normal" way and - in relation to the plane - not tilted to the left or right.
Assume that the two coins are placed in the middle of the picture.
With this you can:
Extract the two ellipses.
Get the two tangents of both ellipses left and right.
Get the two horizontal tangents of the bigger ellipse.
Finnally get the four points where the tangents intersect.
Use the four points as input to warpPerspective as descibed here.
Of course, if we are talking about a mobile app, then sensor and camera data from the phone could help without bothering the user too much.
G'day programmers and math enthusiasts.
Recently I have been exploring how CAS graphing calculators function; in particular, how they are able to draw level curves and hence contours for multivariable functions.
Just a couple of notes before I ask my question:
I am using Python's Pygame library purely for the window and graphics. Of course there are better options out there but I really wanted to keep my code as primitive as I am comfortable with, in an effort to learn the most.
Yes, yes. I know about matplotlib! God have I seen 100 different suggestions for using other supporting libraries. And while they are definitely stunning and robust tools, I am really trying to build up my knowledge from the foundations here so that one day I may even be able to grow and support libraries such as them.
My ultimate goal is to get plots looking as smooth as this:
Mathematica Contour Plot Circle E.g.
What I currently do is:
Evaluate the function over a grid of 500x500 points equal to 0, with some error tolerance (mine is 0.01). This gives me a rough approximation of the level curve at f(x,y)=0.
Then I use a dodgy distance function to find each point's closest neighbour, and draw an anti-aliased line between the two.
The results of both of these steps can be seen here:
First Evaluating Valid Grid Points
Then Drawing Lines to Closest Points
For obvious reasons I've got gaps in the graph where the next closest point is always keeping the graph discontinuous. Alas! I thought of another janky work around. How about on top of finding the closest point, it actually looks for the next closest point that hasn't already been visited? This idea came close, but still doesn't really seem to be even close to efficient. Here are my results after implementing that:
Slightly Smarter Point Connecting
My question is, how is this sort of thing typically implemented in graphing calculators? Have I been going about this all wrong? Any ideas or suggestions would be greatly appreciated :)
(I haven't included any code, mainly because it's not super clear, and also not particularly relevant to the problem).
Also if anyone has some hardcore math answers to suggest, don't be afraid to suggest them, I've got a healthy background in coding and mathematics (especially numerical and computational methods) so here's me hoping I should be able to cope with them.
so you are evaluating the equation for every x and y point on your plane. then you check if the result is < 0.01 and if so, you are drawing the point.
a better way to check if the point should be drawn is to check if one of the following is true:
(a) if the point is zero
(b) if the point is positive and has at least one negative neighbor
(c) if the point is negative and has at least one positive neighbor
there are 3 problems with this:
it doesn't support any kind of antialisasing so the result will not look as smooth as you would want
you can't make thicker lines (more then 1 pixel)
if the 0-point line is only touching (it's positive on both sides and not positive on one, negative on the other)
this second solution may fix those problems but it was made by me and not tested so it may or may not work:
you assign the value to a corner and then calculate the distance to the zero line for each point from it's corners. this is the algorithm for finding the distance:
def distance(tl, tr, bl, br): # the 4 corners
avg = abs((tl + tr + bl + br) / 4) # getting the absolute average
m = min(map(abs, (tl + tr + bl + br))) # absolute minimum of points
if min == 0: # special case
return float('inf')
return avg / m # distance to 0 point assuming the trend will continue
this returns the estimated distance to the 0 line you can now draw the pixel e.g. if you want a 5-pixel line, then if the result is <4 you draw the pixel full color, elif the pixel is <5 you draw the pixel with an opacity of distance - 4 (*255 if you are using pygames alpha option)
this solution assumes that the function is somewhat linear.
just try it, in the worst case it doesn't work...
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.30.1319&rep=rep1&type=pdf
This 21 Page doc has everything I need to draw the implicit curves accurately and smoothly. It even covers optimisation methods and supports bifurcation points for implicit functions. Would highly recommend to anyone with questions similar to my own above.
Thanks to everyone who had recommendations and clarifying questions, they all helped lead me to this resource.
I have a photo taken from a camera (whose focal length, principle point, and distortion coefficients I know). The photo has a 8cm x 8cm post-in on a table and the center of the post-it is the origin (0, 0) again in cm. I've also indicated the positive-y axis on the post-it.
From this information is it possible to compute the location of the camera and the vector in which the camera is looking in Python using OpenCV? If someone has a snippet of code that does that (assuming you know the coordinates of the post-it corners already) that would be amazing!
Use OpenCV's solvePnP specifying SOLVEPNP_IPPE_SQUARE in the flags. With only 4 points (and a postit) the solution will be quite sensitive to how accurately you mark their images, so ask yourself whether you really need the camera pose and location for your application, and how accurately. E.g., if you just want to make a flat CG "sticker" stay fixed on the table while the camera moves, all you need is estimating a homography, a much simpler task.
It does look like you have all the information required. The marker you use can be easily segmented. Shape analysis will provide corners. I did something similar to get basic eyesight tracking:
Here is a complete example.
Segmentation result for the example:
Please notice, accuracy really matters, so it might be useful to rely on several sets of points.
I need help in python coding an algorithm capable of detecting the corners of a image. I have a thresholded image so far and I was using cornerHarris from opencv to detect all the corners. My problem is filtrating all those points to output only the ones I desired. Maybe I can do a loop to achieve this?
In my case, I want the two lowest corners and the two highest corners points. My main interest is to obtain the pixel coordinates of this corners. You can see an example of a image I'm processing here:
In this image I draw the corners points I'm interested in.
There are several ways to solve this problem. In real-world applications it's rare (that is, actually never occurs) that you need to solve a problem once for a single image. If you have additional images it would be nice to see how much the object of interest varies.
One method to find corners is the convex hull. This method is more generally used to find a convex shape encompassing scattered points, but it's worth knowing about and implementing.
https://en.wikipedia.org/wiki/Convex_hull
What's handy about the convex hull is that the concept of a "corner" (a vertex on the convex hull polygon) is easy to grasp and doesn't rely on parameter settings. You don't have to consider whether a corner is sharp enough, strong enough, pointy enough, unique in its neighborhood, etc.--the convex hull will simply make sense to you.
You should be able to write a functional version of a convex hull "gift wrapping" algorithm in a reasonable period of time.
https://en.wikipedia.org/wiki/Gift_wrapping_algorithm
There are many ways to compute the convex hull, but don't get lost in all the different methods. Choose one that makes sense to you and implement it. The fastest known method may still be Seidel, but don't even think about running down that rabbit hole. Simple is good.
Before you compute the convex hull, you'll need to reduce your white shape to edge points; otherwise the hull algorithm will check far too many points. Reducing the number of points to be considered can be done using edge-finding on the connected component (the white "blob"), edge-finding without first segmenting foreground from background, or any of various simple kernels (e.g. Sobel).
Although the algorithm is called the "convex" hull, your shape doesn't have to be convex, especially if you're only interested in the top and bottom vertices/corners as shown in your sample image.
Corner finders can be a bit disappointing, frankly, especially since the name implies, "Hey, it'll just find corners all the time." There are some good ones out there, but you could spend a lot of time investigating all the alternatives. Even then you'll likely have to set thresholds, consider whether your application will yield the occasional weird result given the shape and scale of corners, and so on.
Although you mention wanting to find only the top and bottom points, if you wanted to find those two odd triangular outcroppings on the left side the corner-finding gets a little more complicated; using the convex hull keeps this very simple.
Although you want to find a robust solution to corner detection, preferably using a known algorithm for which performance can be understood easily, you also want to avoid overgeneralizing. In any case, review some list of corner detectors and see what strikes your fancy. If you see a promising algorithm that looks easy-ish to implement, why not try implementing it?
https://en.wikipedia.org/wiki/Corner_detection
I've been trying for a while to find the centre of a curved shape (for example a banana). I can do all the basics, such as creating a binary image, and locating the contour. However, the centroid function correctly finds a point outside of the contour. The point I require must be inside the contour. I've attached an image which should explain things better.
If anyone has any ideas, or has seen something similar I would really appreciate some help.
You could look at this answer, What is the fastest way to find the "visual" center of an irregularly shaped polygon?
Basically skeletonisation algorithms should help (in terms of efficiency and accuracy as compared to continuous erosion, which would fail in some cases), since they narrow down the set of possible valid points to a set of line segments, which you can then do some sort of conditional processing on.