I am wondering if this can be done:
I have a set of images that start out looking forward, as they move forward the camera spins horizontally in a 360 direction.
So each image has a slightly different view as it spins around going forward.
The question is; can I accurately calculate the spin that camera is moving?
A follow up question is can I calculate the direction the image is moving with the spin?
The idea would be to use a few points that you would track across the transformation. And from those points you could find the angle of rotation between each frame.
You might want to have a look at this that explains the maths.
http://nghiaho.com/?page_id=671
If you don't need to stick to python, you could use matlab :
http://uk.mathworks.com/help/vision/examples/find-image-rotation-and-scale-using-automated-feature-matching.html?requestedDomain=uk.mathworks.com
Related
I am working on object detection project and to measure it dimension correctly, for that I am using coin for reference, to measure accurately, I need a bird eye view of this image.
[Image Here]
Disclaimer: This approach is not mathematically complete nor exact, I know. Although I hope someone will find it useful for real life applications or has some positive ideas how to improve it.
As you can see from the discussion you can't get an accurate estimation of the vanishing point / the horizon by just one coin because a circle can be projected to the same ellipse for different vananishing points. However if there are two coins of same size at bottom center and top center of the image it should be manageble to get an acceptable accuracy:
If your business allows it you can do assumptions that will lower the accuracy but make it easier to implement:
Assume that the plane's normal vector is parallel to the yz-plane of your image, i.e the camera is held in a "normal" way and - in relation to the plane - not tilted to the left or right.
Assume that the two coins are placed in the middle of the picture.
With this you can:
Extract the two ellipses.
Get the two tangents of both ellipses left and right.
Get the two horizontal tangents of the bigger ellipse.
Finnally get the four points where the tangents intersect.
Use the four points as input to warpPerspective as descibed here.
Of course, if we are talking about a mobile app, then sensor and camera data from the phone could help without bothering the user too much.
I'm looking for some advice on what to use for this problem.
I have a camera feed of a flat surface. I want to be able to mark a point on that surface, and then to automatically track the movement of that point in the frame. In other words, I want to track the shift and rotation of the image compared to the initial state.
I think this should be possible using opencv or something similar, but I'm unable to find the right tools.
Could someone point me in the right direction?
Preferably some way in python but other methods are welcome too.
I am new to OpenCV. I am working on a project that involves tracking and detecting a spinning roulette ball. Here is the video I want to use: https://www.youtube.com/watch?v=IzZNaVQ3FnA&list=LL_a67IPXKsmu48W4swCQpMQ&index=7&t=0s
I want to get the ball time for 1 revolution. But the ball is quite fast and hard to detect. I am not sure how to overcome this.
What would be the best algorithm for doing this?
By subtracting successive images, you will isolate the ball as a (slightly curved) line segment. Both its length and its angular position are cues for the speed.
Anyway, these parameters are a little tricky to extract for a side view, as the ellipse has to be "unprojected" to a top view, to see the original circle. You need to know the relative position of the wheel and the viewer, which you most probably don't know.
An approximate solution is obtained by stretching the ellipse in the direction of the small axis.
I'm hoping someone here can help me go about trying to solve this problem. I'm trying to figure out a good way to use a real time video feed to run a python 0script when a pressure gauge that looks like this reaches a certain variable pressure.
I'm pretty confident that this shouldn't be too difficult as the gauge will be mounted in a fixed position in the image frame, but am new to this sort of stuff and really don't know where to start.
Any suggestions? Thanks in advance!
This problem could be much tougher than you think. Typically, people use OpenCV for image processing/pattern recognition problems. There are many excellent tutorials for Python available in the documentation
At it's core, your problem is the figure out the angle of rotation of the needle. Then from there, you can calculate what the pressure is. If you're lucky, you can find the needle using Feature Detection + Homography and then calculate the angle. But I have my doubts that this kind of image leads itself to feature detection.
Template Matching will be able to find the needle, but it's very sensitive to rotation, so you'd have to generate a large number of templates with different needle angles and then go through all of your templates every time you want a pressure reading.
It may well be faster, cheaper, and easier to use an electronic pressure gauge for reading pressure. There are any number of ways you can get the output of your electronic gauge into Python. Some gauges can be connected to the network. Some can communicate as a serial port with RS232/422 or USB.
If you can find a gauge with a digital display, it should be more straighforward to do some kind of OCR/Template Matching to read the display from a camera.
Edit:
#jeremy arsenault's comment prompted me to think of another solution. If you are focused mostly on detecting when the needle crosses a threshold, you could look for the leading edge of the dial. And you could look more towards the center of the dial, where there are no indicators.
This kind of image really lends itself to easy conversion to Thresholding to black and white. The dial is black and the background (at least towards the center) is white. After you do some Morphology and Thresholding of your image, you could look for black pixels of the dial. No template matching required. Pictures below illustrate some of the ideas. The first picture is your image thresholded to black and white. The starting image was very clean, but if you're grabbing from a camera, you'll probably have to do some Blurring, Morphology before thresholding to get no stray black pixels where you don't want them. The second image has a superimposed circle where you can look for black pixels. You can find the leading edge and trailing edge and average to get a fairly precise value of the center.
Some more implementation details/pseudocode. For computer images, origin is usually top left corner and +y goes down. Also, I've defined increasing angle theta with increasing pressure. Parametric equation for circle is
x = x0 - r*cos(theta + theta0)
y = y0 - r*sin(theta + theta0)
theta0 is the offset angle that gives zero pressure (around -45 degrees for gauge pictured. The brute force way to look for the needle would be to start sweeping the angle from zero and increasing until you find a black pixel at x,y. Then keep increasing until you find a white pixel at x,y. The average of the two angles gives you the center of the needle and you should be able to calculate pressure.
I have a Python program where people can draw simple line drawings using a touch screen. The images are documented in two ways. First, they are saved as actual image files. Second, I record 4 pieces of information at every refresh: the time point, whether contact was being made with the screen at the time (1 or 0), the x coordinate, and the y coordinate.
What I'd like to do is gain some measure of how similar a given drawing is to any other drawing. I've tried a few things, including simple Euclidian distance and similarity between each pixel, and I've looked at Frechet distance. None of these can give what I'm looking for.
The issues are that each drawing might have a different number of points, one segment does not always immediately connect to the next, and the order of the points is irrelevant. For instance, if you and I both draw something as simple as an ice cream cone, I might draw ice cream first, and you might draw the cone first. We may get an identical end result, but many of the most intuitive metrics would be totally thrown off.
Any ideas anyone has would be greatly appreciated.
if you care about how similar a drawing is to another, then there's no need to collect data at every refresh. just collect it once the drawer is done drawing
Then, you can use fourier analysis to break the images down in to frequency domains and run cross correlations on that
or some kind of 2D cross correlation on the images, I guess