Understanding quiver plot scale - python

I am trying to understand, how scale works in quiver plot. I have found an explanation here , but I want to double check, that I understand what the author says.
When he says
Setting the angles keyword to 'xy' means that the vector components are scaled according to the physical axis units rather than geometrical units on the page.
Does he mean that, let's say, if I have some movement which is 1 pixel (to make it simple, let's say per 1 second) and this happens on a 232x232 image (actual thing I have), will the 1px/s movement show up scaled 232 times ? That is 1/232 ? Or will the 1px/s appear as representing 1 px (no matter the angle to the axis), that is it will take 1px, however that pixel appears on the image with 232 x 232 pixel size image ?
The actual scaling factor which multiplicatively converts vector component units to physical axis units is width/scale where width is the width of the plot in physical units and scale is the number specified by the scale keyword argument of quiver.
Does that mean that if say, I choose my scale to be 10, a vector that is actually 10px/s will appear as 1px/s ? (I know these are very basic questions but I am confused with the "physical axis" and the "geometrical units on the page").
I think I will test this by simply having two images, one sized, say 40x40 and the other 160x160, with same 1px movement and see how that affects the plot.
What I would like to do is to actually have 1px/s representing a 1px length, whatever that length is as represented by the plot (using matplotlib.pyplot).

OK, so I played around with synthetic data and using either " angles = 'xy' " or just leaving it out (i.e. having something like
q = pl.quiver(a[valid,0],a[valid,1],a[valid,2],a[valid,3], color='r',**kw),
where one of the kwargs is your scale factor.
It seems to scale with the width of the plot. I used
pl.draw()
pl.xticks(np.arange(0,120,32))
pl.yticks(np.arange(0,100,32))
pl.grid(color = 'y')
pl.savefig(new_file, bbox = "tight")
to save my figure (the actual images were 120x100 (in this case) , 120x120 and 64x64.
What I did to test it:
I created one pixel shifts in the horizontal and vertical, then on some other images, a diagonal shift of 5px (3 right and 4 down).
To get 1px/s vector to take the length of 1px as in your image, you have to set the scale to the width of your plot i.e. in my case it was either 64 or 120. I wanted to see if having a non-square image will affect the scale, so I cropped the image to 100px height. From the few tests I have done, it does not seem to have any effect. It would be worth testing with different pixel shifts, but since the horizontal/vertical shifts were not scaled differently using different side length image, I think it is safe to assume it behaves as expected.
I wonder what is the reason for such scaling of vectors...

Related

How to find rectangular areas in a two-dimensional array whose distance from the center of the rectangle is less than the specified value of x

I'm looking for rectangular areas in a non-negative two-dimensional float array in numpy where the distance to the center point of the area is less than x. In fact, the purpose of data analysis is the output of a depth estimation function in which I specify the areas that are less distant from each other (which can be said, for example, to be part of a wall or objects that are vertical and facing the camera).
For example, in the image below, the output of the depth estimation function can be seen, where each pixel represents a distance between 0 and 500 cm. In any area where the difference in size is less than a value indicates that the object is in a vertical position and I am looking for these areas
https://drive.google.com/file/d/1Z2Bsi5ZNoo4pFU6N188leq56vGHFfvcd/view?usp=sharing
The code I am working on is related to MiDas, at the end of which I have added my code, which is in the following link
https://colab.research.google.com/drive/1vsfukFqOOZZjTajySM8hL0VNCvOxaa6X?usp=sharing
Now, for example, I'm looking for areas like this paper that are stuck behind a chair in the picture below
https://drive.google.com/file/d/1ui99gpU2i0JFumLivLpoEyHn3QTcfD8a/view?usp=sharing

How do I fit rectangles to an image in python and obtain their coordinates

I'm looking for a way to split a number of images into proper rectangles. These rectangles are ideally shaped such that each of them take on the largest possible size without containing a lot of white.
So let's say that we have the following image
I would like to get an output such as this:
Note the overlapping rectangles, the hole and the non axis aligned rectangle, all of these are likely scenario's I have to deal with.
I'm aiming to get the coordinates describing the corner pieces of the rectangles so something like
[[(73,13),(269,13),(269,47)(73,47)],
[(73,13),(73,210),(109,210),(109,13)]
...]
In order to do this I have already looked at the cv2.findContours but I couldn't get it to work with overlapping rectangles (though I could use the hierarchy model to deal with holes as that causes the contours to be merged into one.
Note that although not shown holes can be nested.
A algorithm that works roughly as follow should be able to give you the result you seek.
Get all the corner points in the image.
Randomly select 3 points to create a rectangle
Count the ratio of yellow pixels within the rectangle, accept if the ratio satisfy a threshold.
Repeat 2 to 4 until :
a) every single combination of point is complete or
b) all yellow pixel are accounted for or
c) after n number of iteration
The difficult part of this algorithm lies in step 2, creating rectangle from 3 points.
If all the rectangles were right angle, you can simply find the minimum x and y to correspond for topLeft corner and maximum x and y to correspond for bottomRight corner of your new rectangle.
But since you have off axis rectangle, you will need to check if the two vector created from the 3 points have a 90 degree angle between them before generating the rectangle.

find closest clusters of colors in a numpy array from an image file

Current state
I have a numpy array of shape (900, 1800, 3) that has been made from an image file.
That's one array element per pixel: 900 px high, 1800 px wide, and 3 channels (R, G, B) per pixel represented in the array.
There are only a small number (3-20) unique RGB colors in the images being parsed, so there are only very few different RGB value combinations represented in the array.
Goal
Identify the smallest circular areas in the image that contains n number of unique colors, where n will always be less than or equal to the number of unique colors in the image.
Return top y (by count or pct) of the smallest areas.
A 'result' could simply be the x,y value of the center pixel of an identified circular area and its radius.
I do plan to draw a circle around each area, but this question is about the best approach for first identifying the top smallest areas.
The Catch/Caveat
The images are actually flattened projections of spheres. That means that a pixel at the right edge of the image is actually adjacent to a pixel on the left edge, and similarly for top and bottom pixels. The solution must account for this as it is parsing pixels to identify closest pixels with other colors. EDIT: this part may be answered in comments below
The Question
My initial approach is to simply parse pixel by pixel and brute force the problem with handrolled x/y coordinate math: take a pixel, work outwards until we hit n colors, score that pixel for how many steps outward it took, next pixel. Keep a top y dict that gets re-evaluated after each pixel, adding any pixels that make top y, and dumping any that get pushed out. Return that dict as the output.
I know that many python libs like scipy, scikit-image, and maybe others like to work with images as numpy arrays. I'm sure there is a method/approach that is smarter and leverages a library or some kind of clustering algo instead of brute forcing it, but I'm not familiar enough with the space to know intuitively what methods and libs to consider. The question: What is the pseudocode for a good method/lib to do this the right way?

How to measure object size in real world in terms of measurement like inches centimeters etc from object size in the image in pixels?

I have calculated object size in terms of pixel from the image containing object. I want to measure object size in real world. Is there any way to find out multiplying factor to measure actual size ? I'm currently using python for implementation.
Typically you will have obtained your image with a camera which projects the 3 dimensional scene onto a 2 dimensional sensor by means of a lens. The vertical (height) projection is represented in the following diagram ( I am assuming a rectilinear lens):
You say the height of your object of interest in pixels of your image as 150 pixels:
You say the image size has a total height of 800 pixels, assuming this is the sensor resolution:
You are interested in finding the real height in metric system of the object H_{obj} that finds itself at distance D from the camera.
Expressing the angles in radians we can establish the following relationships:
where f is the focal length of the lens.
isolating terms and substituting we reach
but you have expressed h_{obj} in pixels and you want H_{obj} expressed in the metric system. So, lets first move from pixels to millimeters.
Lets assume you don't know the sensor height, so we keep it as the variable for the moment. Rearranging, expressing focal length as millimeters (mm) and substituting into the previous equation we have:
notice the term :
\alpha represents the Vertical Field of View (since we use the sensor height) of the camera, this is a parameter that is typically given for camera & lens calculations.
Normally its given in degrees, so we just convert it to radians.
If you actually know the focal length of your lens and the size of your sensor, just calculate the Field of View directly. This leaves us with the following equation:
The Field of View is the parameter you are missing to be able to complete your calculation. To complete the example let's assume it is 90ยบ:
The system of units you use now to express your distance D will define the units with which the h_{obj} will be expressed in.
Another approach is, given a Field of View, and assuming a rectilinear lens, you can calculate pixel height and pixel width at a defined distance from the camera.
Vertical resolution at distance D
For more information on Field of view:
Understanding your camera
Understanding Focal Length and Field of View
Sensor Size and Field of View

How to measure image coincidence in an optical rangefinder

I have a couple of USB webcams (fixed focal length) setup as a simple stereoscopic rangefinder, spaced N mm apart with each rotated by M degrees towards the centerline, and I've calibrated the cameras to ensure alignment.
When adjusting the angle, how would I measure the coincidence between the images (preferably in Python/PIL/OpenCV) to know when the cameras are focused on an object? Is it as simple as choosing a section of pixels in each image (A rows by B columns) and calculating the sum of the difference between the pixels?
the problem is that you can not assume pixel perfect align of cameras
so let assume x-axis is the parallax shifted axis and y- axis is aligned. You need to identify the x-axis image distortion/shift to detect parallax align even if you are aligned as much as possible. The result of abs difference is not guaranteed to be in min/max so instead of substracting individual pixels substract average color of nearby area of that pixel with radius/size bigger then the align error in y-axis. Let call this radius or size r this way the resulting difference should be minimal when aligned.
Approximation search
You can even speed up the process by r
select big r
scan whole x-range with step for example 0.25*r
choose the lowest difference x-position (x0)
change r to half
go to bullet 2 (but this time whole x range is just between <x0-2.0*r,x0+2.0r>
stops if r is smaller then few pixels
This way you can search in O(log2(n)) instead of O(n)
computer vision approach
this should be even faster:
detect points of interest (in booth images)
specific change in gradient,etc ...
cross match points of interest between images
compute average x-distance between cross-matched points
change parallax align by found distance of points
goto bullet 1 until x-distance is small enough
This way you can avoid checking whole x-range because the align distance is obtained directly ... You just need to convert it to angle or what ever you use to align parallax
[notes]
You do not need to do this on whole image area just select few horizontal lines along the images and scan their nearby area.
There are also another ways to detect align for example for short distances the skew is significant marker of align so compare the height of object on its left and right side between cameras ... If near the same you are aligned if bigger/smaller you are not aligned and know which way to turn ...

Categories

Resources