If we find contours for an image using cv2.findContours(), the co-ordinates present in each contour are of datatype of int32.
Example: Printing contours[0] gives,
array([[[0,1]],
[[1,1]],
.
.
.
[[400,232]]])
Is there any way to find the co-ordinates in the contour with the higher precision (subpixel)?
array([[[0.11,1.78]],
[[1.56,1.92]],
.
.
.
[[400.79,232.35]]])
In principle, in a properly sampled gray-scale image you do have information to guess at the location with sub-pixel precision. But it’s still a guess. I’m not aware of any existing algorithms that try to do this, I only know of algorithms that trace the contour in a binary image.
There have been a few papers published (that I’m aware of) that try to simplify the outline polygon in a way that you get a better representation of the underlying contour. These papers all make an assumption of smoothness that allows them to accomplish their task, without such an assumption there is no other information in the binary image than what is extracted by the simple pixel contour algorithm. The simplest method in this category is smoothing the polygon: move each vertex a little closer to the line formed by its two neighboring vertices.
If the goal however is to get more precise measurements of the object, then there are several different approaches to use the gray-value data to significantly improve the precision of these measurements. I’m familiar with two of them:
L.J. van Vliet, “Gray-scale measurements in multi-dimensional digitized images”, PhD thesis Delft University of Technology, 1993. PDF
N. Sladoje and J. Lindblad, “High Precision Boundary Length Estimation by Utilizing Gray-Level Information”, IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2):357-363, 2009. DOI
The first one does area, perimeter and local curvature in 2D (higher dimensions lead to additional measurements), based on the assumption of a properly sampled band-limited image. The latter does length, but is the start point of the “pixel coverage” model: the same authors have papers also on area and Feret diameters. This model assumes area sampling of a sharp boundary.
I have a dataset with different quality data. There are A-grade, B-grade, C-grade and D-grade data, being the A-grade the best ones and the D-grade the ones with more scatter and uncertainty.
The data comes from a quasi periodic signal and, taking into consideration all the data, it covers just one cycle. If we only take into account the best data (A and B graded, in green and yellow) we don't cover the whole cycle but we are sure that we are only using the best data points.
After computing a periodogram to determine the period of the signal for both, the whole sample and only the A and B graded, I ended up with the following results: 5893 days and 4733 days respectively.
Using those values I fit the data to a sine and I plot them in the following plot:
Plot with the data
In the attached file the green points are the best ones and the red ones are the worst.
As you can see, the data only cover one cycle, and the red points (the worst ones) are crucial to cover that cycle, but they are not as good in quality. So I would like to know if the curve fit is better with those points or not.
I was trying to use the R² parameter but I have read that it only works properly for lineal functions...
How can I quantify which of those fits is better?
Thank you
I have been trying to create custom regions for states. I want to fill the state map by using area of influence of points.
The below image represents what I have been trying. The left image shows the points and I just want to fill all the areas as in the right image. I have used Voronoi/Thiesen, but it leaves some points outside the area since it just takes the centroid to color the polygon.
Is there any algorithm or process to achieve that?, now I am using in Python.
You've identified your basic problem: you used a cluster-unit Voronoi algorithm, which is too simplistic for your application. You need to apply that same algebra to the points themselves, not to the region as a single-statistic entity.
To this end, I strongly recommend a multi-class SVM (Support Vector Machine) algorithm, which will identify the largest gaps between identified regions (classes) of points. Use a Gaussian kernel modification (of a very low degree) to handle non-linear boundaries. You will almost certainly get simple curves instead of lines.
I'm working on a heatmap generation program which hopefully will fill in the colors based on value samples provided from a building layout (this is not GPS based).
If I have only a few known data points such as these in a large matrix of unknowns, how do I get the values in between interpolated in Python?:
0,0,0,0,1,0,0,0,0,0,5,0,0,0,0,9
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,0,0,2,0,0,0,0,0,0,0,0,8,0,0,0
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,8,0,0,0,0,0,0,0,6,0,0,0,0,0,0
0,0,0,0,0,3,0,0,0,0,0,0,0,0,7,0
I understand that bilinear won't do it, and Gaussian will bring all the peaks down to low values due to the sheer number of surrounding zeros. This is obviously a matrix handling proposition, and I don't need it to be Bezier curve smooth, just close enough to be a graphic representation would be fine. My matrix will end up being about 1500×900 cells in size, with approximately 100 known points.
Once the values are interpolated, I have written code to convert it all to colors, no problem. It's just that right now I'm getting single colored pixels sprinkled over a black background.
Proposing a naive solution:
Step 1: interpolate and extrapolate existing data points onto surroundings.
This can be done using "wave propagation" type algorithm.
The known points "spread out" their values onto surroundings until all the grid is "flooded" with some known values. At the end of this stage you have a number of intersected "disks", and no zeroes left.
Step 2: smoothen the result (using bilinear filtering or some other filtering).
If you are able to use ScyPy, then interp2d does exactly what you want. A possible problem with is that it seems to not extrapolate smoothly according to this issue. This means that all values near the walls are going to be the same as closest their neighbour points. This can be solved by putting thermometers in all 4 corners :)
So, I want to detect cars from a driver recorder recorded video. I've read a lot and do research quite a lot but still not quite getting it. I do think of using a HOG descriptor with linear SVM. But in what way it can still be improver to make it easier to be implement and more robust since this will be kind of a research for me?
I am thinkin of combining another technique/algorithm with the HOG but still kind of lost. I am quite new in this.
Any help is greatly appreciated. I am also open to other better ideas.
HOG (histogram of oriented gradients) is merely a certain type of feature vector that can be computed from your data. You compute the gradient vector at each pixel in your image and then you divide up the possible angles into a discrete number of bins. Within a given image sub-region, you add the total magnitude of the gradient pointing in a given direction as the entry for the relevant angular bin containing that direction.
This leaves you with a vector that has a length equal to the number of bins you've chosen for dividing up the range of angles and acts as an unnormalized histogram.
If you want to compute other image features for the same sub-region, such as the sum of the pixels, some measurement of sharp angles or lines, aspects of the color distribution, or so forth, you can compute as many or as few as you would like, arrange them into a long vector as well, and simply concatenate that feature vector with the HOG vector.
You may also want to repeat the computation of the HOG vector for several different scale levels to help capture some scale variability, concatenating each scale-specific HOG vector onto the overall feature vector. There are other feature concepts like SIFT and others, which are created to automatically account for scale invariance.
You may need to do some normalization or scaling, which you can read about in any standard SVM guide. The standard LIBSVM guide is a great place to start.
You will have to be careful to organize your feature vector correctly since you will likely have a very large number of components to the feature vector, and you have to ensure they are always calculated and placed into the same ordering and undergo exactly the same scaling or normalization treatments.