Main problem
In Python, I have triangulated the surface of a (unit) sphere using an icosahedral mesh. I have a list simplices of tuples containing the indices of the three vertices of each triangle, and I have a two lists describing the coordinates (in radians) of each vertex: its latitude and longitude.
For about a million points, I want to determine which triangle each point is in. I am looking for an efficient algorithm that returns the list indices of each triangle (indices corresponding to list simplices).
I am willing to sacrifice memory over efficiency, so I am fine with constructing a tree or using some lookup method.
Caveats
The triangles are of roughly equal size, but not exactly, so I suspect that a simple nearest-neighbor KDTree implementation is not exact.
Extra information
The icosahedral mesh has been obtained using the stripy package. It projects the vertices of the icosahedron onto the unit sphere and subsequently bisects the triangles, so that each edge is split in half, or conversely, each triangle is split in four. stripy has a built-in method for calculating the triangle a point is contained in, but for a mesh refinement of 6 (i.e. 6 bisections) and about a million points, this takes hours. I suspect that this method does not make use of a tree/lookup method and I hope there is a method that significantly improves on this.
Compute a latitude/longitude bounding box for each triangle. Remember that the largest-magnitude latitudes may be on an edge (easily found by considering the normal to the great circle including each edge) or (if the pole is enclosed) in the interior.
Divide all triangles that cross the periodic longitude boundary in two—or, to be cheap, just their bounding boxes.
Build an extended-object k-d tree over the triangles (and triangle pieces from above). This still uses only the latitude/longitude values.
Run the obvious recursive, conservative containment search to find candidate triangles. (It doesn’t matter which piece of the divided triangles you find.)
Test carefully for triangle inclusion: for each triangle side, judge which hemisphere (defined by the great circle containing the segment) contains the query point in a fashion (probably just a cross product on the three-dimensional vectors) that doesn’t depend on the order in which the vertices are presented and never produces “on the dividing line”. Then every point is guaranteed to be in exactly one triangle.
Related
I have the coordinates of the outline of a (convex) quadrilateral. These are stored as a list in the form of outline = [(x1,y1), (x2,y2), ..., (xn,yn)], though this can be changed/modified in any way convenient. I want to find the vertices of the quadrilateral.
Image of plotted outline
So far, I've considered using linear programming to find the coordinates. However, because of the possible granularity, this wouldn't work. Also, it would probably require heuristics to implement, which I want to avoid for robustness.
Obviously, one can find two coordinates by doing the highest and lowest y-value, but from there, I'm rather stuck at where to go next.
How can I get the coordinates of the corners?
Note - I've tagged this python, as that's the language that I'm using for my project, though a description of an algorithmic approach would be a much-appreciated answer as well.
If data points have exact coordinates (exactly lie on quad sides), you can get some extremal point (topmost for example), then sort other points by angle, choose the smallest angle, аnd get the farthest point with this angle - next vertex, then repeat with new vertex and so on - it is like convex hull building with gift wrapping algorithm.
If points positions are not perfect, consider approximation of sides with straight lines using Hough transform
Some libraries like OpenCV contain Hough transform inplementations, as well as convex hull ones.
I’m trying to write an algorithm in python to create automatically polygons (cluster of exagons maybe) that have same “fall to ground” orders in a certain amount of time (i have a quite huge amount of geo data about it order: DATE,latitude,longitude, product ordered)
I need it to assign proper coverage to stores managing the avg number of orders per sore according to given properties.
So I need to be able to define polygons with equal probability of being the place of an e-commerce order and later be able to cluster them around a store and merging intona single polygon
I have no clue on how to set up the process.
U have some suggestion? Thanks
Generate a hexagonal grid, clip grid to the research area, aggregate point data using the Spatial Join/Spatial Index function or your custom metric for Hexagonal geometric grid.
To achieve this can use the h3 and geopandas python library or the GIS software.
Result like this:
Advantages of using the Hexagonal grid:
Finding neighbors is easier with the hexagonal grid. For a rectangular grid, we have 4
neighbors N units away and N^2 apart in a hexagonal grid the distances are equal to
each of the six neighbors.
Hexagons reduce the sample bias due to the grid shape boundary effects, which is related to
a low ratio of the circumference to the area of the hexagon (a circle would be best but cannot be used).
Over a large area, the hexagonal grid will be less distorted due to the curvature of the earth than the grid rectangular.
When comparing polygons with equal areas, each point inside the hexagon is closer to the center
hexagon than a given point in a square or triangle with an equal area (this is due to more acute angles
square and triangle in relation to the hexagon).
And more benefits…
When looking for the best place for business, it is also a good idea to do a neighborhood analysis. Result like this:
I have a set of 3d points in a txt file in the form of (x,y,z) as shown in figure 1. I want to specify the boundaries of these points as in figure 2 such that if any new points were added outside the boundaries they are deleted as the blue points, and if they are inside the boundaries as the green ones they are kept. How can I achieve that in python? I tried convex hull but it only gets the boundary points !
The real data can be found here, I used figures for simplification. https://drive.google.com/file/d/1ei9NaJHN922pYItK2CRIXyLfwqm_xgrt/view?usp=sharing
Figure 1
Figure 2
For 2D points, you can apply the test as described in Wikipedia:
One simple way of finding whether the point is inside or outside a simple polygon is to test how many times a ray, starting from the point and going in any fixed direction, intersects the edges of the polygon. If the point is on the outside of the polygon the ray will intersect its edge an even number of times. If the point is on the inside of the polygon then it will intersect the edge an odd number of times. The status of a point on the edge of the polygon depends on the details of the ray intersection algorithm.
The n-dimensional case involves a convex hull test and requires linear programming techniques as described here.
I split de world in X random polygons.
Then I am given a coordinate C1, for instance (-21.45, 7.10), and I want to attribute the right polygon to this coordinate.
The first solution is to apply my ‘point_in_polygon’ algorithm (given a set of coordinates that defines a polygon and a coordinate that defines a point, tell me if the point is inside or not) on each polygon until I find the right one.
But that is very expensive if I have a lot of points to put in a lot of polygons.
An improvement on that relies on the following idea:
To optimise the search, I create a grid (a collection) with a step n, k where I already attribute each pair of coordinates such that:
for i=-180 to 180 step n
for j = -90 to 90 step k
grid.add(i,j)
Then I create a dictionary, and for each pair in the collection I find the corresponding polygon
For each g in grid
For each p in polygons
If point_in_polygon(g,p) == True
my_dict(g) = p
Then, when I receive C1, I look for the closest coordinate in my grid, let’s say g1.
Thanks to my_dict, I can get quickly p1 = my_dict(g1)
Then I compute point_in_polygon(C1, p1) which is likely to be true. If it’s not, I find the closest g which is assigned to a different polygon, and I redo a test. Etc. until I have found the right polygon.
Now, the question is: what is the optimal n, k to create the grid?
So that I can find the right polygon in the minimum number of steps.
I don’t want it too low, because the search of the closest g which is assigned to a different polygon might be expensive.
I don’t want it too high as well, because then I might be missing some polygons and then the search never converges.
My intuition is that the smallest polygon is going to give the steps.
I am not sure if this is a programming problem, a maths problem, or just something I can find empirically, that's why I ask it here.
Any inputs appreciated!
Let me suggest a slight modification to your grid. Currently, you store for each cell the polygon that the cell's center belongs to. Instead, store all the polygons that overlap the cell. Then, whenever you see that a cell has only a single overlapping polygon, you don't need to do any inclusion testing. The grid can be built by methods of conservative rasterization (note that the referenced article is not focused on conservative but rather general rasterization).
The efficiency of your grid correlates with the ratio of single-polygon cells and total cells (because this is the probability of not having to perform polygon-inclusion tests). The storage itself is pretty cheap. You can use a dense array and get constant access to the cells. Hence, from a theoretical point of view, you should have as many cells as possible (because as you have more cells, the single-polygon cell ratio increases). In practice, you might find that cache and other memory effects might make large grids impractical. However, there is no good way to know other than test. So, just try with a couple of sizes on a few different machines and try to find a good fit.
If I had to guess, I would say that your cells should be square and have an area of about 1% - 5% of the average polygon area. Also, more compact polygons can be handled more efficiently than many long and thin polygons.
Pick any point and draw a line straight down from that point. The first polygon edge you hit tells you what polygon the point is in.
So, if you don't want to do polygon tests, then instead of dividing the space into a regular grid, first cut it into strips with vertical cuts that go through all polygon intersections.
Now, within each strip none of the polygon edges cross or end, so you can make an ordered list of all those edges from bottom to top.
If you want to find the polygon that contains a point, then, do a binary search using the x coordinate to find the proper strip. Then in the list of edges that span the strip, you can do a binary search using the y coordinate to find the closest one underneath the point, and that tells you what polygon the point is in.
Google 'trapezoidal decomposition' to find lots of information about similar techniques.
For my project i use 2D images from a telescope. The outer border of each image is known to be oversatured with points due to telescope malfunction. Therefor i want to extract the points that make up the outer border of the 2D image.
So what i want to do is somehow extract the points that make up the outer shell, with a desired width of the shell according to my preference.
What i have tried so far:
In Python i have tried finding the points that make up the edge by using scipy.ConvexHull to find the outer points and then removing these points. When doing this in a loop it should remove the outer edge with a width dependant on the amount of iterations. However, this method is dependant on the point density, and removes less points for places on the edge where the density is large. What i want is that an about equal width of outer edge is removed of the whole image, see images below :
To show what i mean, i have added the ConvexHull result, in red the points it gives as outer edge points after 15x iterations:
For clarification, this is the desired result i would like my algorithm to give me, an outer edge with equal width over the whole image, which is independant of point density.
Since you showed only ideas and graphics without code, I will do the same.
I see several ways to get the smaller polygon within your convex hull with a near-constant width between them. There are also variations on each. I illustrate with a convex hull that is a simplified version of the one in your graphics. Each of my solutions ignores the majority of points in the problem and uses only the vertices of the convex hull, so the "point density" is ignored.
Before choosing a polygon, you could find the "center point" of your convex hull. There are multiple ways to define this. You could use the centroid of the vertices of the hull, where the x- and y-coordinates are the averages of the coordinates of the vertices, but this biases toward parts of the hull with many small segments. You could use the center of the bounding rectangle, where the x- and y-coordinates are the average of the maximum and the minimum coordinates of the hull's vertices. This is the approach I used in my graphics. There are other possible "center points."
My first inner polygon moves each vertex a proportional distance toward the center point. In my example, I moved each point one-fourth of the distance toward the center point.
My second inner polygon moves the vertices a fixed distance toward the center point. I chose a distance one-fourth of the average distance of the vertices from the center point. Note that for this particular example there is very little difference between this polygon and my previous one. The differences would be more obvious for a hull where come points are much closer to the center point than some other points.
My third polygon abandons the center point. It moves each side of the hull a fixed distance toward the inside of the hull. The intersections of these new segments are used to define the new polygon. In other words, I did "inward polygon offsetting" or "polygon buffering." This is a non-trivial task in computational geometry, but some discussion on this task and similar tasks can be found at this SO question. This does look different from the other polygons, since the smaller sides of the hull tend to shrink or completely disappear from the result.
Choose whichever polygon suits your needs--the first two are easier to compute than the third, but the third comes closest to your ideal of "equal width of outer edge."