Geo Polygon generation Python for ecommerce coverage definition - python

I’m trying to write an algorithm in python to create automatically polygons (cluster of exagons maybe) that have same “fall to ground” orders in a certain amount of time (i have a quite huge amount of geo data about it order: DATE,latitude,longitude, product ordered)
I need it to assign proper coverage to stores managing the avg number of orders per sore according to given properties.
So I need to be able to define polygons with equal probability of being the place of an e-commerce order and later be able to cluster them around a store and merging intona single polygon
I have no clue on how to set up the process.
U have some suggestion? Thanks

Generate a hexagonal grid, clip grid to the research area, aggregate point data using the Spatial Join/Spatial Index function or your custom metric for Hexagonal geometric grid.
To achieve this can use the h3 and geopandas python library or the GIS software.
Result like this:
Advantages of using the Hexagonal grid:
Finding neighbors is easier with the hexagonal grid. For a rectangular grid, we have 4
neighbors N units away and N^2 apart in a hexagonal grid the distances are equal to
each of the six neighbors.
Hexagons reduce the sample bias due to the grid shape boundary effects, which is related to
a low ratio of the circumference to the area of the hexagon (a circle would be best but cannot be used).
Over a large area, the hexagonal grid will be less distorted due to the curvature of the earth than the grid rectangular.
When comparing polygons with equal areas, each point inside the hexagon is closer to the center
hexagon than a given point in a square or triangle with an equal area (this is due to more acute angles
square and triangle in relation to the hexagon).
And more benefits…
When looking for the best place for business, it is also a good idea to do a neighborhood analysis. Result like this:

Related

How can I calculate the projected area of distorted-3D-closed-polygon?

I'm trying to find the maximum projected area of distorted-3D-closed-polygon.
This one is original polygon, which is regular.
I distorted this polygon by randomly choosing two point, and bending the structure of polygon around the line that connects two points.
This is the distorted polygon.
What I want to do is calculating the projected area of this polygon.
I initially thought that this could be calculated like other self-intersecting or overlapping ones, however, it wasn't.
For example, the two different shapes above have the same topology when projected.
However, the areas that has to be concerned is different.
Therefore, it is necessary to take 3D topologies into account, and this is where I stucked.
How can I handle the complexity of this problem?

How to determine if 3d point lie in a certain volume or not?

I have a set of 3d points in a txt file in the form of (x,y,z) as shown in figure 1. I want to specify the boundaries of these points as in figure 2 such that if any new points were added outside the boundaries they are deleted as the blue points, and if they are inside the boundaries as the green ones they are kept. How can I achieve that in python? I tried convex hull but it only gets the boundary points !
The real data can be found here, I used figures for simplification. https://drive.google.com/file/d/1ei9NaJHN922pYItK2CRIXyLfwqm_xgrt/view?usp=sharing
Figure 1
Figure 2
For 2D points, you can apply the test as described in Wikipedia:
One simple way of finding whether the point is inside or outside a simple polygon is to test how many times a ray, starting from the point and going in any fixed direction, intersects the edges of the polygon. If the point is on the outside of the polygon the ray will intersect its edge an even number of times. If the point is on the inside of the polygon then it will intersect the edge an odd number of times. The status of a point on the edge of the polygon depends on the details of the ray intersection algorithm.
The n-dimensional case involves a convex hull test and requires linear programming techniques as described here.

How to divide a polygon into tiny polygons of a particular size?

I would like to divide/cut an irregular polygon into tiny polygons of a particular size(1.6m x 1m) in such a way that most of the irregular polygon area has to be utilized (an OPTIMIZATION MODEL)
The length and width of the polygon can be interchanged (either 1.6m X 1m (or) 1m X 1.6m)
So, in the end, I need to have as many polygons of size (1.6m X 1m) as possible.
You may consider it as a packing problem. I need to pack as many rectangles of size(1.6m x 1m) as possible inside a polygon. The rectangles can be translated and rotated but not intersect each other.
I used the "Create Grid" feature but it just cuts the whole polygon in a particular fashion.
But what I also want is that here, a blue polygon can also be cut in a vertical manner(1m x 1.6m) too.
So, I would like to know whether there is a plugin for this in QGIS/ArcGIS or any python script for this kind of polygon optimization?

Find triangle containing point in spherical triangular mesh (Python, spherical coordinates)

Main problem
In Python, I have triangulated the surface of a (unit) sphere using an icosahedral mesh. I have a list simplices of tuples containing the indices of the three vertices of each triangle, and I have a two lists describing the coordinates (in radians) of each vertex: its latitude and longitude.
For about a million points, I want to determine which triangle each point is in. I am looking for an efficient algorithm that returns the list indices of each triangle (indices corresponding to list simplices).
I am willing to sacrifice memory over efficiency, so I am fine with constructing a tree or using some lookup method.
Caveats
The triangles are of roughly equal size, but not exactly, so I suspect that a simple nearest-neighbor KDTree implementation is not exact.
Extra information
The icosahedral mesh has been obtained using the stripy package. It projects the vertices of the icosahedron onto the unit sphere and subsequently bisects the triangles, so that each edge is split in half, or conversely, each triangle is split in four. stripy has a built-in method for calculating the triangle a point is contained in, but for a mesh refinement of 6 (i.e. 6 bisections) and about a million points, this takes hours. I suspect that this method does not make use of a tree/lookup method and I hope there is a method that significantly improves on this.
Compute a latitude/longitude bounding box for each triangle. Remember that the largest-magnitude latitudes may be on an edge (easily found by considering the normal to the great circle including each edge) or (if the pole is enclosed) in the interior.
Divide all triangles that cross the periodic longitude boundary in two—or, to be cheap, just their bounding boxes.
Build an extended-object k-d tree over the triangles (and triangle pieces from above). This still uses only the latitude/longitude values.
Run the obvious recursive, conservative containment search to find candidate triangles. (It doesn’t matter which piece of the divided triangles you find.)
Test carefully for triangle inclusion: for each triangle side, judge which hemisphere (defined by the great circle containing the segment) contains the query point in a fashion (probably just a cross product on the three-dimensional vectors) that doesn’t depend on the order in which the vertices are presented and never produces “on the dividing line”. Then every point is guaranteed to be in exactly one triangle.

Optimizing Polygon Search

I split de world in X random polygons.
Then I am given a coordinate C1, for instance (-21.45, 7.10), and I want to attribute the right polygon to this coordinate.
The first solution is to apply my ‘point_in_polygon’ algorithm (given a set of coordinates that defines a polygon and a coordinate that defines a point, tell me if the point is inside or not) on each polygon until I find the right one.
But that is very expensive if I have a lot of points to put in a lot of polygons.
An improvement on that relies on the following idea:
To optimise the search, I create a grid (a collection) with a step n, k where I already attribute each pair of coordinates such that:
for i=-180 to 180 step n
for j = -90 to 90 step k
grid.add(i,j)
Then I create a dictionary, and for each pair in the collection I find the corresponding polygon
For each g in grid
For each p in polygons
If point_in_polygon(g,p) == True
my_dict(g) = p
Then, when I receive C1, I look for the closest coordinate in my grid, let’s say g1.
Thanks to my_dict, I can get quickly p1 = my_dict(g1)
Then I compute point_in_polygon(C1, p1) which is likely to be true. If it’s not, I find the closest g which is assigned to a different polygon, and I redo a test. Etc. until I have found the right polygon.
Now, the question is: what is the optimal n, k to create the grid?
So that I can find the right polygon in the minimum number of steps.
I don’t want it too low, because the search of the closest g which is assigned to a different polygon might be expensive.
I don’t want it too high as well, because then I might be missing some polygons and then the search never converges.
My intuition is that the smallest polygon is going to give the steps.
I am not sure if this is a programming problem, a maths problem, or just something I can find empirically, that's why I ask it here.
Any inputs appreciated!
Let me suggest a slight modification to your grid. Currently, you store for each cell the polygon that the cell's center belongs to. Instead, store all the polygons that overlap the cell. Then, whenever you see that a cell has only a single overlapping polygon, you don't need to do any inclusion testing. The grid can be built by methods of conservative rasterization (note that the referenced article is not focused on conservative but rather general rasterization).
The efficiency of your grid correlates with the ratio of single-polygon cells and total cells (because this is the probability of not having to perform polygon-inclusion tests). The storage itself is pretty cheap. You can use a dense array and get constant access to the cells. Hence, from a theoretical point of view, you should have as many cells as possible (because as you have more cells, the single-polygon cell ratio increases). In practice, you might find that cache and other memory effects might make large grids impractical. However, there is no good way to know other than test. So, just try with a couple of sizes on a few different machines and try to find a good fit.
If I had to guess, I would say that your cells should be square and have an area of about 1% - 5% of the average polygon area. Also, more compact polygons can be handled more efficiently than many long and thin polygons.
Pick any point and draw a line straight down from that point. The first polygon edge you hit tells you what polygon the point is in.
So, if you don't want to do polygon tests, then instead of dividing the space into a regular grid, first cut it into strips with vertical cuts that go through all polygon intersections.
Now, within each strip none of the polygon edges cross or end, so you can make an ordered list of all those edges from bottom to top.
If you want to find the polygon that contains a point, then, do a binary search using the x coordinate to find the proper strip. Then in the list of edges that span the strip, you can do a binary search using the y coordinate to find the closest one underneath the point, and that tells you what polygon the point is in.
Google 'trapezoidal decomposition' to find lots of information about similar techniques.

Categories

Resources