Remove almost parallels NetworkX shortest path - python

I generated a path between locations A and B with the constrain of locations that I have to pass throw them or at close to them so the route looks like: A -> c1 -> c2 - > B, even though it is not the shortest path.
I used for path in nx.all_shortest_paths(UG, source=l1_node_id, target=l2_node_id,weight = 'wgt'):
when 'wgt' is the distance of the edge/driving speed in this road.
I generated a list of lists where each inner list is the node_id for example:
l_list = [
[n11,n12,n13,n14....]
[n21,n22,n23,n24....]
..
]
and on the map, its looks like:(the markers are the beginning of each route and I also colored them with different color)
I want to change it to one route but as you can see there are some splits like the green and the red, some common sequences(which I can handle) and the second problem is the beginning of the blue route\end of the black one which is unimportant.
I can't just remove the red route because it supposed to be a generic algorithm and I don't know even where it will happen again during this route.
I do have timestamps of each marker but it just says that I have been close to this area. (it is locations of cellular antennas)

First, you gonna need to define what is "almost parallel" more concisely, or more formally, you need to define a similarity function.
Choosing a similarity/distance function
There are plenty of ways to define a similarity function, here is one of them
Resample
Assuming each node n_i has an x and y coordinates (n_i_x,n_i_y).
You can resample the points on the x-axis, such that the new points are sampled at 1km.
Then, for each 2 routes, you'll can sum the difference in the y axis.
Use this distance in order to cluster routes.
Other ideas
Earth mover distance
Jaccard (~ % of common nodes)
Clustering
Once you defined a similarity function, you can use a distance based clstering algorithm, I recommend using sklearn's agglomerative clustering.
After the clustering is done, all you have left to do is to choose one route from each cluster.

Related

Measure distance between meshes

For my project, I need to measure the distance between two STL files. I wrote a script that allows reading the files, positioning them in relation to each other in the desired position. Now, in the next step I need to check the distance from one object to the other. Is there a function or script available on a library that allows me to carry out this process? Because then I’m going to want to define metrics like interpenetration area, maximum negative distance etc etc so I need to check first the distance between those objects and see if there is like mesh intersection and mesure that distance. I put the url for the combination of the 2 objects that I want to mesure the distance:
https://imgur.com/wgNaalh
Pyvista offers a really easy way of calculating just that:
import pyvista as pv
import numpy as np
mesh_1 = pv.read(**path to mesh 1**)
mesh_2 = pv.read(**path to mesh 2**)
closest_cells, closest_points = mesh_2.find_closest_cell(mesh_1.points, return_closest_point=True)
d_exact = np.linalg.norm(mesh_1 .points - closest_points, axis=1)
print(f'mean distance is: {np.mean(d_exact)}')
For more methods and examples, have a look at:
https://docs.pyvista.org/examples/01-filter/distance-between-surfaces.html#using-pyvista-filter
To calculate the distance between two meshes, first one needs to check whether these meshes intersect. If not, then the resulting distance can be computed as the distance between two closest points, one from each mesh (as on the picture below).
If the meshes do intersect, then it is necessary to find the part of each mesh, which is inside the other mesh, then find two most distant points, one from each inner part. The distance between these points will be the maximum deepness of the meshes interpenetration. It can be returned with negative sign to distinguish it from the distance between separated meshes.
In Python, one can use MeshLib library and findSignedDistance function from it as follows:
import meshlib.mrmeshpy as mr
mesh1 = mr.loadMesh("Cube.stl")
mesh2 = mr.loadMesh("Torus.stl"))
z = mr.findSignedDistance(mesh1, mesh2)
print(z.signedDist) // 0.3624192774295807

Joining two convex nonintersecting polygons into one

I need to join two convex, non-intersecting polygons into one joined covex polygon in way of minimisation of resulting area, like in picture below: I'm seeking for an alhorithm doing this. I also would be appreciate if someone provide me with corresponding python implementation.
If there are two non-intersecting polygons having say, m and n vertices respectively, then your problem can be thought of in this way:
Finding the convex polygon of the least area containing all of the m+n points. Having said this, check out the QuickHull Algorithm here: http://www.geeksforgeeks.org/quickhull-algorithm-convex-hull/
Additionally you can also check out these algorithms.
Jarvis's Algorithm: http://www.geeksforgeeks.org/convex-hull-set-1-jarviss-algorithm-or-wrapping/
And, Graham's Scan: http://www.geeksforgeeks.org/convex-hull-set-2-graham-scan/
Hope this helps.
P.S. I think you can find the python implementations of these algorithms anywhere on the internet. :)
For an efficient solution, you can adapt the Monotone Chain method (https://en.wikibooks.org/wiki/Algorithm_Implementation/Geometry/Convex_hull/Monotone_chain) as follows:
for both polygons, find the leftmost and rightmost sites (in case of ties, use the highest/lowest respectively);
these sites split the polygons in two chains, that are ordered on X;
merge the two upper and two lower chains with comparisons on X (this is a pass of mergesort);
reject the reflex sites from the upper and lower chains, using the same procedure as in the monotone chain method (a variant of Graham's walk).
The total running time will be governed by
n + m comparisons to find the extreme sites;
n + m comparisons for the merge;
n + m + 2 h LeftOf tests (signed area; h is the number vertices of the result).
Thus the complexity is O(n + m), which is not optimal but quite probably good enough for your purposes (a more sophisticated O(Log(n + m) solution is possible when the polygons do not overlap, but not worth the fuss for small polygon sizes).
In the example, the result of the merges are just the concatenation of the chains, but more complex cases can arise.
Final remark: if you keep all polygons as the concatenation of two monotone chains, you can spare the first step of the above procedure.
Finding the convex hull of both sets would work but the following approach is probably faster as it only needs to visit the polygons vertices in order:
Given polygons P and Q, pick from every one a vertex p1 and q1.
Search in Q the vertex q2 contiguous to q1 such that the rotation from p1-q1 to p1-q2 is clockwise (this can be checked easyly using vector cross product).
Repeat until you reach a point qk whose two contiguous vertices in Q generate and anticlockwise rotation.
Now, invert the process traveling from p1 across contigous vertices in P such that the rotation is anticlockwise until an extreme pl is found again.
Repeat from 2 until no more advance is possible. You have now two points pm and pn which are two the vertices where one side of the red area meets the black polygons in your drawing above.
Now repeat the algorithm again but changing the directions, from clockwise to anti-clockwise and vice-versa in order to find the vertices for the other side of the red area.
The only remaining work is generating the final polygon from the two red area sides already found and the segments from the polygons.

Finding best fit boxes of a scatter plot using python?

I'm looking for the best python library to solve this problem:
I have a scatter plot with clumps over data points. This is just a series of x,y coordinate pairs.
I want a tool that will look at the data points I have, then suggest N 'boxes' that encompass the different groups.
Presumably I could go with higher or lower granularity by choosing how many boxes I wanted to use.
Are there any python libraries out there best suited to solve this type of problem?
The way I understand your question, you want to find boxes that enclose clouds of data points.
You define your granularity criterion as the number of boxes used to describe your data set.
I think what you are looking for is agglomerative hierarchical clustering. The algorithm is quite straight forward. Let n be the number of data points you have in the set. Basically, the algorithm starts by considering n groups, each one being populated by a single point. Then, it is an iterative process :
Merge the two closest groups according to a distance criterion
Since the groups set has changed, update the distances between the groups
Back to the merge step until either you reached a specific number of clusters or a specific distance threshold
You can also build the dendogram. It is a tree-like structure that will store the history of all the merging process, allowing you to retrieve any level of granularity between 1 cluster and n clusters.
There is a set of functions in Scipy that are dedicated to this algorithm. It is covered by the question Tutorial for scipy.cluster.hierarchy.
Getting the clusters is the first step, now you can build your boxes. Lets cover this in a so-called mathematical point of view. Let C be a cluster and P1, ... Pn the points of the cluster. If a rectangular box is fine, then it can be defined by the two points of coordinates (xmin, ymin) and (xmax, ymax), with :
xmin = min (P.x P ∈ C)
ymin = min (P.y P ∈ C )
xmax = max (P.x P ∈ C )
xmax = max (P.y P ∈ C )
EDIT :
This way of building the boxes is the dumbest possible. If you want something that really fits, you'll have to look on building the convex hull of each cluster.

Finding all points common to two circles

In Python, how would one find all integer points common to two circles?
For example, imagine a Venn diagram-like intersection of two (equally sized) circles, with center-points (x1,y1) and (x2,y2) and radii r1=r2. Additionally, we already know the two points of intersection of the circles are (xi1,yi1) and (xi2,yi2).
How would one generate a list of all points (x,y) contained in both circles in an efficient manner? That is, it would be simple to draw a box containing the intersections and iterate through it, checking if a given point is within both circles, but is there a better way?
Keep in mind that there are four cases here.
Neither circle intersects, meaning the "common area" is empty.
One circle resides entirely within the other, meaning the "common area" is the smaller/interior circle. Also note that a degenerate case of this is if they are the same concentric circle, which would have to be the case given the criteria that they are equal-diameter circles that you specified.
The two circles touch at one intersection point.
The "general" case where there are going to be two intersection points. From there, you have two arcs that define the enclosed area. In that case, the box-drawing method could work for now, I'm not sure there's a more efficient method for determining what is contained by the intersection. Do note, however, if you're just interested in the area, there is a formula for that.
You may also want to look into the various clipping algorithms used in graphics development. I have used clipping algorithms to solve alot of problems similar to what you are asking here.
If the locations and radii of your circles can vary with a granularity less than your grid, then you'll be checking a bunch of points anyway.
You can minimize the number of points you check by defining the search area appropriately. It has a width equal to the distance between the points of intersection, and a height equal to
r1 + r2 - D
with D being the separation of the two centers. Note that this rectangle in general is not aligned with the X and Y axes. (This also gives you a test as to whether the two circles intersect!)
Actually, you'd only need to check half of these points. If the radii are the same, you'd only need to check a quarter of them. The symmetry of the problem helps you there.
You're almost there.
Iterating over the points in the box should be fairly good, but you can do better if for the second coordinate you iterate directly between the limits.
Say you iterate along the x axis first, then for the y axis, instead of iterating between bounding box coords figure out where each circle intersects the x line, more specifically you are interested in the y coordinate of the intersection points, and iterate between those (pay attention to rounding)
When you do this, because you already know you are inside the circles you can skip the checks entirely.
If you have a lot of points then you skip a lot of checks and you might get some performance improvements.
As an additional improvement you can pick the x axis or the y axis to minimize the number of times you need to compute intersection points.
So you want to find the lattice points that are inside both circles?
The method you suggested of drawing a box and iterating through all the points in the box seems the simplest to me. It will probably be efficient, as long as the number of points in the box is comparable to the number of points in the intersection.
And even if it isn't as efficient as possible, you shouldn't try to optimize it until you have a good reason to believe it's a real bottleneck.
I assume by "all points" you mean "all pixels". Suppose your display is NX by NY pixels. Have two arrays
int x0[NY], x1[NY]; initially full of -1.
The intersection is lozenge-shaped, between two curves.
Iterate x,y values along each curve. At each y value (that is, where the curve crosses y + 0.5), store the x value in the array. If x0[y] is -1, store it in x0, else store it in x1.
Also keep track of the lowest and highest values of y.
When you are done, just iterate over the y values, and at each y, iterate over the x values between x0 and x1, that is, for (ix = x0[iy]; ix < x1[iy]; ix++) (or the reverse).
It's important to understand that pixels are not the points where x and y are integers. Rather pixels are the little squares between the grid lines. This will prevent you from having edge-case problems.

Estimating the boundary of arbitrarily distributed data

I have two dimensional discrete spatial data. I would like to make an approximation of the spatial boundaries of this data so that I can produce a plot with another dataset on top of it.
Ideally, this would be an ordered set of (x,y) points that matplotlib can plot with the plt.Polygon() patch.
My initial attempt is very inelegant: I place a fine grid over the data, and where data is found in a cell, a square matplotlib patch is created of that cell. The resolution of the boundary thus depends on the sampling frequency of the grid. Here is an example, where the grey region are the cells containing data, black where no data exists.
1st attempt http://astro.dur.ac.uk/~dmurphy/data_limits.png
OK, problem solved - why am I still here? Well.... I'd like a more "elegant" solution, or at least one that is faster (ie. I don't want to get on with "real" work, I'd like to have some fun with this!). The best way I can think of is a ray-tracing approach - eg:
from xmin to xmax, at y=ymin, check if data boundary crossed in intervals dx
y=ymin+dy, do 1
do 1-2, but now sample in y
An alternative is defining a centre, and sampling in r-theta space - ie radial spokes in dtheta increments.
Both would produce a set of (x,y) points, but then how do I order/link neighbouring points them to create the boundary?
A nearest neighbour approach is not appropriate as, for example (to borrow from Geography), an isthmus (think of Panama connecting N&S America) could then close off and isolate regions. This also might not deal very well with the holes seen in the data, which I would like to represent as a different plt.Polygon.
The solution perhaps comes from solving an area maximisation problem. For a set of points defining the data limits, what is the maximum contiguous area contained within those points To form the enclosed area, what are the neighbouring points for the nth point? How will the holes be treated in this scheme - is this erring into topology now?
Apologies, much of this is me thinking out loud. I'd be grateful for some hints, suggestions or solutions. I suspect this is an oft-studied problem with many solution techniques, but I'm looking for something simple to code and quick to run... I guess everyone is, really!
~~~~~~~~~~~~~~~~~~~~~~~~~
OK, here's attempt #2 using Mark's idea of convex hulls:
alt text http://astro.dur.ac.uk/~dmurphy/data_limitsv2.png
For this I used qconvex from the qhull package, getting it to return the extreme vertices. For those interested:
cat [data] | qconvex Fx > out
The sampling of the perimeter seems quite low, and although I haven't played much with the settings, I'm not convinced I can improve the fidelity.
I think what you are looking for is the Convex Hull of the data That will give a set of points that if connected will mean that all your points are on or inside the connected points
I may have mixed something, but what's the motivation for simply not determining the maximum and minimum x and y level? Unless you have an enormous amount of data you could simply iterate through your points determining minimum and maximum levels fairly quickly.
This isn't the most efficient example, but if your data set is small this won't be particularly slow:
import random
data = [(random.randint(-100, 100), random.randint(-100, 100)) for i in range(1000)]
x_min = min([point[0] for point in data])
x_max = max([point[0] for point in data])
y_min = min([point[1] for point in data])
y_max = max([point[1] for point in data])

Categories

Resources