Alternative of ST_DWithin of PostGIS in python shapely - python

I've two linestrings Line1,Line2.
line1 = "LINESTRING(72.863221 18.782499,72.863736 18.770147,72.882275 18.756169,72.881417 18.750805,72.878842 18.736987,72.874379 18.709512,72.860989 18.679593,72.864422 18.653897)"
line2 = "LINESTRING(72.883133 18.780793,72.882103 18.760314,72.862534 18.716422,72.860474 18.683577)"
I'm trying to perform the following query of POSTGIS in shapely. As of now I haven't been able to find the alternative of ST_DWithin command.
road2 = "ST_GeographyFromText('SRID=4326;%s')"%line1
road4 = "ST_GeographyFromText('SRID=4326;%s')"%line2
cur.execute("SELECT ST_AsText(road1) from %s as road1,%s as road2
where ST_DWithin(road1,road2,500)"%(road2,road4))
res = cur.fetchall()
print res
Does anyone knows what is the alternative of ST_DWithin in shapely ?

As far as I know, shapely only supports operations in planar coordinates (no geography types). However, for LineStrings which are not too large so that the curvature of the globe can be neglected, one could partially "circumvent" this by:
working in some planar projection (for example directly in the lat/lon or lon/lat coordinates)
following the second note in the documentation of ST_DWithin and the first note in the documentation of ST_Expand, i.e.:
checking if the bounding box of the second LineString intersects with the expanded bounding box of the first LineString
if yes, checking if the minimum distance is indeed below the prescribed threshold
For example:
from shapely.wkt import dumps, loads
from shapely.geometry.geo import box
spec = [
"LINESTRING(72.863221 18.782499,72.863736 18.770147,72.882275 18.756169,72.881417 18.750805,72.878842 18.736987,72.874379 18.709512,72.860989 18.679593,72.864422 18.653897)",
"LINESTRING(72.883133 18.780793,72.882103 18.760314,72.862534 18.716422,72.860474 18.683577)"
]
lines = list(map(loads, spec))
eta = 0.005
b1 = box(*lines[0].bounds).buffer(eta)
b2 = box(*lines[1].bounds)
flag = b2.intersects(b1) and (lines[0].distance(lines[1]) < eta)
print(eta, flag)
Alternatively, if you would like to check if the entire second LineString is within prescribed threshold from the first LineString, you could also use the buffer method as:
lines[0].buffer(eta).contains(lines[1])
The threshold supplied here to the buffer method is expressed in the same coordinate system in which the LineStrings are defined. Within the lon/lat system, this would represent the "central angle" - the issue then consists in the fact that the great circle distance corresponding to a fixed eta not only depends on the particular values of latitude and longitude but also on the direction of the displacement. However, if the LineStrings are not too large and the required precision is not too high, it probably won't matter that much.

Related

Measure distance between meshes

For my project, I need to measure the distance between two STL files. I wrote a script that allows reading the files, positioning them in relation to each other in the desired position. Now, in the next step I need to check the distance from one object to the other. Is there a function or script available on a library that allows me to carry out this process? Because then I’m going to want to define metrics like interpenetration area, maximum negative distance etc etc so I need to check first the distance between those objects and see if there is like mesh intersection and mesure that distance. I put the url for the combination of the 2 objects that I want to mesure the distance:
https://imgur.com/wgNaalh
Pyvista offers a really easy way of calculating just that:
import pyvista as pv
import numpy as np
mesh_1 = pv.read(**path to mesh 1**)
mesh_2 = pv.read(**path to mesh 2**)
closest_cells, closest_points = mesh_2.find_closest_cell(mesh_1.points, return_closest_point=True)
d_exact = np.linalg.norm(mesh_1 .points - closest_points, axis=1)
print(f'mean distance is: {np.mean(d_exact)}')
For more methods and examples, have a look at:
https://docs.pyvista.org/examples/01-filter/distance-between-surfaces.html#using-pyvista-filter
To calculate the distance between two meshes, first one needs to check whether these meshes intersect. If not, then the resulting distance can be computed as the distance between two closest points, one from each mesh (as on the picture below).
If the meshes do intersect, then it is necessary to find the part of each mesh, which is inside the other mesh, then find two most distant points, one from each inner part. The distance between these points will be the maximum deepness of the meshes interpenetration. It can be returned with negative sign to distinguish it from the distance between separated meshes.
In Python, one can use MeshLib library and findSignedDistance function from it as follows:
import meshlib.mrmeshpy as mr
mesh1 = mr.loadMesh("Cube.stl")
mesh2 = mr.loadMesh("Torus.stl"))
z = mr.findSignedDistance(mesh1, mesh2)
print(z.signedDist) // 0.3624192774295807

Is it possible to fit a coordinate to a street in OSMnx?

OSMnx provides solution to calculate the shortest path between two nodes, but I would like to the same with points on streets (I have GPS coordinates recorded from vehicles). I know there is also a method to get the closest node, but I have two question for this problem of mine.
i) When closest node computed is the street where the point is also taken into consideration? (I assume not)
ii) If I wanted to implement something like this, I like to know how a street (edge) is represented as a curve (Bézier curve maybe?). Is it possible to get the curve (or the equation of the curve) of an edge?
I asked this question here, because the guidelines for contributing of OSMnx asked it.
Streets and node in OSMnx are shapely.geometry.LineString, and shapely.geometry.Point objects, so there is no curve, only sequence of coordinates. The technical term for what you described is Map Matching. There are different ways of map matching, the simplest one being geometric map matching in which you find nearest geometry (node or edge) to the GPS point. point to point map matching can be easily achieved using built-in osmnx function ox.get_nearest_node(). If you have a luxury of dense GPS tracks, this approach could work reasonably good. For point to line map matching you have to use shapely functions. The problem with this approach is that it is very slow. you can speed up the algorithm using spatial index, but still, it will not be fast enough for most purposes. Note that geometric map matching are least accurate among all approaches. I wrote a function a few weeks ago that does simple point to line map matching using edge GeoDataFrame and node GeoDataFrame that you can get from OSMnx. I abandoned this idea and now I am working on a new algorithm (hopefully much faster), which I will publish on GitHub upon completion. Meanwhile, this may be of some help for you or someone else, so I post it here. This is an early version of abandoned code, not tested enough and not optimized. give it a try and let me know if it works for you.
def GeoMM(traj, gdfn, gdfe):
"""
performs map matching on a given sequence of points
Parameters
----------
Returns
-------
list of tuples each containing timestamp, projected point to the line, the edge to which GPS point has been projected, the geometry of the edge))
"""
traj = pd.DataFrame(traj, columns=['timestamp', 'xy'])
traj['geom'] = traj.apply(lambda row: Point(row.xy), axis=1)
traj = gpd.GeoDataFrame(traj, geometry=traj['geom'], crs=EPSG3740)
traj.drop('geom', axis=1, inplace=True)
n_sindex = gdfn.sindex
res = []
for gps in traj.itertuples():
tm = gps[1]
p = gps[3]
circle = p.buffer(150)
possible_matches_index = list(n_sindex.intersection(circle.bounds))
possible_matches = gdfn.iloc[possible_matches_index]
precise_matches = possible_matches[possible_matches.intersects(circle)]
candidate_nodes = list(precise_matches.index)
candidate_edges = []
for nid in candidate_nodes:
candidate_edges.append(G.in_edges(nid))
candidate_edges.append(G.out_edges(nid))
candidate_edges = [item for sublist in candidate_edges for item in sublist]
dist = []
for edge in candidate_edges:
# get the geometry
ls = gdfe[(gdfe.u == edge[0]) & (gdfe.v == edge[1])].geometry
dist.append([ls.distance(p), edge, ls])
dist.sort()
true_edge = dist[0][1]
true_edge_geom = dist[0][2].item()
pp = true_edge_geom.interpolate(true_edge_geom.project(p)) # projected point
res.append((tm, pp, true_edge, true_edge_geom))
return res
OSMnx was recently updated since there have been a couple of requests in this direction (see https://github.com/gboeing/osmnx/pull/234 and references therein). So in the last update, you'll find a function like this:
ox.get_nearest_edge(G, (lat, lon))
It will give you the ID of the nearest edge, which is much better than nearest nodes.
However, I think it is more useful to also get the actual distance of the nearest edge in order to check whether or not your data point is on the road or a few thousand meters apart...
To do this, I followed the implementation from https://github.com/gboeing/osmnx/pull/231/files
# Convert Graph to graph data frame
gdf = ox.graph_to_gdfs(G, nodes=False, fill_edge_geometry=True)
# extract roads and some properties
roads = gdf[["geometry", "u", "v","ref","name","highway","lanes"]].values.tolist()
# calculate and attach distance
roads_with_distances = [(road, ox.Point(tuple(reversed((lat,lon)))).distance(road[0])) for road in roads]
# sort by distance
roads_with_distances = sorted(roads_with_distances, key=lambda x: x[1])
# Select closest road
closest_road = roads_with_distances[0]
# Check whether you are actually "on" the road
if closest_road[1] < 0.0001: print('Hit the road, Jack!')
I have the impression that a distance on the order of $10^{-5}$ means that the coordinate is actually "on" the road.

how to optimize performances of geometry operations

I am looking for an approach to optimize performances of geometry operations. My goal is to count how many points (205,779) within a series of polygons (21,562). Using python and R are preferable as well as GIS software, like ArcGIS, QGIS.
Here are solutions I have searched and written.
using ArcGIS: one of examples is in http://support.esri.com/cn/knowledgebase/techarticles/detail/30779 -> although I did not try it, it always take a large amount of time in spatial join, based on my previous experiences.
using GDAL, OGR: Here is an example: http://geoexamples.blogspot.tw/2012/06/density-maps-using-gdalogr-python.html -> It takes 5 to 9 seconds for every polygon.
using Shapely prepared geometry operations with a loop: Here is my example, and it takes 2.7 to 3.0 seconds for every polygon. (Note that points is Point objects in a list)
prep_poly=[]
for i in polygons:
mycount=[]
for j in points:
if prep(i).contains(j):
mycount.append(1) #count how many points within polygons
prep_poly.append(sum(mycount)) #sum once for every polygon
mycount=[]
using Shapely prepared geometry operations with a filter: Here is my example, and it takes about 3.3 to 3.9 seconds for every polygon.(Note that points is a MultiPoint object)
prep_poly=[]
for i in polygons:
prep_poly.append(len(filter(prep(i).contains, point1)))
Though prepared geometry operations did improve the performances, it is still time-consuming to process lots of polygons. Any suggestion? Thanks!
Rather than looking through every pixel on the screen for every rectangle, you can do the following (Python code):
first_pixel = any pixel in the polygon
px_list = [] # array with pixels left to check
px_list.append(first_pixel) # add pixel to list of pixels to process
count = 0
while len(array) > 0: # pixels left in pixel list
curr_pixel = array[0]
for pixel in get_adjacent_pixels(curr_pixel): # find adjacent pixels
# ie (vertical, horizontal, diagonal)
if pixel in shape:
px_list.append(pixel) # add pixel to list
px_list.remove(curr_pixel)
count += 1
Essentially, the same way that path finding works. Check this wiki article for a visual representation of the above algorithm:
http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm#Algorithm
If you have no easy way to find starting points you could loop through all of the points once, checking for each point whether it is contained by a shape, and then storing that point together with the shape in a separate list and deleting it from the original shapes-for-which-we-have-no-point-yet list.

Find the intersection between two geographical data points

I have two pairs of lat/lon (expressed in decimal degrees) along with their radius (expressed in meters). What I am trying to achieve is to find if an intersect between these two points exits (of course, it is obvious that this doesn't hold here but the plan is to try this algorithm in many other data points). In order to check this I am using Shapely's intersects() function. My question however is how should I deal with the different units? Should I make some sort of transformation \ projection first (same units for both lat\lon and radius)?
48.180759,11.518950,19.0
47.180759,10.518950,10.0
EDIT:
I found this library here (https://pypi.python.org/pypi/utm) which seems helpfull. However, I am not 100% sure if I apply it correctly. Any ideas?
X = utm.from_latlon(38.636782, 21.414384)
A = geometry.Point(X[0], X[1]).buffer(30.777)
Y = utm.from_latlon(38.636800, 21.414488)
B = geometry.Point(Y[0], Y[1]).buffer(23.417)
A.intersects(B)
SOLUTION:
So, I finally managed to solve my problem. Here are two different implementations that both solve the same problem:
X = from_latlon(48.180759, 11.518950)
Y = from_latlon(47.180759, 10.518950)
print(latlonbuffer(48.180759, 11.518950, 19.0).intersects(latlonbuffer(47.180759, 10.518950, 19.0)))
print(latlonbuffer(48.180759, 11.518950, 100000.0).intersects(latlonbuffer(47.180759, 10.518950, 100000.0)))
X = from_latlon(48.180759, 11.518950)
Y = from_latlon(47.180759, 10.518950)
print(geometry.Point(X[0], X[1]).buffer(19.0).intersects(geometry.Point(Y[0], Y[1]).buffer(19.0)))
print(geometry.Point(X[0], X[1]).buffer(100000.0).intersects(geometry.Point(Y[0], Y[1]).buffer(100000.0)))
Shapely only uses the Cartesian coordinate system, so in order to make sense of metric distances, you would need to either:
project the coordinates into a local projection system that uses distance units in metres, such as a UTM zone.
buffer a point from (0,0), and use a dynamic azimuthal equidistant projection centered on the lat/lon point to project to geographic coords.
Here's how to do #2, using shapely.ops.transform and pyproj
import pyproj
from shapely.geometry import Point
from shapely.ops import transform
from functools import partial
WGS84 = pyproj.Proj(init='epsg:4326')
def latlonbuffer(lat, lon, radius_m):
proj4str = '+proj=aeqd +lat_0=%s +lon_0=%s +x_0=0 +y_0=0' % (lat, lon)
AEQD = pyproj.Proj(proj4str)
project = partial(pyproj.transform, AEQD, WGS84)
return transform(project, Point(0, 0).buffer(radius_m))
A = latlonbuffer(48.180759, 11.518950, 19.0)
B = latlonbuffer(47.180759, 10.518950, 10.0)
print(A.intersects(B)) # False
Your two buffered points don't intersect. But these do:
A = latlonbuffer(48.180759, 11.518950, 100000.0)
B = latlonbuffer(47.180759, 10.518950, 100000.0)
print(A.intersects(B)) # True
As shown by plotting the lon/lat coords (which distorts the circles):

Conversion of miles to latitude and longitude degrees using geopy

Background
I want to add a model manager function that filters a queryset based on the proximity to coordinates. I found this blog posting with code that is doing precisely what I want.
Code
The snippet below seems to make use of geopy functions that have since been removed. It coarsely narrows down the queryset by limiting the range of latitude and longitude.
# Prune down the set of all locations to something we can quickly check precisely
rough_distance = geopy.distance.arc_degrees(arcminutes=geopy.distance.nm(miles=distance)) * 2
queryset = queryset.filter(
latitude__range=(latitude - rough_distance, latitude + rough_distance),
longitude__range=(longitude - rough_distance, longitude + rough_distance)
)
Problem
Since some of the used geopy functions have been removed/moved, I'm trying to rewrite this stanza. However, I do not understand the calculations---barely passed geometry and my research has confused me more than actually helped me.
Can anyone help? I would greatly appreciate it.
In case anybody else is looking at this now, since I tried to use geopy and just hit up against it, the modern equivalent of the rough_distance snippet above is:
import geopy
rough_distance = geopy.units.degrees(arcminutes=geopy.units.nautical(miles=1))
It looks like distance in miles is being converted to nautical miles, which are each equal to a minute of arc, which are 1/60th of an arc degree each. That value is then doubled, and then added and subtracted from a given latitude and longitude. These four values can be used to form a bounding box around the coordinates.
You can lookup any needed conversion factors on Wikipedia. There's also a relevant article there titled Horizontal position representation which discusses pros and cons of alternatives to longitude and latitude positioning which avoid some of their complexities. In other words, about the considerations involved with replacing latitude and longitude with another horizontal position representation in calculations.
The Earth is not a sphere, only approximately so. If you need a more accurate calculation, use pyproj. Then you can calculate the location based a reference ellipsoid (e.g. WGS84).
martineau's answer is right on, in terms of what the snippet actually does, but it is important to note that 1 minute of arc represents very different distances depending on location. At the equator, the query covers the least axis aligned bounding box enclosing a circle of diameter distance, but off the equator, the bounding box does not completely contain that circle.
This code from the blog is sloppy:
def near(self, latitude=None, longitude=None, distance=None):
if not (latitude and longitude and distance):
return []
If latitude == 0 (equator) or longitude == 0 (Greenwich meridian), it returns immediately. Should be if latitude is None or longitude is None .......
#TokenMacGuy's answer is an improvement, but:
(a) The whole idea of the "bounding box" is to avoid an SQL or similar query calculating a distance to all points that otherwise satisfy the query. With appropriate indexes, the query will execute much faster. It does this at the cost of leaving the client to (1) calculate the coordinates of the bounding box (2) calculate and check the precise distance for each result returned by the query.
If step 2 is omitted, you get errors, even at the equator. For example "find all pizza shops in a 5-mile radius" means you get answers up to 7.07 miles (that's sqrt(5*2 + 5*2)) away in the corners of the box.
Note that the code that you show seems to be arbitrarily doubling the radius. This would mean you get points 14.1 miles away.
(b) As #TokenMacGuy said, away from the equator, it gets worse. The bounding box so calculated does not include all points that you are interested in -- unless of course you are overkilling by doubling the radius.
(c) If the circle of interest includes either the North or South Pole, the calculation is horribly inexact, and needs adjusting. If the circle of interest is crossed by the 180-degree meridian (i.e. the International Date Line without the zigzags), the results are a nonsense; you need to detect this case and apply a 2-part query (one part for each side of the meridian).
For solutions for problems (b) and (c), see this article.
If the coordinates on the earth are known, you can use geopy to get a good estimate of the decimal degrees to miles (or any distance units) scale at that point:
SCALE_VAL = 0.1
lat_scale_point = (cur_lat + SCALE_VAL, cur_long)
long_scale_point = (cur_lat, cur_long + SCALE_VAL)
cur_point = (cur_lat, cur_long)
lat_point_miles = distance.distance(cur_point, lat_scale_point).miles
long_point_miles = distance.distance(cur_point, long_scale_point).miles
# Assumes that 'radius_miles` is the range around the point you want to look for
lat_rough_distance = (radius_miles / lat_point_miles) * SCALE_VAL
long_rough_distance = (radius_miles / long_point_miles) * SCALE_VAL
Some caveats:
Special-case handling for the the scale points is needed around polls or prime meridean
Depending on how large or small you want your radius to be, you could pick a more appropriate SCALE_VAL

Categories

Resources