Find the intersection between two geographical data points - python

I have two pairs of lat/lon (expressed in decimal degrees) along with their radius (expressed in meters). What I am trying to achieve is to find if an intersect between these two points exits (of course, it is obvious that this doesn't hold here but the plan is to try this algorithm in many other data points). In order to check this I am using Shapely's intersects() function. My question however is how should I deal with the different units? Should I make some sort of transformation \ projection first (same units for both lat\lon and radius)?
48.180759,11.518950,19.0
47.180759,10.518950,10.0
EDIT:
I found this library here (https://pypi.python.org/pypi/utm) which seems helpfull. However, I am not 100% sure if I apply it correctly. Any ideas?
X = utm.from_latlon(38.636782, 21.414384)
A = geometry.Point(X[0], X[1]).buffer(30.777)
Y = utm.from_latlon(38.636800, 21.414488)
B = geometry.Point(Y[0], Y[1]).buffer(23.417)
A.intersects(B)
SOLUTION:
So, I finally managed to solve my problem. Here are two different implementations that both solve the same problem:
X = from_latlon(48.180759, 11.518950)
Y = from_latlon(47.180759, 10.518950)
print(latlonbuffer(48.180759, 11.518950, 19.0).intersects(latlonbuffer(47.180759, 10.518950, 19.0)))
print(latlonbuffer(48.180759, 11.518950, 100000.0).intersects(latlonbuffer(47.180759, 10.518950, 100000.0)))
X = from_latlon(48.180759, 11.518950)
Y = from_latlon(47.180759, 10.518950)
print(geometry.Point(X[0], X[1]).buffer(19.0).intersects(geometry.Point(Y[0], Y[1]).buffer(19.0)))
print(geometry.Point(X[0], X[1]).buffer(100000.0).intersects(geometry.Point(Y[0], Y[1]).buffer(100000.0)))

Shapely only uses the Cartesian coordinate system, so in order to make sense of metric distances, you would need to either:
project the coordinates into a local projection system that uses distance units in metres, such as a UTM zone.
buffer a point from (0,0), and use a dynamic azimuthal equidistant projection centered on the lat/lon point to project to geographic coords.
Here's how to do #2, using shapely.ops.transform and pyproj
import pyproj
from shapely.geometry import Point
from shapely.ops import transform
from functools import partial
WGS84 = pyproj.Proj(init='epsg:4326')
def latlonbuffer(lat, lon, radius_m):
proj4str = '+proj=aeqd +lat_0=%s +lon_0=%s +x_0=0 +y_0=0' % (lat, lon)
AEQD = pyproj.Proj(proj4str)
project = partial(pyproj.transform, AEQD, WGS84)
return transform(project, Point(0, 0).buffer(radius_m))
A = latlonbuffer(48.180759, 11.518950, 19.0)
B = latlonbuffer(47.180759, 10.518950, 10.0)
print(A.intersects(B)) # False
Your two buffered points don't intersect. But these do:
A = latlonbuffer(48.180759, 11.518950, 100000.0)
B = latlonbuffer(47.180759, 10.518950, 100000.0)
print(A.intersects(B)) # True
As shown by plotting the lon/lat coords (which distorts the circles):

Related

What is the correct way to reproject a raster from a CRS to another using Python?

I have a raster of Land Cover data (specifically this one /eodata/auxdata/S2GLC/2017/S2GLC_T32TMS_2017 in https://finder.creodias.eu) that uses 'epsg:32632' as CRS. I want to reproject this raster on 'epsg:21781'. This is what the raster looks like when I open it with xarray.
fn = 'data/S2GLC_T32TMS_2017/S2GLC_T32TMS_2017.tif'
da = xr.open_rasterio(fn).sel(band=1, drop=True)
da
<xarray.DataArray (y: 10980, x: 10980)>
[120560400 values with dtype=uint8]
Coordinates:
* y (y) float64 5.2e+06 5.2e+06 5.2e+06 ... 5.09e+06 5.09e+06 5.09e+06
* x (x) float64 4e+05 4e+05 4e+05 ... 5.097e+05 5.097e+05 5.098e+05
Attributes:
transform: (10.0, 0.0, 399960.0, 0.0, -10.0, 5200020.0)
crs: +init=epsg:32632
res: (10.0, 10.0)
is_tiled: 0
nodatavals: (nan,)
scales: (1.0,)
offsets: (0.0,)
AREA_OR_POINT: Area
INTERLEAVE: BAND
My usual workflow was to transform all the point coordinates, create my destination grid and interpolate using nearest neighbors. Something that looks like this:
import numpy as np
import xarray as xr
import pyproj
from scipy.interpolate import griddata
y = da.y.values
x = da.x.values
xx, yy = np.meshgrid(x,y)
# (n,2) point coordinates in the original CRS
src_coords = np.column_stack([xx.flatten(), yy.flatten()])
transformer = pyproj.transformer.Transformer.from_crs('epsg:32632', 'epsg:21781')
xx, yy = transformer.transform(src_coords[:,0], src_coords[:,1])
# (n,2) point coordinates in the destination CRS, which are not on a regular grid
dst_coords = np.column_stack([xx.flatten(), yy.flatten()])
# I define my destination **regular** grid coordinates
x = np.linspace(620005,719995,10)
y = np.linspace(199995,100005,10)
xx, yy = np.meshgrid(x,y)
dst_grid = np.column_stack([xx.flatten(), yy.flatten()])
# I interpolate onto the grid
reprojected_array = griddata(
src_coords, da.values.flatten(), dst_coords, method='nearest'
).reshape(dst_shape)
Although this method is fairly transparent and (apparently) error-free, it can take very long when dealing with billions of points. Recently, I discovered rasterio's reproject function, and I was blown away by how fast it is. This is how I implemented it:
source = da.values
destination = np.zeros(dst_shape, np.int16)
res, aff = reproject(
source,
destination,
src_transform=src_transform, # affine transformation from original data
src_crs=src_crs,
dst_transform=dst_transform, # affine transformation that corresponds to the grid defined in the other approach
dst_crs=dst_crs,
resampling=Resampling.nearest) # using nearest neighbors just like with scope's griddata
Naturally I wanted to compare the results expecting them to be the same, but they were not, as you can see in the figure.
The resolution is 10 meters so the differences are not large, but after careful comparison with precise satellite data in the 'epsg:21781' coordinates, it looks like the old approach yields better results.
So my questions are:
why do these results differ?
is one approach better than the other? Are there specific conditions where one should prefer one or the other?
Griddata find nearest points in Euclidean distance,
on whatever map projection you give it.
Thus the nearest neighbors from a pipeline like
  4326 data points --> reproject --> nearest-Euclidean griddata
  query points
depend on the "reproject". Could you try +proj=sinu +lon_0= middle lon
for both data and query ?
What one really wants is a nearest-neighbor engine with great-circle distance,
not Euclidean distance.
The difference may be insignificant for small grids, or near the equator,
but less so in Finland -- cos 61° / cos 60° is ~ 97 %.
TL;DR
Is pyproj.transformer.Transformer.from_crs('epsg:32632', 'epsg:21781')
"correct" ? Don't know.
I see no test suite, and a couple of issues:
warp.reproject() generates the wrong result
roundtrip test \
"Nearest neighbor" is ill-defined / sensitive halfway between data points,
e.g. along the lines x or y = int + 0.5 on an int grid.
This is easy to test with KDTree.
xarray makes regular (Cartesian) grids easy, but afaik does not do
curvilinear (2d) grids.

geopandas not recognizing point in polygon

I have two data frames. One has polygons of buildings (around 70K) and the other has points that may or not be inside the polygons (around 100K). I need to identify if a point is inside a polygon or not.
When I plot both dataframes (example below), the plot shows that some points are inside the polygons and other are not. However, when I use .within(), the outcome says none of the points are inside polygons.
I recreated the example creating one polygon and one point "by hand" rather than importing the data and in this case .within() does recognize that the point is in the polygon. Therefore, I assume I'm making a mistake but I don't know where.
Example: (I'll just post the part that corresponds to one point and one polygon for simplicity. In this case, each data frame contains either a single point or a single polygon)
1) Using the imported data. The data frame dmR has the points and the data frame dmf has the polygon
import pandas as pd
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
from shapely import wkt
from shapely.geometry import Point, Polygon
plt.style.use("seaborn")
# I'm skipping the data manipulation stage and
# going to the point where the data are used.
print(dmR)
geometry
35 POINT (-95.75207 29.76047)
print(dmf)
geometry
41964 POLYGON ((-95.75233 29.76061, -95.75194 29.760...
# Plot
fig, ax = plt.subplots(figsize=(5,5))
minx, miny, maxx, maxy = ([-95.7525, 29.7603, -95.7515, 29.761])
ax.set_xlim(minx, maxx)
ax.set_ylim(miny, maxy)
dmR.plot(ax=ax, c='Red')
dmf.plot(ax=ax, alpha=0.5)
plt.savefig('imported_data.png')
The outcome
shows that the point is inside the polygon. However,
print(dmR.within(dmf))
35 False
41964 False
dtype: bool
2) If I try to recreate this by hand, it would be as follows (there may be a better way to do this but I couldn't figure it out):
# Get the vertices of the polygon to create it by hand
poly1 = dmf['geometry']
g = [i for i in poly1]
x,y = g[0].exterior.coords.xy
x,y
(array('d', [-95.752332508564, -95.75193554162979, -95.75193151831627, -95.75232848525047, -95.752332508564]),
array('d', [29.760606530637265, 29.760607694859385, 29.76044470363038, 29.76044237518235, 29.760606530637265]))
# Create the polygon by hand using the corresponding vertices
coords = [(-95.752332508564, 29.760606530637265),
(-95.75193554162979, 29.760607694859385),
(-95.75193151831627, 29.7604447036303),
(-95.75232848525047, 29.76044237518235),
(-95.752332508564, 29.760606530637265)]
poly = Polygon(coords)
# Create point by hand (just copy the point from 1) above
p1 = Point(-95.75207, 29.76047)
# Create the GeoPandas data frames from the point and polygon
ex = gpd.GeoDataFrame()
ex['geometry']=[poly]
ex = ex.set_geometry('geometry')
ex_p = gpd.GeoDataFrame()
ex_p['geometry'] = [p1]
ex_p = ex_p.set_geometry('geometry')
# Plot and print
fig, ax = plt.subplots(figsize=(5,5))
ax.set_xlim(minx, maxx)
ax.set_ylim(miny, maxy)
ex_p.plot(ax=ax, c='Red')
ex.plot(ax = ax, alpha=0.5)
plt.savefig('by_hand.png')
In this case, the outcome also shows the point in the polygon. However,
ex_p.within(ex)
0 True
dtype: bool
which recognize that the point is in the polygon. All suggestions on what to do are appreciated! Thanks.
I don't know if this is the most efficient way to do it but I was able to do what I needed within Python and using Geopandas.
Instead of using point.within(polygon) approach, I did a spatial join (geopandas.sjoin(df_1, df_2, how = 'inner', op = 'contains')) This results in a new data frame that contains the points that are within polygons and excludes the ones that are not. More information on how to do this can be found here.
I assume something is fishy about your coordinate reference system (crs). I cannot tell about dmr as it is not provided but ex_p is a naive geometry as you generated it from points without specifying the crs. You can check the crs using:
dmr.crs
Let's assume it's in 4326, then it will return:
<Geographic 2D CRS: EPSG:4326>
Name: WGS 84
Axis Info [ellipsoidal]:
- Lat[north]: Geodetic latitude (degree)
- Lon[east]: Geodetic longitude (degree)
Area of Use:
- name: World
- bounds: (-180.0, -90.0, 180.0, 90.0)
Datum: World Geodetic System 1984
- Ellipsoid: WGS 84
- Prime Meridian: Greenwich
In this case you would need to set a CRS for ex_p first using:
ex_p = ex_p.set_crs(epsg=4326)
If you want to inherit the crs of dmr dynamically you can also use:
ex_p = ex_p.set_crs(dmr.crs)
After you set a crs, you can re-project from one crs to another using:
ex_p = ex_p.to_crs(epsg=3395)
More on that topic:
https://geopandas.org/projections.html

Intersect two shapely polygons on the Earth projection

as i know, shapely use only cartesian coordinate system. I have two point on the earth with lat and lon coordinates. I need create buffer with 1km radius around this two points and find polygon, where this buffers intersect.
But construstion
buffer = Point(54.4353,65.87343).buffer(0.001) create simple circle, but in projection on the Earth it becomes ellipse, but i need two real circle with 1 km radius.
I think, i need convert my buffers into new projection and then intersect it, but dont now how correct do it.
You need to do what you say. For that, you will need to use a library that handles projections (pyproj is the choice here). There is a similar question in Geodesic buffering in python
import pyproj
from shapely.geometry import MultiPolygon, Polygon, Point
from shapely.ops import transform as sh_transform
from functools import partial
wgs84_globe = pyproj.Proj(proj='latlong', ellps='WGS84')
def point_buff_on_globe(lat, lon, radius):
#First, you build the Azimuthal Equidistant Projection centered in the
# point given by WGS84 lat, lon coordinates
aeqd = pyproj.Proj(proj='aeqd', ellps='WGS84', datum='WGS84',
lat_0=lat, lon_0=lon)
#You then transform the coordinates of that point in that projection
project_coords = pyproj.transform(wgs84_globe, aeqd, lon, lat)
# Build a shapely point with that coordinates and buffer it in the aeqd projection
aeqd_buffer = Point(project_coords).buffer(radius)
# Transform back to WGS84 each coordinate of the aeqd buffer.
# Notice the clever use of sh_transform with partial functor, this is
# something that I learned here in SO. A plain iteration in the coordinates
# will do the job too.
projected_pol = sh_transform(partial(pyproj.transform, aeqd, wgs84_globe),
aeqd_buffer)
return projected_pol
The function point_buff_on_globe will give you a polygon in lat lon that is the result of buffering the given point in the Azimuthal Equidistant Projection centered in that point (the best you can do with your requirements. Two observations:
I don't remember the units of the radius argument. I think is in meters, so if you need a 10 km buffer, you will need to pass it 10e3. But please, check it!
Beware of using this with radius to wide or points that are to far away from each other. Projections work well when the points are near to the point you are centering the projection.

Shortest path between many 2D points (travelling salesman within Shapely LineString?)

I was trying to create river cross-section profiles based on the point terrestical measurements. When trying to create a Shapely LineString from a Series of points with the common id, I realized that the order of given points really matters as the LineString would just connect given points 'indexwise' (connect points in the list-given order). The below code illustrates the default behaviour:
from shapely.geometry import Point, LineString
import geopandas as gpd
import numpy as np
import matplotlib.pyplot as plt
# Generate random points
x=np.random.randint(0,100,10)
y=np.random.randint(0,50,10)
data = zip(x,y)
# Create Point and default LineString GeoSeries
gdf_point = gpd.GeoSeries([Point(j,k) for j,k in data])
gdf_line = gpd.GeoSeries(LineString(zip(x,y)))
# plot the points and "default" LineString
ax = gdf_line.plot(color='red')
gdf_point.plot(marker='*', color='green', markersize=5,ax=ax)
That would produce the image:
Question: Is there any built-in method within Shapely that would automatically create the most logical (a.k.a.: the shortest, the least complicated, the least criss-cross,...) line through the given list of random 2D points?
Below can you find the desired line (green) compared to the default (red).
Here is what solved my cross-section LineString simplification problem. However, my solution doesn't correctly address computationally more complex task of finding the ultimately shortest path through the given points. As the commenters suggested, there are many libraries and scripts available to solve that particulal problem, but in case anyone want to keep it simple, you can use what did the trick for me. Feel free to use and comment!
def simplify_LineString(linestring):
'''
Function reorders LineString vertices in a way that they each vertix is followed by the nearest remaining vertix.
Caution: This doesn't calculate the shortest possible path (travelling postman problem!) This function performs badly
on very random points since it doesn't see the bigger picture.
It is tested only with the positive cartesic coordinates. Feel free to upgrade and share a better function!
Input must be Shapely LineString and function returns Shapely Linestring.
'''
from shapely.geometry import Point, LineString
import math
if not isinstance(linestring,LineString):
raise IOError("Argument must be a LineString object!")
#create a point lit
points_list = list(linestring.coords)
####
# DECIDE WHICH POINT TO START WITH - THE WESTMOST OR SOUTHMOST? (IT DEPENDS ON GENERAL DIRECTION OF ALL POINTS)
####
points_we = sorted(points_list, key=lambda x: x[0])
points_sn = sorted(points_list, key=lambda x: x[1])
# calculate the the azimuth of general diretction
westmost_point = points_we[0]
eastmost_point = points_we[-1]
deltay = eastmost_point[1] - westmost_point[1]
deltax = eastmost_point[0] - westmost_point[0]
alfa = math.degrees(math.atan2(deltay, deltax))
azimut = (90 - alfa) % 360
if (azimut > 45 and azimut < 135):
#General direction is west-east
points_list = points_we
else:
#general direction is south-north
points_list = points_sn
####
# ITERATIVELY FIND THE NEAREST VERTIX FOR THE EACH REMAINING VERTEX
####
# Create a new, ordered points list, starting with the east or southmost point.
ordered_points_list = points_list[:1]
for iteration in range(0, len(points_list[1:])):
current_point = ordered_points_list[-1] # current point that we are looking the nearest neighour to
possible_candidates = [i for i in points_list if i not in ordered_points_list] # remaining (not yet sortet) points
distance = 10000000000000000000000
best_candidate = None
for candidate in possible_candidates:
current_distance = Point(current_point).distance(Point(candidate))
if current_distance < distance:
best_candidate = candidate
distance = current_distance
ordered_points_list.append(best_candidate)
return LineString(ordered_points_list)
There is no built in function, but shapely has a distance function.
You could easily iterate over the points and calculate the shortest distance between them and construct the 'shortest' path.
There are some examples in the offical github repo.
Google's OR-Tools offer a nice and efficient way for solving the Travelling Salesman Problem: https://developers.google.com/optimization/routing/tsp.
Following the tutorial on their website would give you a solution from this (based on your example code):
to this:

Why are Basemap south polar stereographic map projection coordinates not agreeing with those of data sets in the same projection?

Some satellite based earth observation products provide latitude/longitude information while others provide the X/Y coordinates within a given grid projection (and there are also some having both, see example).
My approach in the second case is to set up a Basemap map which has the same parameters (projection, ellipsoid, origin of map) as given by the data provider in a way that the given X/Y values equal the Basemap coordinates. However if I do so the geolocation does not agree with other data sets including the Basemap coastline.
I have experienced this with three different data sets from different trustworthy sources. For the minimal example I use Landsat data provided by the U.S. Geological Survey which includes both, X/Y coordinates of a South Polar Stereographic grid and the corresponding lat/lon coordinates for all four corners of the image.
From a Landsat metafile we get (ID: LC82171052016079LGN00):
CORNER_UL_LAT_PRODUCT = -66.61490 CORNER_UL_LON_PRODUCT = -61.31816
CORNER_UR_LAT_PRODUCT = -68.74325 CORNER_UR_LON_PRODUCT = -58.04533
CORNER_LL_LAT_PRODUCT = -67.68721 CORNER_LL_LON_PRODUCT = -67.01109
CORNER_LR_LAT_PRODUCT = -69.94052 CORNER_LR_LON_PRODUCT = -64.18581
CORNER_UL_PROJECTION_X_PRODUCT = -2259300.000
CORNER_UL_PROJECTION_Y_PRODUCT = 1236000.000
CORNER_UR_PROJECTION_X_PRODUCT = -1981500.000
CORNER_UR_PROJECTION_Y_PRODUCT = 1236000.000
CORNER_LL_PROJECTION_X_PRODUCT = -2259300.000
CORNER_LL_PROJECTION_Y_PRODUCT = 958500.000
CORNER_LR_PROJECTION_X_PRODUCT = -1981500.000
CORNER_LR_PROJECTION_Y_PRODUCT = 958500.000
...
GROUP = PROJECTION_PARAMETERS MAP_PROJECTION = "PS" DATUM = "WGS84"
ELLIPSOID = "WGS84" VERTICAL_LON_FROM_POLE = 0.00000 TRUE_SCALE_LAT =
-71.00000 FALSE_EASTING = 0 FALSE_NORTHING = 0 GRID_CELL_SIZE_PANCHROMATIC = 15.00 GRID_CELL_SIZE_REFLECTIVE = 30.00
GRID_CELL_SIZE_THERMAL = 30.00 ORIENTATION = "NORTH_UP"
RESAMPLING_OPTION = "CUBIC_CONVOLUTION" END_GROUP =
PROJECTION_PARAMETERS
By using Basemap with the right map projection we should be able to derive the corner lat/lon values from the X/Y values:
import numpy as np
from mpl_toolkits.basemap import Basemap
m=Basemap(resolution='h',projection='spstere', ellps='WGS84', boundinglat=-60,lon_0=180, lat_ts=-71)
x_crn=np.array([-2259300,-1981500,-2259300,-1981500])# upper left, upper right, lower left, lower right
y_crn=np.array([1236000, 1236000, 958500, 958500])# upper left, upper right, lower left, lower right
x0, y0= m(0, -90)
#Basemap coordinates at the south pole
#note that (0,0) of the Basemap is in a corner of the map,
#while other data sets use the south pole.
#This is easy to take into account:
lon_crn, lat_crn = m(x0-x_crn, y0-y_crn, inverse=True)
print 'lon_crn: '+str(lon_crn)
print 'lat_crn: '+str(lat_crn)
Which returns:
lon_crn: [-61.31816102 -58.04532791 -67.01108782 -64.1858106 ]
lat_crn: [-67.23548626 -69.3099076 -68.28071626 -70.47651326]
As you can see the longitudes agree to the given precision with those from the metafile, but the latitudes are to low.
I can approximate the latitudes by:
lat_crn=(lat_crn+90.)*1.0275-90.
But this is really not satisfying.
This is how the image is located if using the X/Y corner coordinates from the metafile (in red the Basemap drawcoastlines()):
and this is how it looks like using the corner lat/lon:
In this case I can simply use the lat/lon coordinates, but as mentioned before there are datasets (like this) which is provided by X/Y coordinates only, which makes it very important to rely on the Basemap projection. I know that there are other modules to re-project the data as a potential workaround, but it should work without other modules and a re-projection could introduce errors itself.
As this problem appears with different data sets I like to believe that it is a bug in the Basemap module, but I might also make the same mistake again and again or have wrong expectations.
I did some experimentation and it seems like changing lat_ts has no effect with projection='spstere'. In fact, it seems as if the projection latitude is implicitly assumed to be lat_ts=-90. regardless of what value you assign.
I had more success using projection='stere' instead, so that you would construct the Basemap in your example as follows:
m=Basemap(width=5400000., height=5400000., projection='stere',
ellps='WGS84', lon_0=180., lat_0=-90., lat_ts=-71.)
You may prefer to set the latitude and longitude of the corners instead of the width and height of the plot for your application.

Categories

Resources