Geopandas : difference() methode between polygon and points - python

I'm testing geopandas to make something quite simple : use the difference method to delete some points of a GeoDataFrame that are inside a circle.
Here's the begining of my script :
%matplotlib inline
# previous line is because I used ipynb
import pandas as pd
import geopandas as gp
from shapely.geometry import Point
[...]
points_df = gp.GeoDataFrame(csv_file, crs=None, geometry=geometry)
Here's the first rows of points_df :
Name Adress geometry
0 place1 street1 POINT (6.182674 48.694416)
1 place2 street2 POINT (6.177306 48.689889)
2 place3 street3 POINT (6.18 48.69600000000001)
3 place4 street4 POINT (6.1819 48.6938)
4 place5 street5 POINT (6.175694 48.690833)
Then, I add a point that will contain several points of the first GeoDF :
base = points_df.plot(marker='o', color='red', markersize=5)
center_coord = [Point(6.18, 48.689900)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
center.plot(ax=base, color = 'blue',markersize=5)
circle = center.buffer(0.015)
circle.plot(ax=base, color = 'green')
Here's the result displayed by the iPython notebook :
Now, the goal is to delete red points inside the green circle. To do that, I thought that difference method will be enough. But when I write :
selection = points_df['geometry'].difference(circle)
selection.plot(color = 'green', markersize=5)
The result is that... nothing changed with points_df :
I guess that the difference() method works only with polygons GeoDataFrames and the mix between points and polygons is not posible. But maybe I missed something !
Will a function to test the presence of a point in the circle be better than the difference method in this case ?

I guess that the difference() method works only with polygons
GeoDataFrames and the mix between points and polygons is not posible.
That seems to be the issue, you cant use the overlay with points.
And also for that kind of spatial operation a simple spatial join seems to be the easiest solution.
Starting with the last example ;):
%matplotlib inline
import pandas as pd
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point
# Create Fake Data
df = pd.DataFrame(np.random.randint(10,20,size=(35, 3)), columns=['Longitude','Latitude','data'])
# create Geometry series with lat / longitude
geometry = [Point(xy) for xy in zip(df.Longitude, df.Latitude)]
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# Create GeoDataFrame
points = gp.GeoDataFrame(df, crs=None, geometry=geometry)
# Create Matplotlib figure
fig, ax = plt.subplots()
# Set Axes to equal (otherwise plot looks weird)
ax.set_aspect('equal')
# Plot GeoDataFrame on Axis ax
points.plot(ax=ax,marker='o', color='red', markersize=5)
# Create new point
center_coord = [Point(15, 13)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
# Plot new point
center.plot(ax=ax,color = 'blue',markersize=5)
# Buffer point and plot it
circle = gp.GeoDataFrame(crs=None, geometry=center.buffer(2.5))
circle.plot(color = 'white',ax=ax)
Leaves us with the problem on how to determine if a point is inside or outside of the polygon... one way of achieving that is to Join all points inside the polygon, and create a DataFrame with the difference between all points and points within the circle:
# Calculate the points inside the circle
pointsinside = gp.sjoin(points,circle,how="inner")
# Now the points outside the circle is just the difference
# between points and points inside (see the ~)
pointsoutside = points[~points.index.isin(pointsinside.index)]
# Create a nice plot
fig, ax = plt.subplots()
ax.set_aspect('equal')
circle.plot(color = 'white',ax=ax)
center.plot(ax=ax,color = 'blue',markersize=5)
pointsinside.plot(ax=ax,marker='o', color='green', markersize=5)
pointsoutside.plot(ax=ax,marker='o', color='yellow', markersize=5)
print('Total points:' ,len(points))
print('Points inside circle:' ,len(pointsinside))
print('Points outside circle:' ,len(pointsoutside))
Total points: 35
Points inside circle: 10
Points outside circle: 25

Related

Python Overlay Different Data into Single Map

i am trying to overlay two sets of latitude and longitude plots so that the first set has points of one color and the second set of points has a different color plotted on the same map. I have tried to share the same axis (ax) but it keeps plotting the points in 2 maps instead of 1 single map with both sets or colors of points. My code looks like this:
from sys import exit
from shapely.geometry import Point
import geopandas as gpd
from geopandas import GeoDataFrame as gdf
from shapely.geometry import Point, LineString
import pandas as pd
import matplotlib.pyplot as plt
dfp = pd.read_csv("\\\porfiler03\\gtdshare\\Long_Lats_90p.csv", delimiter=',', skiprows=0,
low_memory=False)
geometry = [Point(xy) for xy in zip(dfp['Longitude'], dfp['Latitude'])]
gdf = gpd.GeoDataFrame(dfp, geometry=geometry)
#this is a simple map that goes with geopandas
fig, ax = plt.subplots()
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
#world = world[(world.name=="Spain")]
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='red', markersize=15);
dfn = pd.read_csv("\\\porfiler03\\gtdshare\\Long_Lats_90n.csv",
delimiter=',', skiprows=0,
low_memory=False)
geometry = [Point(xy) for xy in zip(dfn['Longitude'], dfn['Latitude'])]
gdf = gpd.GeoDataFrame(dfn, geometry=geometry)
#this is a simple map that goes with geopandas
#world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='yellow',
markersize=15);
My first plot looks like the second plot below but with red points in USA and Spain:
My second plot looks like this:
Thank you in helping me overlay these two different sets of points and colors into one map.
In your case, you want to plot 3 geodataframes (world, gdf1, and gdf2) on single axes. Then, after you create fig/axes, you must reuse the same axes (say, ax1) for each plot. Here is the summary of important steps:
Create figure/axes
fig, ax1 = plt.subplots(figsize=(5, 3.5))
Plot base map
world.plot(ax=ax1)
Plot a layer
gdf1.plot(ax=ax1)
Plot more layer
gdf2.plot(ax=ax1)
Hope this helps.

Display geographical points using geopandas

I want to display points on the map using a shape file as a map and a csv with coordinates. The code works but I don't understand how to show the figure map.
My questions are: how to display the points? What is "WnvPresent"? How can i just display the map and the points, not as a split between negative and positive but as a hole?
Website from where i downloaded the shp file: https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/countries
Website from where the idea comes from: https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972
import pandas as pd
import matplotlib.pyplot as plt
import descartes
import geopandas as gpd
from shapely.geometry import Point, Polygon
%matplotlib inline
#read map data in form of .shp
street_map = gpd.read_file(r"C:\Users\stetc\Desktop\images/portofolio\ref-countries-2016-01m.shp")
#create the map
fig,ax = plt.subplots(figsize=(15,15))
street_map.plot(ax = ax)
#read given data
df = pd.read.file(r"C:\Users\stetc\Documents\full_dataset.csv")
#the next step is to get the data in the right format. The way we do this is by turning our regular Pandas DataFrame into a geo-DataFrame, which will require us to specify as parameters the original DataFrame, our coordinate reference system (CRS), and the geometry of our new DataFrame. In order to format our geometry appropriately, we will need to convert the longitude and latitude into Points (we imported Point from shapely above), so first let’s read in the training data-set and specify the EPSG:4326 CRS like so
crs = {"init":"epsg:4326"}
#create points using longitude and lat from the data set
geometry = [Point(xy) for xy in zip (df["Longitude"], df["Latitude"])]
#Create a GeoDataFrame
geo_df =gpd.GeoDataFrame (df, #specify out data
crs=crs, # specify the coordinates reference system
geometry = geometry #specify the geometry list created
)
fig,ax = plt.subplots(figsize = (15,15))
street_map.plot (ax = ax, alpha = 0.4 , color="grey" )
geo_df[geo_df["WnvPresent"]==0].plot(ax=ax,markersize=20, color = "blue", marker="o",label="Neg")
geo_df[geo_df["WnvPresent"]==1].plot(ax=ax,markersize=20, color = "red", marker="o",label="Pos")
plt.legend(prop={"size":15})
WnvPresent is just a column used in the example to plot two different colours (I would do it differently, but that is for another discussion), you can ignore that if your goal is to plot points only.
Try the code below. I have also added zorder to ensure that points are on top of the street_map.
fig, ax = plt.subplots(figsize=(15,15))
street_map.plot(ax=ax, alpha=0.4, color="grey", zorder=1)
geo_df.plot(ax=ax, markersize=20, color="blue", marker="o", zorder=2)
In the first step you create the figure, then you add street_map to ax and then geo_df to the same ax. The last line answers your question "how to display the points?". Keep in mind that both layers has to be in the same CRS (assuming epsg 4326 from your code), otherwise layers won't overlap.
A bit more on plotting is in geopandas docs - https://geopandas.readthedocs.io/en/latest/mapping.html and on CRS here https://geopandas.readthedocs.io/en/latest/projections.html.

Return length of border segment between geographic areas in geopandas

I am new to using geopandas so I have a fairly basic question. I want to identify how much border contact happens between neighboring places in a geo-dataframe.
I will provide an example. The following code reads in a pre-loaded geoframe, randomly creates countries marked as "Treated", defines a function that gives their neighboring countries, and then graphs the result with the countries that border having a slightly lighter shade.
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
path = gp.datasets.get_path('naturalearth_lowres')
earth = gp.read_file(path)
africa = earth[earth.continent=='Africa']
africa['some_places'] = np.random.randint(0,2,size=africa.shape[0])*2
# Define and apply a function that determines which countries touch which others
def touches(x):
result = 0
if x in africa.loc[africa.some_places==2,'geometry']:
result = 2
else:
for y in africa.loc[africa.some_places==2,'geometry']:
if y.touches(x) :
result = 1
break
else:
continue
return result
africa['touch'] = africa.geometry.apply(touches)
# Plot the main places which are 2, the ones that touch which are 1, and the non-touching 0
fig, ax = plt.subplots()
africa.plot(column='touch', cmap='Blues', linewidth=0.5, ax=ax, edgecolor='.2')
ax.axis('off')
plt.show()
For me this gave the following map:
Now the problem is that actually I don't want to indiscriminately shade all areas light blue. I -- ideally -- want to determine the length of border along treated countries and then have a sliding scale of how affected you are based on how much border you share with one or more treated countries.
At the very least, I want to be able to throw away places that only share like 1 or 2 miles of border with another country (or maybe meet at a corner). Any advice or solutions welcome!
I think you need an example that demonstrates proper geospatial operations to get the result. My code below show how to do intersection between N and S America continents, get the line of intersection, then, compute its length. And, finally, plot the line on the map. I hope it is useful and adaptable to your project.
import geopandas
import numpy as np
import matplotlib.pyplot as plt
# make use of the provided world dataset
world = geopandas.read_file(geopandas.datasets.get_path('naturalearth_lowres'))
# drop some areas (they crash my program)
world = world[(world.name != "Antarctica") & (world.name != "Fr. S. Antarctic Lands")]
# create a geodataframe `wg`, taking a few columns of data
wg = world[['continent', 'geometry', 'pop_est']]
# create a geodataframe `continents`, as a result of dissolving countries into continents
continents = wg.dissolve(by='continent')
# epsg:3857, Spherical Mercator
# reproject to `Spherical Mercator` projection
continents3857 = continents.to_crs(epsg='3857')
# get the geometry of the places of interest
namerica = continents3857.geometry[3] # north-america
samerica = continents3857.geometry[5] # south-america
# get intersection between N and S America continents
na_intersect_sa = namerica.intersection(samerica) # got multi-line
# show the length of the result (multi-line object)
blength = na_intersect_sa.length # unit is meters on Spherical Mercator
print("Length in meters:", "%d" % blength)
# The output:
# Length in meters: 225030
ax = continents3857.plot(column="pop_est", cmap="Accent", figsize=(8,5))
for ea in na_intersect_sa.__geo_interface__['coordinates']:
#print(ea)
ax.plot(np.array(ea)[:,0], np.array(ea)[:,1], linewidth=3, color="red")
ax.set_xlim(-9500000,-7900000)
ax.set_ylim(700000, 1400000)
xmin,xmax = ax.get_xlim()
ymin,ymax = ax.get_ylim()
rect = plt.Rectangle((xmin,ymin), xmax-xmin, ymax-ymin, facecolor="lightblue", zorder=-10)
ax.add_artist(rect)
plt.show() # sometimes not needed
The resulting plot:

How to use set clipped path for Basemap polygon

I want to use imshow (for example) to display some data inside the boundaries of a country (for the purposes of example I chose the USA) The simple example below illustrates what I want:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import RegularPolygon
data = np.arange(100).reshape(10, 10)
fig = plt.figure()
ax = fig.add_subplot(111)
im = ax.imshow(data)
poly = RegularPolygon([ 0.5, 0.5], 6, 0.4, fc='none',
ec='k', transform=ax.transAxes)
im.set_clip_path(poly)
ax.add_patch(poly)
ax.axis('off')
plt.show()
The result is:
Now I want to do this but instead of a simple polygon, I want to use the complex shape of the USA. I have created some example data contained in the array of "Z" as can be seen in the code below. It is this data that I want to display, using a colourmap but only within the boundaries of mainland USA.
So far I have tried the following. I get a shape file from here contained in "nationp010g.shp.tar.gz" and I use the Basemap module in python to plot the USA. Note that this is the only method I have found which gives me the ability get a polygon of the area I need. If there are alternative methods I would also be interested in them. I then create a polygon called "mainpoly" which is almost the polygon I want coloured in blue:
Notice how only one body has been coloured, all other disjoint polygons remain white:
So the area coloured blue is almost what I want, note that there are unwanted borderlines near canada because the border actually goes through some lakes, but that is a minor problem. The real problem is, why doesn't my imshow data display inside the USA? Comparing my first and second example codes I can't see why I don't get a clipped imshow in my second example, the way I do in the first. Any help would be appreciated in understanding what I am missing.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap as Basemap
from matplotlib.patches import Polygon
# Lambert Conformal map of lower 48 states.
m = Basemap(llcrnrlon=-119,llcrnrlat=22,urcrnrlon=-64,urcrnrlat=49,
projection='lcc',lat_1=33,lat_2=45,lon_0=-95)
shp_info = m.readshapefile('nationp010g/nationp010g', 'borders', drawbounds=True) # draw country boundaries.
for nshape,seg in enumerate(m.borders):
if nshape == 1873: #This nshape denotes the large continental body of the USA, which we want
mainseg = seg
mainpoly = Polygon(mainseg,facecolor='blue',edgecolor='k')
nx, ny = 10, 10
lons, lats = m.makegrid(nx, ny) # get lat/lons of ny by nx evenly space grid.
x, y = m(lons, lats) # compute map proj coordinates.
Z = np.zeros((nx,ny))
Z[:] = np.NAN
for i in np.arange(len(x)):
for j in np.arange(len(y)):
Z[i,j] = x[0,i]
ax = plt.gca()
im = ax.imshow(Z, cmap = plt.get_cmap('coolwarm') )
im.set_clip_path(mainpoly)
ax.add_patch(mainpoly)
plt.show()
Update
I realise that the line
ax.add_patch(mainpoly)
does not even add the polygon shape to a plot. Am I not using it correctly? As far as I know mainpoly was calculated correctly using the Polygon() method. I checked that the coordinate inputs are a sensible:
plt.plot(mainseg[:,0], mainseg[:,1] ,'.')
which gives
I have also considered about this problem for so long.
And I found NCL language has the function to mask the data outside some border.
Here is the example:
http://i5.tietuku.com/bdb1a6c007b82645.png
The contourf plot only show within China border. Click here for the code.
I know python has a package called PyNCL which support all NCL code in Python framework.
But I really want to plot this kind of figure using basemap. If you have figured it out, please post on the internet. I'll learn at the first time.
Thanks!
Add 2016-01-16
In a way, I have figured it out.
This is my idea and code, and it's inspired from this question I have asked today.
My method:
1. Make the shapefile of the interesting area(like U.S) into shapely.polygon.
2. Test each value point within/out of the polygon.
3. If the value point is out of the study area, mask it as np.nan
Intro
* the polygon xxx was a city in China in ESRI shapefile format.
* fiona, shapely package were used here.
# generate the shapely.polygon
shape = fiona.open("xxx.shp")
pol = shape.next()
geom = shape(pol['geometry'])
poly_data = pol["geometry"]["coordinates"][0]
poly = Polygon(poly_data)
It shows like:
http://i4.tietuku.com/2012307faec02634.png
### test the value point
### generate the grid network which represented by the grid midpoints.
lon_med = np.linspace((xi[0:2].mean()),(xi[-2:].mean()),len(x_grid))
lat_med = np.linspace((yi[0:2].mean()),(yi[-2:].mean()),len(y_grid))
value_test_mean = dsu.mean(axis = 0)
value_mask = np.zeros(len(lon_med)*len(lat_med)).reshape(len(lat_med),len(lon_med))
for i in range(0,len(lat_med),1):
for j in range(0,len(lon_med),1):
points = np.array([lon_med[j],lat_med[i]])
mask = np.array([poly.contains(Point(points[0], points[1]))])
if mask == False:
value_mask[i,j] = np.nan
if mask == True:
value_mask[i,j] = value_test_mean[i,j]
# Mask the np.nan value
Z_mask = np.ma.masked_where(np.isnan(so2_mask),so2_mask)
# plot!
fig=plt.figure(figsize=(6,4))
ax=plt.subplot()
map = Basemap(llcrnrlon=x_map1,llcrnrlat=y_map1,urcrnrlon=x_map2,urcrnrlat=y_map2)
map.drawparallels(np.arange(y_map1+0.1035,y_map2,0.2),labels= [1,0,0,1],size=14,linewidth=0,color= '#FFFFFF')
lon_grid = np.linspace(x_map1,x_map2,len(x_grid))
lat_grid = np.linspace(y_map1,y_map2,len(y_grid))
xx,yy = np.meshgrid(lon_grid,lat_grid)
pcol =plt.pcolor(xx,yy,Z_mask,cmap = plt.cm.Spectral_r ,alpha =0.75,zorder =2)
result
http://i4.tietuku.com/c6620c5b6730a5f0.png
http://i4.tietuku.com/a22ad484fee627b9.png
original result
http://i4.tietuku.com/011584fbc36222c9.png

streamplot does not work with matplotlib basemap

I am trying to use streamplot function to plot wind field with basemap, projection "ortho". My test code is mainly based on this example:
Plotting wind vectors and wind barbs
Here is my code:
import numpy as np
import matplotlib.pyplot as plt
import datetime
from mpl_toolkits.basemap import Basemap, shiftgrid
from Scientific.IO.NetCDF import NetCDFFile as Dataset
# specify date to plot.
yyyy=1993; mm=03; dd=14; hh=00
date = datetime.datetime(yyyy,mm,dd,hh)
# set OpenDAP server URL.
URLbase="http://nomads.ncdc.noaa.gov/thredds/dodsC/modeldata/cmd_pgbh/"
URL=URLbase+"%04i/%04i%02i/%04i%02i%02i/pgbh00.gdas.%04i%02i%02i%02i.grb2" %\
(yyyy,yyyy,mm,yyyy,mm,dd,yyyy,mm,dd,hh)
data = Dataset(URL)
#data = netcdf.netcdf_file(URL)
# read lats,lons
# reverse latitudes so they go from south to north.
latitudes = data.variables['lat'][:][::-1]
longitudes = data.variables['lon'][:].tolist()
# get wind data
uin = data.variables['U-component_of_wind_height_above_ground'][:].squeeze()
vin = data.variables['V-component_of_wind_height_above_ground'][:].squeeze()
# add cyclic points manually (could use addcyclic function)
u = np.zeros((uin.shape[0],uin.shape[1]+1),np.float64)
u[:,0:-1] = uin[::-1]; u[:,-1] = uin[::-1,0]
v = np.zeros((vin.shape[0],vin.shape[1]+1),np.float64)
v[:,0:-1] = vin[::-1]; v[:,-1] = vin[::-1,0]
longitudes.append(360.); longitudes = np.array(longitudes)
# make 2-d grid of lons, lats
lons, lats = np.meshgrid(longitudes,latitudes)
# make orthographic basemap.
m = Basemap(resolution='c',projection='ortho',lat_0=60.,lon_0=-60.)
# create figure, add axes
fig1 = plt.figure(figsize=(8,10))
ax = fig1.add_axes([0.1,0.1,0.8,0.8])
# define parallels and meridians to draw.
parallels = np.arange(-80.,90,20.)
meridians = np.arange(0.,360.,20.)
# first, shift grid so it goes from -180 to 180 (instead of 0 to 360
# in longitude). Otherwise, interpolation is messed up.
ugrid,newlons = shiftgrid(180.,u,longitudes,start=False)
vgrid,newlons = shiftgrid(180.,v,longitudes,start=False)
# now plot.
lonn, latt = np.meshgrid(newlons, latitudes)
x, y = m(lonn, latt)
st = plt.streamplot(x, y, ugrid, vgrid, color='r', latlon='True')
# draw coastlines, parallels, meridians.
m.drawcoastlines(linewidth=1.5)
m.drawparallels(parallels)
m.drawmeridians(meridians)
# set plot title
ax.set_title('SLP and Wind Vectors '+str(date))
plt.show()
After running the code, I got a blank map with a red smear in the lower left corner (please see the figure). After zoom this smear out, I can see the wind stream in a flat projection (not in "ortho" projection) So I guess this is the problem of data projection on the map. I did tried function transform_vector but it does not solve the problem Can anybody tell me, what did I do wrong, please! Thank you.
A new map after updating code:
You are plotting lat/lon coordinates on a map with an orthographic projection. Normally you can fix this by changing your plotting command to:
m.streamplot(mapx, mapy, ugrid, vgrid, color='r', latlon=True)
But your coordinate arrays don't have the same dimensions, that needs to be fixed as well.

Categories

Resources