Starting with a shapefile I obtained from https://s3.amazonaws.com/nyc-tlc/misc/taxi_zones.zip, I'd like to plot the borough of Manhattan, and have outlines for each taxi-zone.
This code rotates each individual taxi zone individually instead of all at once.
import geopandas as gpd
from matplotlib import pyplot as plt
fname = "path_to_shapefile.shp"
df = gpd.read_file(fname)
df = df[df['borough'] == "Manhattan"]
glist = gpd.GeoSeries([g for g in df['geometry']])
glist = glist.rotate(90)
glist.plot()
[EDIT]
I have further refined this to be able to rotate the image programmatically. However, if I add a legend, then that is also rotated, which is not desirable. Still looking for a better solution.
Note, there is also this stackoverflow post (How can I rotate a matplotlib plot through 90 degrees?), however, the solutions that rotate the plot, and not the image, only work with 90 degree rotations.
import geopandas as gpd
from matplotlib import pyplot as plt
import numpy as np
from scipy import ndimage
from matplotlib import transforms
fname = "path_to_shapefile.shp"
df = gpd.read_file(fname)
df = df[df['borough'] == "Manhattan"]
df.plot()
plt.axis("off")
plt.savefig("test.png")
img = plt.imread('test.png')
rotated_img = ndimage.rotate(img, -65)
plt.imshow(rotated_img, cmap=plt.cm.gray)
plt.axis('off')
plt.show()
[EDIT2]
A simple modification to the answer given below by #PMende solved it.
df = gpd.read_file(fname)
df = df[df['borough'] == "Manhattan"]
glist = gpd.GeoSeries([g for g in df['geometry']])
glist = glist.rotate(-65, origin=(0,0))
glist.plot()
The key was rotating all of the objects around a single point, instead of around their individual origins.
[EDIT 3] If anyone is trying to do this, and needs to save the resulting rotated geoseries to a dataframe (say for instance, to color the geometry based on an additional column), you need to create a new one, simply writing
df['geometry'] = glist
does not work. I'm not sure why at the moment. However, the following code worked for me.
new_dataframe = gpd.GeoDataFrame(glist)
new_dataframe = new_dataframe.rename(columns={0:'geometry'}).set_geometry('geometry')
new_dataframe.plot()
If I'm understanding GeoPandas' documentation correctly, you can specify the origin of the rotation of each of your geometries (which by default is the center of each geometry). To get your desired behavior, you can rotate each shape about the same origin.
For example:
import geopandas as gpd
from matplotlib import pyplot as plt
fname = "path_to_shapefile.shp"
df = gpd.read_file(fname)
df = df[df['borough'] == "Manhattan"]
center = df["geometry"].iloc[0].centroid()
glist = gpd.GeoSeries([g for g in df['geometry']])
glist = glist.rotate(90, origin=center)
glist.plot()
I can't test this myself, but it should hopefully get you started in the right direction.
(Though I also agree with #martinfeleis' point about not necessarily wanting to rotate the geometry, but rather the plot.)
Related
I have a dataframe that contains thousands of points with geolocation (longitude, latitude) for Washington D.C. The following is a snippet of it:
import pandas as pd
df = pd.DataFrame({'lat': [ 38.897221,38.888100,38.915390,38.895100,38.895100,38.901005,38.960491,38.996342,38.915310,38.936820], 'lng': [-77.031048,-76.898480,-77.021380,-77.036700,-77.036700 ,-76.990784,-76.862907,-77.028131,-77.010403,-77.184930]})
If you plot the points in the map you can see that some of them are clearly within some buildings:
import folium
wash_map = folium.Map(location=[38.8977, -77.0365], zoom_start=10)
for index,location_info in df.iterrows():
folium.CircleMarker(
location=[location_info["lat"], location_info["lng"]], radius=5,
fill=True, fill_color='red',).add_to(wash_map)
wash_map.save('example_stack.html')
import webbrowser
import os
webbrowser.open('file://'+os.path.realpath('example_stack.html'), new=2)
My goal is to exclude all the points that are within buildings. For that, I first download bounding boxes for the city buildings and then try to exclude points within those polygons as follows:
import osmnx as ox
#brew install spatialindex this solves problems in mac
%matplotlib inline
ox.config(log_console=True)
ox.__version__
tags = {"building": True}
gdf = ox.geometries.geometries_from_point([38.8977, -77.0365], tags, dist=1000)
gdf.shape
For computational simplicity I have requested the shapes of all buildings around the White house with a radius of 1 km. On my own code I have tried with bigger radiuses to make sure all the buildings are included.
In order to exclude points within the polygons I developed the following function (which includes the shape obtention):
def buildings(df,center_point,dist):
import osmnx as ox
#brew install spatialindex this solves problems in mac
%matplotlib inline
ox.config(log_console=True)
ox.__version__
tags = {"building": True}
gdf = ox.geometries.geometries_from_point(center_point, tags,dist)
from shapely.geometry import Point,Polygon
# Next step is to put our coordinates in the correct shapely format: remember to run the map funciton before
#df['within_building']=[]
for point in range(len(df)):
if gdf.geometry.contains(Point(df.lat[point],df.lng[point])).all()==False:
df['within_building']=False
else :
df['within_building']=True
buildings(df,[38.8977, -77.0365],1000)
df['within_building'].all()==False
The function always returns that points are outside building shapes although you can clearly see in the map that some of them are within. I don't know how to plot the shapes over my map so I am not sure if my polygons are correct but for the coordinates they appear to be so. Any ideas?
The example points you provided don't seem to fall within those buildings' footprints. I don't know what your points' coordinate reference system is, so I guessed EPSG4326. But to answer your question, here's how you would exclude them, resulting in gdf_points_not_in_bldgs:
import geopandas as gpd
import matplotlib.pyplot as plt
import osmnx as ox
import pandas as pd
# the coordinates you provided
df = pd.DataFrame({'lat': [38.897221,38.888100,38.915390,38.895100,38.895100,38.901005,38.960491,38.996342,38.915310,38.936820],
'lng': [-77.031048,-76.898480,-77.021380,-77.036700,-77.036700 ,-76.990784,-76.862907,-77.028131,-77.010403,-77.184930]})
# create GeoDataFrame of point geometries
geom = gpd.points_from_xy(df['lng'], df['lat'])
gdf_points = gpd.GeoDataFrame(geometry=geom, crs='epsg:4326')
# get building footprints
tags = {"building": True}
gdf_bldgs = ox.geometries_from_point([38.8977, -77.0365], tags, dist=1000)
gdf_bldgs.shape
# get all points that are not within a building footprint
mask = gdf_points.within(gdf_bldgs.unary_union)
gdf_points_not_in_bldgs = gdf_points[~mask]
print(gdf_points_not_in_bldgs.shape) # (10, 1)
# plot buildings and points
ax = gdf_bldgs.plot()
ax = gdf_points.plot(ax=ax, c='r')
plt.show()
# zoom in to see better
ax = gdf_bldgs.plot()
ax = gdf_points.plot(ax=ax, c='r')
ax.set_xlim(-77.04, -77.03)
ax.set_ylim(38.89, 38.90)
plt.show()
The following code draws the cdf for datetime values:
import matplotlib.pyplot as plt
import matplotlib.dates as dates
import numpy as np; np.random.seed(42)
import pandas as pd
objDate = dates.num2date(np.random.normal(735700, 300, 700))
ser = pd.Series(objDate)
ax = ser.hist(cumulative=True, density=1, bins=500, histtype='step')
plt.show()
How can I remove the vertical line at the right-most end of graph? The approach mentioned here doesn't work as replacing line#9 with:
ax = ser.hist(cumulative=True, density=1, bins=sorted(objDate)+[np.inf], histtype='step')
gives
TypeError: can't compare datetime.datetime to float
The CDF is actually drawn as a polygon, which in matplotlib is defined by a path. A path is in turn defined by vertices (where to go) and codes (how to get there). The docs say that we should not directly alter these attributes, but we can make a new polygon derived from the old one that suits our needs.
poly = ax.findobj(plt.Polygon)[0]
vertices = poly.get_path().vertices
# Keep everything above y == 0. You can define this mask however
# you need, if you want to be more careful in your selection.
keep = vertices[:, 1] > 0
# Construct new polygon from these "good" vertices
new_poly = plt.Polygon(vertices[keep], closed=False, fill=False,
edgecolor=poly.get_edgecolor(),
linewidth=poly.get_linewidth())
poly.set_visible(False)
ax.add_artist(new_poly)
plt.draw()
You should arrive at something like the figure below:
I'm trying to plot data around the Antarctica while masking the continent. While I'm using basemap and it has an option to easily mask continents using map.fillcontinents(), the continent considered by basemap includes the ice shelves, which I do not want to mask.
I tried using geopandas from a code I found on the Internet. This works, except the coastline produces an undesired line in what I assume is the beginning/end of the polygon for the Antarctica:
import numpy as np
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
from matplotlib.collections import PatchCollection
import geopandas as gpd
import shapely
from descartes import PolygonPatch
lats = np.arange(-90,-59,1)
lons = np.arange(0,361,1)
X, Y = np.meshgrid(lons, lats)
data = np.random.rand(len(lats),len(lons))
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
fig=plt.figure(dpi=150)
ax = fig.add_subplot(111)
m = Basemap(projection='spstere',boundinglat=-60,lon_0=180,resolution='i',round=True)
xi, yi = m(X,Y)
cf = m.contourf(xi,yi,data)
patches = []
selection = world[world.name == 'Antarctica']
for poly in selection.geometry:
if poly.geom_type == 'Polygon':
mpoly = shapely.ops.transform(m, poly)
patches.append(PolygonPatch(mpoly))
elif poly.geom_type == 'MultiPolygon':
for subpoly in poly:
mpoly = shapely.ops.transform(m, poly)
patches.append(PolygonPatch(mpoly))
else:
print(poly, 'blah')
ax.add_collection(PatchCollection(patches, match_original=True,color='w',edgecolor='k'))
The same line appears when I try to use other shapefiles, such as the land one that is available to download for free from Natural Earth Data. So I edited this shapefile in QGIS to remove the borders of the Antarctica. The problem now is that I don't know how to mask everything that's inside the shapefile (and couldn't find how to do it either). I also tried combining the previous code with geopandas by setting the linewidth=0, and adding on top the shapefile I created. The problem is that they are not exactly the same:
Any suggestion on how to mask using a shapefile, or with geopandas but without the line?
Edit: Using Thomas Khün's previous answer with my edited shapefile produces a well masked Antarctica/continents, but the coastline goes outside the round edges of the map:
I uploaded here the edited shapefile I used, but it's the Natural Earth Data 50m land shapefile without the line.
Here an example of how to achieve what you want. I basically followed the Basemap example how to deal with shapefiles and added a bit of shapely magic to restrict the outlines to the map boundaries. Note that I first tried to extract the map outline from ax.patches, but that somehow didn't work, so I defined a circle which has a radius of boundinglat and transformed it using the Basemap coordinate transformation functionality.
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
from matplotlib.collections import PatchCollection
from matplotlib.patches import Polygon
import shapely
from shapely.geometry import Polygon as sPolygon
boundinglat = -40
lats = np.arange(-90,boundinglat+1,1)
lons = np.arange(0,361,1)
X, Y = np.meshgrid(lons, lats)
data = np.random.rand(len(lats),len(lons))
fig, ax = plt.subplots(nrows=1, ncols=1, dpi=150)
m = Basemap(
ax = ax,
projection='spstere',boundinglat=boundinglat,lon_0=180,
resolution='i',round=True
)
xi, yi = m(X,Y)
cf = m.contourf(xi,yi,data)
#adjust the path to the shapefile here:
result = m.readshapefile(
'shapefiles/AntarcticaWGS84_contorno', 'antarctica',
zorder = 10, color = 'k', drawbounds = False)
#defining the outline of the map as shapely Polygon:
rim = [np.linspace(0,360,100),np.ones(100)*boundinglat,]
outline = sPolygon(np.asarray(m(rim[0],rim[1])).T)
#following Basemap tutorial for shapefiles
patches = []
for info, shape in zip(m.antarctica_info, m.antarctica):
#instead of a matplotlib Polygon, create first a shapely Polygon
poly = sPolygon(shape)
#check if the Polygon, or parts of it are inside the map:
if poly.intersects(outline):
#if yes, cut and insert
intersect = poly.intersection(outline)
verts = np.array(intersect.exterior.coords.xy)
patches.append(Polygon(verts.T, True))
ax.add_collection(PatchCollection(
patches, facecolor= 'w', edgecolor='k', linewidths=1., zorder=2
))
plt.show()
The result looks like this:
Hope this helps.
For anyone still trying to figure out a simple way to mask a grid from a shapefile, here is a gallery example from the python package Antarctic-Plots which makes this simple.
from antarctic_plots import maps, fetch, utils
import pyogrio
# fetch a grid and shapefile
grid = fetch.bedmachine(layer='surface')
shape = fetch.groundingline()
# subset the grounding line from the coastline
gdf = pyogrio.read_dataframe(shape)
groundingline = gdf[gdf.Id_text == "Grounded ice or land"]
# plot the grid
fig = maps.plot_grd(grid)
# plot the shapefile
fig.plot(groundingline, pen='1p,red')
fig.show()
# mask the inside region
masked_inside = utils.mask_from_shp(
shapefile=groundingline, xr_grid=grid, masked=True)
masked_inside.plot()
# mask the outside region
masked_outside = utils.mask_from_shp(
shapefile=groundingline, xr_grid=grid, masked=True, invert=False)
masked_outside.plot()
I have a dataset coming from a shape file (.shp extention) with coordinates. They should look something like this:
-70.62 -33.43
-70.59 -33.29
And so on. I already have developed a way to plot this data with pyplot, where each green dot represents a tree and each line a street, which looks like this:
pyplot streets & trees
However, I need to draw a grid over it and color it's blocks depending on the amount of trees on each section. That way, the blocks with more trees would be colored with a stronger green, whereas the ones with less amount of trees would be a light green/yellow/red. Of course, these colors should be partially transparent, so the map isn't covered completely.
This is my code:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import cartopy.io.shapereader as shpreader
import shapely.geometry as sg
wgs84 = ccrs.Geodetic()
utm19s = ccrs.UTM(19, southern_hemisphere=True)
p_a = [-70.637, -33.449]
p_b = [-70.58, -33.415]
LL = utm19s.transform_point(p_a[0], p_a[1], wgs84)
UR = utm19s.transform_point(p_b[0], p_b[1], wgs84)
ax = plt.axes(projection=utm19s)
ax.set_extent([LL[0], UR[0], LL[1], UR[1]], crs=utm19s)
rds = shpreader.Reader('roadsUTM.shp')
trees = shpreader.Reader('treesUTM.shp')
rect = sg.box(LL[0], UR[0], LL[1], UR[1])
rds_sel = [r for r in rds.geometries() if r.intersects(rect)]
trees_sel = [t for t in trees.geometries() if t.intersects(rect)]
ax.add_geometries(rds_sel, utm19s, linestyle='solid', facecolor='none')
ax.scatter([t.x for t in trees_sel], [t.y for t in trees_sel], color = "green", edgecolor = "black", transform=utm19s)
plt.show()
TL;DR: A way to use shapefile encripted position data as plain numbers would solve part of my problem. Thanks.
EDIT: So I discovered that the data was already given in the UTM19S format. Should have researched a little bit before asking.
However, I still need to plot said grid over the map.
I am trying to identify the indices of the masked pixels when using
maskoceans
so I can then call only the land pixels in a code I have that is currently going through the whole globe, even though I do not care about the ocean pixels. I was trying different methods to do so, and noticed that my plots were looking really weird. Eventually, I realized that something was getting mixed up in my lat/lon indices, even though I am not actually touching them! Here is the code:
import numpy as np
import netCDF4
from datetime import datetime, timedelta
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
import matplotlib.dates as mpldates
import heat_transfer_coeffs
from dew_interface import get_dew
from matplotlib.dates import date2num, num2date
import numpy as np
import netCDF4
import heat_transfer_coeffs as htc
from jug.task import TaskGenerator
import matplotlib.cm as cm
import mpl_toolkits
from mpl_toolkits import basemap
from mpl_toolkits.basemap import Basemap, maskoceans
np.seterr(all='raise')
# set global vars
ifile = netCDF4.Dataset('/Users/myfile.nc', 'r')
times = ifile.variables['time'][:].astype(np.float64) # hours since beginning of dataset
lats_1d = ifile.variables['latitude'][:] # 90..-90
lons_1d = ifile.variables['longitude'][:] # 0..360
lons_1d[lons_1d>180]-=360 #putting longitude into -180..180
lons, lats = np.meshgrid(lons_1d, lats_1d)
ntimes, nlats, nlons = ifile.variables['tm'].shape
ifile.close()
map1 = basemap.Basemap(resolution='c', projection='mill',llcrnrlat=-36 , urcrnrlat=10, llcrnrlon=5 , urcrnrlon=52)
#Mask the oceans
new_lon = maskoceans(lons,lats,lons,resolution='c', grid = 10)
new_lat = maskoceans(lons,lats,lats,resolution='c', grid = 10)
fig = plt.figure
pc = map1.pcolormesh(lons, lats, new_lat, vmin=0, vmax=34, cmap=cm.RdYlBu, latlon=True)
plt.show()
for iii in range(new_lon.shape[1]):
index = np.where(new_lon.mask[:,iii] == False)
index2 = np.where(new_lon.mask[:,iii] == True)
new_lon[index[0],iii] = 34
new_lon[index2[0],iii] = 0
fig = plt.figure
pc = map1.pcolormesh(lons, lats, new_lat, vmin=0, vmax=34, cmap=cm.RdYlBu, latlon=True)
plt.show()
The first figure I get shows the expected map of Africa with oceans masked and the land values corresponding to the latitude (until saturation of the colorbar at 34, but that value was just taken as an example)
However, the second figure, which should plot the exact same thing as the first one, comes out all messed up, even though the loop in between the first and second figure doesn't touch any of the parameters involved in plotting it:
If I comment out the loop in between figure 1 and 2, figure 2 looks just like figure 1. Any idea about what is going on here?
Short answer, your loop is modifying the variables lons and lats indirectly.
Explanation: the function maskoceans creates a masked array from input array. The masked array and the input array share the same data, so that lons and new_lon share the same data, same thing for lats and new_lat. This means that when you modify new_lon in your loop, you are also modifying lons. That is the source of your problem. The only difference is that new_lon and new_lat are associated with a mask that is used to choose valid data points.
Solution: Make a copy of the initial array before you call maskoceans. You can do that with:
import copy
lons1 = copy.copy(lons)
lats1 = copy.copy(lats)
Then you use lons1 and lats1 to call maskoceans.