I want to display points on the map using a shape file as a map and a csv with coordinates. The code works but I don't understand how to show the figure map.
My questions are: how to display the points? What is "WnvPresent"? How can i just display the map and the points, not as a split between negative and positive but as a hole?
Website from where i downloaded the shp file: https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/countries
Website from where the idea comes from: https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972
import pandas as pd
import matplotlib.pyplot as plt
import descartes
import geopandas as gpd
from shapely.geometry import Point, Polygon
%matplotlib inline
#read map data in form of .shp
street_map = gpd.read_file(r"C:\Users\stetc\Desktop\images/portofolio\ref-countries-2016-01m.shp")
#create the map
fig,ax = plt.subplots(figsize=(15,15))
street_map.plot(ax = ax)
#read given data
df = pd.read.file(r"C:\Users\stetc\Documents\full_dataset.csv")
#the next step is to get the data in the right format. The way we do this is by turning our regular Pandas DataFrame into a geo-DataFrame, which will require us to specify as parameters the original DataFrame, our coordinate reference system (CRS), and the geometry of our new DataFrame. In order to format our geometry appropriately, we will need to convert the longitude and latitude into Points (we imported Point from shapely above), so first let’s read in the training data-set and specify the EPSG:4326 CRS like so
crs = {"init":"epsg:4326"}
#create points using longitude and lat from the data set
geometry = [Point(xy) for xy in zip (df["Longitude"], df["Latitude"])]
#Create a GeoDataFrame
geo_df =gpd.GeoDataFrame (df, #specify out data
crs=crs, # specify the coordinates reference system
geometry = geometry #specify the geometry list created
)
fig,ax = plt.subplots(figsize = (15,15))
street_map.plot (ax = ax, alpha = 0.4 , color="grey" )
geo_df[geo_df["WnvPresent"]==0].plot(ax=ax,markersize=20, color = "blue", marker="o",label="Neg")
geo_df[geo_df["WnvPresent"]==1].plot(ax=ax,markersize=20, color = "red", marker="o",label="Pos")
plt.legend(prop={"size":15})
WnvPresent is just a column used in the example to plot two different colours (I would do it differently, but that is for another discussion), you can ignore that if your goal is to plot points only.
Try the code below. I have also added zorder to ensure that points are on top of the street_map.
fig, ax = plt.subplots(figsize=(15,15))
street_map.plot(ax=ax, alpha=0.4, color="grey", zorder=1)
geo_df.plot(ax=ax, markersize=20, color="blue", marker="o", zorder=2)
In the first step you create the figure, then you add street_map to ax and then geo_df to the same ax. The last line answers your question "how to display the points?". Keep in mind that both layers has to be in the same CRS (assuming epsg 4326 from your code), otherwise layers won't overlap.
A bit more on plotting is in geopandas docs - https://geopandas.readthedocs.io/en/latest/mapping.html and on CRS here https://geopandas.readthedocs.io/en/latest/projections.html.
Related
So, what I am having trouble with is how I am supposed to plot the data I have on top of a global map. I have an array of data, and two arrays of coordinates in latitude and longitude, where each datapoint was taken, but I am not sure of how to plot it on top of a global map. Creating the map itself is not too difficult, I just use:
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
fig = plt.figure(figsize=(10, 8))
m = Basemap(projection='cyl', resolution='c',
llcrnrlat=-90, urcrnrlat=90,
llcrnrlon=-180, urcrnrlon=180, )
m.shadedrelief(scale=0.5)
m.drawcoastlines(color='black')
But the next step is where I am having problems. I have tried doing both a colormesh plot and scatter plot, but they haven't worked so far. How should I go about it so that the data is plotted in the correct coordinate locations for the global map?
Thanks a lot for any help!
Maybe a bit late, but I have this piece of code I used to plot multiple linear plot over a map in Basemap that worked for me.
map = Basemap(projection='cyl', resolution='c',
llcrnrlat=mins[1], urcrnrlat=maxs[1],
llcrnrlon=mins[0], urcrnrlon=50, )
plt.figure(figsize=(15, 15))
for i in range(1259):
filepath = filename[i]
data = pd.read_csv(filepath, index_col=0)
map.plot(data.x,data.y,'k-', alpha=0.1) ### Calling the plot in a loop!!
map.drawcoastlines(linewidth=1)
map.drawcountries(linewidth=0.5, linestyle='solid', color='k' )
plt.show()
The loop calls data from different folders, and I just use the map.plot command to plot. By doing it like that, you can plot all data in the same map.
i am trying to overlay two sets of latitude and longitude plots so that the first set has points of one color and the second set of points has a different color plotted on the same map. I have tried to share the same axis (ax) but it keeps plotting the points in 2 maps instead of 1 single map with both sets or colors of points. My code looks like this:
from sys import exit
from shapely.geometry import Point
import geopandas as gpd
from geopandas import GeoDataFrame as gdf
from shapely.geometry import Point, LineString
import pandas as pd
import matplotlib.pyplot as plt
dfp = pd.read_csv("\\\porfiler03\\gtdshare\\Long_Lats_90p.csv", delimiter=',', skiprows=0,
low_memory=False)
geometry = [Point(xy) for xy in zip(dfp['Longitude'], dfp['Latitude'])]
gdf = gpd.GeoDataFrame(dfp, geometry=geometry)
#this is a simple map that goes with geopandas
fig, ax = plt.subplots()
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
#world = world[(world.name=="Spain")]
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='red', markersize=15);
dfn = pd.read_csv("\\\porfiler03\\gtdshare\\Long_Lats_90n.csv",
delimiter=',', skiprows=0,
low_memory=False)
geometry = [Point(xy) for xy in zip(dfn['Longitude'], dfn['Latitude'])]
gdf = gpd.GeoDataFrame(dfn, geometry=geometry)
#this is a simple map that goes with geopandas
#world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='yellow',
markersize=15);
My first plot looks like the second plot below but with red points in USA and Spain:
My second plot looks like this:
Thank you in helping me overlay these two different sets of points and colors into one map.
In your case, you want to plot 3 geodataframes (world, gdf1, and gdf2) on single axes. Then, after you create fig/axes, you must reuse the same axes (say, ax1) for each plot. Here is the summary of important steps:
Create figure/axes
fig, ax1 = plt.subplots(figsize=(5, 3.5))
Plot base map
world.plot(ax=ax1)
Plot a layer
gdf1.plot(ax=ax1)
Plot more layer
gdf2.plot(ax=ax1)
Hope this helps.
I have a shapefile of the United states, and I have an m x n array of Cartesian data that represents temperature at each pixel. I am able to load in the shapefile and plot it:
import shapefile as shp
import matplotlib.pyplot as plt
sf = shp.Reader("/path/to/USA.shp")
plt.figure()
for shape in sf.shapeRecords():
for i in range(len(shape.shape.parts)):
i_start = shape.shape.parts[i]
if i==len(shape.shape.parts)-1:
i_end = len(shape.shape.points)
else:
i_end = shape.shape.parts[i+1]
x = [i[0] for i in shape.shape.points[i_start:i_end]]
y = [i[1] for i in shape.shape.points[i_start:i_end]]
plt.plot(x,y, color = 'black')
plt.show()
And I am able to read in my data and plot it:
import pickle
from matplotlib import pyplot as mp
Tfile = '/path/to/file.pkl'
with open(Tfile) as f:
reshapeT = pickle.load(f)
mp.matshow(reshapeT)
The problem is reshapeT has dimensions of 536 x 592, and is a subdomain of the US. However, I do have information about the top-left corner of the reshapeT grid (lat / long) as well as the spacing between each pixel (0.01)
My question is: How do I overlay the reshapeT data ontop of the shapefile domain?
If I understand you correctly you would like to overlay a 536x592 numpy array over a specifc part of a plotted shapefile. I would suggest you use Matplotlib's imwshow() method, with the extent parameter, which allows you to place the image within the plot.
Your way of plotting the shapefile is fine, however, if you have the possibility to use geopandas, it will dramatically simplify things. Plotting the shapefile will reduce to the following lines:
import geopandas as gpd
sf = gpd.read_file("/path/to/USA.shp")
ax1 = sf.plot(edgecolor='black', facecolor='none')
As you have done previously, let's load the array data now:
import pickle
Tfile = '/path/to/file.pkl'
with open(Tfile) as f:
reshapeT = pickle.load(f)
Now in order to be able to plot numpy array in the correct position, we first need to calculate its extent (the area which it will cover expressed in coordinates). You mentioned that you have information about the top-left corner and the resolution (0.01) - that's all we need. In the following I'm assuming that the lat/lon information about the top-left corner is saved in the the top_left_lat and top_left_lon variables. The extent needs to be passed in a tuple with a value for each of the edges (in the order left, right, bottom, top).
Hence, our extent can be calculated as follows:
extent_mat = (top_left_lon, top_left_lon + reshapeT.shape[1] * 0.01, top_left_lat - reshapeT.shape[0] * 0.01, top_left_lat)
Finally, we plot the matrix onto the same axes object, ax1, on which we already plotted the shape file to the calculated extent:
# Let's turn off autoscale first. This prevents
# the view of the plot to be limited to the image
# dimensions (instead of the entire shapefile). If you prefer
# that behaviour, just remove the following line
ax1.autoscale(False)
# Finally, let's plot!
ax1.imshow(reshapeT, extent=extent_mat)
I'm testing geopandas to make something quite simple : use the difference method to delete some points of a GeoDataFrame that are inside a circle.
Here's the begining of my script :
%matplotlib inline
# previous line is because I used ipynb
import pandas as pd
import geopandas as gp
from shapely.geometry import Point
[...]
points_df = gp.GeoDataFrame(csv_file, crs=None, geometry=geometry)
Here's the first rows of points_df :
Name Adress geometry
0 place1 street1 POINT (6.182674 48.694416)
1 place2 street2 POINT (6.177306 48.689889)
2 place3 street3 POINT (6.18 48.69600000000001)
3 place4 street4 POINT (6.1819 48.6938)
4 place5 street5 POINT (6.175694 48.690833)
Then, I add a point that will contain several points of the first GeoDF :
base = points_df.plot(marker='o', color='red', markersize=5)
center_coord = [Point(6.18, 48.689900)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
center.plot(ax=base, color = 'blue',markersize=5)
circle = center.buffer(0.015)
circle.plot(ax=base, color = 'green')
Here's the result displayed by the iPython notebook :
Now, the goal is to delete red points inside the green circle. To do that, I thought that difference method will be enough. But when I write :
selection = points_df['geometry'].difference(circle)
selection.plot(color = 'green', markersize=5)
The result is that... nothing changed with points_df :
I guess that the difference() method works only with polygons GeoDataFrames and the mix between points and polygons is not posible. But maybe I missed something !
Will a function to test the presence of a point in the circle be better than the difference method in this case ?
I guess that the difference() method works only with polygons
GeoDataFrames and the mix between points and polygons is not posible.
That seems to be the issue, you cant use the overlay with points.
And also for that kind of spatial operation a simple spatial join seems to be the easiest solution.
Starting with the last example ;):
%matplotlib inline
import pandas as pd
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point
# Create Fake Data
df = pd.DataFrame(np.random.randint(10,20,size=(35, 3)), columns=['Longitude','Latitude','data'])
# create Geometry series with lat / longitude
geometry = [Point(xy) for xy in zip(df.Longitude, df.Latitude)]
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# Create GeoDataFrame
points = gp.GeoDataFrame(df, crs=None, geometry=geometry)
# Create Matplotlib figure
fig, ax = plt.subplots()
# Set Axes to equal (otherwise plot looks weird)
ax.set_aspect('equal')
# Plot GeoDataFrame on Axis ax
points.plot(ax=ax,marker='o', color='red', markersize=5)
# Create new point
center_coord = [Point(15, 13)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
# Plot new point
center.plot(ax=ax,color = 'blue',markersize=5)
# Buffer point and plot it
circle = gp.GeoDataFrame(crs=None, geometry=center.buffer(2.5))
circle.plot(color = 'white',ax=ax)
Leaves us with the problem on how to determine if a point is inside or outside of the polygon... one way of achieving that is to Join all points inside the polygon, and create a DataFrame with the difference between all points and points within the circle:
# Calculate the points inside the circle
pointsinside = gp.sjoin(points,circle,how="inner")
# Now the points outside the circle is just the difference
# between points and points inside (see the ~)
pointsoutside = points[~points.index.isin(pointsinside.index)]
# Create a nice plot
fig, ax = plt.subplots()
ax.set_aspect('equal')
circle.plot(color = 'white',ax=ax)
center.plot(ax=ax,color = 'blue',markersize=5)
pointsinside.plot(ax=ax,marker='o', color='green', markersize=5)
pointsoutside.plot(ax=ax,marker='o', color='yellow', markersize=5)
print('Total points:' ,len(points))
print('Points inside circle:' ,len(pointsinside))
print('Points outside circle:' ,len(pointsoutside))
Total points: 35
Points inside circle: 10
Points outside circle: 25
I'm testing geopandas library for a simple exercise : displaying several points on a map, and then superimpose a large circle above to delete a part of them with the difference method.
To check that the transformation works fine, I'm using an iPython notebook to see my different layers.
So, here's the begining of my manipulation :
%matplotlib inline
# this line is just for a correct plotting in an iPython nb
import pandas as pd
import geopandas as gp
from shapely.geometry import Point
df = pd.read_csv("historical_monuments.csv", sep = ",")
geometry = [Point(xy) for xy in zip(fichier.Longitude, fichier.Latitude)]
# I convert two columns of my csv for geographic information displaying
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# just delete two columns of my first df to avoid redundancy
geodf = gp.GeoDataFrame(file, crs=None, geometry=geometry)
Then, to see my points, I just wrote :
geodf.plot(marker='o', color='red', markersize=5)
Here's the result :
That's super fine. Now I just want to add in this layer a point with a large radius. I tried this :
base = gdf.plot(marker='o', color='red', markersize=5)
# the first plotting becomes a variable to reuse it
center_coord = [Point(6.18, 48.696000)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
circle = center.buffer(0.001)
Then, I just thought that these command would be enough :
circle.plot(ax=base, color = 'white')
But instead of a graphical displaying, my notebook returns :
<matplotlib.axes._subplots.AxesSubplot at 0x7f763bdde5c0>
<matplotlib.figure.Figure at 0x7f763be5ef60>
And I didn't find what could be wrong so far...
The command
%matplotlib inline
produces a static plot. Once it appears in your notebook it can not be changed anymore. That is why you have to put your code in a single Cell as schlump said.
An alternative would be to switch to the notebook backend, which is interactive and allows you to modify your plot over several Cells. To active it simply use
%matplotlib notebook
instead of inline.
Well my best guess is you didn't execute your code within one Cell... for some strange behaviour the plot does not show up if executed over multiple cells... I could replicate your problem, however when i executed the Code in one cell the plot showed up.
%matplotlib inline
import pandas as pd
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point
# Create Fake Data
df = pd.DataFrame(np.random.randint(10,20,size=(10, 3)), columns=['Longitude','Latitude','data'])
# create Geometry series with lat / longitude
geometry = [Point(xy) for xy in zip(df.Longitude, df.Latitude)]
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# Create GeoDataFrame
geodf = gp.GeoDataFrame(df, crs=None, geometry=geometry)
# Create Matplotlib figure
fig, ax = plt.subplots()
# Set Axes to equal (otherwise plot looks weird)
ax.set_aspect('equal')
# Plot GeoDataFrame on Axis ax
geodf.plot(ax=ax,marker='o', color='red', markersize=5)
# Create new point
center_coord = [Point(15, 13)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
# Plot new point
center.plot(ax=ax,color = 'blue',markersize=5)
# Buffer point and plot it
circle = center.buffer(10)
circle.plot(color = 'white',ax=ax)
ps: Btw you've got some variables mixed up