I'm testing geopandas library for a simple exercise : displaying several points on a map, and then superimpose a large circle above to delete a part of them with the difference method.
To check that the transformation works fine, I'm using an iPython notebook to see my different layers.
So, here's the begining of my manipulation :
%matplotlib inline
# this line is just for a correct plotting in an iPython nb
import pandas as pd
import geopandas as gp
from shapely.geometry import Point
df = pd.read_csv("historical_monuments.csv", sep = ",")
geometry = [Point(xy) for xy in zip(fichier.Longitude, fichier.Latitude)]
# I convert two columns of my csv for geographic information displaying
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# just delete two columns of my first df to avoid redundancy
geodf = gp.GeoDataFrame(file, crs=None, geometry=geometry)
Then, to see my points, I just wrote :
geodf.plot(marker='o', color='red', markersize=5)
Here's the result :
That's super fine. Now I just want to add in this layer a point with a large radius. I tried this :
base = gdf.plot(marker='o', color='red', markersize=5)
# the first plotting becomes a variable to reuse it
center_coord = [Point(6.18, 48.696000)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
circle = center.buffer(0.001)
Then, I just thought that these command would be enough :
circle.plot(ax=base, color = 'white')
But instead of a graphical displaying, my notebook returns :
<matplotlib.axes._subplots.AxesSubplot at 0x7f763bdde5c0>
<matplotlib.figure.Figure at 0x7f763be5ef60>
And I didn't find what could be wrong so far...
The command
%matplotlib inline
produces a static plot. Once it appears in your notebook it can not be changed anymore. That is why you have to put your code in a single Cell as schlump said.
An alternative would be to switch to the notebook backend, which is interactive and allows you to modify your plot over several Cells. To active it simply use
%matplotlib notebook
instead of inline.
Well my best guess is you didn't execute your code within one Cell... for some strange behaviour the plot does not show up if executed over multiple cells... I could replicate your problem, however when i executed the Code in one cell the plot showed up.
%matplotlib inline
import pandas as pd
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point
# Create Fake Data
df = pd.DataFrame(np.random.randint(10,20,size=(10, 3)), columns=['Longitude','Latitude','data'])
# create Geometry series with lat / longitude
geometry = [Point(xy) for xy in zip(df.Longitude, df.Latitude)]
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# Create GeoDataFrame
geodf = gp.GeoDataFrame(df, crs=None, geometry=geometry)
# Create Matplotlib figure
fig, ax = plt.subplots()
# Set Axes to equal (otherwise plot looks weird)
ax.set_aspect('equal')
# Plot GeoDataFrame on Axis ax
geodf.plot(ax=ax,marker='o', color='red', markersize=5)
# Create new point
center_coord = [Point(15, 13)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
# Plot new point
center.plot(ax=ax,color = 'blue',markersize=5)
# Buffer point and plot it
circle = center.buffer(10)
circle.plot(color = 'white',ax=ax)
ps: Btw you've got some variables mixed up
Related
So, what I am having trouble with is how I am supposed to plot the data I have on top of a global map. I have an array of data, and two arrays of coordinates in latitude and longitude, where each datapoint was taken, but I am not sure of how to plot it on top of a global map. Creating the map itself is not too difficult, I just use:
import matplotlib.pyplot as plt
from mpl_toolkits.basemap import Basemap
fig = plt.figure(figsize=(10, 8))
m = Basemap(projection='cyl', resolution='c',
llcrnrlat=-90, urcrnrlat=90,
llcrnrlon=-180, urcrnrlon=180, )
m.shadedrelief(scale=0.5)
m.drawcoastlines(color='black')
But the next step is where I am having problems. I have tried doing both a colormesh plot and scatter plot, but they haven't worked so far. How should I go about it so that the data is plotted in the correct coordinate locations for the global map?
Thanks a lot for any help!
Maybe a bit late, but I have this piece of code I used to plot multiple linear plot over a map in Basemap that worked for me.
map = Basemap(projection='cyl', resolution='c',
llcrnrlat=mins[1], urcrnrlat=maxs[1],
llcrnrlon=mins[0], urcrnrlon=50, )
plt.figure(figsize=(15, 15))
for i in range(1259):
filepath = filename[i]
data = pd.read_csv(filepath, index_col=0)
map.plot(data.x,data.y,'k-', alpha=0.1) ### Calling the plot in a loop!!
map.drawcoastlines(linewidth=1)
map.drawcountries(linewidth=0.5, linestyle='solid', color='k' )
plt.show()
The loop calls data from different folders, and I just use the map.plot command to plot. By doing it like that, you can plot all data in the same map.
i am trying to overlay two sets of latitude and longitude plots so that the first set has points of one color and the second set of points has a different color plotted on the same map. I have tried to share the same axis (ax) but it keeps plotting the points in 2 maps instead of 1 single map with both sets or colors of points. My code looks like this:
from sys import exit
from shapely.geometry import Point
import geopandas as gpd
from geopandas import GeoDataFrame as gdf
from shapely.geometry import Point, LineString
import pandas as pd
import matplotlib.pyplot as plt
dfp = pd.read_csv("\\\porfiler03\\gtdshare\\Long_Lats_90p.csv", delimiter=',', skiprows=0,
low_memory=False)
geometry = [Point(xy) for xy in zip(dfp['Longitude'], dfp['Latitude'])]
gdf = gpd.GeoDataFrame(dfp, geometry=geometry)
#this is a simple map that goes with geopandas
fig, ax = plt.subplots()
world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
#world = world[(world.name=="Spain")]
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='red', markersize=15);
dfn = pd.read_csv("\\\porfiler03\\gtdshare\\Long_Lats_90n.csv",
delimiter=',', skiprows=0,
low_memory=False)
geometry = [Point(xy) for xy in zip(dfn['Longitude'], dfn['Latitude'])]
gdf = gpd.GeoDataFrame(dfn, geometry=geometry)
#this is a simple map that goes with geopandas
#world = gpd.read_file(gpd.datasets.get_path('naturalearth_lowres'))
gdf.plot(ax=world.plot(figsize=(10, 6)), marker='o', color='yellow',
markersize=15);
My first plot looks like the second plot below but with red points in USA and Spain:
My second plot looks like this:
Thank you in helping me overlay these two different sets of points and colors into one map.
In your case, you want to plot 3 geodataframes (world, gdf1, and gdf2) on single axes. Then, after you create fig/axes, you must reuse the same axes (say, ax1) for each plot. Here is the summary of important steps:
Create figure/axes
fig, ax1 = plt.subplots(figsize=(5, 3.5))
Plot base map
world.plot(ax=ax1)
Plot a layer
gdf1.plot(ax=ax1)
Plot more layer
gdf2.plot(ax=ax1)
Hope this helps.
I want to display points on the map using a shape file as a map and a csv with coordinates. The code works but I don't understand how to show the figure map.
My questions are: how to display the points? What is "WnvPresent"? How can i just display the map and the points, not as a split between negative and positive but as a hole?
Website from where i downloaded the shp file: https://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/countries
Website from where the idea comes from: https://towardsdatascience.com/geopandas-101-plot-any-data-with-a-latitude-and-longitude-on-a-map-98e01944b972
import pandas as pd
import matplotlib.pyplot as plt
import descartes
import geopandas as gpd
from shapely.geometry import Point, Polygon
%matplotlib inline
#read map data in form of .shp
street_map = gpd.read_file(r"C:\Users\stetc\Desktop\images/portofolio\ref-countries-2016-01m.shp")
#create the map
fig,ax = plt.subplots(figsize=(15,15))
street_map.plot(ax = ax)
#read given data
df = pd.read.file(r"C:\Users\stetc\Documents\full_dataset.csv")
#the next step is to get the data in the right format. The way we do this is by turning our regular Pandas DataFrame into a geo-DataFrame, which will require us to specify as parameters the original DataFrame, our coordinate reference system (CRS), and the geometry of our new DataFrame. In order to format our geometry appropriately, we will need to convert the longitude and latitude into Points (we imported Point from shapely above), so first let’s read in the training data-set and specify the EPSG:4326 CRS like so
crs = {"init":"epsg:4326"}
#create points using longitude and lat from the data set
geometry = [Point(xy) for xy in zip (df["Longitude"], df["Latitude"])]
#Create a GeoDataFrame
geo_df =gpd.GeoDataFrame (df, #specify out data
crs=crs, # specify the coordinates reference system
geometry = geometry #specify the geometry list created
)
fig,ax = plt.subplots(figsize = (15,15))
street_map.plot (ax = ax, alpha = 0.4 , color="grey" )
geo_df[geo_df["WnvPresent"]==0].plot(ax=ax,markersize=20, color = "blue", marker="o",label="Neg")
geo_df[geo_df["WnvPresent"]==1].plot(ax=ax,markersize=20, color = "red", marker="o",label="Pos")
plt.legend(prop={"size":15})
WnvPresent is just a column used in the example to plot two different colours (I would do it differently, but that is for another discussion), you can ignore that if your goal is to plot points only.
Try the code below. I have also added zorder to ensure that points are on top of the street_map.
fig, ax = plt.subplots(figsize=(15,15))
street_map.plot(ax=ax, alpha=0.4, color="grey", zorder=1)
geo_df.plot(ax=ax, markersize=20, color="blue", marker="o", zorder=2)
In the first step you create the figure, then you add street_map to ax and then geo_df to the same ax. The last line answers your question "how to display the points?". Keep in mind that both layers has to be in the same CRS (assuming epsg 4326 from your code), otherwise layers won't overlap.
A bit more on plotting is in geopandas docs - https://geopandas.readthedocs.io/en/latest/mapping.html and on CRS here https://geopandas.readthedocs.io/en/latest/projections.html.
I'm testing geopandas to make something quite simple : use the difference method to delete some points of a GeoDataFrame that are inside a circle.
Here's the begining of my script :
%matplotlib inline
# previous line is because I used ipynb
import pandas as pd
import geopandas as gp
from shapely.geometry import Point
[...]
points_df = gp.GeoDataFrame(csv_file, crs=None, geometry=geometry)
Here's the first rows of points_df :
Name Adress geometry
0 place1 street1 POINT (6.182674 48.694416)
1 place2 street2 POINT (6.177306 48.689889)
2 place3 street3 POINT (6.18 48.69600000000001)
3 place4 street4 POINT (6.1819 48.6938)
4 place5 street5 POINT (6.175694 48.690833)
Then, I add a point that will contain several points of the first GeoDF :
base = points_df.plot(marker='o', color='red', markersize=5)
center_coord = [Point(6.18, 48.689900)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
center.plot(ax=base, color = 'blue',markersize=5)
circle = center.buffer(0.015)
circle.plot(ax=base, color = 'green')
Here's the result displayed by the iPython notebook :
Now, the goal is to delete red points inside the green circle. To do that, I thought that difference method will be enough. But when I write :
selection = points_df['geometry'].difference(circle)
selection.plot(color = 'green', markersize=5)
The result is that... nothing changed with points_df :
I guess that the difference() method works only with polygons GeoDataFrames and the mix between points and polygons is not posible. But maybe I missed something !
Will a function to test the presence of a point in the circle be better than the difference method in this case ?
I guess that the difference() method works only with polygons
GeoDataFrames and the mix between points and polygons is not posible.
That seems to be the issue, you cant use the overlay with points.
And also for that kind of spatial operation a simple spatial join seems to be the easiest solution.
Starting with the last example ;):
%matplotlib inline
import pandas as pd
import geopandas as gp
import numpy as np
import matplotlib.pyplot as plt
from shapely.geometry import Point
# Create Fake Data
df = pd.DataFrame(np.random.randint(10,20,size=(35, 3)), columns=['Longitude','Latitude','data'])
# create Geometry series with lat / longitude
geometry = [Point(xy) for xy in zip(df.Longitude, df.Latitude)]
df = df.drop(['Longitude', 'Latitude'], axis = 1)
# Create GeoDataFrame
points = gp.GeoDataFrame(df, crs=None, geometry=geometry)
# Create Matplotlib figure
fig, ax = plt.subplots()
# Set Axes to equal (otherwise plot looks weird)
ax.set_aspect('equal')
# Plot GeoDataFrame on Axis ax
points.plot(ax=ax,marker='o', color='red', markersize=5)
# Create new point
center_coord = [Point(15, 13)]
center = gp.GeoDataFrame(crs=None, geometry=center_coord)
# Plot new point
center.plot(ax=ax,color = 'blue',markersize=5)
# Buffer point and plot it
circle = gp.GeoDataFrame(crs=None, geometry=center.buffer(2.5))
circle.plot(color = 'white',ax=ax)
Leaves us with the problem on how to determine if a point is inside or outside of the polygon... one way of achieving that is to Join all points inside the polygon, and create a DataFrame with the difference between all points and points within the circle:
# Calculate the points inside the circle
pointsinside = gp.sjoin(points,circle,how="inner")
# Now the points outside the circle is just the difference
# between points and points inside (see the ~)
pointsoutside = points[~points.index.isin(pointsinside.index)]
# Create a nice plot
fig, ax = plt.subplots()
ax.set_aspect('equal')
circle.plot(color = 'white',ax=ax)
center.plot(ax=ax,color = 'blue',markersize=5)
pointsinside.plot(ax=ax,marker='o', color='green', markersize=5)
pointsoutside.plot(ax=ax,marker='o', color='yellow', markersize=5)
print('Total points:' ,len(points))
print('Points inside circle:' ,len(pointsinside))
print('Points outside circle:' ,len(pointsoutside))
Total points: 35
Points inside circle: 10
Points outside circle: 25
I am looping through a bunch of CSV files containing various measurements.
Each file might be from one of 4 different data sources.
In each file, I merge the data into monthly datasets, that I then plot in a 3x4 grid. After this plot has been saved, the loop moves on and does the same to the next file.
This part I got figured out, however I would like to add a visual clue to the plots, as to what data it is. As far as I understand it (and tried it)
plt.subplot(4,3,1)
plt.hist(Jan_Data,facecolor='Red')
plt.ylabel('value count')
plt.title('January')
does work, however this way, I would have to add the facecolor='Red' by hand to every 12 subplots. Looping through the plots wont work for this situation, since I want the ylabel only for the leftmost plots, and xlabels for the bottom row.
Setting facecolor at the beginning in
fig = plt.figure(figsize=(20,15),facecolor='Red')
does not work, since it only changes the background color of the 20 by 15 figure now, which subsequently gets ignored when I save it to a PNG, since it only gets set for screen output.
So is there just a simple setthecolorofallbars='Red' command for plt.hist(… or plt.savefig(… I am missing, or should I just copy n' paste it to all twelve months?
You can use mpl.rc("axes", color_cycle="red") to set the default color cycle for all your axes.
In this little toy example, I use the with mpl.rc_context block to limit the effects of mpl.rc to just the block. This way you don't spoil the default parameters for your whole session.
import matplotlib as mpl
import matplotlib.pylab as plt
import numpy as np
np.random.seed(42)
# create some toy data
n, m = 2, 2
data = []
for i in range(n*m):
data.append(np.random.rand(30))
# and do the plotting
with mpl.rc_context():
mpl.rc("axes", color_cycle="red")
fig, axes = plt.subplots(n, m, figsize=(8,8))
for ax, d in zip(axes.flat, data):
ax.hist(d)
The problem with the x- and y-labels (when you use loops) can be solved by using plt.subplots as you can access every axis seperately.
import matplotlib.pyplot as plt
import numpy.random
# creating figure with 4 plots
fig,ax = plt.subplots(2,2)
# some data
data = numpy.random.randn(4,1000)
# some titles
title = ['Jan','Feb','Mar','April']
xlabel = ['xlabel1','xlabel2']
ylabel = ['ylabel1','ylabel2']
for i in range(ax.size):
a = ax[i/2,i%2]
a.hist(data[i],facecolor='r',bins=50)
a.set_title(title[i])
# write the ylabels on all axis on the left hand side
for j in range(ax.shape[0]):
ax[j,0].set_ylabel(ylabel[j])
# write the xlabels an all axis on the bottom
for j in range(ax.shape[1]):
ax[-1,j].set_xlabel(xlabels[j])
fig.tight_layout()
All features (like titles) which are not constant can be put into arrays and placed at the appropriate axis.