I'd like to know how to fill in a map of U.S. counties by value (i.e., a chloropleth map), using Python 3 and Cartopy, and I haven't yet found anything online to guide me in that. That filled value could be, for instance, highest recorded tornado rating (with counties left blank for no recorded tornadoes), or even something arbitrary such as whether I've visited (=1) or lived (=2) in the county. I found a helpful MetPy example to get the county boundaries on a map:
https://unidata.github.io/MetPy/latest/examples/plots/US_Counties.html
What I envision is somehow setting a list (or dictionary?) of county names to a certain value, and then each value would be assigned to a particular fill color. This is my current script, which generates a nice blank county map of the CONUS/lower 48 (though I'd eventually also like to add Alaska/Hawaii insets).
import cartopy
import cartopy.crs as ccrs
import matplotlib as mpl
import matplotlib.pyplot as plt
from metpy.plots import USCOUNTIES
plot_type = 'png'
borders = cartopy.feature.BORDERS
states = cartopy.feature.NaturalEarthFeature(category='cultural', scale='10m', facecolor='none', name='admin_1_states_provinces_lakes')
oceans = cartopy.feature.OCEAN
lakes = cartopy.feature.LAKES
mpl.rcParams['figure.figsize'] = (12,10)
water_color = 'lightblue'
fig = plt.figure()
ax = plt.axes(projection=ccrs.LambertConformal(central_longitude=-97.5, central_latitude=38.5, standard_parallels=(38.5,38.5)))
ax.set_extent([-120, -74, 23, 50], ccrs.Geodetic())
ax.coastlines()
ax.add_feature(borders, linestyle='-')
ax.add_feature(states, linewidth=0.50, edgecolor='black')
ax.add_feature(oceans, facecolor=water_color)
ax.add_feature(lakes, facecolor=water_color, linewidth=0.50, edgecolor='black')
ax.add_feature(USCOUNTIES.with_scale('500k'), linewidth=0.10, edgecolor='black')
plt.savefig('./county_map.'+plot_type)
plt.close()
Any ideas or tips on how to assign values to counties and fill them accordingly?
So Cartopy's shapereader.Reader can give you access to all of the records in the shapefile, including their attributes. Putting this together with MetPy's get_test_data to get access to the underlying shapefile you can get what you want, assuming you have a dataset that maps e.g. FIPSCODE to EF rating:
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())
cmap = plt.get_cmap('magma')
norm = plt.Normalize(0, 5)
# Fake tornado dataset with a value for each county code
tor_data = dict()
# This will only work (have access to the shapefile's database of
# attributes after it's been download by using `USCOUNTIES` or
# running get_test_data() for the .shx and .dbf files as well.
for rec in shpreader.Reader(get_test_data('us_counties_20m.shp',
as_file_obj=False)).records():
# Mimic getting data, but actually getting a random number
# GEOID seems to be the FIPS code
max_ef = tor_data.get(rec.attributes['GEOID'], np.random.randint(0, 5))
# Normalize the data to [0, 1] and colormap manually
color = tuple(cmap(norm(max_ef)))
# Add the geometry to the plot, being sure to specify the coordinate system
ax.add_geometries([rec.geometry], crs=ccrs.PlateCarree(), facecolor=color)
ax.set_extent((-125, -65, 25, 48))
That gives me:
I'm not sure about passing in a dict, but you can pass in a list to facecolor.
ax.add_feature(USCOUNTIES.with_scale('500k'), linewidth=0.10, edgecolor='black', facecolor=["red", "blue", "green"])
If you know how many counties there are you can make a list that long by:
import matplotlib.cm as cm
import numpy as np
number_of_counties = 3000
color_scale = list(cm.rainbow(np.linspace(0, 1, number_of_counties)))
ax.add_feature(USCOUNTIES.with_scale('500k'), linewidth=.10, edgecolor="black", facecolor=color_scale)
but they didn't make it easy to extract the names from USCOUNTIES. You can see where it is defined in your source code:
from metpy import plots
print(plots.__file__)
If you go inside the directory printed there is a file named cartopy_utils.py and inside the class definition for class MetPyMapFeature(Feature): you will see USCOUNTIES. You might have better luck than I did mapping county names to the geometric shapes.
EDIT: Also, I just used cm.rainbow as an example, you can choose from any color map https://matplotlib.org/stable/tutorials/colors/colormaps.html. Not sure if it even goes up to 3000, but you get the idea.
Related
I'm currently trying to make a nested doughnut chart with four layers and I have come across some problems with it.
There is one dependency in my data. I look into the changes done with a specific method and divide them into agronomical and academic traits. I then create a fourth ring which shows basically the amount of academic and each agronomical trait. I don't know how to automatically align both doughnut rings so they match.
I looked into the matplotlib documentation, but I don't understand the addressing of the colormaps. I took over the example code, but in the end its not really understandable how this is addressing the colors of it.
I need to make a legend for the chart. However, due to the long names of some of the subgroups, I can not show them in the pie chart but they should appear in the legend. When I draw the legend via the ax.legend function, it adds only the groups to the legend which I addressed in the ax.pie function with labels=, if I use fig.legend for drawing the legend, the colors are not matching at all. I tried to use the handles= function I stumbled across some posts here on StackOverflow. But they just give me an error
AttributeError: 'tuple' object has no attribute 'legend'
I would like to add the pct and number of occurrences to my legend, but I guess there is no "easy" way for that?
´´´
import numpy as np
import pandas
import pandas as pd
import matplotlib.pyplot as plt
import openpyxl
df = pandas.read_excel("savedrecs.xlsx", sheet_name="test")
#print(df.head())
size = 0.3
fig, ax = plt.subplots(figsize=(12,8))
#Colors----
cmap1 = plt.get_cmap("tab20c")
cmap2 = plt.get_cmap("tab10")
outer_colors = cmap1(np.arange(20))
inner_colors = cmap1(np.arange(12))
sr_colors = cmap1(np.arange(5,6))
#Data----
third_ring = df[df["Group"].str.contains("group")]
fourth_ring = df[df["Group"].str.contains("Target trait")]
second_ring = df[df["Group"].str.contains("Cultivar")]
first_ring = df[df["Group"].str.contains("Mutation")]
#----
#---Testautopct---
def make_autopct(values):
def my_autopct(pct):
total = sum(values)
val = int(round(pct*total/100.0))
return '{p:.2f}%\n({v:d})'.format(p=pct,v=val)
return my_autopct
#-----
#Piechart----
ir = ax.pie(first_ring["Occurence"], radius=1-size, labels=first_ring["Name"], textprops={"fontsize":8},labeldistance=0,
colors=sr_colors, wedgeprops=dict(edgecolor="w"))
sr = ax.pie(second_ring["Occurence"],
autopct=make_autopct(second_ring["Occurence"]),pctdistance=0.83,textprops={"fontsize":8},
radius=1,wedgeprops=dict(width=size, edgecolor="w"),startangle=90,colors=inner_colors)
tr = ax.pie(third_ring["Occurence"],
autopct=make_autopct(third_ring["Occurence"]),labels=third_ring["Name"],pctdistance=0.83,textprops={"fontsize":8},
radius=1+size,wedgeprops=dict(width=size, edgecolor="w"),startangle=90,colors=outer_colors)
fr = ax.pie(fourth_ring["Occurence"],
autopct=make_autopct(fourth_ring["Occurence"]),labels=fourth_ring["Name"],pctdistance=0.83,textprops={"fontsize":8},
radius=1+size*2,wedgeprops=dict(width=size, edgecolor="w"),startangle=90,colors=outer_colors)
#---Legend & Title----
ax.legend( bbox_to_anchor=(1.04, 0.5), loc="center left", borderaxespad=10 ,fancybox=True, shadow=False, ncol=1, title="This will be a fancy legend title")
fig.suptitle("This will be a fancy title, which I don't know yet!")
#----
plt.tight_layout()
plt.show()
´´´
The output of this code is then as follows:
I have a dataframe that contains thousands of points with geolocation (longitude, latitude) for Washington D.C. The following is a snippet of it:
import pandas as pd
df = pd.DataFrame({'lat': [ 38.897221,38.888100,38.915390,38.895100,38.895100,38.901005,38.960491,38.996342,38.915310,38.936820], 'lng': [-77.031048,-76.898480,-77.021380,-77.036700,-77.036700 ,-76.990784,-76.862907,-77.028131,-77.010403,-77.184930]})
If you plot the points in the map you can see that some of them are clearly within some buildings:
import folium
wash_map = folium.Map(location=[38.8977, -77.0365], zoom_start=10)
for index,location_info in df.iterrows():
folium.CircleMarker(
location=[location_info["lat"], location_info["lng"]], radius=5,
fill=True, fill_color='red',).add_to(wash_map)
wash_map.save('example_stack.html')
import webbrowser
import os
webbrowser.open('file://'+os.path.realpath('example_stack.html'), new=2)
My goal is to exclude all the points that are within buildings. For that, I first download bounding boxes for the city buildings and then try to exclude points within those polygons as follows:
import osmnx as ox
#brew install spatialindex this solves problems in mac
%matplotlib inline
ox.config(log_console=True)
ox.__version__
tags = {"building": True}
gdf = ox.geometries.geometries_from_point([38.8977, -77.0365], tags, dist=1000)
gdf.shape
For computational simplicity I have requested the shapes of all buildings around the White house with a radius of 1 km. On my own code I have tried with bigger radiuses to make sure all the buildings are included.
In order to exclude points within the polygons I developed the following function (which includes the shape obtention):
def buildings(df,center_point,dist):
import osmnx as ox
#brew install spatialindex this solves problems in mac
%matplotlib inline
ox.config(log_console=True)
ox.__version__
tags = {"building": True}
gdf = ox.geometries.geometries_from_point(center_point, tags,dist)
from shapely.geometry import Point,Polygon
# Next step is to put our coordinates in the correct shapely format: remember to run the map funciton before
#df['within_building']=[]
for point in range(len(df)):
if gdf.geometry.contains(Point(df.lat[point],df.lng[point])).all()==False:
df['within_building']=False
else :
df['within_building']=True
buildings(df,[38.8977, -77.0365],1000)
df['within_building'].all()==False
The function always returns that points are outside building shapes although you can clearly see in the map that some of them are within. I don't know how to plot the shapes over my map so I am not sure if my polygons are correct but for the coordinates they appear to be so. Any ideas?
The example points you provided don't seem to fall within those buildings' footprints. I don't know what your points' coordinate reference system is, so I guessed EPSG4326. But to answer your question, here's how you would exclude them, resulting in gdf_points_not_in_bldgs:
import geopandas as gpd
import matplotlib.pyplot as plt
import osmnx as ox
import pandas as pd
# the coordinates you provided
df = pd.DataFrame({'lat': [38.897221,38.888100,38.915390,38.895100,38.895100,38.901005,38.960491,38.996342,38.915310,38.936820],
'lng': [-77.031048,-76.898480,-77.021380,-77.036700,-77.036700 ,-76.990784,-76.862907,-77.028131,-77.010403,-77.184930]})
# create GeoDataFrame of point geometries
geom = gpd.points_from_xy(df['lng'], df['lat'])
gdf_points = gpd.GeoDataFrame(geometry=geom, crs='epsg:4326')
# get building footprints
tags = {"building": True}
gdf_bldgs = ox.geometries_from_point([38.8977, -77.0365], tags, dist=1000)
gdf_bldgs.shape
# get all points that are not within a building footprint
mask = gdf_points.within(gdf_bldgs.unary_union)
gdf_points_not_in_bldgs = gdf_points[~mask]
print(gdf_points_not_in_bldgs.shape) # (10, 1)
# plot buildings and points
ax = gdf_bldgs.plot()
ax = gdf_points.plot(ax=ax, c='r')
plt.show()
# zoom in to see better
ax = gdf_bldgs.plot()
ax = gdf_points.plot(ax=ax, c='r')
ax.set_xlim(-77.04, -77.03)
ax.set_ylim(38.89, 38.90)
plt.show()
I am currently trying to visualize geographical data for the districts of the city of Hamburg. The creation of the choropleth by using plotly.graph_objects and an associated GeoJSON file is working perfectly fine.
However, as I am plotting the city of Hamburg, it is not possible for me to use one of the specified locationmodes and I have to zoom in manually - for each individual plot, for each execution, which is very cumbersome.
Can I somehow use longitude/latitude coordinates, something like zoom_start similar to Folium, or any other keyword I'm missing to limit the selection programmatically?
For completeness, the code so far is attached (Subplots are created, whereas each subplot is data from a dataframe visualized as a graph_objects.Choropleth instance and can be touched individually (zooming, ...).
import plotly
import plotly.graph_objects as go
choro_overview = plotly.subplots.make_subplots(
rows=6, cols=2, specs=[[{'type': 'choropleth'}]*2]*6,
subplot_titles = df_main.columns[5:],
horizontal_spacing = 0.1,
)
cbar_locs_x = [0.45, 1]
cbar_locs_y = np.linspace(0.95, 0.05, 6)
for ii, column in enumerate(df_main.columns[5:]):
placement = np.unravel_index(ii, (6, 2))
choro_overview.add_trace(
go.Choropleth(
locations = df_main['District'],
z = df_main[column],
geojson=geojson_src,
featureidkey='properties.name',
colorbar=dict(len=np.round(1/9, 1), x=cbar_locs_x[placement[1]], y=cbar_locs_y[placement[0]]),
name=column,
colorscale='orrd',
), row=placement[0]+1, col=placement[1]+1
I have since found that the keyword is not in go.Choropleth, but in the figure itself by calling update_geos().
Credit goes to plotly automatic zooming for "Mapbox maps".
I'm getting the error:
TypeError: Image data of dtype object cannot be converted to float
when I try to run the heapmap function in the code below:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# Read the data
df = pd.read_csv("gapminder-FiveYearData.csv")
print(df.head(10))
# Create an array of n-dimensional array of life expectancy changes for countries over the years.
year = ((np.asarray(df['year'])).reshape(12,142))
country = ((np.asarray(df['country'])).reshape(12,142))
print(year)
print(country)
# Create a pivot table
result = df.pivot(index='year',columns='country',values='lifeExp')
print(result)
# Create an array to annotate the heatmap
labels = (np.asarray(["{1:.2f} \n {0}".format(year,value)
for year, value in zip(year.flatten(),
country.flatten())])
).reshape(12,142)
# Define the plot
fig, ax = plt.subplots(figsize=(15,9))
# Add title to the Heat map
title = "GapMinder Heat Map"
# Set the font size and the distance of the title from the plot
plt.title(title,fontsize=18)
ttl = ax.title
ttl.set_position([0.5,1.05])
# Hide ticks for X & Y axis
ax.set_xticks([])
ax.set_yticks([])
# Remove the axes
ax.axis('off')
# Use the heatmap function from the seaborn package
hmap = sns.heatmap(result,annot=labels,fmt="",cmap='RdYlGn',linewidths=0.30,ax=ax)
# Display the Heatmap
plt.imshow(hmap)
Here is a link to the CSV file.
The objective of the activity is to
data file is the dataset with 6 columns namely: country, year, pop, continent, lifeExp and gdpPercap.
Create a pivot table dataframe with year along x-axes, country along y-axes and lifeExp filled within cells.
Plot a heatmap using seaborn for the pivot table that was just created.
Thanks for providing your data to this question. I believe your typeError is coming from the labels array your code is creating for the annotation. Based on the function's built-in annotate properties, I actually don't think you need this extra work and it's modifying your data in a way that errors out when plotting.
I took a stab at re-writing your project to produce a heatmap that shows the pivot table of country and year of lifeExp. I'm also assuming that it is important for you to keep this number a float.
import numpy as np
import pandas as pd
import seaborn as sb
import matplotlib.pyplot as plt
## UNCHANGED FROM ABOVE **
# Read in the data
df = pd.read_csv('https://raw.githubusercontent.com/resbaz/r-novice-gapminder-files/master/data/gapminder-FiveYearData.csv')
df.head()
## ** UNCHANGED FROM ABOVE **
# Create an array of n-dimensional array of life expectancy changes for countries over the years.
year = ((np.asarray(df['year'])).reshape(12,142))
country = ((np.asarray(df['country'])).reshape(12,142))
print('show year\n', year)
print('\nshow country\n', country)
# Create a pivot table
result = df.pivot(index='country',columns='year',values='lifeExp')
# Note: This index and columns order is reversed from your code.
# This will put the year on the X axis of our heatmap
result
I removed the labels code block.
Notes on the sb.heatmap function:
I used plt.cm.get_cmap() to restrict the number of colors in your
mapping. If you want to use the entire colormap spectrum, just remove
it and include how you had it originally.
fmt = "f", this if for float, your lifeExp values.
cbar_kws - you can use this to play around with the size, label and orientation of your color bar.
# Define the plot - feel free to modify however you want
plt.figure(figsize = [20, 50])
# Set the font size and the distance of the title from the plot
title = 'GapMinder Heat Map'
plt.title(title,fontsize=24)
ax = sb.heatmap(result, annot = True, fmt='f', linewidths = .5,
cmap = plt.cm.get_cmap('RdYlGn', 7), cbar_kws={
'label': 'Life Expectancy', 'shrink': 0.5})
# This sets a label, size 20 to your color bar
ax.figure.axes[-1].yaxis.label.set_size(20)
plt.show()
limited screenshot, only b/c the plot is so large
another of the bottom of the plot to show the year axis, slightly zoomed in on my browser.
I have a dataset coming from a shape file (.shp extention) with coordinates. They should look something like this:
-70.62 -33.43
-70.59 -33.29
And so on. I already have developed a way to plot this data with pyplot, where each green dot represents a tree and each line a street, which looks like this:
pyplot streets & trees
However, I need to draw a grid over it and color it's blocks depending on the amount of trees on each section. That way, the blocks with more trees would be colored with a stronger green, whereas the ones with less amount of trees would be a light green/yellow/red. Of course, these colors should be partially transparent, so the map isn't covered completely.
This is my code:
import cartopy.crs as ccrs
import matplotlib.pyplot as plt
import cartopy.io.shapereader as shpreader
import shapely.geometry as sg
wgs84 = ccrs.Geodetic()
utm19s = ccrs.UTM(19, southern_hemisphere=True)
p_a = [-70.637, -33.449]
p_b = [-70.58, -33.415]
LL = utm19s.transform_point(p_a[0], p_a[1], wgs84)
UR = utm19s.transform_point(p_b[0], p_b[1], wgs84)
ax = plt.axes(projection=utm19s)
ax.set_extent([LL[0], UR[0], LL[1], UR[1]], crs=utm19s)
rds = shpreader.Reader('roadsUTM.shp')
trees = shpreader.Reader('treesUTM.shp')
rect = sg.box(LL[0], UR[0], LL[1], UR[1])
rds_sel = [r for r in rds.geometries() if r.intersects(rect)]
trees_sel = [t for t in trees.geometries() if t.intersects(rect)]
ax.add_geometries(rds_sel, utm19s, linestyle='solid', facecolor='none')
ax.scatter([t.x for t in trees_sel], [t.y for t in trees_sel], color = "green", edgecolor = "black", transform=utm19s)
plt.show()
TL;DR: A way to use shapefile encripted position data as plain numbers would solve part of my problem. Thanks.
EDIT: So I discovered that the data was already given in the UTM19S format. Should have researched a little bit before asking.
However, I still need to plot said grid over the map.