I use Bokeh to make choropleth maps and to create other geographical graphics because Python spits it out a million times faster than I could ever create it in another way.
Right now I am doing something like:
from bokeh.io import show, output_file, export_png, output_notebook, output_file, save
from bokeh.models import ColumnDataSource, HoverTool, LabelSet
from bokeh.plotting import figure
from bokeh.sampledata.us_counties import data as counties
import pandas as pd
desired_counties=['Franklin County, Ohio','Fairfield County, Ohio']
counties2 = {code: county for code, county in counties.items() if county["detailed name"] in desired_counties}
unemploymentRate = {'Location': desired_counties, 'rate': [100, 100]} #placeholders
df = pd.DataFrame(data=unemploymentRate)
county_xs = [county["lons"] for county in counties2.values()]
county_ys = [county["lats"] for county in counties2.values()]
county_names = [county['name'] for county in counties2.values()]
unemployment = [df.loc[index,'rate'] for county in counties2.values() for index, row in df.iterrows() if df.loc[index,'Location']==county['detailed name']]
source = ColumnDataSource(data=dict(x=county_xs,y=county_ys,name=county_names,rate=unemployment))
TOOLS="hover,save"
p = figure(x_axis_location=None, y_axis_location=None, plot_width=350,plot_height=250,tools=TOOLS)
p.grid.grid_line_color = None
p.outline_line_color = None
p.patches('x', 'y', source=source,
fill_color='#007a33',
fill_alpha=1, line_color="black", line_width=3.5)
hover = p.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [
("Name", "#name"),
("Rate","#rate{0.0}%")
]
labels = LabelSet(x='x', y='y', text='name', level='glyph',
x_offset=0, y_offset=0, source=source, render_mode='css')
p.add_layout(labels)
p.toolbar.logo = None
p.toolbar_location = None
show(p)
Which gives:
When I hover over the image I see the data I want, but what I would rather have is the data annotated on the image instead for a print possibility. Using the LabelSet class that Bokeh has in their documentation seems perfect, except it's geared toward xy plots and so when trying to use it here it just stacks the labels in the corner.
Questions:
Does anyone know how to label on a Bokeh choropleth without using outside graphics software?
I have ArcGIS Pro, which can obviously create this type of map, but I am looking for something that does it using programming. I use Bokeh right now because I can basically tell it a central county I want, and I have code written that makes me a map of the state highlighting the county I want and its surrounding counties, creates an inset map of the county/surrounding counties, and gives hover data on the inset map of the unemployment rate in those counties (the picture/code above is a simplified version of that). Is there a way to go about this using AcrPy/ArcGIS instead?
The reason your labels are misplaced is because you tell them to use fields 'x' and 'y' from the data source, but the data source does not have these columns. You could compute the "center" of each shape, and add those as the x and y columns to source, then things would display as you want.
Related
I'm trying to get xy coordinates of points drawn by the user. I want to have them as a dictionary, a list or a pandas DataFrame.
I'm using Bokeh 2.0.2 in Jupyter. There'll be a background image (which is not the focus of this post) and on top, the user will create points that I could use further.
Below is where I've managed to get to (with some dummy data). And I've commented some lines which I believe are the direction in which I'd have to go. But I don't seem to get the grasp of it.
from bokeh.plotting import figure, show, Column, output_notebook
from bokeh.models import PointDrawTool, ColumnDataSource, TableColumn, DataTable
output_notebook()
my_tools = ["pan, wheel_zoom, box_zoom, reset"]
#create the figure object
p = figure(title= "my_title", match_aspect=True,
toolbar_location = 'above', tools = my_tools)
seeds = ColumnDataSource({'x': [2,14,8], 'y': [-1,5,7]}) #dummy data
renderer = p.scatter(x='x', y='y', source = seeds, color='red', size=10)
columns = [TableColumn(field="x", title="x"),
TableColumn(field="y", title="y")]
table = DataTable(source=seeds, columns=columns, editable=True, height=100)
#callback = CustomJS(args=dict(source=seeds), code="""
# var data = source.data;
# var x = data['x']
# var y = data['y']
# source.change.emit();
#""")
#
#seeds.x.js_on_change('change:x', callback)
draw_tool = PointDrawTool(renderers=[renderer])
p.add_tools(draw_tool)
p.toolbar.active_tap = draw_tool
show(Column(p, table))
From the documentation at https://docs.bokeh.org/en/latest/docs/user_guide/tools.html#pointdrawtool:
The tool will automatically modify the columns on the data source corresponding to the x and y values of the glyph. Any additional columns in the data source will be padded with the declared empty_value, when adding a new point. Any newly added points will be inserted on the ColumnDataSource of the first supplied renderer.
So, just check the corresponding data source, seeds in your case.
The only issue here is if you want to know exactly what point has been changed or added. In this case, the simplest solution would be to create a custom subclass of PointDrawTool that does just that. Alternatively, you can create an additional "original" data source and compare seeds to it each time it's updated.
The problem is that the execute it in Python. But show create a static version. Here is a simple example that fix it! I removed the table and such to make it a bit cleaner, but it will also work with it:
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import PointDrawTool
output_notebook()
#create the figure object
p = figure(width=400,height=400)
renderer = p.scatter(x=[0,1], y=[1,2],color='red', size=10)
draw_tool = PointDrawTool(renderers=[renderer])
p.add_tools(draw_tool)
p.toolbar.active_tap = draw_tool
# This part is imporant
def app(doc):
global p
doc.add_root(p)
show(app) #<-- show app and not p!
I'm plotting covid-19 data for countries grouped by World Bank regions using pandas and Bokeh.
from bokeh.io import output_file, show
from bokeh.palettes import Spectral5
from bokeh.plotting import figure
from bokeh.transform import factor_cmap
group = data.groupby(["region", "CountryName"])
index_cmap = factor_cmap(
'region_CountryName',
palette=Spectral5,
factors=sorted(data.region.unique()),
end=1
)
p = figure(plot_width=800, plot_height=600, title="Confirmed cases per 100k people by country",
x_range=group, toolbar_location="left")
p.vbar(x='region_CountryName', top='ConfirmedPer100k_max', width=1, source=group,
line_color="white", fill_color=index_cmap, )
p.y_range.start = 0
p.xgrid.grid_line_color = None
p.xaxis.major_label_orientation = 3.14159/2
p.xaxis.group_label_orientation = 3.14159/2
p.outline_line_color = None
show(p)
And I get a
I would like to set some sort of initial zoom into the x-axis to get a more manageable image
, which I got by manually zooming in.
Any suggestions?
You should be able to accomplish this with the x_range parameter. In this example, the plot's x range would be the first 20 countries. You can adjust as needed. You might also have to mess around a bit to get the group_cn_list correct. It's hard to say without seeing your data. If you can post a df example for reproducibility, it would help.
group_cn_list = group["CountryName"].tolist()
p = figure(plot_width=800, plot_height=600, title="Confirmed cases per 100k people by country",
x_range=group_cn_list[0:20], toolbar_location="left")
I am trying to plot India map using plotly, but unable to find a way to do that. Below is the code which I tried for USA.
import pandas as pd
df_sample = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/laucnty16.csv')
df_sample['State FIPS Code'] = df_sample['State FIPS Code'].apply(lambda x: str(x).zfill(2))
df_sample['County FIPS Code'] = df_sample['County FIPS Code'].apply(lambda x: str(x).zfill(3))
df_sample['FIPS'] = df_sample['State FIPS Code'] + df_sample['County FIPS Code']
colorscale = ["#f7fbff","#ebf3fb","#deebf7","#d2e3f3","#c6dbef","#b3d2e9","#9ecae1",
"#85bcdb","#6baed6","#57a0ce","#4292c6","#3082be","#2171b5","#1361a9",
"#08519c","#0b4083","#08306b"]
endpts = list(np.linspace(1, 12, len(colorscale) - 1))
fips = df_sample['FIPS'].tolist()
values = df_sample['Unemployment Rate (%)'].tolist()
fig = ff.create_choropleth(
fips=fips, values=values,
binning_endpoints=endpts,
colorscale=colorscale,
show_state_data=False,
show_hover=True, centroid_marker={'opacity': 0},
asp=2.9, title='USA by Unemployment %',
legend_title='% unemployed'
)
fig.layout.template = None
fig.show()
OUTPUT:
In a similar way I just want to draw India's map with hovering values.
and just want output like below...
the output of INDIAN MAP:
The figure factory create_choropleth method that you're using is deprecated and deals with USA counties exclusively. For other maps, you need the GeoJSON for the features you're mapping. Plotly only comes with GeoJSON data for world countries and US states, so you'll have to provide the data for India's states yourself.
Like your example choropleth, let's plot the current number of active COVID-19 cases per state as of July 17 (this comes from indiacovid19.github.io, which is periodically archiving the data from India's Ministry of Health). As for the GeoJSON, a quick search yields a few GitHub repos but it seems the majority are too outdated for our cases data, as they don't include the merging of Dadra and Nagar Haveli and Daman and Diu. Luckily, datameet provides an up-to-date shapefile for India's states which I simplified a bit to reduce the size and converted to GeoJSON using mapshaper, then flipped the polygon winding using geojson-rewind.
Now, as detailed in the Plotly documentation, we can use plotly express to quickly make a choropleth map with our data:
import pandas as pd
import plotly.express as px
df = pd.read_csv("https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/active_cases_2020-07-17_0800.csv")
fig = px.choropleth(
df,
geojson="https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/india_states.geojson",
featureidkey='properties.ST_NM',
locations='state',
color='active cases',
color_continuous_scale='Reds'
)
fig.update_geos(fitbounds="locations", visible=False)
fig.show()
For more fine control over the plot, we can use the graph objects directly:
import pandas as pd
import plotly.graph_objects as go
df = pd.read_csv("https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/active_cases_2020-07-17_0800.csv")
fig = go.Figure(data=go.Choropleth(
geojson="https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/india_states.geojson",
featureidkey='properties.ST_NM',
locationmode='geojson-id',
locations=df['state'],
z=df['active cases'],
autocolorscale=False,
colorscale='Reds',
marker_line_color='peachpuff',
colorbar=dict(
title={'text': "Active Cases"},
thickness=15,
len=0.35,
bgcolor='rgba(255,255,255,0.6)',
tick0=0,
dtick=20000,
xanchor='left',
x=0.01,
yanchor='bottom',
y=0.05
)
))
fig.update_geos(
visible=False,
projection=dict(
type='conic conformal',
parallels=[12.472944444, 35.172805555556],
rotation={'lat': 24, 'lon': 80}
),
lonaxis={'range': [68, 98]},
lataxis={'range': [6, 38]}
)
fig.update_layout(
title=dict(
text="Active COVID-19 Cases in India by State as of July 17, 2020",
xanchor='center',
x=0.5,
yref='paper',
yanchor='bottom',
y=1,
pad={'b': 10}
),
margin={'r': 0, 't': 30, 'l': 0, 'b': 0},
height=550,
width=550
)
fig.show()
Note : I could not manage to do it in plotly, but I can do it easily in Bokeh. The OP asked specifically for plotly but still I am posting this answer to show how can be done someother way.
GeoJson of India states is distributed by https://gadm.org/
Load it into GeoJSONDataSource Data Model of Bokeh
Setup the figure and fead in th Data Model
Custom colors can be achived by added the information per germoery/state inside the Datamodel.
Working Code
from bokeh.models import GeoJSONDataSource
from urllib.request import urlopen
import json
from bokeh.models import GeoJSONDataSource, HoverTool, LinearColorMapper
from bokeh.palettes import Viridis256
from bokeh.plotting import figure
from bokeh.io import output_file, show
import matplotlib.pyplot as plt
from bokeh.io import show, output_notebook
%matplotlib
output_notebook()
# Geojson of India
with urlopen("https://raw.githubusercontent.com/geohacker/india/master/state/india_state.geojson") as response:
geojson = json.load(response)
# Round robin over over 3 colors
# You can set the colors here based on the case count you have per state
for i in range(len(geojson['features'])):
geojson['features'][i]['properties']['Color'] = ['blue', 'red', 'green'][i%3]
# Set the hover to state information and finally plot it
cmap = LinearColorMapper(palette=Viridis256)
TOOLS = "pan,wheel_zoom,box_zoom,reset,hover,save"
geo_source = GeoJSONDataSource(geojson=json.dumps(geojson))
p = figure(title='India', tools=TOOLS, x_axis_location=None, y_axis_location=None, width=800, height=800)
p.grid.grid_line_color = None
p.patches('xs', 'ys', fill_alpha=0.7, line_color='black', fill_color='Color', line_width=0.1, source=geo_source)
hover = p.select_one(HoverTool)
hover.point_policy = 'follow_mouse'
hover.tooltips = [('State:', '#NAME_1')]
show(p)
Output:
As mentioned in the code comments above, you can add the case information to the states in the datamodel and set it to hovertool. This way when you hover over sates you will see the case count. In fact you can just add what ever info you want to the states inside the datamodel and use the datamodel to render them.
Sorry but you cannot do that as the location mode has only 3 values:
“ISO-3” , “USA-states” , “country names”
and the geo of layout can only have the 7 values for scope -“world” | “usa” | “europe” | “asia” | “frica” | “north america” | “south america”.
so in order to get a plot of India you need to get a plot of asia in which india would be marked but there is no option for a separate plot of India and states.
data = dict(type = 'choropleth',
locations = ['india'],
locationmode = 'country names',
colorscale= 'Portland',
text= ['t1'],
z=[1.0],
colorbar = {'title' : 'Colorbar Title'})
layout = dict(geo = {'scope': 'asia'})
this colud give you asia map with India marked.
I'm trying to get separate hover tooltips for nodes and edges in Bokeh, but haven't been able to get it to work. Could someone point out what I'm doing wrong? I believe the code should look something like this:
from bokeh.io import show, output_notebook
from bokeh.models import Plot, Range1d, MultiLine, Circle, HoverTool
from bokeh.models.graphs import from_networkx, NodesAndLinkedEdges, EdgesAndLinkedNodes
import networkx as nx
output_notebook()
# Generate data
G = nx.karate_club_graph()
nx.set_edge_attributes(G, nx.edge_betweenness_centrality(G), "betweenness_centrality")
# Setup plot
plot = Plot(plot_width=400, plot_height=400,
x_range=Range1d(-1.1, 1.1), y_range=Range1d(-1.1, 1.1))
graph_renderer = from_networkx(G, nx.spring_layout, scale=1, center=(0, 0))
graph_renderer.node_renderer.glyph = Circle(size=15)
graph_renderer.edge_renderer.glyph = MultiLine(line_alpha=0.8, line_width=1)
plot.renderers.append(graph_renderer)
# Add hover
node_hover_tool = HoverTool(renderers=[graph_renderer.node_renderer],
tooltips=[("index", "#index"), ("club", "#club")])
edge_hover_tool = HoverTool(renderers=[graph_renderer.edge_renderer],
tooltips=[("betweenness_centrality", "#betweenness_centrality")],
line_policy="interp")
plot.add_tools(node_hover_tool, edge_hover_tool)
# Show
show(plot)
But I don't see any hover over with this. I've tried a few things to work around this:
If I remove the renderers argument, I can get some hover over, but not specific to the glyphs I want.
If I remove the renderers argument from both HoverTools, I'm able to get correct tooltips on the nodes along with a betweenness_centrality: ??
If I remove the renderers argument from both HoverTools and add graph_renderer.inspection_policy = NodesAndLinkedEdges(), I get correct tooltips on the nodes
If I remove the renderers argument from both HoverTools and add graph_renderer.inspection_policy = EdgesAndLinkedNodes(), I get correct tooltips on the edges
I believe this question was asked before on the google group here, but didn't get an answer.
Thanks for any help!
So, we construct our networks differently, but I just solved this problem with one of my Bokeh rendered networks from networkx.
The way that I did it was by generating dataframes with my desired networkx data by using the lines_source approach outlined on another question here, which gives you:
....
plot = figure(
plot_width=1100, plot_height=700,
tools=['tap','box_zoom', 'reset']
) # This is the size of the widget designed.
# This function sets the color of the nodes, but how to set based on the
# name of the node?
r_circles = plot.circle(
'x', 'y', source=nodes_source, name= "Node_list",
size="_size_", fill_color="_color_", level = 'overlay',
)
hover = HoverTool(
tooltips=[('Name', '#name'),('Members','#Members')],
renderers=[r_circles]
) # Works to render only the nodes tooltips
def get_edges_specs(_network, _layout):
d = dict(xs=[], ys=[], alphas=[],from_node=[],to_node=[])
weights = [d['weight'] for u, v, d in _network.edges(data=True)]
max_weight = max(weights)
calc_alpha = lambda h: 0.1 + 0.6 * (h / max_weight)
for u, v, data in _network.edges(data=True):
d['xs'].append([_layout[u][0], _layout[v][0]])
d['from_node'].append(u)
d['to_node'].append(v)
d['ys'].append([_layout[u][1], _layout[v][1]])
d['alphas'].append(calc_alpha(data['weight']))
return d
lines_source = ColumnDataSource(get_edges_specs(network, layout))
r_lines = plot.multi_line(
'xs', 'ys',
line_width=1.5, alpha='alphas', color='navy',
source=lines_source
) # This function sets the color of the edges
Then I generated a hover tool to display the edge information I wanted, so in my case I wanted to know the 'from node' attribute. I also wanted to give it a lofty name, so the tooltip will render "Whered_ya_come_from"
hover2 = HoverTool(
tooltips=[('Whered_ya_come_from','#from_node')],
renderers=[r_lines]
)
And then the only difference between how we implement it is that you try to do it as a single addition to the plot, whereas I plot them one after the other.
plot.tools.append(hover1)
# done to append the tool at the end because it has a problem getting
# rendered, as it depended on the nodes being rendered first.
plot.tools.append(hover2)
From there, you can export it or render it into an HTML file (my preferred method).
I'm new to bokeh and I just jumped right into using hovertool as that's why I wanted to use bokeh in the first place.
Now I'm plotting genes and what I want to achieve is multiple lines with the same y-coordinate and when you hover over a line you get the name and position of this gene.
I have tried to mimic this example, but for some reason the I can't even get it to show coordinates.
I'm sure that if someone who actually knows their way around bokeh looks at this code, the mistake will be apparent and I'd be very thankful if they showed it to me.
from bokeh.plotting import figure, HBox, output_file, show, VBox, ColumnDataSource
from bokeh.models import Range1d, HoverTool
from collections import OrderedDict
import random
ys = [10 for x in range(len(levelsdf2[(name, 'Start')]))]
xscale = zip(levelsdf2[('Log', 'Start')], levelsdf2[('Log', 'Stop')])
yscale = zip(ys,ys)
TOOLS="pan,wheel_zoom,box_zoom,reset,hover"
output_file("scatter.html")
hover_tips = levelsdf2.index.values
colors = ["#%06x" % random.randint(0,0xFFFFFF) for c in range(len(xscale))]
source = ColumnDataSource(
data=dict(
x=xscale,
y=yscale,
gene=hover_tips,
colors=colors,
)
)
p1 = figure(plot_width=1750, plot_height=950,y_range=[0, 15],tools=TOOLS)
p1.multi_line(xscale[1:10],yscale[1:10], alpha=1, source=source,line_width=10, line_color=colors[1:10])
hover = p1.select(dict(type=HoverTool))
hover.tooltips = [
("index", "$index"),
("(x,y)", "($x, $y)"),
]
show(p1)
the levelsdf2 is a pandas.DataFrame, if it matters.
I figured it out on my own. It turns out that version 0.8.2 of Bokeh doesn't allow hovertool for lines so I did the same thing using quads.
from bokeh.plotting import figure, HBox, output_file, show, VBox, ColumnDataSource
from bokeh.models import Range1d, HoverTool
from collections import OrderedDict
import random
xscale = zip(levelsdf2[('series1', 'Start')], levelsdf2[('series1', 'Stop')])
xscale2 = zip(levelsdf2[('series2', 'Start')], levelsdf2[('series2', 'Stop')])
yscale2 = zip([9.2 for x in range(len(levelsdf2[(name, 'Start')]))],[9.2 for x in range(len(levelsdf2[(name, 'Start')]))])
TOOLS="pan,wheel_zoom,box_zoom,reset,hover"
output_file("linesandquads.html")
hover_tips = levelsdf2.index.values
colors = ["#%06x" % random.randint(0,0xFFFFFF) for c in range(len(xscale))]
proc1 = 'Log'
proc2 = 'MazF2h'
expression1 = levelsdf2[(proc1, 'Level')]
expression2 = levelsdf2[(proc2, 'Level')]
source = ColumnDataSource(
data=dict(
start=[min(xscale[x]) for x in range(len(xscale))],
stop=[max(xscale[x]) for x in range(len(xscale))],
start2=[min(xscale2[x]) for x in range(len(xscale2))],
stop2=[max(xscale2[x]) for x in range(len(xscale2))],
gene=hover_tips,
colors=colors,
expression1=expression1,
expression2=expression2,
)
)
p1 = figure(plot_width=900, plot_height=500,y_range=[8,10.5],tools=TOOLS)
p1.quad(left="start", right="stop", top=[9.211 for x in range(len(xscale))],
bottom = [9.209 for x in range(len(xscale))], source=source, color="colors")
p1.multi_line(xscale2,yscale2, source=source, color="colors", line_width=20)
hover = p1.select(dict(type=HoverTool))
hover.tooltips = OrderedDict([
(proc1+" (start,stop, expression)", "(#start| #stop| #expression1)"),
("Gene","#gene"),
])
show(p1)
Works like a charm.
EDIT: Added a picture of the result, as requested and edited code to match the screenshot posted.
It's not the best solution as it turns out it's not all that easy to plot several series of quads on one plot. It's probably possible but as it didn't matter much in my use case I didn't investigate too vigorously.
As all genes are represented on all series at the same place I just added tooltips for all series to the quads and plotted the other series as multi_line plots on the same figure.
This means that if you hovered on the top line at 9.21 you'd get tooltips for the line at 9.2 as well, but If you hovered on the 9.2 line you wouldn't get a tooltip at all.