I am trying to plot India map using plotly, but unable to find a way to do that. Below is the code which I tried for USA.
import pandas as pd
df_sample = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/laucnty16.csv')
df_sample['State FIPS Code'] = df_sample['State FIPS Code'].apply(lambda x: str(x).zfill(2))
df_sample['County FIPS Code'] = df_sample['County FIPS Code'].apply(lambda x: str(x).zfill(3))
df_sample['FIPS'] = df_sample['State FIPS Code'] + df_sample['County FIPS Code']
colorscale = ["#f7fbff","#ebf3fb","#deebf7","#d2e3f3","#c6dbef","#b3d2e9","#9ecae1",
"#85bcdb","#6baed6","#57a0ce","#4292c6","#3082be","#2171b5","#1361a9",
"#08519c","#0b4083","#08306b"]
endpts = list(np.linspace(1, 12, len(colorscale) - 1))
fips = df_sample['FIPS'].tolist()
values = df_sample['Unemployment Rate (%)'].tolist()
fig = ff.create_choropleth(
fips=fips, values=values,
binning_endpoints=endpts,
colorscale=colorscale,
show_state_data=False,
show_hover=True, centroid_marker={'opacity': 0},
asp=2.9, title='USA by Unemployment %',
legend_title='% unemployed'
)
fig.layout.template = None
fig.show()
OUTPUT:
In a similar way I just want to draw India's map with hovering values.
and just want output like below...
the output of INDIAN MAP:
The figure factory create_choropleth method that you're using is deprecated and deals with USA counties exclusively. For other maps, you need the GeoJSON for the features you're mapping. Plotly only comes with GeoJSON data for world countries and US states, so you'll have to provide the data for India's states yourself.
Like your example choropleth, let's plot the current number of active COVID-19 cases per state as of July 17 (this comes from indiacovid19.github.io, which is periodically archiving the data from India's Ministry of Health). As for the GeoJSON, a quick search yields a few GitHub repos but it seems the majority are too outdated for our cases data, as they don't include the merging of Dadra and Nagar Haveli and Daman and Diu. Luckily, datameet provides an up-to-date shapefile for India's states which I simplified a bit to reduce the size and converted to GeoJSON using mapshaper, then flipped the polygon winding using geojson-rewind.
Now, as detailed in the Plotly documentation, we can use plotly express to quickly make a choropleth map with our data:
import pandas as pd
import plotly.express as px
df = pd.read_csv("https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/active_cases_2020-07-17_0800.csv")
fig = px.choropleth(
df,
geojson="https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/india_states.geojson",
featureidkey='properties.ST_NM',
locations='state',
color='active cases',
color_continuous_scale='Reds'
)
fig.update_geos(fitbounds="locations", visible=False)
fig.show()
For more fine control over the plot, we can use the graph objects directly:
import pandas as pd
import plotly.graph_objects as go
df = pd.read_csv("https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/active_cases_2020-07-17_0800.csv")
fig = go.Figure(data=go.Choropleth(
geojson="https://gist.githubusercontent.com/jbrobst/56c13bbbf9d97d187fea01ca62ea5112/raw/e388c4cae20aa53cb5090210a42ebb9b765c0a36/india_states.geojson",
featureidkey='properties.ST_NM',
locationmode='geojson-id',
locations=df['state'],
z=df['active cases'],
autocolorscale=False,
colorscale='Reds',
marker_line_color='peachpuff',
colorbar=dict(
title={'text': "Active Cases"},
thickness=15,
len=0.35,
bgcolor='rgba(255,255,255,0.6)',
tick0=0,
dtick=20000,
xanchor='left',
x=0.01,
yanchor='bottom',
y=0.05
)
))
fig.update_geos(
visible=False,
projection=dict(
type='conic conformal',
parallels=[12.472944444, 35.172805555556],
rotation={'lat': 24, 'lon': 80}
),
lonaxis={'range': [68, 98]},
lataxis={'range': [6, 38]}
)
fig.update_layout(
title=dict(
text="Active COVID-19 Cases in India by State as of July 17, 2020",
xanchor='center',
x=0.5,
yref='paper',
yanchor='bottom',
y=1,
pad={'b': 10}
),
margin={'r': 0, 't': 30, 'l': 0, 'b': 0},
height=550,
width=550
)
fig.show()
Note : I could not manage to do it in plotly, but I can do it easily in Bokeh. The OP asked specifically for plotly but still I am posting this answer to show how can be done someother way.
GeoJson of India states is distributed by https://gadm.org/
Load it into GeoJSONDataSource Data Model of Bokeh
Setup the figure and fead in th Data Model
Custom colors can be achived by added the information per germoery/state inside the Datamodel.
Working Code
from bokeh.models import GeoJSONDataSource
from urllib.request import urlopen
import json
from bokeh.models import GeoJSONDataSource, HoverTool, LinearColorMapper
from bokeh.palettes import Viridis256
from bokeh.plotting import figure
from bokeh.io import output_file, show
import matplotlib.pyplot as plt
from bokeh.io import show, output_notebook
%matplotlib
output_notebook()
# Geojson of India
with urlopen("https://raw.githubusercontent.com/geohacker/india/master/state/india_state.geojson") as response:
geojson = json.load(response)
# Round robin over over 3 colors
# You can set the colors here based on the case count you have per state
for i in range(len(geojson['features'])):
geojson['features'][i]['properties']['Color'] = ['blue', 'red', 'green'][i%3]
# Set the hover to state information and finally plot it
cmap = LinearColorMapper(palette=Viridis256)
TOOLS = "pan,wheel_zoom,box_zoom,reset,hover,save"
geo_source = GeoJSONDataSource(geojson=json.dumps(geojson))
p = figure(title='India', tools=TOOLS, x_axis_location=None, y_axis_location=None, width=800, height=800)
p.grid.grid_line_color = None
p.patches('xs', 'ys', fill_alpha=0.7, line_color='black', fill_color='Color', line_width=0.1, source=geo_source)
hover = p.select_one(HoverTool)
hover.point_policy = 'follow_mouse'
hover.tooltips = [('State:', '#NAME_1')]
show(p)
Output:
As mentioned in the code comments above, you can add the case information to the states in the datamodel and set it to hovertool. This way when you hover over sates you will see the case count. In fact you can just add what ever info you want to the states inside the datamodel and use the datamodel to render them.
Sorry but you cannot do that as the location mode has only 3 values:
“ISO-3” , “USA-states” , “country names”
and the geo of layout can only have the 7 values for scope -“world” | “usa” | “europe” | “asia” | “frica” | “north america” | “south america”.
so in order to get a plot of India you need to get a plot of asia in which india would be marked but there is no option for a separate plot of India and states.
data = dict(type = 'choropleth',
locations = ['india'],
locationmode = 'country names',
colorscale= 'Portland',
text= ['t1'],
z=[1.0],
colorbar = {'title' : 'Colorbar Title'})
layout = dict(geo = {'scope': 'asia'})
this colud give you asia map with India marked.
Related
I'm trying to set a map of European country with their result in Eurovision.
I have a button to choose the different countries ( Italy, France, Portugal, UK , etc ...)
For Example, if I choose to see the Result of Sweden, I want to see on the map the numbers of points given by the others according to a color scale. I success to do it !
But I want to visualize Sweden, for example in black on the map, to better see where it is, and the "neighborhood effect of notation" .
fig3 = go.Figure(data=go.Choropleth(
locations=Euro_tr['Country_code'], # Spatial coordinates
z = Euro_tr['Italy'], # Data to be color-coded
locationmode = "ISO-3",
colorbar_title = "Points donnés",
text=Euro_tr['Country'],
))
fig3.update_layout(
title_text = 'Score Eurovision',
margin={"r":55,"t":55,"l":55,"b":55},
height=500,
geo_scope="europe" ,
)
#Make a button for each country
button=[]
for country in Euro_tr.columns[1:-1] :
dico=dict (
label=country,
method="update",
args = [{'z': [ Euro_tr[country] ] }],)
button.append(dico)
fig3.update_layout(
updatemenus=[
dict(
buttons=button,
y=0.9,
x=0,
xanchor='right',
yanchor='top',
active=0,
),
])
As you see in this example showing the points given to Sweden, I want Sweden to be in a specific color, independently to others countries, the ones that have given points and the ones that have given no points.
Thanks for your help !
I followed the answers from #vestland, and I succeed to put my country of interest in one color , independently to others by using fig.add_traces(go.Choropleth)
To have the possibility to change the data and the trace according to my country of interest, I use streamlit and buttons.
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
# Creation of graphes for each country
Graphes=[]
for country in Euro_tr.columns[1:-1] : #To pass each country
Graphe=go.Figure(data=go.Choropleth(
locations=Euro_tr['Country_code'], # Spatial coordinates
z = Euro_tr[country], # Data to be color-coded
locationmode = "ISO-3",
colorbar_title = "Points donnés",
autocolorscale= False,
colorscale="viridis",
text=Euro_tr['Country'],
))
# customisation : title according to the country and its points
Graphe.update_layout(
title_text = "Total :Points donnés à {fcountry} qui a remporté {fpoints} points".format(fcountry = country, fpoints = Eurovision_tot['Result_tot'][Eurovision_tot["Country"]==country].values[0]),
margin={"r":55,"t":55,"l":55,"b":55},
height=500,
)
)
# block a specific zoom on the map ( scope "europe" isn't complete for eurovision countries xD!)
Graphe.update_geos(
center=dict(lon= 23, lat= 54),
lataxis_range=[31.0529,-40.4296], lonaxis_range=[-24, 88.2421],
projection_scale=3
)
# add trace for the specific country.
Graphe.add_traces(go.Choropleth(locations=Country_df['Country_code'][Country_df["Country"]==country],
z = [1],
colorscale = [[0, col_swe],[1, col_swe]],
colorbar=None,
showscale = False))
Graphes.append(Graphe)
#creation selectbox to select country
col12, col22 = st.beta_columns([0.2,0.8]) # I use columns to put the selector on a side and the graph on other side
Pays=list(Euro_tr.columns[1:-1]) # List containing country's name
Selection_Pays = col12.selectbox('',(Pays)) #create a multiple selector with the different countries as possible choice
# define action according to the selection.
for country in Pays :
if Selection_Pays== country : #if country is selected
col22.plotly_chart(Graphes[Pays.index(country)]) # plot the corresponding map.
Even though you've built your choropleth map using px.express you can always add a trace using plotly.graph_objects with fig.add_traces(go.Choropleth) such as this:
col_swe = 'Black'
fig.add_traces(go.Choropleth(locations=df_swe['iso_alpha'],
z = [1],
colorscale = [[0, col_swe],[1, col_swe]],
colorbar=None,
showscale = False)
)
To my knowledge it's not possible to define a single color for a single country directly, and thats why I've assigned z=[1] as a value, and a custom scale colorscale = [[0, col_swe],[1, col_swe]] to make sure that Sweden always is illustrated in 'Black'.
Code:
import plotly.express as px
import plotly.graph_objects as go
df = px.data.gapminder().query("year==2007")
fig = px.choropleth(df, locations="iso_alpha",
color="lifeExp", # lifeExp is a column of gapminder
hover_name="country", # column to add to hover information
color_continuous_scale=px.colors.sequential.Plasma)
df_swe = df[df['country']=='Sweden']
col_swe = 'Black'
fig.add_traces(go.Choropleth(locations=df_swe['iso_alpha'],
z = [1],
colorscale = [[0, col_swe],[1, col_swe]],
colorbar=None,
showscale = False)
)
f = fig.full_figure_for_development(warn=False)
fig.show()
I have a dataframe that details sales of various product categories vs. time. I'd like to make a "line and marker" plot of sales vs. time, per category. To my surprise, this appears to be very difficult in Bokeh.
The scatter plot is easy. But then trying to overplot a line of sales vs. date with the same source (so I can update both scatter and line plots in one go when the source updates) and in such a way that the colors of the line match the colors of the scatter plot markers proves near impossible.
Minimal reproducible example with contrived data:
import pandas as pd
df = pd.DataFrame({'Date':['2020-01-01','2020-01-02','2020-01-01','2020-01-02'],\
'Product Category':['shoes','shoes','grocery','grocery'],\
'Sales':[100,180,21,22],'Colors':['red','red','green','green']})
df['Date'] = pd.to_datetime(df['Date'])
from bokeh.io import output_notebook
output_notebook()
from bokeh.io import output_file, show
from bokeh.plotting import figure
source = ColumnDataSource(df)
plot = figure(x_axis_type="datetime", plot_width=800, toolbar_location=None)
plot.scatter(x="Date",y="Sales",size=15, source=source, fill_color="Colors", fill_alpha=0.5, \
line_color="Colors",legend="Product Category")
for cat in list(set(source.data['Product Category'])):
tmp = source.to_df()
col = tmp[tmp['Product Category']==cat]['Colors'].values[0]
plot.line(x="Date",y="Sales",source=source, line_color=col)
show(plot)
Here's what it looks like, which is clearly wrong:
Here's what I want and don't know how to make:
Can Bokeh not make such plots, where scatter markers and lines have the same color per category, with a legend?
With bokeh it is often helpful to first think about the visualisation you want and then structuring the data source appropriately. You want two lines, on per category, the x axis is time and y axis is the sales. Then a natural way to structure your data source is the following:
df = pd.DataFrame({'Date':['2020-01-01','2020-01-02'],
'Shoe Sales':[100, 180],
'Grocery Sales': [21, 22]
})
from bokeh.io import output_notebook
output_notebook()
from bokeh.io import output_file, show
from bokeh.plotting import figure
source = ColumnDataSource(df)
plot = figure(x_axis_type="datetime", plot_width=800, toolbar_location=None)
categories = ["Shoe Sales", "Grocery Sales"]
colors = {"Shoe Sales": "red", "Grocery Sales": "green"}
for category in categories:
plot.scatter(x="Date",y=category,size=15, source=source, fill_color=colors[category], legend=category)
plot.line(x="Date",y=category,source=source, line_color=colors[category])
show(plot)
The solutions is to group your data. Then you can plot lines for each group.
Minimal Example
import pandas as pd
from bokeh.plotting import figure, show, output_notebook
output_notebook()
df = pd.DataFrame({'Date':['2020-01-01','2020-01-02','2020-01-01','2020-01-02'],
'Product Category':['shoes','shoes','grocery','grocery'],
'Sales':[100,180,21,22],'Colors':['red','red','green','green']})
df['Date'] = pd.to_datetime(df['Date'])
plot = figure(x_axis_type="datetime",
plot_width=400,
plot_height=400,
toolbar_location=None
)
plot.scatter(x="Date",
y="Sales",
size=15,
source=df,
fill_color="Colors",
fill_alpha=0.5,
line_color="Colors",
legend_field="Product Category"
)
for color in df['Colors'].unique():
plot.line(x="Date", y="Sales", source=df[df['Colors']==color], line_color=color)
show(plot)
Output
I'm plotting covid-19 data for countries grouped by World Bank regions using pandas and Bokeh.
from bokeh.io import output_file, show
from bokeh.palettes import Spectral5
from bokeh.plotting import figure
from bokeh.transform import factor_cmap
group = data.groupby(["region", "CountryName"])
index_cmap = factor_cmap(
'region_CountryName',
palette=Spectral5,
factors=sorted(data.region.unique()),
end=1
)
p = figure(plot_width=800, plot_height=600, title="Confirmed cases per 100k people by country",
x_range=group, toolbar_location="left")
p.vbar(x='region_CountryName', top='ConfirmedPer100k_max', width=1, source=group,
line_color="white", fill_color=index_cmap, )
p.y_range.start = 0
p.xgrid.grid_line_color = None
p.xaxis.major_label_orientation = 3.14159/2
p.xaxis.group_label_orientation = 3.14159/2
p.outline_line_color = None
show(p)
And I get a
I would like to set some sort of initial zoom into the x-axis to get a more manageable image
, which I got by manually zooming in.
Any suggestions?
You should be able to accomplish this with the x_range parameter. In this example, the plot's x range would be the first 20 countries. You can adjust as needed. You might also have to mess around a bit to get the group_cn_list correct. It's hard to say without seeing your data. If you can post a df example for reproducibility, it would help.
group_cn_list = group["CountryName"].tolist()
p = figure(plot_width=800, plot_height=600, title="Confirmed cases per 100k people by country",
x_range=group_cn_list[0:20], toolbar_location="left")
I use Bokeh to make choropleth maps and to create other geographical graphics because Python spits it out a million times faster than I could ever create it in another way.
Right now I am doing something like:
from bokeh.io import show, output_file, export_png, output_notebook, output_file, save
from bokeh.models import ColumnDataSource, HoverTool, LabelSet
from bokeh.plotting import figure
from bokeh.sampledata.us_counties import data as counties
import pandas as pd
desired_counties=['Franklin County, Ohio','Fairfield County, Ohio']
counties2 = {code: county for code, county in counties.items() if county["detailed name"] in desired_counties}
unemploymentRate = {'Location': desired_counties, 'rate': [100, 100]} #placeholders
df = pd.DataFrame(data=unemploymentRate)
county_xs = [county["lons"] for county in counties2.values()]
county_ys = [county["lats"] for county in counties2.values()]
county_names = [county['name'] for county in counties2.values()]
unemployment = [df.loc[index,'rate'] for county in counties2.values() for index, row in df.iterrows() if df.loc[index,'Location']==county['detailed name']]
source = ColumnDataSource(data=dict(x=county_xs,y=county_ys,name=county_names,rate=unemployment))
TOOLS="hover,save"
p = figure(x_axis_location=None, y_axis_location=None, plot_width=350,plot_height=250,tools=TOOLS)
p.grid.grid_line_color = None
p.outline_line_color = None
p.patches('x', 'y', source=source,
fill_color='#007a33',
fill_alpha=1, line_color="black", line_width=3.5)
hover = p.select_one(HoverTool)
hover.point_policy = "follow_mouse"
hover.tooltips = [
("Name", "#name"),
("Rate","#rate{0.0}%")
]
labels = LabelSet(x='x', y='y', text='name', level='glyph',
x_offset=0, y_offset=0, source=source, render_mode='css')
p.add_layout(labels)
p.toolbar.logo = None
p.toolbar_location = None
show(p)
Which gives:
When I hover over the image I see the data I want, but what I would rather have is the data annotated on the image instead for a print possibility. Using the LabelSet class that Bokeh has in their documentation seems perfect, except it's geared toward xy plots and so when trying to use it here it just stacks the labels in the corner.
Questions:
Does anyone know how to label on a Bokeh choropleth without using outside graphics software?
I have ArcGIS Pro, which can obviously create this type of map, but I am looking for something that does it using programming. I use Bokeh right now because I can basically tell it a central county I want, and I have code written that makes me a map of the state highlighting the county I want and its surrounding counties, creates an inset map of the county/surrounding counties, and gives hover data on the inset map of the unemployment rate in those counties (the picture/code above is a simplified version of that). Is there a way to go about this using AcrPy/ArcGIS instead?
The reason your labels are misplaced is because you tell them to use fields 'x' and 'y' from the data source, but the data source does not have these columns. You could compute the "center" of each shape, and add those as the x and y columns to source, then things would display as you want.
I am intending to create a scatter plot with a linear colormapper. The dataset is the popular Female Literacy and Birthrate dataset.
The plot would have the "GDP per capita" on the x axis and "Life Expectancy at Birth" on the y axis. In addition to this (and this is where i am running into the issue), is to vary the color of the points according to "Birth rate".
Current Code:
#DATA MANIPULATION
# import Pandas, Bokeh, etc
import numpy as np
import pandas as pd
from bokeh.io import show, output_file
from bokeh.models import ColumnDataSource
from bokeh.palettes import Viridis256 as palette
from bokeh.plotting import figure
from bokeh.sampledata.autompg import autompg as df
from bokeh.transform import linear_cmap
# load the data file
excel_file = '../factbook.xlsx'
#(removed url above since it is private)
factbook = pd.read_excel(excel_file)
source = ColumnDataSource(factbook)
colormapper = linear_cmap(field_name = factbook["Birth rate"], palette=palette, low=min(factbook["Birth rate"]), high=max(factbook["Birth rate"]))
p = figure(title = "UN Factbook Bubble Visualization",
x_axis_label = 'GDP per capita', y_axis_label = 'Life expectancy at birth')
p.circle(x = 'GDP per capita', y = 'Life expectancy at birth', source = source, color =colormapper)
output_file("file", title="Bubble Graph")
show(p)
the p.circle line is having an issue with consuming the colormapper. I would like help on understanding how to resolve this.
The field_name parameter should be provided with the name of a column. You are supplying the entire data column itself. Since you have not provided a complete runnable example, it is impossible to test for sure, but presumably you want:
linear_cmap(field_name="Birth rate", ...)