I am working with a Kaggle dataset "US Accidents" (which can be downloaded here) that has 3 million records on traffic accident data. A quick exploration shows that California contains the most accidents. I thought a choropleth viz would be cool to implement however, the data on my Choropleth is inaccurate and was wondering where I am going wrong/how to fix it.
Here is my code...
states_by_accident = df.State.value_counts()
import plotly.graph_objects as go
fig = go.Figure(data = go.Choropleth(
locations = df.State.unique(),
z = states_by_accident,
locationmode = 'USA-states',
colorscale = 'Blues'
))
fig.update_layout(
geo_scope = 'usa'
)
fig.show()
I have tried converting the colors to a log scale which helped spread out the coloring but it still displayed Ohio as having the most accidents which is inaccurate.
This is happening because df.State.unique() doesn't have the states in the same order as the values in states_by_accident.
You can fix this by instead passing the argument locations = states_by_accident.index to go.Chloropleth so that the locations and values are consistent:
fig = go.Figure(data = go.Choropleth(
locations = states_by_accident.index,
z = states_by_accident,
locationmode = 'USA-states',
colorscale = 'Blues'
))
Related
I want to add units to my yaxis of my bar chart.
Im using plotly.express for that but didnt found a working solution inside the documentation.
text_auto() and fig.update_layout() are not working for me right now.
(Tried that thread without success -> Changing Text Inside Plotly Express Bar Charts)
Im not using panda data format right now, rather a own dictionary i feed plotly.
Please bear with me as im still new to analysing data with plotly.
import json
import requests
from operator import itemgetter
import plotly.express as px
#hyperlinks = xaxis with description and link to the game
#times = yaxis total playtime (<- where i want to use "xx.xh")
#titles = simple hover text
df = {
"x" : hyperlinks,
"y" : times,
"titles" : titles,
}
fig = px.bar(
df,
x="x",
y="y",
hover_data=["titles"],
color="y",
color_continuous_scale="Plotly3_r",
title=f"Top 30 games with most playtime",
text_auto=".h",
labels={"y" : "entire playtime of steam games"},
)
fig.update_layout(
yaxis={
"tickformat" : '.h'
}
)
fig.show()
fig.write_html("My_most_played_games.html")
I have generated some random values for the example.
Since recently you can have access to figure parameters of plotly using fig.full_figure_for_development() from there you can extract element to check where plotly added ticks and regenerate them adding to them any string you want
import plotly.express as px
import numpy as np
#hyperlinks = xaxis with description and link to the game
#times = yaxis total playtime (<- where i want to use "xx.xh")
#titles = simple hover text
df = {
"x" : ['black desert', 'arma 3', 'borderland 2', 'Cyberpunk'],
"y" : [420, 350, 310, 180],
"titles" : ['black desert', 'arma 3', 'borderland 2', 'Cyberpunk'],
}
fig = px.bar(
df,
x="x",
y="y",
hover_data=["titles"],
color="y",
color_continuous_scale="Plotly3_r",
title=f"Top 30 games with most playtime",
text_auto=".h",
labels={"y" : "entire playtime of steam games"},
)
# Important part to recover infor from the figure
full_fig = fig.full_figure_for_development() # recover data from figure
range_vl = full_fig.layout.yaxis.range # get range of y axis
distance_tick = full_fig.layout.yaxis.dtick # get distance between ticks
number_ticks = range_vl[1]//full_fig.layout.yaxis.dtick + 1 # calculate number of ticks of your figure
tick_vals = [range_vl[0]+distance_tick*num for num in range(int(number_ticks))] # generate your ticks
tick_text = [f"{val} h" for val in tick_vals] #generate text for your ticks
fig.update_layout(
# set tick mode to array, tickvals to your vals calculated and tick text to the text genrated
yaxis={"tickmode":"array","tickvals":tick_vals, "ticktext": tick_text}
)
fig.show()
I'm trying to set a map of European country with their result in Eurovision.
I have a button to choose the different countries ( Italy, France, Portugal, UK , etc ...)
For Example, if I choose to see the Result of Sweden, I want to see on the map the numbers of points given by the others according to a color scale. I success to do it !
But I want to visualize Sweden, for example in black on the map, to better see where it is, and the "neighborhood effect of notation" .
fig3 = go.Figure(data=go.Choropleth(
locations=Euro_tr['Country_code'], # Spatial coordinates
z = Euro_tr['Italy'], # Data to be color-coded
locationmode = "ISO-3",
colorbar_title = "Points donnés",
text=Euro_tr['Country'],
))
fig3.update_layout(
title_text = 'Score Eurovision',
margin={"r":55,"t":55,"l":55,"b":55},
height=500,
geo_scope="europe" ,
)
#Make a button for each country
button=[]
for country in Euro_tr.columns[1:-1] :
dico=dict (
label=country,
method="update",
args = [{'z': [ Euro_tr[country] ] }],)
button.append(dico)
fig3.update_layout(
updatemenus=[
dict(
buttons=button,
y=0.9,
x=0,
xanchor='right',
yanchor='top',
active=0,
),
])
As you see in this example showing the points given to Sweden, I want Sweden to be in a specific color, independently to others countries, the ones that have given points and the ones that have given no points.
Thanks for your help !
I followed the answers from #vestland, and I succeed to put my country of interest in one color , independently to others by using fig.add_traces(go.Choropleth)
To have the possibility to change the data and the trace according to my country of interest, I use streamlit and buttons.
import streamlit as st
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
# Creation of graphes for each country
Graphes=[]
for country in Euro_tr.columns[1:-1] : #To pass each country
Graphe=go.Figure(data=go.Choropleth(
locations=Euro_tr['Country_code'], # Spatial coordinates
z = Euro_tr[country], # Data to be color-coded
locationmode = "ISO-3",
colorbar_title = "Points donnés",
autocolorscale= False,
colorscale="viridis",
text=Euro_tr['Country'],
))
# customisation : title according to the country and its points
Graphe.update_layout(
title_text = "Total :Points donnés à {fcountry} qui a remporté {fpoints} points".format(fcountry = country, fpoints = Eurovision_tot['Result_tot'][Eurovision_tot["Country"]==country].values[0]),
margin={"r":55,"t":55,"l":55,"b":55},
height=500,
)
)
# block a specific zoom on the map ( scope "europe" isn't complete for eurovision countries xD!)
Graphe.update_geos(
center=dict(lon= 23, lat= 54),
lataxis_range=[31.0529,-40.4296], lonaxis_range=[-24, 88.2421],
projection_scale=3
)
# add trace for the specific country.
Graphe.add_traces(go.Choropleth(locations=Country_df['Country_code'][Country_df["Country"]==country],
z = [1],
colorscale = [[0, col_swe],[1, col_swe]],
colorbar=None,
showscale = False))
Graphes.append(Graphe)
#creation selectbox to select country
col12, col22 = st.beta_columns([0.2,0.8]) # I use columns to put the selector on a side and the graph on other side
Pays=list(Euro_tr.columns[1:-1]) # List containing country's name
Selection_Pays = col12.selectbox('',(Pays)) #create a multiple selector with the different countries as possible choice
# define action according to the selection.
for country in Pays :
if Selection_Pays== country : #if country is selected
col22.plotly_chart(Graphes[Pays.index(country)]) # plot the corresponding map.
Even though you've built your choropleth map using px.express you can always add a trace using plotly.graph_objects with fig.add_traces(go.Choropleth) such as this:
col_swe = 'Black'
fig.add_traces(go.Choropleth(locations=df_swe['iso_alpha'],
z = [1],
colorscale = [[0, col_swe],[1, col_swe]],
colorbar=None,
showscale = False)
)
To my knowledge it's not possible to define a single color for a single country directly, and thats why I've assigned z=[1] as a value, and a custom scale colorscale = [[0, col_swe],[1, col_swe]] to make sure that Sweden always is illustrated in 'Black'.
Code:
import plotly.express as px
import plotly.graph_objects as go
df = px.data.gapminder().query("year==2007")
fig = px.choropleth(df, locations="iso_alpha",
color="lifeExp", # lifeExp is a column of gapminder
hover_name="country", # column to add to hover information
color_continuous_scale=px.colors.sequential.Plasma)
df_swe = df[df['country']=='Sweden']
col_swe = 'Black'
fig.add_traces(go.Choropleth(locations=df_swe['iso_alpha'],
z = [1],
colorscale = [[0, col_swe],[1, col_swe]],
colorbar=None,
showscale = False)
)
f = fig.full_figure_for_development(warn=False)
fig.show()
I am trying to generate several maps with different content based on a dataframe.
So far, I have managed to display the information I needed on the interactive maps.
However, as I need to include the generated maps as figures in a report, I need to find a way to show all the markers in the figures. Problem is: some markers only are shown when I manually zoom in the area.
Is there a way to always make the markers visible?
Here is the code:
import plotly.graph_objects as go
token = open("token.mapbox_token").read() # you need your own token
df_select = df_map.loc[df_map['Budget'] == 0.9]
fig= go.Figure(go.Scattermapbox(lat=df_select.Latitude, lon=df_select.Longitude,
mode='markers', marker=go.scattermapbox.Marker(
size=df_select.Warehouse_Size*5, color = df_select.Warehouse_Size,
colorscale = ['white','red','orange','green','blue','purple'],
showscale = False)))
fig = fig.add_trace(go.Choroplethmapbox(geojson=br_geo, locations=df_select.State,
featureidkey="properties.UF_05",
z=df_select.Top10,
colorscale=["white","pink"], showscale=False,
zmin = 0,
zmax=1,
marker_opacity=0.5, marker_line_width=1
))
df_prio = df_select.loc[df_select['Prioritisated'] == 1]
fig= fig.add_trace(go.Scattermapbox(lat=df_prio.Latitude, lon=df_prio.Longitude+1,
mode='markers',
marker=go.scattermapbox.Marker(symbol = "campsite", size = 10)))
fig.update_layout(height=850,width = 870,
mapbox_style = "mapbox://styles/rafaelaveloli/ckollp2dg21dd19pmgm3vyebu",
mapbox_zoom=3.4, mapbox_center = {"lat": -14.5 ,"lon": -52},
mapbox_accesstoken = token, showlegend= False)
fig.show()
This is the result I get:
And this is one of the hidden markers that are only visible when zooming in:
How can I make it visible in the first figure, without changing the figure zoom and dimensions?
Passing allowoverlap=True to go.scattermapbox.Marker() seems to resolve the issue (link to relevant docs).
I am currently trying to visualize geographical data for the districts of the city of Hamburg. The creation of the choropleth by using plotly.graph_objects and an associated GeoJSON file is working perfectly fine.
However, as I am plotting the city of Hamburg, it is not possible for me to use one of the specified locationmodes and I have to zoom in manually - for each individual plot, for each execution, which is very cumbersome.
Can I somehow use longitude/latitude coordinates, something like zoom_start similar to Folium, or any other keyword I'm missing to limit the selection programmatically?
For completeness, the code so far is attached (Subplots are created, whereas each subplot is data from a dataframe visualized as a graph_objects.Choropleth instance and can be touched individually (zooming, ...).
import plotly
import plotly.graph_objects as go
choro_overview = plotly.subplots.make_subplots(
rows=6, cols=2, specs=[[{'type': 'choropleth'}]*2]*6,
subplot_titles = df_main.columns[5:],
horizontal_spacing = 0.1,
)
cbar_locs_x = [0.45, 1]
cbar_locs_y = np.linspace(0.95, 0.05, 6)
for ii, column in enumerate(df_main.columns[5:]):
placement = np.unravel_index(ii, (6, 2))
choro_overview.add_trace(
go.Choropleth(
locations = df_main['District'],
z = df_main[column],
geojson=geojson_src,
featureidkey='properties.name',
colorbar=dict(len=np.round(1/9, 1), x=cbar_locs_x[placement[1]], y=cbar_locs_y[placement[0]]),
name=column,
colorscale='orrd',
), row=placement[0]+1, col=placement[1]+1
I have since found that the keyword is not in go.Choropleth, but in the figure itself by calling update_geos().
Credit goes to plotly automatic zooming for "Mapbox maps".
For context, what I'm trying to do is make an emission abatement chart that has the abated emissions being subtracted from the baseline. Mathematically, this is the same as adding the the abatement to the residual emission line:
Residual = Baseline - Abated
The expected results should look something like this:
Desired structure of stacked area chart:
I've currently got the stacked area chart to look like this:
As you can see, the way that the structure of stacked area chart is that the stacking starts at zero, however, I'm trying to get the stacking to either be added to the residual (red) line, or to be subtracted from the baseline (black).
I would do this in excel by just defining a blank area as the first stacked item, equal the residual line, so that the stacking occurs ontop of that. However, I'm not sure if there is a pythonic way to do this in plotly, while mainting the structure and interactivity of the chart.
The shaping of the pandas dataframes is pretty simple, just a randomly generated series of abatement values for each of the subcategories I've set up, that are then grouped to form the baseline and the residual forecasts:
scenario = 'High'
# The baseline data as a line
baseline_line = baselines.loc[baselines['Scenario']==scenario].groupby(['Year']).sum()
# The abatement and residual data
df2 = deepcopy(abatement).drop(columns=['tCO2e'])
df2['Baseline'] = baselines['tCO2e']
df2['Abatement'] = abatement['tCO2e']
df2 = df2.fillna(0)
df2['Residual'] = df2['Baseline'] - df2['Abatement']
df2 = df2.loc[abatement['Scenario']==scenario]
display(df2)
# The residual forecast as a line
emissions_lines = df2.loc[df2['Scenario']==scenario].groupby(['Year']).sum()
The charting is pretty simple as well, using the plotly express functionality:
# Just plotting
fig = px.area(df2,
x = 'Year',
y = 'Abatement',
color = 'Site',
line_group = 'Fuel_type'
)
fig2 = px.line(emissions_lines,
x = emissions_lines.index,
y = 'Baseline',
color_discrete_sequence = ['black'])
fig3 = px.line(emissions_lines,
x = emissions_lines.index,
y = 'Residual',
color_discrete_sequence = ['red'])
fig.add_trace(
fig2.data[0],
)
fig.add_trace(
fig3.data[0],
)
fig.show()
To summarise, I wish to have the Plotly stacked area chart be 'elevated' so that it fits between the residual and baseline forecasts.
NOTE: I've used the term 'baseline' with two meanings here. One specific to my example, and one generic to stacked area chart (in the title). The first usage, in the title, is meant to be the series for which the stacked area chart starts. Currently, this series is just the x-axis, or zero, I'm wishing to have this customised so that I can define a series (in this example, the red residual line) that the stacking can start from.
The second usage of the term 'baseline' refers to the 'baseline forecast', or BAU.
I think I've found a workaround, it is not ideal, but is similar to the approach I have taken in excel. I've ultimately added the 'residual' emissions in the same structure as the categories and concatenated it at the start of the DataFrame, so it bumps everything else up in between the residual and baseline forecasts.
Concatenation step:
# Me trying to make it cleanly at the residual line
df2b = deepcopy(emissions_lines)
df2b['Fuel_type'] = "Base"
df2b['Site'] = "Base"
df2b['Abatement'] = df2b['Residual']
df2c = pd.concat([df2b.reset_index(),df2],ignore_index=True)
Rejigged plotting step, with some recolouring/reformatting of the chart:
# Just plotting
fig = px.area(df2c,
x = 'Year',
y = 'Abatement',
color = 'Site',
line_group = 'Fuel_type'
)
# Making the baseline invisible and ignorable
fig.data[0].line.color='rgba(255, 255, 255,1)'
fig.data[0].showlegend = False
fig2 = px.line(emissions_lines,
x = emissions_lines.index,
y = 'Baseline',
color_discrete_sequence = ['black'])
fig3 = px.line(emissions_lines,
x = emissions_lines.index,
y = 'Residual',
color_discrete_sequence = ['red'])
fig.add_trace(
fig2.data[0],
)
fig.add_trace(
fig3.data[0],
)
fig.show()
Outcome:
I'm going to leave this unresolved, as I see this as not what I originally intended. It currently 'works', but this is not ideal and causes some issues with the interaction with the legend function in the Plotly object.