Adding geopandas boundary plot to plotly - python

I have two separate map plots, one is cultural boundaries in a country (just like the state borders) in the form of custom polygons based on latitude and longitude values, defined in geojson format. I can plot the polygons easily using geopandas:
states = gpd.read_file('myfile.geojson')
states.boundary.plot()
Here is a sample output:
The second is a series of latitudes and longitudes with corresponding values that I need to plot over a map layer, which I can do with plotly express's scatter_mapbox:
fig = px.scatter_mapbox(df_year,
lat='y', lon='x',
color='drought_index',
range_color=(-4, 4),
hover_data={'x': False, 'y': False},
zoom=5, height=800, width=1050,
center={'lat': 32.7089, 'lon': 53.6880},
color_continuous_scale=px.colors.diverging.RdYlGn,
color_continuous_midpoint=0,
)
fig.update_layout(mapbox_style="outdoors", mapbox_accesstoken=mb_token)
Which will look like this:
Is there any way to add these two plots together and have the scatter points and shape boundaries overlap on a single map? Meaning that on top of the mapbox layer, I have the scatter points and the boundaries of the polygons visible.
The problem is that geopandas plot uses matplotlib and returens AxesSubplot:, and I couldn't find any way to add this to the plotly fig. I tried the mpl_to_plotly() from plotly.tools, but it threw an exception on 'Canvas is null'.
I also tried to find a way to plot the geojson shapes with plotly, but all I could find was the choropleth mapbox which requires the shapes to be filled with a color. I tried to use it anyways by decreasing the opacity of the choropleth plot but it either will cover the scatter plot or be barely visible.
Any suggestion on how to approach this is appreciated.

you really described the solution. https://plotly.com/python/mapbox-layers/
have used UK county boundaries as cultural layer
have used UK hospitals to generate a scatter mapbox
"source": json.loads(gdf.geometry.to_json()), is really the solution to add a GEOJSON layer from a geopandas dataframe
import requests
import geopandas as gpd
import pandas as pd
import json, io
import plotly.express as px
# UK admin area boundaries
res = requests.get("https://opendata.arcgis.com/datasets/69dc11c7386943b4ad8893c45648b1e1_0.geojson")
# geopandas dataframe of "cultural layer"
gdf = gpd.GeoDataFrame.from_features(res.json()["features"], crs="CRS84")
# get some public addressess - hospitals. data that can be scattered
dfhos = pd.read_csv(io.StringIO(requests.get("http://media.nhschoices.nhs.uk/data/foi/Hospital.csv").text),
sep="¬",engine="python",)
fig = (
px.scatter_mapbox(
dfhos.head(100),
lat="Latitude",
lon="Longitude",
color="Sector",
hover_data=["OrganisationName", "Postcode"],
)
.update_traces(marker={"size": 10})
.update_layout(
mapbox={
"style": "open-street-map",
"zoom": 5,
"layers": [
{
"source": json.loads(gdf.geometry.to_json()),
"below": "traces",
"type": "line",
"color": "purple",
"line": {"width": 1.5},
}
],
},
margin={"l": 0, "r": 0, "t": 0, "b": 0},
)
)
fig.show()

Related

Creating an appropriate colorscale for choropleth in plotly

I am attempting to replicate the colorscale in this map: https://www.reddit.com/r/dataisbeautiful/comments/pewez6/oc_us_counties_by_population_density/
I am essentially trying to show variation in color for counties that record population densities under the thousand mark (these counties are washed out by the outliers). I have played around with the range_color argument in px.choropleth and have attempted to create my own colorscales but cannot seem to replicate the colorscale present in the map in the link. I was hoping https://react-colorscales.getforge.io/ might help me but I have not figured it out yet.
I have not been able to understand exactly how to do this, any help would be appreciated.
Here is my best attempt. I am looking for a more gradual gradient for the lower values.
One thing you can consider is clipping the highest and lowest values. The true values can still be shown in the hover information.
import pandas as pd
import requests
import tempfile
from zipfile import ZipFile
from pathlib import Path
import io
import plotly.express as px
# get country population data
with tempfile.TemporaryDirectory() as d:
with ZipFile(io.BytesIO(requests.get("https://api.worldbank.org/v2/en/indicator/EN.POP.DNST?downloadformat=csv").content)) as zfile:
zfile.extractall(d)
df = pd.read_csv(list(Path(d).glob("API_*.*"))[0], skiprows=3)
df = df.loc[:,["Country Name", "Country Code","2020"]].dropna()
# clip population density to stop large and small values running away with cmap
df["Density"] = df["2020"].clip(lower=df["2020"].quantile(.1), upper=df["2020"].quantile(.80))
px.choropleth(
df,
locations="Country Code",
color="Density",
hover_name="Country Name",
hover_data={"2020":":.2f", "Density":":.2f"},
color_continuous_scale=px.colors.sequential.Oranges,
)
logarithmic color scale
another option is to use a logarithmic color scale
have also made this color scale discrete. This is not so necessary, change to color_continuous_scale=px.colors.sequential.Oranges to use true continuous scale
colors = px.colors.sequential.Oranges
df["Density"] = np.log1p(df["2020"])
edges = pd.cut(df["Density"], bins=len(colors)-1, retbins=True)[1]
edges = edges[:-1] / edges[-1]
# color scales don't like negative edges...
edges = np.maximum(edges, np.full(len(edges), 0))
cc_scale = (
[(0, colors[0])]
+ [(e, colors[(i + 1) // 2]) for i, e in enumerate(np.repeat(edges, 2))]
+ [(1, colors[-1])]
)
ticks = np.linspace(df["Density"].min(), df["Density"].max(), len(colors))[1:-1]
px.choropleth(
df,
locations="Country Code",
color="Density",
hover_name="Country Name",
hover_data={"2020":":.2f", "Density":":.2f"},
color_continuous_scale=cc_scale
).update_layout(
coloraxis={
"colorbar": {
"tickmode": "array",
"tickvals": ticks,
"ticktext": np.expm1(ticks).round(0),
}
}
)

Plot bar charts on a map in plotly

I want to plot a bar chart on a map created with plotly, similar to the QGIS plot here. Ideally, the bar chart would be stacked and grouped instead of just grouped. So far, I only found examples for pie charts on plotly maps, for instance here.
with plotly mapbox you can add layers
with plotly you can generate images from figures
using above two facts you can add URI encoded images to a mapbox figure
you have not provided any sample geometry or data. Have used a subset geopandas sample geometry plus generated random data for each country (separate graph)
the real key to this solution is layer-coordinates
get centroid of a country
add a buffer around this and get envelope (bounding rectangle)
arrange co-ordinates of envelope to meet requirements stated in link
import geopandas as gpd
import plotly.express as px
import numpy as np
import base64, io
# create an encocded image of graph...
# change to generate graph you want
def b64image(vals=np.random.randint(1, 25, 5)):
fig = px.bar(
pd.DataFrame({"y": vals}).pipe(
lambda d: d.assign(category=d.index.astype(str))
),
y="y",
color="category",
).update_layout(
showlegend=False,
xaxis_visible=False,
yaxis_visible=False,
bargap=0,
margin={"l": 0, "r": 0, "t": 0, "b": 0},
autosize=False,
height=100,
width=100,
paper_bgcolor="rgba(0,0,0,0)",
plot_bgcolor="rgba(0,0,0,0)",
)
b = io.BytesIO(fig.to_image(format="png"))
b64 = base64.b64encode(b.getvalue())
return "data:image/png;base64," + b64.decode("utf-8"), fig
# get some geometry
world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))
# let's just work with a bounded version of europe
eur = world.loc[
lambda d: d["continent"].eq("Europe")
& ~d["iso_a3"].isin(["RUS", "NOR", "FRA", "ISL"])
]
px.choropleth_mapbox(
eur,
geojson=eur.__geo_interface__,
locations="iso_a3",
featureidkey="properties.iso_a3",
color_discrete_sequence=["lightgrey"],
).update_layout(
margin={"l": 0, "r": 0, "t": 0, "b": 0},
showlegend=False,
mapbox_style="carto-positron",
mapbox_center={
"lon": eur.unary_union.centroid.x,
"lat": eur.unary_union.centroid.y,
},
mapbox_zoom=3,
# add a plotly graph per country...
mapbox_layers=[
{
"sourcetype": "image",
# no data provided, use random values for each country
"source": b64image(vals=np.random.randint(1, 25, 5))[0],
# https://plotly.com/python/reference/layout/mapbox/#layout-mapbox-layers-items-layer-coordinates
# a few hops to get 4 cordinate pairs to meet mapbox requirement
"coordinates": [
list(p) for p in r.geometry.centroid.buffer(1.1).envelope.exterior.coords
][0:-1][::-1],
}
for i, r in eur.iterrows()
],
)
output

Choropleth heat map missing some area

I'm trying to create a heatmap for countries. I've created a custom geojson, which is working fine, by it's own.
Unfortunately, when I link to the dataframe where the amount are displayed, only part of the heat map are rendered, excluding some areas.
Why is that?
Thanks a lot
data and code available here
You have missed one important parameter: featureidkey This is the join between the geometry and data frame (location)
Have also simplified your aggregate to removed need for reset_index()
full code
import geopandas as gpd
import pandas as pd
import plotly.express as px
gdf = gpd.read_file(
"https://raw.githubusercontent.com/vincenzojrs/test/main/map-2.geojson"
)
sellers = pd.read_csv(
"https://raw.githubusercontent.com/vincenzojrs/test/main/sellers.csv"
)
sellersxcity = sellers.groupby(["id_ac"], as_index=False).agg({"num_ord_sell": "sum"})
fig = px.choropleth_mapbox(
sellersxcity,
geojson=gdf,
featureidkey="properties.ID_1",
locations="id_ac",
color="num_ord_sell",
color_continuous_scale="Viridis",
mapbox_style="carto-positron",
zoom=3,
center={"lat": 40.4999, "lon": -3.673},
labels={"num_ord_sell": "Count for Orders"},
)
fig.update_layout(margin={"r": 0, "t": 0, "l": 0, "b": 0})

Combined xaxis and header of table with plotly Python

I would like to do something quite similar to the picture with plotly on python. I tried to find a way with subplots and shared_axis but no way to find a correct way. Is it possible to share the x axis of a bar chart with the column titles of a table?
graph bar with shared xaxis
this can be simulated with two traces
first trace is a standard bar chart, with yaxis domain constrained to 80% of the figure
second trace is a bar showing values as text and a fixed height against a second yaxis. yaxis2 is constrained to 10% of the domain
import plotly.express as px
import pandas as pd
import numpy as np
df = pd.DataFrame({"year": range(2011, 2022)}).assign(
pct=lambda d: np.random.uniform(-0.08, 0.08, len(d))
)
px.bar(df, x="year", y="pct").add_traces(
px.bar(df, x="year", y=np.full(len(df), 1), text="pct")
.update_traces(
yaxis="y2",
marker={"line": {"color": "black", "width": 1.5}, "color": "#E5ECF6"},
texttemplate="%{text:,.2%}",
)
.data
).update_layout(
yaxis={"domain": [0.2, 1], "tickformat": ",.2%"},
yaxis2={"domain": [0, 0.1], "visible": False},
xaxis={"title": "", "dtick": 1},
)

Python: How to create a step plot with offline plotly for a pandas DataFrame?

Lets say we have following DataFrame and corresponding graph generated:
import pandas as pd
import plotly
from plotly.graph_objs import Scatter
df = pd.DataFrame({"value":[10,7,0,3,8]},
index=pd.to_datetime([
"2015-01-01 00:00",
"2015-01-01 10:00",
"2015-01-01 20:00",
"2015-01-02 22:00",
"2015-01-02 23:00"]))
plotly.offline.plot({"data": [Scatter( x=df.index, y=df["value"] )]})
Expected results
If I use below code:
import matplotlib.pyplot as plt
plt.step(df.index, df["value"],where="post")
plt.show()
I get a step graph as below:
Question
How can I get same results as step function but using offline plotly instead?
We can use the line parameter shape option as hv using below code:
trace1 = {
"x": df.index,
"y": df["value"],
"line": {"shape": 'hv'},
"mode": 'lines',
"name": 'value',
"type": 'scatter'
};
data = [trace1]
plotly.offline.plot({
"data": data
})
Which generates below graph:
As your data is a pandas dataframe, alternatively to offline plotly, you could use plotly express:
import plotly.express as plx
plx.line(df,line_shape='hv')
Other line_shape options, or interpolation methods between given points:
'hv' step ends, equivalent to pyplot's post option;
'vh' step starts;
'hvh' step middles, x axis;
'vhv' step middles, y axis;
'spline' smooth curve between points;
'linear' line segments between points, default value for line_shape.
Just try them...
hv = plx.line(df,line_shape='hv')
vh = plx.line(df,line_shape='vh')
vhv = plx.line(df,line_shape='vhv')
hvh = plx.line(df,line_shape='hvh')
spline = plx.line(df,line_shape='spline')
line = plx.line(df,line_shape='linear')
Selection of one of them should be commited to the nature of your data...

Categories

Resources