Plotly: How to show legend in single-trace scatterplot with plotly express? - python

Sorry beforehand for the long post. I'm new to python and to plotly, so please bear with me.
I'm trying to make a scatterplot with a trendline to show me the legend of the plot including the regression parameters but for some reason I can't understand why px.scatter doesn't show me the legend of my trace. Here is my code
fig1 = px.scatter(data_frame = dataframe,
x="xdata",
y="ydata",
trendline = 'ols')
fig1.layout.showlegend = True
fig1.show()
This displays the scatterplot and the trendline, but no legend even when I tried to override it.
I used pio.write_json(fig1, "fig1.plotly") to export it to jupyterlab plotly chart studio and add manually the legend, but even though I enabled it, it won't show either in the chart studio.
I printed the variable with print(fig1) to see what's happening, this is (part of) the result
(Scatter({
'hovertemplate': '%co=%{x}<br>RPM=%{y}<extra></extra>',
'legendgroup': '',
'marker': {'color': '#636efa', 'symbol': 'circle'},
'mode': 'markers',
'name': '',
'showlegend': False,
'x': array([*** some x data ***]),
'xaxis': 'x',
'y': array([*** some y data ***]),
'yaxis': 'y'
}), Scatter({
'hovertemplate': ('<b>OLS trendline</b><br>RPM = ' ... ' <b>(trend)</b><extra></extra>'),
'legendgroup': '',
'marker': {'color': '#636efa', 'symbol': 'circle'},
'mode': 'lines',
'name': '',
'showlegend': False,
'x': array([*** some x data ***]),
'xaxis': 'x',
'y': array([ *** some y data ***]),
'yaxis': 'y'
}))
As we can see, creating a figure with px.scatter by default hides the legend when there's a single trace (I experimented adding a color property to px.scatter and it showed the legend), and searching the px.scatter documentation I can't find something related to override the legend setting.
I went back to the exported file (fig1.plotly.json) and manually changed the showlegend entries to True and then I could see the legend in the chart studio, but there has to be some way to do it directly from the command.
Here's the question:
Does anyone know a way to customize px.express graphic objects?
Another workaround I see is to use low level plotly graph object creation, but then I don't know how to add a trendline.
Thank you again for reading through all of this.

You must specify that you'd like to display a legend and provide a legend name like this:
fig['data'][0]['showlegend']=True
fig['data'][0]['name']='Sepal length'
Plot:
Complete code:
import plotly.express as px
df = px.data.iris() # iris is a pandas DataFrame
fig = px.scatter(df, x="sepal_width", y="sepal_length",
trendline='ols',
trendline_color_override='red')
fig['data'][0]['showlegend']=True
fig['data'][0]['name']='Sepal length'
fig.show()
Complete code:

Related

Plotly make marker overlay add_trace

I have the following Scatterternary plot below. Whenever I add_trace, the marker remains under it (so you cannot even hover it). How can I make the marker circle above the red area? [In implementation, I will have several areas and the marker may move around]
I tried adding fig.update_ternaries(aaxis_layer="above traces",baxis_layer="above traces", caxis_layer="above traces") as shown in the documentation without success. There is also another explanation for the boxplots with the same issue but I don't know how to implement it in this case.
import plotly.graph_objects as go
fig = go.Figure(go.Scatterternary({
'mode': 'markers', 'a': [0.3],'b': [0.5], 'c': [0.6],
'marker': {'color': 'AliceBlue','size': 14,'line': {'width': 2} },}))
fig.update_layout({
'ternary': {
'sum': 100,
'aaxis': {'nticks':1, 'ticks':""},
'baxis': {'nticks':1},
'caxis': {'nticks':1} }})
fig.add_trace(go.Scatterternary(name='RedArea',a=[0.1,0.1,0.6],b=[0.7,0.4,0.5],c=[0.2,0.6,0.8],mode='lines',opacity=0.35,fill='toself',
fillcolor='red'))
fig.update_traces( hovertemplate = "<b>CatA: %{a:.0f}<br>CatB: %{b:.0f}<br>CatC: %{c:.0f}<extra></extra>")
fig.show()
In this case, the markers can be displayed by swapping the drawing order. plotly does not provide the ability to control the drawing order, so changing the order of the code is the solution. However, it is not clear if this technique is possible for all graphs.
import plotly.graph_objects as go
fig = go.Figure()
fig.update_layout({
'ternary': {
'sum': 100,
'aaxis': {'nticks':1, 'ticks':""},
'baxis': {'nticks':1},
'caxis': {'nticks':1} }})
fig.add_trace(go.Scatterternary(
name='RedArea',
a=[0.1,0.1,0.6],
b=[0.7,0.4,0.5],
c=[0.2,0.6,0.8],
mode='lines',
opacity=0.35,
fill='toself',
fillcolor='red')
)
fig.add_trace(go.Scatterternary({
'mode': 'markers', 'a': [0.3],'b': [0.5], 'c': [0.6],
'marker': {'color': 'AliceBlue','size': 14,'line': {'width': 2} },}))
fig.update_traces( hovertemplate = "<b>CatA: %{a:.0f}<br>CatB: %{b:.0f}<br>CatC: %{c:.0f}<extra></extra>")
fig.show()

Streamlit bar chart with different color for each label

I have a DataFrame that contains two columns:
Nucleotide (ordinal, only unique values)
Similarities (quantative, count of specific Nucleotide)
I want to plot an interactive bar chart using Streamlit, where each Nucleotide will have different color, like on the example below:
I know how to do it using matplotlib or seaborn, but these figures are not interactive.
Also my approach using vega-lite does not work, because the 'c' argument for the colormap cannot refer to the axis being already used on the plot.
st.vega_lite_chart(df, {
'mark': {'type': 'bar', 'tooltip': True},
'encoding': {
'x': {'field': 'Nucleotide', 'type': 'ordinal'},
'y': {'field': 'Similarities', 'type': 'quantitative'},
'color': {'field': 'Nucleotide', 'type': 'ordinal'},
},
})
Do you maybe have some other ideas?
Altair is a great choice here in my view. It comes out of the box with streamlit and creates very nice looking and interactive charts. Bascially you have to create a Chart object, pass in the data that you want to plot, and use the column names for things like x,y or color.
For your example, the code would read like
import altair as alt
import streamlit as st
chart = (
alt.Chart(data)
.mark_bar()
.encode(
alt.X("Nucleotide:O"),
alt.Y("Similarities"),
alt.Color("Nucleotide:O"),
alt.Tooltip(["Nucleotide", "Similarities"]),
)
.interactive()
)
st.altair_chart(chart)
assuming your dataframe is called data and the columns are called "Nucleotide" and "Similarities".
This would be a very basic bar chart that you can zoom in and hover over to see a tooltip.

Modify values and path in plotly express sunburst using updatemenus

I'm plotting datasets with plotly express as sunburst charts.
One thing I'm trying to achieve is the possibility to select the values to be plotted so that the plot gets updated if the values change meaning that a different column in the dataframe is selected.
I've created an example based on this sunburst example in the official docs
https://plotly.com/python/sunburst-charts/#sunburst-of-a-rectangular-dataframe-with-plotlyexpress
There the column 'total_bill' is selected for plotting and that works. I can recreate the plot in that example.
Now I would like to use updatemenus to switch that to the 'tip' column that also holds floats and should be usable.
The example code I've tried:
import plotly.express as px
df = px.data.tips()
updatemenus = [{'buttons': [{'method': 'update',
'label': 'total_bill',
'args': [{'values': 'total_bill'}]
},
{'method': 'update',
'label': 'tip',
'args': [{'values': 'tip'}]
}],
'direction': 'down',
'showactive': True,}]
fig = px.sunburst(df, path=['day', 'time', 'sex'], values='total_bill')
fig.update_layout(updatemenus=updatemenus)
fig.show()
Now this will successfully plot the same plot as in the example, but when I select back and forth between the two updatemenu options, it doesn't behave properly.
I've also tried to use Series everywhere, but the results is the same.
I've also looked at this example, which has a similar focus
Plotly: How to select graph source using dropdown?
but the answers there didn't solve my problem either, since the sunburst in some way seems to behave differently from the scatter plot?
Any idea how to get this working?
similar to solution you arrived at. Use Plotly Express to build all the figures, collect into a dict
menu can now be built with dict comprehension
import plotly.express as px
df = px.data.tips()
# construct figures for all columns that can provide values
figs = {
c: px.sunburst(df, path=["day", "time", "sex"], values=c)
for c in ["total_bill", "tip"]
}
# choose a column that becomes the figure
fig = figs["total_bill"]
# now build menus, that use parameters that have been pre-built using plotly express
fig.update_layout(
updatemenus=[
{
"buttons": [
{
"label": c,
"method": "restyle",
"args": [
{
"values": [figs[c].data[0].values],
"hovertemplate": figs[c].data[0].hovertemplate,
}
],
}
for c in figs.keys()
]
}
]
)
Ok, I found one solution that works, but maybe someone could point me towards a "cleaner" solution, if possible.
I stumbled across this question that is actually unrelated:
Plotly: How to create sunburst subplot using graph_objects?
There, one solution was to save the figure data of a plotly express figure and reuse it in a graph objects figure.
This gave me an idea, that I could maybe save the data of each figure and then reuse (and update/modify) it in a third figure.
And, as it turns out, this works!
So here is the working example:
import plotly.express as px
df = px.data.tips()
# create two figures based on the same data, but with different values
fig1 = px.sunburst(df, path=['day', 'time', 'sex'], values='total_bill')
fig2 = px.sunburst(df, path=['day', 'time', 'sex'], values='tip')
# save the data of each figure so we can reuse that later on
ids1 = fig1['data'][0]['ids']
labels1 = fig1['data'][0]['labels']
parents1 = fig1['data'][0]['parents']
values1 = fig1['data'][0]['values']
ids2 = fig2['data'][0]['ids']
labels2 = fig2['data'][0]['labels']
parents2 = fig2['data'][0]['parents']
values2 = fig2['data'][0]['values']
# create updatemenu dict that changes the figure contents
updatemenus = [{'buttons': [{'method': 'update',
'label': 'total_bill',
'args': [{
'names': [labels1],
'parents': [parents1],
'ids': [ids1],
'values': [values1]
}]
},
{'method': 'update',
'label': 'tip',
'args': [{
'names': [labels2],
'parents': [parents2],
'ids': [ids2],
'values': [values2]
}]
}],
'direction': 'down',
'showactive': True}]
# create the actual figure to be shown and modified
fig = px.sunburst(values=values1, parents=parents1, ids=ids1, names=labels1, branchvalues='total')
fig.update_layout(updatemenus=updatemenus)
fig.show()

Plotly: Change y-axis scale

I have a dataset that looks like this:
x y z
0 Jan 28446000 110489.0
1 Feb 43267700 227900.0
When I plot a line chart like this:
px.line(data,x = 'x', y = ['y','z'], line_shape = 'spline', title="My Chart")
The y axis scale comes from 0 to 90 M. The first line on the chart for y is good enough. However, the second line appears to be always at 0M. What can I do to improve my chart such that we can see clearly how the values of both column change over the x values?
Is there any way I can normalize the data? Or perhaps I could change the scaling of the chart.
Often times we use data which is in different scales, and scaling the data would mask a characteristic we wish to display. One way to handle this is to add a secondary y-axis. An example is shown below.
Key points:
Create a layout dictionary object
Add a yaxis2 key to the dict, with the following: 'side': 'right', 'overlaying': 'y1'
This tells Plotly to create a secondary y-axis on the right side of the graph, and to overlay the primary y-axis.
Assign the appropriate trace to the newly created secondary y-axis as: 'yaxis': 'y2'
The other trace does not need to be assigned, as 'y1' is the default y-axis.
Comments (TL;DR):
The example code shown here uses the lower-level Plotly API, rather than a convenience wrapper such as graph_object to express. The reason is that I (personally) feel it's helpful to users to show what is occurring 'under the hood', rather than masking the underlying code logic with a convenience wrapper.
This way, when the user needs to modify a finer detail of the graph, they will have a better understanding of the lists and dicts which Plotly is constructing for the underlying graphing engine (orca).
The Docs:
Here is a link to the Plotly docs referencing multiple axes.
Example Code:
import pandas as pd
from plotly.offline import iplot
df = pd.DataFrame({'x': ['Jan', 'Feb'],
'y': [28446000, 43267700],
'z': [110489.0, 227900.0]})
layout = {'title': 'Secondary Y-Axis Demonstration',
'legend': {'orientation': 'h'}}
traces = []
traces.append({'x': df['x'], 'y': df['y'], 'name': 'Y Values'})
traces.append({'x': df['x'], 'y': df['z'], 'name': 'Z Values', 'yaxis': 'y2'})
# Add config common to all traces.
for t in traces:
t.update({'line': {'shape': 'spline'}})
layout['yaxis1'] = {'title': 'Y Values', 'range': [0, 50000000]}
layout['yaxis2'] = {'title': 'Z Values', 'side': 'right', 'overlaying': 'y1', 'range': [0, 400000]}
iplot({'data': traces, 'layout': layout})
Graph:

How to set the font size for labels in pd.DataFrame.plot()? (pandas)

I'm wondering if it is possible to override the label sizes for a plot generated with pd.DataFrame.plot() method. Following the docs I can easily do that for the xticks and yticks using the fontsize kwarg:
fontsize int, default None
Font size for xticks and yticks.
Unfortunately, I don't see a similar option that would change the size of the xlabel and ylabel.
Here's a snippet visualizing the issue:
import pandas as pd
df = pd.DataFrame(
[
{'date': '2020-09-10', 'value': 10},
{'date': '2020-09-10', 'value': 12},
{'date': '2020-09-10', 'value': 13},
]
)
df.plot(x='date', y='value', xlabel='This is the date.', ylabel='This is the value.', fontsize=10)
df.plot(x='date', y='value', xlabel="This is the date.", ylabel="This is the value.", fontsize=20)
Can I change the size of xlabel and ylabel in a similar manner?
As the documentation states, the result is a matplotlib object (by default, unless you changed it). Therefore you can change whatever you like in the same way you would change a matplolib object:
from matplolib import pyplot as plt
plt.xlabel('This is the date.', fontsize=18)
plt.ylabel('This is the value.', fontsize=16)
You can keep changing the object as you wish using matplolib options.

Categories

Resources