Plotly: How to update / redraw a plotly express figure with new data? - python

During debugging or computationally heavy loops, i would like to see how my data processing evolves (for example in a line plot or an image).
In matplotlib the code can redraw / update the figure with plt.cla() and then plt.draw() or plt.pause(0.001), so that i can follow the progress of my computation in real time or while debugging. How do I do that in plotly express (or plotly)?

So i think i essentially figured it out. The trick is to not use go.Figure() to create a figure, but go.FigureWidget() Which is optically the same thing, but behind the scenes it's not.
documentation
youtube video demonstration
Those FigureWidgets are exactly there to be updated as new data comes in. They stay dynamic, and later calls can modify them.
A FigureWidget can be made from a Figure:
figure = go.Figure(data=data, layout=layout)
f2 = go.FigureWidget(figure)
f2 #display the figure
This is practical, because it makes it possible to use the simplified plotly express interface to create a Figure and then use this to construct a FigureWidget out of it. Unfortunately plotly express does not seem to have it's own simplified FigureWidget module. So one needs to use the more complicated go.FigureWidget.

I'm not sure if an idential functionality exists for plotly. But you can at least build a figure, expand your data source, and then just replace the data of the figure without touching any other of the figure elements like this:
for i, col in enumerate(fig.data):
fig.data[i]['y'] = df[df.columns[i]]
fig.data[i]['x'] = df.index
It should not matter if your figure is a result of using plotly.express or go.Figure since both approaches will produce a figure structure that can be edited by the code snippet above. You can test this for yourself by setting the two following snippets up in two different cells in JupyterLab.
Code for cell 1
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from jupyter_dash import JupyterDash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
# code and plot setup
# settings
pd.options.plotting.backend = "plotly"
# sample dataframe of a wide format
np.random.seed(5); cols = list('abc')
X = np.random.randn(50,len(cols))
df=pd.DataFrame(X, columns=cols)
df.iloc[0]=0;df=df.cumsum()
# plotly figure
fig = df.plot(template = 'plotly_dark')
fig.show()
Code for cell 2
# create or retrieve new data
Y = np.random.randn(1,len(cols))
# organize new data in a df
df2 = pd.DataFrame(Y, columns = cols)
# add last row to df to new values
# this step can be skipped if your real world
# data is not a cumulative process like
# in this example
df2.iloc[-1] = df2.iloc[-1] + df.iloc[-1]
# append new data to existing df
df = df.append(df2, ignore_index=True)#.reset_index()
# replace old data in fig with new data
for i, col in enumerate(fig.data):
fig.data[i]['y'] = df[df.columns[i]]
fig.data[i]['x'] = df.index
fig.show()
Running the first cell will put together some data and build a figure like this:
Running the second cell will produce a new dataframe with only one row, append it to your original dataframe, replace the data in your existing figure, and show the figure again. You can run the second cell as many times as you like to redraw your figure with an expanding dataset. After 50 runs, your figure will look like this:

Related

Plotly Express - plot subset of dataframe columns by default and the rest as option

I am using plotly express to plot figures this way :
fig = px.line(df,
x=df.index,
y=df.columns
)
It displays the graph properly and shows all the columns by default (as lines in the graph) with option to uncheck (or check) them to disable showing whatever we want if needed.
What I would like is to show the same graph but by default uncheking some of the columns initially and keep the option to check or uncheck them for visualization.
This means that I cannot take only a subset of columns as new data frame to show as the other columns are still relevant.
Did not find anything in the documentation unfortunately...
Thank you in advance.
You can use the visible property of the traces to state it is only in the legend. Below shows all columns in the figure then first two columns are set as visible, all other columns are only in the legend.
import plotly.express as px
import pandas as pd
import numpy as np
# simulate dataframe
df = pd.DataFrame(
{c: np.random.uniform(0, 1, 100) + cn for cn, c in enumerate("ABCDEF")}
)
fig = px.line(df, x=df.index, y=df.columns)
# for example only display first two columns of data frame, all others can be displayed
# by clicking on legend item
fig.for_each_trace(
lambda t: t.update(visible=True if t.name in df.columns[:2] else "legendonly")
)

How to animate chart with multiple y axis (python)

I am trying to make animated plot (currently using plotly.express but open to any other solutions) with secondary y axis. I have read different threads about how to animate a bar chart with multiple groups (Plotly: How to animate a bar chart with multiple groups using plotly express?) and make second axis on plotly-express (Plotly: How to plot on secondary y-Axis with plotly express), however I havent found any answer on how to make animated plot with secondary y axis.
Here is my code
import pandas as pd
import plotly.express as px
df = pd.read_csv("plotly_animation_stackoverflow.csv")
px.bar(data_frame=df,x="date",y=["A","B","C"],animation_frame="lag",barmode="group")
and I cannot see the bar chart for column C because of scale issue
There is also an issue with plotly-express as my data frame expand with additional lags. I can easily do this in Tableau, however I am trying to keep this open source. Is there another way that I can pass a function to a plot that it applies additional lags as I move the slide bar?
here is the data
date,A,B,C,lag
8/22/2016,54987,36488,0.3389,0
8/23/2016,91957,73793,0.3389,0
8/24/2016,91957,73793,0.3357,0
8/25/2016,91957,73793,0.3291,0
8/26/2016,91957,73793,0.3295,0
8/29/2016,91957,73793,0.3281,0
8/30/2016,107657,82877,0.3273,0
8/31/2016,107657,82877,0.3247,0
9/1/2016,107657,82877,0.322,0
9/2/2016,107657,82877,0.3266,0
8/22/2016,54987,36488,NA,1
8/23/2016,91957,73793,0.3389,1
8/24/2016,91957,73793,0.3389,1
8/25/2016,91957,73793,0.3357,1
8/26/2016,91957,73793,0.3291,1
8/29/2016,91957,73793,0.3295,1
8/30/2016,107657,82877,0.3281,1
8/31/2016,107657,82877,0.3273,1
9/1/2016,107657,82877,0.3247,1
9/2/2016,107657,82877,0.322,1
9/3/2016,,,0.3266,1
8/22/2016,54987,36488,,2
8/23/2016,91957,73793,,2
8/24/2016,91957,73793,0.3389,2
8/25/2016,91957,73793,0.3389,2
8/26/2016,91957,73793,0.3357,2
8/29/2016,91957,73793,0.3291,2
8/30/2016,107657,82877,0.3295,2
8/31/2016,107657,82877,0.3281,2
9/1/2016,107657,82877,0.3273,2
9/2/2016,107657,82877,0.3247,2
9/3/2016,,,0.322,2
9/4/2016,,,0.3266,2
after building the figure, update required traces to use secondary y-axis. This needs to include traces within frames as well as traces within figure
configure secondary y-axis
import pandas as pd
import plotly.express as px
import io
data = """date,A,B,C,lag
8/22/2016,54987,36488,0.3389,0
8/23/2016,91957,73793,0.3389,0
8/24/2016,91957,73793,0.3357,0
8/25/2016,91957,73793,0.3291,0
8/26/2016,91957,73793,0.3295,0
8/29/2016,91957,73793,0.3281,0
8/30/2016,107657,82877,0.3273,0
8/31/2016,107657,82877,0.3247,0
9/1/2016,107657,82877,0.322,0
9/2/2016,107657,82877,0.3266,0
8/22/2016,54987,36488,NA,1
8/23/2016,91957,73793,0.3389,1
8/24/2016,91957,73793,0.3389,1
8/25/2016,91957,73793,0.3357,1
8/26/2016,91957,73793,0.3291,1
8/29/2016,91957,73793,0.3295,1
8/30/2016,107657,82877,0.3281,1
8/31/2016,107657,82877,0.3273,1
9/1/2016,107657,82877,0.3247,1
9/2/2016,107657,82877,0.322,1
9/3/2016,,,0.3266,1
8/22/2016,54987,36488,,2
8/23/2016,91957,73793,,2
8/24/2016,91957,73793,0.3389,2
8/25/2016,91957,73793,0.3389,2
8/26/2016,91957,73793,0.3357,2
8/29/2016,91957,73793,0.3291,2
8/30/2016,107657,82877,0.3295,2
8/31/2016,107657,82877,0.3281,2
9/1/2016,107657,82877,0.3273,2
9/2/2016,107657,82877,0.3247,2
9/3/2016,,,0.322,2
9/4/2016,,,0.3266,2"""
df = pd.read_csv(io.StringIO(data))
fig = px.bar(data_frame=df,x="date",y=["A","B","C"],animation_frame="lag",barmode="group")
# update approprate traces to use secondary yaxis
for t in fig.data:
if t.name=="C": t.update(yaxis="y2")
for f in fig.frames:
for t in f.data:
if t.name=="C": t.update(yaxis="y2")
# configure yaxis2 and give it some space
fig.update_layout(yaxis2={"overlaying":"y", "side":"right"}, xaxis={"domain":[0,.98]})

Plotly graph_objects add df column to hovertemplate

I am trying to generally recreate this graph and struggling with adding a column to the hovertemplate of a plotly Scatter. Here is a working example:
import pandas as pd
import chart_studio.plotly as py
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0),
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{percent}',
))
fig.show()
I'd like to get the column Cast-Fatality to show under {percent}
I've also tried putting in the Scatter() call a line for text = [df['Case-Fatality']], and switching {percent} to {text} as shown in this example, but this doesn't pull from the dataframe as hoped.
I've tried replotting it as a px, following this example but it throws the error dictionary changed size during iteration and I think using go may be simpler than px but I'm new to plotly.
Thanks in advance for any insight for how to add a column to the hover.
As the question asks for a solution with graph_objects, here are two that work-
Method (i)
Adding %{text} where you want the variable value to be and passing another variable called text that is a list of values needed in the go.Scatter() call. Like this-
percent = df['Case-Fatality']
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{text}',text = percent
Here is the complete code-
import pandas as pd
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0),
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{text}',
text = percent))
fig.show()
Method (ii)
This solution requires you to see the hoverlabel as when you pass x unified to hovermode. All you need to do then is pass an invisible trace with the same x-axis and the desired y-axis values. Passing mode='none' makes it invisible. Here is the complete code-
import pandas as pd
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0)))
fig.add_scatter(x=df.Confirmed, y=percent, mode='none')
fig.update_layout(hovermode='x unified')
fig.show()
The link you shared is broken. Are you looking for something like this?
import pandas as pd
import plotly.express as px
px.scatter(df,
x="Confirmed",
y="Deaths",
hover_name="Country",
hover_data={"Case-Fatality":True})
Then if you need to use bold or change your hover_template you can follow the last step in this answer
Drawing inspiration from another SO question/answer, I find that this is working as desired and permits adding multiple cols to the hover data:
import pandas as pd
import plotly.express as px
fig = px.scatter(df,
x="Confirmed",
y="Deaths",
hover_name="Country",
hover_data=[df['Case-Fatality'], df['Deaths/100K pop.']])
fig.show()

Plotly: How to make line charts colored by a variable using plotly.graph_objects?

I'm making a line chart below. I want to make the lines colored by a variable Continent. I know it can be done easily using plotly.express
Does anyone know how I can do that with plotly.graph_objects? I tried to add color=gapminder['Continent'], but it did not work.
Thanks a lot for help in advance.
import plotly.express as px
gapminder = px.data.gapminder()
import plotly.graph_objects as go
fig = go.Figure()
fig.add_trace(go.Scatter(x=gapminder['year'], y=gapminder['lifeExp'],
mode='lines+markers'))
fig.show()
Using an approach like color=gapminder['Continent'] normally applies to scatterplots where you define categories to existing points using a third variable. You're trying to make a line plot here. This means that not only will you have a color per continent, but also a line per continent. If that is in fact what you're aiming to do, here's one approach:
Plot:
Code:
import plotly.graph_objects as go
import plotly.express as px
# get data
df_gapminder = px.data.gapminder()
# manage data
df_gapminder_continent = df_gapminder.groupby(['continent', 'year']).mean().reset_index()
df = df_gapminder_continent.pivot(index='year', columns='continent', values = 'lifeExp')
df.tail()
# plotly setup and traces
fig = go.Figure()
for col in df.columns:
fig.add_trace(go.Scatter(x=df.index, y=df[col].values,
name = col,
mode = 'lines'))
# format and show figure
fig.update_layout(height=800, width=1000)
fig.show()

How to plot time series graph in jupyter?

I have tried to plot the data in order to achieve something like this:
But I could not and I just achieved this graph with plotly:
Here is the small sample of my data
Does anyone know how to achieve that graph?
Thanks in advance
You'll find a lot of good stuff on timeseries on plotly.ly/python. Still, I'd like to share some practical details that I find very useful:
organize your data in a pandas dataframe
set up a basic plotly structure using fig=go.Figure(go.Scatter())
Make your desired additions to that structure using fig.add_traces(go.Scatter())
Plot:
Code:
import plotly.graph_objects as go
import pandas as pd
import numpy as np
# random data or other data sources
np.random.seed(123)
observations = 200
timestep = np.arange(0, observations/10, 0.1)
dates = pd.date_range('1/1/2020', periods=observations)
val1 = np.sin(timestep)
val2=val1+np.random.uniform(low=-1, high=1, size=observations)#.tolist()
# organize data in a pandas dataframe
df= pd.DataFrame({'Timestep':timestep, 'Date':dates,
'Value_1':val1,
'Value_2':val2})
# Main plotly figure structure
fig = go.Figure([go.Scatter(x=df['Date'], y=df['Value_2'],
marker_color='black',
opacity=0.6,
name='Value 1')])
# One of many possible additions
fig.add_traces([go.Scatter(x=df['Date'], y=df['Value_1'],
marker_color='blue',
name='Value 2')])
# plot figure
fig.show()

Categories

Resources