Putting Linear Trendline on a Plotly Subplot - python

I wanted to know if there was an easier way I could put a linear regression line on a plotly subplot. The code I made below does not appear to be efficient and it makes it difficult to add annotations to the graph for the linear trendlines, which I want placed on the graph. Furthermore, it is hard to make axes and titles with this code.
I was wondering if there was a way I could create a go.Figure and somehow put it on the subplot. I have tried that, but plotly will only allow me to put the data from the figure on the subplot rather than the actual Figure, so I lose the title, axis, and trendline information. In addition, the trendline is hidden on the graphs because the scatterplot is overlaid on top of it. I tried changing how the data was displayed with data=(data[1],data[0]), but that did not work.
Basically, I want to know if there is a more efficient way of putting a trendline on the scatter plots than I pursued, so I can make it easier to set axes, set the graph size, create legends, etc, since it is difficult to work with what I coded .
sheets_dict=pd.ExcelFile('10.05.22_EMS172LabReport1.xlsx')
sheets_list=np.array(sheets_dict.sheet_names[2:])
fig=make_subplots(rows=7,cols=1)
i=0
for name in sheets_list:
df=sheets_dict.parse(name)
df.columns=df.columns.str.replace(' ','')
df=df.drop(df.index[0])
slope,y_int=np.polyfit(df.CURR1,df.VOLT1,1)
LR="Linear Fit: {:,.3e}x + {:,.3e}".format(slope,y_int)
rmse=np.sqrt(sum(slope*df.CURR1+y_int-df.VOLT1)**2)
df['Best Fit']=slope*df.CURR1+y_int
i+=1
fig.add_trace(
go.Scatter(name='Best Fit Line'+" ± {:,.3e}V".format(rmse),x=df['CURR1'],y=df['Best Fit'],
mode='lines',line_color='red',line_width=2),row=i, col=1)
fig.add_trace(
go.Scatter(name='Voltage',x=df['CURR1'],y=df['VOLT1'],mode='markers'),
row=i, col=1)
# fig.data = (fig.data[1],fig.data[0])
fig.show()

You can add titles and axes labels as follows:
import pandas as pd
import plotly.subplots as ps
fig=ps.make_subplots(rows=5,cols=1,subplot_titles=['Plot 1', 'Plot 2', 'Plot 3', 'Plot 4', 'Plot 5'])
fig.add_scatter(y=[2, 1, 3], row=1, col=1)
fig.add_scatter(y=[3, 1, 5], row=2, col=1)
fig.add_scatter(y=[2, 6, 3], row=3, col=1)
fig.add_scatter(y=[4, 0, 3], row=4, col=1)
fig.add_scatter(y=[3, 2, 3], row=5, col=1)
fig['layout']['xaxis']['title']='X-axis 1'
fig['layout']['xaxis2']['title']='X-axis 2'
fig['layout']['xaxis3']['title']='X-axis 3'
fig['layout']['xaxis4']['title']='X-axis 4'
fig['layout']['xaxis5']['title']='X-axis 5'
fig['layout']['yaxis']['title']='Y-axis 1'
fig['layout']['yaxis2']['title']='Y-axis 2'
fig['layout']['yaxis3']['title']='Y-axis 3'
fig['layout']['yaxis4']['title']='Y-axis 4'
fig['layout']['yaxis5']['title']='Y-axis 5'
fig.show()
The subplot_titles parameter in the make_subplots function is used to add plot titles. The fig['layout']['(x/y)axis(number)']['title'] is used to set the axes labels. Alternatively you can use:
fig.update_yaxes(title_text="yaxis 1 title", row=1, col=1)
or
fig.update_xaxes(title_text="xaxis 1 title", row=1, col=1)
To alter the plot sizes or spacing you can play around with the column_widths/row_heights or vertical_spacing/horizontal_spacing parameters of make_subplots:
https://plotly.com/python-api-reference/plotly.subplots.html#subplots
As for the legend, there's no direct way of associating a legend with a subplot other than what you already have but the first comment in the following link shows a way of adding an annotation on the subplot that can act sort of like a legend:
https://community.plotly.com/t/associating-subplots-legends-with-each-subplot-and-formatting-subplot-titles/33786

Trendlines are implemented in plotly.express with extensive functionality. See here. It is possible to create a subplot using that graph data, but I have created a subplot with a graph object to take advantage of your current code.
Since you did not provide specific data, I used the example data in ref. It is a data frame showing the rate of change in stock prices for several companies. It is in the form of a trend line added to it.
As for the graph, I have changed the height because a subplot requires height. The addition of axis labels for each subplot is specified in a matrix. If you need axis titles for all subplots, add them. Also, as a customization of the legend, we have grouped A group for the torrent lines and a group for the rate of change. As an example of the annotations, the slope values are set to 0 on the x-axis of each subplot and the y-axis is set to the position of the maximum value of each value.
import plotly.express as px
import plotly.graph_objects as go
import numpy as np
df = px.data.stocks()
df.head()
date GOOG AAPL AMZN FB NFLX MSFT
0 2018-01-01 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000
1 2018-01-08 1.018172 1.011943 1.061881 0.959968 1.053526 1.015988
2 2018-01-15 1.032008 1.019771 1.053240 0.970243 1.049860 1.020524
3 2018-01-22 1.066783 0.980057 1.140676 1.016858 1.307681 1.066561
4 2018-01-29 1.008773 0.917143 1.163374 1.018357 1.273537 1.040708
from plotly.subplots import make_subplots
fig = make_subplots(rows=6,cols=1, subplot_titles=df.columns[1:].tolist())
for i,c in enumerate(df.columns[1:]):
dff = df[[c]].copy()
slope,y_int=np.polyfit(dff.index, dff[c], 1)
LR="Linear Fit: {:,.3e}x + {:,.3e}".format(slope,y_int)
rmse=np.sqrt(sum(slope*dff.index+y_int-df[c])**2)
dff['Best Fit'] = slope*df.index+y_int
fig.add_trace(go.Scatter(
name='Best Fit Line'+" ± {:,.3e}V".format(rmse),
x=dff.index,
y=dff['Best Fit'],
mode='lines',
line_color='blue',
line_width=2,
legendgroup='group1',
legendgrouptitle_text='Trendline'), row=i+1, col=1)
fig.add_trace(go.Scatter(
x=dff.index,
y=dff[c],
legendgroup='group2',
legendgrouptitle_text='Rate of change',
mode='markers+lines', name=c), row=i+1, col=1)
fig.add_annotation(x=0.1,
y=dff[c].max(),
xref='x',
yref='y',
text='{:,.3e}'.format(rmse),
showarrow=False,
yshift=5, row=i+1, col=1)
fig.update_layout(autosize=True, height=800, title_text="Stock and Trendline")
fig.update_xaxes(title_text="index", row=6, col=1)
fig.update_yaxes(title_text="Rate of change", row=3, col=1)
fig.show()

Related

easiest way to put several Python Plotly Express figures on one html file

I want to keep the figures in the exact same way, for example, having different data source, subplot title and respective legends near EACH figure.
currently the code is sth. like
fig1 = px.line(df_crude_spot_long, x="Date", y="$/bbl", color='type', title='Benchmark Crude Spot Prices', color_discrete_sequence=px.colors.qualitative.Bold)
fig1.update_layout(
xaxis_title="",
title={
# 'text': "Plot Title",
# 'y':0.9,
'x':0.5},
legend_title="Benchmark",
font=dict(
family="Courier New, monospace",
size=40,
color="navy"),
legend=dict(
yanchor="top",
y=0.99,
xanchor="left",
x=0.01
)
)
fig2 = px.line(df_crude_futures, x="contract month", y='Price', color='futures', title='Latest Crude Oil Futures', color_discrete_sequence=px.colors.qualitative.Bold)
fig2.update_layout(
title={
# 'text': "Plot Title",
# 'y':0.9,
'x':0.5},
legend_title="Futures",
font=dict(
family="Courier New, monospace",
size=40,
color="navy"),
legend=dict(
yanchor="top",
y=0.99,
xanchor="left",
x=0.8
)
)
fig3
fig4
....
As you can see, the data source for different figures are not from the same dataframe.
I tried the 2nd approach in this post, by combining make_subplots and plotly express with code like this
Is it possible to create a subplot with Plotly Express?
fig = make_subplots(
rows=2, cols=2,
subplot_titles=("BenchmarkPrices", "Latest Oil Futures", "Bunker Prices", "Fuel Futures"))
for d in fig1.data:
fig.add_trace((go.Scatter(x=d['x'], y=d['y'], name = d['name'])), row=1, col=1)
for d in fig2.data:
fig.add_trace((go.Scatter(x=d['x'], y=d['y'], name = d['name'])), row=2, col=1)
for d in fig3.data:
fig.add_trace((go.Scatter(x=d['x'], y=d['y'], name = d['name'])), row=2, col=1)
for d in fig4.data:
fig.add_trace((go.Scatter(x=d['x'], y=d['y'], name = d['name'])), row=2, col=2)
but the result has some of the figures not shown properly and all the legends put together on the right side.
I mentioned Html file in the title because I generally save my figure like below
offline.plot({'data':fig},filename='charts.html',auto_open=False)
Update 1
The comment section suggested the method in this post
Plotly saving multiple plots into a single html
it puts several figures under 1 html but doesn't solve my problem because 1.I need to produce a 2 x 4 charts(2 charts per row and 4 rows in total), with this way it only put 1 chart per row. 2. when I download png from the html, it only shows the 1st figure, even on the html we see the 4

Use one color for multiple traces added to a Figure using Plotly

I have started from the following example:
from plotly.subplots import make_subplots
from plotly import graph_objects as go
fig = make_subplots(rows=3, cols=1, subplot_titles=["foo", "bar", "goo"])
for i in range(3):
fig.add_trace(go.Box(x=list(range(100)), boxmean="sd", showlegend=False), row=i + 1, col=1)
fig.update_layout(height=600, width=1200, title_text="Yo Yo")
fig
It yields three box plots in three rows of a subplots Plotly container:
My objective is:
Get rid of the trace X strings on the left.
Use the same color for all three subplots.
By using:
fig.add_trace(go.Box(x=list(range(100)), boxmean="sd", showlegend=False, fillcolor="blue"), row=i + 1, col=1)
I'm getting closer to the second objective, but it is not yet there:
I'm guessing I can ask for a color cycle consisting of a single color; but I didn't manage to do that.
We have already tried the fill and obtained results, so I think the remaining task is to align the line colors. The y-axis labels can be set to empty by name. There are other ways to do this, but I think this is the easiest.
from plotly.subplots import make_subplots
from plotly import graph_objects as go
fig = make_subplots(rows=3, cols=1, subplot_titles=["foo", "bar", "goo"])
for i in range(3):
fig.add_trace(go.Box(x=list(range(100)),
boxmean="sd",
fillcolor='blue',
line={'color':'red'},
name='',
showlegend=False), row=i + 1, col=1)
fig.update_layout(height=600, width=1200, title_text="Yo Yo")
fig.show()

Is it possible to create a subplot with Plotly Express?

I would like to create a subplot with 2 plot generated with the function plotly.express.line, is it possible? Given the 2 plot:
fig1 =px.line(df, x=df.index, y='average')
fig1.show()
fig2 = px.line(df, x=df.index, y='Volume')
fig2.show()
I would like to generate an unique plot formed by 2 subplot (in the example fig1 and fig2)
Yes, you can build subplots using plotly express. Either
1. directly through the arguments facet_row and facet_colums (in which case we often talk about facet plots, but they're the same thing), or
2. indirectly through "stealing" elements from figures built with plotly express and using them in a standard make_subplots() setup with fig.add_traces()
Method 1: Facet and Trellis Plots in Python
Although plotly.express supports data of both wide and long format, I often prefer building facet plots from the latter. If you have a dataset such as this:
Date variable value
0 2019-11-04 average 4
1 2019-11-04 average 2
.
.
8 2019-12-30 volume 5
9 2019-12-30 volume 2
then you can build your subplots through:
fig = px.line(df, x='Date', y = 'value', facet_row = 'variable')
Plot 1:
By default, px.line() will apply the same color to both lines, but you can easily handle that through:
fig.update_traces(line_color)
This complete snippet shows you how:
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
df = pd.DataFrame({'Date': ['2019-11-04', '2019-11-04', '2019-11-18', '2019-11-18', '2019-12-16', '2019-12-16', '2019-12-30', '2019-12-30'],
'variable':['average', 'volume', 'average', 'volume', 'average','volume','average','volume'],
'value': [4,2,6,5,6,7,5,2]})
fig = px.line(df, x='Date', y = 'value', facet_row = 'variable')
fig.update_traces(line_color = 'red', row = 2)
fig.show()
Method 2: make_subplots
Since plotly express can do some pretty amazing stuff with fairly complicated datasets, I see no reason why you should not stumple upon cases where you would like to use elements of a plotly express figure as a source for a subplot. And that is very possible.
Below is an example where I've built to plotly express figures using px.line on the px.data.stocks() dataset. Then I go on to extract some elements of interest using add_trace and go.Scatter in a For Loop to build a subplot setup. You could certainly argue that you could just as easily do this directly on the data source. But then again, as initially stated, plotly express can be an excellent data handler in itself.
Plot 2: Subplots using plotly express figures as source:
Complete code:
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
from plotly.subplots import make_subplots
df = px.data.stocks().set_index('date')
fig1 = px.line(df[['GOOG', 'AAPL']])
fig2 = px.line(df[['AMZN', 'MSFT']])
fig = make_subplots(rows=2, cols=1)
for d in fig1.data:
fig.add_trace((go.Scatter(x=d['x'], y=d['y'], name = d['name'])), row=1, col=1)
for d in fig2.data:
fig.add_trace((go.Scatter(x=d['x'], y=d['y'], name = d['name'])), row=2, col=1)
fig.show()
There is no need to use graph_objects module if you have just already generated px figures for making subplots. Here is the full code.
import plotly.express as px
import pandas as pd
from plotly.subplots import make_subplots
df = px.data.stocks().set_index('date')
fig1 = px.line(df[['GOOG', 'AAPL']])
fig2 = px.line(df[['AMZN', 'MSFT']])
fig = make_subplots(rows=2, cols=1)
fig.add_trace(fig1['data'][0], row=1, col=1)
fig.add_trace(fig1['data'][1], row=1, col=1)
fig.add_trace(fig2['data'][0], row=2, col=1)
fig.add_trace(fig2['data'][1], row=2, col=1)
fig.show()
If there are more than two variables in each plot, one can use for loop also to add the traces using fig.add_trace method.
From the documentation, Plotly express does not support arbitrary subplot capabilities. You can instead use graph objects and traces (note that go.Scatter is equivalent):
import pandas as pd
from plotly.subplots import make_subplots
import plotly.graph_objects as go
## create some random data
df = pd.DataFrame(
data={'average':[1,2,3], 'Volume':[7,3,6]},
index=['a','b','c']
)
fig = make_subplots(rows=1, cols=2)
fig.add_trace(
go.Scatter(x=df.index, y=df.average, name='average'),
row=1, col=1
)
fig.add_trace(
go.Scatter(x=df.index, y=df.Volume, name='Volume'),
row=1, col=2
)
fig.show()

I can't figure out how to make my Plotly charts show whole numbers only

image of plotly chart
Hello, I'm really struggling to figure out how to format the axes on this chart. I've gone through the documentation and tried all sorts of different formatting suggestions from here and elsewhere but really not getting it. As you can see, the bottom chart has a .5 number, I want that to be skipped altogether and only have whole numbers along the axis.
I've seen ,d as a tickformat option to do this in about every answer, but I can't get that to work or I'm not seeing how to apply it to the second chart.
Can anyone with some Plotly charting experience help me out?
Here's the pertinent code:
def create_chart():
#Put data together into an interactive chart
fig.update_layout(height=500, width=800, yaxis_tickprefix = '$', hovermode='x unified', xaxis_tickformat =',d',
template=symbol_template, separators=".", title_text=(df.columns[DATA_COL_1]) + " & Units 2015-2019"
)
I believe what is happening is that the xaxis_tickformat parameter is affecting only the first subplot, but not the second one. To modify the formatting for each subplot, you can pass a dictionary with the tickformat parameter to yaxis, yaxis2, .... and so on for however many subplots you have (in your case, you only have 2 subplots).
import pandas as pd
from plotly.subplots import make_subplots
import plotly.graph_objects as go
## recreate the df
df = pd.DataFrame({'Year':[2015,2016,2017,2018,2019],
'Sales':[8.8*10**7,8.2*10**7,8.5*10**7,9.1*10**7,9.6*10**7],
'Units':[36200,36500,36900,37300,37700]})
def create_chart():
#Put data together into an interactive chart
fig = make_subplots(rows=2, cols=1)
fig.add_trace(go.Scatter(
x=df.Year,
y=df.Sales,
name='Sales',
mode='lines+markers'
), row=1, col=1)
fig.add_trace(go.Scatter(
x=df.Year,
y=df.Units,
name='Units',
mode='lines+markers'
), row=2, col=1)
fig.update_layout(
title_x=0.5,
height=500,
width=800,
yaxis_tickprefix = '$',
hovermode='x unified',
xaxis_tickformat =',d',
## this will change the formatting for BOTH subplots
yaxis=dict(tickformat ='d'),
yaxis2=dict(tickformat ='d'),
# template=symbol_template,
separators=".",
title={
'text':"MCD Sales & Units 2015-2019",
'x':0.5
}
)
fig.show()
create_chart()

Plotly:How to create subplots with python?

I am wondering what is best practice to create subplots using Python Plotly. Is it to use plotly.express or the standard plotly.graph_objects?
I'm trying to create a figure with two subplots, which are stacked bar charts. The following code doesn't work. I didn't find anything useful in the official documentation. The classic Titanic dataset was imported as train_df here.
import plotly.express as px
train_df['Survived'] = train_df['Survived'].astype('category')
fig1 = px.bar(train_df, x="Pclass", y="Age", color='Survived')
fig2 = px.bar(train_df, x="Sex", y="Age", color='Survived')
trace1 = fig1['data'][0]
trace2 = fig2['data'][0]
fig = make_subplots(rows=1, cols=2, shared_xaxes=False)
fig.add_trace(trace1, row=1, col=1)
fig.add_trace(trace2, row=1, col=2)
fig.show()
I got the following figure:
What I expect is as follows:
I'm hoping that the existing answer suits your needs, but I'd just like to note that the statement
it's not possible to subplot stakedbar (because stacked bar are in facted figures and not traces
is not entirely correct. It's possible to build a plotly subplot figure using stacked bar charts as long as you put it together correctly using add_trace() and go.Bar(). And this also answers your question regarding:
I am wondering what is best practice to create subplots using Python Plotly. Is it to use plotly.express or the standard plotly.graph_objects?
Use plotly.express ff you find a px approach that suits your needs. And like in your case where you do not find it; build your own subplots using plotly.graphobjects.
Below is an example that will show you one such possible approach using the titanic dataset. Note that the column names are noe the same as yours since there are no capital letters. The essence of this approav is that you use go.Bar() for each trace, and specify where to put those traces using the row and col arguments in go.Bar(). If you assign multiple traces to the same row and col, you will get stacked bar chart subplots if you specify barmode='stack' in fig.update_layout(). Usingpx.colors.qualitative.Plotly[i]` will let you assign colors from the standard plotly color cycle sequentially.
Plot:
Code:
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import plotly.express as px
import pandas as pd
url = "https://raw.github.com/mattdelhey/kaggle-titanic/master/Data/train.csv"
titanic = pd.read_csv(url)
#titanic.info()
train_df=titanic
train_df
# data for fig 1
df1=titanic.groupby(['sex', 'pclass'])['survived'].aggregate('mean').unstack()
# plotly setup for fig
fig = make_subplots(2,1)
fig.add_trace(go.Bar(x=df1.columns.astype('category'), y=df1.loc['female'],
name='female',
marker_color = px.colors.qualitative.Plotly[0]),
row=1, col=1)
fig.add_trace(go.Bar(x=df1.columns.astype('category'), y=df1.loc['male'],
name='male',
marker_color = px.colors.qualitative.Plotly[1]),
row=1, col=1)
# data for plot 2
age = pd.cut(titanic['age'], [0, 18, 80])
df2 = titanic.pivot_table('survived', [age], 'pclass')
groups=['(0, 18]', '(18, 80]']
fig.add_trace(go.Bar(x=df2.columns, y=df2.iloc[0],
name=groups[0],
marker_color = px.colors.qualitative.Plotly[3]),
row=2, col=1)
fig.add_trace(go.Bar(x=df2.columns, y=df2.iloc[1],
name=groups[1],
marker_color = px.colors.qualitative.Plotly[4]),
row=2, col=1)
fig.update_layout(title=dict(text='Titanic survivors by sex and age group'), barmode='stack', xaxis = dict(tickvals= df1.columns))
fig.show()
fig.show()
From what I know, it's not possible to subplot stakedbar (because stacked bar are in facted figures and not traces)...
On behalf of fig.show(), you can put to check if the html file is okay for you (The plots are unfortunately one under the other...) :
with open('p_graph.html', 'a') as f:
f.write(fig1.to_html(full_html=False, include_plotlyjs='cdn',default_height=500))
f.write(fig2.to_html(full_html=False, include_plotlyjs='cdn',default_height=500))
try the code below to check if the html file generate can be okay for you:
import pandas as pd
import plotly.graph_objects as go
#Remove the .astype('category') to easily
#train_df['Survived'] = train_df['Survived'].astype('category')
Pclass_pivot=pd.pivot_table(train_df,values='Age',index='Pclass',
columns='Survived',aggfunc=lambda x: len(x))
Sex_pivot=pd.pivot_table(train_df,values='Age',index='Sex',
columns='Survived',aggfunc=lambda x: len(x))
fig1 = go.Figure(data=[
go.Bar(name='Survived', x=Pclass_pivot.index.values, y=Pclass_pivot[1]),
go.Bar(name='NotSurvived', x=Pclass_pivot.index.values, y=Pclass_pivot[0])])
# Change the bar mode
fig1.update_layout(barmode='stack')
fig2 = go.Figure(data=[
go.Bar(name='Survived', x=Sex_pivot.index.values, y=Sex_pivot[1]),
go.Bar(name='NotSurvived', x=Sex_pivot.index.values, y=Sex_pivot[0])])
# Change the bar mode
fig2.update_layout(barmode='stack')
with open('p_graph.html', 'a') as f:
f.write(fig1.to_html(full_html=False, include_plotlyjs='cdn',default_height=500))
f.write(fig2.to_html(full_html=False, include_plotlyjs='cdn',default_height=500))
I managed to generate the subplots using the add_bar function.
Code:
from plotly.subplots import make_subplots
# plotly can only support one legend per graph at the moment.
fig = make_subplots(
rows=1, cols=2,
subplot_titles=("Pclass vs. Survived", "Sex vs. Survived")
)
fig.add_bar(
x=train_df[train_df.Survived == 0].Pclass.value_counts().index,
y=train_df[train_df.Survived == 0].Pclass.value_counts().values,
text=train_df[train_df.Survived == 0].Pclass.value_counts().values,
textposition='auto',
name='Survived = 0',
row=1, col=1
)
fig.add_bar(
x=train_df[train_df.Survived == 1].Pclass.value_counts().index,
y=train_df[train_df.Survived == 1].Pclass.value_counts().values,
text=train_df[train_df.Survived == 1].Pclass.value_counts().values,
textposition='auto',
name='Survived = 1',
row=1, col=1
)
fig.add_bar(
x=train_df[train_df.Survived == 0].Sex.value_counts().index,
y=train_df[train_df.Survived == 0].Sex.value_counts().values,
text=train_df[train_df.Survived == 0].Sex.value_counts().values,
textposition='auto',
marker_color='#636EFA',
showlegend=False,
row=1, col=2
)
fig.add_bar(
x=train_df[train_df.Survived == 1].Sex.value_counts().index,
y=train_df[train_df.Survived == 1].Sex.value_counts().values,
text=train_df[train_df.Survived == 1].Sex.value_counts().values,
textposition='auto',
marker_color='#EF553B',
showlegend=False,
row=1, col=2
)
fig.update_layout(
barmode='stack',
height=400, width=1200,
)
fig.update_xaxes(ticks="inside")
fig.update_yaxes(ticks="inside", col=1)
fig.show()
Resulting plot:
Hope this is helpful to the newbies of plotly like me.

Categories

Resources