Related
I'm trying to build a timeseries heatmap along a 24-hour day on each day of the week, and I want to have each day be subject within its own values only. Here's what I've done in Plotly so far.
The problem is the "highest" color only goes to the one on the 2nd row. My desired output, made in Excel, is this one:
Each row clearly shows its own green color since they each of them have separate conditional formatting.
My code:
import plotly.express as px
import pandas as pd
df = pd.read_csv('test0.csv', header=None)
fig = px.imshow(df, color_continuous_scale=['red', 'green'])
fig.update_coloraxes(showscale=False)
fig.show()
The csv file:
0,0,1,2,0,5,2,3,3,5,8,4,7,9,9,0,4,5,2,0,7,6,5,7
1,3,4,9,4,3,3,2,12,15,6,9,1,4,3,1,1,2,5,3,4,2,5,8
9,6,7,1,3,4,5,6,9,8,7,8,6,6,5,4,5,3,3,6,4,8,9,10
8,7,8,6,7,5,4,6,6,7,8,5,5,6,5,7,5,6,7,5,8,6,4,4
3,4,2,1,1,2,2,1,2,1,1,1,1,3,4,4,2,2,1,1,1,2,4,3
3,5,4,4,4,6,5,5,5,4,3,7,7,8,7,6,7,6,6,3,4,3,3,3
5,4,4,5,4,3,1,1,1,1,2,2,3,2,1,1,4,3,4,5,4,4,3,4
I've solved it! I had to make the heatmaps by row and combine them.
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import pandas as pd
import calendar
df = pd.read_csv('test0.csv', header=None)
# initialize subplots with vertical_spacing as 0 so the rows are right next to each other
fig = make_subplots(rows=7, cols=1, vertical_spacing=0)
# shift sunday to first position
days = list(calendar.day_name)
days = days[-1:] + days[:-1]
for index, row in df.iterrows():
row_list = row.tolist()
sub_fig = go.Heatmap(
x=list(range(0, 24)), # hours
y=[days[index]], # days of the week
z=[row_list], # data
colorscale=[
[0, '#FF0000'],
[1, '#00FF00']
],
showscale=False
)
# insert heatmap to subplot
fig.append_trace(sub_fig, index + 1, 1)
fig.show()
Output:
I have a dataset with 3 columns, Index(date_time), Label, Value. Label can be one of 6 different sensors in a pharmaceutical reaction vessel. When I chart the data using plotly I get a continuous series over time with sections that look like this. Plotly chart of my data
T0 denotes the beginning of a chemical reaction. As I have multiple occurrences of this chemical reaction I want to create discrete "batches" from T0 to T0 + 4 hours. This will then be used to analyze the variance in all of the batches. Sometimes teh chemical reaction does not complete after several hours, s my task is to figure out why.
I also have an external dataset that has labels "good" or "bad" so I was also hoping to change the data format to wide and have target column for each batch.
This is all my code until now, using peakutils to try estimate the peaks for T0
import glob
import pandas as pd
import peakutils
import matplotlib.pyplot as plt
import peakutils
from peakutils.plot import plot as pplot
from matplotlib import pyplot
%matplotlib inline
# Get CSV files list from a folder
path = [path]
csv_files = glob.glob(path + "/*.csv")
# Read each CSV file into DataFrame
# This creates a list of dataframes
df_list = (pd.read_csv(file) for file in csv_files)
# Concatenate all DataFrames
big_df = pd.concat(df_list, ignore_index=True)
data = big_df
data[' Date'] = pd.to_datetime(data[' Date'], dayfirst='True')
data = data.sort_values(by = ' Date')
import plotly.express as px
import plotly.io as pio
pio.renderers.default='browser'
fig = px.line(data, x = ' Date', y = ' Value', color = ' Pen Name')
fig.show()
x = wt_df[' Date']
y = wt_df[' Value']
indexes = peakutils.indexes(y, thres=0.5, min_dist=300)
print(indexes)
print(x[indexes], y[indexes])
pyplot.figure(figsize=(10,6))
pplot(x, y, indexes)
pyplot.title('First estimate')
T
his is the peak utils output
enter image description here
I have a Pandas dataframe representing portfolio weights in multiple dates, such as the following contents in CSV format:
DATE,ASSET1,ASSET2,ASSET3,ASSET4,ASSET5,ASSET6,ASSET7
2010-01-04,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-02-03,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-03-05,0.217195,0.0,0.250000,0.032805,0.25,0.000000,0.250000
2010-04-06,0.139636,0.0,0.250000,0.110364,0.25,0.000000,0.250000
2010-05-05,0.179569,0.0,0.218951,0.101480,0.25,0.000000,0.250000
2010-06-04,0.207270,0.0,0.211974,0.080756,0.25,0.000000,0.250000
2010-07-06,0.132468,0.0,0.250000,0.117532,0.25,0.000000,0.250000
2010-08-04,0.116353,0.0,0.250000,0.133647,0.25,0.000000,0.250000
2010-09-02,0.081677,0.0,0.250000,0.168323,0.25,0.000000,0.250000
2010-10-04,0.000000,0.0,0.250000,0.250000,0.25,0.009955,0.240045
For each row in the Pandas dataframe resulting from this CSV, we can generate a bar chart with the portfolio composition at that day. I would like to have multiple bar charts, with a time slider, such that we can choose one of the dates and see the portfolio composition during that day.
Can this be achieved with Plotly?
I could not find a way to do it straight in the dataframe above, but it is possible to do it by "melting" the dataframe. The following code achieves what I was looking for, together with some beautification of the chart:
import pandas as pd
from io import StringIO
import plotly.express as px
string = """
DATE,ASSET1,ASSET2,ASSET3,ASSET4,ASSET5,ASSET6,ASSET7
2010-01-04,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-02-03,0.250000,0.0,0.250000,0.000000,0.25,0.000000,0.250000
2010-03-05,0.217195,0.0,0.250000,0.032805,0.25,0.000000,0.250000
2010-04-06,0.139636,0.0,0.250000,0.110364,0.25,0.000000,0.250000
2010-05-05,0.179569,0.0,0.218951,0.101480,0.25,0.000000,0.250000
2010-06-04,0.207270,0.0,0.211974,0.080756,0.25,0.000000,0.250000
2010-07-06,0.132468,0.0,0.250000,0.117532,0.25,0.000000,0.250000
2010-08-04,0.116353,0.0,0.250000,0.133647,0.25,0.000000,0.250000
2010-09-02,0.081677,0.0,0.250000,0.168323,0.25,0.000000,0.250000
2010-10-04,0.000000,0.0,0.250000,0.250000,0.25,0.009955,0.240045
"""
df = pd.read_csv(StringIO(string))
df = df.melt(id_vars=['DATE']).sort_values(by = 'DATE')
fig = px.bar(df, x="variable", y="value", animation_frame="DATE")
fig.update_layout(legend_title_text = None)
fig.update_xaxes(title = "Asset")
fig.update_yaxes(title = "Proportion")
fig.update_layout(autosize = True, height = 600)
fig.update_layout(hovermode="x")
fig.update_layout(plot_bgcolor="#F8F8F8")
fig.update_traces(
hovertemplate=
'<i></i> %{y:.2%}'
)
fig.show()
This produces the following:
How do I utilize plotly.express to plot multiple lines on two yaxis out of one Pandas dataframe?
I find this very useful to plot all columns containing a specific substring:
fig = px.line(df, y=df.filter(regex="Linear").columns, render_mode="webgl")
as I don't want to loop over all my filtered columns and use something like:
fig.add_trace(go.Scattergl(x=df["Time"], y=df["Linear-"]))
in each iteration.
It took me some time to fiddle this out, but I feel this could be useful to some people.
# import some stuff
import plotly.express as px
from plotly.subplots import make_subplots
import pandas as pd
import numpy as np
# create some data
df = pd.DataFrame()
n = 50
df["Time"] = np.arange(n)
df["Linear-"] = np.arange(n)+np.random.rand(n)
df["Linear+"] = np.arange(n)+np.random.rand(n)
df["Log-"] = np.arange(n)+np.random.rand(n)
df["Log+"] = np.arange(n)+np.random.rand(n)
df.set_index("Time", inplace=True)
subfig = make_subplots(specs=[[{"secondary_y": True}]])
# create two independent figures with px.line each containing data from multiple columns
fig = px.line(df, y=df.filter(regex="Linear").columns, render_mode="webgl",)
fig2 = px.line(df, y=df.filter(regex="Log").columns, render_mode="webgl",)
fig2.update_traces(yaxis="y2")
subfig.add_traces(fig.data + fig2.data)
subfig.layout.xaxis.title="Time"
subfig.layout.yaxis.title="Linear Y"
subfig.layout.yaxis2.type="log"
subfig.layout.yaxis2.title="Log Y"
# recoloring is necessary otherwise lines from fig und fig2 would share each color
# e.g. Linear-, Log- = blue; Linear+, Log+ = red... we don't want this
subfig.for_each_trace(lambda t: t.update(line=dict(color=t.marker.color)))
subfig.show()
The trick with
subfig.for_each_trace(lambda t: t.update(line=dict(color=t.marker.color)))
I got from nicolaskruchten here: https://stackoverflow.com/a/60031260
Thank you derflo and vestland! I really wanted to use Plotly Express as opposed to Graph Objects with dual axis to more easily handle DataFrames with lots of columns. I dropped this into a function. Data1/2 works well as a DataFrame or Series.
import plotly.express as px
from plotly.subplots import make_subplots
import pandas as pd
def plotly_dual_axis(data1,data2, title="", y1="", y2=""):
# Create subplot with secondary axis
subplot_fig = make_subplots(specs=[[{"secondary_y": True}]])
#Put Dataframe in fig1 and fig2
fig1 = px.line(data1)
fig2 = px.line(data2)
#Change the axis for fig2
fig2.update_traces(yaxis="y2")
#Add the figs to the subplot figure
subplot_fig.add_traces(fig1.data + fig2.data)
#FORMAT subplot figure
subplot_fig.update_layout(title=title, yaxis=dict(title=y1), yaxis2=dict(title=y2))
#RECOLOR so as not to have overlapping colors
subplot_fig.for_each_trace(lambda t: t.update(line=dict(color=t.marker.color)))
return subplot_fig
I am trying to generally recreate this graph and struggling with adding a column to the hovertemplate of a plotly Scatter. Here is a working example:
import pandas as pd
import chart_studio.plotly as py
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0),
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{percent}',
))
fig.show()
I'd like to get the column Cast-Fatality to show under {percent}
I've also tried putting in the Scatter() call a line for text = [df['Case-Fatality']], and switching {percent} to {text} as shown in this example, but this doesn't pull from the dataframe as hoped.
I've tried replotting it as a px, following this example but it throws the error dictionary changed size during iteration and I think using go may be simpler than px but I'm new to plotly.
Thanks in advance for any insight for how to add a column to the hover.
As the question asks for a solution with graph_objects, here are two that work-
Method (i)
Adding %{text} where you want the variable value to be and passing another variable called text that is a list of values needed in the go.Scatter() call. Like this-
percent = df['Case-Fatality']
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{text}',text = percent
Here is the complete code-
import pandas as pd
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0),
hovertemplate = '%{hovertext}<br>Confirmed: %{x}<br>Fatalities: %{y}<br>%{text}',
text = percent))
fig.show()
Method (ii)
This solution requires you to see the hoverlabel as when you pass x unified to hovermode. All you need to do then is pass an invisible trace with the same x-axis and the desired y-axis values. Passing mode='none' makes it invisible. Here is the complete code-
import pandas as pd
import plotly.graph_objects as go
dfs = pd.read_html('https://coronavirus.jhu.edu/data/mortality', header=0)
df = dfs[0]
percent = df['Case-Fatality'] # This is my closest guess, but isn't working
fig = go.Figure(data=go.Scatter(x=df['Confirmed'],
y = df['Deaths'],
mode='markers',
hovertext=df['Country'],
hoverlabel=dict(namelength=0)))
fig.add_scatter(x=df.Confirmed, y=percent, mode='none')
fig.update_layout(hovermode='x unified')
fig.show()
The link you shared is broken. Are you looking for something like this?
import pandas as pd
import plotly.express as px
px.scatter(df,
x="Confirmed",
y="Deaths",
hover_name="Country",
hover_data={"Case-Fatality":True})
Then if you need to use bold or change your hover_template you can follow the last step in this answer
Drawing inspiration from another SO question/answer, I find that this is working as desired and permits adding multiple cols to the hover data:
import pandas as pd
import plotly.express as px
fig = px.scatter(df,
x="Confirmed",
y="Deaths",
hover_name="Country",
hover_data=[df['Case-Fatality'], df['Deaths/100K pop.']])
fig.show()