Separate heatmap ranges for each row in Plotly - python

I'm trying to build a timeseries heatmap along a 24-hour day on each day of the week, and I want to have each day be subject within its own values only. Here's what I've done in Plotly so far.
The problem is the "highest" color only goes to the one on the 2nd row. My desired output, made in Excel, is this one:
Each row clearly shows its own green color since they each of them have separate conditional formatting.
My code:
import plotly.express as px
import pandas as pd
df = pd.read_csv('test0.csv', header=None)
fig = px.imshow(df, color_continuous_scale=['red', 'green'])
fig.update_coloraxes(showscale=False)
fig.show()
The csv file:
0,0,1,2,0,5,2,3,3,5,8,4,7,9,9,0,4,5,2,0,7,6,5,7
1,3,4,9,4,3,3,2,12,15,6,9,1,4,3,1,1,2,5,3,4,2,5,8
9,6,7,1,3,4,5,6,9,8,7,8,6,6,5,4,5,3,3,6,4,8,9,10
8,7,8,6,7,5,4,6,6,7,8,5,5,6,5,7,5,6,7,5,8,6,4,4
3,4,2,1,1,2,2,1,2,1,1,1,1,3,4,4,2,2,1,1,1,2,4,3
3,5,4,4,4,6,5,5,5,4,3,7,7,8,7,6,7,6,6,3,4,3,3,3
5,4,4,5,4,3,1,1,1,1,2,2,3,2,1,1,4,3,4,5,4,4,3,4

I've solved it! I had to make the heatmaps by row and combine them.
from plotly.subplots import make_subplots
import plotly.graph_objects as go
import pandas as pd
import calendar
df = pd.read_csv('test0.csv', header=None)
# initialize subplots with vertical_spacing as 0 so the rows are right next to each other
fig = make_subplots(rows=7, cols=1, vertical_spacing=0)
# shift sunday to first position
days = list(calendar.day_name)
days = days[-1:] + days[:-1]
for index, row in df.iterrows():
row_list = row.tolist()
sub_fig = go.Heatmap(
x=list(range(0, 24)), # hours
y=[days[index]], # days of the week
z=[row_list], # data
colorscale=[
[0, '#FF0000'],
[1, '#00FF00']
],
showscale=False
)
# insert heatmap to subplot
fig.append_trace(sub_fig, index + 1, 1)
fig.show()
Output:

Related

Plotly graph : show number of occurrences in bar-chart

I try to plot a bar-chart from a givin dataframe.
x-axis = dates
y-axis = number of occurences for each month
The result should be a barchart. Each x is an occurrence.
x
xx
x
2020-1
2020-2
2020-3
2020-4
2020-5
I tried but don't get the desired result as above.
import datetime as dt
import pandas as pd
import numpy as np
import plotly.offline as pyo
import plotly.graph_objs as go
# initialize list of lists
data = [['a', '2022-01-05'], ['a', '2022-02-14'], ['a', '2022-02-15'],['a', '2022-05-14']]
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Date'])
# print dataframe.
df['Date']=pd.to_datetime(df['Date'])
# plot dataframe
trace1=go.Bar(
#
x = df.Date.dt.month,
y = df.Name.groupby(df.Date.dt.month).count()
)
data=[trace1]
fig=go.Figure(data=data)
pyo.plot(fig)
Remove the last line and write instead:
fig.show()
Edit:
It's unclear to me whether you have 1 dimensional or 2 dimensional data here. Supposing you have 1d data, this is, just a bunch of dates that you want to aggregate in a bar chart, simply do this:
# initialize list of lists
data = ['2022-01-05', '2022-02-14', '2022-02-15', '2022-05-14']
# Create the pandas DataFrame
df = pd.DataFrame(data)
# plot dataframe
fig = px.bar(df)
If, instead, you have 2d data then what you want is a scatter plot, not a bar chart.

Plot horizontal lines between date ranges iterating through pandas dataframe

I essentially have two different data frames, one for calculating weekly data (df) and a second one (df1) that has the plot values of the stock/crypto. On df, I have created a pandas column 'pivot' ((open+high+low)/3) using the weekly data to create a set of values containing the weekly pivot values.
Now I want to plot these weekly data (as lines) onto df1 which has the daily data. Therefore the x1 would be the start of the week and x2 be the end of the week. the y values being the pivot value from the df(weekly).
Here is what I would want it to look like:
My Approach & Problem:
First of all, I am a beginner in Python, this is my second month of learning. My apologies if this was asked before.
I know the pivot values can be calculated using a single data frame & pandas group-by but I want to take the issue after this is done, so both ways should be fine if you are approaching this issue. What I would like to have is those final lines with OHLC candlesticks. I would like to plot these results using Plotly OHLC and go Shapes. What I am stuck with is iterating through the pivot weekly data frame and adding the lines as traces on top of the OHLC data daily data.
Here's my code so far:
import yfinance as yf
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedelta
df = yf.download( tickers = 'BTC-USD',
start = '2021-08-30',
end = datetime.today().strftime('%Y-%m-%d'),
interval = '1wk',
group_by = 'ticker',
auto_adjust = True).reset_index()
#daily df for plot
df2 = yf.download( tickers = 'BTC-USD',
start = '2021-08-30',
end = datetime.today().strftime('%Y-%m-%d'),
interval = '1d',
group_by = 'ticker',
auto_adjust = True).reset_index()
#small cap everything
df = df.rename(columns={'Date':'date',
'Open': 'open',
'High': 'high',
'Low' : 'low',
'Close' : 'close'})
df['pivot'] = (df['high']+ df['low'] + df['close'])/3
result = df.copy()
fig = go.Figure(data = [go.Candlestick(x= df['date'],
open = df['open'],
high = df['high'],
low = df['low'],
close = df['close'],
name = 'Price Candle')])
This would be for plotting until the candlesticks OHLC, however, the rest iteration is what is troubling me. You can plot it on a line chart or on an OHLC chart and iterate it.
fig = px.line(df, x='time', y='close')
result = df.copy()
for i, pivot in result.iterrows():
fig.add_shape(type="line",
x0=pivot.date, y0=pivot, x1=pivot.date, y1=pivot,
line=dict(
color="green",
width=3)
)
fig
When I print this no pivot lines appear the way I want them to show.Only the original price line graph shows
Thanks in advance for taking the time to read this so far.
There are two ways to create a line segment: add a shape or use line mode on a scatter plot. I think the line mode of scatter plots is more advantageous because it allows for more detailed settings. For the data frame, introduce a loop process on a line-by-line basis to get the next line using the idx of the data frame. y-axis values are pivot values. I wanted to get Yokohama, so I moved the legend position up. Also, since we are looping through the scatter plot, we will have many legends for the pivot values, so we set the legend display to True for the first time only.
import yfinance as yf
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from datetime import datetime, timedelta
df = yf.download( tickers = 'BTC-USD',
start = '2021-08-30',
end = datetime.today().strftime('%Y-%m-%d'),
interval = '1wk',
group_by = 'ticker',
auto_adjust = True).reset_index()
#daily df for plot
df2 = yf.download( tickers = 'BTC-USD',
start = '2021-08-30',
end = datetime.today().strftime('%Y-%m-%d'),
interval = '1d',
group_by = 'ticker',
auto_adjust = True).reset_index()
#small cap everything
df = df.rename(columns={'Date':'date',
'Open': 'open',
'High': 'high',
'Low' : 'low',
'Close' : 'close'})
df['pivot'] = (df['high']+ df['low'] + df['close'])/3
fig = go.Figure()
fig.add_trace(go.Candlestick(x= df['date'],
open = df['open'],
high = df['high'],
low = df['low'],
close = df['close'],
name = 'Price Candle',
legendgroup='one'
)
)
#fig.add_trace(go.Scatter(mode='lines', x=df['date'], y=df['pivot'], line=dict(color='green'), name='pivot'))
for idx, row in df.iterrows():
#print(idx)
if idx == len(df)-2:
break
fig.add_trace(go.Scatter(mode='lines',
x=[row['date'], df.loc[idx+1,'date']],
y=[row['pivot'], row['pivot']],
line=dict(color='blue', width=1),
name='pivot',
showlegend=True if idx == 0 else False,
)
)
fig.update_layout(
autosize=False,
height=600,
width=1100,
legend=dict(
orientation="h",
yanchor="bottom",
y=1.02,
xanchor="right",
x=1)
)
fig.update_xaxes(rangeslider_visible=False)
fig.show()

How do I display a grouped graph using a CSV file

import pandas as pd
import plotly
import plotly.express as px
import plotly.io as pio
df = pd.read_csv("final_spreadsheet.csv")
barchart = px.bar(
data_frame = df,
x = "Post-Lockdown Period (May - September)",
y = "Post-Lockdown Period (May - September)",
color = "Peak-Lockdown Period (March-May)",
opacity = 0.9,
orientation ="v",
barmode = 'relative',
)
pio.show(barchart)
I want the x axis to be the different behavioral variables and for each behavioral variable I want there to be two bars one for peak pandemic and one for post pandemic. I also want the y axis to just be the values of each
This is my current attempt but no graphs appear. Attached is also a picture of the CSV file in excel form
In plotly.express you can create a grouped bar chart by passing a list of the two variables you want to group together in the argument y. In your case, you'll want to pass the argument y = ['Peak-Lockdown Period (March-May)','Post-Lockdown Period (May-September)'] as well as the argument barmode = 'grouped' to px.bar. I created a sample DataFrame to illustrate:
import pandas as pd
import plotly.express as px
import plotly.io as pio
# df = pd.read_csv("final_spreadsheet.csv")
## create example DataFrame similar to yours
df = pd.DataFrame({
'Behavioral': list('ABCD'),
'Peak-Lockdown Period (March-May)': [76.7,26.12,0,2.94],
'Post-Lockdown Period (May-September)': [77.32,26.38,0,3.36]
})
barchart = px.bar(
data_frame = df,
x = 'Behavioral',
y = ['Peak-Lockdown Period (March-May)','Post-Lockdown Period (May-September)'],
# color = "Peak-Lockdown Period (March-May)",
opacity = 0.9,
orientation ="v",
barmode = 'group',
)
pio.show(barchart)
EDIT: you can accomplish the same thing using plotly.graph_objects:
import plotly.graph_objects as go
fig = go.Figure(data=[
go.Bar(name='Peak-Lockdown Period (March-May)', x=df['Behavioral'].values, y=df['Peak-Lockdown Period (March-May)'].values),
go.Bar(name='Post-Lockdown Period (May-September)', x=df['Behavioral'].values, y=df['Post-Lockdown Period (May-September)'].values),
])

Plotly Distplot subplots

I am trying to write a for loop that for distplot subplots.
I have a dataframe with many columns of different lengths. (not including the NaN values)
fig = make_subplots(
rows=len(assets), cols=1,
y_title = 'Hourly Price Distribution')
i=1
for col in df_all.columns:
fig = ff.create_distplot([[df_all[[col]].dropna()]], col)
fig.append()
i+=1
fig.show()
I am trying to run a for loop for subplots for distplots and get the following error:
PlotlyError: Oops! Your data lists or ndarrays should be the same length.
UPDATE:
This is an example below:
df = pd.DataFrame({'2012': np.random.randn(20),
'2013': np.random.randn(20)+1})
df['2012'].iloc[0] = np.nan
fig = ff.create_distplot([df[c].dropna() for c in df.columns],
df.columns,show_hist=False,show_rug=False)
fig.show()
I would like to plot each distribution in a different subplot.
Thank you.
Update: Distribution plots
Calculating the correct values is probably both quicker and more elegant using numpy. But I often build parts of my graphs using one plotly approach(figure factory, plotly express) and then use them with other elements of the plotly library (plotly.graph_objects) to get what I want. The complete snippet below shows you how to do just that in order to build a go based subplot with elements from ff.create_distplot. I'd be happy to give further explanations if the following suggestion suits your needs.
Plot
Complete code
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.figure_factory as ff
from plotly.subplots import make_subplots
import plotly.graph_objects as go
df = pd.DataFrame({'2012': np.random.randn(20),
'2013': np.random.randn(20)+1})
df['2012'].iloc[0] = np.nan
df = df.reset_index()
dfm = pd.melt(df, id_vars=['index'], value_vars=df.columns[1:])
dfm = dfm.dropna()
dfm.rename(columns={'variable':'year'}, inplace = True)
cols = dfm.year.unique()
nrows = len(cols)
fig = make_subplots(rows=nrows, cols=1)
for r, col in enumerate(cols, 1):
dfs = dfm[dfm['year']==col]
fx1 = ff.create_distplot([dfs['value'].values], ['distplot'],curve_type='kde')
fig.add_trace(go.Scatter(
x= fx1.data[1]['x'],
y =fx1.data[1]['y'],
), row = r, col = 1)
fig.show()
First suggestion
You should:
1. Restructure your data with pd.melt(df, id_vars=['index'], value_vars=df.columns[1:]),
2. and the use the occuring column 'variable' to build subplots for each year through the facet_row argument to get this:
In the complete snippet below you'll see that I've changed 'variable' to 'year' in order to make the plot more intuitive. There's one particularly convenient side-effect with this approach, namely that running dfm.dropna() will remove the na value for 2012 only. If you were to do the same thing on your original dataframe, the corresponding value in the same row for 2013 would also be removed.
import numpy as np
import pandas as pd
import plotly.express as px
df = pd.DataFrame({'2012': np.random.randn(20),
'2013': np.random.randn(20)+1})
df['2012'].iloc[0] = np.nan
df = df.reset_index()
dfm = pd.melt(df, id_vars=['index'], value_vars=df.columns[1:])
dfm = dfm.dropna()
dfm.rename(columns={'variable':'year'}, inplace = True)
fig = px.histogram(dfm, x="value",
facet_row = 'year')
fig.show()

Sorting and conditional color formatting in matplotlib

to skip the context and get straight to the question, go down to "desired changes"
I wrote the helper function below to
Fetch data
Calculate the YTD return
Plot the results in a bar plot
Here is the function:
def ytd_perf(symb, col_names, source = 'yahoo'):
import datetime as datetime
from datetime import date
import pandas as pd
import pandas_datareader.data as web
import matplotlib.pyplot as plt
import seaborn as sns
%pylab inline
#establish start and end dates
start = date(date.today().year, 1, 1)
end = datetime.date.today()
#fetch data
df = web.DataReader(symb, source, start = start, end = end)['Adj Close']
#make sure column orders don't change
df = df.reindex_axis(symb, 1)
#rename the columns
df.columns = col_names
#calc returns from the first element
df = (df / df.ix[0]) - 1
#Plot the most recent line of data -- this represents the YTD return
ax = df.ix[-1].plot(kind = 'bar', title = ('YTD Performance as of '+ str(end)),figsize=(12,9))
vals = ax.get_yticks()
ax.set_yticklabels(['{:3.1f}%'.format(x*100) for x in vals])
So, when I run:
tickers = ['SPY', 'TLT']
names = ['Stocks', 'Bonds']
ytd_perf(tickers, names)
I get the following output:
2 desired changes that I can't quite get to work:
I would like to change the color of the bar such that if the value < 0, it is red.
Sort the bars from highest to lowest (which is the case in this chart because there are only two series, but doesnt work with many series).

Categories

Resources