With plotly express I've built a bar chart similar to as shown on their website.
As px.bar did not allow me to run the animation frame on datetime64[ns] I transformed the datetime into a string as follows.
eu_vaccine_df['date_str'] = eu_vaccine_df['date'].apply(lambda x: str(x))
eu_vaccine_df[['date_str', 'date', 'country', 'people_vaccinated_per_hundred']].head()
The dataset on which I then run the px.bar looks as follows and contains 30 different countries.
The code for my barchart including animation looks as follows.
fig = px.bar(
eu_vaccine_df,
x='country', y='people_vaccinated_per_hundred',
color='country',
animation_frame='date_str',
animation_group='country',
hover_name='country',
range_y=[0,50],
range_x=[0,30]
)
fig.update_layout(
template='plotly_dark',
margin=dict(r=10, t=25, b=40, l=60)
)
fig.show()
In the end result the date on the animation frame is wrong. It first shows all results from 2021 and then all results from 2020 as shown at the bottom of the following screenshot.
Sorting my df by the date solved the issue.
covid_df['date'] = pd.to_datetime(covid_df['date'])
covid_df = covid_df.sort_values('date', ascending=True)
covid_df['date'] = covid_df['date'].dt.strftime('%m-%d-%Y')
Related
I have a df that looks like this:
image of the dataframe
my goal is to make a line chart that sums up the codes for each month and, after this, add a dropdown to be able to filter between 'type', group' and 'Spec.'
If I didn't want the dropdown filter, I could achieve this with
`df.groupby('month')['code'].count().reset_index()`
Since I need the filters, the ideal is to be able to do this sum in the graph code in plotly, so I don't lose the 'type', group' and 'Spec.' columns.
I tryed this code:
`line_fig1 = px.line(data_frame = df,
x= 'month',
y='code',
labels={'month':'','code':''},
title='',
width=450,
height=250,
template='plotly_white',
color_discrete_sequence= ["rgb(1, 27, 105)"],
markers=True,
text='code'
)`
and this was the result:
image of the chart
I also tryed something like
`line_fig1 = px.line(data_frame = df,
x= 'month',
y='code'.count()`
or even tryed to add a column with a number one, so the chart could aggregate
`df['assign_value'] = 1
line_fig1 = px.line(data_frame = df,
x= 'month',
y='assign_value'`
But this also don't work.
Any help here?
I think you should groupby by month and code and then use new dataframe to make line graph. Something as below:
df2 = df.groupby(['month', 'code'])['code'].count().reset_index(name='counts')
fig = px.line(df2,x='month',y='counts', color='code')
fig.show()
Hey I was trying to make an line animation for the stocks data built in plotly.
So I tried this code below following
https://plotly.com/python/animations/
but nothing shows.
import plotly.express as px
df = px.data.stocks()
fig = px.line(df, x = 'date', y=df.columns[1:6], animation_frame = 'date')
fig.show()
what I intended to do was to make a line animation of the 6 company's stock price
with respect to the date. I'm totally new to plotly so this maybe a dumb question but I'd be grateful if you guys could help. Thank You!
you need some consistency across the animation frames for the xaxis and yaxis
to achieve this I modified to use day of month as xaxis and ensured range is appropriate for all frames in yaxis
then used month/year as the animation (a line only makes sense if there is more that one value to plot) so there are a collection of values in each frame
import plotly.express as px
import pandas as pd
df = px.data.stocks()
df["date"] = pd.to_datetime(df["date"])
fig = px.line(df, x = df["date"].dt.day, y=df.columns[1:6], animation_frame=df["date"].dt.strftime("%b-%Y"))
fig.update_layout(yaxis={"range":[0,df.iloc[:,1:6].max().max()]})
Well, I am trying to plot a Bar Graph in Plotly where I need to show 3 years of data in a grouped bar chart though the chart is displaying the data in the chart not showing data correctly all the bars are equal in the graph Something like this:
Here is my code for plotting:
data=[go.Bar(x=nasdaq['Sector'],y=recent_ipos['IPO Year'],textangle=-45,name='2015'),
go.Bar(x=nasdaq['Sector'],y=recent_ipos['IPO Year'],textangle=-45,name='2016'),
go.Bar(x=nasdaq['Sector'],y=recent_ipos['IPO Year'],textangle=-45,name='2017')
]
layout=go.Layout(title="NASDAQ Market Capitalization IPO yEAR (million USD)",barmode='group')
fig=go.Figure(data=data,layout=layout)
fig.show(renderer="colab")
Here is my code which I am using to extract the data for 3 years:
recent_ipos = nasdaq[nasdaq['IPO Year'] > 2014]
recent_ipos['IPO Year'] = recent_ipos['IPO Year'].astype(int)
I tried to extract the 2015 data here using an array but I don't find an appropriate method here to extract an element from the array
ipo2015=np.array(recent_ipos['IPO Year'])
ipo2015
I am not sure if this is the right way to extract the particular year data or not??
Things I want to know here are :
How to extract year data appropriately in the graph using Plotly?
What changes I should make to solve this inconsistency in the graph?
What should I put in Y= in all the three groups bars??
How to extract the years dynamically rather than manually?
Hope to receive help from this amazing community on StackOverflow.
Thanks in advance.!!
I wrote the code under the assumption that the data on which the question is based is in data frame format. The data is taken from plotly. The query() can also be used as a variable using # as shown in the code.
import plotly.graph_objects as go
import plotly.express as px
# yyyy = [1992,1997,2002]
df = px.data.gapminder()
continent = df['continent'].unique().tolist()
yyyy = df['year'].unique().tolist()[-3:] # update
data = []
for y in yyyy:
# tmp_df = df.query('year == #y')
tmp_df = df[df['year'] == y].groupby('continent')['pop'].sum()
data.append(go.Bar(x=tmp_df.index, y=tmp_df, name=y))
# Change the bar mode
fig = go.Figure(data)
fig.update_layout(barmode='group')
fig.show()
I'm using plotly to create a stacked bar chart, with each bar representing a quarter end date. The data is pulled into a dataframe via SQL and the dates are parsed in the read_sql statement.
When graphing the dates on the x-axis are displayed as 10/01 instead of 9/30, 4/1 instead of 3/31, etc.
Any idea how I can just display the dates correctly?
Here's a sample
import plotly.express as px
fig = px.bar(df.groupby('dt_quarter').head(10), x='dt_quarter', y="amount", color="name", title="Stack Bar Test")
fig.update_layout(yaxis_title_text = 'Amount ($)',xaxis_title_text='Date', legend_title_text='Sector', legend_traceorder='reversed')
fig.show()
What I ended up doing was creating a new column in my dataframe that displays the date in 'QXYYYY' format (e.g. Q42020, etc.). I then used that as my x axis and it works fine.
For creating the new column:
alldata['quarter'] = pd.PeriodIndex(alldata.dt_quarter, freq='Q').astype('str')
I have a plotly graph of the EUR/JPY exchange rate across a few months in 15 minute time intervals, so as a result, there is no data from friday evenings to sunday evenings.
Here is a portion of the data, note the skip in the index (type: DatetimeIndex) over the weekend:
Plotting this data in plotly results in a gap over the missing dates Using the dataframe above:
import plotly.graph_objs as go
candlesticks = go.Candlestick(x=data.index, open=data['Open'], high=data['High'],
low=data['Low'], close=data['Close'])
fig = go.Figure(layout=cf_layout)
fig.add_trace(trace=candlesticks)
fig.show()
Ouput:
As you can see, there are gaps where the missing dates are. One solution I've found online is to change the index to text using:
data.index = data.index.strftime("%d-%m-%Y %H:%M:%S")
and plotting it again, which admittedly does work, but has it's own problem. The x-axis labels look atrocious:
I would like to produce a graph that plots a graph like in the second plot where there are no gaps, but the x-axis is displayed like as it is on the first graph. Or at least displayed in a much more concise and responsive format, as close to the first graph as possible.
Thank you in advance for any help!
Even if some dates are missing in your dataset, plotly interprets your dates as date values, and shows even missing dates on your timeline. One solution is to grab the first and last dates, build a complete timeline, find out which dates are missing in your original dataset, and include those dates in:
fig.update_xaxes(rangebreaks=[dict(values=dt_breaks)])
This will turn this figure:
Into this:
Complete code:
import plotly.graph_objects as go
from datetime import datetime
import pandas as pd
import numpy as np
# sample data
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')
# remove some dates to build a similar case as in the question
df = df.drop(df.index[75:110])
df = df.drop(df.index[210:250])
df = df.drop(df.index[460:480])
# build complete timepline from start date to end date
dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1])
# retrieve the dates that ARE in the original datset
dt_obs = [d.strftime("%Y-%m-%d") for d in pd.to_datetime(df['Date'])]
# define dates with missing values
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d").tolist() if not d in dt_obs]
# make fiuge
fig = go.Figure(data=[go.Candlestick(x=df['Date'],
open=df['AAPL.Open'], high=df['AAPL.High'],
low=df['AAPL.Low'], close=df['AAPL.Close'])
])
# hide dates with no values
fig.update_xaxes(rangebreaks=[dict(values=dt_breaks)])
fig.update_layout(yaxis_title='AAPL Stock')
fig.show()
Just in case someone here wants to remove gaps for outside trading hours and weekends,
As shown below, using rangebreaks is the way to do it.
fig = go.Figure(data=[go.Candlestick(x=df['date'], open=df['Open'], high=df['High'], low=df['Low'], close=df['Close'])])
fig.update_xaxes(
rangeslider_visible=True,
rangebreaks=[
# NOTE: Below values are bound (not single values), ie. hide x to y
dict(bounds=["sat", "mon"]), # hide weekends, eg. hide sat to before mon
dict(bounds=[16, 9.5], pattern="hour"), # hide hours outside of 9.30am-4pm
# dict(values=["2020-12-25", "2021-01-01"]) # hide holidays (Christmas and New Year's, etc)
]
)
fig.update_layout(
title='Stock Analysis',
yaxis_title=f'{symbol} Stock'
)
fig.show()
here's Plotly's doc.
thanks for the amazing sample! works on daily data but with intraday / 5min data rangebreaks only leave one day on chart
# build complete timepline
dt_all = pd.date_range(start=df.index[0],end=df.index[-1], freq="5T")
# retrieve the dates that ARE in the original datset
dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in pd.to_datetime(df.index, format="%Y-%m-%d %H:%M:%S")]
# define dates with missing values
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]
To fix problem with intraday data, you can use the dvalue parameter of rangebreak with the right ms value.
For example, 1 hour = 3.6e6 ms, so use dvalue with this value.
Documentation here : https://plotly.com/python/reference/layout/xaxis/
fig.update_xaxes(rangebreaks=[dict(values=dt_breaks, dvalue=3.6e6)])