Plotly x-axis dates are displayed as the date plus one day - python

I'm using plotly to create a stacked bar chart, with each bar representing a quarter end date. The data is pulled into a dataframe via SQL and the dates are parsed in the read_sql statement.
When graphing the dates on the x-axis are displayed as 10/01 instead of 9/30, 4/1 instead of 3/31, etc.
Any idea how I can just display the dates correctly?
Here's a sample
import plotly.express as px
fig = px.bar(df.groupby('dt_quarter').head(10), x='dt_quarter', y="amount", color="name", title="Stack Bar Test")
fig.update_layout(yaxis_title_text = 'Amount ($)',xaxis_title_text='Date', legend_title_text='Sector', legend_traceorder='reversed')
fig.show()

What I ended up doing was creating a new column in my dataframe that displays the date in 'QXYYYY' format (e.g. Q42020, etc.). I then used that as my x axis and it works fine.
For creating the new column:
alldata['quarter'] = pd.PeriodIndex(alldata.dt_quarter, freq='Q').astype('str')

Related

How to extract appropriate data in Plotly Grouped Bar Chart?

Well, I am trying to plot a Bar Graph in Plotly where I need to show 3 years of data in a grouped bar chart though the chart is displaying the data in the chart not showing data correctly all the bars are equal in the graph Something like this:
Here is my code for plotting:
data=[go.Bar(x=nasdaq['Sector'],y=recent_ipos['IPO Year'],textangle=-45,name='2015'),
go.Bar(x=nasdaq['Sector'],y=recent_ipos['IPO Year'],textangle=-45,name='2016'),
go.Bar(x=nasdaq['Sector'],y=recent_ipos['IPO Year'],textangle=-45,name='2017')
]
layout=go.Layout(title="NASDAQ Market Capitalization IPO yEAR (million USD)",barmode='group')
fig=go.Figure(data=data,layout=layout)
fig.show(renderer="colab")
Here is my code which I am using to extract the data for 3 years:
recent_ipos = nasdaq[nasdaq['IPO Year'] > 2014]
recent_ipos['IPO Year'] = recent_ipos['IPO Year'].astype(int)
I tried to extract the 2015 data here using an array but I don't find an appropriate method here to extract an element from the array
ipo2015=np.array(recent_ipos['IPO Year'])
ipo2015
I am not sure if this is the right way to extract the particular year data or not??
Things I want to know here are :
How to extract year data appropriately in the graph using Plotly?
What changes I should make to solve this inconsistency in the graph?
What should I put in Y= in all the three groups bars??
How to extract the years dynamically rather than manually?
Hope to receive help from this amazing community on StackOverflow.
Thanks in advance.!!
I wrote the code under the assumption that the data on which the question is based is in data frame format. The data is taken from plotly. The query() can also be used as a variable using # as shown in the code.
import plotly.graph_objects as go
import plotly.express as px
# yyyy = [1992,1997,2002]
df = px.data.gapminder()
continent = df['continent'].unique().tolist()
yyyy = df['year'].unique().tolist()[-3:] # update
data = []
for y in yyyy:
# tmp_df = df.query('year == #y')
tmp_df = df[df['year'] == y].groupby('continent')['pop'].sum()
data.append(go.Bar(x=tmp_df.index, y=tmp_df, name=y))
# Change the bar mode
fig = go.Figure(data)
fig.update_layout(barmode='group')
fig.show()

python plotly (px) animation frame date is in wrong order

With plotly express I've built a bar chart similar to as shown on their website.
As px.bar did not allow me to run the animation frame on datetime64[ns] I transformed the datetime into a string as follows.
eu_vaccine_df['date_str'] = eu_vaccine_df['date'].apply(lambda x: str(x))
eu_vaccine_df[['date_str', 'date', 'country', 'people_vaccinated_per_hundred']].head()
The dataset on which I then run the px.bar looks as follows and contains 30 different countries.
The code for my barchart including animation looks as follows.
fig = px.bar(
eu_vaccine_df,
x='country', y='people_vaccinated_per_hundred',
color='country',
animation_frame='date_str',
animation_group='country',
hover_name='country',
range_y=[0,50],
range_x=[0,30]
)
fig.update_layout(
template='plotly_dark',
margin=dict(r=10, t=25, b=40, l=60)
)
fig.show()
In the end result the date on the animation frame is wrong. It first shows all results from 2021 and then all results from 2020 as shown at the bottom of the following screenshot.
Sorting my df by the date solved the issue.
covid_df['date'] = pd.to_datetime(covid_df['date'])
covid_df = covid_df.sort_values('date', ascending=True)
covid_df['date'] = covid_df['date'].dt.strftime('%m-%d-%Y')

Plotly: How to style a plotly figure so that it doesn't display gaps for missing dates?

I have a plotly graph of the EUR/JPY exchange rate across a few months in 15 minute time intervals, so as a result, there is no data from friday evenings to sunday evenings.
Here is a portion of the data, note the skip in the index (type: DatetimeIndex) over the weekend:
Plotting this data in plotly results in a gap over the missing dates Using the dataframe above:
import plotly.graph_objs as go
candlesticks = go.Candlestick(x=data.index, open=data['Open'], high=data['High'],
low=data['Low'], close=data['Close'])
fig = go.Figure(layout=cf_layout)
fig.add_trace(trace=candlesticks)
fig.show()
Ouput:
As you can see, there are gaps where the missing dates are. One solution I've found online is to change the index to text using:
data.index = data.index.strftime("%d-%m-%Y %H:%M:%S")
and plotting it again, which admittedly does work, but has it's own problem. The x-axis labels look atrocious:
I would like to produce a graph that plots a graph like in the second plot where there are no gaps, but the x-axis is displayed like as it is on the first graph. Or at least displayed in a much more concise and responsive format, as close to the first graph as possible.
Thank you in advance for any help!
Even if some dates are missing in your dataset, plotly interprets your dates as date values, and shows even missing dates on your timeline. One solution is to grab the first and last dates, build a complete timeline, find out which dates are missing in your original dataset, and include those dates in:
fig.update_xaxes(rangebreaks=[dict(values=dt_breaks)])
This will turn this figure:
Into this:
Complete code:
import plotly.graph_objects as go
from datetime import datetime
import pandas as pd
import numpy as np
# sample data
df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/finance-charts-apple.csv')
# remove some dates to build a similar case as in the question
df = df.drop(df.index[75:110])
df = df.drop(df.index[210:250])
df = df.drop(df.index[460:480])
# build complete timepline from start date to end date
dt_all = pd.date_range(start=df['Date'].iloc[0],end=df['Date'].iloc[-1])
# retrieve the dates that ARE in the original datset
dt_obs = [d.strftime("%Y-%m-%d") for d in pd.to_datetime(df['Date'])]
# define dates with missing values
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d").tolist() if not d in dt_obs]
# make fiuge
fig = go.Figure(data=[go.Candlestick(x=df['Date'],
open=df['AAPL.Open'], high=df['AAPL.High'],
low=df['AAPL.Low'], close=df['AAPL.Close'])
])
# hide dates with no values
fig.update_xaxes(rangebreaks=[dict(values=dt_breaks)])
fig.update_layout(yaxis_title='AAPL Stock')
fig.show()
Just in case someone here wants to remove gaps for outside trading hours and weekends,
As shown below, using rangebreaks is the way to do it.
fig = go.Figure(data=[go.Candlestick(x=df['date'], open=df['Open'], high=df['High'], low=df['Low'], close=df['Close'])])
fig.update_xaxes(
rangeslider_visible=True,
rangebreaks=[
# NOTE: Below values are bound (not single values), ie. hide x to y
dict(bounds=["sat", "mon"]), # hide weekends, eg. hide sat to before mon
dict(bounds=[16, 9.5], pattern="hour"), # hide hours outside of 9.30am-4pm
# dict(values=["2020-12-25", "2021-01-01"]) # hide holidays (Christmas and New Year's, etc)
]
)
fig.update_layout(
title='Stock Analysis',
yaxis_title=f'{symbol} Stock'
)
fig.show()
here's Plotly's doc.
thanks for the amazing sample! works on daily data but with intraday / 5min data rangebreaks only leave one day on chart
# build complete timepline
dt_all = pd.date_range(start=df.index[0],end=df.index[-1], freq="5T")
# retrieve the dates that ARE in the original datset
dt_obs = [d.strftime("%Y-%m-%d %H:%M:%S") for d in pd.to_datetime(df.index, format="%Y-%m-%d %H:%M:%S")]
# define dates with missing values
dt_breaks = [d for d in dt_all.strftime("%Y-%m-%d %H:%M:%S").tolist() if not d in dt_obs]
To fix problem with intraday data, you can use the dvalue parameter of rangebreak with the right ms value.
For example, 1 hour = 3.6e6 ms, so use dvalue with this value.
Documentation here : https://plotly.com/python/reference/layout/xaxis/
fig.update_xaxes(rangebreaks=[dict(values=dt_breaks, dvalue=3.6e6)])

Why is my plotly graph plotting datetime on x axis as exponents?

I am trying to plot a graph with dates (pandas datetime) on the x axis. However, they are plotting in numerical format instead (showing up as exponents).
Example of dates:
0 2014-05-01
1 2014-05-02
2 2014-05-03
3 2014-05-04
4 2014-05-05
Name: date, dtype: datetime64[ns]
Code for plotly:
trace1 = go.Scatter(x = df_iso_h.date,
y=del18_f_hum,
mode = 'markers')
data = [trace1]
py.iplot(data)
My x-axis:
Not sure how to fix this??
You need to add layout and specify parameter xaxis in it. Such as here.
So try this:
# Create trace
trace1 = go.Scatter(x = df_iso_h.date,
y=del18_f_hum,
mode = 'markers')
# Add trace in data
data = [trace1]
# Create layout. With layout you can customize plotly plot
layout = dict(title = 'Scatter',
# Add what you want to see at xaxis
xaxis = df_iso_h.date
)
#Do not forget added layout to fig!
fig = dict(data=data, layout=layout)
# Plot scatter
py.iplot(data, filename="scatterplot")
This should help you.
Update: Try to convert datetime column with strftime (new column should be in object format!):
df_iso_h["date"] = df_iso_h["date"].dt.strftime("%d-%m-%Y")
If not worked, add this column in xaxis. Maybe plotly do not support datetime format yyyy-mm-dd... Notice, you xaxis will be looks like 01-05-2014
Figured it out... Plotly does not take pandas datetime, so I had to convert my pandas datetime to python datetime.datetime or datetime.date.
It seems that this was a regression introduced in plotly.py Version 3.2.0 and has been fixed in Version 3.2.1
You can now simply pass the pandas datetime column to plotly and it will handle the proper conversion for you like in the past.
See https://github.com/plotly/plotly.py/issues/1160

Plotting timestamps in matplotlib

I have a pandas dataframe which contains a column called "order.timestamp" - a list of timestamps for a set of occurrences.
I would like to plot these timestamps on the x-axis of a matplotlib plot and have the dates, hours, seconds etc display as I zoom in. Is this possible?
I have tried using datetime.strptime:
date_format = '%Y-%m-%dT%H:%M:%S.%fZ'
for i in range(0, len(small_data)) :
b = datetime.strptime(small_data["order.timestamp"].iloc[i],date_format)
small_data = small_data.set_value(i, "order.timestamp", b)
Which re-creates the column "order.timestamp" in my pandas dataframe. The column now contains entries like:
2017-01-01 12:50:06.902000
However, if I now try to plot as normal:
fig = plt.figure()
plt.plot(small_data["order.timestamp"], small_data["y_values"])
plt.show()
I see an error
ValueError: ordinal must be >= 1
Any help greatly appreciated!

Categories

Resources