Annotate Time Series plot in Matplotlib

Annotate Time Series plot in Matplotlib - python

I have an index array (x) of dates (datetime objects) and an array of actual values (y: bond prices). Doing (in iPython):
plot(x,y)
Produces a perfectly fine time series graph with the x axis labeled with the dates. No problem so far. But I want to add text on certain dates. For example, at 2009-10-31 I wish to display the text "Event 1" with an arrow pointing to the y value at that date.
I have read trough the Matplotlib documentation on text() and annotate() to no avail. It only covers standard numbered x-axises, and I can´t infer how to work those examples on my problem.
Thank you

Matplotlib uses an internal floating point format for dates.
You just need to convert your date to that format (using matplotlib.dates.date2num or matplotlib.dates.datestr2num) and then use annotate as usual.
As a somewhat excessively fancy example:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
x = [dt.datetime(2009, 05, 01), dt.datetime(2010, 06, 01),
dt.datetime(2011, 04, 01), dt.datetime(2012, 06, 01)]
y = [1, 3, 2, 5]
fig, ax = plt.subplots()
ax.plot_date(x, y, linestyle='--')
ax.annotate('Test', (mdates.date2num(x[1]), y[1]), xytext=(15, 15),
textcoords='offset points', arrowprops=dict(arrowstyle='-|>'))
fig.autofmt_xdate()
plt.show()

Related

Reformat seaborn axis tick labels (datetime and scientific notation)

I am trying to plot a bar chart. However, the number on x-axis shows 0.0, 0.5, 1.0 and y-axis shows the np datetime.
I want the x-axis shows the exactly number and y-axis shows the date with dd-mm-yy only. May I know how can I solve it?
from pandas_datareader import data
import datetime
dateToday = datetime.datetime.today().strftime("%Y-%m-%d")
# Only get the adjusted close.
tickers = data.DataReader(['NFLX'],
start='',
end=dateToday,
data_source='yahoo')['Volume'][-50:]
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
# Change the style of the figure to the "dark" theme
sns.set_style("darkgrid")
plt.figure(figsize=(12,6))
plt.title('Returns')
sns.barplot(y=tickers.index,x=tickers['NFLX'],color='b',edgecolor='w',label='7d_returns',orient = 'h')
plt.xticks(rotation = 0)

For the date formatting, you can supply a formatted index: y=tickers.index.strftime('%d-%m-%Y')
For the x axis, they are currently in scientific notation (note the 1e7 in the bottom right corner), so you can disable the scientific notation: plt.ticklabel_format(style='plain', axis='x')
sns.barplot(y=tickers.index.strftime('%d-%m-%Y'), x=tickers['NFLX'], color='b', edgecolor='w', label='7d_returns', orient='h')
plt.ticklabel_format(style='plain', axis='x')

I cant change the value of x label in plot

I want to plot Speed-Time graph and ı want to change to x label as 2012, 2013,2014,2015,2016 as like this
With this code ı got this graph:
Could you help me please. What should ı do ?
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
data=pd.read_excel("ts80.xlsx")
Date=data["Date"]
Speed=data["Speed"]
timestamp = pd.to_datetime(Date[0:]).dt.strftime("%Y%m%d")
fig, ax = plt.subplots(figsize=(13,6))
ax.plot(timestamp, Speed)
plt.xlabel("Time")
plt.ylabel("80Avg[m/s]")
plt.title("Mean Wind Speed at 80m")
ax.xaxis.set_minor_formatter(mdates.DateFormatter("%Y"))
plt.xticks([2012, 2013, 2014, 2015, 2016])
plt.show()
Could you please write correct code ?

The issue is that you're converting your dates strings to dates and then back to strings. Don't do that. Keep them as dates and then use the correct DataFormatter on your x-axis.
import numpy
from matplotlib import pyplot
from matplotlib import dates
import pandas
data = pd.read_excel("ts80.xlsx")
t = pd.to_datetime(data["Date"])
x = data["Speed"]
fig, ax = pyplot.subplots()
ax.plot(t, x)
ax.xaxis.set_major_formatter(dates.DateFormatter("%Y"))
ax.xaxis.set_major_locator(dates.YearLocator())
plt.show()
The reason why you want to use a proper Formatter and Locator is that it solves the general case where this code evolves into an interactive plot with a long series of data and the user can pan/zoom around. Hard-coded ticks or tick labels completely falls apart under that scenario.

Axis interval spacing when plotting with pandas timedelta

I'm trying to plot some columns in a dataframe that has pandas timedelta values as its index. When I plot it, all the points are evenly spaced along the x axis even if there's a variable time between.
time = [pd.Timestamp('9/3/2016')-pd.Timestamp('9/1/2016'),pd.Timestamp('9/8/2016')-pd.Timestamp('9/1/2016'),pd.Timestam\p('9/29/2016')-pd.Timestamp('9/1/2016')]
df = pd.DataFrame(index=time, columns=['y'],data=[5,0,10])
df.plot()
plt.show()
Wrong spacing
If instead I used dates instead of timedelta, I get the proper spacing on the x axis:
time = [pd.Timestamp('9/3/2016'),pd.Timestamp('9/5/2016'),pd.Timestamp('9/20/2016')]
df = pd.DataFrame(index=time, columns=['y'],data=[5,0,10])
df.plot()
plt.show()
Right spacing
Is there a way to get this to display correctly?

At the moment, it's not fully supported yet in pandas. Please see this issue on Github for more info.
For a quick workaround, you can use:
import matplotlib.pyplot as plt
plt.plot(df.index, df.values)
Here's an example of how you could play with the ticks to make them readable (rather than just a very large number)
import matplotlib as mpl
import datetime
fig, ax = plt.subplots()
ax.plot(df.index, df.values)
plt.xticks([t.value for t in df.index], df.index, rotation=45)
plt.show()

Matplotlib : Default Resolution of Plot Mouse-over Values

I'm plotting a time-series with MatplotLib. The time series values, x-axis, have the resolution '%d/%m/%y %H:%M', but only month and year are indicated in the mouse over.
My question is how can one override the default and set what datetime items that should be shown during mouse over?
My preference is to show at least day, month, and year
.....................................................................
For example, this is a screenshot where I did a mouse-over for one of the points:
As you can see the (bottom LHS corner) x value gives a date which indicated only month and year.
When zoomed in, day, month and year are shown:

The values that are shown on mouse-over are controlled by the ax.format_coord method, which is meant to be monkey-patched by a user-supplied method when customization is needed.
For example:
import matplotlib.pyplot as plt
def formatter(x, y):
return '{:0.0f} rainbows, {:0.0f} unicorns'.format(10*x, 10*y)
fig, ax = plt.subplots()
ax.format_coord = formatter
plt.show()
There are also the ax.format_xdata and ax.format_ydata which the default ax.format_coord calls, to allow easier customization of only the x or y components.
For example:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.format_xdata = '{:0.1f}meters'.format
plt.show()
Note that I passed in the string's format method, but it could have just as easily been a lambda or any arbitrary method that expects a single numeric argument.
By default, format_xdata and format_ydata use the axis's major tick formatter, which is why you're getting day-level resolution for your date axis.
However, you'll also need to convert matplotlib's internal numeric date format back to a "proper" datetime object. Therefore, you can control your formatting similar to the following:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
fig, ax = plt.subplots()
ax.xaxis_date()
ax.set_xlim(dt.datetime(2015, 1, 1), dt.datetime(2015, 6, 1))
ax.format_xdata = lambda d: mdates.num2date(d).strftime('%d/%m/%y %H:%M')
plt.show()

How to highlight specific x-value ranges

I'm making a visualization of historical stock data for a project, and I'd like to highlight regions of drops. For instance, when the stock is experiencing significant drawdown, I would like to highlight it with a red region.
Can I do this automatically, or will I have to draw a rectangle or something?

Have a look at axvspan (and axhspan for highlighting a region of the y-axis).
import matplotlib.pyplot as plt
plt.plot(range(10))
plt.axvspan(3, 6, color='red', alpha=0.5)
plt.show()
If you're using dates, then you'll need to convert your min and max x values to matplotlib dates. Use matplotlib.dates.date2num for datetime objects or matplotlib.dates.datestr2num for various string timestamps.
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
t = mdates.drange(dt.datetime(2011, 10, 15), dt.datetime(2011, 11, 27),
dt.timedelta(hours=2))
y = np.sin(t)
fig, ax = plt.subplots()
ax.plot_date(t, y, 'b-')
ax.axvspan(*mdates.datestr2num(['10/27/2011', '11/2/2011']), color='red', alpha=0.5)
fig.autofmt_xdate()
plt.show()

Here is a solution that uses axvspan to draw multiple highlights where the limits of each highlight are set by using the indices of the stock data corresponding to the peaks and troughs.
Stock data usually contain a discontinuous time variable where weekends and holidays are not included. Plotting them in matplotlib or pandas will produce gaps along the x-axis for weekends and holidays when dealing with daily stock prices. This may not be noticeable with long date ranges and/or small figures (like in this example), but it will become apparent if you zoom in and it may be something that you want to avoid.
This is why I share here a complete example that features:
A realistic sample dataset that includes a discontinuous DatetimeIndex based on the New York Stock Exchange trading calendar imported with the pandas_market_calendars as well as fake stock data that looks like the real thing.
A pandas plot created with use_index=False which removes the gaps for weekends and holidays by using instead a range of integers for the x-axis. The returned ax object is used in a way that avoids the need to import matplotlib.pyplot (unless you need plt.show).
An automatic detection of drawdowns over the entire date range by using the scipy.signal find_peaks function which returns the indices needed to plot the highlights with axvspan. Computing drawdowns in a more correct way would require a clear definition of what would count as a drawdown and would lead to more complicated code which is a topic for another question.
Properly formatted ticks created by looping through the timestamps of the DatetimeIndex seeing as all the convenient matplotlib.dates tick locators and formatters as well as DatetimeIndex properties like .is_month_start cannot be used in this case.
Create sample dataset
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import pandas_market_calendars as mcal # v 1.6.1
from scipy.signal import find_peaks # v 1.5.2
# Create datetime index with a 'trading day end' frequency based on the New York Stock
# Exchange trading hours (end date is inclusive)
nyse = mcal.get_calendar('NYSE')
nyse_schedule = nyse.schedule(start_date='2019-10-01', end_date='2021-02-01')
nyse_dti = mcal.date_range(nyse_schedule, frequency='1D').tz_convert(nyse.tz.zone)
# Create sample of random data for daily stock closing price
rng = np.random.default_rng(seed=1234) # random number generator
price = 100 + rng.normal(size=nyse_dti.size).cumsum()
df = pd.DataFrame(data=dict(price=price), index=nyse_dti)
df.head()
# price
# 2019-10-01 16:00:00-04:00 98.396163
# 2019-10-02 16:00:00-04:00 98.460263
# 2019-10-03 16:00:00-04:00 99.201154
# 2019-10-04 16:00:00-04:00 99.353774
# 2019-10-07 16:00:00-04:00 100.217517
Plot highlights for drawdowns with properly formatted ticks
# Plot stock price
ax = df['price'].plot(figsize=(10, 5), use_index=False, ylabel='Price')
ax.set_xlim(0, df.index.size-1)
ax.grid(axis='x', alpha=0.3)
# Highlight drawdowns using the indices of stock peaks and troughs: find peaks and
# troughs based on signal analysis rather than an algorithm for drawdowns to keep
# example simple. Width and prominence have been handpicked for this example to work.
peaks, _ = find_peaks(df['price'], width=7, prominence=4)
troughs, _ = find_peaks(-df['price'], width=7, prominence=4)
for peak, trough in zip(peaks, troughs):
ax.axvspan(peak, trough, facecolor='red', alpha=.2)
# Create and format monthly ticks
ticks = [idx for idx, timestamp in enumerate(df.index)
if (timestamp.month != df.index[idx-1].month) | (idx == 0)]
ax.set_xticks(ticks)
labels = [tick.strftime('%b\n%Y') if df.index[ticks[idx]].year
!= df.index[ticks[idx-1]].year else tick.strftime('%b')
for idx, tick in enumerate(df.index[ticks])]
ax.set_xticklabels(labels)
ax.figure.autofmt_xdate(rotation=0, ha='center')
ax.set_title('Drawdowns are highlighted in red', pad=15, size=14);
For the sake of completeness, it is worth noting that you can achieve exactly the same result using the fill_between plotting function, though it takes a few more lines of code:
ax.set_ylim(*ax.get_ylim()) # remove top and bottom gaps with plot frame
drawdowns = np.repeat(False, df['price'].size)
for peak, trough in zip(peaks, troughs):
drawdowns[np.arange(peak, trough+1)] = True
ax.fill_between(np.arange(df.index.size), *ax.get_ylim(), where=drawdowns,
facecolor='red', alpha=.2)
You are using matplotlib's interactive interface and want to have dynamic ticks when you zoom in? Then you will need to use locators and formatters from the matplotlib.ticker module. You could for example keep the major ticks fixed like in this example and add dynamic minor ticks to show days or weeks of the year when zooming in. You can find an example of how to do this at the end of this answer.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Annotate Time Series plot in Matplotlib - python

Related

Reformat seaborn axis tick labels (datetime and scientific notation)

I cant change the value of x label in plot

Axis interval spacing when plotting with pandas timedelta

Matplotlib : Default Resolution of Plot Mouse-over Values

How to highlight specific x-value ranges

Categories

Resources