This code plots the data exactly as I want with the dates on the x-axis and the times on the y-axis. However I want the y-axis to show every hour on the hour (e.g., 00, 01, ... 23) and the x-axis to show the beginning of every month at an angle so there's no overlap (the actual data being used spans over a year) and only once, since this code repeats the same months. How is this accomplished?
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-02-04 11:55:09']
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time, '.')
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%Y-%m')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
plt.show()
UPDATE: This fixes the x axis.
# Monthly intervals on x axis
months = mdates.MonthLocator()
d_fmt = mdates.DateFormatter('%Y-%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(d_fmt)
However, this attempt to fix the y axis just makes it blank.
# Hourly intervals on y axis
hours = mdates.HourLocator()
t_fmt = mdates.DateFormatter('%H')
ax.yaxis.set_major_locator(hours)
ax.yaxis.set_major_formatter(t_fmt)
I'm reading these docs but not understanding my error: https://matplotlib.org/api/dates_api.html, https://matplotlib.org/api/ticker_api.html
Matplotlib cannot plot times without corresponding date. This would make is necessary to add some arbitrary date (in the below case I took the 1st of january 2018) to the times. One may use datetime.datetime.combine for that purpose.
timetodatetime = lambda x:dt.datetime.combine(dt.date(2018, 1, 1), x)
time = list(map(timetodatetime, data.time))
ax.plot(data.date, time, '.')
Then the code from the question using HourLocator() would work fine. Finally, setting the limits on the axes would also require to use datetime objects,
ax.set_ylim([dt.datetime(2018,1,1,0), dt.datetime(2018,1,2,0)])
Complete example:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27',
'2018-02-04 11:55:09']
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
timetodatetime = lambda x:dt.datetime.combine(dt.date(2018, 1, 1), x)
time = list(map(timetodatetime, data.time))
ax.plot(data.date, time, '.')
# Monthly intervals on x axis
months = mdates.MonthLocator()
d_fmt = mdates.DateFormatter('%Y-%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(d_fmt)
## Hourly intervals on y axis
hours = mdates.HourLocator()
t_fmt = mdates.DateFormatter('%H')
ax.yaxis.set_major_locator(hours)
ax.yaxis.set_major_formatter(t_fmt)
ax.set_ylim([dt.datetime(2018,1,1,0), dt.datetime(2018,1,2,0)])
plt.show()
Related
I have a graph that shows the closing price of a stock throughout a day at each five minute interval. The x axis shows the time and the range of x values is from 9:30 to 4:00 (16:00).
The problem is that the automatic bounds for the x axis go from 9:37 to 16:07 and I really just want it from 9:30 to 16:00.
The code I am currently running is this:
stk = yf.Ticker(ticker)
his = stk.history(interval="5m", start=start, end=end).values.tolist() #open - high - low - close - volume
x = []
y = []
count = 0
five_minutes = datetime.timedelta(minutes = 5)
for bar in his:
x.append((start + five_minutes * count))#.strftime("%H:%M"))
count = count + 1
y.append(bar[3])
plt.clf()
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%H:%M"))
plt.gca().xaxis.set_major_locator(mdates.MinuteLocator(interval=30))
plt.plot(x, y)
plt.gcf().autofmt_xdate()
plt.show()
And it produces this plot (currently a link because I am on a new user account):
I thought I was supposed to use the axis.set_data_interval function providing, so I did so by providing datetime objects representing 9:30 and 16:00 as the min and the max. This gave me the error:
TypeError: '<' not supported between instances of 'float' and 'datetime.datetime'
Is there another a way for me to be able to adjust the first xtick and still have it automatically fill in the rest?
This problem can be fixed by adjusting the way you use the mdates tick locator. Here is an example based on the one shared by r-beginners to make it comparable. Note that I use the pandas plotting function for convenience. The x_compat=True argument is needed for it to work with mdates:
import pandas as pd # 1.1.3
import yfinance as yf # 0.1.54
import matplotlib.dates as mdates # 3.3.2
# Import data
ticker = 'AAPL'
stk = yf.Ticker(ticker)
his = stk.history(period='1D', interval='5m')
# Create pandas plot with appropriately formatted x-axis ticks
ax = his.plot(y='Close', x_compat=True, figsize=(10,5))
ax.xaxis.set_major_locator(mdates.MinuteLocator(byminute=[0, 30]))
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H:%M', tz=his.index.tz))
ax.legend(frameon=False)
ax.figure.autofmt_xdate(rotation=0, ha='center')
The sample data was created by obtaining Apple's stock price from Yahoo Finance. The desired five-minute interval labels are a list of strings obtained by using the date function to get the start and end times at five-minute intervals.
Based on this, the x-axis is drawn as a graph of the number of five-minute intervals and the closing price, and the x-axis is set to any interval by slicing.
import yfinance as yf
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
import numpy as np
ticker = 'AAPL'
stk = yf.Ticker(ticker)
his = stk.history(period='1D',interval="5m")
his.reset_index(inplace=True)
time_rng = pd.date_range('09:30','15:55', freq='5min')
labels = ['{:02}:{:02}'.format(t.hour,t.minute) for t in time_rng]
fig, ax = plt.subplots()
x = np.arange(len(his))
y = his.Close
ax.plot(x,y)
ax.set_xticks(x[::3])
ax.set_xticklabels(labels[::3], rotation=45)
plt.show()
i have x-axis which is in terms of days (366 days Feb was taken as 29 days) but instead I want to convert it in terms of months (Jan - Dec). What should i do...
def plotGraph():
line, point = getXY()
plt.plot(line['xlMax'], c='orangered', alpha=0.5, label = 'Minimum Temperature (2005-14)')
plt.plot(line['xlMin'], c='dodgerblue', alpha=0.5, label = 'Minimum Temperature (2005-14)')
plt.scatter(point['xsMax'].index, point['xsMax'], s = 10, c = 'maroon', label = 'Record Break Minimum (2015)')
plt.scatter(point['xsMin'].index, point['xsMin'], s = 10, c = 'midnightblue', label = 'Record Break Maximum (2015)')
ax1 = plt.gca() # Primary axes
ax1.fill_between(line['xlMax'].index , line['xlMax'], line['xlMin'], facecolor='lightgray', alpha=0.25)
ax1.grid(True, alpha = 1)
for spine in ax1.spines:
ax1.spines[spine].set_visible(False)
ax1.spines['bottom'].set_visible(True)
ax1.spines['bottom'].set_alpha(0.3)
# Removing Ticks
ax1.tick_params(axis=u'both', which=u'both',length=0)
plt.show()
I think the quickest change might be to just set new ticks and tick labels at the starts of months; I found the conversion from day-of-the-year to month here, the first table:
import numpy as np
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
x = range(1,367)
y = np.random.rand(len(range(1,367)))
ax.plot(x,y)
month_starts = [1,32,61,92,122,153,183,214,245,275,306,336]
month_names = ['Jan','Feb','Mar','Apr','May','Jun',
'Jul','Aug','Sep','Oct','Nov','Dec']
ax.set_xticks(month_starts)
ax.set_xticklabels(month_names)
Note I assumed your days were numbered 1 to 366; if they are 0 to 365 you may have to change the range.
But I think usually a better approach is to get your days into some sort of datetime; this is more flexible and usually pretty smart. If say, your days were not confined to one year, it would be more complicated to associate day numbers with months.
This example uses datetime instead of integers. The dates are plotted on the x-axis directly, and then the DateFormatter and MonthLocator from matplotlib.dates are used to format the axis appropriately:
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
start = dt.datetime(2016,1,1) #there has to be a year given, even if it isn't plotted
new_dates = [start + dt.timedelta(days=i) for i in range(366)]
fig, ax = plt.subplots()
x = new_dates
y = np.random.rand(len(range(1,367)))
xfmt = mdates.DateFormatter('%b')
months = mdates.MonthLocator()
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(xfmt)
ax.plot(x,y)
If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
I'm trying to build matplotlib charts whose x-axis is a dateIndex from a pandas dataframe. Trying to mimic some examples from matplotlib, I've been unsuccessful. The xaxis ticks and labels never appear.
I thought maybe matplotlib wasn't properly digesting the pandas index, so I converted it to ordinal with the matplotlib date2num helper function, but that gave the same result.
# https://matplotlib.org/api/dates_api.html
# https://matplotlib.org/examples/api/date_demo.html
import datetime as dt
import matplotlib.dates as mdates
import matplotlib.cbook as cbook
import matplotlib.dates as mpd
years = mdates.YearLocator() # every year
months = mdates.MonthLocator() # every month
yearsFmt = mdates.DateFormatter('%Y')
majorLocator = years
majorFormatter = yearsFmt #FormatStrFormatter('%d')
minorLocator = months
y1 = np.arange(100)*0.14+1
y2 = -(np.arange(100)*0.04)+12
"""neither of these indices works"""
x = pd.date_range(start='4/1/2012', periods=len(y1))
#x = map(mpd.date2num, pd.date_range(start='4/1/2012', periods=len(y1)))
fig, ax = plt.subplots()
ax.plot(x,y1)
ax.plot(x,y2)
ax.xaxis.set_major_locator(years)
ax.xaxis.set_major_formatter(yearsFmt)
ax.xaxis.set_minor_locator(months)
datemin = x[0]
datemax = x[-1]
ax.set_xlim(datemin, datemax)
fig.autofmt_xdate()
plt.show()
The problem is the following. pd.date_range(start='4/1/2012', periods=len(y1)) creates dates from the first of April 2012 to the 9th of July 2012.
Now you set the major locator to be a YearLocator. This means, that you want to have a tick for each year on the axis. However, all dates are within the same year 2012. So there is no major tick to be shown within the plot range.
The suggestion would be to use a MonthLocator instead, such that the first of each month is ticked. Also if would make sense to use a formatter, which actually shows the months, e.g. '%b %Y'. You may use a DayLocator for the minor ticks, if you want, to show the small tickmarks for each day.
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_minor_locator(mdates.DayLocator())
Complete example:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
y1 = np.arange(100)*0.14+1
y2 = -(np.arange(100)*0.04)+12
x = pd.date_range(start='4/1/2012', periods=len(y1))
fig, ax = plt.subplots()
ax.plot(x,y1)
ax.plot(x,y2)
ax.xaxis.set_major_locator(mdates.MonthLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b %Y'))
ax.xaxis.set_minor_locator(mdates.DayLocator())
fig.autofmt_xdate()
plt.show()
You could use pd.DataFrame.plot to handle most of that
df = pd.DataFrame(dict(
y1=y1, y2=y2
), index=x)
df.plot()
My matplotlib pyplot has too many xticks - it is currently showing each year and month for a 15-year period, e.g. "2001-01", but I only want the x-axis to show the year (e.g. 2001).
The output will be a line graph where x-axis shows dates and the y-axis shows the sale and rent prices.
# Defining the variables
ts1 = prices['Month'] # eg. "2001-01" and so on
ts2 = prices['Sale']
ts3 = prices['Rent']
# Reading '2001-01' as year and month
ts1 = [dt.datetime.strptime(d,'%Y-%m').date() for d in ts1]
plt.figure(figsize=(13, 9))
# Below is where it goes wrong. I don't know how to set xticks to show each year.
plt.xticks(ts1, rotation='vertical')
plt.xlabel('Year')
plt.ylabel('Price')
plt.plot(ts1, ts2, 'r-', ts1, ts3, 'b.-')
plt.gcf().autofmt_xdate()
plt.show()
Try removing the plt.xticks function call altogether. matplotlib will then use the default AutoDateLocator function to find the optimum tick locations.
Alternatively if the default includes some months which you don't want then you can use matplotlib.dates.YearLocator which will force the ticks to be years only.
You can set the locator as shown below in a quick example:
import matplotlib.pyplot as plt
import matplotlib.dates as mdate
import numpy as np
import datetime as dt
x = [dt.datetime.utcnow() + dt.timedelta(days=i) for i in range(1000)]
y = range(len(x))
plt.plot(x, y)
locator = mdate.YearLocator()
plt.gca().xaxis.set_major_locator(locator)
plt.gcf().autofmt_xdate()
plt.show()
You can do this with plt.xticks.
As an example, here I have set the xticks frequency to display every three indices. In your case, you would probably want to do so every twelve indices.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(10)
y = np.random.randn(10)
plt.plot(x,y)
plt.xticks(np.arange(min(x), max(x)+1, 3))
plt.show()
In your case, since you are using dates, you can replace the argument of the second to last line above with something like ts1[0::12], which will select every 12th element from ts1 or np.arange(0, len(dates), 12) which will select every 12th index corresponding to the ticks you want to show.