Change the datetime xticks frequency - matplotlib [duplicate] - python

I'm basically trying to plot a graph where the x axis represent the month of the year. The data is stored in a numpy.array, with dimensions k x months. Here it follows a minimal example (my data is not this crazy):
import numpy
import matplotlib
import matplotlib.pyplot as plt
cmap = plt.get_cmap('Set3')
colors = [cmap(i) for i in numpy.linspace(0, 1, len(complaints))]
data = numpy.random.rand(18,12)
y = range(data.shape[1])
plt.figure(figsize=(15, 7), dpi=200)
for i in range(data.shape[0]):
plt.plot(y, data[i,:], color=colors[i], linewidth=5)
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.xticks(numpy.arange(0, 12, 1))
plt.xlabel('Hour of the Day')
plt.ylabel('Number of Complaints')
plt.title('Number of Complaints per Hour in 2015')
I'd like to have the xticks as strings instead of numbers. I'm wondering if I have to create a list of strings, manually, or if there is another way to translate the numbers to months. I have to do the same for weekdays, for example.
I've been looking to these examples:
http://matplotlib.org/examples/pylab_examples/finance_demo.html
http://matplotlib.org/examples/pylab_examples/date_demo2.html
But I'm not using datetime.

Althought this answer works well, for this case you can avoid defining your own FuncFormatter by using the pre-defined ones from matplotlib for dates, by using matplotlib.dates rather than matplotlib.ticker:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import pandas as pd
# Define time range with 12 different months:
# `MS` stands for month start frequency
x_data = pd.date_range('2018-01-01', periods=12, freq='MS')
# Check how this dates looks like:
print(x_data)
y_data = np.random.rand(12)
fig, ax = plt.subplots()
ax.plot(x_data, y_data)
# Make ticks on occurrences of each month:
ax.xaxis.set_major_locator(mdates.MonthLocator())
# Get only the month to show in the x-axis:
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
# '%b' means month as locale’s abbreviated name
plt.show()
Obtaining:
DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
'2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01',
'2018-09-01', '2018-10-01', '2018-11-01', '2018-12-01'],
dtype='datetime64[ns]', freq='MS')

This is an alternative plotting method plot_date, which you might want to use if your independent variable are datetime like, instead of using the more general plot method:
import datetime
data = np.random.rand(24)
#a list of time: 00:00:00 to 23:00:00
times = [datetime.datetime.strptime(str(i), '%H') for i in range(24)]
#'H' controls xticklabel format, 'H' means only the hours is shown
#day, year, week, month, etc are not shown
plt.plot_date(times, data, fmt='H')
plt.setp(plt.gca().xaxis.get_majorticklabels(),
'rotation', 90)
The benefit of it is that now you can easily control the density of xticks, if we want to have a tick every hour, we will insert these lines after plot_date:
##import it if not already imported
#import matplotlib.dates as mdates
plt.gca().xaxis.set_major_locator(mdates.HourLocator())

You can still use formatters to format your results in the way you want. For example, to have month names printed, let us first define a function taking an integer to a month abbreviation:
def getMonthName(month_number):
testdate=datetime.date(2010,int(month_number),1)
return testdate.strftime('%b')
Here, I have created an arbitrary date with the correct month and returned that month. Check the datetime documentation for available format codes if needed. If that is always easier than just setting a list by hand is another question. Now let us plot some monthly testdata:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
x_data=np.arange(1,12.5,1)
y_data=x_data**2 # Just some arbitrary data
plt.plot(x_data,y_data)
plt.gca().xaxis.set_major_locator(mtick.FixedLocator(x_data)) # Set tick locations
plt.gca().xaxis.set_major_formatter(mtick.FuncFormatter(lambda x,p:getMonthName(x)))
plt.show()
The message here is that you can use matplotlib.ticker.FuncFormatter to use any function to obtain a tick label. The function takes two arguments (value and position) and returns a string.

Related

How to plot time on the y axis correctly using python matplotlib?

I have two lists containing the sunset and sunrise times and the corresponding dates.
It looks like:
sunrises = ['06:30', '06:28', '06:27', ...]
dates = ['3.21', '3.22', '3.23', ...]
I want to make a plot of the sunrise times as the Y axis and the dates as the X axis.
Simply using
ax.plot(dates, sunrises)
ax.xaxis.set_major_locator(matplotlib.ticker.MultipleLocator(7))
ax.yaxis.set_major_locator(matplotlib.ticker.MultipleLocator(7))
plt.show()
can plot the dates correctly, but the time is wrong:
And actually, the sunrise time isn't supposed to be a straight line.
How do I solve this problem?
You need to transform the datetime in string format to the format that matplotlib can comprehend by using datetime
from matplotlib import pyplot as plt
import matplotlib as mpl
from datetime import datetime
import matplotlib.dates as mdates
sunrises = ['06:30', '06:28', '06:27',]
sunrises_dt = [datetime.strptime(item,'%H:%M') for item in sunrises]
dates = ['3.21', '3.22', '3.23',]
fig,ax = plt.subplots()
ax.plot(dates, sunrises_dt)
ax.yaxis.set_major_formatter(mdates.DateFormatter('%H:%M',))
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(1))
plt.show()
This is because your sunrises are not numerical. I'm assuming you'd want them in a form such that "6:30" means 6.5. Which is calculated below:
import matplotlib.pyplot as plt
sunrises = ['06:30', '06:28', '06:27']
# This converts to decimals
sunrises = [float(x[0:2])+(float(x[-2:])/60) for x in sunrises]
dates = ['3.21', '3.22', '3.23']
plt.plot(sunrises, dates)
plt.xlabel('sunrises')
plt.ylabel('dates')
plt.show()
Note, your dates are being treated as decimals. Is this correct?

Plotting a times series using matplotlib with 24 hours on the y-axis

If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)

pandas .plot() x-axis tick frequency -- how can I show more ticks?

I am plotting time series using pandas .plot() and want to see every month shown as an x-tick.
Here is the dataset structure
Here is the result of the .plot()
I was trying to use examples from other posts and matplotlib documentation and do something like
ax.xaxis.set_major_locator(
dates.MonthLocator(revenue_pivot.index, bymonthday=1,interval=1))
But that removed all the ticks :(
I also tried to pass xticks = df.index, but it has not changed anything.
What would be the rigth way to show more ticks on x-axis?
No need to pass any args to MonthLocator. Make sure to use x_compat in the df.plot() call per #Rotkiv's answer.
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import matplotlib.dates as mdates
df = pd.DataFrame(np.random.rand(100,2), index=pd.date_range('1-1-2018', periods=100))
ax = df.plot(x_compat=True)
ax.xaxis.set_major_locator(mdates.MonthLocator())
plt.show()
formatted x-axis with set_major_locator
unformatted x-axis
You could also format the x-axis ticks and labels of a pandas DateTimeIndex "manually" using the attributes of a pandas Timestamp object.
I found that much easier than using locators from matplotlib.dates which work on other datetime formats than pandas (if I am not mistaken) and thus sometimes show an odd behaviour if dates are not converted accordingly.
Here's a generic example that shows the first day of each month as a label based on attributes of pandas Timestamp objects:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# data
dim = 8760
idx = pd.date_range('1/1/2000 00:00:00', freq='h', periods=dim)
df = pd.DataFrame(np.random.randn(dim, 2), index=idx)
# select tick positions based on timestamp attribute logic. see:
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Timestamp.html
positions = [p for p in df.index
if p.hour == 0
and p.is_month_start
and p.month in range(1, 13, 1)]
# for date formatting, see:
# https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
labels = [l.strftime('%m-%d') for l in positions]
# plot with adjusted labels
ax = df.plot(kind='line', grid=True)
ax.set_xlabel('Time (h)')
ax.set_ylabel('Foo (Bar)')
ax.set_xticks(positions)
ax.set_xticklabels(labels)
plt.show()
yields:
Hope this helps!
The right way to do that described here
Using the x_compat parameter, it is possible to suppress automatic tick resolution adjustment
df.A.plot(x_compat=True)
If you want to just show more ticks, you can also dive deep into the structure of pd.plotting._converter:
dai = ax.xaxis.minor.formatter.plot_obj.date_axis_info
dai['fmt'][dai['fmt'] == b''] = b'%b'
After plotting, the formatter is a TimeSeries_DateFormatter and _set_default_format has been called, so self.plot_obj.date_axis_info is not None. You can now manipulate the structured array .date_axis_info to be to your liking, namely contain less b'' and more b'%b'
Remove tick labels:
ax = df.plot(x='date', y=['count'])
every_nth = 10
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)
Lower every_nth to include more labels, raise to keep fewer.

Format datetime labels to include weekday name for pandas plot

I would like to add the corresponding weekday names (Mon, Tues, etc.) to the xlabels for a pandas timeseries plot.
import pandas as pd
import numpy as np
import pylab as p
import datetime
dates = pd.date_range(datetime.datetime.today().date(), periods=10, freq='D')
data = pd.DataFrame(np.arange(10),index=dates,columns=['A'])
a = data['A'].plot()
p.tight_layout()
p.show()
I have tried adjusting the formatting using:
from matplotlib.dates import DateFormatter
formatter = DateFormatter('%a %d-%m-%Y')
a.xaxis.set_major_formatter(formatter)
But this does not work, leading to incorrect day and year.
It seems there should be a very simple solution, but I cannot find it.
Here's what I thought would work but didn't:
from matplotlib.ticker import FuncFormatter
from matplotlib import pyplot as plt
ax = data.A.plot()
ax.xaxis.set_major_formatter(FuncFormatter(lambda d, _: d.strftime('%a')))
or
ax = plt.subplot()
ax.plot(data.index, data.A)
ax.xaxis.set_major_formatter(FuncFormatter(lambda d, _: d.strftime('%a')))
These both go wrong in different ways. It seems the formatter inputs turn out to be floats rather than dates in both cases. In the first the function only gets applied to the first and last ticks. You can see this by passing
ax.xaxis.set_major_formatter(FuncFormatter(lambda d, _: d)
Here's a solution which is pretty flexible:
ax = plt.subplot()
ax.plot(data.index, data.A)
ticks = ax.set_xticklabels([d.strftime('%a') for d in data.index])
You can swap the list comprehension in the last line for whatever you like.
EDIT:
I think I've figure out what these numbers representing the xticks mean.
In [37]:
ax = plt.subplot()
ax.plot(data.index, data.A)
print ax.get_xticks()
[ 735824. 735825. 735826. 735827. 735828. 735829. 735830. 735831.
735832. 735833.]
These seem to represent the number of days since the start of 1 AD: According to this: http://www.epochconverter.com/epoch/seconds-days-since-year-0.php
"There are 736189 days between 0000-00-00 and today (Aug 14, 2015)."
Which is exactly 735824 (the first tick) + 365. So far so bad.
You could (I won't bother) write a function to convert this number and ones like into dates. Another approach would be:
def get_day(tick):
date = dates[0] + datetime.timedelta(tick - ticks[0])
return date.strftime('%a')
ax = plt.subplot()
ax.plot(data.index, data.A)
ticks = ax.get_xticks()
ax.xaxis.set_major_formatter(FuncFormatter(lambda tick, _: get_day(tick)))
Again, you can sub the date format you want into get_day. Not sure if this will solve the panning/zooming problem but at least it gives a way of setting the tick labels using a function.

Matplotlib pyplot - tick control and showing date

My matplotlib pyplot has too many xticks - it is currently showing each year and month for a 15-year period, e.g. "2001-01", but I only want the x-axis to show the year (e.g. 2001).
The output will be a line graph where x-axis shows dates and the y-axis shows the sale and rent prices.
# Defining the variables
ts1 = prices['Month'] # eg. "2001-01" and so on
ts2 = prices['Sale']
ts3 = prices['Rent']
# Reading '2001-01' as year and month
ts1 = [dt.datetime.strptime(d,'%Y-%m').date() for d in ts1]
plt.figure(figsize=(13, 9))
# Below is where it goes wrong. I don't know how to set xticks to show each year.
plt.xticks(ts1, rotation='vertical')
plt.xlabel('Year')
plt.ylabel('Price')
plt.plot(ts1, ts2, 'r-', ts1, ts3, 'b.-')
plt.gcf().autofmt_xdate()
plt.show()
Try removing the plt.xticks function call altogether. matplotlib will then use the default AutoDateLocator function to find the optimum tick locations.
Alternatively if the default includes some months which you don't want then you can use matplotlib.dates.YearLocator which will force the ticks to be years only.
You can set the locator as shown below in a quick example:
import matplotlib.pyplot as plt
import matplotlib.dates as mdate
import numpy as np
import datetime as dt
x = [dt.datetime.utcnow() + dt.timedelta(days=i) for i in range(1000)]
y = range(len(x))
plt.plot(x, y)
locator = mdate.YearLocator()
plt.gca().xaxis.set_major_locator(locator)
plt.gcf().autofmt_xdate()
plt.show()
You can do this with plt.xticks.
As an example, here I have set the xticks frequency to display every three indices. In your case, you would probably want to do so every twelve indices.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(10)
y = np.random.randn(10)
plt.plot(x,y)
plt.xticks(np.arange(min(x), max(x)+1, 3))
plt.show()
In your case, since you are using dates, you can replace the argument of the second to last line above with something like ts1[0::12], which will select every 12th element from ts1 or np.arange(0, len(dates), 12) which will select every 12th index corresponding to the ticks you want to show.

Categories

Resources