I would like to add the corresponding weekday names (Mon, Tues, etc.) to the xlabels for a pandas timeseries plot.
import pandas as pd
import numpy as np
import pylab as p
import datetime
dates = pd.date_range(datetime.datetime.today().date(), periods=10, freq='D')
data = pd.DataFrame(np.arange(10),index=dates,columns=['A'])
a = data['A'].plot()
p.tight_layout()
p.show()
I have tried adjusting the formatting using:
from matplotlib.dates import DateFormatter
formatter = DateFormatter('%a %d-%m-%Y')
a.xaxis.set_major_formatter(formatter)
But this does not work, leading to incorrect day and year.
It seems there should be a very simple solution, but I cannot find it.
Here's what I thought would work but didn't:
from matplotlib.ticker import FuncFormatter
from matplotlib import pyplot as plt
ax = data.A.plot()
ax.xaxis.set_major_formatter(FuncFormatter(lambda d, _: d.strftime('%a')))
or
ax = plt.subplot()
ax.plot(data.index, data.A)
ax.xaxis.set_major_formatter(FuncFormatter(lambda d, _: d.strftime('%a')))
These both go wrong in different ways. It seems the formatter inputs turn out to be floats rather than dates in both cases. In the first the function only gets applied to the first and last ticks. You can see this by passing
ax.xaxis.set_major_formatter(FuncFormatter(lambda d, _: d)
Here's a solution which is pretty flexible:
ax = plt.subplot()
ax.plot(data.index, data.A)
ticks = ax.set_xticklabels([d.strftime('%a') for d in data.index])
You can swap the list comprehension in the last line for whatever you like.
EDIT:
I think I've figure out what these numbers representing the xticks mean.
In [37]:
ax = plt.subplot()
ax.plot(data.index, data.A)
print ax.get_xticks()
[ 735824. 735825. 735826. 735827. 735828. 735829. 735830. 735831.
735832. 735833.]
These seem to represent the number of days since the start of 1 AD: According to this: http://www.epochconverter.com/epoch/seconds-days-since-year-0.php
"There are 736189 days between 0000-00-00 and today (Aug 14, 2015)."
Which is exactly 735824 (the first tick) + 365. So far so bad.
You could (I won't bother) write a function to convert this number and ones like into dates. Another approach would be:
def get_day(tick):
date = dates[0] + datetime.timedelta(tick - ticks[0])
return date.strftime('%a')
ax = plt.subplot()
ax.plot(data.index, data.A)
ticks = ax.get_xticks()
ax.xaxis.set_major_formatter(FuncFormatter(lambda tick, _: get_day(tick)))
Again, you can sub the date format you want into get_day. Not sure if this will solve the panning/zooming problem but at least it gives a way of setting the tick labels using a function.
Related
I encounter an issue with Matplotlib.dates.DateFormatter :
I want to convert timestamps in Date format which is simple usually with the straftime but when using it on matplotlib i don't have the dynamic position on my graph so I used the md.DateFormatter('%H:%M:%S.%f') to have the X values as a date format with the dynamic index.
The fact is, my dates have too much values, I don't want the nanoseconds but I don't know how to remove them. I searched on StackOverflow to find a solution but applying a date[:-3] won't work as I have a datetime format...
Do you have a solution? It's maybe trivial but can't find any solution right now...
Thanks in advance.
NB : What I call the dynamic index is when you are on the graph and you can see the exact X and Y value of your pointer at the bottom
Here is an applicable example :
df =
timestamp val
0 2022-03-13 03:19:59.999070 X1
1 2022-03-13 03:20:00.004070 X2
2 2022-03-13 03:20:00.009070 X3
3 2022-03-13 03:20:00.014070 X4
And I try to plot this with :
ax=plt.gca()
xfmt = md.DateFormatter('%H:%M:%S.%f')
ax.xaxis.set_major_formatter(xfmt)
plt.plot(df.timestamp, df.val, linestyle="-", marker = ".")
plt.setp(ax.get_xticklabels(), rotation=40)
plt.show()
In conclusin, what I want is to remove the 070 in the graph but if I remove it beforehand, DateFormatter will replace it by 000 which is as useless as it was..
If you want to change both the tick labels and the format of the number shown on the interactive status bar, you could define your own function to deliver your desired format, then use a FuncFormatter to display those values on your plot.
For example:
import matplotlib.pyplot as plt
import matplotlib.dates as md
import pandas as pd
# dummy data
ts = pd.date_range("2022-03-13 03:19:59.999070",
"2022-03-13 03:20:00.014070", periods=4)
df = pd.DataFrame({'timestamp': ts, 'val':[0, 1, 2, 3]})
fig, ax = plt.subplots()
# define our own function to drop the last three characters
xfmt = lambda x, pos: md.DateFormatter('%H:%M:%S.%f')(x)[:-3]
# use that function as the major formatter, using FuncFormatter
ax.xaxis.set_major_formatter(plt.FuncFormatter(xfmt))
plt.setp(ax.get_xticklabels(), rotation=40)
ax.plot(df.timestamp, df.val, linestyle="-", marker = ".")
plt.tight_layout()
plt.show()
Note the matching tick format and status bar format.
If, however, you do not want to change the tick labels, but only change the value on the status bar, we can do that by reassigning the ax.format_coord function, using the a similar idea for the function we defined above, but also adding in the y value for display
For example:
import matplotlib.pyplot as plt
import matplotlib.dates as md
import pandas as pd
# dummy data
ts = pd.date_range("2022-03-13 03:19:59.999070",
"2022-03-13 03:20:00.014070", periods=4)
df = pd.DataFrame({'timestamp': ts, 'val':[0, 1, 2, 3]})
fig, ax = plt.subplots()
xfmt = md.DateFormatter('%H:%M:%S.%f')
xfmt2 = lambda x, y: "x={}, y={:g}".format(xfmt(x)[:-3], y)
# use original formatter here with microseconds
ax.xaxis.set_major_formatter(plt.FuncFormatter(xfmt))
# and the millisecond function here
ax.format_coord = xfmt2
plt.setp(ax.get_xticklabels(), rotation=40)
ax.plot(df.timestamp, df.val, linestyle="-", marker = ".")
plt.tight_layout()
plt.show()
Note the difference between the status bar and the tick formats here.
I have time-series plots (over 1 year) where the months on the x-axis are of the form Jan, Feb, Mar, etc, but I would like to have just the first letter of the month instead (J,F,M, etc). I set the tick marks using
ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_minor_locator(MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.ticker.NullFormatter())
ax.xaxis.set_minor_formatter(matplotlib.dates.DateFormatter('%b'))
Any help would be appreciated.
The following snippet based on the official example here works for me.
This uses a function based index formatter order to only return the first letter of the month as requested.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import matplotlib.cbook as cbook
import matplotlib.ticker as ticker
datafile = cbook.get_sample_data('aapl.csv', asfileobj=False)
print 'loading', datafile
r = mlab.csv2rec(datafile)
r.sort()
r = r[-365:] # get the last year
# next we'll write a custom formatter
N = len(r)
ind = np.arange(N) # the evenly spaced plot indices
def format_date(x, pos=None):
thisind = np.clip(int(x+0.5), 0, N-1)
return r.date[thisind].strftime('%b')[0]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(ind, r.adj_close, 'o-')
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date))
fig.autofmt_xdate()
plt.show()
I tried to make the solution suggested by #Appleman1234 work, but since I, myself, wanted to create a solution that I could save in an external configuration script and import in other programs, I found it inconvenient that the formatter had to have variables defined outside of the formatter function itself.
I did not solve this but I just wanted to share my slightly shorter solution here so that you and maybe others can take it or leave it.
It turned out to be a little tricky to get the labels in the first place, since you need to draw the axes, before the tick labels are set. Otherwise you just get empty strings, when you use Text.get_text().
You may want to get rid of the agrument minor=True which was specific to my case.
# ...
# Manipulate tick labels
plt.draw()
ax.set_xticklabels(
[t.get_text()[0] for t in ax.get_xticklabels(minor=True)], minor=True
)
I hope it helps:)
The original answer uses the index of the dates. This is not necessary. One can instead get the month names from the DateFormatter('%b') and use a FuncFormatter to use only the first letter of the month.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
from matplotlib.dates import MonthLocator, DateFormatter
x = np.arange("2019-01-01", "2019-12-31", dtype=np.datetime64)
y = np.random.rand(len(x))
fig, ax = plt.subplots()
ax.plot(x,y)
month_fmt = DateFormatter('%b')
def m_fmt(x, pos=None):
return month_fmt(x)[0]
ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_major_formatter(FuncFormatter(m_fmt))
plt.show()
If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
I'm basically trying to plot a graph where the x axis represent the month of the year. The data is stored in a numpy.array, with dimensions k x months. Here it follows a minimal example (my data is not this crazy):
import numpy
import matplotlib
import matplotlib.pyplot as plt
cmap = plt.get_cmap('Set3')
colors = [cmap(i) for i in numpy.linspace(0, 1, len(complaints))]
data = numpy.random.rand(18,12)
y = range(data.shape[1])
plt.figure(figsize=(15, 7), dpi=200)
for i in range(data.shape[0]):
plt.plot(y, data[i,:], color=colors[i], linewidth=5)
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.xticks(numpy.arange(0, 12, 1))
plt.xlabel('Hour of the Day')
plt.ylabel('Number of Complaints')
plt.title('Number of Complaints per Hour in 2015')
I'd like to have the xticks as strings instead of numbers. I'm wondering if I have to create a list of strings, manually, or if there is another way to translate the numbers to months. I have to do the same for weekdays, for example.
I've been looking to these examples:
http://matplotlib.org/examples/pylab_examples/finance_demo.html
http://matplotlib.org/examples/pylab_examples/date_demo2.html
But I'm not using datetime.
Althought this answer works well, for this case you can avoid defining your own FuncFormatter by using the pre-defined ones from matplotlib for dates, by using matplotlib.dates rather than matplotlib.ticker:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import pandas as pd
# Define time range with 12 different months:
# `MS` stands for month start frequency
x_data = pd.date_range('2018-01-01', periods=12, freq='MS')
# Check how this dates looks like:
print(x_data)
y_data = np.random.rand(12)
fig, ax = plt.subplots()
ax.plot(x_data, y_data)
# Make ticks on occurrences of each month:
ax.xaxis.set_major_locator(mdates.MonthLocator())
# Get only the month to show in the x-axis:
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
# '%b' means month as locale’s abbreviated name
plt.show()
Obtaining:
DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
'2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01',
'2018-09-01', '2018-10-01', '2018-11-01', '2018-12-01'],
dtype='datetime64[ns]', freq='MS')
This is an alternative plotting method plot_date, which you might want to use if your independent variable are datetime like, instead of using the more general plot method:
import datetime
data = np.random.rand(24)
#a list of time: 00:00:00 to 23:00:00
times = [datetime.datetime.strptime(str(i), '%H') for i in range(24)]
#'H' controls xticklabel format, 'H' means only the hours is shown
#day, year, week, month, etc are not shown
plt.plot_date(times, data, fmt='H')
plt.setp(plt.gca().xaxis.get_majorticklabels(),
'rotation', 90)
The benefit of it is that now you can easily control the density of xticks, if we want to have a tick every hour, we will insert these lines after plot_date:
##import it if not already imported
#import matplotlib.dates as mdates
plt.gca().xaxis.set_major_locator(mdates.HourLocator())
You can still use formatters to format your results in the way you want. For example, to have month names printed, let us first define a function taking an integer to a month abbreviation:
def getMonthName(month_number):
testdate=datetime.date(2010,int(month_number),1)
return testdate.strftime('%b')
Here, I have created an arbitrary date with the correct month and returned that month. Check the datetime documentation for available format codes if needed. If that is always easier than just setting a list by hand is another question. Now let us plot some monthly testdata:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
x_data=np.arange(1,12.5,1)
y_data=x_data**2 # Just some arbitrary data
plt.plot(x_data,y_data)
plt.gca().xaxis.set_major_locator(mtick.FixedLocator(x_data)) # Set tick locations
plt.gca().xaxis.set_major_formatter(mtick.FuncFormatter(lambda x,p:getMonthName(x)))
plt.show()
The message here is that you can use matplotlib.ticker.FuncFormatter to use any function to obtain a tick label. The function takes two arguments (value and position) and returns a string.
Below I have the following script which creates a simple time series plot:
%matplotlib inline
import datetime
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
df = []
start_date = datetime.datetime(2015, 7, 1)
for i in range(10):
for j in [1,2]:
unit = 'Ones' if j == 1 else 'Twos'
date = start_date + datetime.timedelta(days=i)
df.append({
'Date': date.strftime('%Y%m%d'),
'Value': i * j,
'Unit': unit
})
df = pd.DataFrame(df)
sns.tsplot(df, time='Date', value='Value', unit='Unit', ax=ax)
fig.autofmt_xdate()
And the result of this is the following:
As you can see the x-axis has strange numbers for the datetimes, and not the usual "nice" representations that come with matplotlib and other plotting utilities. I've tried many things, re-formatting the data but it never comes out clean. Anyone know a way around?
Matplotlib represents dates as floating point numbers (in days), thus unless you (or pandas or seaborn), tell it that your values are representing dates, it will not format the ticks as dates. I'm not a seaborn expert, but it looks like it (or pandas) does convert the datetime objects to matplotlib dates, but then does not assign proper locators and formatters to the axes. This is why you get these strange numbers, which are in fact just the days since 0001.01.01. So you'll have to take care of the ticks manually (which, in most cases, is better anyways as it gives you more control).
So you'll have to assign a date locator, which decides where to put ticks, and a date formatter, which will then format the strings for the tick labels.
import datetime
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# build up the data
df = []
start_date = datetime.datetime(2015, 7, 1)
for i in range(10):
for j in [1,2]:
unit = 'Ones' if j == 1 else 'Twos'
date = start_date + datetime.timedelta(days=i)
# I believe it makes more sense to directly convert the datetime to a
# "matplotlib"-date (float), instead of creating strings and then let
# pandas parse the string again
df.append({
'Date': mdates.date2num(date),
'Value': i * j,
'Unit': unit
})
df = pd.DataFrame(df)
# build the figure
fig, ax = plt.subplots()
sns.tsplot(df, time='Date', value='Value', unit='Unit', ax=ax)
# assign locator and formatter for the xaxis ticks.
ax.xaxis.set_major_locator(mdates.AutoDateLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y.%m.%d'))
# put the labels at 45deg since they tend to be too long
fig.autofmt_xdate()
plt.show()
Result:
For me, #hitzg's answer results in "OverflowError: signed integer is greater than maximum" in the depths of DateFormatter.
Looking at my dataframe, my indices are datetime64, not datetime. Pandas converts these nicely though. The following works great for me:
import matplotlib as mpl
def myFormatter(x, pos):
return pd.to_datetime(x)
[ . . . ]
ax.xaxis.set_major_formatter(mpl.ticker.FuncFormatter(myFormatter))
Here is a potentially inelegant solution, but it's the only one I have ... Hope it helps!
g = sns.pointplot(x, y, data=df, ci=False);
unique_dates = sorted(list(df['Date'].drop_duplicates()))
date_ticks = range(0, len(unique_dates), 5)
g.set_xticks(date_ticks);
g.set_xticklabels([unique_dates[i].strftime('%d %b') for i in date_ticks], rotation='vertical');
g.set_xlabel('Date');
Let me know if you see any issues!
def myFormatter(x, pos):
return pd.to_datetime(x).strftime('%Y%m%d')
ax.xaxis.set_major_formatter(mpl.ticker.FuncFormatter(myFormatter))