I'm having trouble limiting the number of dates on the x-axis to make them legible. I need to plot the word length vs the year but the number of years is too large for the plot size.
The Issue:
Any help is appreciated.
As mentioned in the comments, use datetime (if your dates are in string format, you can easily convert them to datetime). Once you do that it should automatically display years along the x-axis. If you need to change the frequency of ticks to every year (or anything else), you can use mdates, like so:
import matplotlib.dates as mdates
import matplotlib.pyplot as plt
import datetime
import math
start = datetime.datetime.strptime("01-01-2000", "%d-%m-%Y")
end = datetime.datetime.strptime("10-04-2019", "%d-%m-%Y")
x = [start + datetime.timedelta(days=x) for x in range(0, (end-start).days)]
y = [math.sqrt(x) for x in range(len(x))]
fig, ax = plt.subplots()
ax.plot(x, y)
ax.xaxis.set_major_locator(mdates.YearLocator())
fig.autofmt_xdate()
plt.show()
The snippet above generates the following:
Related
I have time-series plots (over 1 year) where the months on the x-axis are of the form Jan, Feb, Mar, etc, but I would like to have just the first letter of the month instead (J,F,M, etc). I set the tick marks using
ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_minor_locator(MonthLocator())
ax.xaxis.set_major_formatter(matplotlib.ticker.NullFormatter())
ax.xaxis.set_minor_formatter(matplotlib.dates.DateFormatter('%b'))
Any help would be appreciated.
The following snippet based on the official example here works for me.
This uses a function based index formatter order to only return the first letter of the month as requested.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.mlab as mlab
import matplotlib.cbook as cbook
import matplotlib.ticker as ticker
datafile = cbook.get_sample_data('aapl.csv', asfileobj=False)
print 'loading', datafile
r = mlab.csv2rec(datafile)
r.sort()
r = r[-365:] # get the last year
# next we'll write a custom formatter
N = len(r)
ind = np.arange(N) # the evenly spaced plot indices
def format_date(x, pos=None):
thisind = np.clip(int(x+0.5), 0, N-1)
return r.date[thisind].strftime('%b')[0]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(ind, r.adj_close, 'o-')
ax.xaxis.set_major_formatter(ticker.FuncFormatter(format_date))
fig.autofmt_xdate()
plt.show()
I tried to make the solution suggested by #Appleman1234 work, but since I, myself, wanted to create a solution that I could save in an external configuration script and import in other programs, I found it inconvenient that the formatter had to have variables defined outside of the formatter function itself.
I did not solve this but I just wanted to share my slightly shorter solution here so that you and maybe others can take it or leave it.
It turned out to be a little tricky to get the labels in the first place, since you need to draw the axes, before the tick labels are set. Otherwise you just get empty strings, when you use Text.get_text().
You may want to get rid of the agrument minor=True which was specific to my case.
# ...
# Manipulate tick labels
plt.draw()
ax.set_xticklabels(
[t.get_text()[0] for t in ax.get_xticklabels(minor=True)], minor=True
)
I hope it helps:)
The original answer uses the index of the dates. This is not necessary. One can instead get the month names from the DateFormatter('%b') and use a FuncFormatter to use only the first letter of the month.
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
from matplotlib.dates import MonthLocator, DateFormatter
x = np.arange("2019-01-01", "2019-12-31", dtype=np.datetime64)
y = np.random.rand(len(x))
fig, ax = plt.subplots()
ax.plot(x,y)
month_fmt = DateFormatter('%b')
def m_fmt(x, pos=None):
return month_fmt(x)[0]
ax.xaxis.set_major_locator(MonthLocator())
ax.xaxis.set_major_formatter(FuncFormatter(m_fmt))
plt.show()
This code plots the data exactly as I want with the dates on the x-axis and the times on the y-axis. However I want the y-axis to show every hour on the hour (e.g., 00, 01, ... 23) and the x-axis to show the beginning of every month at an angle so there's no overlap (the actual data being used spans over a year) and only once, since this code repeats the same months. How is this accomplished?
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-02-04 11:55:09']
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time, '.')
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%Y-%m')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
plt.show()
UPDATE: This fixes the x axis.
# Monthly intervals on x axis
months = mdates.MonthLocator()
d_fmt = mdates.DateFormatter('%Y-%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(d_fmt)
However, this attempt to fix the y axis just makes it blank.
# Hourly intervals on y axis
hours = mdates.HourLocator()
t_fmt = mdates.DateFormatter('%H')
ax.yaxis.set_major_locator(hours)
ax.yaxis.set_major_formatter(t_fmt)
I'm reading these docs but not understanding my error: https://matplotlib.org/api/dates_api.html, https://matplotlib.org/api/ticker_api.html
Matplotlib cannot plot times without corresponding date. This would make is necessary to add some arbitrary date (in the below case I took the 1st of january 2018) to the times. One may use datetime.datetime.combine for that purpose.
timetodatetime = lambda x:dt.datetime.combine(dt.date(2018, 1, 1), x)
time = list(map(timetodatetime, data.time))
ax.plot(data.date, time, '.')
Then the code from the question using HourLocator() would work fine. Finally, setting the limits on the axes would also require to use datetime objects,
ax.set_ylim([dt.datetime(2018,1,1,0), dt.datetime(2018,1,2,0)])
Complete example:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27',
'2018-02-04 11:55:09']
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
timetodatetime = lambda x:dt.datetime.combine(dt.date(2018, 1, 1), x)
time = list(map(timetodatetime, data.time))
ax.plot(data.date, time, '.')
# Monthly intervals on x axis
months = mdates.MonthLocator()
d_fmt = mdates.DateFormatter('%Y-%m')
ax.xaxis.set_major_locator(months)
ax.xaxis.set_major_formatter(d_fmt)
## Hourly intervals on y axis
hours = mdates.HourLocator()
t_fmt = mdates.DateFormatter('%H')
ax.yaxis.set_major_locator(hours)
ax.yaxis.set_major_formatter(t_fmt)
ax.set_ylim([dt.datetime(2018,1,1,0), dt.datetime(2018,1,2,0)])
plt.show()
I'm basically trying to plot a graph where the x axis represent the month of the year. The data is stored in a numpy.array, with dimensions k x months. Here it follows a minimal example (my data is not this crazy):
import numpy
import matplotlib
import matplotlib.pyplot as plt
cmap = plt.get_cmap('Set3')
colors = [cmap(i) for i in numpy.linspace(0, 1, len(complaints))]
data = numpy.random.rand(18,12)
y = range(data.shape[1])
plt.figure(figsize=(15, 7), dpi=200)
for i in range(data.shape[0]):
plt.plot(y, data[i,:], color=colors[i], linewidth=5)
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.xticks(numpy.arange(0, 12, 1))
plt.xlabel('Hour of the Day')
plt.ylabel('Number of Complaints')
plt.title('Number of Complaints per Hour in 2015')
I'd like to have the xticks as strings instead of numbers. I'm wondering if I have to create a list of strings, manually, or if there is another way to translate the numbers to months. I have to do the same for weekdays, for example.
I've been looking to these examples:
http://matplotlib.org/examples/pylab_examples/finance_demo.html
http://matplotlib.org/examples/pylab_examples/date_demo2.html
But I'm not using datetime.
Althought this answer works well, for this case you can avoid defining your own FuncFormatter by using the pre-defined ones from matplotlib for dates, by using matplotlib.dates rather than matplotlib.ticker:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import pandas as pd
# Define time range with 12 different months:
# `MS` stands for month start frequency
x_data = pd.date_range('2018-01-01', periods=12, freq='MS')
# Check how this dates looks like:
print(x_data)
y_data = np.random.rand(12)
fig, ax = plt.subplots()
ax.plot(x_data, y_data)
# Make ticks on occurrences of each month:
ax.xaxis.set_major_locator(mdates.MonthLocator())
# Get only the month to show in the x-axis:
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
# '%b' means month as locale’s abbreviated name
plt.show()
Obtaining:
DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
'2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01',
'2018-09-01', '2018-10-01', '2018-11-01', '2018-12-01'],
dtype='datetime64[ns]', freq='MS')
This is an alternative plotting method plot_date, which you might want to use if your independent variable are datetime like, instead of using the more general plot method:
import datetime
data = np.random.rand(24)
#a list of time: 00:00:00 to 23:00:00
times = [datetime.datetime.strptime(str(i), '%H') for i in range(24)]
#'H' controls xticklabel format, 'H' means only the hours is shown
#day, year, week, month, etc are not shown
plt.plot_date(times, data, fmt='H')
plt.setp(plt.gca().xaxis.get_majorticklabels(),
'rotation', 90)
The benefit of it is that now you can easily control the density of xticks, if we want to have a tick every hour, we will insert these lines after plot_date:
##import it if not already imported
#import matplotlib.dates as mdates
plt.gca().xaxis.set_major_locator(mdates.HourLocator())
You can still use formatters to format your results in the way you want. For example, to have month names printed, let us first define a function taking an integer to a month abbreviation:
def getMonthName(month_number):
testdate=datetime.date(2010,int(month_number),1)
return testdate.strftime('%b')
Here, I have created an arbitrary date with the correct month and returned that month. Check the datetime documentation for available format codes if needed. If that is always easier than just setting a list by hand is another question. Now let us plot some monthly testdata:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
x_data=np.arange(1,12.5,1)
y_data=x_data**2 # Just some arbitrary data
plt.plot(x_data,y_data)
plt.gca().xaxis.set_major_locator(mtick.FixedLocator(x_data)) # Set tick locations
plt.gca().xaxis.set_major_formatter(mtick.FuncFormatter(lambda x,p:getMonthName(x)))
plt.show()
The message here is that you can use matplotlib.ticker.FuncFormatter to use any function to obtain a tick label. The function takes two arguments (value and position) and returns a string.
My matplotlib pyplot has too many xticks - it is currently showing each year and month for a 15-year period, e.g. "2001-01", but I only want the x-axis to show the year (e.g. 2001).
The output will be a line graph where x-axis shows dates and the y-axis shows the sale and rent prices.
# Defining the variables
ts1 = prices['Month'] # eg. "2001-01" and so on
ts2 = prices['Sale']
ts3 = prices['Rent']
# Reading '2001-01' as year and month
ts1 = [dt.datetime.strptime(d,'%Y-%m').date() for d in ts1]
plt.figure(figsize=(13, 9))
# Below is where it goes wrong. I don't know how to set xticks to show each year.
plt.xticks(ts1, rotation='vertical')
plt.xlabel('Year')
plt.ylabel('Price')
plt.plot(ts1, ts2, 'r-', ts1, ts3, 'b.-')
plt.gcf().autofmt_xdate()
plt.show()
Try removing the plt.xticks function call altogether. matplotlib will then use the default AutoDateLocator function to find the optimum tick locations.
Alternatively if the default includes some months which you don't want then you can use matplotlib.dates.YearLocator which will force the ticks to be years only.
You can set the locator as shown below in a quick example:
import matplotlib.pyplot as plt
import matplotlib.dates as mdate
import numpy as np
import datetime as dt
x = [dt.datetime.utcnow() + dt.timedelta(days=i) for i in range(1000)]
y = range(len(x))
plt.plot(x, y)
locator = mdate.YearLocator()
plt.gca().xaxis.set_major_locator(locator)
plt.gcf().autofmt_xdate()
plt.show()
You can do this with plt.xticks.
As an example, here I have set the xticks frequency to display every three indices. In your case, you would probably want to do so every twelve indices.
import numpy as np
import matplotlib.pyplot as plt
x = np.arange(10)
y = np.random.randn(10)
plt.plot(x,y)
plt.xticks(np.arange(min(x), max(x)+1, 3))
plt.show()
In your case, since you are using dates, you can replace the argument of the second to last line above with something like ts1[0::12], which will select every 12th element from ts1 or np.arange(0, len(dates), 12) which will select every 12th index corresponding to the ticks you want to show.
I have written code which plots the past seven day stock value for a user-determined stock market over time.
The problem I have is that I want to format the x axis in a YYMMDD format.
I also don't understand what 2.014041e7 means at the end of the x axis.
Values for x are:
20140421.0, 20140417.0, 20140416.0, 20140415.0, 20140414.0, 20140411.0, 20140410.0
Values for y are:
531.17, 524.94, 519.01, 517.96, 521.68, 519.61, 523.48
My code is as follows:
mini = min(y)
maxi = max(y)
minimum = mini - 75
maximum = maxi + 75
mini2 = int(min(x))
maxi2 = int(max(x))
plt.close('all')
fig, ax = plt.subplots(1)
pylab.ylim([minimum,maximum])
pylab.xlim([mini2,maxi2])
ax.plot(x, y)
ax.plot(x, y,'ro')
ax.plot(x, m*x + c)
ax.grid()
ax.plot()
When plotting your data using your method you are simply plotting your y data against numbers (floats) in x such as 20140421.0 (which I assume you wish to mean the date 21/04/2014).
You need to convert your data from these floats into an appropriate format for matplotlib to understand. The code below takes your two lists (x, y) and converts them.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
# Original data
raw_x = [20140421.0, 20140417.0, 20140416.0, 20140415.0, 20140414.0, 20140411.0, 20140410.0]
y = [531.17, 524.94, 519.01, 517.96, 521.68, 519.61, 523.48]
# Convert your x-data into an appropriate format.
# date_fmt is a string giving the correct format for your data. In this case
# we are using 'YYYYMMDD.0' as your dates are actually floats.
date_fmt = '%Y%m%d.0'
# Use a list comprehension to convert your dates into datetime objects.
# In the list comp. strptime is used to convert from a string to a datetime
# object.
dt_x = [dt.datetime.strptime(str(i), date_fmt) for i in raw_x]
# Finally we convert the datetime objects into the format used by matplotlib
# in plotting using matplotlib.dates.date2num
x = [mdates.date2num(i) for i in dt_x]
# Now to actually plot your data.
fig, ax = plt.subplots()
# Use plot_date rather than plot when dealing with time data.
ax.plot_date(x, y, 'bo-')
# Create a DateFormatter object which will format your tick labels properly.
# As given in your question I have chosen "YYMMDD"
date_formatter = mdates.DateFormatter('%y%m%d')
# Set the major tick formatter to use your date formatter.
ax.xaxis.set_major_formatter(date_formatter)
# This simply rotates the x-axis tick labels slightly so they fit nicely.
fig.autofmt_xdate()
plt.show()
The code is commented throughout so should be easily self explanatory. Details on the various modules can be found below:
datetime
matplotlib.dates