I am drawing matplotlib plots and my x axis consists of YYYYMM formatted strings of year and month like 201901 for January of 2019.
My problem is that some of the data spans on a long period of time and this makes the x axis tick labels so dense that they pile up on each other and they become unreadable.
I tried making the font smaller and I rotated the labels 90 degrees which helped a lot but it is still not enough for some of my data.
Here is an example of one of my x axis which looks ok:
And here is an example of an x axis which is too dense because the data spans on a long period of time:
So I want matplotlib to skip printing a few tick labels when the tick labels start piling up on each other. For example, print the label for January, skip printing the labels for February, March, April and May, print the label for June, and skip printing the labels for July, August etc. But I don't know how to do this?
Or are there any other kind of solutions I can use to overcome this problem?
A quick dirty solution would be the following:
ax.set_xticks(ax.get_xticks()[::2])
This would only display every second xtick. If you wanted to only display every n-th tick you would use
ax.set_xticks(ax.get_xticks()[::n])
If you don't have a handle on ax you can get one as ax = plt.gca().
Alternatively, you could specify the number of xticks to use with:
plt.locator_params(axis='x', nbins=10)
An alternate solution could be as below:
x = df['Date']
y = df['Value']
# Risize the figure (optional)
plt.figure(figsize=(20,5))
# Plot the x and y values on the graph
plt.plot(x, y)
# Here you specify the ticks you want to display
# You can also specify rotation for the tick labels in degrees or with keywords.
plt.xticks(x[::5], rotation='vertical')
# Add margins (padding) so that markers don't get clipped by the axes
plt.margins(0.2)
# Display the graph
plt.show()
Related
I am trying to plot three lines on one figure. I have data for three years for three sites and i am simply trying to plot them with the same x axis and same y axis. The first two lines span all three years of data, while the third dataset is usually more sparse. Using the object-oriented axes matplotlib format, when i try to plot my third set of data, I get points at the end of the graph that are out of the range of my third set of data. my third dataset is structured as tuples of dates and values such as:
data=
[('2019-07-15', 30.6),
('2019-07-16', 20.88),
('2019-07-17', 16.94),
('2019-07-18', 11.99),
('2019-07-19', 13.76),
('2019-07-20', 16.97),
('2019-07-21', 19.9),
('2019-07-22', 25.56),
('2019-07-23', 18.59),
...
('2020-08-11', 8.33),
('2020-08-12', 10.06),
('2020-08-13', 12.21),
('2020-08-15', 6.94),
('2020-08-16', 5.51),
('2020-08-17', 6.98),
('2020-08-18', 6.17)]
where the data ends in August 2020, yet the graph includes points at the end of 2020. This is happening with all my sites, as the first two datasets stay constant knowndf['DATE'] and knowndf['Value'] below.
Here is the problematic graph.
And here is what I have for the plotting:
fig, ax=plt.subplots(1,1,figsize=(15,12))
fig.tight_layout(pad=6)
ax.plot(knowndf['DATE'], knowndf['Value1'],'b',alpha=0.7)
ax.plot(knowndf['DATE'], knowndf['Value2'],color='red',alpha=0.7)
ax.plot(*zip(*data), 'g*', markersize=8) #when i plot this set of data i get nonexistent points
ax.tick_params(axis='x', rotation=45) #rotating for aesthetic
ax.set_xticks(ax.get_xticks()[::30]) #only want every 30th tick instead of every daily tick
I've tried ax.twinx() and that gives me two y axis that doesn't help me since i want to use the same x-axis and y-axis for all three sites. I've tried not using the axes approach, but there are things that come with axes that i need to plot with. Please please help!
i am making a plot on which the x axis represents dates and the y axis represents total covid cases. the problem is that due to a large dataset, there are many dates on the x axis and when i am ploting that i am getting a plot on which the xtick values are overlapped and i can not clearly see the covid cases at a particular date. so i want to make a clear graph. how can i do that? or you can also suggest me any better idea to make the graph more readable.
i am giving my code and plot below. Thanks.
ensure your dates are dates not strings
Use matplotlib date formatters
I've used data from UK as you did not provide sample
x = countries["date"]
y = countries["total_cases"]
fig, ax = plt.subplots(figsize=(10, 6))
locator = mdates.AutoDateLocator(minticks=3, maxticks=7)
formatter = mdates.ConciseDateFormatter(locator)
ax.xaxis.set_major_locator(locator)
ax.xaxis.set_major_formatter(formatter)
ax.plot(x, y)
In this plot, matplotlib automatically hides the first and last axis labels. However, the desired behavior is to always show both the first and last axis labels and that too as a 'nice' value. By 'nice' value, I mean a major tic should be present at both boundaries of the axis. For example, in the figure shown below, the x-axis would have started from -0.1 and ended at 1.5. Similarly, the y-axis would have started from -0.25 and ended at 2.00. How can this be achieved in Matplotlib?
Thanks in advance for your help.
I solved this problem by first letting matplotlib find the ideal tick locations and then setting the axis limits such that one additional tick is added on both the edges of the axis.
plt.figure()
plt.plot(xdata, data)
loc, labels = plt.xticks() #returns the current tics and lables.
min_new = loc[0] - (loc[1]-loc[0])
max_new = loc[len(loc)-1] + (loc[1]-loc[0])
plt.xlim(left=min_new, right=max_new)
plt.gca().xaxis.set_major_locator(AutoLocator())
plt.gca().yaxis.set_major_locator(AutoLocator())
plt.show()
Edit:
The same could also be achieved by extending axis limits to the hidden ticks at the edges. i.e.,
loc, labels = plt.xticks()
plt.xlim(left=loc[0], right=loc[len(loc)-1])
I have a dataframe like this:
data_ = list(range(106))
index_ = pd.period_range('3/1/2004', '12/1/2012', freq='M')
df2_ = pd.DataFrame(data = data_, index = index_, columns = ['data'])
I want to plot this dataframe. Currently, I am using:
df2_.plot()
Now I like to control the labels (and possibly ticks) at the x axis. In particular, I like to have monthly ticks at the axis and possibly a label at every other month or quarterly labels. I also like to have vertical grid lines.
I started looking at this example but I am already failing at constructing the timedelta.
With regards to constructing the timedelta, datetime.timdelta() doesn’t have a parameter to specify months, so it’s probably convenient to stick to pd.date_range(). However, I found that objects of type pandas.tslib.Timestamp don’t play nice with matplotlib ticks so you could convert them to datetime.date objects like so
index_ = [pd.to_datetime(date, format='%Y-%m-%d').date()
for date in pd.date_range('2004-03-01', '2012-12-01', freq="M")]
It’s possible to add gridlines and customise axes labels by first defining a matplotlib axes object, and then passing this to DataFrame.plot()
ax = plt.axes()
df2_.plot(ax=ax)
Now you can add vertical gridlines to your plot
ax.xaxis.grid(True)
And specify quarterly xticks labels by using matplotlib.dates.MonthLocator and setting the interval to 3
ax.xaxis.set_major_locator(dates.MonthLocator(interval=3))
And finally, I found the ticks to be to be very crowded so I formatted them to get a nicer fit
ax.xaxis.set_major_formatter(dates.DateFormatter('%b %y'))
labels = ax.get_xticklabels()
plt.setp(labels, rotation=85, fontsize=8)
To produce the following:
I want to plot a series of values against a date range in matplotlib. I changed the tick base parameter to 7, to get one tick at the beginning of every week (plticker.IndexLocator, base = 7). The problem is that the set_xticklabels function does not accept a base parameter. As a result, the second tick (representing day 8 on the beginning of week 2) is labelled with day 2 from my date range list, and not with day 8 as it should be (see picture).
How to give set_xticklabelsa base parameter?
Here is the code:
my_data = pd.read_csv("%r_filename_%s_%s_%d_%d.csv" % (num1, num2, num3, num4, num5), dayfirst=True)
my_data.plot(ax=ax1, color='r', lw=2.)
loc = plticker.IndexLocator(base=7, offset = 0) # this locator puts ticks at regular intervals
ax1.set_xticklabels(my_data.Date, rotation=45, rotation_mode='anchor', ha='right') # this defines the tick labels
ax1.xaxis.set_major_locator(loc)
Here is the plot:
Plot
Many thanks - your solution perfectly works. For the case that other people run into the same issue in the future: i have implemented the above-mentioned solution but also added some code so that the tick labels keep the desired rotation and also align (with their left end) to the respective tick. May not be pythonic, may not be best-practice, but it works
x_fmt = mpl.ticker.IndexFormatter(x)
ax.set_xticklabels(my_data.Date, rotation=-45)
ax.tick_params(axis='x', pad=10)
ax.xaxis.set_major_formatter(x_fmt)
labels = my_data.Date
for tick in ax.xaxis.get_majorticklabels():
tick.set_horizontalalignment("left")
The reason your ticklabels went bad is that setting manual ticklabels decouples the labels from your data. The proper approach is to use a Formatter according to your needs. Since you have a list of ticklabels for each data point, you can use an IndexFormatter. It seems to be undocumented online, but it has a help:
class IndexFormatter(Formatter)
| format the position x to the nearest i-th label where i=int(x+0.5)
| ...
| __init__(self, labels)
| ...
So you just have to pass your list of dates to IndexFormatter. With a minimal, pandas-independent example (with numpy only for generating dummy data):
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
# create dummy data
x = ['str{}'.format(k) for k in range(20)]
y = np.random.rand(len(x))
# create an IndexFormatter with labels x
x_fmt = mpl.ticker.IndexFormatter(x)
fig,ax = plt.subplots()
ax.plot(y)
# set our IndexFormatter to be responsible for major ticks
ax.xaxis.set_major_formatter(x_fmt)
This should keep your data and labels paired even when tick positions change:
I noticed you also set the rotation of the ticklabels in the call to set_xticklabels, you would lose this now. I suggest using fig.autofmt_xdate to do this instead, it seems to be designed exactly for this purpose, without messing with your ticklabel data.