Quite often I want to make a bar chart of counts. If the counts are low I often get major and/or minor tick locations that are not integers. How can I prevent this? It makes no sense to have a tick at 1.5 when the data are counts.
This is my first attempt:
import pylab
ax = pylab.subplot(2, 2, 1)
pylab.bar(range(1,4), range(1,4), align='center')
major_tick_locs = ax.yaxis.get_majorticklocs()
if len(major_tick_locs) < 2 or major_tick_locs[1] - major_tick_locs[0] < 1:
minor_tick_locs = ax.yaxis.get_minorticklocs()
if len(minor_tick_locs) < 2 or minor_tick_locs[1] - minor_tick_locs[0] < 1:
which works OK when the counts are small but when they are large, I get many many minor ticks:
import pylab
ax = pylab.subplot(2, 2, 2)
pylab.bar(range(1,4), range(100,400,100), align='center')
major_tick_locs = ax.yaxis.get_majorticklocs()
if len(major_tick_locs) < 2 or major_tick_locs[1] - major_tick_locs[0] < 1:
minor_tick_locs = ax.yaxis.get_minorticklocs()
if len(minor_tick_locs) < 2 or minor_tick_locs[1] - minor_tick_locs[0] < 1:
How can I get the desired behaviour from the first example with small counts whilst avoiding what happens in the second?
You can use the MaxNLocator method, like so:
from pylab import MaxNLocator
ya = axes.get_yaxis()
I had a similar issue with a histogram I was plotting showing fractional count. Here's how I was able to resolve it:
plt.hist(x=[Dataset being counted])
# Get your current y-ticks (loc is an array of your current y-tick elements)
loc, labels = plt.yticks()
# This sets your y-ticks to the specified range at whole number intervals
plt.yticks(np.arange(0, max(loc), step=1))
I think it turns out I can just ignore the minor ticks. I'm going to give this a go and see if it stands up in all use cases:
def ticks_restrict_to_integer(axis):
"""Restrict the ticks on the given axis to be at least integer,
that is no half ticks at 1.5 for example.
from matplotlib.ticker import MultipleLocator
major_tick_locs = axis.get_majorticklocs()
if len(major_tick_locs) < 2 or major_tick_locs[1] - major_tick_locs[0] < 1:
def _test_restrict_to_integer():
ax = pylab.subplot(1, 2, 1)
pylab.bar(range(1,4), range(1,4), align='center')
ax = pylab.subplot(1, 2, 2)
pylab.bar(range(1,4), range(100,400,100), align='center')
pylab.bar(range(1,4), range(1,4), align='center')
has worked in my code.
Just use the align optional parameter and xticks does the magic.
I am plotting seismological data and am creating a figure featuring 16 subplots of different depth slices. Each subplot displays the lat/lon of the epicenter and the color is scaled to its magnitude. I am trying to do two things:
Adjust the scale of all plots to equal the x and y min and max for the area selected. This will allow easy comparison across the plots. (so all plots would range from xmin to xmax etc)
adjust the magnitude colors so they also represent the scale (ie colors represent all available points not just the points on that specific sub plot)
I have seen this accomplished a number of ways but am struggling to apply them to the loop in my code. The data I am using is here: Data.
I posted my code and what the current output looks like below.
import matplotlib.pyplot as plt
import pandas as pd
eq_df = pd.read_csv(eq_csv)
eq_data = eq_df[['LON', 'LAT', 'DEPTH', 'MAG']]
nbound = max(eq_data.LAT)
sbound = min(eq_data.LAT)
ebound = max(eq_data.LON)
wbound = min(eq_data.LON)
xlimit = (wbound, ebound)
ylimit = (sbound, nbound)
magmin = min(eq_data.MAG)
magmax = max(eq_data.MAG)
for n in list(range(1,17)):
km = eq_data[(eq_data.DEPTH > n - 1) & (eq_data.DEPTH <= n)]
plt.subplot(4, 4, n)
plt.scatter(km["LON"], km['LAT'], s = 10, c = km['MAG'], vmin = magmin, vmax = magmax) #added vmin/vmax to scale my magnitude data
plt.ylim(sbound, nbound) # set y limits of plot
plt.xlim(wbound, ebound) # set x limits of plot
plt.tick_params(axis='both', which='major', labelsize= 6)
plt.subplots_adjust(hspace = 1)
plt.gca().set_title('Depth = ' + str(n - 1) +'km to ' + str(n) + 'km', size = 8) #set title of subplots
plt.suptitle('Magnitude of Events at Different Depth Slices, 1950 to Today')
ETA: new code to resolve my issue
In response to this comment on the other answer, here is a demonstration of the use of sharex=True and sharey=True for this use case:
import matplotlib.pyplot as plt
import numpy as np
# Supply the limits since random data will be plotted
wbound = -0.1
ebound = 1.1
sbound = -0.1
nbound = 1.1
fig, axs = plt.subplots(nrows=4, ncols=4, figsize=(16,12), sharex=True, sharey=True)
plt.xlim(wbound, ebound)
plt.ylim(sbound, nbound)
for n, ax in enumerate(axs.flatten()):
ax.scatter(np.random.random(20), np.random.random(20),
c = np.random.random(20), marker = '.')
ticks = [n % 4 == 0, n > 12]
ax.tick_params(left=ticks[0], bottom=ticks[1])
ax.set_title('Depth = ' + str(n - 1) +'km to ' + str(n) + 'km', size = 12)
plt.suptitle('Magnitude of Events at Different Depth Slices, 1950 to Today', y = 0.95)
Explanation of a couple things:
I have reduced the horizontal spacing between subplots with subplots_adjust(wspace=0.05)
plt.suptitle does not need to be (and should not be) in the loop.
ticks = [n % 4 == 0, n > 12] creates a pair of bools for each axis which is then used to control which tick marks are drawn.
Left and bottom tick marks are controlled for each axis with ax.tick_params(left=ticks[0], bottom=ticks[1])
plt.xlim() and plt.ylim() need only be called once, before the loop
Finally got it thanks to some help above and some extended googling.
I have updated my code above with notes indicating where code was added.
To adjust the limits of my plot axes I used:
plt.ylim(sbound, nbound)
plt.xlim(wbound, ebound)
To scale my magnitude data across all plots I added vmin, vmax to the following line:
plt.scatter(km["LON"], km['LAT'], s = 10, c = km['MAG'], vmin = magmin, vmax = magmax)
And here is the resulting figure:
All I want is quite straight forward, I just want the locator ticks to start at a specified timestamp:
peudo code: locator.set_start_ticking_at( datetime_dummy )
I have no luck finding anything so far.
Here is the portion of the code for this question:
axes[0].set_xlim(datetime_dummy) # datetime_dummy = '2015-12-25 05:34:00'
import matplotlib.dates as matdates
seclocator = matdates.SecondLocator(interval=20)
minlocator = matdates.MinuteLocator(interval=1)
hourlocator = matdates.HourLocator(interval=12)
seclocator.MAXTICKS = 40000
minlocator.MAXTICKS = 40000
hourlocator.MAXTICKS = 40000
majorFmt = matdates.DateFormatter('%Y-%m-%d, %H:%M:%S')
minorFmt = matdates.DateFormatter('%H:%M:%S')
plt.setp(axes[0].xaxis.get_majorticklabels(), rotation=90 )
plt.setp(axes[0].xaxis.get_minorticklabels(), rotation=90 )
# other codes
# save fig as a picture
The x axis ticks of above code will get me:
How do I tell the minor locator to align with the major locator?
How do I tell the locators which timestamp to start ticking at?
what I have tried:
set_xlim doesn't do the trick
seclocator.tick_values(datetime_dummy, datetime_dummy1) doesn't do anything
Instead of using the interval keyword parameter, use bysecond and byminute to specify exactly which seconds and minutes you with to mark. The bysecond and byminute parameters are used to construct a dateutil rrule. The rrule generates datetimes which match certain specified patterns (or, one might say, "rules").
For example, bysecond=[20, 40] limits the datetimes to those whose seconds
equal 20 or 40. Thus, below, the minor tick marks only appear for datetimes
whose soconds equal 20 or 40.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as matdates
N = 100
fig, ax = plt.subplots()
x = np.arange(N).astype('<i8').view('M8[s]').tolist()
y = (np.random.random(N)-0.5).cumsum()
ax.plot(x, y)
seclocator = matdates.SecondLocator(bysecond=[20, 40])
minlocator = matdates.MinuteLocator(byminute=range(60)) # range(60) is the default
seclocator.MAXTICKS = 40000
minlocator.MAXTICKS = 40000
majorFmt = matdates.DateFormatter('%Y-%m-%d, %H:%M:%S')
minorFmt = matdates.DateFormatter('%H:%M:%S')
plt.setp(ax.xaxis.get_majorticklabels(), rotation=90)
plt.setp(ax.xaxis.get_minorticklabels(), rotation=90)
#unutbu: Many thanks: I've been looking everywhere for the answer to a related problem!
#eliu: I've adapted unutbu's excellent answer to demonstrate how you can define lists (to create different 'dateutil' rules) which give you complete control over which x-ticks are displayed. Try un-commenting each example below in turn and play around with the values to see the effect. Hope this helps.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
idx = pd.date_range('2017-01-01 05:03', '2017-01-01 18:03', freq = 'min')
df = pd.Series(np.random.randn(len(idx)), index = idx)
fig, ax = plt.subplots()
# Choose which major hour ticks are displayed by creating a 'dateutil' rule e.g.:
# Only use the hours in an explicit list:
# hourlocator = mdates.HourLocator(byhour=[6,12,8])
# Use the hours in a range defined by: Start, Stop, Step:
# hourlocator = mdates.HourLocator(byhour=range(8,15,2))
# Use every 3rd hour:
# hourlocator = mdates.HourLocator(interval = 3)
# Set the format of the major x-ticks:
majorFmt = mdates.DateFormatter('%H:%M')
#... and ditto to set minor_locators and minor_formatters for minor x-ticks if needed as well)
ax.plot(df.index, df.values, color = 'black', linewidth = 0.4)
fig.autofmt_xdate() # optional: makes 30 deg tilt on tick labels
I want to do something with plt.hist2d and plt.colorbar and I'm having real trouble working out how to do it. To explain, I've written the following example:
import numpy as np
from matplotlib import pyplot as plt
x = np.random.random(1e6)
y = np.random.random(1e6)
plt.hist2d(x, y)
This code generates a plot that looks something like the image below.
If I generate a histogram, ideally I would like the colour bar to extend beyond the maximum and minimum range of the data to the next step beyond the maximum and minimum. In the example in this question, this would set the colour bar extent from 9660 to 10260 in increments of 60.
How can I force either plt.hist2d or plt.colorbar to set the colour bar such that ticks are assigned to the start and end of the plotted colour bar?
I think this is what you're looking for:
h = plt.hist2d(x, y)
mn, mx = h[-1].get_clim()
mn = 60 * np.floor(mn / 60.)
mx = 60 * np.ceil(mx / 60.)
h[-1].set_clim(mn, mx)
cbar = plt.colorbar(h[-1], ticks=np.arange(mn, mx + 1, 60), )
This gives something like,
It's also often convenient to use tickers from the matplotlib.ticker, and use the tick_values method of tickers, but for this purpose I think the above is most convenient.
Good luck!
With huge thanks to farenorth, who got me thinking about this in the right way, I came up with a function, get_colour_bar_ticks:
def get_colour_bar_ticks(colourbar):
import numpy as np
# Get the limits and the extent of the colour bar.
limits = colourbar.get_clim()
extent = limits[1] - limits[0]
# Get the yticks of the colour bar as values (ax.get_yticks() returns them as fractions).
fractions = colourbar.ax.get_yticks()
yticks = (fractions * extent) + limits[0]
increment = yticks[1] - yticks[0]
# Generate the expanded ticks.
if (fractions[0] == 0) & (fractions[-1] == 1):
return yticks
start = yticks[0] - increment
end = yticks[-1] + increment
if fractions[0] == 0:
newticks = np.concatenate((yticks, [end]))
elif fractions[1] == 1:
newticks = np.concatenate(([start], yticks))
newticks = np.concatenate(([start], yticks, [end]))
return newticks
With this function I can then do this:
from matplotlib import pyplot as plt
x = np.random.random(1e6)
y = np.random.random(1e6)
h = plt.hist2d(x, y)
cbar = plt.colorbar()
ticks = get_colour_bar_ticks(cbar)
h[3].set_clim(ticks[0], ticks[-1])
cbar.set_clim(ticks[0], ticks[-1])
Which results in this, which is what I really wanted:
I have the following code:
import datetime
from matplotlib.ticker import FormatStrFormatter
from pylab import *
X = np.arange(len(hits))
base=datetime.date(2014, 8, 1)
date_list=array([base + datetime.timedelta(days=x) for x in range(0,len(hits))])
fig,ax = plt.subplots(1,1,1,figsize=(15,10))
for i in range(len(hits)):
-X[i],hits[i],facecolor='#89E07E', edgecolor='white',
ax.barh(-X[i],-misses[i],facecolor='#F03255', edgecolor='white',
for i in range(len(bar_handles)):
patch = bar_handles[i].get_children()[0]
bl = patch.get_xy()
percent_x = 0.5*patch.get_width() + bl[0]
percent_y = 0.5*patch.get_height() + bl[1]
if i%2==0:
percentage = 100*(float(hits[j])/float(hits[j]+misses[j]))
percentage = 100*(float(misses[j])/float(hits[j]+misses[j]))
ax.text(percent_x,percent_y,"%d%%" % percentage,ha='center',va='center')
for i in range(len(hits)):
plt.tick_params(which='both', width=0)
minorLocator = FixedLocator(xticks)
majorLocator = FixedLocator([0])
ax.xaxis.grid(b=True,which='minor', color='0.5', linestyle='-',linewidth=1)
ax.xaxis.grid(b=True,which='major', color='b', linestyle='-',linewidth=2.5)
# ax2 = plt.twinx()
# ax2.grid(False)
# for i in range(len(hits)):
# plt.yticks(-X,hits+misses)
This generates the following image:
I am left with one big issue and two minor problems. The big issue is that I want to add on the right y-axis the sums of the values. That is add 113,268,235 and 305. Trying something along the lines of twinx or share a subplots did not work out for me.
The minor issues are:
On the x-axis, the values to the left of 0 should be without the minus sign.
If you look closely, you see the the blue major vertical grid line coincides with a gray minor one. Would be nice to have only the blue one. This can be solved by first finding the index of 0 in xticks using numpy.where and then removing this element using numpy.delete.
With matplotlib when a log scale is specified for an axis, the default method of labeling that axis is with numbers that are 10 to a power eg. 10^6. Is there an easy way to change all of these labels to be their full numerical representation? eg. 1, 10, 100, etc.
Note that I do not know what the range of powers will be and want to support an arbitrary range (negatives included).
Sure, just change the formatter.
For example, if we have this plot:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.axis([1, 10000, 1, 100000])
You could set the tick labels manually, but then the tick locations and labels would be fixed when you zoom/pan/etc. Therefore, it's best to change the formatter. By default, a logarithmic scale uses a LogFormatter, which will format the values in scientific notation. To change the formatter to the default for linear axes (ScalarFormatter) use e.g.
from matplotlib.ticker import ScalarFormatter
for axis in [ax.xaxis, ax.yaxis]:
I've found that using ScalarFormatter is great if all your tick values are greater than or equal to 1. However, if you have a tick at a number <1, the ScalarFormatter prints the tick label as 0.
We can use a FuncFormatter from the matplotlib ticker module to fix this issue. The simplest way to do this is with a lambda function and the g format specifier (thanks to #lenz in comments).
import matplotlib.ticker as ticker
ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda y, _: '{:g}'.format(y)))
Note in my original answer I didn't use the g format, instead I came up with this lambda function with FuncFormatter to set numbers >= 1 to their integer value, and numbers <1 to their decimal value, with the minimum number of decimal places required (i.e. 0.1, 0.01, 0.001, etc). It assumes that you are only setting ticks on the base10 values.
import matplotlib.ticker as ticker
import numpy as np
ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda y,pos: ('{{:.{:1d}f}}'.format(int(np.maximum(-np.log10(y),0)))).format(y)))
For clarity, here's that lambda function written out in a more verbose, but also more understandable, way:
def myLogFormat(y,pos):
# Find the number of decimal places required
decimalplaces = int(np.maximum(-np.log10(y),0)) # =0 for numbers >=1
# Insert that number into a format string
formatstring = '{{:.{:1d}f}}'.format(decimalplaces)
# Return the formatted tick label
return formatstring.format(y)
I found Joe's and Tom's answers very helpful, but there are a lot of useful details in the comments on those answers. Here's a summary of the two scenarios:
Ranges above 1
Here's the example code like Joe's, but with a higher range:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.axis([1, 10000, 1, 1000000])
That shows a plot like this, using scientific notation:
As in Joe's answer, I use a ScalarFormatter, but I also call set_scientific(False). That's necessary when the scale goes up to 1000000 or above.
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
fig, ax = plt.subplots()
ax.axis([1, 10000, 1, 1000000])
for axis in [ax.xaxis, ax.yaxis]:
formatter = ScalarFormatter()
Ranges below 1
As in Tom's answer, here's what happens when the range goes below 1:
import matplotlib.pyplot as plt
from matplotlib.ticker import ScalarFormatter
fig, ax = plt.subplots()
ax.axis([0.01, 10000, 1, 1000000])
for axis in [ax.xaxis, ax.yaxis]:
formatter = ScalarFormatter()
That displays the first two ticks on the x axis as zeroes.
Switching to a FuncFormatter handles that. Again, I had problems with numbers 1000000 or higher, but adding a precision to the format string solved it.
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
fig, ax = plt.subplots()
ax.axis([0.01, 10000, 1, 1000000])
for axis in [ax.xaxis, ax.yaxis]:
formatter = FuncFormatter(lambda y, _: '{:.16g}'.format(y))
regarding these questions
What if I wanted to change the numbers to, 1, 5, 10, 20?
– aloha Jul 10 '15 at 13:26
I would like to add ticks in between, like 50,200, etc.., How can I do
that? I tried, set_xticks[50.0,200.0] but that doesn't seem to work!
– ThePredator Aug 3 '15 at 12:54
But with ax.axis([1, 100, 1, 100]), ScalarFormatter gives 1.0, 10.0, ... which is not what I desire. I want it to give integers...
– CPBL Dec 7 '15 at 20:22
you can solve those issue like this with MINOR formatter:
ax.set_yticks([0.00000025, 0.00000015, 0.00000035])
in my application I'm using this format scheme, which I think solves most issues related to log scalar formatting; the same could be done for data > 1.0 or x axis formatting:
#force 'autoscale'
yd = [] #matrix of y values from all lines on plot
for n in range(len(plt.gca().get_lines())):
line = plt.gca().get_lines()[n]
yd = [item for sublist in yd for item in sublist]
ymin, ymax = np.min(yd), np.max(yd)
ax.set_ylim([0.9*ymin, 1.1*ymax])
z = []
for i in [0.0000001, 0.00000015, 0.00000025, 0.00000035,
0.000001, 0.0000015, 0.0000025, 0.0000035,
0.00001, 0.000015, 0.000025, 0.000035,
0.0001, 0.00015, 0.00025, 0.00035,
0.001, 0.0015, 0.0025, 0.0035,
0.01, 0.015, 0.025, 0.035,
0.1, 0.15, 0.25, 0.35]:
if ymin<i<ymax:
for comments on "force autoscale" see: Python matplotlib logarithmic autoscale
which yields:
then to create a general use machine:
# user controls
sub_ticks = [10,11,12,14,16,18,22,25,35,45] # fill these midpoints
sub_range = [-8,8] # from 100000000 to 0.000000001
format = "%.8f" # standard float string formatting
# set scalar and string format floats
#force 'autoscale'
yd = [] #matrix of y values from all lines on plot
for n in range(len(plt.gca().get_lines())):
line = plt.gca().get_lines()[n]
yd = [item for sublist in yd for item in sublist]
ymin, ymax = np.min(yd), np.max(yd)
ax.set_ylim([0.9*ymin, 1.1*ymax])
# add sub minor ticks
for i in sub_ticks:
for j in range(sub_range[0],sub_range[1]):
k = []
for l in set_sub_formatter:
if ymin<l<ymax:
The machinery outlined in the accepted answer works great, but sometimes a simple manual override is easier. To get ticks at 1, 10, 100, 1000, for example, you could say:
ticks = 10**np.arange(4)
plt.xticks(ticks, ticks)
Note that it is critical to specify both the locations and the labels, otherwise matplotlib will ignore you.
This mechanism can be used to obtain arbitrary formatting. For instance:
plt.xticks(ticks, [ f"{x:.0f}" for x in ticks ])
plt.xticks(ticks, [ f"10^{int(np.log10(x))}" for x in ticks ])
plt.xticks(ticks, [ romannumerals(x) for x in ticks ])
(where romannumerals is an imagined function that converts its argument into Roman numerals).
As an aside, this technique also works if you want ticks at arbitrary intervals, e.g.,
ticks = [1, 2, 5, 10, 20, 50, 100]
import matplotlib.pyplot as plt
plt.rcParams['axes.formatter.min_exponent'] = 2
plt.xlim(1e-5, 1e5)
This will become default for all plots in a session.
See also: LogFormatter tickmarks scientific format limits