Change Y axis tick scale with log bar graph python

Change Y axis tick scale with log bar graph python - python

I have a bar graph that is log scaled that I am trying to change the y ticks from 10^1, 10^2, etc., to whole numbers. I have tried setting the tick values manually, and tried setting the values from the data, and also setting the format to scalar. One thing I notice in all of the questions I am looking at is that my construction of the graph doesn't include subplot.
def confirmed_cases():
x = df['Date']
y = df['Confirmed']
plt.figure(figsize=(20, 10))
plt.bar(x, y)
plt.yscale('log')
# plt.yticks([0, 100000, 250000, 500000, 750000, 1000000, 1250000])
# plt.get_yaxis().set_major_formatter(matplotlib.ticker.ScalarFormatter())
plt.title('US Corona Cases By Date')
plt.xlabel('Date')
plt.ylabel('Confirmed Cases')
plt.xticks(rotation=90)

There a few issues:
The formatter needs to be placed at the yaxis of the ax. Useplt.gca() to get the current ax. Note that there is no function plt.get_yaxis().
The scalar formatter starts using exponential notation for large numbers. To prevent that, set_powerlimits((m,n)) makes sure the powers are only shown for values outside the range 10**m and 10**n.
In a log scale, major ticks are used for values 10**n for integer n. The other ticks or minor ticks, at positions k*10**n for k from 2 to 9. If there are only a few major ticks visible, the minor ticks can also get a tick label. To suppress both the minor tick marks and their optional labels, a NullFormatter can be used.
Avoid using a tick at zero for a log-scale axis. Log(0) is minus infinity.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib
plt.figure(figsize=(20, 10))
plt.bar(np.arange(100), np.random.geometric(1/500000, 100))
plt.yscale('log')
formatter = matplotlib.ticker.ScalarFormatter()
formatter.set_powerlimits((-6,9))
plt.gca().yaxis.set_major_formatter(formatter)
plt.gca().yaxis.set_minor_locator(matplotlib.ticker.NullLocator())
plt.yticks([100000, 250000, 500000, 750000, 1000000, 1250000])
plt.show()

Related

Set log xticks in matplotlib for a linear plot

Consider
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.hist(np.log10(xdata), bins=100)
plt.show()
plt.semilogy(xdata)
plt.show()
is there any way to display xticks of the first plot (plt.hist) as in the second plot's yticks? For good reasons I want to histogram the np.log10(xdata) of xdata but I'd like to set minor ticks to display as usual in a log scale (even considering that the exponent is linear...)
In other words, I want the x_axis of this plot:
to be like the y_axis
of the 2nd plot, without changing the spacing between major ticks (e.g., adding log marks between 5.5 and 6.0, without altering these values)

Proper histogram plot with logarithmic x-axis:
Explanation:
Cut off negative values
The randomly generated example data likely contains still some negative values
activate the commented code lines at the beginning to see the effect
logarithmic function isn't defined for values <= 0
while the 2nd plot just deals with y-axis log scaling (negative values are just out of range), the 1st plot doesn't work with negative values in the BINs range
probably real world working data won't be <= 0, otherwise keep that in mind
BINs should be aligned to log scale as well
otherwise the 'BINs widths' distribution looks off
switch # on the plt.hist( statements in the 1st plot section to see the effect)
xdata (not np.log10(xdata)) to be plotted in the histogram
that 'workaround' with plotting np.log10(xdata) probably was the root cause for the misunderstanding in the comments
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
# MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}") # note the negative values
# cut off potential negative values (log function isn't defined for <= 0 )
xdata = np.ma.masked_less_equal(xdata, 0)
MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}")
# align the bins to fit a log scale
bins = 100
bins_log_aligned = np.logspace(np.log10(MIN_xdata), np.log10(MAX_xdata), bins)
# 1st plot
plt.hist(xdata, bins = bins_log_aligned) # note: xdata (not np.log10(xdata) )
# plt.hist(xdata, bins = 100)
plt.xscale('log')
plt.show()
# 2nd plot
plt.semilogy(xdata)
plt.show()

Just kept for now for clarification purpose. Will be deleted when the question is revised.
Disclaimer:
As Lucas M. Uriarte already mentioned that isn't an expected way of changing axis ticks.
x axis ticks and labels don't represent the plotted data
You should at least always provide that information along with such a plot.
The plot
From seeing the result I kinda understand where that special plot idea is coming from - still there should be a preferred way (e.g. conversion of the data in advance) to do such a plot instead of 'faking' the axis.
Explanation how that special axis transfer plot is done:
original x-axis is hidden
a twiny axis is added
note that its y-axis is hidden by default, so that doesn't need handling
twiny x-axis is set to log and the 2nd plot y-axis limits are transferred
subplots used to directly transfer the 2nd plot y-axis limits
use variables if you need to stick with your two plots
twiny x-axis is moved from top (twiny default position) to bottom (where the original x-axis was)
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.figure()
fig, axs = plt.subplots(2, figsize=(7,10), facecolor=(1, 1, 1))
# 1st plot
axs[0].hist(np.log10(xdata), bins=100) # plot the data on the normal x axis
axs[0].axes.xaxis.set_visible(False) # hide the normal x axis
# 2nd plot
axs[1].semilogy(xdata)
# 1st plot - twin axis
axs0_y_twin = axs[0].twiny() # set a twiny axis, note twiny y axis is hidden by default
axs0_y_twin.set(xscale="log")
# transfer the limits from the 2nd plot y axis to the twin axis
axs0_y_twin.set_xlim(axs[1].get_ylim()[0],
axs[1].get_ylim()[1])
# move the twin x axis from top to bottom
axs0_y_twin.tick_params(axis="x", which="both", bottom=True, top=False,
labelbottom=True, labeltop=False)
# Disclaimer
disclaimer_text = "Disclaimer: x axis ticks and labels don't represent the plotted data"
axs[0].text(0.5,-0.09, disclaimer_text, size=12, ha="center", color="red",
transform=axs[0].transAxes)
plt.tight_layout()
plt.subplots_adjust(hspace=0.2)
plt.show()

How to remove scientific notation from a log-log plot? [duplicate]

This question already has answers here:
Matplotlib log scale tick label number formatting
(6 answers)
Closed 2 years ago.
I'd like the y axis to show only the number 100, 200, and 300, and not in scientific notation. Any thoughts?
Current plot
Simplified code:
from matplotlib import pyplot as plt
import numpy as np
x = np.logspace(2, 6, 20)
y = np.logspace(np.log10(60), np.log10(300), 20)
plt.scatter(x, y[::-1])
plt.xscale('log')
plt.yscale('log')
plt.show()

The major and minor locators determine the positions of the ticks. The standard positions are set via the AutoLocator. The NullLocator removes them. A MultipleLocator(x) shows ticks every multiple x.
For the y axis, setting standard tick positions shows the ticks at the top closer to each other, as determines by the log scale. Doing the same for the x axis, however, due to the large range, would put them too close together. So, for the x axis the positions determined by the LogLocator can stay in place.
The formatters control how the ticks should be displayed. The ScalarFormatter sets the default way. There is an option scilimits that determines for which ranges of values a scientific notation should be used. As 1.000.000 usually gets displayed as 1e6, setting scilimits=(-6,9) avoids this.
from matplotlib import pyplot as plt
from matplotlib import ticker
import numpy as np
x = np.logspace(2, 6, 20)
y = np.logspace(np.log10(60), np.log10(300), 20)
plt.scatter(x, y[::-1])
plt.xscale('log')
plt.yscale('log')
ax = plt.gca()
# ax.xaxis.set_major_locator(ticker.AutoLocator())
ax.xaxis.set_minor_locator(ticker.NullLocator()) # no minor ticks
ax.xaxis.set_major_formatter(ticker.ScalarFormatter()) # set regular formatting
# ax.yaxis.set_major_locator(ticker.AutoLocator()) # major y tick positions in a regular way
ax.yaxis.set_major_locator(ticker.MultipleLocator(100)) # major y tick positions every 100
ax.yaxis.set_minor_locator(ticker.NullLocator()) # no minor ticks
ax.yaxis.set_major_formatter(ticker.ScalarFormatter()) # set regular formatting
ax.ticklabel_format(style='sci', scilimits=(-6, 9)) # disable scientific notation
plt.show()

Matplotlib - pyplot incorrectly setting axes ticks when using scatter()

I am trying to customize the xticks and yticks for my scatterplot with the simple code below:
import numpy as np
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
y_ticks = np.arange(10, 41, 10)
x_ticks = np.arange(1000, 5001, 1000)
ax.set_yticks(y_ticks)
ax.set_xticks(x_ticks)
ax.scatter(some_x, some_y)
plt.show()
If we comment out the line: ax.scatter(x, y), we get an empty plot with the correct result:
However if the code is run exactly as shown, we get this:
Finally, if we run the code with ax.set_yticks(yticks) and ax.set_xticks(xticks) commented out, we also get the correct result (just with the axes not in the ranges I desire them to be):
Note that I am using Python version 2.7. Additionally, some_x and some_y are omitted.
Any input on why the axes are changing in such an odd manner only after I try plotting a scatterplot would be appreciated.
EDIT:
If I run ax.scatter(x, y) before xticks and yticks are set, I get odd results that are slightly different than before:

Matplotlib axes will always adjust themselves to the content. This is a desirable feature, because it allows to always see the plotted data, no matter if it ranges from -10 to -9 or from 1000 to 10000.
Setting the xticks will only change the tick locations. So if you set the ticks to locations between -10 and -9, but then plot data from 1000 to 10000, you would simply not see any ticks, because they do not lie in the shown range.
If the automatically chosen limits are not what you are looking for, you need to set them manually, using ax.set_xlim() and ax.set_ylim().
Finally it should be clear that in order to have correct numbers appear on the axes, you need to actually use numbers. If some_x and some_y in ax.scatter(some_x, some_y) are strings, they will not obey to any reasonable limits, but simply be plotted one after the other.

Setting font size in matplotlib plot [duplicate]

I have too many ticks on my graph and they are running into each other.
How can I reduce the number of ticks?
For example, I have ticks:
1E-6, 1E-5, 1E-4, ... 1E6, 1E7
And I only want:
1E-5, 1E-3, ... 1E5, 1E7
I've tried playing with the LogLocator, but I haven't been able to figure this out.

Alternatively, if you want to simply set the number of ticks while allowing matplotlib to position them (currently only with MaxNLocator), there is pyplot.locator_params,
pyplot.locator_params(nbins=4)
You can specify specific axis in this method as mentioned below, default is both:
# To specify the number of ticks on both or any single axes
pyplot.locator_params(axis='y', nbins=6)
pyplot.locator_params(axis='x', nbins=10)

To solve the issue of customisation and appearance of the ticks, see the Tick Locators guide on the matplotlib website
ax.xaxis.set_major_locator(plt.MaxNLocator(3))
would set the total number of ticks in the x-axis to 3, and evenly distribute them across the axis.
There is also a nice tutorial about this

If somebody still gets this page in search results:
fig, ax = plt.subplots()
plt.plot(...)
every_nth = 4
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)

There's a set_ticks() function for axis objects.

in case somebody still needs it, and since nothing
here really worked for me, i came up with a very
simple way that keeps the appearance of the
generated plot "as is" while fixing the number
of ticks to exactly N:
import numpy as np
import matplotlib.pyplot as plt
f, ax = plt.subplots()
ax.plot(range(100))
ymin, ymax = ax.get_ylim()
ax.set_yticks(np.round(np.linspace(ymin, ymax, N), 2))

The solution #raphael gave is straightforward and quite helpful.
Still, the displayed tick labels will not be values sampled from the original distribution but from the indexes of the array returned by np.linspace(ymin, ymax, N).
To display N values evenly spaced from your original tick labels, use the set_yticklabels() method. Here is a snippet for the y axis, with integer labels:
import numpy as np
import matplotlib.pyplot as plt
ax = plt.gca()
ymin, ymax = ax.get_ylim()
custom_ticks = np.linspace(ymin, ymax, N, dtype=int)
ax.set_yticks(custom_ticks)
ax.set_yticklabels(custom_ticks)

If you need one tick every N=3 ticks :
N = 3 # 1 tick every 3
xticks_pos, xticks_labels = plt.xticks() # get all axis ticks
myticks = [j for i,j in enumerate(xticks_pos) if not i%N] # index of selected ticks
newlabels = [label for i,label in enumerate(xticks_labels) if not i%N]
or with fig,ax = plt.subplots() :
N = 3 # 1 tick every 3
xticks_pos = ax.get_xticks()
xticks_labels = ax.get_xticklabels()
myticks = [j for i,j in enumerate(xticks_pos) if not i%N] # index of selected ticks
newlabels = [label for i,label in enumerate(xticks_labels) if not i%N]
(obviously you can adjust the offset with (i+offset)%N).
Note that you can get uneven ticks if you wish, e.g. myticks = [1, 3, 8].
Then you can use
plt.gca().set_xticks(myticks) # set new X axis ticks
or if you want to replace labels as well
plt.xticks(myticks, newlabels) # set new X axis ticks and labels
Beware that axis limits must be set after the axis ticks.
Finally, you may wish to draw only an arbitrary set of ticks :
mylabels = ['03/2018', '09/2019', '10/2020']
plt.draw() # needed to populate xticks with actual labels
xticks_pos, xticks_labels = plt.xticks() # get all axis ticks
myticks = [i for i,j in enumerate(b) if j.get_text() in mylabels]
plt.xticks(myticks, mylabels)
(assuming mylabels is ordered ; if it is not, then sort myticks and reorder it).

xticks function auto iterates with range function
start_number = 0
end_number = len(data you have)
step_number = how many skips to make from strat to end
rotation = 90 degrees tilt will help with long ticks
plt.xticks(range(start_number,end_number,step_number),rotation=90)

if you want 10 ticks:
for y axis: ax.set_yticks(ax.get_yticks()[::len(ax.get_yticks())//10])
for x axis: ax.set_xticks(ax.get_xticks()[::len(ax.get_xticks())//10])
this simply gets your ticks and chooses every 10th of the list and sets it back to your ticks. you can change the number of ticks as you wish.

When a log scale is used the number of major ticks can be fixed with the following command
import matplotlib.pyplot as plt
....
plt.locator_params(numticks=12)
plt.show()
The value set to numticks determines the number of axis ticks to be displayed.
Credits to #bgamari's post for introducing the locator_params() function, but the nticks parameter throws an error when a log scale is used.

reducing number of plot ticks

I have too many ticks on my graph and they are running into each other.
How can I reduce the number of ticks?
For example, I have ticks:
1E-6, 1E-5, 1E-4, ... 1E6, 1E7
And I only want:
1E-5, 1E-3, ... 1E5, 1E7
I've tried playing with the LogLocator, but I haven't been able to figure this out.

Alternatively, if you want to simply set the number of ticks while allowing matplotlib to position them (currently only with MaxNLocator), there is pyplot.locator_params,
pyplot.locator_params(nbins=4)
You can specify specific axis in this method as mentioned below, default is both:
# To specify the number of ticks on both or any single axes
pyplot.locator_params(axis='y', nbins=6)
pyplot.locator_params(axis='x', nbins=10)

To solve the issue of customisation and appearance of the ticks, see the Tick Locators guide on the matplotlib website
ax.xaxis.set_major_locator(plt.MaxNLocator(3))
would set the total number of ticks in the x-axis to 3, and evenly distribute them across the axis.
There is also a nice tutorial about this

If somebody still gets this page in search results:
fig, ax = plt.subplots()
plt.plot(...)
every_nth = 4
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)

There's a set_ticks() function for axis objects.

in case somebody still needs it, and since nothing
here really worked for me, i came up with a very
simple way that keeps the appearance of the
generated plot "as is" while fixing the number
of ticks to exactly N:
import numpy as np
import matplotlib.pyplot as plt
f, ax = plt.subplots()
ax.plot(range(100))
ymin, ymax = ax.get_ylim()
ax.set_yticks(np.round(np.linspace(ymin, ymax, N), 2))

The solution #raphael gave is straightforward and quite helpful.
Still, the displayed tick labels will not be values sampled from the original distribution but from the indexes of the array returned by np.linspace(ymin, ymax, N).
To display N values evenly spaced from your original tick labels, use the set_yticklabels() method. Here is a snippet for the y axis, with integer labels:
import numpy as np
import matplotlib.pyplot as plt
ax = plt.gca()
ymin, ymax = ax.get_ylim()
custom_ticks = np.linspace(ymin, ymax, N, dtype=int)
ax.set_yticks(custom_ticks)
ax.set_yticklabels(custom_ticks)

If you need one tick every N=3 ticks :
N = 3 # 1 tick every 3
xticks_pos, xticks_labels = plt.xticks() # get all axis ticks
myticks = [j for i,j in enumerate(xticks_pos) if not i%N] # index of selected ticks
newlabels = [label for i,label in enumerate(xticks_labels) if not i%N]
or with fig,ax = plt.subplots() :
N = 3 # 1 tick every 3
xticks_pos = ax.get_xticks()
xticks_labels = ax.get_xticklabels()
myticks = [j for i,j in enumerate(xticks_pos) if not i%N] # index of selected ticks
newlabels = [label for i,label in enumerate(xticks_labels) if not i%N]
(obviously you can adjust the offset with (i+offset)%N).
Note that you can get uneven ticks if you wish, e.g. myticks = [1, 3, 8].
Then you can use
plt.gca().set_xticks(myticks) # set new X axis ticks
or if you want to replace labels as well
plt.xticks(myticks, newlabels) # set new X axis ticks and labels
Beware that axis limits must be set after the axis ticks.
Finally, you may wish to draw only an arbitrary set of ticks :
mylabels = ['03/2018', '09/2019', '10/2020']
plt.draw() # needed to populate xticks with actual labels
xticks_pos, xticks_labels = plt.xticks() # get all axis ticks
myticks = [i for i,j in enumerate(b) if j.get_text() in mylabels]
plt.xticks(myticks, mylabels)
(assuming mylabels is ordered ; if it is not, then sort myticks and reorder it).

xticks function auto iterates with range function
start_number = 0
end_number = len(data you have)
step_number = how many skips to make from strat to end
rotation = 90 degrees tilt will help with long ticks
plt.xticks(range(start_number,end_number,step_number),rotation=90)

if you want 10 ticks:
for y axis: ax.set_yticks(ax.get_yticks()[::len(ax.get_yticks())//10])
for x axis: ax.set_xticks(ax.get_xticks()[::len(ax.get_xticks())//10])
this simply gets your ticks and chooses every 10th of the list and sets it back to your ticks. you can change the number of ticks as you wish.

When a log scale is used the number of major ticks can be fixed with the following command
import matplotlib.pyplot as plt
....
plt.locator_params(numticks=12)
plt.show()
The value set to numticks determines the number of axis ticks to be displayed.
Credits to #bgamari's post for introducing the locator_params() function, but the nticks parameter throws an error when a log scale is used.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Change Y axis tick scale with log bar graph python - python

Related

Set log xticks in matplotlib for a linear plot

How to remove scientific notation from a log-log plot? [duplicate]

Matplotlib - pyplot incorrectly setting axes ticks when using scatter()

Setting font size in matplotlib plot [duplicate]

reducing number of plot ticks

Categories

Resources