Setting xticklabels and x-axis limits in a bar plot with matplotlib - python

I want to plot a bar graph with a variable amount of values along the x-axis. For the data, I have a set of labels which I want to show on the x-axis under the bars. I also want the x-axis limits to start at -1, since otherwise, only half of the first bar at index 0 would be visible. I've tried multiple alternatives for achieving that, none of them worked, because the xticklabels are always one or more off. And IF they work for a given set of data, with another set of data (with more or less bars) it does not work again. See minimum code example below
from matplotlib import pyplot as plt
from matplotlib import ticker
import numpy as np
randData = np.random.rand(100)
xValues = np.linspace(0, len(randData)-1, num=len(randData))
labels = []
for i in range(len(randData)):
labels.append('label' + str(i))
fig, ax = plt.subplots()
ax.bar(np.linspace(0, len(randData)-1, num=len(randData)), randData)
ax.xaxis.set_major_locator(ticker.MultipleLocator(1))
# Alternative 1
# Use an empty string for index -1, set labels, then set new xlim
labels.insert(0, '')
ax.set_xticklabels(labels, size='x-small', rotation=90)
plt.xlim(-1, len(randData))
# Alternative 2
# Use an empty string for index -1, set new xlim, then set labels
labels.insert(0, '')
plt.xlim(-1, len(randData))
ax.set_xticklabels(labels, size='x-small', rotation=90)
# Alternative 3
# Setting limits with ax.set_xlim
ax.set_xticklabels(labels, size='x-small', rotation=90)
ax.set_xlim([-1, len(randData)])
# Alternative 4
# Setting limits with plt.xlim
ax.set_xticklabels(labels, size='x-small', rotation=90)
plt.xlim(-1, len(randData))
plt.show()
None of the variants worked so far. One part of the problem is that the pyplot automatically sets its xlimits depending on the amount of bar graphs (sometimes it starts at -1, with more values it might sometimes start at -4).
One of the faulty results is shown below:
Any help would be appreciated.
P.S.: If I may, I'd like to add a little side question: How can I remove the Warning "UserWarning: FixedFormatter should only be used together with FixedLocator" when setting the xticklabels? Nothing from this answer worked for me.

Related

Set log xticks in matplotlib for a linear plot

Consider
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.hist(np.log10(xdata), bins=100)
plt.show()
plt.semilogy(xdata)
plt.show()
is there any way to display xticks of the first plot (plt.hist) as in the second plot's yticks? For good reasons I want to histogram the np.log10(xdata) of xdata but I'd like to set minor ticks to display as usual in a log scale (even considering that the exponent is linear...)
In other words, I want the x_axis of this plot:
to be like the y_axis
of the 2nd plot, without changing the spacing between major ticks (e.g., adding log marks between 5.5 and 6.0, without altering these values)
Proper histogram plot with logarithmic x-axis:
Explanation:
Cut off negative values
The randomly generated example data likely contains still some negative values
activate the commented code lines at the beginning to see the effect
logarithmic function isn't defined for values <= 0
while the 2nd plot just deals with y-axis log scaling (negative values are just out of range), the 1st plot doesn't work with negative values in the BINs range
probably real world working data won't be <= 0, otherwise keep that in mind
BINs should be aligned to log scale as well
otherwise the 'BINs widths' distribution looks off
switch # on the plt.hist( statements in the 1st plot section to see the effect)
xdata (not np.log10(xdata)) to be plotted in the histogram
that 'workaround' with plotting np.log10(xdata) probably was the root cause for the misunderstanding in the comments
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
# MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}") # note the negative values
# cut off potential negative values (log function isn't defined for <= 0 )
xdata = np.ma.masked_less_equal(xdata, 0)
MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}")
# align the bins to fit a log scale
bins = 100
bins_log_aligned = np.logspace(np.log10(MIN_xdata), np.log10(MAX_xdata), bins)
# 1st plot
plt.hist(xdata, bins = bins_log_aligned) # note: xdata (not np.log10(xdata) )
# plt.hist(xdata, bins = 100)
plt.xscale('log')
plt.show()
# 2nd plot
plt.semilogy(xdata)
plt.show()
Just kept for now for clarification purpose. Will be deleted when the question is revised.
Disclaimer:
As Lucas M. Uriarte already mentioned that isn't an expected way of changing axis ticks.
x axis ticks and labels don't represent the plotted data
You should at least always provide that information along with such a plot.
The plot
From seeing the result I kinda understand where that special plot idea is coming from - still there should be a preferred way (e.g. conversion of the data in advance) to do such a plot instead of 'faking' the axis.
Explanation how that special axis transfer plot is done:
original x-axis is hidden
a twiny axis is added
note that its y-axis is hidden by default, so that doesn't need handling
twiny x-axis is set to log and the 2nd plot y-axis limits are transferred
subplots used to directly transfer the 2nd plot y-axis limits
use variables if you need to stick with your two plots
twiny x-axis is moved from top (twiny default position) to bottom (where the original x-axis was)
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.figure()
fig, axs = plt.subplots(2, figsize=(7,10), facecolor=(1, 1, 1))
# 1st plot
axs[0].hist(np.log10(xdata), bins=100) # plot the data on the normal x axis
axs[0].axes.xaxis.set_visible(False) # hide the normal x axis
# 2nd plot
axs[1].semilogy(xdata)
# 1st plot - twin axis
axs0_y_twin = axs[0].twiny() # set a twiny axis, note twiny y axis is hidden by default
axs0_y_twin.set(xscale="log")
# transfer the limits from the 2nd plot y axis to the twin axis
axs0_y_twin.set_xlim(axs[1].get_ylim()[0],
axs[1].get_ylim()[1])
# move the twin x axis from top to bottom
axs0_y_twin.tick_params(axis="x", which="both", bottom=True, top=False,
labelbottom=True, labeltop=False)
# Disclaimer
disclaimer_text = "Disclaimer: x axis ticks and labels don't represent the plotted data"
axs[0].text(0.5,-0.09, disclaimer_text, size=12, ha="center", color="red",
transform=axs[0].transAxes)
plt.tight_layout()
plt.subplots_adjust(hspace=0.2)
plt.show()

How to use a 3rd dataframe column as x axis ticks/labels in matplotlib scatter

I'm struggling to wrap my head around matplotlib with dataframes today. I see lots of solutions but I'm struggling to relate them to my needs. I think I may need to start over. Let's see what you think.
I have a dataframe (ephem) with 4 columns - Time, Date, Altitude & Azimuth.
I produce a scatter for alt & az using:
chart = plt.scatter(ephem.Azimuth, ephem.Altitude, marker='x', color='black', s=8)
What's the most efficient way to set the values in the Time column as the labels/ticks on the x axis?
So:
the scale/gridlines etc all remain the same
the chart still plots alt and az
the y axis ticks/labels remain as is
only the x axis ticks/labels are changed to the Time column.
Thanks
This isn't by any means the cleanest piece of code but the following works for me:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax.scatter(ephem.Azimuth, ephem.Altitude, marker='x', color='black', s=8)
labels = list(ephem.Time)
ax.set_xticklabels(labels)
plt.show()
Here you will explicitly force the set_xticklabels to the dataframe Time column which you have.
In other words, you want to change the x-axis tick labels using a list of values.
labels = ephem.Time.tolist()
# make your plot and before calling plt.show()
# insert the following two lines
ax = plt.gca()
ax.set_xticklabels(labels = labels)
plt.show()

matplotlib: align y-ticks in twinx [duplicate]

I created a matplotlib plot that has 2 y-axes. The y-axes have different scales, but I want the ticks and grid to be aligned. I am pulling the data from excel files, so there is no way to know the max limits beforehand. I have tried the following code.
# creates double-y axis
ax2 = ax1.twinx()
locs = ax1.yaxis.get_ticklocs()
ax2.set_yticks(locs)
The problem now is that the ticks on ax2 do not have labels anymore. Can anyone give me a good way to align ticks with different scales?
Aligning the tick locations of two different scales would mean to give up on the nice automatic tick locator and set the ticks to the same positions on the secondary axes as on the original one.
The idea is to establish a relation between the two axes scales using a function and set the ticks of the second axes at the positions of those of the first.
import matplotlib.pyplot as plt
import matplotlib.ticker
fig, ax = plt.subplots()
# creates double-y axis
ax2 = ax.twinx()
ax.plot(range(5), [1,2,3,4,5])
ax2.plot(range(6), [13,17,14,13,16,12])
ax.grid()
l = ax.get_ylim()
l2 = ax2.get_ylim()
f = lambda x : l2[0]+(x-l[0])/(l[1]-l[0])*(l2[1]-l2[0])
ticks = f(ax.get_yticks())
ax2.yaxis.set_major_locator(matplotlib.ticker.FixedLocator(ticks))
plt.show()
Note that this is a solution for the general case and it might result in totally unreadable labels depeding on the use case. If you happen to have more a priori information on the axes range, better solutions may be possible.
Also see this question for a case where automatic tick locations of the first axes is sacrificed for an easier setting of the secondary axes tick locations.
To anyone who's wondering (and for my future reference), the lambda function f in ImportanceofBeingErnest's answer maps the input left tick to a corresponding right tick through:
RHS tick = Bottom RHS tick + (% of LHS range traversed * RHS range)
Refer to this question on tick formatting to truncate decimal places:
from matplotlib.ticker import FormatStrFormatter
ax2.yaxis.set_major_formatter(FormatStrFormatter('%.2f')) # ax2 is the RHS y-axis

adjust matplotlib subplot spacing after tight_layout

I would like to minimize white space in my figure. I have a row of sub plots where four plots share their y-axis and the last plot has a separate axis.
There are no ylabels or ticklabels for the shared axis middle panels.
tight_layout creates a lot of white space between the the middle plots as if leaving space for tick labels and ylabels but I would rather stretch the sub plots. Is this possible?
import matplotlib.gridspec as gridspec
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
fig = plt.figure()
gs = gridspec.GridSpec(1, 5, width_ratios=[4,1,4,1,2])
ax = fig.add_subplot(gs[0])
axes = [ax] + [fig.add_subplot(gs[i], sharey=ax) for i in range(1, 4)]
axes[0].plot(np.random.randint(0,100,100))
barlist=axes[1].bar([1,2],[1,20])
axes[2].plot(np.random.randint(0,100,100))
barlist=axes[3].bar([1,2],[1,20])
axes[0].set_ylabel('data')
axes.append(fig.add_subplot(gs[4]))
axes[4].plot(np.random.randint(0,5,100))
axes[4].set_ylabel('other data')
for ax in axes[1:4]:
plt.setp(ax.get_yticklabels(), visible=False)
sns.despine();
plt.tight_layout(pad=0, w_pad=0, h_pad=0);
Setting w_pad = 0 is not changing the default settings of tight_layout. You need to set something like w_pad = -2. Which produces the following figure:
You could go further, to say -3 but then you would start to get some overlap with your last plot.
Another way could be to remove plt.tight_layout() and set the boundaries yourself using
plt.subplots_adjust(left=0.065, right=0.97, top=0.96, bottom=0.065, wspace=0.14)
Though this can be a bit of a trial and error process.
Edit
A nice looking graph can be achieved by moving the ticks and the labels of the last plot to the right hand side. This answer shows you can do this by using:
ax.yaxis.tick_right()
ax.yaxis.set_label_position("right")
So for your example:
axes[4].yaxis.tick_right()
axes[4].yaxis.set_label_position("right")
In addition, you need to remove sns.despine(). Finally, there is now no need to set w_pad = -2, just use plt.tight_layout(pad=0, w_pad=0, h_pad=0)
Using this creates the following figure:

Make x-axes of all subplots same length on the page

I am new to matplotlib and trying to create and save plots from pandas dataframes via a loop. Each plot should have an identical x-axis, but different y-axis lengths and labels. I have no problem creating and saving the plots with different y-axis lengths and labels, but when I create the plots, matplotlib rescales the x-axis depending on how much space is needed for the y-axis labels on the left side of the figure.
These figures are for a technical report. I plan to place one on each page of the report and I would like to have all of the x-axes take up the same amount of space on the page.
Here is an MSPaint version of what I'm getting and what I'd like to get.
Hopefully this is enough code to help. I'm sure there are lots of non-optimal parts of this.
import pandas as pd
import matplotlib.pyplot as plt
import pylab as pl
from matplotlib import collections as mc
from matplotlib.lines import Line2D
import seaborn as sns
# elements for x-axis
start = -1600
end = 2001
interval = 200 # x-axis tick interval
xticks = [x for x in range(start, end, interval)] # create x ticks
# items needed for legend construction
lw_bins = [0,10,25,50,75,90,100] # bins for line width
lw_labels = [3,6,9,12,15,18] # line widths
def make_proxy(zvalue, scalar_mappable, **kwargs):
color = 'black'
return Line2D([0, 1], [0, 1], color=color, solid_capstyle='butt', **kwargs)
# generic image ID
img_path = r'C:\\Users\\user\\chart'
img_ID = 0
for line_subset in data:
# create line collection for this run through loop
lc = mc.LineCollection(line_subset)
# create plot and set properties
sns.set(style="ticks")
sns.set_context("notebook")
fig, ax = pl.subplots(figsize=(16, len(line_subset)*0.5)) # I want the height of the figure to change based on number of labels on y-axis
# Figure width should stay the same
ax.add_collection(lc)
ax.set_xlim(left=start, right=end)
ax.set_xticks(xticks)
ax.set_ylim(0, len(line_subset)+1)
ax.margins(0.05)
sns.despine(left=True)
ax.xaxis.set_ticks_position('bottom')
ax.set_yticks(line_subset['order'])
ax.set_yticklabels(line_subset['ylabel'])
ax.tick_params(axis='y', length=0)
# legend
proxies = [make_proxy(item, lc, linewidth=item) for item in lw_labels]
ax.legend(proxies, ['0-10%', '10-25%', '25-50%', '50-75%', '75-90%', '90-100%'], bbox_to_anchor=(1.05, 1.0),
loc=2, ncol=2, labelspacing=1.25, handlelength=4.0, handletextpad=0.5, markerfirst=False,
columnspacing=1.0)
# title
ax.text(0, len(line_subset)+2, s=str(img_ID), fontsize=20)
# save as .png images
plt.savefig(r'C:\\Users\\user\\Desktop\\chart' + str(img_ID) + '.png', dpi=300, bbox_inches='tight')
Unless you use an axes of specifically defined aspect ratio (like in an imshow plot or by calling .set_aspect("equal")), the space taken by the axes should only depend on the figure size along that direction and the spacings set to the figure.
You are therefore pretty much asking for the default behaviour and the only thing that prevents you from obtaining that is that you use bbox_inches='tight' in the savefig command.
bbox_inches='tight' will change the figure size! So don't use it and the axes will remain constant in size. `
Your figure size, defined like figsize=(16, len(line_subset)*0.5) seems to make sense according to what I understand from the question. So what remains is to make sure the axes inside the figure are the size you want them to be. You can do that by manually placing it using fig.add_axes
fig.add_axes([left, bottom, width, height])
where left, bottom, width, height are in figure coordinates ranging from 0 to 1. Or, you can adjust the spacings outside the subplot using subplots_adjust
plt.subplots_adjust(left, bottom, right, top)
To get matching x axis for the subplots (same x axis length for each subplot) , you need to share the x axis between subplots.
See the example here https://matplotlib.org/examples/pylab_examples/shared_axis_demo.html

Categories

Resources