Change distance between boxplots in the same figure in python [duplicate] - python

I'm drawing the bloxplot shown below using python and matplotlib. Is there any way I can reduce the distance between the two boxplots on the X axis?
This is the code that I'm using to get the figure above:
import matplotlib.pyplot as plt
from matplotlib import rcParams
rcParams['ytick.direction'] = 'out'
rcParams['xtick.direction'] = 'out'
fig = plt.figure()
xlabels = ["CG", "EG"]
ax = fig.add_subplot(111)
ax.boxplot([values_cg, values_eg])
ax.set_xticks(np.arange(len(xlabels))+1)
ax.set_xticklabels(xlabels, rotation=45, ha='right')
fig.subplots_adjust(bottom=0.3)
ylabels = yticks = np.linspace(0, 20, 5)
ax.set_yticks(yticks)
ax.set_yticklabels(ylabels)
ax.tick_params(axis='x', pad=10)
ax.tick_params(axis='y', pad=10)
plt.savefig(os.path.join(output_dir, "output.pdf"))
And this is an example closer to what I'd like to get visually (although I wouldn't mind if the boxplots were even a bit closer to each other):

You can either change the aspect ratio of plot or use the widths kwarg (doc) as such:
ax.boxplot([values_cg, values_eg], widths=1)
to make the boxes wider.

Try changing the aspect ratio using
ax.set_aspect(1.5) # or some other float
The larger then number, the narrower (and taller) the plot should be:
a circle will be stretched such that the height is num times the width. aspect=1 is the same as aspect=’equal’.
http://matplotlib.org/api/axes_api.html#matplotlib.axes.Axes.set_aspect

When your code writes:
ax.set_xticks(np.arange(len(xlabels))+1)
You're putting the first box plot on 0 and the second one on 1 (event though you change the tick labels afterwards), just like in the second, "wanted" example you gave they are set on 1,2,3.
So i think an alternative solution would be to play with the xticks position and the xlim of the plot.
for example using
ax.set_xlim(-1.5,2.5)
would place them closer.

positions : array-like, optional
Sets the positions of the boxes. The ticks and limits are automatically set to match the positions. Defaults to range(1, N+1) where N is the number of boxes to be drawn.
https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.boxplot.html
This should do the job!

As #Stevie mentioned, you can use the positions kwarg (doc) to manually set the x-coordinates of the boxes:
ax.boxplot([values_cg, values_eg], positions=[1, 1.3])

Related

seaborn distplot different bar width on each figure

Sorry for giving an image however I think it is the best way to show my problem.
As you can see all of the bin width are different, from my understanding it shows range of rent_hours. I am not sure why different figure have different bin width even though I didn't set any.
My code looks is as follows:
figure, axes = plt.subplots(nrows=4, ncols=3)
figure.set_size_inches(18,14)
plt.subplots_adjust(hspace=0.5)
for ax, age_g in zip(axes.ravel(), age_cat):
group = total_usage_df.loc[(total_usage_df.age_group == age_g) & (total_usage_df.day_of_week <= 4)]
sns.distplot(group.rent_hour, ax=ax, kde=False)
ax.set(title=age_g)
ax.set_xlim([0, 24])
figure.suptitle("Weekday usage pattern", size=25);
additional question:
Seaborn : How to get the count in y axis for distplot using PairGrid for here it says that kde=False makes y-axis count however http://seaborn.pydata.org/generated/seaborn.distplot.html in the doc, it uses kde=False and still seems to show something else. How can I set y-axis to show count?
I've tried
sns.distplot(group.rent_hour, ax=ax, norm_hist=True) and it still seems to give something else rather than count.
sns.distplot(group.rent_hour, ax=ax, kde=False) gives me count however I don't know why it is giving me count.
Answer 1:
From the documentation:
norm_hist : bool, optional
If True, the histogram height shows a density rather than a count.
This is implied if a KDE or fitted density is plotted.
So you need to take into account your bin width as well, i.e. compute the area under the curve and not just the sum of the bin heights.
Answer 2:
# Plotting hist without kde
ax = sns.distplot(your_data, kde=False)
# Creating another Y axis
second_ax = ax.twinx()
#Plotting kde without hist on the second Y axis
sns.distplot(your_data, ax=second_ax, kde=True, hist=False)
#Removing Y ticks from the second axis
second_ax.set_yticks([])

matplotlib: align y-ticks in twinx [duplicate]

I created a matplotlib plot that has 2 y-axes. The y-axes have different scales, but I want the ticks and grid to be aligned. I am pulling the data from excel files, so there is no way to know the max limits beforehand. I have tried the following code.
# creates double-y axis
ax2 = ax1.twinx()
locs = ax1.yaxis.get_ticklocs()
ax2.set_yticks(locs)
The problem now is that the ticks on ax2 do not have labels anymore. Can anyone give me a good way to align ticks with different scales?
Aligning the tick locations of two different scales would mean to give up on the nice automatic tick locator and set the ticks to the same positions on the secondary axes as on the original one.
The idea is to establish a relation between the two axes scales using a function and set the ticks of the second axes at the positions of those of the first.
import matplotlib.pyplot as plt
import matplotlib.ticker
fig, ax = plt.subplots()
# creates double-y axis
ax2 = ax.twinx()
ax.plot(range(5), [1,2,3,4,5])
ax2.plot(range(6), [13,17,14,13,16,12])
ax.grid()
l = ax.get_ylim()
l2 = ax2.get_ylim()
f = lambda x : l2[0]+(x-l[0])/(l[1]-l[0])*(l2[1]-l2[0])
ticks = f(ax.get_yticks())
ax2.yaxis.set_major_locator(matplotlib.ticker.FixedLocator(ticks))
plt.show()
Note that this is a solution for the general case and it might result in totally unreadable labels depeding on the use case. If you happen to have more a priori information on the axes range, better solutions may be possible.
Also see this question for a case where automatic tick locations of the first axes is sacrificed for an easier setting of the secondary axes tick locations.
To anyone who's wondering (and for my future reference), the lambda function f in ImportanceofBeingErnest's answer maps the input left tick to a corresponding right tick through:
RHS tick = Bottom RHS tick + (% of LHS range traversed * RHS range)
Refer to this question on tick formatting to truncate decimal places:
from matplotlib.ticker import FormatStrFormatter
ax2.yaxis.set_major_formatter(FormatStrFormatter('%.2f')) # ax2 is the RHS y-axis

Make x-axes of all subplots same length on the page

I am new to matplotlib and trying to create and save plots from pandas dataframes via a loop. Each plot should have an identical x-axis, but different y-axis lengths and labels. I have no problem creating and saving the plots with different y-axis lengths and labels, but when I create the plots, matplotlib rescales the x-axis depending on how much space is needed for the y-axis labels on the left side of the figure.
These figures are for a technical report. I plan to place one on each page of the report and I would like to have all of the x-axes take up the same amount of space on the page.
Here is an MSPaint version of what I'm getting and what I'd like to get.
Hopefully this is enough code to help. I'm sure there are lots of non-optimal parts of this.
import pandas as pd
import matplotlib.pyplot as plt
import pylab as pl
from matplotlib import collections as mc
from matplotlib.lines import Line2D
import seaborn as sns
# elements for x-axis
start = -1600
end = 2001
interval = 200 # x-axis tick interval
xticks = [x for x in range(start, end, interval)] # create x ticks
# items needed for legend construction
lw_bins = [0,10,25,50,75,90,100] # bins for line width
lw_labels = [3,6,9,12,15,18] # line widths
def make_proxy(zvalue, scalar_mappable, **kwargs):
color = 'black'
return Line2D([0, 1], [0, 1], color=color, solid_capstyle='butt', **kwargs)
# generic image ID
img_path = r'C:\\Users\\user\\chart'
img_ID = 0
for line_subset in data:
# create line collection for this run through loop
lc = mc.LineCollection(line_subset)
# create plot and set properties
sns.set(style="ticks")
sns.set_context("notebook")
fig, ax = pl.subplots(figsize=(16, len(line_subset)*0.5)) # I want the height of the figure to change based on number of labels on y-axis
# Figure width should stay the same
ax.add_collection(lc)
ax.set_xlim(left=start, right=end)
ax.set_xticks(xticks)
ax.set_ylim(0, len(line_subset)+1)
ax.margins(0.05)
sns.despine(left=True)
ax.xaxis.set_ticks_position('bottom')
ax.set_yticks(line_subset['order'])
ax.set_yticklabels(line_subset['ylabel'])
ax.tick_params(axis='y', length=0)
# legend
proxies = [make_proxy(item, lc, linewidth=item) for item in lw_labels]
ax.legend(proxies, ['0-10%', '10-25%', '25-50%', '50-75%', '75-90%', '90-100%'], bbox_to_anchor=(1.05, 1.0),
loc=2, ncol=2, labelspacing=1.25, handlelength=4.0, handletextpad=0.5, markerfirst=False,
columnspacing=1.0)
# title
ax.text(0, len(line_subset)+2, s=str(img_ID), fontsize=20)
# save as .png images
plt.savefig(r'C:\\Users\\user\\Desktop\\chart' + str(img_ID) + '.png', dpi=300, bbox_inches='tight')
Unless you use an axes of specifically defined aspect ratio (like in an imshow plot or by calling .set_aspect("equal")), the space taken by the axes should only depend on the figure size along that direction and the spacings set to the figure.
You are therefore pretty much asking for the default behaviour and the only thing that prevents you from obtaining that is that you use bbox_inches='tight' in the savefig command.
bbox_inches='tight' will change the figure size! So don't use it and the axes will remain constant in size. `
Your figure size, defined like figsize=(16, len(line_subset)*0.5) seems to make sense according to what I understand from the question. So what remains is to make sure the axes inside the figure are the size you want them to be. You can do that by manually placing it using fig.add_axes
fig.add_axes([left, bottom, width, height])
where left, bottom, width, height are in figure coordinates ranging from 0 to 1. Or, you can adjust the spacings outside the subplot using subplots_adjust
plt.subplots_adjust(left, bottom, right, top)
To get matching x axis for the subplots (same x axis length for each subplot) , you need to share the x axis between subplots.
See the example here https://matplotlib.org/examples/pylab_examples/shared_axis_demo.html

Avoid overlapping ticks in matplotlib

I am generating plots like this one:
When using less ticks, the plot fits nicely and the bars are wide enough to see them correctly. Nevertheless, when there are lots of ticks, instead of making the plot larger, it just compress the y axe, resulting in thin bars and overlapping tick text.
This is happening both for plt.show() and plt.save_fig().
Is there any solution so it plots the figure in a scale which guarantees that bars have the specified width, not more (if too few ticks) and not less (too many, overlapping)?
EDIT:
Yes, I'm using barh, and yes, I'm setting height to a fixed value (8):
height = 8
ax.barh(yvalues-width/2, xvalues, height=height, color='blue', align='center')
ax.barh(yvalues+width/2, xvalues, height=height, color='red', align='center')
I don't quite understand your code, it seems you do two plots with the same (only shifted) yvalues, but the image doesn't look so. And are you sure you want to shift by width/2 if you have align=center? Anyways, to changing the image size:
No, I am not sure there is no other way, but I don't see anything in the manual at a glance. To set image size by hand:
fig = plt.figure(figsize=(5, 80))
ax = fig.add_subplot(111)
...your_code
the size is in cm. You can compute it beforehand, try for example
import numpy as np
fig_height = (max(yvalues) - min(yvalues)) / np.diff(yvalue)
this would (approximately) set the minimum distance between ticks to a centimeter, which is too much, but try to adjust it.
I think of two solutions for your case:
If you are trying to plot a histogram, use hist function [1]. This will automatically bin your data. You can even plot multiple overlapping histograms as long as you set alpha value lower than 1. See this post
import matplotlib.pyplot as plt
import numpy as np
x = mu + sigma*np.random.randn(10000)
plt.hist(x, 50, normed=1, facecolor='green',
alpha=0.75, orientation='horizontal')
You can also identify interval of your axis ticks. This will place a tick every 10 items. But I doubt this will solve your problem.
import matplotlib.ticker as ticker
...
ax.yaxis.set_major_locator(ticker.MultipleLocator(10))

Matplotlib: Adjust legend location/position

I'm creating a figure with multiple subplots. One of these subplots is giving me some trouble, as none of the axes corners or centers are free (or can be freed up) for placing the legend. What I'd like to do is to have the legend placed somewhere in between the 'upper left' and 'center left' locations, while keeping the padding between it and the y-axis equal to the legends in the other subplots (that are placed using one of the predefined legend location keywords).
I know I can specify a custom position by using loc=(x,y), but then I can't figure out how to get the padding between the legend and the y-axis to be equal to that used by the other legends. Would it be possible to somehow use the borderaxespad property of the first legend? Though I'm not succeeding at getting that to work.
Any suggestions would be most welcome!
Edit: Here is a (very simplified) illustration of the problem:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(1, 2, sharex=False, sharey=False)
ax[0].axhline(y=1, label='one')
ax[0].axhline(y=2, label='two')
ax[0].set_ylim([0.8,3.2])
ax[0].legend(loc=2)
ax[1].axhline(y=1, label='one')
ax[1].axhline(y=2, label='two')
ax[1].axhline(y=3, label='three')
ax[1].set_ylim([0.8,3.2])
ax[1].legend(loc=2)
plt.show()
What I'd like is that the legend in the right plot is moved down somewhat so it no longer overlaps with the line.
As a last resort I could change the axis limits, but I would very much like to avoid that.
I saw the answer you posted and tried it out. The problem however is that it is also depended on the figure size.
Here's a new try:
import numpy
import matplotlib.pyplot as plt
x = numpy.linspace(0, 10, 10000)
y = numpy.cos(x) + 2.
x_value = .014 #Offset by eye
y_value = .55
fig, ax = plt.subplots(1, 2, sharex = False, sharey = False)
fig.set_size_inches(50,30)
ax[0].plot(x, y, label = "cos")
ax[0].set_ylim([0.8,3.2])
ax[0].legend(loc=2)
line1 ,= ax[1].plot(x,y)
ax[1].set_ylim([0.8,3.2])
axbox = ax[1].get_position()
fig.legend([line1], ["cos"], loc = (axbox.x0 + x_value, axbox.y0 + y_value))
plt.show()
So what I am now doing is basically getting the coordinates from the subplot. I then create the legend based on the dimensions of the entire figure. Hence, the figure size does not change anything to the legend positioning anymore.
With the values for x_value and y_value the legend can be positioned in the subplot. x_value has been eyeballed for a good correspondence with the "normal" legend. This value can be changed at your desire. y_value determines the height of the legend.
Good luck!
After spending way too much time on this, I've come up with the following satisfactory solution (the Transformations Tutorial definitely helped):
bapad = plt.rcParams['legend.borderaxespad']
fontsize = plt.rcParams['font.size']
axline = plt.rcParams['axes.linewidth'] #need this, otherwise the result will be off by a few pixels
pad_points = bapad*fontsize + axline #padding is defined in relative to font size
pad_inches = pad_points/72.0 #convert from points to inches
pad_pixels = pad_inches*fig.dpi #convert from inches to pixels using the figure's dpi
Then, I found that both of the following work and give the same value for the padding:
# Define inverse transform, transforms display coordinates (pixels) to axes coordinates
inv = ax[1].transAxes.inverted()
# Inverse transform two points on the display and find the relative distance
pad_axes = inv.transform((pad_pixels, 0)) - inv.transform((0,0))
pad_xaxis = pad_axes[0]
or
# Find how may pixels there are on the x-axis
x_pixels = ax[1].transAxes.transform((1,0)) - ax[1].transAxes.transform((0,0))
# Compute the ratio between the pixel offset and the total amount of pixels
pad_xaxis = pad_pixels/x_pixels[0]
And then set the legend with:
ax[1].legend(loc=(pad_xaxis,0.6))
Plot:

Categories

Resources