Seaborn (violinplot) too many y-axis values

Seaborn (violinplot) too many y-axis values - python

I have a couple of subplots and the data are probabilities, so should (and do) range between 0 and 1. When I plot them with a violinplot the y-axis of ax[0] extends above 1 (see pic). I know this is just because of the distribution kernel that the violinplot makes, but still it looks bad and I want the y-axes of these 2 plots to be the same. I have tried set_ylim on the left plot, but then I can't get the values (or look) to be the same as the plot on the right. Any ideas?

When creating your subplots, set the sharey parameter to True so that both plots share the same limits for the vertical axis.
[EDIT]
Since you have already tried setting sharey to True, I suggest getting the lower and upper limits ymin and ymax from the left hand side figure and passing them as arguments in set_ylim() for the right hand side figure.
1) Create your subplots:
fig, ax1 = plt.subplots(1,2, figsize = (5, 5), dpi=100)
2) Create left hand side figure here: ax[0].plot(...)
3) Get the axes limits using the get_ylim() method as detailed here: ymin, ymax = ax[0].get_ylim()
4) Create right hand side figure: ax[1].plot(...)
5) Set the axes limits of this new figure: ax[1].set_ylim(bottom=ymin, top=ymax)

I don't have subplots, but I do have probabilities, and this visual extension beyond 1.0 was frustrating to me.
If you add 'cut=0' to the sns.violinplot() call, it will truncate the kernel at the range of your data exactly.
I found the answer here:
How to better fit seaborn violinplots?

Related

Lineplot above clustermap [duplicate]

In a previous answer it was recommended to me to use add_subplot instead of add_axes to show axes correctly, but searching the documentation I couldn't understand when and why I should use either one of these functions.
Can anyone explain the differences?

Common grounds
Both, add_axes and add_subplot add an axes to a figure. They both return a (subclass of a) matplotlib.axes.Axes object.
However, the mechanism which is used to add the axes differs substantially.
add_axes
The calling signature of add_axes is add_axes(rect), where rect is a list [x0, y0, width, height] denoting the lower left point of the new axes in figure coodinates (x0,y0) and its width and height. So the axes is positionned in absolute coordinates on the canvas. E.g.
fig = plt.figure()
ax = fig.add_axes([0,0,1,1])
places a figure in the canvas that is exactly as large as the canvas itself.
add_subplot
The calling signature of add_subplot does not directly provide the option to place the axes at a predefined position. It rather allows to specify where the axes should be situated according to a subplot grid. The usual and easiest way to specify this position is the 3 integer notation,
fig = plt.figure()
ax = fig.add_subplot(231)
In this example a new axes is created at the first position (1) on a grid of 2 rows and 3 columns. To produce only a single axes, add_subplot(111) would be used (First plot on a 1 by 1 subplot grid). (In newer matplotlib versions, add_subplot() without any arguments is possible as well.)
The advantage of this method is that matplotlib takes care of the exact positioning. By default add_subplot(111) would produce an axes positioned at [0.125,0.11,0.775,0.77] or similar, which already leaves enough space around the axes for the title and the (tick)labels. However, this position may also change depending on other elements in the plot, titles set, etc.
It can also be adjusted using pyplot.subplots_adjust(...) or pyplot.tight_layout().
In most cases, add_subplot would be the prefered method to create axes for plots on a canvas. Only in cases where exact positioning matters, add_axes might be useful.
Example
import matplotlib.pyplot as plt
plt.rcParams["figure.figsize"] = (5,3)
fig = plt.figure()
fig.add_subplot(241)
fig.add_subplot(242)
ax = fig.add_subplot(223)
ax.set_title("subplots")
fig.add_axes([0.77,.3,.2,.6])
ax2 =fig.add_axes([0.67,.5,.2,.3])
fig.add_axes([0.6,.1,.35,.3])
ax2.set_title("random axes")
plt.tight_layout()
plt.show()
Alternative
The easiest way to obtain one or more subplots together with their handles is plt.subplots(). For one axes, use
fig, ax = plt.subplots()
or, if more subplots are needed,
fig, axes = plt.subplots(nrows=3, ncols=4)
The initial question
In the initial question an axes was placed using fig.add_axes([0,0,1,1]), such that it sits tight to the figure boundaries. The disadvantage of this is of course that ticks, ticklabels, axes labels and titles are cut off. Therefore I suggested in one of the comments to the answer to use fig.add_subplot as this will automatically allow for enough space for those elements, and, if this is not enough, can be adjusted using pyplot.subplots_adjust(...) or pyplot.tight_layout().

The answer by #ImportanceOfBeingErnest is great.
Yet in that context usually one want to generate an axes for a plot and add_axes() has too much overhead.
So one trick is, as in the answer of #ImportanceOfBeingErnest, is to use add_subplot(111).
Yet more elegant alternative and simple would be:
hAx = plt.figure(figsize = (10, 10)).gca()
If you want 3D projection you can pass any axes property. For instance the projection:
hAx = plt.figure(figsize = (16, 10)).gca(projection = '3d')

seaborn distplot different bar width on each figure

Sorry for giving an image however I think it is the best way to show my problem.
As you can see all of the bin width are different, from my understanding it shows range of rent_hours. I am not sure why different figure have different bin width even though I didn't set any.
My code looks is as follows:
figure, axes = plt.subplots(nrows=4, ncols=3)
figure.set_size_inches(18,14)
plt.subplots_adjust(hspace=0.5)
for ax, age_g in zip(axes.ravel(), age_cat):
group = total_usage_df.loc[(total_usage_df.age_group == age_g) & (total_usage_df.day_of_week <= 4)]
sns.distplot(group.rent_hour, ax=ax, kde=False)
ax.set(title=age_g)
ax.set_xlim([0, 24])
figure.suptitle("Weekday usage pattern", size=25);
additional question:
Seaborn : How to get the count in y axis for distplot using PairGrid for here it says that kde=False makes y-axis count however http://seaborn.pydata.org/generated/seaborn.distplot.html in the doc, it uses kde=False and still seems to show something else. How can I set y-axis to show count?
I've tried
sns.distplot(group.rent_hour, ax=ax, norm_hist=True) and it still seems to give something else rather than count.
sns.distplot(group.rent_hour, ax=ax, kde=False) gives me count however I don't know why it is giving me count.

Answer 1:
From the documentation:
norm_hist : bool, optional
If True, the histogram height shows a density rather than a count.
This is implied if a KDE or fitted density is plotted.
So you need to take into account your bin width as well, i.e. compute the area under the curve and not just the sum of the bin heights.
Answer 2:
# Plotting hist without kde
ax = sns.distplot(your_data, kde=False)
# Creating another Y axis
second_ax = ax.twinx()
#Plotting kde without hist on the second Y axis
sns.distplot(your_data, ax=second_ax, kde=True, hist=False)
#Removing Y ticks from the second axis
second_ax.set_yticks([])

matplotlib: align y-ticks in twinx [duplicate]

I created a matplotlib plot that has 2 y-axes. The y-axes have different scales, but I want the ticks and grid to be aligned. I am pulling the data from excel files, so there is no way to know the max limits beforehand. I have tried the following code.
# creates double-y axis
ax2 = ax1.twinx()
locs = ax1.yaxis.get_ticklocs()
ax2.set_yticks(locs)
The problem now is that the ticks on ax2 do not have labels anymore. Can anyone give me a good way to align ticks with different scales?

Aligning the tick locations of two different scales would mean to give up on the nice automatic tick locator and set the ticks to the same positions on the secondary axes as on the original one.
The idea is to establish a relation between the two axes scales using a function and set the ticks of the second axes at the positions of those of the first.
import matplotlib.pyplot as plt
import matplotlib.ticker
fig, ax = plt.subplots()
# creates double-y axis
ax2 = ax.twinx()
ax.plot(range(5), [1,2,3,4,5])
ax2.plot(range(6), [13,17,14,13,16,12])
ax.grid()
l = ax.get_ylim()
l2 = ax2.get_ylim()
f = lambda x : l2[0]+(x-l[0])/(l[1]-l[0])*(l2[1]-l2[0])
ticks = f(ax.get_yticks())
ax2.yaxis.set_major_locator(matplotlib.ticker.FixedLocator(ticks))
plt.show()
Note that this is a solution for the general case and it might result in totally unreadable labels depeding on the use case. If you happen to have more a priori information on the axes range, better solutions may be possible.
Also see this question for a case where automatic tick locations of the first axes is sacrificed for an easier setting of the secondary axes tick locations.

To anyone who's wondering (and for my future reference), the lambda function f in ImportanceofBeingErnest's answer maps the input left tick to a corresponding right tick through:
RHS tick = Bottom RHS tick + (% of LHS range traversed * RHS range)
Refer to this question on tick formatting to truncate decimal places:
from matplotlib.ticker import FormatStrFormatter
ax2.yaxis.set_major_formatter(FormatStrFormatter('%.2f')) # ax2 is the RHS y-axis

L-shaped Gridspec using matplotlib gs.update

This one is a quick and easy one for the matplotlib community. I was looking to plot an L-shaped gridspec layout, which I have done:
Ignoring a few layout issues I have for the moment, what I have is that the x-axis in the gs[0] plot (top left) shares the x-axis with the gs[2] plot (bottom left) and the gs[2] shares its y axis with the gs[3] plot. Now, what I was hoping to do was update the w-space and h-space to be tighter. So that the axes are almost touching, so perhaps wspace=0.02, hspace=0.02 or something similar.
I was also hoping that the bottom right hand plot was to be longer in the horizontal orientation, keeping the two left hand plots square in shape. Or as close to square as possible. If someone could run through all of the parameters I would be very appreciative. I can tinker then in my own time.

To change the spacings of the plot with grid spec:
gs = gridspec.GridSpec(2, 2,width_ratios=[1,1.5],height_ratios=[1,1])
This changes the relative size of plot gs[0] and gs[2] to gs1 and gs[3], whereas something like:
gs = gridspec.GridSpec(2, 2,width_ratios=[1,1],height_ratios=[1,2])
will change the relative sizes of plot gs[0] and gs1 to gs[2] and gs[3].
The following will tighten up the plots:
gs.update(hspace=0.01, wspace=0.01)
This gave me the following plot:
I also used the following to remove the axis labels where needed:
nullfmt = plt.NullFormatter()
ax.yaxis.set_major_formatter(nullfmt)
ax.xaxis.set_major_formatter(nullfmt)

Preventing xticks from overlapping yticks

How can I prevent the labels of xticks from overlapping with the labels of yticks when using hist (or other plotting commands) in matplotlib?

There are several ways.
One is to use the tight_layout method of the figure you are drawing, which will automatically try to optimize the appareance of the labels.
fig, ax = subplots(1)
ax.plot(arange(10),rand(10))
fig.tight_layout()
An other way is to modify the rcParams values for the ticks formatting:
rcParams['xtick.major.pad'] = 6
This will draw the ticks a little farter from the axes. after modifying the rcparams (this of any other, you can find the complete list on your matplotlibrc configuration file), remember to set it back to deafult with the rcdefaults function.
A third way is to tamper with the axes locator_params telling it to not draw the label in the corner:
fig, ax = subplots(1)
ax.plot(arange(10),rand(10))
ax.locator_params(prune='lower',axis='both')
the axis keywords tell the locator on which axis it should work and the prune keyword tell it to remove the lowest value of the tick

Try increasing the padding between the ticks on the labels
import matplotlib
matplotlib.rcParams['xtick.major.pad'] = 8 # defaults are 4
matplotlib.rcParams['ytick.major.pad'] = 8
same goes for [x|y]tick.minor.pad.
Also, try setting: [x|y]tick.direction to 'out'. That gives you a little more room and helps makes the ticks a little more visible -- especially on histograms with dark bars.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Seaborn (violinplot) too many y-axis values - python

I don't have subplots, but I do have probabilities, and this visual extension beyond 1.0 was frustrating to me. If you add 'cut=0' to the sns.violinplot() call, it will truncate the kernel at the range of your data exactly. I found the answer here: How to better fit seaborn violinplots?

Related

Lineplot above clustermap [duplicate]

seaborn distplot different bar width on each figure

matplotlib: align y-ticks in twinx [duplicate]

L-shaped Gridspec using matplotlib gs.update

Preventing xticks from overlapping yticks

Categories

Resources