Sub Plots using Seaborn - python

I am trying to plot box plots and violin plots for three variables against a variable in a 3X2 subplot formation. But I am not able to figure out how to include sns lib with subplot function.
#plots=plt.figure()
axis=plt.subplots(nrows=3,ncols=3)
for i,feature in enumerate(list(df.columns.values)[:-1]):
axis[i].plot(sns.boxplot(data=df,x='survival_status_after_5yrs',y=feature))
i+=1
axis[i].plot(sns.violinplot(data=df,x='survival_status_after_5yrs',y=feature))
plt.show()```
I am expecting 3X2 subplot, x axis stays same all the time y axis rolls over the three variables I have mentioned.
Thanks for your help.

I think you have two problems.
First, plt.subplots(nrows=3, ncols=2) returns a figure object and an array of axes objects so you should replace this line with:
fig, ax = plt.subplots(nrows=3, ncols=2). The ax object is now a 3x2 numpy array of axes objects.
You could turn this into a 1-d array with ax = ax.flatten() but given what I think you are trying to do I think it is easier to keep as 3x2.
(Btw I assume the ncols=3 is a typo)
Second, as Ewoud answer mentions with seaborn you pass the axes to plot on as an argument to the plot call.
I think the following will work for you:
fig, ax = plt.subplots(nrows=3, ncols=2)
for i, feature in enumerate(list(df.columns.values)[:-1]):
# for each feature create two plots on the same row
sns.boxplot(data=df, x='survival_status_after_5yrs',y=feature, ax=ax[i, 0])
sns.violinplot(data=df, x='survival_status_after_5yrs', y=feature, ax=ax[i, 1])
plt.show()

Most seaborn plot functions have an axis kwarg, so instead of
axis[i].plot(sns.boxplot(data=df,x='survival_status_after_5yrs',y=feature))
try
sns.boxplot(data=df,x='survival_status_after_5yrs',y=feature,axis=axis[i])

Related

set custom tick labels on heatmap color bar

I have a list of dataframes named merged_dfs that I am looping through to get the correlation and plot subplots of heatmap correlation matrix using seaborn.
I want to customize the colorbar tick labels, but I am having trouble figuring out how to do it with my example.
Currently, my colorbar scale values from top to bottom are
[1,0.5,0,-0.5,-1]
I want to keep these values, but change the tick labels to be
[1,0.5,0,0.5,1]
for my diverging color bar.
Here is the code and my attempt:
fig, ax = plt.subplots(nrows=6, ncols=2, figsize=(20,20))
for i, (title,merging) in enumerate (zip(new_name_data,merged_dfs)):
graph = merging.corr()
colormap = sns.diverging_palette(250, 250, as_cmap=True)
a = sns.heatmap(graph.abs(), cmap=colormap, vmin=-1,vmax=1,center=0,annot = graph, ax=ax.flat[i])
cbar = fig.colorbar(a)
cbar.set_ticklabels(["1","0.5","0","0.5","1"])
fig.delaxes(ax[5,1])
plt.show()
plt.close()
I keep getting this error:
AttributeError: 'AxesSubplot' object has no attribute 'get_array'
Several things are going wrong:
fig.colorbar(...) would create a new colorbar, by default appended to the last subplot that was created.
sns.heatmap returns an ax (indicates a subplot). This is very different to matplotlib functions, e.g. plt.imshow(), which would return the graphical element that was plotted.
You can suppress the heatmap's colorbar (cbar=False), and then create it newly with the parameters you want.
fig.colorbar(...) needs a parameter ax=... when the figure contains more than one subplot.
Instead of creating a new colorbar, you can add the colorbar parameters to sns.heatmap via cbar_kws=.... The colorbar itself can be found via ax.collections[0].colobar. (ax.collections[0] is where matplotlib stored the graphical object that contains the heatmap.)
Using an index is strongly discouraged when working with Python. It's usually more readable, easier to maintain and less error-prone to include everything into the zip command.
As now your vmin now is -1, taking the absolute value for the coloring seems to be a mistake.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
merged_dfs = [pd.DataFrame(data=np.random.rand(5, 7), columns=[*'ABCDEFG']) for _ in range(5)]
new_name_data = [f'Dataset {i + 1}' for i in range(len(merged_dfs))]
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(12, 7))
for title, merging, ax in zip(new_name_data, merged_dfs, axes.flat):
graph = merging.corr()
colormap = sns.diverging_palette(250, 250, as_cmap=True)
sns.heatmap(graph, cmap=colormap, vmin=-1, vmax=1, center=0, annot=True, ax=ax, cbar_kws={'ticks': ticks})
ax.collections[0].colorbar.set_ticklabels([abs(t) for t in ticks])
fig.delaxes(axes.flat[-1])
fig.tight_layout()
plt.show()

Subplots not populating correctly

I am trying to use Pandas DataFrame.plot() to plot two variable bar plots side by side with the following code:
fig, (ax1, ax2) = plt.subplots(1,2)
ax1 = train_df['Condition1'].value_counts().plot(kind='bar')
ax2 = train_df['Condition2'].value_counts().plot(kind='bar')
plt.show()
The result is this:
The data is Kaggle's House Prices dataset, however I do not think it matters to answering the question. I have tried this with multiple pairs of variables just to be sure. It only ever shows one plot on the right.
Interestingly enough, the assignment of axes does not matter. If you only assign ax1, it will show in the right hand plot. If you only assign ax2, it will be on the right side.
This occurs no matter what orientation I choose for my subplots (2,) (1,2), (2,1). Always one empty plot.
What's going on here?
You already created the axes with your first line of code. Your second and third code line overwrite these.
You need to pass ax1 and ax2 as arguments to pandas' plot function instead.
Try this:
fig, (ax1, ax2) = plt.subplots(1,2)
train_df['Condition1'].value_counts().plot(kind='bar', ax=ax1)
train_df['Condition2'].value_counts().plot(kind='bar', ax=ax2)
plt.show()

How to have searborn dist and box plot on the same graph one above the other with a single x axis? [duplicate]

I'm trying to share two subplots axes, but I need to share the x axis after the figure was created. E.g. I create this figure:
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(1000)/100.
x = np.sin(2*np.pi*10*t)
y = np.cos(2*np.pi*10*t)
fig = plt.figure()
ax1 = plt.subplot(211)
plt.plot(t,x)
ax2 = plt.subplot(212)
plt.plot(t,y)
# some code to share both x axes
plt.show()
Instead of the comment I want to insert some code to share both x axes.
How do I do this? There are some relevant sounding attributes
_shared_x_axes and _shared_x_axes when I check to figure axis (fig.get_axes()) but I don't know how to link them.
The usual way to share axes is to create the shared properties at creation. Either
fig=plt.figure()
ax1 = plt.subplot(211)
ax2 = plt.subplot(212, sharex = ax1)
or
fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)
Sharing the axes after they have been created should therefore not be necessary.
However if for any reason, you need to share axes after they have been created (actually, using a different library which creates some subplots, like here might be a reason), there would still be a solution:
Using
ax1.get_shared_x_axes().join(ax1, ax2)
creates a link between the two axes, ax1 and ax2. In contrast to the sharing at creation time, you will have to set the xticklabels off manually for one of the axes (in case that is wanted).
A complete example:
import numpy as np
import matplotlib.pyplot as plt
t= np.arange(1000)/100.
x = np.sin(2*np.pi*10*t)
y = np.cos(2*np.pi*10*t)
fig=plt.figure()
ax1 = plt.subplot(211)
ax2 = plt.subplot(212)
ax1.plot(t,x)
ax2.plot(t,y)
ax1.get_shared_x_axes().join(ax1, ax2)
ax1.set_xticklabels([])
# ax2.autoscale() ## call autoscale if needed
plt.show()
The other answer has code for dealing with a list of axes:
axes[0].get_shared_x_axes().join(axes[0], *axes[1:])
As of Matplotlib v3.3 there now exist Axes.sharex, Axes.sharey methods:
ax1.sharex(ax2)
ax1.sharey(ax3)
Just to add to ImportanceOfBeingErnest's answer above:
If you have an entire list of axes objects, you can pass them all at once and have their axes shared by unpacking the list like so:
ax_list = [ax1, ax2, ... axn] #< your axes objects
ax_list[0].get_shared_x_axes().join(ax_list[0], *ax_list)
The above will link all of them together. Of course, you can get creative and sub-set your list to link only some of them.
Note:
In order to have all axes linked together, you do have to include the first element of the axes_list in the call, despite the fact that you are invoking .get_shared_x_axes() on the first element to start with!
So doing this, which would certainly appear logical:
ax_list[0].get_shared_x_axes().join(ax_list[0], *ax_list[1:])
... will result in linking all axes objects together except the first one, which will remain entirely independent from the others.

How to share x axes of two subplots after they have been created

I'm trying to share two subplots axes, but I need to share the x axis after the figure was created. E.g. I create this figure:
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(1000)/100.
x = np.sin(2*np.pi*10*t)
y = np.cos(2*np.pi*10*t)
fig = plt.figure()
ax1 = plt.subplot(211)
plt.plot(t,x)
ax2 = plt.subplot(212)
plt.plot(t,y)
# some code to share both x axes
plt.show()
Instead of the comment I want to insert some code to share both x axes.
How do I do this? There are some relevant sounding attributes
_shared_x_axes and _shared_x_axes when I check to figure axis (fig.get_axes()) but I don't know how to link them.
The usual way to share axes is to create the shared properties at creation. Either
fig=plt.figure()
ax1 = plt.subplot(211)
ax2 = plt.subplot(212, sharex = ax1)
or
fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)
Sharing the axes after they have been created should therefore not be necessary.
However if for any reason, you need to share axes after they have been created (actually, using a different library which creates some subplots, like here might be a reason), there would still be a solution:
Using
ax1.get_shared_x_axes().join(ax1, ax2)
creates a link between the two axes, ax1 and ax2. In contrast to the sharing at creation time, you will have to set the xticklabels off manually for one of the axes (in case that is wanted).
A complete example:
import numpy as np
import matplotlib.pyplot as plt
t= np.arange(1000)/100.
x = np.sin(2*np.pi*10*t)
y = np.cos(2*np.pi*10*t)
fig=plt.figure()
ax1 = plt.subplot(211)
ax2 = plt.subplot(212)
ax1.plot(t,x)
ax2.plot(t,y)
ax1.get_shared_x_axes().join(ax1, ax2)
ax1.set_xticklabels([])
# ax2.autoscale() ## call autoscale if needed
plt.show()
The other answer has code for dealing with a list of axes:
axes[0].get_shared_x_axes().join(axes[0], *axes[1:])
As of Matplotlib v3.3 there now exist Axes.sharex, Axes.sharey methods:
ax1.sharex(ax2)
ax1.sharey(ax3)
Just to add to ImportanceOfBeingErnest's answer above:
If you have an entire list of axes objects, you can pass them all at once and have their axes shared by unpacking the list like so:
ax_list = [ax1, ax2, ... axn] #< your axes objects
ax_list[0].get_shared_x_axes().join(ax_list[0], *ax_list)
The above will link all of them together. Of course, you can get creative and sub-set your list to link only some of them.
Note:
In order to have all axes linked together, you do have to include the first element of the axes_list in the call, despite the fact that you are invoking .get_shared_x_axes() on the first element to start with!
So doing this, which would certainly appear logical:
ax_list[0].get_shared_x_axes().join(ax_list[0], *ax_list[1:])
... will result in linking all axes objects together except the first one, which will remain entirely independent from the others.

Purpose of 'ax' keyword in pandas scatter_matrix function

I'm puzzled by the meaning of the 'ax' keyword in the pandas scatter_matrix function:
pd.scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds={}, hist_kwds={}, **kwds)
The only clue given in the docstring for the ax keyword is too cryptic for me:
ax : Matplotlib axis object
I had a look in the pandas code for the scatter_matrix function, and the ax variable is incorporated in the following matplotlib subplots call:
fig, axes = plt.subplots(nrows=n, ncols=n, figsize=figsize, ax=ax,
squeeze=False)
But, for the life of me, I can't find any reference to an 'ax' keyword in matplotlib subplots!
Can anyone tell me what this ax keyword is for???
This is tricky here. When looking at the source of pandas scatter_matrix you will find this line right after the docstring:
fig, axes = _subplots(nrows=n, ncols=n, figsize=figsize, ax=ax, squeeze=False)
Hence, internally, a new figure, axes combination is created using the internal _subplots method. This is strongly related to the matplotlibs subplots command but slightly different. Here, the ax keyword is supplied as well. If you look at the corresponding source (pandas.tools.plotting._subplots) you will find these lines:
if ax is None:
fig = plt.figure(**fig_kw)
else:
fig = ax.get_figure()
fig.clear()
Hence, if you supply an axes object (e.g. created using matplotlibs subplots command), pandas scatter_matrix grabs the corresponding (matplolib) figure object and deletes its content. Afterwards a new subplots grid is created into this figure object.
All in all, the ax keyword allows to plot the scatter matrix into a given figure (even though IMHO in a slightly strange way).
In short, it targets a subplot within a grid.
If you have nrows=2 and ncols=2, for example, then ax allows you to plot on a specific axis by passing ax=axes[0,0] (top left) or ax=axes[1,1] (bottom right), etc.
When you create the subplots, you receive an axes variable. You can later plot (or subplot) with an element of that axes variable as above.
Take a look at the "Targeting different subplots" section of this page: http://pandas.pydata.org/pandas-docs/dev/visualization.html#targeting-different-subplots
I hope this helps.

Categories

Resources