I am trying to use Pandas DataFrame.plot() to plot two variable bar plots side by side with the following code:
fig, (ax1, ax2) = plt.subplots(1,2)
ax1 = train_df['Condition1'].value_counts().plot(kind='bar')
ax2 = train_df['Condition2'].value_counts().plot(kind='bar')
plt.show()
The result is this:
The data is Kaggle's House Prices dataset, however I do not think it matters to answering the question. I have tried this with multiple pairs of variables just to be sure. It only ever shows one plot on the right.
Interestingly enough, the assignment of axes does not matter. If you only assign ax1, it will show in the right hand plot. If you only assign ax2, it will be on the right side.
This occurs no matter what orientation I choose for my subplots (2,) (1,2), (2,1). Always one empty plot.
What's going on here?
You already created the axes with your first line of code. Your second and third code line overwrite these.
You need to pass ax1 and ax2 as arguments to pandas' plot function instead.
Try this:
fig, (ax1, ax2) = plt.subplots(1,2)
train_df['Condition1'].value_counts().plot(kind='bar', ax=ax1)
train_df['Condition2'].value_counts().plot(kind='bar', ax=ax2)
plt.show()
Related
I'm trying to set y-axis limit for a certain subplot using plt.ylim. (in my example, the plot on ax1)
However, no matter where I put the command plt.ylim((10,20)), it only works on the last subplot (in the following example, it is the plot on ax2).
fig, (ax1,ax2) = plt.subplots(2,1)
x=range(1,100)
y=range(1,100)
plt.ylim((10,20))
ax1.plot(x,y)
ax2.plot(x,y)
Only ax2 will be limited and ax1 will still be in the original range.
fig, (ax1,ax2) = plt.subplots(2,1)
x=range(1,100)
y=range(1,100)
ax1.plot(x,y)
ax2.plot(x,y)
plt.ylim((10,20))
screenshot for the result
Running the two blocks of code will produce the same result. I know I can also use other methods like plt.setp(ax1, ylim=[10,20]). But I'd like to know how to use plt.ylim properly.
Thank you very much in advance!
I am trying to plot box plots and violin plots for three variables against a variable in a 3X2 subplot formation. But I am not able to figure out how to include sns lib with subplot function.
#plots=plt.figure()
axis=plt.subplots(nrows=3,ncols=3)
for i,feature in enumerate(list(df.columns.values)[:-1]):
axis[i].plot(sns.boxplot(data=df,x='survival_status_after_5yrs',y=feature))
i+=1
axis[i].plot(sns.violinplot(data=df,x='survival_status_after_5yrs',y=feature))
plt.show()```
I am expecting 3X2 subplot, x axis stays same all the time y axis rolls over the three variables I have mentioned.
Thanks for your help.
I think you have two problems.
First, plt.subplots(nrows=3, ncols=2) returns a figure object and an array of axes objects so you should replace this line with:
fig, ax = plt.subplots(nrows=3, ncols=2). The ax object is now a 3x2 numpy array of axes objects.
You could turn this into a 1-d array with ax = ax.flatten() but given what I think you are trying to do I think it is easier to keep as 3x2.
(Btw I assume the ncols=3 is a typo)
Second, as Ewoud answer mentions with seaborn you pass the axes to plot on as an argument to the plot call.
I think the following will work for you:
fig, ax = plt.subplots(nrows=3, ncols=2)
for i, feature in enumerate(list(df.columns.values)[:-1]):
# for each feature create two plots on the same row
sns.boxplot(data=df, x='survival_status_after_5yrs',y=feature, ax=ax[i, 0])
sns.violinplot(data=df, x='survival_status_after_5yrs', y=feature, ax=ax[i, 1])
plt.show()
Most seaborn plot functions have an axis kwarg, so instead of
axis[i].plot(sns.boxplot(data=df,x='survival_status_after_5yrs',y=feature))
try
sns.boxplot(data=df,x='survival_status_after_5yrs',y=feature,axis=axis[i])
I'm trying to share two subplots axes, but I need to share the x axis after the figure was created. E.g. I create this figure:
import numpy as np
import matplotlib.pyplot as plt
t = np.arange(1000)/100.
x = np.sin(2*np.pi*10*t)
y = np.cos(2*np.pi*10*t)
fig = plt.figure()
ax1 = plt.subplot(211)
plt.plot(t,x)
ax2 = plt.subplot(212)
plt.plot(t,y)
# some code to share both x axes
plt.show()
Instead of the comment I want to insert some code to share both x axes.
How do I do this? There are some relevant sounding attributes
_shared_x_axes and _shared_x_axes when I check to figure axis (fig.get_axes()) but I don't know how to link them.
The usual way to share axes is to create the shared properties at creation. Either
fig=plt.figure()
ax1 = plt.subplot(211)
ax2 = plt.subplot(212, sharex = ax1)
or
fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)
Sharing the axes after they have been created should therefore not be necessary.
However if for any reason, you need to share axes after they have been created (actually, using a different library which creates some subplots, like here might be a reason), there would still be a solution:
Using
ax1.get_shared_x_axes().join(ax1, ax2)
creates a link between the two axes, ax1 and ax2. In contrast to the sharing at creation time, you will have to set the xticklabels off manually for one of the axes (in case that is wanted).
A complete example:
import numpy as np
import matplotlib.pyplot as plt
t= np.arange(1000)/100.
x = np.sin(2*np.pi*10*t)
y = np.cos(2*np.pi*10*t)
fig=plt.figure()
ax1 = plt.subplot(211)
ax2 = plt.subplot(212)
ax1.plot(t,x)
ax2.plot(t,y)
ax1.get_shared_x_axes().join(ax1, ax2)
ax1.set_xticklabels([])
# ax2.autoscale() ## call autoscale if needed
plt.show()
The other answer has code for dealing with a list of axes:
axes[0].get_shared_x_axes().join(axes[0], *axes[1:])
As of Matplotlib v3.3 there now exist Axes.sharex, Axes.sharey methods:
ax1.sharex(ax2)
ax1.sharey(ax3)
Just to add to ImportanceOfBeingErnest's answer above:
If you have an entire list of axes objects, you can pass them all at once and have their axes shared by unpacking the list like so:
ax_list = [ax1, ax2, ... axn] #< your axes objects
ax_list[0].get_shared_x_axes().join(ax_list[0], *ax_list)
The above will link all of them together. Of course, you can get creative and sub-set your list to link only some of them.
Note:
In order to have all axes linked together, you do have to include the first element of the axes_list in the call, despite the fact that you are invoking .get_shared_x_axes() on the first element to start with!
So doing this, which would certainly appear logical:
ax_list[0].get_shared_x_axes().join(ax_list[0], *ax_list[1:])
... will result in linking all axes objects together except the first one, which will remain entirely independent from the others.
Pretty much what it says in the title.. most pandas examples suggest doing fig = plt.figure() before df.plot(..). But if I do that, two figures pop up after plt.show() - the first completely empty and the second with the actual pandas figure.. Any ideas why?
On a DataFrame, df.plot(..) will create a new figure, unless you provide an Axes object to the ax keyword argument.
So you are correct that the plt.figure() is not needed in this case. The plt.figure() calls in the pandas documentation should be removed, as they indeed are not needed. There is an issue about this: https://github.com/pydata/pandas/issues/8776
What you can do with the ax keyword is eg:
fig, ax = plt.subplots()
df.plot(..., ax=ax)
Note that when plotting a series, this will by default plot on the 'current' axis (plt.gca()) if you don't provide ax.
I'm puzzled by the meaning of the 'ax' keyword in the pandas scatter_matrix function:
pd.scatter_matrix(frame, alpha=0.5, figsize=None, ax=None, grid=False, diagonal='hist', marker='.', density_kwds={}, hist_kwds={}, **kwds)
The only clue given in the docstring for the ax keyword is too cryptic for me:
ax : Matplotlib axis object
I had a look in the pandas code for the scatter_matrix function, and the ax variable is incorporated in the following matplotlib subplots call:
fig, axes = plt.subplots(nrows=n, ncols=n, figsize=figsize, ax=ax,
squeeze=False)
But, for the life of me, I can't find any reference to an 'ax' keyword in matplotlib subplots!
Can anyone tell me what this ax keyword is for???
This is tricky here. When looking at the source of pandas scatter_matrix you will find this line right after the docstring:
fig, axes = _subplots(nrows=n, ncols=n, figsize=figsize, ax=ax, squeeze=False)
Hence, internally, a new figure, axes combination is created using the internal _subplots method. This is strongly related to the matplotlibs subplots command but slightly different. Here, the ax keyword is supplied as well. If you look at the corresponding source (pandas.tools.plotting._subplots) you will find these lines:
if ax is None:
fig = plt.figure(**fig_kw)
else:
fig = ax.get_figure()
fig.clear()
Hence, if you supply an axes object (e.g. created using matplotlibs subplots command), pandas scatter_matrix grabs the corresponding (matplolib) figure object and deletes its content. Afterwards a new subplots grid is created into this figure object.
All in all, the ax keyword allows to plot the scatter matrix into a given figure (even though IMHO in a slightly strange way).
In short, it targets a subplot within a grid.
If you have nrows=2 and ncols=2, for example, then ax allows you to plot on a specific axis by passing ax=axes[0,0] (top left) or ax=axes[1,1] (bottom right), etc.
When you create the subplots, you receive an axes variable. You can later plot (or subplot) with an element of that axes variable as above.
Take a look at the "Targeting different subplots" section of this page: http://pandas.pydata.org/pandas-docs/dev/visualization.html#targeting-different-subplots
I hope this helps.