How can i show the output of pies side by side? - python

I have following code which gives the output one below another. How can i show the output side by side? I will also add anouther pies in to this code, so i also want to know how would it be if i wanted to show 6 pies for instance.
Thanks in advance
data["Gender"].value_counts().plot.pie(autopct="%.1f%%")
plt.show()
data["Education_Level"].value_counts().plot.pie(autopct="%.1f%%")

You can create a subplot with a specification of your own, and then pass the current axis as a parameter. Here I'll create a subplot with 1 row and 2 columns:
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({'mass': [0.33, 4.87, 5.97],
'radius': [2439.7, 6051.8, 6378.1]},
index=['Mercury', 'Venus', 'Earth']
)
fig, axs = plt.subplots(nrows=1, ncols=2, figsize=(12, 12))
df.plot.pie(y='mass', ax=axs[0])
df.plot.pie(y='radius', ax=axs[1])
plt.show()
The code above produces the following result:
In case you wanted 6 figures next to each other, set the ncols parameter to 6, and then pass through all 6 axes. Here's a quick demo.
fig, axs = plt.subplots(nrows=1, ncols=6, figsize=(12, 12))
for ax in axs:
df.plot.pie(y='mass', ax=ax) # plots the same pie 6 times
Be sure to read more about matplotlib and how figures/axes work from their documentation.

Related

Python - Synchronizing boxplot axis for comparison

this is my first question on Stack so please let me know if my post isn't very clear.
How can I synchronize the range of two boxplots so that the x axis grids will be in line?
In the example below, I want the upper plot to also show grids from -10 to 10 like the lower plot, but I don't want to fix it to real numbers so that the box plots would be synchronized even if the dataset changes.
two boxplots
fig, (ax0, ax1) = plt.subplots(2, 1, figsize=(10*mult, 8*mult), gridspec_kw={'height_ratios': [1, 4]})
sns_plot = sns.boxplot(y='Overall', x='RoR', data=data_s, ax=ax0, showfliers=False)
sns_plot.set_xlabel("")
sns_plot.set_ylabel("")
sns_plot = sns.boxplot(y='AUA Bucket', x='RoR', data=data_s, order=aua_buckets, ax=ax1,showfliers=False)
sns_plot.set_xlabel("")
sns_plot.set_ylabel("")
plt.subplots_adjust(left=0.12)
plt.subplots_adjust(bottom=0.05)
plt.subplots_adjust(right=0.98)
plt.subplots_adjust(top=0.98)
plt.savefig("dist_aua.png", format="png", dpi=75)
plt.close()
To illustrate with the example in the official reference, if you align the limits of each axis, the grid will be aligned.
import matplotlib.pyplot as plt
import seaborn as sns
sns.set_theme(style="whitegrid")
tips = sns.load_dataset("tips")
fig, (ax0, ax1) = plt.subplots(2, 1, figsize=(10, 8), gridspec_kw={'height_ratios': [1, 4]})
sns.boxplot(x=tips["total_bill"], ax=ax0, showfliers=False)
sns.boxplot(x=tips["total_bill"], y=tips['day'], ax=ax1, showfliers=False)
ax0.set_xlim(0,50)
ax1.set_xlim(0,50)
plt.show()

how to make Y-label of subplots to appear only once?

I have a simple problem i need to solve, here is the example
import matplotlib.pyplot as plt
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(6, 6))
for axs in ax.flat:
axs.set(ylabel='AUC')
this is the output
I want Y-label(AUC) to appear only once(be shared) at the first subplot, and other values should remain. This is the desired output
How to solve this? Please I need your help
Since you're setting your labels in a loop, you're labeling all the axes in your subplots accordingly. What you need is to only label the first cell in your subplot row.
So this:
for axs in ax.flat:
axs.set(ylabel='AUC')
changes to:
ax[0].set_ylabel("AUC")
I also recommend you to share the axis between your multiple subplots, since all the yticks are making your plot a little less readable than ideal. You can change it as below:
fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(6, 6), sharex=True, sharey=True,)
The resulting image will be:

Adjust y-axis in Seaborn multiplot

I'm plotting a CSV file from my simulation results. The plot has three graphs in the same figure fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6)).
However, for comparison purposes I want the y-axis in all graphs starting at zero and the ending at a specific value. I tried the solution mentioned here from the Seaborn author. I don't get any errors, but the solution also does not work for me.
Here's my script:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
fname = 'results/filename.csv'
def plot_file():
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6))
df = pd.read_csv(fname, sep='\t')
profits = \
df.groupby(['providerId', 'periods'], as_index=False)['profits'].sum()
# y-axis needs to start at zero and end at 10
g = sns.lineplot(x='periods',
y='profits',
data=profits,
hue='providerId',
legend='full',
ax=axes[0])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='price',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[1])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='quality',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[2])
g.set(ylim=(0, None))
plt.show()
print(g) # -> AxesSubplot(0.672059,0.11;0.227941x0.77)
The resulting figure is as follows:
How can I adjust each individual plot?
Based on the way you've written your code, you can refer to each subplot axis with g.axis and use g.axis.set_ylim(low,high). (A difference compared to the linked answer is that your graphs are not being plotted on a seaborn FacetGrid.)
An example using dummy data and different axis ranges to illustrate:
df = pd.DataFrame(np.random.uniform(0,10,(100,2)), columns=['a','b'])
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(8,4))
g = sns.lineplot(x='a',
y='b',
data=df.sample(10),
ax=axes[0])
g.axes.set_ylim(0,25)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[1])
g.axes.set_ylim(0,3.5)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[2])
g.axes.set_ylim(0,0.3)
plt.tight_layout()
plt.show()

Combine Bar and Line subplots in pandas

I have 5 time series that I want to graph in a subplot. Essentially I've been using subplotting:
fig, axes = plt.subplots(nrows=5, ncols=1, figsize=(16,10), sharex=True)
xlim = (start, end)
ax1=df.hr.plot(ax=axes[0], color='green', xlim=xlim)
ax2=df.act.plot(ax=axes[1], color='orange', xlim=xlim)
ax3=df.rr.plot(ax=axes[2], color='blue', xlim=xlim)
ax4=df2.set_index('timestamp').rmssd.plot(color='purple', ax=axes[3], xlim=xlim)
ax5=ma_df.tz_convert('US/Eastern')['any_act'].resample('10Min', how='count').plot(kind='line',ax=axes[4])
Which produces
Due to the nature of the data, I want to visualize the last subplot as bar chart. So naturally, I changed the last line to:
ax5=ma_df.tz_convert('US/Eastern')['any_act'].resample('10Min', how='count').plot(kind='bar',ax=axes[4])
Which then creates the following figure:
Which, produces what I expect in the last subplot, but makes the other plots useless. Needless to say, it's not what I want.
How can I combine the 4 line time series with one bar chart in the same plot, but different subplots, all sharing the same x-axis?
Meaning I would want the first 4 subplotplots like in the first image, and the last subplot like in the second image.
Update
I made a simple example, which unfortunately works as expected, and does not replicate my problem, which is even more baffling. Code is below
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline
df = pd.read_csv('https://s3.amazonaws.com/temp-leonsas-qsaeamu0sl5v4b/df.csv')
bar_df = pd.read_csv('https://s3.amazonaws.com/temp-leonsas-qsaeamu0sl5v4b/bar_df.csv')
fig, axes = plt.subplots(nrows=4, ncols=1, figsize=(16,10), sharex=True)
ax1=df.hr.plot(ax=axes[0], color='green', kind='line')
ax2=df.act.plot(ax=axes[1], color='orange', kind='line')
ax3=df.rr.plot(ax=axes[2], color='blue', kind='line')
ax4=bar_df.occ_count.plot(ax=axes[3], kind='bar')
Whereas the code in my codebase which replicates the problem is
fig, axes = plt.subplots(nrows=4, ncols=1, figsize=(16,10), sharex=True)
ax1=df.hr.plot(ax=axes[0], color='green', kind='line')
ax2=df.act.plot(ax=axes[1], color='orange', kind='line')
ax3=df.rr.plot(ax=axes[2], color='blue', kind='line')
ax4=bar_df.occ_count.plot(ax=axes[3], kind='bar')
The main difference is that in my codebase the DataFrames are being generated and not just loaded up from s3. Is there an implicit config inside a DataFrame that can somehow make this happen? I just used df.to_csv to dump those 2 dataframes into S3.
I think you just need to explicitly pass kind='line' to the first three plots, here's a simpler example:
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
s = pd.Series([1,2,3,2,1])
fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(16,10), sharex=True)
s.plot(ax=axes[0], color='green', kind='line')
s.plot(ax=axes[1], color='red', kind='bar')

Share axes in matplotlib for only part of the subplots

I am having a big plot where I initiated with:
import numpy as np
import matplotlib.pyplot as plt
fig, axs = plt.subplots(5, 4)
And I want to do share-x-axis between column 1 and 2; and do the same between column 3 and 4. However, column 1 and 2 does not share the same axis with column 3 and 4.
I was wondering that would there be anyway to do this, and not sharex=True and sharey=True across all figures?
PS: This tutorial does not help too much, because it is only about sharing x/y within each row/column; they cannot do axis sharing between different rows/columns (unless share them across all axes).
I'm not exactly sure what you want to achieve from your question. However, you can specify per subplot which axis it should share with which subplot when adding a subplot to your figure.
This can be done via:
import matplotlib.pylab as plt
fig = plt.figure()
ax1 = fig.add_subplot(5, 4, 1)
ax2 = fig.add_subplot(5, 4, 2, sharex = ax1)
ax3 = fig.add_subplot(5, 4, 3, sharex = ax1, sharey = ax1)
A slightly limited but much simpler option is available for subplots. The limitation is there for a complete row or column of subplots.
For example, if one wants to have common y axis for all the subplots but common x axis only for individual columns in a 3x2 subplot, one could specify it as:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(3, 2, sharey=True, sharex='col')
One can manually manage axes sharing using a Grouper object, which can be accessed via ax._shared_x_axes and ax._shared_y_axes. For example,
import matplotlib.pyplot as plt
def set_share_axes(axs, target=None, sharex=False, sharey=False):
if target is None:
target = axs.flat[0]
# Manage share using grouper objects
for ax in axs.flat:
if sharex:
target._shared_x_axes.join(target, ax)
if sharey:
target._shared_y_axes.join(target, ax)
# Turn off x tick labels and offset text for all but the bottom row
if sharex and axs.ndim > 1:
for ax in axs[:-1,:].flat:
ax.xaxis.set_tick_params(which='both', labelbottom=False, labeltop=False)
ax.xaxis.offsetText.set_visible(False)
# Turn off y tick labels and offset text for all but the left most column
if sharey and axs.ndim > 1:
for ax in axs[:,1:].flat:
ax.yaxis.set_tick_params(which='both', labelleft=False, labelright=False)
ax.yaxis.offsetText.set_visible(False)
fig, axs = plt.subplots(5, 4)
set_share_axes(axs[:,:2], sharex=True)
set_share_axes(axs[:,2:], sharex=True)
To adjust the spacing between subplots in a grouped manner, please refer to this question.
I used Axes.sharex /sharey in a similar setting
https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.sharex.html#matplotlib.axes.Axes.sharex
import matplotlib.pyplot as plt
fig, axd = plt.subplot_mosaic([list(range(3))] +[['A']*3, ['B']*3])
axd[0].plot([0,0.2])
axd['A'].plot([1,2,3])
axd['B'].plot([1,2,3,4,5])
axd['B'].sharex(axd['A'])
for i in [1,2]:
axd[i].sharey(axd[0])
plt.show()

Categories

Resources