How to show separate boxplots for all columns? - python

When I try to show boxplots for all columns I used this command:
df_num.boxplot(rot=90)
But as you can see the boxes are so tiny as their ranges are different and should not be sharing the same y-axis. Can I do something like below but in boxplots? Thanks!

You could do it this way (example including just 2 of the columns but you can obviously add more):
fig, ax = plt.subplots(figsize=(12,6), ncols=2)
df_num["backers_count"].plot.box(ax=ax[0])
df_num["converted_pledged_amount"].plot.box(ax=ax[1]);
...or with Seaborn:
fig, ax = plt.subplots(figsize=(12,6), ncols=2)
sns.boxplot(data=df_num, y="backers_count", ax=ax[0])
sns.boxplot(data=df_num, y="converted_pledged_amount", ax=ax[1]);
If you want to display them in a grid of, say 3 rows and 3 columns then you can change the ncols=2 bit to nrows=3, ncols=3, and then instead of ax=ax[0], ax=ax[1] etc you would write ax=ax[0,0], ax=ax[0,1] etc

Related

How to combine boxplot figures into one?

I am working now on plot my dataset by boxplot as in below code
plt.figure(figsize=(8,5))
fig = plt.figure()
num_list=Final_dataset.columns.values.tolist()
for i in range(len(num_list)):
column=num_list[i]
sns.boxplot(x="label", y=column, data=Final_dataset, palette='Set2')
plt.savefig('{}.png'. format(i))
plt.show()
I need to produce one image that combine all attributes figures as in this figure rather than several figures. how Ican fix it? thanks, a lot
See subplot function in matplotlib.
nrows = 3 # decide how many you want
ncols = 4 # decide how many you want
plt.figure(figsize=(8,5))
num_list=Final_dataset.columns.values.tolist()
for i in range(len(num_list)):
column=num_list[i]
sns.boxplot(x="label", y=column, data=Final_dataset, palette='Set2')
plt.subplot(nrows, ncols, index = 1+i)
plt.savefig('{}.png'. format(i))
plt.show()

plotting whit subplots in a loop python [duplicate]

Case:
I receive a dataframe with (say 50) columns.
I extract the necessary columns from that dataframe using a condition.
So we have a list of selected columns of our dataframe now. (Say this variable is sel_cols)
I need a bar chart for each of these columns value_counts().
And I need to arrange all these bar charts in 3 columns, and varying number of rows based on number of columns selected in sel_cols.
So, if say 8 columns were selected, I want the figure to have 3 columns and 3 rows, with last subplot empty or just 8 subplots in 3x3 matrix if that is possible.
I could generate each chart separately using following code:
for col in sel_cols:
df[col].value_counts().plot(kind='bar)
plt.show()
plt.show() inside the loop so that each chart is shown and not just the last one.
I also tried appending these charts to a list this way:
charts = []
for col in sel_cols:
charts.append(df[col].value_counts().plot(kind='bar))
I could convert this list into an numpy array through reshape() but then it will have to be perfectly divisible into that shape. So 8 chart objects will not be reshaped into 3x3 array.
Then I tried creating the subplots first in this way:
row = len(sel_cols)//3
fig, axes = plt.subplots(nrows=row,ncols=3)
This way I would get the subplots, but I get two problems:
I end up with extra subplots in the 3 columns which will go unplotted (8 columns example).
I do not know how to plot under each subplots through a loop.
I tried this:
for row in axes:
for chart, col in zip(row,sel_cols):
chart = data[col].value_counts().plot(kind='bar')
But this only plots the last subplot with the last column. All other subplots stays blank.
How to do this with minimal lines of code, possibly without any need for human verification of the final subplots placements?
You may use this sample dataframe:
pd.DataFrame({'A':['Y','N','N','Y','Y','N','N','Y','N'],
'B':['E','E','E','E','F','F','F','F','E'],
'C':[1,1,0,0,1,1,0,0,1],
'D':['P','Q','R','S','P','Q','R','P','Q'],
'E':['E','E','E','E','F','F','G','G','G'],
'F':[1,1,0,0,1,1,0,0,1],
'G':['N','N','N','N','Y','N','N','Y','N'],
'H':['G','G','G','E','F','F','G','F','E'],
'I':[1,1,0,0,1,1,0,0,1],
'J':['Y','N','N','Y','Y','N','N','Y','N'],
'K':['E','E','E','E','F','F','F','F','E'],
'L':[1,1,0,0,1,1,0,0,1],
})
Selected columns are: sel_cols = ['A','B','D','E','G','H','J','K']
Total 8 columns.
Expected output is bar charts for value_counts() of each of these columns arranged in subplots in a figure with 3 columns. Rows to be decided based on number of columns selected, here 8 so 3 rows.
Given OP's sample data:
df = pd.DataFrame({'A':['Y','N','N','Y','Y','N','N','Y','N'],'B':['E','E','E','E','F','F','F','F','E'],'C':[1,1,0,0,1,1,0,0,1],'D':['P','Q','R','S','P','Q','R','P','Q'],'E':['E','E','E','E','F','F','G','G','G'],'F':[1,1,0,0,1,1,0,0,1],'G':['N','N','N','N','Y','N','N','Y','N'],'H':['G','G','G','E','F','F','G','F','E'],'I':[1,1,0,0,1,1,0,0,1],'J':['Y','N','N','Y','Y','N','N','Y','N'],'K':['E','E','E','E','F','F','F','F','E'],'L':[1,1,0,0,1,1,0,0,1]})
sel_cols = list('ABDEGHJK')
data = df[sel_cols].apply(pd.value_counts)
We can plot the columns of data in several ways (in order of simplicity):
DataFrame.plot with subplots param
seaborn.catplot
Loop through plt.subplots
1. DataFrame.plot with subplots param
Set subplots=True with the desired layout dimensions. Unused subplots will be auto-disabled:
data.plot.bar(subplots=True, layout=(3, 3), figsize=(8, 6),
sharex=False, sharey=True, legend=False)
plt.tight_layout()
2. seaborn.catplot
melt the data into long-form (i.e., 1 variable per column, 1 observation per row) and pass it to seaborn.catplot:
import seaborn as sns
melted = data.melt(var_name='var', value_name='count', ignore_index=False).reset_index()
sns.catplot(data=melted, kind='bar', x='index', y='count',
col='var', col_wrap=3, sharex=False)
3. Loop through plt.subplots
zip the columns and axes to iterate in pairs. Use the ax param to place each column onto its corresponding subplot.
If the grid size is larger than the number of columns (e.g., 3*3 > 8), disable the leftover axes with set_axis_off:
fig, axes = plt.subplots(3, 3, figsize=(8, 8), constrained_layout=True, sharey=True)
# plot each col onto one ax
for col, ax in zip(data.columns, axes.flat):
data[col].plot.bar(ax=ax, rot=0)
ax.set_title(col)
# disable leftover axes
for ax in axes.flat[data.columns.size:]:
ax.set_axis_off()
Alternative to the answer by tdy, I tried to do it without seaborn using Matplotlib and a for loop.
Figured it might be better for some who want specific control over subplots with formatting and other parameters, then this is another way:
fig = plt.figure(1,figsize=(16,12))
for i, col in enumerate(sel_cols,1):
fig.add_subplot(3,4,i,)
data[col].value_counts().plot(kind='bar',ax=plt.gca())
plt.title(col)
plt.tight_layout()
plt.show(1)
plt.subplot activates a subplot, while plt.gca() points to the active subplot.

Personalize pandas boxplot with colors

I've been trying to make a boxplot of some gender data that I divided into two sapareted dataframes, one for male, and one for female.
I managed to make the graph basically how I wanted it, but now I would like to make it look better. I'd like to make it look like a seaborn graph, but I wasn't able to find a way to make this using the seaborn library. I tried some ideas I found for coloring the pandas boxpplot, but nothing worked.
Is there a way to color these graphs? Or is there a way to make these side-by-side boxplots with seaborn?
dados_generos = dados_sem_zeros[["NU_NOTA_CN","NU_NOTA_CH","NU_NOTA_MT","NU_NOTA_LC","NU_NOTA_REDACAO", "TP_SEXO"]]
sexo_f = dados_generos[dados_generos["TP_SEXO"].str.contains("F")]
sexo_m = dados_generos[dados_generos["TP_SEXO"].str.contains("M")]
labels = ["CN", "CH", "MT", "LC", "REDAÇÃO"]
fig, (ax, ax2) = plt.subplots(figsize = (10,7), ncols=2, sharey=True)
#Setting axis titles
ax.set_xlabel('Provas')
ax2.set_xlabel('Provas')
ax.set_ylabel('Notas')
#Making plots
chart1 = sexo_f[provas].boxplot(ax=ax)
chart2 = sexo_m[provas].boxplot(ax=ax2)
#Setting axis labels
chart1.set_xticklabels(labels,rotation=45)
chart2.set_xticklabels(labels,rotation=45)
plt.show()
This is the result I have:
This is the link to the data I'm using:
https://github.com/KarolDuarte/dados_generos/blob/main/dados_generos.csv
Since sns is best suitable for long form data, let's try melting the data and use sns.
# melting the data
plot_data = df.melt('TP_SEXO')
fig, axes = plt.subplots(figsize = (10,7), ncols=2, sharey=True)
for ax, (gender, data) in zip(axes, plot_data.groupby('TP_SEXO')) :
sns.boxplot(x='variable',y='value',data=data, ax=ax)
Output:

python, histogram,data fitting

I wish to make a Histogram in Python 3 from an input file containing the raw data of energy (.dat). And on the same plot I want to plot a formula(distribution analytical, pho vs energy). It is easy to plot them seperately, but I need combined version. Can you help?
If you want 2 plots in the same figure look into this:
https://matplotlib.org/3.1.0/gallery/subplots_axes_and_figures/subplots_demo.html
fig, (ax1, ax2) = plt.subplots(2)
If you want to have 2 plots int he same plot that share axis use this:
https://matplotlib.org/3.1.0/gallery/subplots_axes_and_figures/two_scales.html#sphx-glr-gallery-subplots-axes-and-figures-two-scales-py
ax2 = ax1.twinx() # instantiate a second axes that shares the same x-axis

Matplotlib Subplot axes sharing: Apply to every other plot?

I am trying to find a way to apply the shared axes parameters of subplot() to every other plot in a series of subplots.
I've got the following code, which uses data from RPM4, based on rows in fpD
fig, ax = plt.subplots(2*(fpD['name'].count()), sharex=True, figsize=(6,fpD['name'].count()*2),
gridspec_kw={'height_ratios':[5,1]*fpD['name'].count()})
for i, r in fpD.iterrows():
RPM4[RPM4['name'] == RPM3.iloc[i,0]].plot(x='date', y='RPM', ax=ax[(2*i)], legend=False)
RPM4[RPM4['name'] == RPM3.iloc[i,0]].plot(kind='area', color='lightgrey', x='date', y='total', ax=ax[(2*i)+1],
legend=False,)
ax[2*i].set_title('test', fontsize=12)
plt.tight_layout()
Which produces an output that is very close to what I need. It loops through the 'name' column in a table and produces two plots for each, and displays them as subplots:
As you can see, the sharex parameter works fine for me here, since I want all the plots to share the same axis.
However, what I'd really like is for all the even-numbered (bigger) plots to share the same y axis, and for the odd-numbered (small grey) plots to all share a different y axis.
Any help on accomplishing this is much appreciated, thanks!

Categories

Resources