Related
I want to make a single legend with corresponding colors for the models that are in the individual plots on the whole subplot.
My current code is as follows:
from matplotlib.legend import _get_legend_handles_labels
colors = ['red', 'blue','darkblue', 'purple', 'orange', 'brown', 'pink', 'darkgreen', 'gray']
models = ['Logistic Regression', 'SVM', 'Decision Tree', 'Random Forest', 'XGBoost', 'ADABoost', 'Gaussian NB', 'KNN', 'MLP']
# English
fig, axs = plt.subplots(4,2, figsize=(25,15))
fig.suptitle('Performance measures for English and Arabic data', fontsize = 25)
axs[0,0].bar(models, en_f1_scores, color=colors)
axs[0,0].set_title("English F1 score", fontsize = 20)
axs[1,0].bar(models, en_acc_scores, color=colors)
axs[1,0].set_title("English accuracy score", fontsize = 20)
axs[2,0].bar(models, en_recall_scores, color=colors)
axs[2,0].set_title("English recall score", fontsize = 20)
axs[3,0].bar(models, en_precision_scores, color=colors)
axs[3,0].set_title("English precision score", fontsize = 20)
# Arabic
axs[0,1].bar(models, ar_f1_scores, color=colors)
axs[0, 1].set_title("Arabic F1 score", fontsize = 20)
axs[1,1].bar(models, ar_accuracy_scores, color=colors)
axs[1,1].set_title("Arabic accuracy score", fontsize = 20)
axs[2,1].bar(models, ar_recall_scores, color=colors)
axs[2,1].set_title("Arabic recall score", fontsize = 20)
axs[3,1].bar(models, ar_precision_scores, color=colors)
axs[3,1].set_title("Arabic precision score", fontsize = 20)
fig.tight_layout(pad=3.0)
And the current output looks like this:
Adding this code:
lines, labels = fig.axes[-1].get_legend_handles_labels()
fig.legend(lines, labels, loc = 'upper center')
Does not do anything, all it shows me is:
<matplotlib.legend.Legend at 0x266574402b0>
Moreover, both lines and labels are empty arrays; []
What can I do to add a legend at the top of the subplot figure? (if it is a horizontal legend, even better!)
Thank you!
First, I would suggest to save all information into lists, so the plot can be made via a large loop. That way, if some detail changes, it only needs to be changed at one spot.
To create a legend, graphical elements that have a "label" will be added automatically. Normally, a complete bar plot only gets one label. By diving into the generated bars, individual labels can be assigned.
The code first creates a dummy legend, so fig.tight_layout() can adapt all the spacings and leave some place for the legend. After calling fig.tight_layout(), the real legend is created. (With the real legend, fig.tight_layout() would try to assign it completely to one subplot, and create a wide gap between the two columns of subplots).
import matplotlib.pyplot as plt
import numpy as np
colors = ['red', 'blue', 'darkblue', 'purple', 'orange', 'brown', 'pink', 'darkgreen', 'gray']
models = ['Logistic Regression', 'SVM', 'Decision Tree', 'Random Forest', 'XGBoost', 'ADABoost', 'Gaussian NB', 'KNN', 'MLP']
titles = ["F1 score", "accuracy score", "recall score", "precision score"]
N = len(models)
en_f1_scores = np.random.rand(N)
en_acc_scores = np.random.rand(N)
en_recall_scores = np.random.rand(N)
en_precision_scores = np.random.rand(N)
en_scores = [en_f1_scores, en_acc_scores, en_recall_scores, en_precision_scores]
ar_f1_scores = np.random.rand(N)
ar_acc_scores = np.random.rand(N)
ar_recall_scores = np.random.rand(N)
ar_precision_scores = np.random.rand(N)
ar_scores = [ar_f1_scores, ar_acc_scores, ar_recall_scores, ar_precision_scores]
fig, axs = plt.subplots(4, 2, figsize=(25, 15), sharex=True, sharey='row')
fig.suptitle('Performance measures for English and Arabic data', fontsize=25)
for axs_row, en_score, ar_score, title in zip(axs, en_scores, ar_scores, titles):
for language, score, ax in zip(['English', 'Arabic'], [en_score, ar_score], axs_row):
ax.bar(models, score, color=colors)
ax.set_title(language + ' ' + title, fontsize=20)
ax.set_xticks([]) # remove the x tick and their labels
ax.grid(axis='y', ls=':', color='black') # add some gridlines
ax.set_axisbelow(True) # gridlines behind the bars
for spine in ['top', 'right', 'left']: # remove part of the surrounding box, as it gets busy with the grid lines
ax.spines[spine].set_visible(False)
ax.margins(x=0.01) # less white space left and right
# the legend is created for each graphical element that has a "label"
for bar, model in zip(axs[0, 0].containers[0], models):
bar.set_label(model)
# first create a dummy legend, so fig.tight_layout() makes enough space
axs[0, 0].legend(handles=axs[0, 0].containers[0][:1],
bbox_to_anchor=(0, 1.12), loc='lower left')
fig.tight_layout(pad=3.0)
# now create the real legend; if fig.tight_layout() were called on this,
# it would create a large empty space between the columns of subplots
# as it wants the legend to belong to only one of the subplots
axs[0, 0].legend(handles=axs[0, 0].containers[0], ncol=len(models),
bbox_to_anchor=(1.03, 1.12), loc='lower center', fontsize=18)
plt.show()
I have made a Seaborn stripplot on top of barplot that has experience group on the axis, grouped by two different conditions (target present or target not present) from a dataframe using the following code:
IZ_colors = ['#E1F3DC','#56B567']
ax1 = sns.barplot(data=IZ_df, x='Group', y='Time in IZ (%)', hue='Condition',
order=['Std_Ctrl','ELS_Ctrl','Std_CSDS','ELS_CSDS'], hue_order=['Empty','Aggressor'],
palette=IZ_colors)
hatches = ['','//']
# Loop over the bars
for bars, hatch in zip(ax1.containers, hatches):
# Set a different hatch for each group of bars
for bar in bars:
bar.set_hatch(hatch)
sns.stripplot(data=IZ_df ,x='Group', y='Time in IZ (%)', hue='Condition', dodge=True,
order=['Std_Ctrl','ELS_Ctrl','Std_CSDS','ELS_CSDS'], hue_order=['Empty','Aggressor'],
palette=IZ_colors, marker='o', size=7, edgecolor='#373737', linewidth=1, color='black',)
plt.legend(bbox_to_anchor=(1.35, 0.7))
However, I would like the markers of the stripplot to be colored by sex (not by condition like how they are now), which is another column in the dataframe. I would still like them to be grouped by hue='Condition'. Is this possible?
plot here
You could create two stripplots, one for each sex and draw them as the same spot. The double entries of the legend can be removed via get_legend_handles_labels() and taking a subset of the handles and the labels.
Here is an example using the titanic dataset:
import matplotlib.pyplot as plt
import seaborn as sns
titanic = sns.load_dataset('titanic')
IZ_colors = ['#E1F3DC', '#56B567']
ax1 = sns.barplot(data=titanic, x='class', y='age', hue='alive',
order=['First', 'Second', 'Third'], hue_order=['no', 'yes'],
palette=IZ_colors)
hatches = ['', '//']
for bars, hatch in zip(ax1.containers, hatches):
for bar in bars:
bar.set_hatch(hatch)
for sex, color in zip(['male', 'female'], ['orange', 'turquoise']):
df_per_sex = titanic[titanic['sex'] == sex]
sns.stripplot(data=df_per_sex, x='class', y='age', hue='alive',
order=['First', 'Second', 'Third'], hue_order=['no', 'yes'],
dodge=True, palette=[color] * 2,
marker='o', size=4, edgecolor='#373737', linewidth=1)
handles, labels = ax1.get_legend_handles_labels()
handles = [handles[0], handles[2]] + handles[4:]
labels = ['Male', 'Female'] + labels[4:]
ax1.legend(handles, labels, bbox_to_anchor=(1.01, 0.7), loc='upper left')
plt.tight_layout()
plt.show()
I want to display two pie-charts, well donut charts, side by side. But using the code below all I'm getting is overlapping graphs. I've tried using various values for subplot adjust but the legends always end up overlapping. Chopped out all the non relevant code in the function
#Function to draw graphs for sports data
#Create figure with two subplots
fig,ax=plt.subplots(1,2,subplot_kw=dict(aspect="equal"))
j=0
#Loop through all columns we want to graph
for type in types:
#Create a pie chart
wedges, texts, autotexts = ax[j].pie(to_plot,
explode=explode,
labels=labels,
colors=colors,
autopct=lambda pct: func(pct, data),
pctdistance=0.8,
counterclock=False,
startangle=90,
wedgeprops={'width': 0.75},
radius=1.75
)
#Set label colors
for text in texts:
text.set_color('grey')
#Create legend
ax[j].legend(wedges, leg_labels,
title=title,
title_fontsize="x-large",
loc="center left",
bbox_to_anchor=(1.5, 0, 0.5, 1),
prop={'size': 12})
j += 1
plt.subplots_adjust(left=None, bottom=None, right=None, top=None, wspace=0.5, hspace=None)
plt.show()
return
The bbox setting that determines the position of the legend is set to the right of each pie chart, so they overlap. Therefore, we can avoid overlapping the legends by setting the respective positions for the graphs.
import matplotlib.pyplot as plt
# Some data
labels = 'Frogs', 'Hogs', 'Dogs', 'Logs'
fracs = [15, 30, 45, 10]
titles = ['20 members\nfollow Basketball','23 members\nfollow Basketball']
legend_pos = ['center left','center right']
bboxes = [(-1.0, 0, 0.5, 1),(1.5, 0, 0.5, 1)]
# Make figure and axes
fig, axs = plt.subplots(1, 2, subplot_kw=dict(aspect="equal"))
for i in range(2):
wedges, texts,_ = axs[i].pie(fracs,
labels=labels,
autopct='%.0f%%',
shadow=True,
explode=(0, 0.1, 0, 0),
wedgeprops=dict(width=0.6))
axs[i].legend(wedges,
labels,
title=titles[i],
title_fontsize="x-large",
loc=legend_pos[i],
bbox_to_anchor=bboxes[i],
prop={'size': 12})
plt.show()
I am plotting a pie chart with pandas plot function, with the following code and matplotlib:
plt.figure(figsize=(16,8))
# plot chart
ax1 = plt.subplot(121, aspect='equal')
dfhelp.plot(kind='pie', y = 'Prozentuale Gesamt', ax=ax1, autopct='%1.1f%%',
startangle=90, shadow=False, labels=dfhelp['Anzahl Geschäfte in der Gruppe'], legend = False, fontsize=14)
plt.show
the output looks like:
the problem is, the percentages and legend are overlapping, do you have any idea to fix that? For the plotting I used this question.
This is an easier and more readable version of this answer in my opinion (but credits to that answer for making it possible).
import matplotlib.pyplot as plt
import pandas as pd
d = {'col1': ['Tesla', 'GM', 'Ford', 'Nissan', 'Other'],
'col2': [117, 95, 54, 10, 7]}
df = pd.DataFrame(data=d)
print(df)
# Calculate percentages points
percent = 100.*df.col2/df.col2.sum()
# Write label in the format "Manufacturer - Percentage %"
labels = ['{0} - {1:1.2f} %'.format(i,j) for i,j in zip(df.col1, percent)]
ax = df.col2.plot(kind='pie', labels=None) # the pie plot
ax.axis('equal') # Equal aspect ratio ensures that pie is drawn as a circle
ax.yaxis.label.set_visible(False) # disable y-axis label
# add the legend
ax.legend(labels, loc='best', bbox_to_anchor=(-0.1, 1.), fontsize=8)
plt.show()
I did a scatter plot using seaborn from three columns ['Category','Installs' and 'Gross Income'] and a hue map using the category column from my dataset. However in the legend, other than the category column which I want to appear, there is a big smug at the end showing one of the columns used in the scatter plot, Installs. I'll like to remove this element, but from searching through other questions hear and the documentation of seaborn and matplotlib I'm at a loss on how to proceed.
Here is a snippet of the code I'm working with:
fig, ax = pyplot.subplots(figsize=(12,6))
ax=sns.scatterplot( x="Installs", y="Gross Income", data=comp_income_inst, hue='Category',
palette=sns.color_palette("cubehelix",len(comp_income_inst)),
size='Installs', sizes=(100,5000), legend='brief', ax=ax)
ax.set(xscale="log", yscale="log")
ax.set(ylabel="Average Income")
ax.set_title("Distribution showing the Earnings of Apps in Various Categories\n", fontsize=18)
plt.rcParams["axes.labelsize"] = 15
# Move the legend to an empty part of the plot
plt.legend(loc='upper left', bbox_to_anchor=(-0.2, -0.06),fancybox=True, shadow=True, ncol=5)
#plt.legend(loc='upper left')
plt.show()
Actually, that is not a smudge but the size legend for your hue map. Because the bubble sizes (100, 5000) are so large relative to data, they overlap in that space in legend, creating the "smudge" effect. The default legend combines both color and size legends together.
But rather than remove the size markers as you intend, readers may need to know the range Installs size for bubbles. Hence, consider separating one legend into two legends and use borderpad and prop size to fit the bubbles and labels.
Data (seeded, random data)
categs = ['GAME', 'EDUCATION', 'FAMILY', 'WEATHER', 'ENTERTAINMENT', 'PHOTOGRAPHY', 'LIFESTYLE',
'SPORTS', 'PRODUCTIVITY', 'COMMUNICATION', 'PERSONALIZATION', 'HEALTH_AND_FITNESS', 'FOOD_AND_DRINK', 'PARENTING',
'MAPS_AND_NAVIGATION', 'TOOLS', 'VIDEO_PLAYERS', 'BUSINESS', 'AUTO_AND_VEHICLES', 'TRAVEL_AND_LOCAL',
'FINANCE', 'MEDICAL', 'ART_AND_DESIGN', 'SHOPPING', 'NEWS_AND_MAGAZINES', 'SOCIAL', 'DATING', 'BOOKS_AND REFERENCES',
'LIBRARIES_AND_DEMO', 'EVENTS']
np.random.seed(11222018)
comp_income_inst = pd.DataFrame({'Category': categs,
'Installs': np.random.randint(100, 5000, 30),
'Gross Income': np.random.uniform(0, 30, 30) * 100000
}, columns=['Category', 'Installs', 'Gross Income'])
Graph
fig, ax = plt.subplots(figsize=(13,6))
ax = sns.scatterplot(x="Installs", y="Gross Income", data=comp_income_inst, hue='Category',
palette=sns.color_palette("cubehelix",len(comp_income_inst)),
size='Installs', sizes=(100, 5000), legend='brief', ax=ax)
ax.set(xscale="log", yscale="log")
ax.set(ylabel="Average Income")
ax.set_title("Distribution showing the Earnings of Apps in Various Categories\n", fontsize=20)
plt.rcParams["axes.labelsize"] = 15
# EXTRACT CURRENT HANDLES AND LABELS
h,l = ax.get_legend_handles_labels()
# COLOR LEGEND (FIRST 30 ITEMS)
col_lgd = plt.legend(h[:30], l[:30], loc='upper left',
bbox_to_anchor=(-0.05, -0.50), fancybox=True, shadow=True, ncol=5)
# SIZE LEGEND (LAST 5 ITEMS)
size_lgd = plt.legend(h[-5:], l[-5:], loc='lower center', borderpad=1.6, prop={'size': 20},
bbox_to_anchor=(0.5,-0.45), fancybox=True, shadow=True, ncol=5)
# ADD FORMER (OVERWRITTEN BY LATTER)
plt.gca().add_artist(col_lgd)
plt.show()
Output
Even consider seaborn's theme with sns.set() just before plotting: