I'm trying to display a custom legend for a bar graph, but it is only displaying the first legend in the legend list. How can I display all the values in the legend?
df.time_to_travel_grouping.value_counts().plot(kind="bar",
color = ["b","tab:green","tab:red","c","m","y","tab:blue","tab:orange"],
xlabel="TTT", ylabel="Total Counts",
title="Fig4: Total Counts by Time to Travel Category (TTT)", figsize=(20,15))
plt.legend(["a","b","c","d","e","f","g","h"])
plt.subplots_adjust(bottom=0.15)
plt.subplots_adjust(left=0.15)
Let's get the patches handles from the axes using ax.get_legend_handles_labels:
s = pd.Series(np.arange(100,50,-5), index=[*'abcdefghij'])
ax = s.plot(kind="bar",
color = ["b","tab:green","tab:red","c","m","y","tab:blue","tab:orange"],
xlabel="TTT", ylabel="Total Counts",
title="Fig4: Total Counts by Time to Travel Category (TTT)", figsize=(20,15))
patches, _ = ax.get_legend_handles_labels()
labels = [*'abcdefghij']
ax.legend(*patches, labels, loc='best')
plt.subplots_adjust(bottom=0.15)
plt.subplots_adjust(left=0.15)
Output:
To create an automatic legend, matplotlib stores labels for graphical elements. In the case of this bar plot, the complete 'container' pandas assigns one label to the complete 'container'.
You could remove the label of the container (assigning a label starting with _), and assign individual labels to the bars. The xtick labels can be used, as they are already in the desired order.
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
df = pd.DataFrame({'time_to_travel_grouping': np.random.choice([*'abcdefgh'], 200)})
ax = df.time_to_travel_grouping.value_counts().plot(kind="bar",
color=["b", "tab:green", "tab:red", "c", "m", "y", "tab:blue", "tab:orange"],
xlabel="TTT", ylabel="Total Counts",
title="Fig4: Total Counts by Time to Travel Category (TTT)",
figsize=(20, 15))
ax.containers[0].set_label('_nolegend')
for bar, tick_label in zip(ax.containers[0], ax.get_xticklabels()):
bar.set_label(tick_label.get_text())
ax.legend()
plt.tight_layout()
plt.show()
With a little bit less internal manipulation, something similar can be obtained via seaborn:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
import seaborn as sns
df = pd.DataFrame({'time_to_travel_grouping': np.random.choice([*'abcdefgh'], 200)})
plt.figure(figsize=(20, 15))
ax = sns.countplot(data=df, x='time_to_travel_grouping', hue='time_to_travel_grouping',
palette=["b", "tab:green", "tab:red", "c", "m", "y", "tab:blue", "tab:orange"],
order=df.time_to_travel_grouping.value_counts().index,
dodge=False)
plt.setp(ax, xlabel="TTT", ylabel="Total Counts", title="Fig4: Total Counts by Time to Travel Category (TTT)")
plt.tight_layout()
plt.show()
Just putting the strings in legend function does not work as you expected in matplotlib. So, for adding all desired legends to the plot, you can make the patch objects from them with colors and add by this way. This piece of code will do the job and I think more generalized than the other solutions:
## include this library
import matplotlib.patches as mpatches
## desired legends
legend_list = ["a","b","c","d","e","f","g","h"]
## corresponding colors in the same order
color_list = ["b","tab:green","tab:red","c","m","y","tab:blue","tab:orange"]
## make patches from the legends and corresponding colors
patch_list = []
i = 0
for each_legend in legend_list:
patch_list.append(mpatches.Patch(label=each_legend, color=color_list[i]))
i += 1
## add made patches to the plot
plt.legend(handles=patch_list, fontsize=12, loc=(1, 0))
Related
I generated a boxplot using seaborn. On the x axis, I would like to have, both the number of days (20, 25, 32) and the actual dates they refer to (2022-05-08, 2022-05-13, 2022-05-20).
I found a potential solution at the following link add custom tick with matplotlib. I'm trying to adapt it to my problem but I could only get the number of days or the dates, not both.
I really would appreciate any help. Thank you in advance for your time.
Please, find below my code and the desired output.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'nb_days':[20,20,20,25,25,20,32,32,25,32,32],
'Dates':['2022-05-08','2022-05-08','2022-05-08','2022-05-13','2022-05-13','2022-05-08','2022-05-20','2022-05-20','2022-05-13','2022-05-20','2022-05-20'],
'score':[3,3.5,3.4,2,2.2,3,5,5.2,4,4.3,5]})
df['Dates'] = df['Dates'].apply(pd.to_datetime)
tick_label = dict(zip(df['nb_days'],df['Dates'].apply(lambda x: x.strftime('%Y-%m-%d')))) #My custom xtick label
#Plot
fig,ax = plt.subplots(figsize=(6,6))
ax = sns.boxplot(x='nb_days',y='score',data=df,color=None)
# iterate over boxes to change color
for i,box in enumerate(ax.artists):
box.set_edgecolor('red')
box.set_facecolor('white')
sns.stripplot(x='nb_days',y='score',data=df,color='black')
ticks = sorted(df['nb_days'].unique())
labels = [tick_label.get(t, ticks[i]) for i,t in enumerate(ticks)]
ax.set_xticklabels(labels)
plt.tight_layout()
plt.show()
plt.close()
Here is the desired output.
You can do that by adding these lines in place of ax.set_xticklabels(labels)
new_labels=["{}\n{}".format(a_, b_) for a_, b_ in zip(ticks, labels)]
ax.set_xticklabels(new_labels)
Output
Try this:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'nb_days':[20,20,20,25,25,20,32,32,25,32,32],
'Dates':['2022-05-08','2022-05-08','2022-05-08','2022-05-13','2022-05-13','2022-05-08','2022-05-20','2022-05-20','2022-05-13','2022-05-20','2022-05-20'],
'score':[3,3.5,3.4,2,2.2,3,5,5.2,4,4.3,5]})
df['Dates'] = df['Dates'].apply(pd.to_datetime)
tick_label = dict(zip(df['nb_days'],df['Dates'].apply(lambda x: x.strftime('%Y-%m-%d')))) #My custom xtick label
#Plot
fig,ax = plt.subplots(figsize=(6,6))
ax = sns.boxplot(x='nb_days',y='score',data=df,color=None)
# iterate over boxes to change color
for i,box in enumerate(ax.artists):
box.set_edgecolor('red')
box.set_facecolor('white')
sns.stripplot(x='nb_days',y='score',data=df,color='black')
ticks = sorted(df['nb_days'].unique())
labels = ["{}\n".format(t)+tick_label.get(t, ticks[i]) for i, t in enumerate(ticks)]
ax.set_xticklabels(labels)
plt.tight_layout()
plt.show()
plt.close()
I am plotting a series of boxplots on the same axes and want to adda legend to identify them.
Very simplified, my script looks like this:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df={}
bp={}
positions = [1,2,3,4]
df[0]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
df[1]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
colour=['red','blue']
fig, ax = plt.subplots()
for i in [0,1]:
bp[i] = df[i].plot.box(ax=ax,
positions = positions,
color={'whiskers': colour[i],
'caps': colour[i],
'medians': colour[i],
'boxes': colour[i]}
)
plt.legend([bp[i] for i in [0,1]], ['first plot', 'second plot'])
fig.show()
The plot is fine, but the legend is not drawn and I get this warning
UserWarning: Legend does not support <matplotlib.axes._subplots.AxesSubplot object at 0x000000000A7830F0> instances.
A proxy artist may be used instead.
(I have had this warning before when adding a legend to a scatter plot, but the legend was still drawn, so i could ignore it. )
Here is a link to a description of proxy artists, but it is not clear how to apply this to my script. Any suggestions?
'pandas' plots return AxesSubplot objects which can not be used for generating legends. You must generate you own legend using proxy artist instead. I have modified your code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as mpatches
df={}
bp={}
positions = [1,2,3,4]
df[0]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
df[1]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
colour=['red','blue']
fig, ax = plt.subplots()
for i in [0,1]:
bp[i] = df[i].plot.box(ax=ax,
positions = positions,
color={'whiskers': colour[i],
'caps': colour[i],
'medians': colour[i],
'boxes': colour[i]}
)
red_patch = mpatches.Patch(color='red', label='The red data')
blue_patch = mpatches.Patch(color='blue', label='The blue data')
plt.legend(handles=[red_patch, blue_patch])
plt.show()
The results are shown below:
I am plotting multiple dataframes as point plot using seaborn. Also I am plotting all the dataframes on the same axis.
How would I add legend to the plot ?
My code takes each of the dataframe and plots it one after another on the same figure.
Each dataframe has same columns
date count
2017-01-01 35
2017-01-02 43
2017-01-03 12
2017-01-04 27
My code :
f, ax = plt.subplots(1, 1, figsize=figsize)
x_col='date'
y_col = 'count'
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_1,color='blue')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_2,color='green')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df_3,color='red')
This plots 3 lines on the same plot. However the legend is missing. The documentation does not accept label argument .
One workaround that worked was creating a new dataframe and using hue argument.
df_1['region'] = 'A'
df_2['region'] = 'B'
df_3['region'] = 'C'
df = pd.concat([df_1,df_2,df_3])
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df,hue='region')
But I would like to know if there is a way to create a legend for the code that first adds sequentially point plot to the figure and then add a legend.
Sample output :
I would suggest not to use seaborn pointplot for plotting. This makes things unnecessarily complicated.
Instead use matplotlib plot_date. This allows to set labels to the plots and have them automatically put into a legend with ax.legend().
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np
date = pd.date_range("2017-03", freq="M", periods=15)
count = np.random.rand(15,4)
df1 = pd.DataFrame({"date":date, "count" : count[:,0]})
df2 = pd.DataFrame({"date":date, "count" : count[:,1]+0.7})
df3 = pd.DataFrame({"date":date, "count" : count[:,2]+2})
f, ax = plt.subplots(1, 1)
x_col='date'
y_col = 'count'
ax.plot_date(df1.date, df1["count"], color="blue", label="A", linestyle="-")
ax.plot_date(df2.date, df2["count"], color="red", label="B", linestyle="-")
ax.plot_date(df3.date, df3["count"], color="green", label="C", linestyle="-")
ax.legend()
plt.gcf().autofmt_xdate()
plt.show()
In case one is still interested in obtaining the legend for pointplots, here a way to go:
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df1,color='blue')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df2,color='green')
sns.pointplot(ax=ax,x=x_col,y=y_col,data=df3,color='red')
ax.legend(handles=ax.lines[::len(df1)+1], labels=["A","B","C"])
ax.set_xticklabels([t.get_text().split("T")[0] for t in ax.get_xticklabels()])
plt.gcf().autofmt_xdate()
plt.show()
Old question, but there's an easier way.
sns.pointplot(x=x_col,y=y_col,data=df_1,color='blue')
sns.pointplot(x=x_col,y=y_col,data=df_2,color='green')
sns.pointplot(x=x_col,y=y_col,data=df_3,color='red')
plt.legend(labels=['legendEntry1', 'legendEntry2', 'legendEntry3'])
This lets you add the plots sequentially, and not have to worry about any of the matplotlib crap besides defining the legend items.
I tried using Adam B's answer, however, it didn't work for me. Instead, I found the following workaround for adding legends to pointplots.
import matplotlib.patches as mpatches
red_patch = mpatches.Patch(color='#bb3f3f', label='Label1')
black_patch = mpatches.Patch(color='#000000', label='Label2')
In the pointplots, the color can be specified as mentioned in previous answers. Once these patches corresponding to the different plots are set up,
plt.legend(handles=[red_patch, black_patch])
And the legend ought to appear in the pointplot.
This goes a bit beyond the original question, but also builds on #PSub's response to something more general---I do know some of this is easier in Matplotlib directly, but many of the default styling options for Seaborn are quite nice, so I wanted to work out how you could have more than one legend for a point plot (or other Seaborn plot) without dropping into Matplotlib right at the start.
Here's one solution:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# We will need to access some of these matplotlib classes directly
from matplotlib.lines import Line2D # For points and lines
from matplotlib.patches import Patch # For KDE and other plots
from matplotlib.legend import Legend
from matplotlib import cm
# Initialise random number generator
rng = np.random.default_rng(seed=42)
# Generate sample of 25 numbers
n = 25
clusters = []
for c in range(0,3):
# Crude way to get different distributions
# for each cluster
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Cluster {c+1}"
})
clusters.append(df)
# Flatten to a single data frame
clusters = pd.concat(clusters)
# Now do the same for data to feed into
# the second (scatter) plot...
n = 8
points = []
for c in range(0,2):
p = rng.integers(low=1, high=6, size=4)
df = pd.DataFrame({
'x': rng.normal(p[0], p[1], n),
'y': rng.normal(p[2], p[3], n),
'name': f"Group {c+1}"
})
points.append(df)
points = pd.concat(points)
# And create the figure
f, ax = plt.subplots(figsize=(8,8))
# The KDE-plot generates a Legend 'as usual'
k = sns.kdeplot(
data=clusters,
x='x', y='y',
hue='name',
shade=True,
thresh=0.05,
n_levels=2,
alpha=0.2,
ax=ax,
)
# Notice that we access this legend via the
# axis to turn off the frame, set the title,
# and adjust the patch alpha level so that
# it closely matches the alpha of the KDE-plot
ax.get_legend().set_frame_on(False)
ax.get_legend().set_title("Clusters")
for lh in ax.get_legend().get_patches():
lh.set_alpha(0.2)
# You would probably want to sort your data
# frame or set the hue and style order in order
# to ensure consistency for your own application
# but this works for demonstration purposes
groups = points.name.unique()
markers = ['o', 'v', 's', 'X', 'D', '<', '>']
colors = cm.get_cmap('Dark2').colors
# Generate the scatterplot: notice that Legend is
# off (otherwise this legend would overwrite the
# first one) and that we're setting the hue, style,
# markers, and palette using the 'name' parameter
# from the data frame and the number of groups in
# the data.
p = sns.scatterplot(
data=points,
x="x",
y="y",
hue='name',
style='name',
markers=markers[:len(groups)],
palette=colors[:len(groups)],
legend=False,
s=30,
alpha=1.0
)
# Here's the 'magic' -- we use zip to link together
# the group name, the color, and the marker style. You
# *cannot* retreive the marker style from the scatterplot
# since that information is lost when rendered as a
# PathCollection (as far as I can tell). Anyway, this allows
# us to loop over each group in the second data frame and
# generate a 'fake' Line2D plot (with zero elements and no
# line-width in our case) that we can add to the legend. If
# you were overlaying a line plot or a second plot that uses
# patches you'd have to tweak this accordingly.
patches = []
for x in zip(groups, colors[:len(groups)], markers[:len(groups)]):
patches.append(Line2D([0],[0], linewidth=0.0, linestyle='',
color=x[1], markerfacecolor=x[1],
marker=x[2], label=x[0], alpha=1.0))
# And add these patches (with their group labels) to the new
# legend item and place it on the plot.
leg = Legend(ax, patches, labels=groups,
loc='upper left', frameon=False, title='Groups')
ax.add_artist(leg);
# Done
plt.show();
Here's the output:
I've been trying to follow this How to make custom legend in matplotlib SO question but I think a few things are getting lost in translation. I used a custom color mapping for the different classes of points in my plot and I want to be able to put a table with those color-label pairs. I stored the info in a dictionary D_color_label and then made 2 parallel lists colors and labels. I tried using it in the ax.legend but it didn't seem to work.
np.random.seed(0)
# Create dataframe
DF_0 = pd.DataFrame(np.random.random((100,2)), columns=["x","y"])
# Label to colors
D_idx_color = {**dict(zip(range(0,25), ["#91FF61"]*25)),
**dict(zip(range(25,50), ["#BA61FF"]*25)),
**dict(zip(range(50,75), ["#916F61"]*25)),
**dict(zip(range(75,100), ["#BAF1FF"]*25))}
D_color_label = {"#91FF61":"label_0",
"#BA61FF":"label_1",
"#916F61":"label_2",
"#BAF1FF":"label_3"}
# Add color column
DF_0["color"] = pd.Series(list(D_idx_color.values()), index=list(D_idx_color.keys()))
# Plot
fig, ax = plt.subplots(figsize=(8,8))
sns.regplot(data=DF_0, x="x", y="y", scatter_kws={"c":DF_0["color"]}, ax=ax)
# Add custom legend
colors = list(set(DF_0["color"]))
labels = [D_color_label[x] for x in set(DF_0["color"])]
# If I do this, I get the following error:
# ax.legend(colors, labels)
# UserWarning: Legend does not support '#BA61FF' instances.
# A proxy artist may be used instead.
According to http://matplotlib.org/users/legend_guide.html you have to put to legend function artists which will be labeled. To use scatter_plot individually you have to group by your data by color and plot every data of one color individually to set its own label for every artist:
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import seaborn as sns
np.random.seed(0)
# Create dataframe
DF_0 = pd.DataFrame(np.random.random((100, 2)), columns=["x", "y"])
DF_0['color'] = ["#91FF61"]*25 + ["#BA61FF"]*25 + ["#91FF61"]*25 + ["#BA61FF"]*25
#print DF_0
D_color_label = {"#91FF61": "label_0", "#BA61FF": "label_1",
"#916F61": "label_2", "#BAF1FF": "label_3"}
colors = list(DF_0["color"].uniqe())
labels = [D_color_label[x] for x in DF_0["color"].unique()]
ax = sns.regplot(data=DF_0, x="x", y="y", scatter_kws={'c': DF_0['color'], 'zorder':1})
# Make a legend
# groupby and plot points of one color
for i, grp in DF_0.groupby(['color']):
grp.plot(kind='scatter', x='x', y='y', c=i, ax=ax, label=labels[i+1], zorder=0)
ax.legend(loc=2)
plt.show()
I have the following code:
import numpy as np
import pandas as pd
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
import seaborn as sns
sns.set(style="white")
# Create a dataset with many short random walks
rs = np.random.RandomState(4)
pos = rs.randint(-1, 2, (10, 5)).cumsum(axis=1)
pos -= pos[:, 0, np.newaxis]
step = np.tile(range(5), 10)
walk = np.repeat(range(10), 5)
df = pd.DataFrame(np.c_[pos.flat, step, walk],
columns=["position", "step", "walk"])
# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="walk", hue="walk", col_wrap=5, size=5,
aspect=1)
# Draw a bar plot to show the trajectory of each random walk
grid.map(sns.barplot, "step", "position", palette="Set3").add_legend();
grid.savefig("/Users/mymacmini/Desktop/test_fig.png")
#sns.plt.show()
Which makes this plot:
As you can see I get the legend wrong. How can I make it right?
Some how there is one legend item for each of the subplot. Looks like if we want to have legend corresponds to the bars in each of the subplot, we have to manually make them.
# Let's just make a 1-by-2 plot
df = df.head(10)
# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="walk", hue="walk", col_wrap=2, size=5,
aspect=1)
# Draw a bar plot to show the trajectory of each random walk
bp = grid.map(sns.barplot, "step", "position", palette="Set3")
# The color cycles are going to all the same, doesn't matter which axes we use
Ax = bp.axes[0]
# Some how for a plot of 5 bars, there are 6 patches, what is the 6th one?
Boxes = [item for item in Ax.get_children()
if isinstance(item, matplotlib.patches.Rectangle)][:-1]
# There is no labels, need to define the labels
legend_labels = ['a', 'b', 'c', 'd', 'e']
# Create the legend patches
legend_patches = [matplotlib.patches.Patch(color=C, label=L) for
C, L in zip([item.get_facecolor() for item in Boxes],
legend_labels)]
# Plot the legend
plt.legend(handles=legend_patches)
When the legend doesn't work out you can always make your own easily like this:
import matplotlib
name_to_color = {
'Expected': 'green',
'Provided': 'red',
'Difference': 'blue',
}
patches = [matplotlib.patches.Patch(color=v, label=k) for k,v in name_to_color.items()]
matplotlib.pyplot.legend(handles=patches)