I have a data frame and I want to plot a legend with 'A', 'B', and 'C' however, what I have only produced a legend with an 'A' label:
data = {'A1_mean': [0.457, 1],
'A2_median': [0.391,1],
'A3_range': [0.645,1],
'A4_std': [0.111,1],
'B1_mean': [0.132,3],
'B2_median': [0.10,3],
'B3_range': [0.244,3],
'B4_std': [0.297,3],
'C1_mean': [0.286,2],
'C2_median': [0.231,2],
'C3_range': [0.554,2],
'C4_std': [0.147,2]}
df = pd.DataFrame(data).T
color = {1:'red',2:'green',3:'blue'}
ax=df[0].plot(kind='bar',color=df[1].map(color).tolist())
ax.legend(['A','B','C'])
gives:
How can I change this so that I have a legend with A B and C, with the appropriate color (A:red, B:blue, C:green) ?
Per the Legend guide you could place Proxy Artists in the legend:
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
data = {'A1_mean': [0.457, 1],
'A2_median': [0.391,1],
'A3_range': [0.645,1],
'A4_std': [0.111,1],
'B1_mean': [0.132,3],
'B2_median': [0.10,3],
'B3_range': [0.244,3],
'B4_std': [0.297,3],
'C1_mean': [0.286,2],
'C2_median': [0.231,2],
'C3_range': [0.554,2],
'C4_std': [0.147,2]}
df = pd.DataFrame(data).T
color = {1:'red',2:'green',3:'blue'}
labels = ['A','C','B']
fig, ax = plt.subplots()
df[0].plot(ax=ax, kind='bar', color=df[1].map(color))
handles = []
for i, c in color.items():
handles.append(mpatches.Patch(color=c, label=labels[i-1]))
plt.legend(handles=handles, loc='best')
# auto-rotate xtick labels
fig.autofmt_xdate()
plt.show()
Related
I generated a boxplot using seaborn. On the x axis, I would like to have, both the number of days (20, 25, 32) and the actual dates they refer to (2022-05-08, 2022-05-13, 2022-05-20).
I found a potential solution at the following link add custom tick with matplotlib. I'm trying to adapt it to my problem but I could only get the number of days or the dates, not both.
I really would appreciate any help. Thank you in advance for your time.
Please, find below my code and the desired output.
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'nb_days':[20,20,20,25,25,20,32,32,25,32,32],
'Dates':['2022-05-08','2022-05-08','2022-05-08','2022-05-13','2022-05-13','2022-05-08','2022-05-20','2022-05-20','2022-05-13','2022-05-20','2022-05-20'],
'score':[3,3.5,3.4,2,2.2,3,5,5.2,4,4.3,5]})
df['Dates'] = df['Dates'].apply(pd.to_datetime)
tick_label = dict(zip(df['nb_days'],df['Dates'].apply(lambda x: x.strftime('%Y-%m-%d')))) #My custom xtick label
#Plot
fig,ax = plt.subplots(figsize=(6,6))
ax = sns.boxplot(x='nb_days',y='score',data=df,color=None)
# iterate over boxes to change color
for i,box in enumerate(ax.artists):
box.set_edgecolor('red')
box.set_facecolor('white')
sns.stripplot(x='nb_days',y='score',data=df,color='black')
ticks = sorted(df['nb_days'].unique())
labels = [tick_label.get(t, ticks[i]) for i,t in enumerate(ticks)]
ax.set_xticklabels(labels)
plt.tight_layout()
plt.show()
plt.close()
Here is the desired output.
You can do that by adding these lines in place of ax.set_xticklabels(labels)
new_labels=["{}\n{}".format(a_, b_) for a_, b_ in zip(ticks, labels)]
ax.set_xticklabels(new_labels)
Output
Try this:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
df = pd.DataFrame({'nb_days':[20,20,20,25,25,20,32,32,25,32,32],
'Dates':['2022-05-08','2022-05-08','2022-05-08','2022-05-13','2022-05-13','2022-05-08','2022-05-20','2022-05-20','2022-05-13','2022-05-20','2022-05-20'],
'score':[3,3.5,3.4,2,2.2,3,5,5.2,4,4.3,5]})
df['Dates'] = df['Dates'].apply(pd.to_datetime)
tick_label = dict(zip(df['nb_days'],df['Dates'].apply(lambda x: x.strftime('%Y-%m-%d')))) #My custom xtick label
#Plot
fig,ax = plt.subplots(figsize=(6,6))
ax = sns.boxplot(x='nb_days',y='score',data=df,color=None)
# iterate over boxes to change color
for i,box in enumerate(ax.artists):
box.set_edgecolor('red')
box.set_facecolor('white')
sns.stripplot(x='nb_days',y='score',data=df,color='black')
ticks = sorted(df['nb_days'].unique())
labels = ["{}\n".format(t)+tick_label.get(t, ticks[i]) for i, t in enumerate(ticks)]
ax.set_xticklabels(labels)
plt.tight_layout()
plt.show()
plt.close()
Question
I have used the secondary_y argument in pd.DataFrame.plot().
While trying to change the fontsize of legends by .legend(fontsize=20), I ended up having only 1 column name in the legend when I actually have 2 columns to be printed on the legend.
This problem (having only 1 column name in the legend) does not take place when I did not use secondary_y argument.
I want all the column names in my dataframe to be printed in the legend, and change the fontsize of the legend even when I use secondary_y while plotting dataframe.
Example
The following example with secondary_y shows only 1 column name A, when I have actually 2 columns, which are A and B.
The fontsize of the legend is changed, but only for 1 column name.
import pandas as pd
import numpy as np
np.random.seed(42)
df = pd.DataFrame(np.random.randn(24*3, 2),
index=pd.date_range('1/1/2019', periods=24*3, freq='h'))
df.columns = ['A', 'B']
df.plot(secondary_y = ["B"], figsize=(12,5)).legend(fontsize=20, loc="upper right")
When I do not use secondary_y, then legend shows both of the 2 columns A and B.
import pandas as pd
import numpy as np
np.random.seed(42)
df = pd.DataFrame(np.random.randn(24*3, 2),
index=pd.date_range('1/1/2019', periods=24*3, freq='h'))
df.columns = ['A', 'B']
df.plot(figsize=(12,5)).legend(fontsize=20, loc="upper right")
To manage to customize it you have to create your graph with subplots function of Matplotlib:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)
df = pd.DataFrame(np.random.randn(24*3, 2),
index=pd.date_range('1/1/2019', periods=24*3, freq='h'))
df.columns = ['A', 'B']
#define colors to use
col1 = 'steelblue'
col2 = 'red'
#define subplots
fig,ax = plt.subplots()
#add first line to plot
lns1=ax.plot(df.index,df['A'], color=col1)
#add x-axis label
ax.set_xlabel('dates', fontsize=14)
#add y-axis label
ax.set_ylabel('A', color=col1, fontsize=16)
#define second y-axis that shares x-axis with current plot
ax2 = ax.twinx()
#add second line to plot
lns2=ax2.plot(df.index,df['B'], color=col2)
#add second y-axis label
ax2.set_ylabel('B', color=col2, fontsize=16)
#legend
ax.legend(lns1+lns2,['A','B'],loc="upper right",fontsize=20)
#another solution is to create legend for fig,:
#fig.legend(['A','B'],loc="upper right")
plt.show()
result:
this is a somewhat late response, but something that worked for me was simply setting plt.legend(fontsize = wanted_fontsize) after the plot function.
I have written a python code to generate multiple line graph. I want to change it to a bar graph, such that for every point(Time,pktCount) I get a bar depicting that value on that time.
code
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('C:\\Users\\Hp\\Documents\\XYZ.csv')
fig, ax = plt.subplots()
for i, group in df.groupby('Source'):
group.plot(x='Time', y='PktCount', ax=ax,label=group["Source"].iloc[0])
ax.set_title("PktCount Sent by nodes")
ax.set_ylabel("PktCount")
ax.set_xlabel("Time (milliSeconds)")
#optionally only set ticks at years present in the years column
plt.legend(title="Source Nodes", loc=0, fontsize='medium', fancybox=True)
plt.show()
This is my csv file :
Source,Destination,Bits,Energy,PktCount,Time
1,3,320,9.999983999999154773195,1,0
3,1,96,9.999979199797145566758,1,1082
3,4,320,9.999963199267886912408,2,1773
4,3,96,9.999974399702292006927,1,2842
1,3,320,9.999947199998309546390,2,7832
3,1,96,9.999937599065032479166,3,8965
3,4,320,9.999921598535773824816,4,10421
4,3,96,9.999948799404584013854,2,11822
2,3,384,9.999907199998736846248,1,13796
3,2,96,9.999892798283143074166,5,14990
1,3,320,9.999886399997464319585,3,18137
3,4,384,9.999873597648032688946,6,18488
3,4,384,9.999854397012922303726,7,25385
4,3,96,9.999919999106876020781,3,26453
1,3,320,9.999831999996619092780,4,27220
3,1,96,9.999828796810067870484,8,28366
2,3,384,9.999823999997473692496,2,31677
3,2,96,9.999804796557437119834,9,32873
1,3,320,9.999787199995773865975,5,34239
3,1,96,9.999783996354582686592,10,35370
1,3,320,9.999766399994928639170,6,41536
3,1,96,9.999763196151728253350,11,42667
1,3,320,9.999745599994083412365,7,49060
3,1,96,9.999742395948873820108,12,50192
2,3,384,9.999742399996210538744,3,50720
3,2,96,9.999718395696243069458,13,51925
You can explicitly collect values for x and y axis in two lists, then plot them separately:-
import pandas as pd
import matplotlib.pyplot as plt
df = pd.read_csv('C:\\Users\\Hp\\Documents\\XYZ.csv')
fig, ax = plt.subplots()
x1, y1 = [], []
for i, group in df.groupby('Source'):
#collecting all values in these lists
x1.append(group['Time'].values.tolist())
y1.append(group['PktCount'].values.tolist())
ax.set_title("PktCount Sent by nodes")
ax.set_ylabel("PktCount")
ax.set_xlabel("Time (milliSeconds)")
color_l = ['r', 'y', 'g', 'b']
i = 0
for a, b in zip(x1, y1):
ax.bar(a, b, width = 400, color = color_l[i])
i += 1
plt.legend(('1', '2', '3', '4'))
plt.show()
I am plotting a series of boxplots on the same axes and want to adda legend to identify them.
Very simplified, my script looks like this:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df={}
bp={}
positions = [1,2,3,4]
df[0]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
df[1]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
colour=['red','blue']
fig, ax = plt.subplots()
for i in [0,1]:
bp[i] = df[i].plot.box(ax=ax,
positions = positions,
color={'whiskers': colour[i],
'caps': colour[i],
'medians': colour[i],
'boxes': colour[i]}
)
plt.legend([bp[i] for i in [0,1]], ['first plot', 'second plot'])
fig.show()
The plot is fine, but the legend is not drawn and I get this warning
UserWarning: Legend does not support <matplotlib.axes._subplots.AxesSubplot object at 0x000000000A7830F0> instances.
A proxy artist may be used instead.
(I have had this warning before when adding a legend to a scatter plot, but the legend was still drawn, so i could ignore it. )
Here is a link to a description of proxy artists, but it is not clear how to apply this to my script. Any suggestions?
'pandas' plots return AxesSubplot objects which can not be used for generating legends. You must generate you own legend using proxy artist instead. I have modified your code:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.patches as mpatches
df={}
bp={}
positions = [1,2,3,4]
df[0]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
df[1]= pd.DataFrame (np.random.rand(4,4),columns =['A','B','C','D'])
colour=['red','blue']
fig, ax = plt.subplots()
for i in [0,1]:
bp[i] = df[i].plot.box(ax=ax,
positions = positions,
color={'whiskers': colour[i],
'caps': colour[i],
'medians': colour[i],
'boxes': colour[i]}
)
red_patch = mpatches.Patch(color='red', label='The red data')
blue_patch = mpatches.Patch(color='blue', label='The blue data')
plt.legend(handles=[red_patch, blue_patch])
plt.show()
The results are shown below:
I have the following code:
import numpy as np
import pandas as pd
import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as plt
matplotlib.style.use('ggplot')
import seaborn as sns
sns.set(style="white")
# Create a dataset with many short random walks
rs = np.random.RandomState(4)
pos = rs.randint(-1, 2, (10, 5)).cumsum(axis=1)
pos -= pos[:, 0, np.newaxis]
step = np.tile(range(5), 10)
walk = np.repeat(range(10), 5)
df = pd.DataFrame(np.c_[pos.flat, step, walk],
columns=["position", "step", "walk"])
# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="walk", hue="walk", col_wrap=5, size=5,
aspect=1)
# Draw a bar plot to show the trajectory of each random walk
grid.map(sns.barplot, "step", "position", palette="Set3").add_legend();
grid.savefig("/Users/mymacmini/Desktop/test_fig.png")
#sns.plt.show()
Which makes this plot:
As you can see I get the legend wrong. How can I make it right?
Some how there is one legend item for each of the subplot. Looks like if we want to have legend corresponds to the bars in each of the subplot, we have to manually make them.
# Let's just make a 1-by-2 plot
df = df.head(10)
# Initialize a grid of plots with an Axes for each walk
grid = sns.FacetGrid(df, col="walk", hue="walk", col_wrap=2, size=5,
aspect=1)
# Draw a bar plot to show the trajectory of each random walk
bp = grid.map(sns.barplot, "step", "position", palette="Set3")
# The color cycles are going to all the same, doesn't matter which axes we use
Ax = bp.axes[0]
# Some how for a plot of 5 bars, there are 6 patches, what is the 6th one?
Boxes = [item for item in Ax.get_children()
if isinstance(item, matplotlib.patches.Rectangle)][:-1]
# There is no labels, need to define the labels
legend_labels = ['a', 'b', 'c', 'd', 'e']
# Create the legend patches
legend_patches = [matplotlib.patches.Patch(color=C, label=L) for
C, L in zip([item.get_facecolor() for item in Boxes],
legend_labels)]
# Plot the legend
plt.legend(handles=legend_patches)
When the legend doesn't work out you can always make your own easily like this:
import matplotlib
name_to_color = {
'Expected': 'green',
'Provided': 'red',
'Difference': 'blue',
}
patches = [matplotlib.patches.Patch(color=v, label=k) for k,v in name_to_color.items()]
matplotlib.pyplot.legend(handles=patches)