Make single legend for two subplots of DataFrame - python

I create a plot with two axes on different subplots. Currently one overlays another. The problem is to make legend to contain both labels in stack. How can I do this?
d = data.groupby('atemp_rounded').sum().reset_index()
fig = plt.figure()
ax1 = fig.add_subplot(111) # don't know what 111 stands for...
ax2 = ax1.twinx()
d.plot(ax=ax1, y='casual')
d.plot(ax=ax2, y='registered', color='g')
plt.show()

You may set the legend of the individual plots off and instead create a figure legend. To have this placed within the axes boundaries the position needs to be specified in axes coordinates.
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({"A" : [3,2,1], "B" : [2,2,1]})
fig = plt.figure()
ax1 = fig.add_subplot(111) # don't know what 111 stands for...
ax2 = ax1.twinx()
df.plot(ax=ax1, y='A', legend=False)
df.plot(ax=ax2, y='B', color='g', legend=False)
fig.legend(loc="upper right", bbox_to_anchor=(0,0,1,1), bbox_transform=ax1.transAxes)
plt.show()

Related

How to properly plot a line over bars?

This one used to work fine, but somehow it stopped working (I must have changed something mistakenly but I can't find the issue).
I'm plotting a set of 3 bars per date, plus a line that shows the accumulated value of one of them. But only one or another (either the bars or the line) is properly being plotted. If I left the code for the bars last, only the bars are plotted. If I left the code for the line last, only the line is plotted.
fig, ax = plt.subplots(figsize = (15,8))
df.groupby("date")["result"].sum().cumsum().plot(
ax=ax,
marker='D',
lw=2,
color="purple")
df.groupby("date")[selected_columns].sum().plot(
ax=ax,
kind="bar",
color=["blue", "red", "gold"])
ax.legend(["LINE", "X", "Y", "Z"])
Appreciate the help!
Pandas draws bar plots with the x-axis as categorical, so internally numbered 0, 1, 2, ... and then setting the label. The line plot uses dates as x-axis. To combine them, both need to be categorical. The easiest way is to drop the index from the line plot. Make sure that the line plot is draw first, enabling the labels to be set correctly by the bar plot.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({'date': pd.date_range('20210101', periods=10),
'earnings': np.random.randint(100, 600, 10),
'costs': np.random.randint(0, 200, 10)})
df['result'] = df['earnings'] - df['costs']
fig, ax = plt.subplots(figsize=(15, 8))
df.groupby("date")["result"].sum().cumsum().reset_index(drop=True).plot(
ax=ax,
marker='D',
lw=2,
color="purple")
df.groupby("date")[['earnings', 'costs', 'result']].sum().plot(
ax=ax,
kind="bar",
rot=0,
width=0.8,
color=["blue", "red", "gold"])
ax.legend(['Cumul.result', 'earnings', 'costs', 'result'])
# shorten the tick labels to only the date
ax.set_xticklabels([tick.get_text()[:10] for tick in ax.get_xticklabels()])
ax.set_ylim(ymin=0) # bar plots are nicer when bars start at zero
plt.tight_layout()
plt.show()
Here I post the solution:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
a=[11.3,222,22, 63.8,9]
b=[0.12,-1.0,1.82,16.67,6.67]
l=[i for i in range(5)]
plt.rcParams['font.sans-serif']=['SimHei']
fmt='%.1f%%'
yticks = mtick.FormatStrFormatter(fmt)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(l, b,'og-',label=u'A')
ax1.yaxis.set_major_formatter(yticks)
for i,(_x,_y) in enumerate(zip(l,b)):
plt.text(_x,_y,b[i],color='black',fontsize=8,)
ax1.legend(loc=1)
ax1.set_ylim([-20, 30])
ax1.set_ylabel('ylabel')
plt.legend(prop={'family':'SimHei','size':8})
ax2 = ax1.twinx()
plt.bar(l,a,alpha=0.1,color='blue',label=u'label')
ax2.legend(loc=2)
plt.legend(prop={'family':'SimHei','size':8},loc="upper left")
plt.show()
The key to this is the command
ax2 = ax1.twinx()

Is there a way to replace a matplotlib subplot with a legend (rather than have the legend outside the subplots)?

I have a figure with 11 scatter plots as subplots. I would like the legend (same across all 11 subplots) to replace the 12th subplot. Is there a way to put the legend there and have it be the same size as the subplots?
Matplotlib scatter plot of 11 subplots
Sort of a manual approach, but here it is:
You can "remove" an axis using ax.clear() and ax.set_axis_off(). Then you can create patches with specific colors and labels, and create a legend in the desired ax based on them.
Try this:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
# Create figure with subplots
fig, axes = plt.subplots(figsize=(16, 16), ncols=4, nrows=3, sharex=True, sharey=True)
# Plot some random data
for row in axes:
for ax in row:
ax.scatter(np.random.random(5), np.random.random(5), color='green')
ax.scatter(np.random.random(2), np.random.random(2), color='red')
ax.scatter(np.random.random(3), np.random.random(3), color='orange')
ax.set_title('some title')
# Clear bottom-right ax
bottom_right_ax = axes[-1][-1]
bottom_right_ax.clear() # clears the random data I plotted previously
bottom_right_ax.set_axis_off() # removes the XY axes
# Manually create legend handles (patches)
red_patch = mpatches.Patch(color='red', label='Red data')
green_patch = mpatches.Patch(color='green', label='Green data')
orange_patch = mpatches.Patch(color='orange', label='Orange data')
# Add legend to bottom-right ax
bottom_right_ax.legend(handles=[red_patch, green_patch, orange_patch], loc='center')
# Show figure
plt.show()
Output:

Why is Seaborn plotting two legends, how do I remove one and fix the other?

When I run the code shown below I get a figure containing 2 legends. I can't figure out why two are being plotted and I havent been able to remove one of them. My aim is to keep the legend that is outside of the figure, remove the one thats inside the figure and also somehow stop the weird cropping that is cutting off the right side of the legend outside the figure.
I had a previous question asking something similar, but that issue was solved by using seaborns scatterplot instead of the relplot. Sadly neither of the answers that worked in that question work here. If this problem is arising out of an "uncoventional" way of plotting the type of figure I'm trying to make, then please let me know. Doing it properly is better than hacking your way to the solution...
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
#setup
sns.set(font_scale=2)
sns.set_context('poster')
#figure and axes
fig = plt.figure(figsize=(20,20))
axs = {i:fig.add_subplot(330+i) for i in range(1,10)}
#create random data
r = np.random.randint
N=10
df = pd.DataFrame(columns=['No.','x1','x2','x3','y1','y2','y3'])
for i in range(N):
df.loc[i] = i+1,r(50,high=100),r(50,high=100),r(50,high=100),r(50,high=100),r(50,high=100),r(50,high=100)
#create axes labels
x_labels = ['x1','x2','x3']
y_labels = ['y1','y2','y3']
xy_labels = [(x,y) for y in y_labels for x in x_labels ]
#plot on axes
for i,(x_label,y_label) in enumerate(xy_labels):
if i ==0:#if statement so only one of the plots has legend='full'
a = sns.scatterplot(
data=df,
x=x_label,
y=y_label,
legend='full', #create the legend
ax=axs[i+1],
hue='No.',
palette=sns.color_palette("hls", N)
)
fig.legend(bbox_to_anchor=(1, 0.7), loc=2, borderaxespad=0.) #Move the legend outside the plot
a.legend_.remove() #attempt to remove the legend
else:
a = sns.scatterplot(
data=df,
x=x_label,
y=y_label,
legend=False,
ax=axs[i+1],
hue='No.',
palette=sns.color_palette("hls", N)
)
#remove axes labels from specific plots
if i not in [0,3,6]: axs[i+1].set_ylabel('')
if i not in [6,7,8]: axs[i+1].set_xlabel('')
#add line plots and set limits
for ax in axs.values():
sns.lineplot(x=range(50,100),y=range(50,100), ax=ax, linestyle='-')
ax.set_xlim([50,100])
ax.set_ylim([50,100])
fig.tight_layout()
You can add legend=False in the last part of your code.
#setup
sns.set(font_scale=2)
sns.set_context('poster')
#figure and axes
fig = plt.figure(figsize=(20,20))
axs = {i:fig.add_subplot(330+i) for i in range(1,10)}
#create axes labels
x_labels = ['x1','x2','x3']
y_labels = ['y1','y2','y3']
xy_labels = [(x,y) for y in y_labels for x in x_labels ]
#plot on axes
for i,(x_label,y_label) in enumerate(xy_labels):
if i ==0:#if statement so only one of the plots has legend='full'
a = sns.scatterplot(
data=df,
x=x_label,
y=y_label,
legend='full', #create the legend
ax=axs[i+1],
hue='No.',
palette=sns.color_palette("hls", N)
)
fig.legend(bbox_to_anchor=(1, 0.7), loc=2, borderaxespad=0.) #Move the legend outside the plot
a.legend_.remove() #attempt to remove the legend
else:
a = sns.scatterplot(
data=df,
x=x_label,
y=y_label,
legend=False,
ax=axs[i+1],
hue='No.',
palette=sns.color_palette("hls", N)
)
#remove axes labels from specific plots
if i not in [0,3,6]: axs[i+1].set_ylabel('')
if i not in [6,7,8]: axs[i+1].set_xlabel('')
#add line plots and set limits
for ax in axs.values():
sns.lineplot(x=range(50,100),y=range(50,100), ax=ax, linestyle='-', legend=False)
ax.set_xlim([50,100])
ax.set_ylim([50,100])
fig.tight_layout()
Result:

Adjust y-axis in Seaborn multiplot

I'm plotting a CSV file from my simulation results. The plot has three graphs in the same figure fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6)).
However, for comparison purposes I want the y-axis in all graphs starting at zero and the ending at a specific value. I tried the solution mentioned here from the Seaborn author. I don't get any errors, but the solution also does not work for me.
Here's my script:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
fname = 'results/filename.csv'
def plot_file():
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6))
df = pd.read_csv(fname, sep='\t')
profits = \
df.groupby(['providerId', 'periods'], as_index=False)['profits'].sum()
# y-axis needs to start at zero and end at 10
g = sns.lineplot(x='periods',
y='profits',
data=profits,
hue='providerId',
legend='full',
ax=axes[0])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='price',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[1])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='quality',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[2])
g.set(ylim=(0, None))
plt.show()
print(g) # -> AxesSubplot(0.672059,0.11;0.227941x0.77)
The resulting figure is as follows:
How can I adjust each individual plot?
Based on the way you've written your code, you can refer to each subplot axis with g.axis and use g.axis.set_ylim(low,high). (A difference compared to the linked answer is that your graphs are not being plotted on a seaborn FacetGrid.)
An example using dummy data and different axis ranges to illustrate:
df = pd.DataFrame(np.random.uniform(0,10,(100,2)), columns=['a','b'])
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(8,4))
g = sns.lineplot(x='a',
y='b',
data=df.sample(10),
ax=axes[0])
g.axes.set_ylim(0,25)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[1])
g.axes.set_ylim(0,3.5)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[2])
g.axes.set_ylim(0,0.3)
plt.tight_layout()
plt.show()

How can I make a barplot and a lineplot in the same seaborn plot with different Y axes nicely?

I have two different sets of data with a common index, and I want to represent the first one as a barplot and the second one as a lineplot in the same graph. My current approach is similar to the following.
ax = pt.a.plot(alpha = .75, kind = 'bar')
ax2 = ax.twinx()
ax2.plot(ax.get_xticks(), pt.b.values, alpha = .75, color = 'r')
And the result is similar to this
This image is really nice and almost right. My only problem is that ax.twinx() seems to create a new canvas on top of the previous one, and the white lines are clearly seen on top of the barplot.
Is there any way to plot this without including the white lines?
You can use twinx() method along with seaborn to create a seperate y-axis, one for the lineplot and the other for the barplot. To control the style of the plot (default style of seaborn is darkgrid), you can use set_style method and specify the preferred theme. If you set style=None it resets to white background without the gridlines. You can also try whitegrid. If you want to further customize the gridlines, you can do it on the axis level using the ax2.grid(False).
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
matplotlib.rc_file_defaults()
ax1 = sns.set_style(style=None, rc=None )
fig, ax1 = plt.subplots(figsize=(12,6))
sns.lineplot(data = df['y_var_1'], marker='o', sort = False, ax=ax1)
ax2 = ax1.twinx()
sns.barplot(data = df, x='x_var', y='y_var_2', alpha=0.5, ax=ax2)
You have to remove grid lines of the second axis. Add to the code ax2.grid(False). However y-ticks of the second axis will be not align to y-ticks of the first y-axis, like here:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(pd.Series(np.random.uniform(0,1,size=10)), color='g')
ax2 = ax1.twinx()
ax2.plot(pd.Series(np.random.uniform(0,17,size=10)), color='r')
ax2.grid(False)
plt.show()

Categories

Resources