I try to plot with both pandas (pd) and matplotlib.pyplot (plt). But I don't want pandas to show legend yet I still need the plt legend. Is there a way I could delete the legend of pandas plot? (legend=False doesn't work)
import pandas as pd
import matplotlib.pyplot as plt
xs = [i for i in range(1, 11)]
ys = [i for i in range(1, 11)]
df = pd.DataFrame(list(zip(xs, ys)), columns=['xs', 'ys'])
fig, ax = plt.subplots()
# plot pd data-frame, I don't want this to show legend
df.plot(x='xs', y='ys', ax=ax, kind='line', legend=False)
# these doesn't work
ax.legend([])
ax.get_legend().remove()
ax.legend().set_visible(False)
# plot by plt, I only want this to show legend
ax.plot(xs, ys, label='I only need this label to be shown')
ax.legend()
plt.show() # still showing both legends
Note: I prefer not to change the order of plotting (even though plot plt first and then pd could allow showing only plt legend, but the plt plot will get block by pd plot), and not using plt to plot the dataframe's data
You can remove the 1st set of lines and labels from the legend:
fig, ax = plt.subplots()
df.plot(x='xs', y='ys', ax=ax, kind='line', label='Something')
ax.plot(xs, ys, label='I only need this label to be shown')
# Legend except 1st lines/labels
lines, labels = ax.get_legend_handles_labels()
ax.legend(lines[1:], labels[1:])
plt.show()
You can use matplotlib to plot DataFrame data (and other data from other sources) on the same plot without using df.plot(). Do you need to use df.plot(), or would this be okay?
import pandas as pd
import matplotlib.pyplot as plt
xs = [i for i in range(1, 11)]
ys = [i for i in range(1, 11)]
df = pd.DataFrame(list(zip(xs, ys)), columns=['xs', 'ys'])
fig, ax = plt.subplots()
#just keep using mpl but reference the data in the dataframe, basically what df.plot() does
ax.plot(df['xs'], df['ys'])
ax.plot(xs, ys, label='I only need this label to be shown')
ax.legend()
plt.show()
If you do insist on using df.plot(), you can still take advantage of the underscore trick, as described in the documentation:
Specific lines can be excluded from the automatic legend element selection by defining a label starting with an underscore.
import pandas as pd
import matplotlib.pyplot as plt
xs = [i for i in range(1, 11)]
ys = [i for i in range(1, 11)]
df = pd.DataFrame(list(zip(xs, ys)), columns=['xs', 'ys'])
fig, ax = plt.subplots()
# plot pd data-frame, I don't want this to show legend
df.plot(x='xs', y='ys', ax=ax, kind='line', label='_hidden')
# plot by plt, I only want this to show legend
ax.plot(xs, ys, label='I only need this label to be shown')
ax.legend()
plt.show() # still showing both legends
This will yield the same result as above, but I get a warning (UserWarning: The handle <matplotlib.lines.Line2D object at 0x00000283F0FFDB38> has a label of '_hidden' which cannot be automatically added to the legend.). This feels messier and more hacky, so I prefer the first option.
Use label='_nolegend_' as recommended here. This worked for me:
import pandas as pd
import matplotlib.pyplot as plt
xs = [i for i in range(1, 11)]
ys = [i for i in range(1, 11)]
df = pd.DataFrame(list(zip(xs, ys)), columns=['xs', 'ys'])
fig, ax = plt.subplots()
# plot pd data-frame, I don't want this to show legend
df.plot(x='xs', y='ys', ax=ax, kind='line', label='_nolegend_')
# plot by plt, I only want this to show legend
ax.plot(xs, ys, label='I only need this label to be shown')
ax.legend()
plt.show() # now showing one legend
Related
this is the output of my code
as you can see both legends 'pl' and 'ppl' are overlapping at the top right. How do I get one of them to move to top left.
I tried searching for ans, and used "loc" to fix the issue, somehow I continue getting error. Can someone help please?
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
ax1.set_xlabel('Date')
ax1.set_ylabel('percent change / 100')
dd = pd.DataFrame(np.random.randint(1,10,(30,2)),columns=['pl','ppl'])
dd['pl'].plot(ax=ax1,legend=True)
dd['ppl'].plot(ax=ax2, style=['g--', 'b--', 'r--'],legend=True)
ax2.set_ylabel('difference')
plt.show()
Perhaps plot directly with matplotlib instead of using DataFrame.plot:
ax1.plot(dd['pl'], label='pl')
ax1.legend(loc='upper left')
ax2.plot(dd['ppl'], ls='--', c='g', label='ppl')
ax2.legend(loc='upper right')
Output:
I think you need to call legend on plot and position the legend accordingly. Please see below.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
ax1.set_xlabel('Date')
ax1.set_ylabel('percent change / 100')
dd = pd.DataFrame(np.random.randint(1,10,(30,2)),columns=['pl','ppl'])
dd['pl'].plot(ax=ax1, legend=True).legend(loc='center left',bbox_to_anchor=(1.0, 0.5))
dd['ppl'].plot(ax=ax2, style=['g--', 'b--', 'r--'],legend=True).legend(loc='upper right')
You can create the legend in several ways:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax2 = ax1.twinx()
ax1.set_xlabel("Date")
ax1.set_ylabel("percent change / 100")
dd = pd.DataFrame(np.random.randint(1, 10, (30, 2)), columns=["pl", "ppl"])
dd["pl"].plot(ax=ax1)
dd["ppl"].plot(ax=ax2, style=["g--", "b--", "r--"])
# # two separate legends
# ax1.legend()
# ax2.legend(loc="upper left")
# # a single legend for the whole fig
# fig.legend(loc="upper right")
# # a single legend for the axis
# get the lines in the axis
lines1 = ax1.lines
lines2 = ax2.lines
all_lines = lines1 + lines2
# get the label for each line
all_labels = [lin.get_label() for lin in all_lines]
# place the legend
ax1.legend(all_lines, all_labels, loc="upper left")
ax2.set_ylabel("difference")
plt.show()
The last one I left uncommented creates a single legend inside the ax, with both lines listed.
Cheers!
This one used to work fine, but somehow it stopped working (I must have changed something mistakenly but I can't find the issue).
I'm plotting a set of 3 bars per date, plus a line that shows the accumulated value of one of them. But only one or another (either the bars or the line) is properly being plotted. If I left the code for the bars last, only the bars are plotted. If I left the code for the line last, only the line is plotted.
fig, ax = plt.subplots(figsize = (15,8))
df.groupby("date")["result"].sum().cumsum().plot(
ax=ax,
marker='D',
lw=2,
color="purple")
df.groupby("date")[selected_columns].sum().plot(
ax=ax,
kind="bar",
color=["blue", "red", "gold"])
ax.legend(["LINE", "X", "Y", "Z"])
Appreciate the help!
Pandas draws bar plots with the x-axis as categorical, so internally numbered 0, 1, 2, ... and then setting the label. The line plot uses dates as x-axis. To combine them, both need to be categorical. The easiest way is to drop the index from the line plot. Make sure that the line plot is draw first, enabling the labels to be set correctly by the bar plot.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({'date': pd.date_range('20210101', periods=10),
'earnings': np.random.randint(100, 600, 10),
'costs': np.random.randint(0, 200, 10)})
df['result'] = df['earnings'] - df['costs']
fig, ax = plt.subplots(figsize=(15, 8))
df.groupby("date")["result"].sum().cumsum().reset_index(drop=True).plot(
ax=ax,
marker='D',
lw=2,
color="purple")
df.groupby("date")[['earnings', 'costs', 'result']].sum().plot(
ax=ax,
kind="bar",
rot=0,
width=0.8,
color=["blue", "red", "gold"])
ax.legend(['Cumul.result', 'earnings', 'costs', 'result'])
# shorten the tick labels to only the date
ax.set_xticklabels([tick.get_text()[:10] for tick in ax.get_xticklabels()])
ax.set_ylim(ymin=0) # bar plots are nicer when bars start at zero
plt.tight_layout()
plt.show()
Here I post the solution:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
a=[11.3,222,22, 63.8,9]
b=[0.12,-1.0,1.82,16.67,6.67]
l=[i for i in range(5)]
plt.rcParams['font.sans-serif']=['SimHei']
fmt='%.1f%%'
yticks = mtick.FormatStrFormatter(fmt)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(l, b,'og-',label=u'A')
ax1.yaxis.set_major_formatter(yticks)
for i,(_x,_y) in enumerate(zip(l,b)):
plt.text(_x,_y,b[i],color='black',fontsize=8,)
ax1.legend(loc=1)
ax1.set_ylim([-20, 30])
ax1.set_ylabel('ylabel')
plt.legend(prop={'family':'SimHei','size':8})
ax2 = ax1.twinx()
plt.bar(l,a,alpha=0.1,color='blue',label=u'label')
ax2.legend(loc=2)
plt.legend(prop={'family':'SimHei','size':8},loc="upper left")
plt.show()
The key to this is the command
ax2 = ax1.twinx()
I have a figure with 11 scatter plots as subplots. I would like the legend (same across all 11 subplots) to replace the 12th subplot. Is there a way to put the legend there and have it be the same size as the subplots?
Matplotlib scatter plot of 11 subplots
Sort of a manual approach, but here it is:
You can "remove" an axis using ax.clear() and ax.set_axis_off(). Then you can create patches with specific colors and labels, and create a legend in the desired ax based on them.
Try this:
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np
# Create figure with subplots
fig, axes = plt.subplots(figsize=(16, 16), ncols=4, nrows=3, sharex=True, sharey=True)
# Plot some random data
for row in axes:
for ax in row:
ax.scatter(np.random.random(5), np.random.random(5), color='green')
ax.scatter(np.random.random(2), np.random.random(2), color='red')
ax.scatter(np.random.random(3), np.random.random(3), color='orange')
ax.set_title('some title')
# Clear bottom-right ax
bottom_right_ax = axes[-1][-1]
bottom_right_ax.clear() # clears the random data I plotted previously
bottom_right_ax.set_axis_off() # removes the XY axes
# Manually create legend handles (patches)
red_patch = mpatches.Patch(color='red', label='Red data')
green_patch = mpatches.Patch(color='green', label='Green data')
orange_patch = mpatches.Patch(color='orange', label='Orange data')
# Add legend to bottom-right ax
bottom_right_ax.legend(handles=[red_patch, green_patch, orange_patch], loc='center')
# Show figure
plt.show()
Output:
Hello how can i make a figure with scatter subplots using pandas? Its working with plot, but not with scatter.
Here an Example
import numpy as np
import pandas as pd
matrix = np.random.rand(200,5)
df = pd.DataFrame(matrix,columns=['index','A','B','C','D'])
#single plot, working with
df.plot(
kind='scatter',
x='index',
y='A',
s= 0.5
)
# not workig
df.plot(
subplots=True,
kind='scatter',
x='index',
y=['A','B','C'],
s= 0.5
)
Error
raise ValueError(self._kind + " requires an x and y column")
ValueError: scatter requires an x and y column
Edit:
Solution to make a figure with subplots with using df.plot
(Thanks to #Fourier)
import numpy as np
import pandas as pd
matrix = np.random.rand(200,5)#random data
df = pd.DataFrame(matrix,columns=['index','A','B','C','D']) #make df
#get a list for subplots
labels = list(df.columns)
labels.remove('index')
df.plot(
layout=(-1, 5),
kind="line",
x='index',
y=labels,
subplots = True,
sharex = True,
ls="none",
marker="o")
Would this work for you:
import pandas as pd
import numpy as np
df = pd.DataFrame({"index":np.arange(5),"A":np.random.rand(5),"B":np.random.rand(5),"C":np.random.rand(5)})
df.plot(kind="line", x="index", y=["A","B","C"], subplots=True, sharex=True, ls="none", marker="o")
Output
Note: This uses a line plot with invisible lines. For a scatter, I would go and loop over it.
for column in df.columns[:-1]: #[:-1] ignores the index column for my random sample
df.plot(kind="scatter", x="index", y=column)
EDIT
In order to add custom ylabels you can do the following:
axes = df.plot(kind='line', x="index", y=["A","B","C"], subplots=True, sharex=True, ls="none", marker="o", legend=False)
ylabels = ["foo","bar","baz"]
for ax, label in zip(axes, ylabels):
ax.set_ylabel(label)
My code is inside a Jupyter Notebook.
I can create a chart using Method 1 below, and have it look exactly as I'd like it to look.
But when I try with Method 2, which uses subplot, I don't know how to make it look the same (setting the figsize, colors, legend off to the right).
How do I use subplot, and have it look the same as Method 1?
Thank you in advance for your help!
# Using Numpy and Pandas
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.style as style
df = pd.DataFrame(np.random.randint(0,100,size=(4, 4)), columns=list('ABCD'))
style.use('fivethirtyeight')
# Colorblind-friendly colors
colors = [[0,0,0], [230/255,159/255,0], [86/255,180/255,233/255], [0,158/255,115/255]]
# Method 1
chart = df.plot(figsize = (10,5), color = colors)
chart.yaxis.label.set_visible(True)
chart.set_ylabel("Bitcoin Price")
chart.set_xlabel("Time")
chart.legend(bbox_to_anchor=(1.05, 1), loc=2)
plt.show()
# Method 2
fig, ax = plt.subplots()
ax.plot(df)
ax.set_ylabel("Bitcoin Price")
ax.set_xlabel("Time")
plt.show()
You just replace char by ax, like this
ax.yaxis.label.set_visible(True)
ax.set_ylabel("Bitcoin Price") ax.set_xlabel("Time") ax.legend(bbox_to_anchor=(1.05, 1), loc=2)
I'm thinking of two ways to get a result that might be useful for you. pd.DataFrame.plot returns an Axes object you can pass all the methods you want, so both examples just replace chart for ax.
Setup
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.style as style
df = pd.DataFrame(np.random.randint(0,100,size=(4, 4)), columns=list('ABCD'))
style.use('fivethirtyeight')
# Colorblind-friendly colors
colors = [[0,0,0], [230/255,159/255,0], [86/255,180/255,233/255], [0,158/255,115/255]]
Iterating over df
colors_gen = (x for x in colors) # we will also be iterating over the colors
fig, ax = plt.subplots(figsize = (10,5))
for i in df: # iterate over columns...
ax.plot(df[i], color=next(colors_gen)) # and plot one at a time
ax.set_ylabel("Bitcoin Price")
ax.set_xlabel("Time")
ax.legend(bbox_to_anchor=(1.05, 1), loc=2)
ax.yaxis.label.set_visible(True)
plt.show()
Use pd.DataFrame.plot but pass ax as an argument
fig, ax = plt.subplots(figsize = (10,5))
df.plot(color=colors, ax=ax)
ax.set_ylabel("Bitcoin Price")
ax.set_xlabel("Time")
ax.legend(bbox_to_anchor=(1.05, 1), loc=2)
ax.yaxis.label.set_visible(True)
plt.show()