Legend not showing with barless histogram plot in python - python

I am trying to plot a kde plot in seaborn using the histplot function, and removing later the bars of the histogram in the following way (see last part of the accepted answer here):
fig, ax = plt.subplots()
sns.histplot(data, kde=True, binwidth=5, stat="probability", label='data1', kde_kws={'cut': 3})
The reason for using histplot instead of kdeplot is that I need to set a specific binwidth. The problem I have that I cannot print out the legend, meaning that
ax.legend(loc='best')
does nothing, and I receive the following message: No handles with labels found to put in legend.
I have also tried with
handles, labels = ax.get_legend_handles_labels()
plt.legend(handles, labels, loc='best')
but without results. Does anybody have an idea of what is going on here? Thanks in advance!

You can add the label for the kde line via the line_kws={'label': ...} parameter.
sns.kdeplot can't be used directly, because currently the only option is the default scaling (density).
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
data = np.random.normal(0.01, 0.1, size=10000).cumsum()
ax = sns.histplot(data, kde=True, binwidth=5, stat="probability", label='data1',
kde_kws={'cut': 3}, line_kws={'label': 'kde scaled to probability'})
ax.containers[0].remove() # remove the bars of the histogram
ax.legend()
plt.show()

Related

set custom tick labels on heatmap color bar

I have a list of dataframes named merged_dfs that I am looping through to get the correlation and plot subplots of heatmap correlation matrix using seaborn.
I want to customize the colorbar tick labels, but I am having trouble figuring out how to do it with my example.
Currently, my colorbar scale values from top to bottom are
[1,0.5,0,-0.5,-1]
I want to keep these values, but change the tick labels to be
[1,0.5,0,0.5,1]
for my diverging color bar.
Here is the code and my attempt:
fig, ax = plt.subplots(nrows=6, ncols=2, figsize=(20,20))
for i, (title,merging) in enumerate (zip(new_name_data,merged_dfs)):
graph = merging.corr()
colormap = sns.diverging_palette(250, 250, as_cmap=True)
a = sns.heatmap(graph.abs(), cmap=colormap, vmin=-1,vmax=1,center=0,annot = graph, ax=ax.flat[i])
cbar = fig.colorbar(a)
cbar.set_ticklabels(["1","0.5","0","0.5","1"])
fig.delaxes(ax[5,1])
plt.show()
plt.close()
I keep getting this error:
AttributeError: 'AxesSubplot' object has no attribute 'get_array'
Several things are going wrong:
fig.colorbar(...) would create a new colorbar, by default appended to the last subplot that was created.
sns.heatmap returns an ax (indicates a subplot). This is very different to matplotlib functions, e.g. plt.imshow(), which would return the graphical element that was plotted.
You can suppress the heatmap's colorbar (cbar=False), and then create it newly with the parameters you want.
fig.colorbar(...) needs a parameter ax=... when the figure contains more than one subplot.
Instead of creating a new colorbar, you can add the colorbar parameters to sns.heatmap via cbar_kws=.... The colorbar itself can be found via ax.collections[0].colobar. (ax.collections[0] is where matplotlib stored the graphical object that contains the heatmap.)
Using an index is strongly discouraged when working with Python. It's usually more readable, easier to maintain and less error-prone to include everything into the zip command.
As now your vmin now is -1, taking the absolute value for the coloring seems to be a mistake.
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
merged_dfs = [pd.DataFrame(data=np.random.rand(5, 7), columns=[*'ABCDEFG']) for _ in range(5)]
new_name_data = [f'Dataset {i + 1}' for i in range(len(merged_dfs))]
fig, axes = plt.subplots(nrows=2, ncols=3, figsize=(12, 7))
for title, merging, ax in zip(new_name_data, merged_dfs, axes.flat):
graph = merging.corr()
colormap = sns.diverging_palette(250, 250, as_cmap=True)
sns.heatmap(graph, cmap=colormap, vmin=-1, vmax=1, center=0, annot=True, ax=ax, cbar_kws={'ticks': ticks})
ax.collections[0].colorbar.set_ticklabels([abs(t) for t in ticks])
fig.delaxes(axes.flat[-1])
fig.tight_layout()
plt.show()

How to do a boxplot with individual data points using seaborn

I have a box plot that I create using the following command:
sns.boxplot(y='points_per_block', x='block', data=data, hue='habit_trial')
So the different colors represent whether the trial was a habit trial or not (0,1). I want to also plot the individual data points, which I tried to achieve using:
sns.stripplot(y='points_per_block', x='block', data=data, hue='habit_trial')
The result was the following
I want the individual points to display over the corresponding box plots. Is there a way to do this without resorting to hacking their positions in some manner? The problem comes from the fact that the separation of data using hue works differently for stripplot and boxplot but I would have thought that these would be easily combinable.
Thanks in advance.
Seaborn functions working with categorical data usually have a dodge= parameter indicating whether data with different hue should be separated a bit. For a boxplot, dodge defaults to True, as it usually would look bad without dodging. For a stripplot defaults to dodge=False.
The following example also shows how the legend can be updated (matplotlib 3.4 is needed for HandlerTuple):
import seaborn as sns
from matplotlib.legend_handler import HandlerTuple
tips = sns.load_dataset("tips")
ax = sns.boxplot(data=tips, x="day", y="total_bill",
hue="smoker", hue_order=['Yes', 'No'], boxprops={'alpha': 0.4})
sns.stripplot(data=tips, x="day", y="total_bill",
hue="smoker", hue_order=['Yes', 'No'], dodge=True, ax=ax)
handles, labels = ax.get_legend_handles_labels()
ax.legend(handles=[(handles[0], handles[2]), (handles[1], handles[3])],
labels=['Smoker', 'Non-smoker'],
loc='upper left', handlelength=4,
handler_map={tuple: HandlerTuple(ndivide=None)})

Difficulty combining and repositioning the legends of two charts in matplotlib and pandas

I am trying to plot two charts onto one figure, with both charts coming from the same dataframe, but one represented as a stacked bar chart and the other a simple line plot.
When I create the plot using the following code:
combined.iloc[:, 1:10].plot(kind='bar', stacked=True, figsize=(20,10))
combined.iloc[:, 0].plot(kind='line', secondary_y=True, use_index=False, linestyle='-', marker='o')
plt.legend(loc='upper left', fancybox=True, framealpha=1, shadow=True, borderpad=1)
plt.show()
With the combined data frame looking like this:
I get the following image:
I am trying to combine both legends into one, and position the legend in the upper left hand corner so all the chart is visible.
Can someone explain why plt.legend() only seems to be editing the line chart corresponding to the combined.iloc[:, 0] slice of my combined dataframe? If anyone can see a quick and easy way to combine and reposition the legends please let me know! I'd be most grateful.
Passing True for the argument secondary_y means that the plot will be created on a separate axes instance with twin x-axis, since this creates a different axes instance the solution is generally to create the legend manually, as in the answers to the question linked by #ImportanceOfBeingErnest. If you don't want to create the legend directly you can get around this issue by calling plt.legend() between calls to pandas.DataFrame.plot and storing the result. You can then recover the handles and labels from the two axes instances. The following code is a complete example of this
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.DataFrame({'x' : np.random.random(25),
'y' : np.random.random(25)*5,
'z' : np.random.random(25)*2.5})
df.iloc[:, 1:10].plot(kind='bar', stacked=True)
leg = plt.legend()
df.iloc[:, 0].plot(kind='line', y='x', secondary_y=True)
leg2 = plt.legend()
plt.legend(leg.get_patches()+leg2.get_lines(),
[text.get_text() for text in leg.get_texts()+leg2.get_texts()],
loc='upper left', fancybox=True, framealpha=1, shadow=True, borderpad=1)
leg.remove()
plt.show()
This will produce
and should be fairly easy to modify to suit your specific use case.
Alternatively, you can use matplotlib.pyplot.figlegend(), but you will need to pass legend = False in all calls to pandas.DataFrame.plot(), i.e.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
df = pd.DataFrame({'x' : np.random.random(25),
'y' : np.random.random(25)*5,
'z' : np.random.random(25)*2.5})
df.iloc[:, 1:10].plot(kind='bar', stacked=True, legend=False)
df.iloc[:, 0].plot(kind='line', y='x', secondary_y=True, legend=False)
plt.figlegend(loc='upper left', fancybox=True, framealpha=1, shadow=True, borderpad=1)
plt.show()
This will however default to positioning the legend outside the axes, but you can override the automatic positioning via the bbox_to_anchor argument in calling plt.figlegend().

How to prevent overlapping x-axis labels in sns.countplot

For the plot
sns.countplot(x="HostRamSize",data=df)
I got the following graph with x-axis label mixing together, how do I avoid this? Should I change the size of the graph to solve this problem?
Having a Series ds like this
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(136)
l = "1234567890123"
categories = [ l[i:i+5]+" - "+l[i+1:i+6] for i in range(6)]
x = np.random.choice(categories, size=1000,
p=np.diff(np.array([0,0.7,2.8,6.5,8.5,9.3,10])/10.))
ds = pd.Series({"Column" : x})
there are several options to make the axis labels more readable.
Change figure size
plt.figure(figsize=(8,4)) # this creates a figure 8 inch wide, 4 inch high
sns.countplot(x="Column", data=ds)
plt.show()
Rotate the ticklabels
ax = sns.countplot(x="Column", data=ds)
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right")
plt.tight_layout()
plt.show()
Decrease Fontsize
ax = sns.countplot(x="Column", data=ds)
ax.set_xticklabels(ax.get_xticklabels(), fontsize=7)
plt.tight_layout()
plt.show()
Of course any combination of those would work equally well.
Setting rcParams
The figure size and the xlabel fontsize can be set globally using rcParams
plt.rcParams["figure.figsize"] = (8, 4)
plt.rcParams["xtick.labelsize"] = 7
This might be useful to put on top of a juypter notebook such that those settings apply for any figure generated within. Unfortunately rotating the xticklabels is not possible using rcParams.
I guess it's worth noting that the same strategies would naturally also apply for seaborn barplot, matplotlib bar plot or pandas.bar.
You can rotate the x_labels and increase their font size using the xticks methods of pandas.pyplot.
For Example:
import matplotlib.pyplot as plt
plt.figure(figsize=(10,5))
chart = sns.countplot(x="HostRamSize",data=df)
plt.xticks(
rotation=45,
horizontalalignment='right',
fontweight='light',
fontsize='x-large'
)
For more such modifications you can refer this link:
Drawing from Data
If you just want to make sure xticks labels are not squeezed together, you can set a proper fig size and try fig.autofmt_xdate().
This function will automatically align and rotate the labels.
plt.figure(figsize=(15,10)) #adjust the size of plot
ax=sns.countplot(x=df['Location'],data=df,hue='label',palette='mako')
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right") #it will rotate text on x axis
plt.tight_layout()
plt.show()
you can try this code & change size & rotation according to your need.
I don't know whether it is an option for you but maybe turning the graphic could be a solution (instead of plotting on x=, do it on y=), such that:
sns.countplot(y="HostRamSize",data=df)

Move legend outside figure in seaborn tsplot [duplicate]

This question already has answers here:
Move seaborn plot legend to a different position
(8 answers)
How to put the legend outside the plot
(18 answers)
Closed 8 months ago.
I would like to create a time series plot using seaborn.tsplot like in this example from tsplot documentation, but with the legend moved to the right, outside the figure.
Based on the lines 339-340 in seaborn's timeseries.py, it looks like seaborn.tsplot currently doesn't allow direct control of legend placement:
if legend:
ax.legend(loc=0, title=legend_name)
Is there a matplotlib workaround?
I'm using seaborn 0.6-dev.
20220916 Update
Since version v0.11.2 of seaborn, there is an in-built control of the legend position, see seaborn.move_legend. To put the legend outside:
ax = sns.histplot(penguins, x="bill_length_mm", hue="species")
sns.move_legend(ax, "upper left", bbox_to_anchor=(1, 1))
Old Answer
Indeed, seaborn doesn't handle legends well so far. You can use plt.legend() to control legend properties directly through matplotlib, in accordance with Matplotlib Legend Guide.
Note that in Seaborn 0.10.0 tsplot was removed, and you may replicate (with different values for the estimation if you please) the plots with lineplot instead of tsplot.
Snippet
import matplotlib.pyplot as plt
import seaborn as sns
sns.set(style="darkgrid")
# Load the long-form example gammas dataset
gammas = sns.load_dataset("gammas")
# Plot the response with standard error
sns.lineplot(data=gammas, x="timepoint", y="BOLD signal", hue="ROI")
# Put the legend out of the figure
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
Output
Existing solutions seem to be making things unnecessarily complicated by using the "wrong" thing for the location parameter; think about it terms of where the legend is in relation to an anchor. For example, if you want a legend on the right, then the anchor location is center left of it.
We can simplify Sergey Antopolskiy's answer down to:
import seaborn as sns
# Load the long-form example gammas dataset
g = sns.lineplot(data=gammas, x="timepoint", y="BOLD signal", hue="ROI")
# Put the legend out of the figure
g.legend(loc='center left', bbox_to_anchor=(1, 0.5))
bbox_to_anchor says we want the anchor on the right (i.e. 1 on the x-axis) and vertically centered (0.5 on the y-axis). loc says we want the legend center-left of this anchor.
In Seaborn version 0.11.0, this gives me something like:
I tried to apply T.W.'s answer for seaborn lineplot, without success. A few modifications to his answer did the job... in case anyone is looking for the lineplot version as I was!
import seaborn as sns
import pandas as pd
# load data
df = sns.load_dataset("gammas")
# EDIT: I Needed to ad the fig
fig, ax1 = plt.subplots(1,1)
# EDIT:
# T.W.' answer said: "create with hue but without legend" <- # I needed to include it!
# So, removed: legend=False
g = sns.lineplot(x="timepoint", y="BOLD signal", hue="ROI", data=df, ax=ax1)
# EDIT:
# Removed 'ax' from T.W.'s answer here aswell:
box = g.get_position()
g.set_position([box.x0, box.y0, box.width * 0.85, box.height]) # resize position
# Put a legend to the right side
g.legend(loc='center right', bbox_to_anchor=(1.25, 0.5), ncol=1)
plt.show()
The answer by Sergey worked great for me using a seaborn.tsplot but I was not able to get it working for an seaborn.lmplot so I looked a bit deeper and found another solution:
Example:
import seaborn as sns
import pandas as pd
# load data
df = pd.DataFrame.from_csv('mydata.csv')
# create with hue but without legend
g = sns.lmplot(x="x_data", y="y_data", hue="condition", legend=False, data=df)
# resize figure box to -> put the legend out of the figure
box = g.ax.get_position() # get position of figure
g.ax.set_position([box.x0, box.y0, box.width * 0.85, box.height]) # resize position
# Put a legend to the right side
g.ax.legend(loc='center right', bbox_to_anchor=(1.25, 0.5), ncol=1)
sns.plt.show(g)
Maybe you have to play around with the values to fit them to your legend.
This answer will also be helpful if you need more examples.
A pure seaborn solution:
FacetGrid-based Seaborn plots can do this automatically using the legend_out kwarg. Using relplot, pass legend_out to the FacetGrid constructor via the facet_kws dictionary:
import seaborn as sns
sns.set(style="darkgrid")
gammas = sns.load_dataset("gammas")
sns.relplot(
data=gammas,
x="timepoint",
y="BOLD signal",
hue="ROI",
kind="line",
facet_kws={"legend_out": True}
)

Categories

Resources