Setting flier (outlier) style in Seaborn boxplot is ignored - python

Using Seaborn, I can create boxplots of multiple columns of one pandas DataFrame on the same figure. I would like to apply a custom style to the fliers (outliers), e.g. setting the marker symbol, color and marker size.
The API documentation on seaborn.boxplot, however, only provides an argument fliersize which lets me control the size of the fliers but not the color and symbol.
Since Seaborn uses matplotlib for plotting, I thought I could provide a matplotlib styling dictionary to the boxplot function like so:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
# create a dataframe
df = pd.DataFrame({'column_a': [3, 6, 200, 100, 7], 'column_b': [1, 8, 4, 150, 290], 'column_c': [6, 7, 20, 80, 275]})
# set figure size
sns.set(rc={"figure.figsize": (14, 6)})
# define outlier properties
flierprops = dict(marker='o', markersize=5)
# create boxplot
ax = sns.boxplot(df, vert=False, showmeans=True, flierprops=flierprops)
plt.show()
Result:
According to the provided dictionary, I would expect a large red circle representing the flyer of column_c but instead, the standard settings are still used.
This thread describes a similar problem when matplotlib is used directly - however, from the discussion I guessed that this should be fixed meanwhile when using recent versions of matplotlib.
I tried this with an iPython notebook (iPython 3.10), matplotlib 1.4.3 and seaborn 0.5.1.

flierprops = dict(marker='o', markerfacecolor='None', markersize=10, markeredgecolor='black')
sns.boxplot(y=df.Column,orient="v",flierprops=flierprops)

Seaborn's boxplot code ignores your flierprops argument and overwrites it with its own before passing arguments to Matplotlib's. Matplotlib's boxplot also returns all the flier objects as part of its return value, so you could modify this after running boxplot, but Seaborn doesn't return this.
The overwriting of flierprops (and sym) seems like a bug, so I'll see if I can fix it: see this issue. Meanwhile, you may want to consider using matplotlib's boxplot instead. Looking at seaborn's code may be useful (boxplot is in distributions.py).
Update: there is now a pull request that fixes this (flierprops and other *props, but not sym)

Related

SNS Heatmap, display one column label out of two [duplicate]

I have a pandas dataframe of shape (39, 67). When I plot it's seaborn heatmap, I don't get as many labels on the X and Y axes. .get_xticklabels() method also returns only 23 labels.
matplotlib doesn't show any labels (only numbers) as well.
Both these heatmaps are for the same dataframe (39, 67).
To ensure the labels are visible, you have to set the parameters xticklabels, yticklabels to True, like so.
import seaborn as sns
sns.heatmap(dataframe, xticklabels=True, yticklabels=True)
Here's the documentation for the heatmap function.
import seaborn as sns
sns.heatmap(dataframe, xticklabels=1, yticklabels=1)
You may also play with figsize=(7, 5) to adjust the scale.
The answers here didnt work for me so I followed the suggestions here.
Try opening a separate matplotlib window and tweak the parameters there,
Python sns heatmap does not fully display x labels

seaborn barplot padding between bars

Shortly ago, I posted this question:
seaborn barplot: vary color with x and hue
My sample code from that question produces a barplot, which looks like this:
As you can see, there is a very tiny space between the bars.
The code is this:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
df = pd.DataFrame(columns=["model", "time", "value"])
df["model"] = ["on"]*2 + ["off"]*2
df["time"] = ["short", "long"] * 2
df["value"] = [1, 10, 2, 4]
sns.barplot(data=df, x="model", hue="time", y="value")
plt.show()
Now I executed the code on a different machine and it produced a different image:
The different colors are no concern, I'll specify my own palette in any case. But an important difference for me is: The bars touch each other now, there is no longer a white border between them.
How can I reproduce the original behaviour, how can I explicitly set the padding of barplots bars in seaborn ?
My current seaborn version is 0.9.0. Unfortunately, I don't know with which version the original image was created.

changing size of seaborn plots and matplotlib library plots in a common way

from pylab import rcParams
rcParams['figure.figsize'] = (10, 10)
This works fine for histogram plots but not for factor plots.
sns.factorplot (.....) still shows the default size.
sns.factorplot('Pclass','Survived',hue='person',data = titanic_df,size = 6,aspect =1)
I have to specify size,aspect everytime.
Please suggest something that works for both of them globally.
It's not possible to change the figure size of a factorplot via rcParams.
The figure size is hardcoded inside the FacetGrid class as
figsize = (ncol * size * aspect, nrow * size)
A new figure is then created using this figsize.
This makes it impossible to change the figure size by other means than the argument in the function call to factorplot. It makes it also impossible to first create a figure with other parameters and plot the factorplot to this figure. However, for a workaround in case of a factorplot with a single axes see #MartinEvans' answer.
The author of seaborn argues here that this is because a factorplot would need to have full control over the figure.
While one may question whether this needs to be the case, there is nothing you can do about it, other than (a) adding a feature request at the GitHub site and/or write your own wrapper - which wouldn't be too difficult, given that seaborn as well as matplotlib are open source.
The figure size can be modified by first creating a figure and axis and passing this as a parameter to the seaborn plot:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
fig, ax = plt.subplots(figsize=(10, 10))
df = pd.DataFrame({
'product' : ['A', 'A', 'A', 'B', 'B', 'C', 'C'],
'close' : [1, 1, 0, 1, 1, 1, 0],
'counts' : [3, 3, 3, 2, 2, 2, 2]})
sns.factorplot(y='counts', x='product', hue='close', data=df, kind='bar', palette='muted', ax=ax)
plt.close(2) # close empty figure
plt.show()
When using an Axis grids type plot, seaborn will automatically create another figure. A workaround is to close the empty second figure.

pandas and seaborn - heatmap with no colors

I've been working with seaborn and its heatmap function. I would like to build a matrix with annotated values from pandas dataframe df:
C,L,N
a,x,10
a,y,2
a,z,4
b,x,1
b,y,22
b,z,11
c,x,3
c,y,1
c,z,0
So far it worked fine with:
# Read DataBase
df = pd.read_csv('myfile.csv')
# Save Users/City/Language Matrix
pdf = df.pivot(index='C',columns='L',values='N').fillna(0)
# Set Font Parameters
rc = {'font.size': 8, 'xtick.labelsize': 11, 'ytick.labelsize': 11}
# Set Figure
fig = plt.figure(figsize=(20, 9))
# Assign to Seaborn
sns.set(rc=rc)
with sns.axes_style('white'):
sns.heatmap(pdf,
cbar=False,
square=False,
annot=True,
cmap='Blues',
fmt='g',
linewidths=0.5)
Which returns:
At the end I'm interested to keep only the values and save the structure as a simple table, discarding the colors. I tried to set cmap=None but it doesn't work, and without the cmap, seaborn assigns a default cmap to the heatmap.
If I didn't understand you wrong, and all you want is to ignore the colormap, you can create your custom colormap with the background color of your choice using matplotlib's ListedColormap:
from matplotlib.colors import ListedColormap
with sns.axes_style('white'):
sns.heatmap(pdf,
cbar=False,
square=False,
annot=True,
fmt='g',
cmap=ListedColormap(['white']),
linewidths=0.5)
Which will yield:
Replace the string white with a STR, HEX or RGB color to set up the background color of your choice.
However, I believe pandas offers better export options. Find bellow a screen of all the possible export options:
Depending where you want to insert the table, maybe to_html, to_json or to_latex are better options than plotting with seaborn.

How to change figuresize using seaborn factorplot

%pylab inline
import pandas as pd
import numpy as np
import matplotlib as mpl
import seaborn as sns
typessns = pd.DataFrame.from_csv('C:/data/testesns.csv', index_col=False, sep=';')
mpl.rc("figure", figsize=(45, 10))
sns.factorplot("MONTH", "VALUE", hue="REGION", data=typessns, kind="box", palette="OrRd");
I always get a small size figure, no matter what size I 've specified in figsize...
How to fix it?
Note added in 2019: In modern seaborn versions the size argument has been renamed to height.
To be a little more concrete:
%matplotlib inline
import seaborn as sns
exercise = sns.load_dataset("exercise")
# Defaults are size=5, aspect=1
sns.factorplot("kind", "pulse", "diet", exercise, kind="point", size=2, aspect=1)
sns.factorplot("kind", "pulse", "diet", exercise, kind="point", size=4, aspect=1)
sns.factorplot("kind", "pulse", "diet", exercise, kind="point", size=4, aspect=2)
You want to pass in the arguments 'size' or 'aspect' to the sns.factorplot() when constructing your plot.
Size will change the height, while maintaining the aspect ratio (so it will also also get wider if only size is changed.)
Aspect will change the width while keeping the height constant.
The above code should be able to be run locally in an ipython notebook.
Plot sizes are reduced in these examples to show the effects, and because the plots from the above code were fairly large when saved as png's. This also shows that size/aspect includes the legend in the margin.
size=2, aspect=1
size=4, aspect=1
size=4, aspect=2
Also, all other useful parameters/arguments and defaults for this plotting function can be viewed with once the 'sns' module is loaded:
help(sns.factorplot)
mpl.rc is stored in a global dictionary (see http://matplotlib.org/users/customizing.html).
So, if you only want to change the size of one figure (locally), it will do the trick:
plt.figure(figsize=(45,10))
sns.factorplot(...)
It worked for me using matplotlib-1.4.3 and seaborn-0.5.1
The size of the figure is controlled by the size and aspect arguments to factorplot. They correspond to the size of each facet ("size" really means "height" and then size * aspect gives the width), so if you are aiming for a particularl size for the whole figure you'll need to work backwards from there.
import seaborn as sns
sns.set(rc={'figure.figsize':(12.7,8.6)})
plt.figure(figsize=(45,10))
Output
Do not use %pylab inline, it is deprecated, use %matplotlib inline
The question is not specific to IPython.
use seaborn .set_style function, pass it your rc as second parameter or kwarg.: http://web.stanford.edu/~mwaskom/software/seaborn/generated/seaborn.set_style.html
If you just want to scale the figure use the below code:
import matplotlib.pyplot as plt
plt.figure(figsize=(8, 6))
sns.factorplot("MONTH", "VALUE", hue="REGION", data=typessns, kind="box", palette="OrRd"); // OR any plot code
Note as of July 2018:
seaborn.__version__ == 0.9.0
Two main changes which affect the above answers
The factorplot function has been renamed to catplot()
The size parameter has been renamed to height for multi plot grid functions and those that use them.
https://seaborn.pydata.org/whatsnew.html
Meaning the answer provided by #Fernando Hernandez should be adjusted as per below:
%matplotlib inline
import seaborn as sns
exercise = sns.load_dataset("exercise")
# Defaults are hieght=5, aspect=1
sns.catplot("kind", "pulse", "diet", exercise, kind="point", height=4, aspect=2)

Categories

Resources