Custom Chart Formatting in Seaborn - python

I have a straightforward countplot in seaborn.
Code is:
ax = sns.countplot(x="days", data=df,color ='cornflowerblue')
ax.set_xticklabels(ax.get_xticklabels(),rotation=90)
ax.set(xlabel='days', ylabel='Conversions')
ax.set_title("Days to Conversion")
for p in ax.patches:
count = p.get_height()
x = p.get_x() + p.get_width()/1.25
y = p.get_height()*1.01
ax.annotate(count, (x, y),ha='right')
Which produces:
I'm trying to make the chart a bit 'prettier'. Specifically I want to raise the height of the outline so it wont cross the count numbers on first bar, and make the count numbers centered with a small space about the bar. Can't get it to work.
Guidance please.

To set the labels, in the latest matplotlib version (3.4.2), there is a new function bar_label() which takes care of positioning. In older versions you could use your code, but use x = p.get_x() + p.get_width()/2 and set ax.text(..., ha='center').
To make room for the labels, an extra margin can be added via ax.margins(y=0.1).
import matplotlib.pyplot as plt
import seaborn as sns
df = sns.load_dataset('tips')
ax = sns.countplot(x="day", data=df, color='cornflowerblue')
ax.tick_params(axis='x', labelrotation=90)
ax.bar_label(ax.containers[-1])
ax.margins(y=0.1)
plt.tight_layout()
plt.show()

Related

Seaborn - grouped bar plot with Bottom parameter

I want to create a bar plot (vertical) using seaborn, each x axis label will have n (2 in the example) bars of different colors - but each bar will be floating - in other words it uses the matplotlib bar bottom parameter
this works without the bottom part as follows, but fails with it
import pandas as pd
import seaborn as sns
d = {'month':['202001','202002','202003','202001','202002','202003'],
'range' : [0.94,4.47,0.97,4.70,0.98,1.23],
'bottom' : [8.59,17.05,8.35,17.78,8.32,5.67],
'group' : ['a','a','a','b','b','b']
}
df = pd.DataFrame(data=d)
sns.barplot(data=df,x = "month", y = "range",hue='group')
(Sorry I can't upload the picture for some reason, I think the service is blocked from my work, but the code will display it if run)
but when I add the bottom parameters it fails
sns.barplot(data=df,x = "month", y = "range",hue='group',bottom='bottom')
I appreciate the help, and perhaps an explanation of why it is failing, as logically it should work
The bars indicate a range of forecasts for a measure, and I want to show them as a rectangle
sns itself doesn't handle bottom, so it's passed to plt.bar. But plt.bar requires bottom to have the same shape/size with x and y which is not the case when data is passed by sns.
Let's try a work around with pandas plot function:
to_plot = df.pivot(index='month',columns='group')
fig,ax = plt.subplots()
to_plot['range'].add(to_plot['bottom']).plot.bar(ax=ax)
# cover the bars up to `bottom`
# replace `w` with background color of your choice
to_plot['bottom'].plot.bar(ax=ax, color='w', legend=None)
Output:
sns.set()
to_plot = df.pivot(index='month',columns='group')
For another approach that allows a specific style:
# set sns plot style
sns.set()
fig,ax = plt.subplots()
for i,(label,r) in enumerate(to_plot.iterrows()):
plt.bar([i-0.1,i+0.1],r['range'],
bottom=r['bottom'],
color=['C0','C1'],
width=0.2)
plt.xticks(np.arange(len(to_plot)), to_plot.index);
Output:

No whitespace between Seaborn barplot bars

I created a Seaborn barplot using the code below (it comes from https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/)
I would like all the bars to stack up without whitespace, but have been unable to do so. If I add width it complains about multiple values for width in barh. This is probably as seaborn has its own algo to determine the width. Is there anyway around it?
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Read data
df = pd.read_csv("https://raw.githubusercontent.com/selva86/datasets/master/email_campaign_funnel.csv")
# Draw Plot
plt.figure(figsize=(13, 10), dpi=80)
group_col = 'Gender'
order_of_bars = df.Stage.unique()[::-1]
colors = [plt.cm.Spectral(i/float(len(df[group_col].unique())-1)) for i in
range(len(df[group_col].unique()))]
for c, group in zip(colors, df[group_col].unique()):
sns.barplot(x='Users', y='Stage', data=df.loc[df[group_col]==group, :],
order=order_of_bars, color=c, label=group)
# Decorations
plt.xlabel("$Users$")
plt.ylabel("Stage of Purchase")
plt.yticks(fontsize=12)
plt.title("Population Pyramid of the Marketing Funnel", fontsize=22)
plt.legend()
plt.show()
Not a matplotlib expert by any means, so there may be a better way to do this. Perhaps you can do something like the following, which is similar to the approach in this answer:
# Draw Plot
fig, ax = plt.subplots(figsize=(13, 10), dpi=80)
...
for c, group in zip(colors, df[group_col].unique()):
sns.barplot(x='Users', y='Stage', data=df.loc[df[group_col]==group, :],
order=order_of_bars, color=c, label=group, ax=ax)
# Adjust height
for patch in ax.patches:
current_height = patch.get_height()
patch.set_height(1)
patch.set_y(patch.get_y() + current_height - 1)

Multiple plots on common x axis in Matplotlib with common y-axis labeling

I have written the following minimal Python code in order to plot various functions of x on the same X-axis.
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
from cycler import cycler
cycle = plt.rcParams['axes.prop_cycle'].by_key()['color']
xlabel='$X$'; ylabel='$Y$'
### Set tick features
plt.tick_params(axis='both',which='major',width=2,length=10,labelsize=18)
plt.tick_params(axis='both',which='minor',width=2,length=5)
#plt.set_axis_bgcolor('grey') # Doesn't work if I uncomment!
lines = ["-","--","-.",":"]
Nlayer=4
f, axarr = plt.subplots(Nlayer, sharex=True)
for a in range(1,Nlayer+1):
X = np.linspace(0,10,100)
Y = X**a
index = a-1 + np.int((a-1)/Nlayer)
axarr[a-1].plot(X, Y, linewidth=2.0+index, color=cycle[a], linestyle = lines[index], label='Layer = {}'.format(a))
axarr[a-1].legend(loc='upper right', prop={'size':6})
#plt.legend()
# Axes labels
plt.xlabel(xlabel, fontsize=20)
plt.ylabel(ylabel, fontsize=20)
plt.show()
However, the plots don't join together on the X-axis and I failed to get a common Y-axis label. It actually labels for the last plot (see attached figure). I also get a blank plot additionally which I couldn't get rid of.
I am using Python3.
The following code will produce the expected output :
without blank plot which was created because of the two plt.tick_params calls before creating the actual fig
with the gridspec_kw argument of subplots that allows you to control the space between rows and cols of subplots environment in order to join the different layer plots
with unique and centered common ylabel using fig.text with relative positioning and rotation argument (same thing is done to xlabel to get an homogeneous final result). One may note that, it can also be done by repositioning the ylabel with ax.yaxis.set_label_coords() after an usual call like ax.set_ylabel().
import numpy as np
import matplotlib.pyplot as plt
cycle = plt.rcParams['axes.prop_cycle'].by_key()['color']
xlabel='$X$'; ylabel='$Y$'
lines = ["-","--","-.",":"]
Nlayer = 4
fig, axarr = plt.subplots(Nlayer, sharex='col',gridspec_kw={'hspace': 0, 'wspace': 0})
X = np.linspace(0,10,100)
for i,ax in enumerate(axarr):
Y = X**(i+1)
ax.plot(X, Y, linewidth=2.0+i, color=cycle[i], linestyle = lines[i], label='Layer = {}'.format(i+1))
ax.legend(loc='upper right', prop={'size':6})
with axes labels, first option :
fig.text(0.5, 0.01, xlabel, va='center')
fig.text(0.01, 0.5, ylabel, va='center', rotation='vertical')
or alternatively :
# ax is here, the one of the last Nlayer plotted, i.e. Nlayer=4
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
# change y positioning to be in the horizontal center of all Nlayer, i.e. dynamically Nlayer/2
ax.yaxis.set_label_coords(-0.1,Nlayer/2)
which gives :
I also simplified your for loop by using enumerate to have an automatic counter i when looping over axarr.

How to prevent overlapping x-axis labels in sns.countplot

For the plot
sns.countplot(x="HostRamSize",data=df)
I got the following graph with x-axis label mixing together, how do I avoid this? Should I change the size of the graph to solve this problem?
Having a Series ds like this
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np; np.random.seed(136)
l = "1234567890123"
categories = [ l[i:i+5]+" - "+l[i+1:i+6] for i in range(6)]
x = np.random.choice(categories, size=1000,
p=np.diff(np.array([0,0.7,2.8,6.5,8.5,9.3,10])/10.))
ds = pd.Series({"Column" : x})
there are several options to make the axis labels more readable.
Change figure size
plt.figure(figsize=(8,4)) # this creates a figure 8 inch wide, 4 inch high
sns.countplot(x="Column", data=ds)
plt.show()
Rotate the ticklabels
ax = sns.countplot(x="Column", data=ds)
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right")
plt.tight_layout()
plt.show()
Decrease Fontsize
ax = sns.countplot(x="Column", data=ds)
ax.set_xticklabels(ax.get_xticklabels(), fontsize=7)
plt.tight_layout()
plt.show()
Of course any combination of those would work equally well.
Setting rcParams
The figure size and the xlabel fontsize can be set globally using rcParams
plt.rcParams["figure.figsize"] = (8, 4)
plt.rcParams["xtick.labelsize"] = 7
This might be useful to put on top of a juypter notebook such that those settings apply for any figure generated within. Unfortunately rotating the xticklabels is not possible using rcParams.
I guess it's worth noting that the same strategies would naturally also apply for seaborn barplot, matplotlib bar plot or pandas.bar.
You can rotate the x_labels and increase their font size using the xticks methods of pandas.pyplot.
For Example:
import matplotlib.pyplot as plt
plt.figure(figsize=(10,5))
chart = sns.countplot(x="HostRamSize",data=df)
plt.xticks(
rotation=45,
horizontalalignment='right',
fontweight='light',
fontsize='x-large'
)
For more such modifications you can refer this link:
Drawing from Data
If you just want to make sure xticks labels are not squeezed together, you can set a proper fig size and try fig.autofmt_xdate().
This function will automatically align and rotate the labels.
plt.figure(figsize=(15,10)) #adjust the size of plot
ax=sns.countplot(x=df['Location'],data=df,hue='label',palette='mako')
ax.set_xticklabels(ax.get_xticklabels(), rotation=40, ha="right") #it will rotate text on x axis
plt.tight_layout()
plt.show()
you can try this code & change size & rotation according to your need.
I don't know whether it is an option for you but maybe turning the graphic could be a solution (instead of plotting on x=, do it on y=), such that:
sns.countplot(y="HostRamSize",data=df)

matplotlib - zebra-stripe a figure's background color?

I'm building a simple line chart with matplotlib, and I'd like to zebra-stripe the background of the chart, so that each alternating row is colored differently. Is there a way to do this?
My chart already has gridding, and has major ticks only.
Edit: The code from my comment below, but more legible:
yTicks = ax.get_yticks()[:-1]
xTicks = ax.get_xticks()
ax.barh(yTicks, [max(xTicks)-min(xTicks)] * len(yTicks),
height=(yTicks[1]-yTicks[0]), left=min(xTicks), color=['w','#F0FFFF'])
Here's a quick hack that uses a barchart (axes.barh) to simulate striping.
import matplotlib.pyplot as plt
# initial plot
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot([1,2,3,4,5])
yTickPos,_ = plt.yticks()
yTickPos = yTickPos[:-1] #slice off the last as it is the top of the plot
# create bars at yTickPos that are the length of our greatest xtick and have a height equal to our tick spacing
ax.barh(yTickPos, [max(plt.xticks()[0])] * len(yTickPos), height=(yTickPos[1]-yTickPos[0]), color=['g','w'])
plt.show()
Produces:

Categories

Resources