Same scale for twinx() combo plot with seaborn - python

Let's use the classic example of weekly precipitation:
import pandas as pd
import seaborn as sns
from matplotlib import pyplot as plt
from random import randint
data = {
'Week': [i for i in range(1,9)],
'Weekly Precipitation': [randint(1,10) for i in range(1,9)]
}
df = pd.DataFrame(data)
Let's also add a column with the cumulative precipitation:
df['Cumulative'] = df['Weekly Precipitation'].expanding(min_periods=2).sum()
Now, let's say I want a single chart with a barplot for the weekly precipitation, and a lineplot with the cumulative precipitation. So I do this:
fig, ax1 = plt.subplots(figsize=(10,5))
sns.barplot(x='Week', y='Weekly Precipitation', data=df, ax=ax1)
ax2 = ax1.twinx()
sns.lineplot(x='Week', y='Cumulative', data=df, ax=ax2)
Which yields this plot:
And you can see the problem: while both series are commensurate, both x axes use a different scale, which distorts the visualization, as the line should always be higher than the bars.
So, instead of twin axes, I'm trying to put both plot on the same axis:
fig, ax1 = plt.subplots(figsize=(10,5))
ax1.set_facecolor('white')
sns.barplot(x='Week', y='Weekly Precipitation', data=df, ax=ax1)
sns.lineplot(x='Week', y='Cumulative', data=df, ax=ax1)
ax1.set_ylabel('Precipitation')
Now, of course, the scale is right (although I have to do with a single y label), but... the second plot is shifted to the right by one tick!
How does that even make sense?!

Related

How to properly plot a line over bars?

This one used to work fine, but somehow it stopped working (I must have changed something mistakenly but I can't find the issue).
I'm plotting a set of 3 bars per date, plus a line that shows the accumulated value of one of them. But only one or another (either the bars or the line) is properly being plotted. If I left the code for the bars last, only the bars are plotted. If I left the code for the line last, only the line is plotted.
fig, ax = plt.subplots(figsize = (15,8))
df.groupby("date")["result"].sum().cumsum().plot(
ax=ax,
marker='D',
lw=2,
color="purple")
df.groupby("date")[selected_columns].sum().plot(
ax=ax,
kind="bar",
color=["blue", "red", "gold"])
ax.legend(["LINE", "X", "Y", "Z"])
Appreciate the help!
Pandas draws bar plots with the x-axis as categorical, so internally numbered 0, 1, 2, ... and then setting the label. The line plot uses dates as x-axis. To combine them, both need to be categorical. The easiest way is to drop the index from the line plot. Make sure that the line plot is draw first, enabling the labels to be set correctly by the bar plot.
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
df = pd.DataFrame({'date': pd.date_range('20210101', periods=10),
'earnings': np.random.randint(100, 600, 10),
'costs': np.random.randint(0, 200, 10)})
df['result'] = df['earnings'] - df['costs']
fig, ax = plt.subplots(figsize=(15, 8))
df.groupby("date")["result"].sum().cumsum().reset_index(drop=True).plot(
ax=ax,
marker='D',
lw=2,
color="purple")
df.groupby("date")[['earnings', 'costs', 'result']].sum().plot(
ax=ax,
kind="bar",
rot=0,
width=0.8,
color=["blue", "red", "gold"])
ax.legend(['Cumul.result', 'earnings', 'costs', 'result'])
# shorten the tick labels to only the date
ax.set_xticklabels([tick.get_text()[:10] for tick in ax.get_xticklabels()])
ax.set_ylim(ymin=0) # bar plots are nicer when bars start at zero
plt.tight_layout()
plt.show()
Here I post the solution:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
a=[11.3,222,22, 63.8,9]
b=[0.12,-1.0,1.82,16.67,6.67]
l=[i for i in range(5)]
plt.rcParams['font.sans-serif']=['SimHei']
fmt='%.1f%%'
yticks = mtick.FormatStrFormatter(fmt)
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(l, b,'og-',label=u'A')
ax1.yaxis.set_major_formatter(yticks)
for i,(_x,_y) in enumerate(zip(l,b)):
plt.text(_x,_y,b[i],color='black',fontsize=8,)
ax1.legend(loc=1)
ax1.set_ylim([-20, 30])
ax1.set_ylabel('ylabel')
plt.legend(prop={'family':'SimHei','size':8})
ax2 = ax1.twinx()
plt.bar(l,a,alpha=0.1,color='blue',label=u'label')
ax2.legend(loc=2)
plt.legend(prop={'family':'SimHei','size':8},loc="upper left")
plt.show()
The key to this is the command
ax2 = ax1.twinx()

Seaborn align plots in subplots

I'm using Seaborn to plot 3 ghaphs. I would like to know how could I align vertically different plots.
This is my plot so far:
And this is my code:
import seaborn as sns
import matplotlib.pyplot as plt
import matplotlib.ticker as plticker
import seaborn as sns
import numpy as np
flatui = ["#636EFA", "#EF553B", "#00CC96", "#AB63FA"]
fig, ax = plt.subplots(figsize=(17, 7))
plot=sns.lineplot(ax=ax,x="number of weeks", y="avg streams", hue="year", data=df, palette=flatui)
ax.yaxis.set_major_formatter(ticker.FuncFormatter(lambda x, pos: '{:,.2f}'.format(x/1000) + 'K'))
plot.set(title='Streams trend')
plot.xaxis.set_major_locator(ticker.MultipleLocator(2))
fig, ax =plt.subplots(1,2, figsize=(17,7))
plot = sns.barplot(x="Artist", y="Releases", data = result.head(10), ax=ax[0])
plot.set_xticklabels(
plot.get_xticklabels(),
rotation=90,
horizontalalignment='center',
fontweight='light',
fontsize='x-large'
)
plot=sns.barplot(x="Artist", y="Streams", data = result.head(10), ax=ax[1])
plot.set_xticklabels(
plot.get_xticklabels(),
rotation=90,
horizontalalignment='center',
fontweight='light',
fontsize='x-large'
)
Basically I create a figure where I plot the trend graph and then a figure with 2 subplots where I plot my 2 bar plots.
What I would like to do is to align the trend plot and the 2 barplots. As you might notice on the left, the trend plot and the first barplot are not aligned, I would like to make the two figures start from the same point (like at the ending of the trend plot and the second barplot, where the 2 graphs are aligned).
How could I do that?
Here is a solution using GridSpec
fig = plt.figure()
gs0 = matplotlib.gridspec.GridSpec(2,2, figure=fig)
ax1 = fig.add_subplot(gs0[0,:])
ax2 = fig.add_subplot(gs0[1,0])
ax3 = fig.add_subplot(gs0[1,1])
sns.lineplot(ax=ax1, ...)
sns.barplot(ax=ax2, ...)
sns.barplot(ax=ax3, ...)
If you have the newest version of matplotlib, you can also use the new semantic figure composition engine
axd = plt.figure(constrained_layout=True).subplot_mosaic(
"""
AA
BC
"""
)
sns.lineplot(ax=axd['A'], ...)
sns.barplot(ax=axd['B'], ...)
sns.barplot(ax=axd['C'], ...)

Adjust y-axis in Seaborn multiplot

I'm plotting a CSV file from my simulation results. The plot has three graphs in the same figure fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6)).
However, for comparison purposes I want the y-axis in all graphs starting at zero and the ending at a specific value. I tried the solution mentioned here from the Seaborn author. I don't get any errors, but the solution also does not work for me.
Here's my script:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
fname = 'results/filename.csv'
def plot_file():
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6))
df = pd.read_csv(fname, sep='\t')
profits = \
df.groupby(['providerId', 'periods'], as_index=False)['profits'].sum()
# y-axis needs to start at zero and end at 10
g = sns.lineplot(x='periods',
y='profits',
data=profits,
hue='providerId',
legend='full',
ax=axes[0])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='price',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[1])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='quality',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[2])
g.set(ylim=(0, None))
plt.show()
print(g) # -> AxesSubplot(0.672059,0.11;0.227941x0.77)
The resulting figure is as follows:
How can I adjust each individual plot?
Based on the way you've written your code, you can refer to each subplot axis with g.axis and use g.axis.set_ylim(low,high). (A difference compared to the linked answer is that your graphs are not being plotted on a seaborn FacetGrid.)
An example using dummy data and different axis ranges to illustrate:
df = pd.DataFrame(np.random.uniform(0,10,(100,2)), columns=['a','b'])
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(8,4))
g = sns.lineplot(x='a',
y='b',
data=df.sample(10),
ax=axes[0])
g.axes.set_ylim(0,25)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[1])
g.axes.set_ylim(0,3.5)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[2])
g.axes.set_ylim(0,0.3)
plt.tight_layout()
plt.show()

Draw a mean indexed bar chart?

How to draw the following graph showing the difference against the average using matplotlib, searborn, Plotly or with any other framework?
I have found that some calls this plot Mean indexed bar chart. Using seaborn, it can be using a code like the following:
import seaborn as sns
import matplotlib.pyplot as plt
sns.set(style="white", context="talk")
f, ax1 = plt.subplots(figsize=(7, 5), sharex=True)
mean = df.mean()
y2 = mean - df["your column"]
sns.barplot(x=dfCopy.index, y=y2, palette="deep", ax=ax1)
ax1.axhline(0, color="k", clip_on=False)
ax1.set_ylabel("Diverging")
# Finalize the plot
sns.despine(bottom=True)
plt.setp(f.axes, yticks=[])
plt.tight_layout(h_pad=2)

How can I make a barplot and a lineplot in the same seaborn plot with different Y axes nicely?

I have two different sets of data with a common index, and I want to represent the first one as a barplot and the second one as a lineplot in the same graph. My current approach is similar to the following.
ax = pt.a.plot(alpha = .75, kind = 'bar')
ax2 = ax.twinx()
ax2.plot(ax.get_xticks(), pt.b.values, alpha = .75, color = 'r')
And the result is similar to this
This image is really nice and almost right. My only problem is that ax.twinx() seems to create a new canvas on top of the previous one, and the white lines are clearly seen on top of the barplot.
Is there any way to plot this without including the white lines?
You can use twinx() method along with seaborn to create a seperate y-axis, one for the lineplot and the other for the barplot. To control the style of the plot (default style of seaborn is darkgrid), you can use set_style method and specify the preferred theme. If you set style=None it resets to white background without the gridlines. You can also try whitegrid. If you want to further customize the gridlines, you can do it on the axis level using the ax2.grid(False).
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
matplotlib.rc_file_defaults()
ax1 = sns.set_style(style=None, rc=None )
fig, ax1 = plt.subplots(figsize=(12,6))
sns.lineplot(data = df['y_var_1'], marker='o', sort = False, ax=ax1)
ax2 = ax1.twinx()
sns.barplot(data = df, x='x_var', y='y_var_2', alpha=0.5, ax=ax2)
You have to remove grid lines of the second axis. Add to the code ax2.grid(False). However y-ticks of the second axis will be not align to y-ticks of the first y-axis, like here:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(pd.Series(np.random.uniform(0,1,size=10)), color='g')
ax2 = ax1.twinx()
ax2.plot(pd.Series(np.random.uniform(0,17,size=10)), color='r')
ax2.grid(False)
plt.show()

Categories

Resources