Combine Bar and Line subplots in pandas - python

I have 5 time series that I want to graph in a subplot. Essentially I've been using subplotting:
fig, axes = plt.subplots(nrows=5, ncols=1, figsize=(16,10), sharex=True)
xlim = (start, end)
ax1=df.hr.plot(ax=axes[0], color='green', xlim=xlim)
ax2=df.act.plot(ax=axes[1], color='orange', xlim=xlim)
ax3=df.rr.plot(ax=axes[2], color='blue', xlim=xlim)
ax4=df2.set_index('timestamp').rmssd.plot(color='purple', ax=axes[3], xlim=xlim)
ax5=ma_df.tz_convert('US/Eastern')['any_act'].resample('10Min', how='count').plot(kind='line',ax=axes[4])
Which produces
Due to the nature of the data, I want to visualize the last subplot as bar chart. So naturally, I changed the last line to:
ax5=ma_df.tz_convert('US/Eastern')['any_act'].resample('10Min', how='count').plot(kind='bar',ax=axes[4])
Which then creates the following figure:
Which, produces what I expect in the last subplot, but makes the other plots useless. Needless to say, it's not what I want.
How can I combine the 4 line time series with one bar chart in the same plot, but different subplots, all sharing the same x-axis?
Meaning I would want the first 4 subplotplots like in the first image, and the last subplot like in the second image.
Update
I made a simple example, which unfortunately works as expected, and does not replicate my problem, which is even more baffling. Code is below
import pandas as pd
from matplotlib import pyplot as plt
%matplotlib inline
df = pd.read_csv('https://s3.amazonaws.com/temp-leonsas-qsaeamu0sl5v4b/df.csv')
bar_df = pd.read_csv('https://s3.amazonaws.com/temp-leonsas-qsaeamu0sl5v4b/bar_df.csv')
fig, axes = plt.subplots(nrows=4, ncols=1, figsize=(16,10), sharex=True)
ax1=df.hr.plot(ax=axes[0], color='green', kind='line')
ax2=df.act.plot(ax=axes[1], color='orange', kind='line')
ax3=df.rr.plot(ax=axes[2], color='blue', kind='line')
ax4=bar_df.occ_count.plot(ax=axes[3], kind='bar')
Whereas the code in my codebase which replicates the problem is
fig, axes = plt.subplots(nrows=4, ncols=1, figsize=(16,10), sharex=True)
ax1=df.hr.plot(ax=axes[0], color='green', kind='line')
ax2=df.act.plot(ax=axes[1], color='orange', kind='line')
ax3=df.rr.plot(ax=axes[2], color='blue', kind='line')
ax4=bar_df.occ_count.plot(ax=axes[3], kind='bar')
The main difference is that in my codebase the DataFrames are being generated and not just loaded up from s3. Is there an implicit config inside a DataFrame that can somehow make this happen? I just used df.to_csv to dump those 2 dataframes into S3.

I think you just need to explicitly pass kind='line' to the first three plots, here's a simpler example:
import matplotlib.pyplot as plt
import pandas as pd
%matplotlib inline
s = pd.Series([1,2,3,2,1])
fig, axes = plt.subplots(nrows=2, ncols=1, figsize=(16,10), sharex=True)
s.plot(ax=axes[0], color='green', kind='line')
s.plot(ax=axes[1], color='red', kind='bar')

Related

Put shared axis labels to upper plot

from matplotlib import pyplot as plt
fig, (ax0, ax1) = plt.subplots(nrows=2, sharex=True)
fig.show()
Returns this figure:
But I want the x-axis labels below the first plot, not the second, like shown below. How can I achieve this?
There is an example in the official reference, so I answered it by referring to it: In the tick parameter, set the bottom label to false.
import matplotlib.pyplot as plt
ax0 = plt.subplot(211)
ax1 = plt.subplot(212, sharex=ax0, sharey=ax0)
#plt.plot([],[])
plt.tick_params('x', labelbottom=False)
#print(ax1.get_xticks())
plt.show()
The answer from #r-beginners brought me to a solution that also works when using the plt.subplots shortcut instead of instantiating each axis separately.
from matplotlib import pyplot as plt
fig, (ax0, ax1) = plt.subplots(nrows=2, sharex=True)
plt.tick_params('x', labelbottom=False, labeltop=True)
fig.show()
he essential part is plt.tick_params which take keyword arguments labeltop or labelbottom (as well as labelleft or labelright for shared axis on several columns) to select / deselect each axis individually.

Unable to set xlabel when when Pandas creates scatter plot in multiple subplots with color label

I'm unable to set the xlabel of plots when
I use Pandas to make scatter plots
inside multiple subplots
and specifying the c color attribute column.
For example, this works fine: I can see the xlabel:
import numpy as np
import pandas as pd
import pylab as plt
plt.ion()
foo = pd.DataFrame(np.random.randn(5, 5), columns='a b c d e'.split())
fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)
foo.plot.scatter(x='a', y='b', ax=ax1)
foo.plot.scatter(x='a', y='c', ax=ax2)
ax2.set_xlabel('xxx') # works
However, the following slight twist, where I set the color c field, does not set the xlabel:
fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)
foo.plot.scatter(x='a', y='b', ax=ax1, c='c')
foo.plot.scatter(x='a', y='c', ax=ax2, c='c')
ax2.set_xlabel('xx') # NO x label
plt.xlabel doesn't work either. ax2.get_xlabel() returns the "xx" that I expect, but it's not visible:
How can I get an xlabel in this case? Pandas Github repo has 3000+ open issues, rather than filing this as a bug, I'd rather find a Matplotlib-oriented workaround to render an xlabel. (Python 3.8.1, Pandas 1.0.3, Matplotlib 3.2.0.)
Edit: moved from numeric column names to textual column names since it was causing confusion.
The visibility of the xlabel is being set to False for some reason, to get around it you simply need to do
fig, (ax1, ax2) = plt.subplots(nrows=2, sharex=True)
foo.plot.scatter(x='a', y='b', ax=ax1, c='c')
foo.plot.scatter(x='a', y='c', ax=ax2, c='c')
ax2.set_xlabel('xxx')
ax2.xaxis.get_label().set_visible(True)
This will give you

Adjust y-axis in Seaborn multiplot

I'm plotting a CSV file from my simulation results. The plot has three graphs in the same figure fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6)).
However, for comparison purposes I want the y-axis in all graphs starting at zero and the ending at a specific value. I tried the solution mentioned here from the Seaborn author. I don't get any errors, but the solution also does not work for me.
Here's my script:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
fname = 'results/filename.csv'
def plot_file():
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(24, 6))
df = pd.read_csv(fname, sep='\t')
profits = \
df.groupby(['providerId', 'periods'], as_index=False)['profits'].sum()
# y-axis needs to start at zero and end at 10
g = sns.lineplot(x='periods',
y='profits',
data=profits,
hue='providerId',
legend='full',
ax=axes[0])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='price',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[1])
# y-axis need to start at zero and end at one
g = sns.scatterplot(x='periods',
y='quality',
hue='providerId',
style='providerId',
data=df,
legend=False,
ax=axes[2])
g.set(ylim=(0, None))
plt.show()
print(g) # -> AxesSubplot(0.672059,0.11;0.227941x0.77)
The resulting figure is as follows:
How can I adjust each individual plot?
Based on the way you've written your code, you can refer to each subplot axis with g.axis and use g.axis.set_ylim(low,high). (A difference compared to the linked answer is that your graphs are not being plotted on a seaborn FacetGrid.)
An example using dummy data and different axis ranges to illustrate:
df = pd.DataFrame(np.random.uniform(0,10,(100,2)), columns=['a','b'])
fig, axes = plt.subplots(nrows=1, ncols=3, figsize=(8,4))
g = sns.lineplot(x='a',
y='b',
data=df.sample(10),
ax=axes[0])
g.axes.set_ylim(0,25)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[1])
g.axes.set_ylim(0,3.5)
g = sns.scatterplot(x='a',
y='b',
data=df.sample(10),
ax=axes[2])
g.axes.set_ylim(0,0.3)
plt.tight_layout()
plt.show()

How can I make a barplot and a lineplot in the same seaborn plot with different Y axes nicely?

I have two different sets of data with a common index, and I want to represent the first one as a barplot and the second one as a lineplot in the same graph. My current approach is similar to the following.
ax = pt.a.plot(alpha = .75, kind = 'bar')
ax2 = ax.twinx()
ax2.plot(ax.get_xticks(), pt.b.values, alpha = .75, color = 'r')
And the result is similar to this
This image is really nice and almost right. My only problem is that ax.twinx() seems to create a new canvas on top of the previous one, and the white lines are clearly seen on top of the barplot.
Is there any way to plot this without including the white lines?
You can use twinx() method along with seaborn to create a seperate y-axis, one for the lineplot and the other for the barplot. To control the style of the plot (default style of seaborn is darkgrid), you can use set_style method and specify the preferred theme. If you set style=None it resets to white background without the gridlines. You can also try whitegrid. If you want to further customize the gridlines, you can do it on the axis level using the ax2.grid(False).
import matplotlib
import matplotlib.pyplot as plt
import seaborn as sns
matplotlib.rc_file_defaults()
ax1 = sns.set_style(style=None, rc=None )
fig, ax1 = plt.subplots(figsize=(12,6))
sns.lineplot(data = df['y_var_1'], marker='o', sort = False, ax=ax1)
ax2 = ax1.twinx()
sns.barplot(data = df, x='x_var', y='y_var_2', alpha=0.5, ax=ax2)
You have to remove grid lines of the second axis. Add to the code ax2.grid(False). However y-ticks of the second axis will be not align to y-ticks of the first y-axis, like here:
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import pandas as pd
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(pd.Series(np.random.uniform(0,1,size=10)), color='g')
ax2 = ax1.twinx()
ax2.plot(pd.Series(np.random.uniform(0,17,size=10)), color='r')
ax2.grid(False)
plt.show()

Why I get additional empty plot in matplotlib?

I have the following code in my IPython notebook:
import matplotlib.pyplot as plt
plt.setp(plt.xticks()[1], rotation=45)
plt.figure(figsize=(17, 10)) # <--- This is the problematic line!!!!!!!!!!!!!
plt.plot_date(df['date'],df['x'], color='black', linestyle='-')
plt.plot_date(df['date'],df['y'], color='red', linestyle='-')
plt.plot_date(df['date'],df['z'], color='green', linestyle='-')
In the above example df is pandas data frame.
Without the marked line (containig figsize) the plot is too small. With the mentioned line I have an increased image as I want but before it I have an additional empty plot.
Does anybody know why it happens an how this problem can be resolved?
Try reversing the first two lines after the import. plt.setp is opening a figure.
here's how I would do this:
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(17, 10))
plt.setp(plt.xticks()[1], rotation=45)
ax.plot_date(df['date'],df['x'], color='black', linestyle='-')
ax.plot_date(df['date'],df['y'], color='red', linestyle='-')
ax.plot_date(df['date'],df['z'], color='green', linestyle='-')
It's a good practice to explicitly create and operate on your your Figure and Axes objects.

Categories

Resources