Subplot with two panels (plot and pie) - python

I would like to merge two graphs or panels using subplot. I tried using thisi script, but I didn't get that I would like. I have one plot and one pie. On the image, the plot should be to left and pie to right. How can I do?
import matplotlib.pyplot as plt
import seaborn as sns
fig1, ax1 = plt.subplots()
#pie
plt.subplot(2,1,1)
sns.countplot(df['data_si_o_no'])
#pie
ax1.pie(df['data_si_o_no'].value_counts(),
labels=['Not Disaster', 'Disaster'],
autopct='%1.2f%%',
shadow=True,
explode=(0.05, 0),
startangle=60)
fig1.suptitle('Distribution of the data', fontsize=24)
plt.subplot(2,2,1)

Few things to note:
To plot 2 subplots next to each other, the subplots should be 1,2 as that represents the # rows, # columns.
The ax1 will be broken into ax1[0], ax1[1] for the two plots. I have just used ax instead of ax1
As you are using seaborn for one of the count plot, you need to define ax=ax[0] while providing the parameters, so that matplotlib knows it is the first plot.
For the matplotlib pie plot/chart, you need to use ax[1]=..
With these changes, you should be able to see the required plots. Note that, as there was no data provided, I used dummy data (Female/Male) from Titanic dataset instead.
Code
import matplotlib.pyplot as plt
import seaborn as sns
fig1, ax = plt.subplots(1,2, figsize=(10,5)) # 1 row, 2 columns
#pie
sns.countplot(df['data_si_o_no'], ax=ax[0])
#pie
ax[1]=plt.pie(df['data_si_o_no'].value_counts(),
labels=['male', 'female'],
autopct='%1.2f%%',
shadow=True,
explode=(0.05, 0),
startangle=60)
fig1.suptitle('Distribution of the data', fontsize=24)
Plot

Related

Overlaying Pandas plot with Matplotlib is sensitive to the plotting order

I have the following problem: I'm trying to overlay two plots: One Pandas plot via plot.area() for a dataframe, and a second plot that is a standard Matplotlib plot. Depending the coder order for those two, the Matplotlib plot is displayed only if the code is before the Pandas plot.area() on the same axes.
Example: I have a Pandas dataframe called revenue that has a DateTimeIndex, and a single column with "revenue" values (float). Separately I have a dataset called projection with data along the same index (revenue.index)
If the code looks like this:
import pandas as pd
import matplotlib.pyplot as plt
fig, ax = plt.subplots(figsize=(10, 6))
# First -- Pandas area plot
revenue.plot.area(ax = ax)
# Second -- Matplotlib line plot
ax.plot(revenue.index, projection, color='black', linewidth=3)
plt.tight_layout()
plt.show()
Then the only thing displayed is the pandas plot.area() like this:
1/ Pandas plot.area() and 2/ Matplotlib line plot
However, if the order of the plotting is reversed:
fig, ax = plt.subplots(figsize=(10, 6))
# First -- Matplotlib line plot
ax.plot(revenue.index, projection, color='black', linewidth=3)
# Second -- Pandas area plot
revenue.plot.area(ax = ax)
plt.tight_layout()
plt.show()
Then the plots are overlayed properly, like this:
1/ Matplotlib line plot and 2/ Pandas plot.area()
Can someone please explain me what I'm doing wrong / what do I need to do to make the code more robust ? Kind TIA.
The values on the x-axis are different in both plots. I think DataFrame.plot.area() formats the DateTimeIndex in a pretty way, which is not compatible with pyplot.plot().
If you plot of the projection first, plot.area() can still plot the data and does not format the x-axis.
Mixing the two seems tricky to me, so I would either use pyplot or Dataframe.plot for both the area and the line:
import pandas as pd
from matplotlib import pyplot as plt
projection = [1000, 2000, 3000, 4000]
datetime_series = pd.to_datetime(["2021-12","2022-01", "2022-02", "2022-03"])
datetime_index = pd.DatetimeIndex(datetime_series.values)
revenue = pd.DataFrame({"value": [1200, 2200, 2800, 4100]})
revenue = revenue.set_index(datetime_index)
fig, ax = plt.subplots(1, 2, figsize=(10, 4))
# Option 1: only pyplot
ax[0].fill_between(revenue.index, revenue.value)
ax[0].plot(revenue.index, projection, color='black', linewidth=3)
ax[0].set_title("Pyplot")
# Option 2: only DataFrame.plot
revenue["projection"] = projection
revenue.plot.area(y='value', ax=ax[1])
revenue.plot.line(y='projection', ax=ax[1], color='black', linewidth=3)
ax[1].set_title("DataFrame.plot")
The results then look like this, where DataFrame.plot gives a much cleaner looking result:
If you do not want the projection in the revenue DataFrame, you can put it in a separate DataFrame and set the index to match revenue:
projection_df = pd.DataFrame({"projection": projection})
projection_df = projection_df.set_index(datetime_index)
projection_df.plot.line(ax=ax[1], color='black', linewidth=3)

Superimposing plots in seaborn cause x-axis to misallign

I am having an issue trying to superimpose plots with seaborn. I am able to generate the two plots separetly as
fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))
sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)
sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)
The output looks like this:
But when i try to put both plots superimposed, but assiging both to the same ax object.
fig, (ax1,ax2) = plt.subplots(ncols=2,figsize=(30, 7))
sns.lineplot(data=data1, y='MSE',x='pct_gc',ax=ax1)
sns.boxplot(x="pct_gc", y="MSE", data=data2,ax=ax2,width=0.4)
I am not able to identify with the X axis in the Lineplot changes when superimposing both plots (both plots X axis go from 0 to 0.069).
My goal is for both plots to be superimposed, while keeping the same X axis range.
Seaborn's boxplot creates categorical x-axis, with all boxes nicely with the same distance. Internally the x-axis is numbered as 0, 1, 2, ... but externally it gets the labels from 0 to 0.069.
To combine a line plot with a boxplot, matplotlib's boxplot can be addressed directly, so that positions and widths can be set explicitly. When patch_artist=True, a rectangle is created (instead of just lines), for which a facecolor can be given. manage_ticks=False prevents that boxplot changes the x ticks and their limits. Optionally notch=True would accentuate the median a bit more, but depending on the data, the confidence interval might be too large and look weird.
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
data1 = pd.DataFrame({'pct_gc': np.linspace(0, 0.069, 200), 'MSE': np.random.normal(0.02, 0.1, 200).cumsum()})
data1['pct_range'] = pd.cut(data1['pct_gc'], 10)
fig, ax1 = plt.subplots(ncols=1, figsize=(20, 7))
sns.lineplot(data=data1, y='MSE', x='pct_gc', ax=ax1)
for interval, color in zip(np.unique(data1['pct_range']), plt.cm.tab10.colors):
ax1.boxplot(data1[data1['pct_range'] == interval]['MSE'],
positions=[interval.mid], widths=0.4 * interval.length,
patch_artist=True, boxprops={'facecolor': color},
notch=False, medianprops={'color':'yellow', 'linewidth':2},
manage_ticks=False)
plt.show()

Multi colored bars based on category using matplotlib

I have created a chart based on three values in my 'result' field. Do you know how I can change the colors based on the three values (Abandoned, Connected, To voice mail) using the code I already have below:
data.direction.describe()
import seaborn as sns
%matplotlib inline
import matplotlib.pyplot as plt
sns.set() # use Seaborn styles
df = data.pivot_table('call_id', index='timeslot', columns='result', aggfunc='count')
ax = df.plot(kind='bar', width=0.7, align='center', stacked=False, rot=90, figsize=(12,6), legend=False, zorder=3)
plt.grid(zorder=0)
plt.legend(loc='upper center', bbox_to_anchor=(0.5, -0.3))
plt.xlabel("Timeslot")
plt.ylabel("Number of calls")
plt.title("Figure 4: Number of calls connected to the admin line by time slot")
ax.spines['bottom'].set_color('black')
ax.spines['top'].set_color('white')
ax.spines['right'].set_color('white')
ax.spines['left'].set_color('black')
ax.yaxis.label.set_color('black')
ax.xaxis.label.set_color('black')
ax.title.set_color('black')
ax.patch.set_facecolor('white')
plt.savefig('figure4.png', dpi=300, facecolor=ax.get_facecolor(), transparent=True, bbox_inches='tight', pad_inches=0.1)
plt.show()
This is how it currently looks but I want to be able to choose the colors. Chart
Since you are using pandas, I'd suggest that you add your colors to the call to df.plot by using the color parameter.
ax = df.plot(color=my_colors, kind='bar', width=0.7,
Where 'my_colors' is a list of the colors you want to use.

Controlling legend across multiple subplots with windrose axes

I cannot figure out how to make the legends not overlap with my figures (see below figure) in subplots. The problem is my axes are complicated because they are from a windrose. To get the axes:
1) I have downloaded the windrose.py from https://github.com/akrherz/windrose/tree/darylchanges
2) I copied the windrose.py into the same path with my python script, example.py
3) I changed windrose.py so that it is able to do subplots, according to the steps from Subplot of Windrose in matplotlib . Those steps were to make WindroseAxes as a projection into matplotlib. I edited the file windrose.py:
3a) Include an
import from matplotlib.projections import register_projection
at the beginning of the file.
3b) Then add a name variable :
class WindroseAxes(PolarAxes):
name = 'windrose'
...
3c) Finally, at the end of windrose.py, you add:
register_projection(WindroseAxes)
Once that is done, you can easily create your windrose axes using the projection argument to the matplotlib axes.
4) Now I ran my script below (example of my real script)
from windrose import WindroseAxes
import numpy as np
import matplotlib.pyplot as plt
from windrose_subplot import WindroseAxes
wind_speeds1 = np.array([12,10,13,15])
wind_dirs1 = np.array([60,76,32,80]) # in degrees
wind_speeds2 = np.array([23,12,10,8])
wind_dirs2 = np.array([23,45,29,13])
fig = plt.figure()
ax1 = fig.add_subplot(231,projection='windrose')
ax1.bar(wind_dirs1,wind_speeds1,normed=True,opening=0.8,edgecolor='white')
ax2 = fig.add_subplot(232,projection='windrose')
ax2.bar(wind_dirs2,wind_speeds2,normed=True,opening=0.8,edgecolor='white')
ax1.legend()
ax2.legend()
plt.tight_layout()
plt.show()
Ideally, I would like to create one legend with the max/min of all the subplots because they are all the same units . This legend will have to be the corresponding colors for each subplot for the same values across subplots (eg, a single normal legend relevant to all subplots). There will be 6 subplots in the real script but 2 here for now shows the point.
This is simple to fix. In order to only plot one legend, comment out or delete where you plot the first legend. In order to move the legend off of the plot, use bbox_to_anchor=() with some logical location. See below for an example that works for this example.
import numpy as np
import matplotlib.pyplot as plt
from windrose_subplot import WindroseAxes
wind_speeds1 = np.array([12,10,13,15])
wind_dirs1 = np.array([60,76,32,80]) # in degrees
wind_speeds2 = np.array([23,12,10,8])
wind_dirs2 = np.array([23,45,29,13])
fig = plt.figure()
ax1 = fig.add_subplot(231,projection='windrose')
ax1.bar(wind_dirs1,wind_speeds1,normed=True,opening=0.8,edgecolor='white')
ax2 = fig.add_subplot(232,projection='windrose')
ax2.bar(wind_dirs2,wind_speeds2,normed=True,opening=0.8,edgecolor='white')
# ax1.legend()
ax2.legend(bbox_to_anchor=(1.2 , -0.1))
plt.tight_layout()
plt.show()
However, note the bbox_to_anchor is reliant on the axis that the legend comes from, so
ax1.legend(bbox_to_anchor=1.2, -0.1))
#ax2.legend()
would display the legend underneath the second axis:
Thank you Hazard11, I found your answer very useful :) There is an issue with the answer though is the legend does not represent the first subplot because the bins are generated when creating the second subplot.
I just solved this issue by calculating the bins using numpy.histogram first and then passing that to windrose.WindroseAxes.bar() when creating each wind rose. Doing it this way means you need to pick which one you want to use to generate the bins. Another way to do it would be to define the bins manually or to create a function which generates some efficient binning for both which could then be used.
wind_speeds1 = np.array([12,10,13,15])
wind_dirs1 = np.array([60,76,32,80]) # in degrees
wind_speeds2 = np.array([23,12,10,8])
wind_dirs2 = np.array([23,45,29,13])
wind_speeds_bins = np.histogram(wind_speeds2, 5)[1]
fig = plt.figure()
ax1 = fig.add_subplot(231, projection='windrose')
ax1.bar(wind_dirs1 ,wind_speeds1, normed=True, opening=0.8, edgecolor='white', bins=wind_speeds_bins)
ax2 = fig.add_subplot(232, projection='windrose')
ax2.bar(wind_dirs2, wind_speeds2, normed=True, opening=0.8, edgecolor='white', bins=wind_speeds_bins)
# ax1.legend()
ax2.legend(bbox_to_anchor=(1.2 , -0.1))
plt.tight_layout()
plt.show()

matplotlib - pandas - No xlabel and xticks for twinx axes in subploted figures

I had a similar question, which was answered previously. However, it differs in usage of Pandas package with it.
Here is my previous question: matplotlib - No xlabel and xticks for twinx axes in subploted figures
So, my question like last one is that why it does not show xlabel and xticks for first row diagrams when using this Python code.
Two notes:
I also used subplots instead of gridspec but same result.
If you uncomment any of the commented lines in this code, which is related to using the Pandas on the axes in each diagram, the xlabel and xticks will disappear!
import matplotlib.pyplot as plt
import matplotlib.gridspec as gspec
import numpy as np
import pandas as pd
from math import sqrt
fig = plt.figure()
gs = gspec.GridSpec(2, 2)
gs.update(hspace=0.7, wspace=0.7)
ax1 = plt.subplot(gs[0, 0])
ax2 = plt.subplot(gs[0, 1])
ax3 = plt.subplot(gs[1, 0])
ax4 = plt.subplot(gs[1, 1])
x1 = np.linspace(1,10,10)
ax12 = ax1.twinx()
ax1.set_xlabel("Fig1")
ax12.set_xlabel("Fig1")
ax1.set_ylabel("Y1")
ax12.set_ylabel("Y2")
# pd.Series(range(10)).plot(ax=ax1)
ax12.plot(x1, x1**3)
ax22 = ax2.twinx()
ax2.set_xlabel("Fig2")
ax22.set_xlabel("Fig2")
ax2.set_ylabel("Y3")
ax22.set_ylabel("Y4")
# pd.Series(range(10)).plot(ax=ax2)
ax22.plot(x1, x1**0.5)
ax32 = ax3.twinx()
ax3.set_xlabel("Fig3")
ax32.set_xlabel("Fig3")
ax3.set_ylabel("Y5")
ax32.set_ylabel("Y6")
# pd.Series(range(200)).plot(ax=ax3)
ax42 = ax4.twinx()
ax4.set_xlabel("Fig4")
ax42.set_xlabel("Fig4")
ax4.set_ylabel("Y7")
ax42.set_ylabel("Y8")
# pd.Series(range(10)).plot(ax=ax42)
plt.subplots_adjust(wspace=0.8, hspace=0.8)
plt.show()
I just got the same issue because I was mixing plots made with matplotlib and made with Pandas.
You should not plot with Pandas, here is how you could replace:
pd.Series(range(10)).plot(ax=ax42)
with
ax42.plot(pd.Series(range(10))
As Scimonster mentioned above for me it worked when I plotted all the pandas before creating twinx axes.
I had few plots in twinx which were also coming from Pandas Dataframe objects (x,t plots). I created separate lists before starting the plot and then used them to plot after plotting the first pandas plots.
to summarize my work flow was
1. Creating lists for twinx plots
2. opening plot and plotting all pandas plots with normal axes.
3. creating twinx axes
4. plotting lists on twinx axes
fortunately, this flow is working for me

Categories

Resources