Arrangement of pie charts using matplotlib subplot - python

I have 7 pi-charts (4 are listed below). I am trying to create a dashboard with 4 pie charts in first row and 3 pie charts in second row. Not sure where I am going wrong with the below code. Are there any other alternatives to achieve this? Any help would be appreciated.
from matplotlib import pyplot as PLT
fig = PLT.figure()
ax1 = fig.add_subplot(221)
line1 = plt.pie(df_14,colors=("g","r"))
plt.title('EventLogs')
ax1 = fig.add_subplot(223)
line2 = plt.pie(df_24,colors=("g","r"))
plt.title('InstalledApp')
ax1 = fig.add_subplot(222)
line3 = plt.pie(df_34,colors=("g","r"))
plt.title('Drive')
ax1 = fig.add_subplot(224)
line4 = plt.pie(df_44,colors=("g","r"))
plt.title('SQL Job')
ax1 = fig.add_subplot(321)
line5 = plt.pie(df_54,colors=("g","r"))
plt.title('Administrators')
ax2 = fig.add_subplot(212)
PLT.show()

A better method which I always use and is more intuitive, at-least for me, is to use subplot2grid....
fig = plt.figure(figsize=(18,10), dpi=1600)
#this line will produce a figure which has 2 row
#and 4 columns
#(0, 0) specifies the left upper coordinate of your plot
ax1 = plt.subplot2grid((2,4),(0,0))
plt.pie(df_14,colors=("g","r"))
plt.title('EventLogs')
#next one
ax1 = plt.subplot2grid((2, 4), (0, 1))
plt.pie(df_24,colors=("g","r"))
plt.title('InstalledApp')
And you can go on like this, and when you want to switch the row just write the coordinate as (1, 0)... which is second row-first column.
An example with 2 rows and 2 cols -
fig = plt.figure(figsize=(18,10), dpi=1600)
#2 rows 2 columns
#first row, first column
ax1 = plt.subplot2grid((2,2),(0,0))
plt.pie(df.a,colors=("g","r"))
plt.title('EventLogs')
#first row sec column
ax1 = plt.subplot2grid((2,2), (0, 1))
plt.pie(df.a,colors=("g","r"))
plt.title('EventLog_2')
#Second row first column
ax1 = plt.subplot2grid((2,2), (1, 0))
plt.pie(df.a,colors=("g","r"))
plt.title('InstalledApp')
#second row second column
ax1 = plt.subplot2grid((2,2), (1, 1))
plt.pie(df.a,colors=("g","r"))
plt.title('InstalledApp_2')
Hope this helps!

Use this if you want to create quicker arrangements of subplots
In addition to hashcode55's code:
When you want to avoid making multiple DataFrames, I recommend to assign integers to your feature-column and iterate through those. Make sure you make a dictionary for the features though.
Here I am doing a plot with 4 columns and 2 rows.
fig = plt.figure(figsize=(25,10)) #,dpi=1600)
i= 0 #this is the feature I used
r,c = 0 ,0 #these are the rows(r) and columns(c)
for i in range(7):
if c < 4:
#weekday
ax1 = plt.subplot2grid((2,4), (r, c))
plt.pie(data[data.feature == i].something , labels = ..., autopct='%.0f%%')
plt.title(feature[i])
c +=1 #go one column to the left
i+=1 #go to the next feature
else:
c = 0 #reset column number as we exceeded 4 columns
r = 1 #go into the second row
ax1 = plt.subplot2grid((2,4), (r, c))
plt.pie(data[data.feature == i].something , labels = ..., autopct='%.0f%%')
plt.title(days[i])
c +=1
i+=1
plt.show()
This code will go on until the amount of features is exhausted.

Related

Remove for loops when plotting matplotlib subplots

I have large subplot-based figure to produce in python using matplotlib. In total the figure has in excess of 500 individual plots each with 1000s of datapoints. This can be plotted using a for loop-based approach modelled on the minimum example given below
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
# define main plot names and subplot names
mains = ['A','B','C','D']
subs = list(range(9))
# generate mimic data in pd dataframe
col = [letter+str(number) for letter in mains for number in subs]
col.insert(0,'Time')
df = pd.DataFrame(columns=col)
for title in df.columns:
df[title] = [i for i in range(100)]
# although alphabet and mains are the same in this minimal example this may not always be true
alphabet = ['A', 'B', 'C', 'D']
column_names = [column for column in df.columns if column != 'Time']
# define figure size and main gridshape
fig = plt.figure(figsize=(15, 15))
outer = gridspec.GridSpec(2, 2, wspace=0.2, hspace=0.2)
for i, letter in enumerate(alphabet):
# define inner grid size and shape
inner = gridspec.GridSpecFromSubplotSpec(3, 3,
subplot_spec=outer[i], wspace=0.1, hspace=0.1)
# select only columns with correct letter
plot_array = [col for col in column_names if col.startswith(letter)]
# set title for each letter plot
ax = plt.Subplot(fig, outer[i])
ax.set_title(f'Letter {letter}')
ax.axis('off')
fig.add_subplot(ax)
# create each subplot
for j, col in enumerate(plot_array):
ax = plt.Subplot(fig, inner[j])
X = df['Time']
Y = df[col]
# plot waveform
ax.plot(X, Y)
# hide all axis ticks
ax.axis('off')
# set y_axis limits so all plots share same y_axis
ax.set_ylim(df[column_names].min().min(),df[column_names].max().max())
fig.add_subplot(ax)
However this is slow, requiring minutes to plot the figure. Is there a more efficient (potentially for loop free) method to achieve the same result
The issue with the loop is not the plotting but the setting of the axis limits with df[column_names].min().min() and df[column_names].max().max().
Testing with 6 main plots, 64 subplots and 375,000 data points, the plotting section of the example takes approx 360s to complete when axis limits are set by searching df for min and max values each loop. However by moving the search for min and max outside the loops. eg
# set y_lims
y_upper = df[column_names].max().max()
y_lower = df[column_names].min().min()
and changing
ax.set_ylim(df[column_names].min().min(),df[column_names].max().max())
to
ax.set_ylim(y_lower,y_upper)
the plotting time is reduced to approx 24 seconds.

Stacked bar plot color by a diferent category

I want to make a plot where it shows the RFS by floor opened by the unit_id, but i want the color of the unit_id to be defined by the year. So far i make it work for a reduced set of data, but i think it will be difficult to scale the code.
What i do is, first i identify the order that each unit has at it's floor so i first plot all the units by floor in the first position, then the ones that are in second position and so on.
Thanks!
df_build = pd.DataFrame({'floor':[1,1,1,2,2,2,3,3,3],'unidad':[100,101,102,200,201,202,300,301,302],
'rsf':[2000,1000,1500,1500,2000,1000,1000,1500,2000],'order':[0,1,2,0,1,2,0,1,2],
'year':[2008,2009,2010,2009,2010,2011,2010,2011,2012]})
assign_colors = {2008:'tab:red',2009:'tab:blue',2010:'tab:green',2011:'tab:pink',2012:'tab:olive'}
labels = list(df_build.floor.unique())
order_0 = df_build[df_build.order==0].rsf.values
c1=list(df_build[df_build.order==0].year.replace(assign_colors).values)
order_1 = df_build[df_build.order==1].rsf.values
c2=list(df_build[df_build.order==1].year.replace(assign_colors).values)
order_2 = df_build[df_build.order==2].rsf.values
c3=list(df_build[df_build.order==2].year.replace(assign_colors).values)
width = 0.35
fig, ax = plt.subplots()
ax.barh(labels, order_0, width,color=c1)
ax.barh(labels, order_1, width,left=order_0, color=c2)
ax.barh(labels, order_2, width,left=order_0+order_1, color=c3)
ax.set_ylabel('floor')
ax.set_title('Stacking Plan')
#ax.legend()
plt.show()
Try pivoting the data and loop:
# map the color
df_build['color'] = df_build['year'].map(assign_colors)
# pivot the data
plot_df = df_build.pivot(index='floor', columns='order')
# plot by row
fig, ax = plt.subplots()
for i in df.index:
rights = plot_df.loc[i,'rsf'].cumsum()
lefts = rights.shift(fill_value=0)
ax.barh(i, plot_df.loc[i,'rsf'], left=lefts, color=plot_df.loc[i,'color'])
for j in range(len(rights)):
label = plot_df.loc[i, 'unidad'].iloc[j]
rsf = plot_df.loc[i, 'rsf'].iloc[j]
x = (rights.iloc[j] + lefts.iloc[j]) / 2
ax.text(x, i, f'{label}-{rsf}', ha='center')
Output:

How to add row titles to the following the matplotlib code?

I am trying to create a plot containing 8 subplots (4 rows and 2 columns). To do so, I have made this code that reads the x and y data and plots it in the following fashion:
fig, axs = plt.subplots(4, 2, figsize=(15,25))
y_labels = ['k0', 'k1']
for x in range(4):
for y in range(2):
axs[x, y].scatter([i[x] for i in X_vals], [i[y] for i in y_vals])
axs[x, y].set_xlabel('Loss')
axs[x, y].set_ylabel(y_labels[y])
This gives me the following result:
However, I want to add a title to all the rows (not the plots) in the following way(the titles in yellow text):
I found this image and some ways to do that here but I wasn't able to implement this for my use case and got an error. This is what I tried :
gridspec = axs[0].get_subplotspec().get_gridspec()
subfigs = [fig.add_subfigure(gs) for gs in gridspec]
for row, subfig in enumerate(subfigs):
subfig.suptitle(f'Subplot row title {row}')
which gave me the error : 'numpy.ndarray' object has no attribute 'get_subplotspec'
So I changed the code to :
gridspec = axs[0, 0].get_subplotspec().get_gridspec()
subfigs = [fig.add_subfigure(gs) for gs in gridspec]
for row, subfig in enumerate(subfigs):
subfig.suptitle(f'Subplot row title {row}')
but this returned the error : 'Figure' object has no attribute 'add_subfigure'
The solution in the answer that you linked is the correct one, however it is specific for the 3x3 case as shown there. The following code should be a more general solution for different numbers of subplots. This should work provided your data and y_label arrays/lists are all the correct size.
Note that this requires matplotlib 3.4.0 and above to work:
import numpy as np
import matplotlib.pyplot as plt
# random data. Make sure these are the correct size if changing number of subplots
x_vals = np.random.rand(4, 10)
y_vals = np.random.rand(2, 10)
y_labels = ['k0', 'k1']
# change rows/cols accordingly
rows = 4
cols = 2
fig = plt.figure(figsize=(15,25), constrained_layout=True)
fig.suptitle('Figure title')
# create rows x 1 subfigs
subfigs = fig.subfigures(nrows=rows, ncols=1)
for row, subfig in enumerate(subfigs):
subfig.suptitle(f'Subplot row title {row}')
# create 1 x cols subplots per subfig
axs = subfig.subplots(nrows=1, ncols=cols)
for col, ax in enumerate(axs):
ax.scatter(x_vals[row], y_vals[col])
ax.set_title("Subplot ax title")
ax.set_xlabel('Loss')
ax.set_ylabel(y_labels[col])
Which gives:

Double loop to populate subplots in matplotlib

I have a dict of dataframes that I want to use to populate subplots.
Each dict has two columns of data for x and y axis, and two categorical columns for hue.
Pseudo code:
for df in dict of dataframes:
for cat in categories:
plot(x=col_0, y=col_1, hue=cat)
Data for example:
dict_dfs = dict()
for i in range(5):
dict_dfs['df_{}'.format(i)] = pd.DataFrame({'col_1':np.random.randn(10), # first column with data = x axis
'col_2':np.random.randn(10), # second column with data = y axis
'cat_0': ('Home '*5 + 'Car '*5).split(), # first category = hue of plots on the left
'cat_1': ('kitchen '*3 + 'Bedroom '*2 + 'Care '*5).split() # second category = hue of plots on the right
})
IN:
fig, axes = plt.subplots(len(dict_dfs.keys()), 2, figsize=(15,10*len(dict_dfs.keys())))
for i, (name, df) in enumerate(dict_dfs.items()):
for j, cat in enumerate(['cat_0', 'cat_1']):
sns.scatterplot(
x="col_1", y="col_2", hue=cat, data=df, ax=axes[i,j], alpha=0.6)
axes[i,j].set_title('df: {}, cat: {}'.format(name, cat), fontsize = 25, pad = 35, fontweight = 'bold')
axes[i,j].set_xlabel('col_1', fontsize = 26, fontweight = 'bold')
axes[i,j].set_ylabel('col_2', fontsize = 26, fontweight = 'bold')
plt.show()
OUT:
the 10 subplots are created correctly (5 dfs * 2 categories), but only the first one (axes[0, 0]) gets populated. I am used to create subplots with one loop, but it's the first time I use two. I have checked the code without finding the issue. Anyone can help ?
The plt.show() is within the scope of the for-loops, so the figure plot gets shown after the initialization of the first subplot. If you move it out of the loops (un-indent it to the beginning of the line), the plot should correctly be shown with all subplots.

Multiple titles (suptitle) with subplots

I have a series of 9 subplots in a 3x3 grid, each subplot with a title.
I want to add a title for each row. To do so I thought about using suptitle.
The problem is if I use 3 suptitles they seems to be overwritten and only the last one seems to be shown.
Here is my basic code:
fig, axes = plt.subplots(3,3,sharex='col', sharey='row')
for j in range(9):
axes.flat[j].set_title('plot '+str(j))
plt1 = fig.suptitle("row 1",x=0.6,y=1.8,fontsize=18)
plt2 = fig.suptitle("row 2",x=0.6,y=1.2,fontsize=18)
plt3 = fig.suptitle("row 3",x=0.6,y=0.7,fontsize=18)
fig.subplots_adjust(right=1.1,top=1.6)
You can tinker with the titles and labels. Check the following example adapted from your code:
import matplotlib.pyplot as plt
fig, axes = plt.subplots(3,3,sharex='col', sharey='row')
counter = 0
for j in range(9):
if j in [0,3,6]:
axes.flat[j].set_ylabel('Row '+str(counter), rotation=0, size='large',labelpad=40)
axes.flat[j].set_title('plot '+str(j))
counter = counter + 1
if j in [0,1,2]:
axes.flat[j].set_title('Column '+str(j)+'\n\nplot '+str(j))
else:
axes.flat[j].set_title('plot '+str(j))
plt.show()
, which results in:

Categories

Resources