How to add row titles to the following the matplotlib code? - python

I am trying to create a plot containing 8 subplots (4 rows and 2 columns). To do so, I have made this code that reads the x and y data and plots it in the following fashion:
fig, axs = plt.subplots(4, 2, figsize=(15,25))
y_labels = ['k0', 'k1']
for x in range(4):
for y in range(2):
axs[x, y].scatter([i[x] for i in X_vals], [i[y] for i in y_vals])
axs[x, y].set_xlabel('Loss')
axs[x, y].set_ylabel(y_labels[y])
This gives me the following result:
However, I want to add a title to all the rows (not the plots) in the following way(the titles in yellow text):
I found this image and some ways to do that here but I wasn't able to implement this for my use case and got an error. This is what I tried :
gridspec = axs[0].get_subplotspec().get_gridspec()
subfigs = [fig.add_subfigure(gs) for gs in gridspec]
for row, subfig in enumerate(subfigs):
subfig.suptitle(f'Subplot row title {row}')
which gave me the error : 'numpy.ndarray' object has no attribute 'get_subplotspec'
So I changed the code to :
gridspec = axs[0, 0].get_subplotspec().get_gridspec()
subfigs = [fig.add_subfigure(gs) for gs in gridspec]
for row, subfig in enumerate(subfigs):
subfig.suptitle(f'Subplot row title {row}')
but this returned the error : 'Figure' object has no attribute 'add_subfigure'

The solution in the answer that you linked is the correct one, however it is specific for the 3x3 case as shown there. The following code should be a more general solution for different numbers of subplots. This should work provided your data and y_label arrays/lists are all the correct size.
Note that this requires matplotlib 3.4.0 and above to work:
import numpy as np
import matplotlib.pyplot as plt
# random data. Make sure these are the correct size if changing number of subplots
x_vals = np.random.rand(4, 10)
y_vals = np.random.rand(2, 10)
y_labels = ['k0', 'k1']
# change rows/cols accordingly
rows = 4
cols = 2
fig = plt.figure(figsize=(15,25), constrained_layout=True)
fig.suptitle('Figure title')
# create rows x 1 subfigs
subfigs = fig.subfigures(nrows=rows, ncols=1)
for row, subfig in enumerate(subfigs):
subfig.suptitle(f'Subplot row title {row}')
# create 1 x cols subplots per subfig
axs = subfig.subplots(nrows=1, ncols=cols)
for col, ax in enumerate(axs):
ax.scatter(x_vals[row], y_vals[col])
ax.set_title("Subplot ax title")
ax.set_xlabel('Loss')
ax.set_ylabel(y_labels[col])
Which gives:

Related

Remove for loops when plotting matplotlib subplots

I have large subplot-based figure to produce in python using matplotlib. In total the figure has in excess of 500 individual plots each with 1000s of datapoints. This can be plotted using a for loop-based approach modelled on the minimum example given below
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
# define main plot names and subplot names
mains = ['A','B','C','D']
subs = list(range(9))
# generate mimic data in pd dataframe
col = [letter+str(number) for letter in mains for number in subs]
col.insert(0,'Time')
df = pd.DataFrame(columns=col)
for title in df.columns:
df[title] = [i for i in range(100)]
# although alphabet and mains are the same in this minimal example this may not always be true
alphabet = ['A', 'B', 'C', 'D']
column_names = [column for column in df.columns if column != 'Time']
# define figure size and main gridshape
fig = plt.figure(figsize=(15, 15))
outer = gridspec.GridSpec(2, 2, wspace=0.2, hspace=0.2)
for i, letter in enumerate(alphabet):
# define inner grid size and shape
inner = gridspec.GridSpecFromSubplotSpec(3, 3,
subplot_spec=outer[i], wspace=0.1, hspace=0.1)
# select only columns with correct letter
plot_array = [col for col in column_names if col.startswith(letter)]
# set title for each letter plot
ax = plt.Subplot(fig, outer[i])
ax.set_title(f'Letter {letter}')
ax.axis('off')
fig.add_subplot(ax)
# create each subplot
for j, col in enumerate(plot_array):
ax = plt.Subplot(fig, inner[j])
X = df['Time']
Y = df[col]
# plot waveform
ax.plot(X, Y)
# hide all axis ticks
ax.axis('off')
# set y_axis limits so all plots share same y_axis
ax.set_ylim(df[column_names].min().min(),df[column_names].max().max())
fig.add_subplot(ax)
However this is slow, requiring minutes to plot the figure. Is there a more efficient (potentially for loop free) method to achieve the same result
The issue with the loop is not the plotting but the setting of the axis limits with df[column_names].min().min() and df[column_names].max().max().
Testing with 6 main plots, 64 subplots and 375,000 data points, the plotting section of the example takes approx 360s to complete when axis limits are set by searching df for min and max values each loop. However by moving the search for min and max outside the loops. eg
# set y_lims
y_upper = df[column_names].max().max()
y_lower = df[column_names].min().min()
and changing
ax.set_ylim(df[column_names].min().min(),df[column_names].max().max())
to
ax.set_ylim(y_lower,y_upper)
the plotting time is reduced to approx 24 seconds.

Not able to use plt.subplots() for my data [duplicate]

This question already has answers here:
How to plot in multiple subplots
(12 answers)
Closed 1 year ago.
I am using the Lombscargle function to output the power spectrum for a signal I pass as input, I am able to get the plots one after another but the task at hand is to plot these graphs using subplots in a way that there are 5 rows, 4 cols.
An example for signal would be:
signal = [ '254.24', '254.32', '254.4', '254.84', '254.24', '254.28', '254.84', '253.56', '253.76', '253.32', '253.88', '253.72', '253.92', '251.56', '253.04', '244.72', '243.84', '246.08', '245.84', '249.0', '250.08', '248.2', '253.12', '253.2', '253.48', '253.88', '253.12', '253.4', '253.4']
from scipy.signal import lombscargle
def LSP_scipy(signal):
start_ang_freq = 2 * np.pi * (60/60)
end_ang_freq = 2 * np.pi * (240/60)
SAMPLES = 5000
SAMPLE_SPACING = 1/15
t = np.linspace(0,len(signal)*SAMPLE_SPACING,len(signal))
period_freq = np.linspace(start_ang_freq,end_ang_freq,SAMPLES)
modified_signal_axis = []
modified_time_axis = []
for count,value in enumerate(signal):
if value != 'None':
modified_signal_axis.append(float(value))
modified_time_axis.append(t[count])
prog = lombscargle(modified_time_axis, modified_signal_axis, period_freq, normalize=False, precenter = True)
fig, axes = plt.subplots()
ax.plot(period_freq,prog)
How do I plot these graphs in a matrix format?
Trying loop approach,
See inline comments to add and flatten the subplots.
This is an implementation of flattening the axes array from this answer of the duplicate.
from scipy.signal import lombscargle
from matplotlib.ticker import FormatStrFormatter
import numpy as np
import matplotlib.pyplot as plt
def LSP_scipy(signal):
start_ang_freq = 2 * np.pi * (60/60)
end_ang_freq = 2 * np.pi * (240/60)
SAMPLES = 5000
SAMPLE_SPACING = 1/15
t = np.linspace(0, len(signal)*SAMPLE_SPACING, len(signal))
period_freq = np.linspace(start_ang_freq, end_ang_freq, SAMPLES)
modified_signal_axis = []
modified_time_axis = []
# create the figure and subplots
fig, axes = plt.subplots(5, 6, figsize=(20, 9), sharex=False, sharey=False)
# flatten the array
axes = axes.ravel()
for count, value in enumerate(signal):
if value != 'None':
modified_signal_axis.append(float(value))
modified_time_axis.append(t[count])
prog = lombscargle(modified_time_axis, modified_signal_axis, period_freq, normalize=False, precenter=True)
# plot
axes[count].plot(period_freq, prog)
# format the axes
axes[count].set(title=value)
# some plot have an exponential offset on the yaxis, this turns it off
axes[count].ticklabel_format(useOffset=False)
# some yaxis values are long floats, this formats them to 3 decimal places
axes[count].yaxis.set_major_formatter(FormatStrFormatter('%.3f'))
# format the figure
fig.tight_layout()
signal = [ '254.24', '254.32', '254.4', '254.84', '254.24', '254.28', '254.84', '253.56', '253.76', '253.32', '253.88', '253.72', '253.92', '251.56', '253.04', '244.72', '243.84', '246.08', '245.84', '249.0', '250.08', '248.2', '253.12', '253.2', '253.48', '253.88', '253.12', '253.4', '253.4']
LSP_scipy(signal[:20]) # as per comment, only first 20
You can use for loop and iterate over subplots. A very simple example is shown below.The subplots method creates the figure along with the subplots and store in the ax array.
import matplotlib.pyplot as plt
x = np.linspace(0, 10)
y = range(10)
fig, ax = plt.subplots(nrows=2, ncols=2)
for row in ax:
for col in row:
col.plot(x, y)
plt.show()
# or you can also do
for in range(2): # row=0, col=0
for j in range(2): # row=0, col=1
ax[i, j].plot(x,y) # row=1, col=0
# row=1, col=1
Then one idea is to take the signals into an array of shape=(20,1), where each row corresponds to signal amplitude or some other measurable quantity. Then you could do as below (check the output keeping only the lines till plt.text you will get the idea).
for i in range(1, 21):
plt.subplot(5, 4, i)
plt.text(0.5, 0.5, str((5, 4, i)),
fontsize=18, ha='center')
# Call the function here...get the value of period_freq and prog
period_freq,prog = LSP_scipy(signal[i])
plt.plot(period_freq, prog)

How to plot a list of figures in a single subplot?

I have 2 lists of figures and their axes.
I need to plot each figure in a single subplot so that the figures become in one big subplot. How can I do that?
I tried for loop but it didn't work.
Here's what I have tried:
import ruptures as rpt
import matplotlib.pyplot as plt
# make random data with 100 samples and 9 columns
n_samples, n_dims, sigma = 100, 9, 2
n_bkps = 4
signal, bkps = rpt.pw_constant(n_samples, n_dims, n_bkps, noise_std=sigma)
figs, axs = [], []
for i in range(signal.shape[1]):
points = signal[:,i]
# detection of change points
algo = rpt.Dynp(model='l2').fit(points)
result = algo.predict(n_bkps=2)
fig, ax = rpt.display(points, bkps, result, figsize=(15,3))
figs.append(fig)
axs.append(ax)
plt.show()
I had a look at the source code of ruptures.display(), and it accepts **kwargs that are passed on to matplotlib. This allows us to redirect the output to a single figure, and with gridspec, we can position individual subplots within this figure:
import ruptures as rpt
import matplotlib.pyplot as plt
n_samples, n_dims, sigma = 100, 9, 2
n_bkps = 4
signal, bkps = rpt.pw_constant(n_samples, n_dims, n_bkps, noise_std=sigma)
#number of subplots
n_subpl = signal.shape[1]
#give figure a name to refer to it later
fig = plt.figure(num = "ruptures_figure", figsize=(8, 15))
#define grid of nrows x ncols
gs = fig.add_gridspec(n_subpl, 1)
for i in range(n_subpl):
points = signal[:,i]
algo = rpt.Dynp(model='l2').fit(points)
result = algo.predict(n_bkps=2)
#rpt.display(points, bkps, result)
#plot into predefined figure
_, curr_ax = rpt.display(points, bkps, result, num="ruptures_figure")
#position current subplot within grid
curr_ax[0].set_position(gs[i].get_position(fig))
curr_ax[0].set_subplotspec(gs[i])
plt.show()
Sample output:

Proper Matplotlib axes construction / reuse

I currently am building a set of scatter plot charts using pandas plot.scatter. In this construction off of two base axes.
My current construction looks akin to
ax1 = pandas.scatter.plot()
ax2 = pandas.scatter.plot(ax=ax1)
for dataframe in list:
output_ax = pandas.scatter.plot(ax2)
output_ax.get_figure().save("outputfile.png")
total_output_ax = total_list.scatter.plot(ax2)
total_output_ax.get_figure().save("total_output.png")
This seems inefficient. For 1...N permutations I want to reuse a base axes that has 50% of the data already plotted. What I am trying to do is:
Add base data to scatter plot
For item x in y: (save data to base scatter and save image)
Add all data to scatter plot and save image
here's one way to do it with plt.scatter.
I plot column 0 on x-axis, and all other columns on y axis, one at a time.
Notice that there is only 1 ax object, and I don't replot all points, I just add points using the same axes with a for loop.
Each time I get a corresponding png image.
import numpy as np
import pandas as pd
np.random.seed(2)
testdf = pd.DataFrame(np.random.rand(20,4))
testdf.head(5) looks like this
0 1 2 3
0 0.435995 0.025926 0.549662 0.435322
1 0.420368 0.330335 0.204649 0.619271
2 0.299655 0.266827 0.621134 0.529142
3 0.134580 0.513578 0.184440 0.785335
4 0.853975 0.494237 0.846561 0.079645
#I put the first axis out of a loop, that can be in the loop as well
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(1,1,1)
ax.scatter(testdf[0],testdf[1], color='red')
fig.legend()
fig.savefig('fig_1.png')
colors = ['pink', 'green', 'black', 'blue']
for i in range(2,4):
ax.scatter(testdf[0], testdf[i], color=colors[i])
fig.legend()
fig.savefig('full_' + str(i) + '.png')
Then you get these 3 images (fig_1, fig_2, fig_3)
Axes objects cannot be simply copied or transferred. However, it is possible to set artists to visible/invisible in a plot. Given your ambiguous question, it is not fully clear how your data are stored but it seems to be a list of dataframes. In any case, the concept can easily be adapted to different input data.
import matplotlib.pyplot as plt
#test data generation
import pandas as pd
import numpy as np
rng = np.random.default_rng(123456)
df_list = [pd.DataFrame(rng.integers(0, 100, (7, 2))) for _ in range(3)]
#plot all dataframes into an axis object to ensure
#that all plots have the same scaling
fig, ax = plt.subplots()
patch_collections = []
for i, df in enumerate(df_list):
pc = ax.scatter(x=df[0], y=df[1], label=str(i))
pc.set_visible(False)
patch_collections.append(pc)
#store individual plots
for i, pc in enumerate(patch_collections):
pc.set_visible(True)
ax.set_title(f"Dataframe {i}")
fig.savefig(f"outputfile{i}.png")
pc.set_visible(False)
#store summary plot
[pc.set_visible(True) for pc in patch_collections]
ax.set_title("All dataframes")
ax.legend()
fig.savefig(f"outputfile_0_{i}.png")
plt.show()

Generate multiple plots with for loop; display output in matplotlib subplots

Objective: To generate 100 barplots using a for loop, and display the output as a subplot image
Data format: Datafile with 101 columns. The last column is the X variable; the remaining 100 columns are the Y variables, against which x is plotted.
Desired output: Barplots in 5 x 20 subplot array, as in this example image:
Current approach: I've been using PairGrid in seaborn, which generates an n x 1 array: .
where input == dataframe; input3 == list from which column headers are called:
for i in input3:
plt.figure(i)
g = sns.PairGrid(input,
x_vars=["key_variable"],
y_vars=i,
aspect=.75, size=3.5)
g.map(sns.barplot, palette="pastel")
Does anyone have any ideas how to solve this?
To give an example of how to plot 100 dataframe columns over a grid of 20 x 5 subplots:
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
data = np.random.rand(3,101)
data[:,0] = np.arange(2,7,2)
df = pd.DataFrame(data)
fig, axes = plt.subplots(nrows=5, ncols=20, figsize=(21,9), sharex=True, sharey=True)
for i, ax in enumerate(axes.flatten()):
ax.bar(df.iloc[:,0], df.iloc[:,i+1])
ax.set_xticks(df.iloc[:,0])
plt.show()
You can try to use matplotlob's subplots to create the plot grid and pass the axis to the barplot. The axis indexing you could do using a nested loop...

Categories

Resources