Given a gridspec object in matplotlib, I want to automatically iterate through all its indices so I can add the corresponding Axes automatically, something like:
for i, j in gspec.indices: # whatever those indices are
axs[i,j] = fig.add_subplot(gspec[i][j])
How do I do that, without knowing how many rows or columns the gridspec has in advance?
gspec.get_geometry() returns the number of rows and of columns. Here is some example code:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(constrained_layout=True)
gspec = fig.add_gridspec(3, 4)
nrows, ncols = gspec.get_geometry()
axs = np.array([[fig.add_subplot(gspec[i, j]) for j in range(ncols)] for i in range(nrows)])
t = np.linspace(0, 4 * np.pi, 1000)
for i in range(nrows):
for j in range(ncols):
axs[i, j].plot(np.sin((i + 1) * t), np.sin((j + 1) * t))
plt.show()
If axs isn't needed as numpy array, the conversion to numpy array can be left out.
Note that the code assumes you need a subplot in every possible grid position, which also can be obtained via fig, axs = plt.subplots(...). A gridspec is typically used when you want to combine grid positions to create custom layouts, as shown in the examples of the tutorial.
I have a table that contains three different time characteristics according to two different parameters. I want to plot those parameters on x and y-axis and show bars of the three different times on the z-axis. I have created a simple bar plot where I plot one of the time characteristics:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
columns = ['R','Users','A','B','C']
df=pd.DataFrame({'R':[2,2,2,4,4,4,6,6,6,8,8],
'Users':[80,400,1000,80,400,1000,80,400,1000,80,400],
'A':[ 0.05381,0.071907,0.08767,0.04493,0.051825,0.05295,0.05285,0.0804,0.0967,0.09864,0.1097],
'B':[0.04287,0.83652,5.49683,0.02604,.045599,2.80836,0.02678,0.32621,1.41399,0.19025,0.2111],
'C':[0.02192,0.16217,0.71645, 0.25314,5.12239,38.92758,1.60807,262.4874,8493,11.6025,6288]},
columns=columns)
fig = plt.figure()
ax = plt.axes(projection="3d")
num_bars = 11
x_pos = df["R"]
y_pos = df["Users"]
z_pos = [0] * num_bars
x_size = np.ones(num_bars)/4
y_size = np.ones(num_bars)*50
z_size = df["A"]
ax.bar3d(x_pos, y_pos, z_pos, x_size, y_size, z_size, color='aqua')
plt.show()
This produces a simple 3d barplot:
However, I would like to plot similar bars next to the existing ones for the rest two columns (B and C) in a different color and add a plot legend as well. I could not figure out how to achieve this.
As a side question, is it as well possible to show only values from df at x- and y-axis? The values are 2-4-6-8 and 80-400-1000, I do not wish pyplot to add additional values on those axis.
I have managed to find a solution myself. To solve the problem with values I have added one to all times (to avoid negative log) and used np.log on all time columns. The values got on scale 0-10 this way and the plot got way easier to read. After that I used loop to go over each column and create corresponding values, positions and colors which I have added all to one list. I moved y_pos for each column so the columns do not plot on same position.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
columns = ['R','Users','A','B','C']
df=pd.DataFrame({'R':[2,2,2,4,4,4,6,6,6,8,8],
'Users':[80,400,1000,80,400,1000,80,400,1000,80,400],
'A':[ 0.05381,0.071907,0.08767,0.04493,0.051825,0.05295,0.05285,0.0804,0.0967,0.09864,0.1097],
'B':[0.04287,0.83652,5.49683,0.02604,.045599,2.80836,0.02678,0.32621,1.41399,0.19025,0.2111],
'C':[0.02192,0.16217,0.71645, 0.25314,5.12239,38.92758,1.60807,262.4874,8493,11.6025,6288]},
columns=columns)
fig = plt.figure(figsize=(10, 10))
ax = plt.axes(projection="3d")
df["A"] = np.log(df["A"]+1)
df["B"] = np.log(df["B"]+1)
df["C"] = np.log(df["C"]+1)
colors = ['r', 'g', 'b']
num_bars = 11
x_pos = []
y_pos = []
x_size = np.ones(num_bars*3)/4
y_size = np.ones(num_bars*3)*50
c = ['A','B','C']
z_pos = []
z_size = []
z_color = []
for i,col in enumerate(c):
x_pos.append(df["R"])
y_pos.append(df["Users"]+i*50)
z_pos.append([0] * num_bars)
z_size.append(df[col])
z_color.append([colors[i]] * num_bars)
x_pos = np.reshape(x_pos,(33,))
y_pos = np.reshape(y_pos,(33,))
z_pos = np.reshape(z_pos,(33,))
z_size = np.reshape(z_size,(33,))
z_color = np.reshape(z_color,(33,))
ax.bar3d(x_pos, y_pos, z_pos, x_size, y_size, z_size, color=z_color)
plt.xlabel('R')
plt.ylabel('Users')
ax.set_zlabel('Time')
from matplotlib.lines import Line2D
legend_elements = [Line2D([0], [0], marker='o', color='w', label='A',markerfacecolor='r', markersize=10),
Line2D([0], [0], marker='o', color='w', label='B',markerfacecolor='g', markersize=10),
Line2D([0], [0], marker='o', color='w', label='C',markerfacecolor='b', markersize=10)
]
# Make legend
ax.legend(handles=legend_elements, loc='best')
# Set view
ax.view_init(elev=35., azim=35)
plt.show()
Final plot:
I am trying to use a for loop to create histograms for each fields in a dataframe. The dataframe here is labeled as 'df4'.
There are 3 fields/columns.
Then I want to create vertical lines using quantiles for each of the columns as defined in the following series: p, exp, eng.
My code below only successfully creates the vertical lines on the last field/column or histogram.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df4 = pd.read_csv("xyz.csv", index_col = "abc_id" )
# dataframe
# x coordinates for the lines
p = df4['abc'].quantile([0.25,0.5,0.75,0.9,0.95])
exp = df4['efg'].quantile([0.25,0.5,0.75,0.9,0.95])
eng = df4['xyz'].quantile([0.25,0.5,0.75,0.9,0.95])
# colors for the lines
colors = ['r','k','b','g','y']
bins = [0,100,200,300,400,500,600,700,800,900,1000,1100,1200,1300,1400,1500,1600,1700,1800,1900,2000]
fig, axs = plt.subplots(len(df4.columns), figsize=(10, 25))
for n, col in enumerate(df4.columns):
if (n==0):
for xc,c in zip(exp,colors):
plt.axvline(x=xc, label='line at x = {}'.format(xc), c=c)
if (n==1):
for xc,c in zip(eng,colors):
plt.axvline(x=xc, label='line at x = {}'.format(xc), c=c)
if (n==2):
for xc,c in zip(p,colors):
plt.axvline(x=xc, label='line at x = {}'.format(xc), c=c)
df[col].hist(ax=axs[n],bins=50)
plt.legend()
plt.show()
I am looping through a list containing 6 col_names. I loop by taking 3 cols at a time so i can print 3 subplots per iteration later.
I have 2 dataframes with same column names so they look identical except for the histograms of each column name.
I want to plot similar column names of both dataframes on the same subplot. Right now, im plotting their histograms on 2 separate subplots.
currently, for col 'A','B','C' in df_plot:
and for col 'A','B','C' in df_plot2:
I only want 3 charts where i can combine similar column names into same chart so there is blue and yellow bars in the same chart.
Adding df_plot2 below doesnt work. i think im not defining my second axs properly but im not sure how to do that.
col_name_list = ['A','B','C','D','E','F']
chunk_list = [col_name_list[i:i + 3] for i in xrange(0, len(col_name_list), 3)]
for k,g in enumerate(chunk_list):
df_plot = df[g]
df_plot2 = df[g][df[g] != 0]
fig, axs = plt.subplots(1,len(g),figsize = (50,20))
axs = axs.ravel()
for j,x in enumerate(g):
df_plot[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs[j], position=0, title = x, fontsize = 30)
# adding this doesnt work.
df_plot2[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs[j], position=1, fontsize = 30)
axs[j].title.set_size(40)
fig.tight_layout()
the solution is to plot on the same ax:
change axs[j] to axs
for k,g in enumerate(chunk_list):
df_plot = df[g]
df_plot2 = df[g][df[g] != 0]
fig, axs = plt.subplots(1,len(g),figsize = (50,20))
axs = axs.ravel()
for j,x in enumerate(g):
df_plot[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs, position=0, title = x, fontsize = 30)
# adding this doesnt work.
df_plot2[x].value_counts(normalize=True).head().plot(kind='bar',ax=axs, position=1, fontsize = 30)
axs[j].title.set_size(40)
fig.tight_layout()
then just call plt.plot()
Example this will plot x and y on the same subplot:
import matplotlib.pyplot as plt
x = np.arange(0, 10, 1)
y = np.arange(0, 20, 2)
ax = plt.subplot(1,1)
fig = plt.figure()
ax = fig.gca()
ax.plot(x)
ax.plot(y)
plt.show()
EDIT:
There is now a squeeze keyword argument. This makes sure the result is always a 2D numpy array.
fig, ax2d = subplots(2, 2, squeeze=False)
if needed Turning that into a 1D array is easy:
axli = ax1d.flatten()
It is possible to refer to different subplots using two indices to index their axes as in the following example
rows = 2
cols = 2
f, ax = plt.subplots(rows, cols)
x = np.arange(12)
y = xdata**2
plotFunction(x,y,ax,0,1)
def plotFunction(xdata, ydata, ax, i, j):
ax[i,j].plot(xdata, ydata, marker='o', label='quadratic')
however if either rows or cols = 1 pyplot does not permit the use of two indices. This precludes the generic use of my plotting function that relies on double index plotting. So the following won't work
rows = 1
cols = 2
f, ax = plt.subplots(rows, cols)
x = np.arange(12)
y = xdata**2
plotFunction(x,y,ax,0,1)
One way is to use the 'squeeze=False' option when calling subplots.
rows = 1
cols = 2
f, ax = plt.subplots(rows, cols, squeeze=False)
x = np.arange(12)
y = xdata**2
plotFunction(x,y,ax,0,1,label='quadratic')
def plotFunction(xdata, ydata, ax, i, j, label):
ax[i,j].plot(xdata, ydata, marker='o', label=label)
This permits [row,col] indexing in all cases.