How to plot multiple plots using for loop - python

a=pd.DataFrame({'length':[20,10,30,40,50],
'width':[5,10,15,20,25],
'height':[7,14,21,28,35]})
for i,feature in enumerate(a,1):
sns.regplot(x = feature,y= 'height',data = a)
print("{} plotting {} ".format(i,feature))
I want to plot 3 different plots with three different columns i.e 'length','width' and 'height' on x-axis and 'height' on y-axis in each one of them .
This is the code i wrote but it overlays 3 different plots over one another.I intend to plot 3 different plots.

It depends on what you want to do. It you want several individual plots, you can create a new figure for each dataset:
import matplotlib.pyplot as plt
for i, feature in enumerate(a, 1):
plt.figure() # forces a new figure
sns.regplot(data=a, x=feature, y='height')
print("{} plotting {} ".format(i,feature))
Alternatively, you can draw them all on the same figure, but in different subplots. I.E next to each other:
import matplotlib.pyplot as plt
# create a figure with 3 subplots
fig, axes = plt.subplots(1, a.shape[1])
for i, (feature, ax) in enumerate(zip(a, axes), 1):
sns.regplot(data=a, x=feature, y='height', ax=ax)
print("{} plotting {} ".format(i,feature))
plt.subplots has several options that allow you to align the plots the way you like. check the docs for more on that!

Related

Matplotlib multiple colorbars

I need four different color bars for a single plot. My plot consists of 5 subplots, but only the first one requires four different color bar plots. I would like to position them below a graph. I achieved it, but my color bars are of different sizes:
For your reference, my figure is split into 5 figures using the matplotlib subplot2grid method. A minimal working example (without subplot2grid) is shown below:
import matplotlib.pyplot as plt
import numpy as np
frame = np.zeros([512, 512])
fig = plt.figure(figsize=(16, 8))
ax = plt.imshow(frame)
for i in range(4):
plt.colorbar(fraction=0.046, pad=0.04, location="bottom")
plt.show()
How do I position the color bar plots below the plot and them next to each other and of the same length (e.g. the same length as the first bar plot, or image size)?

How to combine 2 dataframe histograms in 1 plot?

I would like to use a code that shows all histograms in a dataframe. That will be df.hist(bins=10). However, I would like to add another histograms which shows CDF df_hist=df.hist(cumulative=True,bins=100,density=1,histtype="step")
I tried separating their matplotlib axes by using fig=plt.figure() and
plt.subplot(211). But this df.hist is actually part of pandas function, not matplotlib function. I also tried setting axes and adding ax=ax1 and ax2 options to each histogram but it didn't work.
How can I combine these histograms together?
Any help?
Histograms that I want to combine are like these. I want to show them side by side or put the second one on tip of the first one.
Sorry that I didn't care to make them look good.
It is possible to draw them together:
# toy data frame
df = pd.DataFrame(np.random.normal(0,1,(100,20)))
# draw hist
fig, axes = plt.subplots(5,4, figsize=(16,10))
df.plot(kind='hist', subplots=True, ax=axes, alpha=0.5)
# clone axes so they have different scales
ax_new = [ax.twinx() for ax in axes.flatten()]
df.plot(kind='kde', ax=ax_new, subplots=True)
plt.show()
Output:
It's also possible to draw them side-by-side. For example
fig, axes = plt.subplots(10,4, figsize=(16,10))
hist_axes = axes.flatten()[:20]
df.plot(kind='hist', subplots=True, ax=hist_axes, alpha=0.5)
kde_axes = axes.flatten()[20:]
df.plot(kind='kde', subplots=True, ax=kde_axes, alpha=0.5)
will plot hist on top of kde.
You can find more info here: Multiple histograms in Pandas (possible duplicate btw) but apparently Pandas cannot handle multiple histogram on same graphs.
It's ok because np.histogram and matplotlib.pyplot can, check the above link for a more complete answer.
Solution for overlapping histograms with df.hist with any number of subplots
You can combine two dataframe histogram figures by creating twin axes using the grid of axes returned by df.hist. Here is an example of normal histograms combined with cumulative step histograms where the size of the figure and the layout of the grid of subplots are taken care of automatically:
import numpy as np # v 1.19.2
import pandas as pd # v 1.1.3
import matplotlib.pyplot as plt # v 3.3.2
# Create sample dataset stored in a pandas dataframe
rng = np.random.default_rng(seed=1) # random number generator
letters = [chr(i) for i in range(ord('A'), ord('G')+1)]
df = pd.DataFrame(rng.exponential(1, size=(100, len(letters))), columns=letters)
# Set parameters for figure dimensions and grid layout
nplots = df.columns.size
ncols = 3
nrows = int(np.ceil(nplots/ncols))
subp_w = 10/ncols # 10 is the total figure width in inches
subp_h = 0.75*subp_w
bins = 10
# Plot grid of histograms with pandas function (with a shared y-axis)
grid = df.hist(grid=False, sharey=True, figsize=(ncols*subp_w, nrows*subp_h),
layout=(nrows, ncols), bins=bins, edgecolor='white', linewidth=0.5)
# Create list of twin axes containing second y-axis: note that due to the
# layout, the grid object may contain extra unused axes that are not shown
# (here in the H and I positions). The ax parameter of df.hist only accepts
# a number of axes that corresponds to the number of numerical variables
# in df, which is why the flattened array of grid axes is sliced here.
grid_twinx = [ax.twinx() for ax in grid.flat[:nplots]]
# Plot cumulative step histograms over normal histograms: note that the grid layout is
# preserved in grid_twinx so no need to set the layout parameter a second time here.
df.hist(ax=grid_twinx, histtype='step', bins=bins, cumulative=True, density=True,
color='tab:orange', linewidth=2, grid=False)
# Adjust space between subplots after generating twin axes
plt.gcf().subplots_adjust(wspace=0.4, hspace=0.4)
plt.show()
Solution for displaying histograms of different types side-by-side with matplotlib
To my knowledge, it is not possible to show the different types of plots side-by-side with df.hist. You need to create the figure from scratch, like in this example using the same dataset as before:
# Set parameters for figure dimensions and grid layout
nvars = df.columns.size
plot_types = 2 # normal histogram and cumulative step histogram
ncols_vars = 2
nrows = int(np.ceil(nvars/ncols_vars))
subp_w = 10/(plot_types*ncols_vars) # 10 is the total figure width in inches
subp_h = 0.75*subp_w
bins = 10
# Create figure with appropriate size
fig = plt.figure(figsize=(plot_types*ncols_vars*subp_w, nrows*subp_h))
fig.subplots_adjust(wspace=0.4, hspace=0.7)
# Create subplots by adding a new axes per type of plot for each variable
# and create lists of axes of normal histograms and their y-axis limits
axs_hist = []
axs_hist_ylims = []
for idx, var in enumerate(df.columns):
axh = fig.add_subplot(nrows, plot_types*ncols_vars, idx*plot_types+1)
axh.hist(df[var], bins=bins, edgecolor='white', linewidth=0.5)
axh.set_title(f'{var} - Histogram', size=11)
axs_hist.append(axh)
axs_hist_ylims.append(axh.get_ylim())
axc = fig.add_subplot(nrows, plot_types*ncols_vars, idx*plot_types+2)
axc.hist(df[var], bins=bins, density=True, cumulative=True,
histtype='step', color='tab:orange', linewidth=2)
axc.set_title(f'{var} - Cumulative step hist.', size=11)
# Set shared y-axis for histograms
for ax in axs_hist:
ax.set_ylim(max(axs_hist_ylims))
plt.show()

plot matplotlib subplots indefinitely?

say I was testing a range of parameters of a clustering algorithm and I wanted to write python code that would plot all the results of the algorithm in subplots 2 to a row
is there a way to do this without pre-calculating how many total plots you would need?
something like:
for c in range(3,10):
k = KMeans(n_clusters=c)
plt.subplots(_, 2, _)
plt.scatter(data=data, x='x', y='y', c=k.fit_predict(data))
... and then it would just plot 'data' with 'c' clusters 2 plots per row until it ran out of stuff to plot.
thanks!
This answer from the question Dynamically add/create subplots in matplotlib explains a way to do it:
https://stackoverflow.com/a/29962074/3827277
verbatim copy & paste:
import matplotlib.pyplot as plt
# Start with one
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot([1,2,3])
# Now later you get a new subplot; change the geometry of the existing
n = len(fig.axes)
for i in range(n):
fig.axes[i].change_geometry(n+1, 1, i+1)
# Add the new
ax = fig.add_subplot(n+1, 1, n+1)
ax.plot([4,5,6])
plt.show()

Avoid overlapping on seaborn plots

I'm making some EDA using pandas and seaborn, this is the code I have to plot the histograms of a group of features:
skewed_data = pd.DataFrame.skew(data)
skewed_features =skewed_data.index
fig, axs = plt.subplots(ncols=len(skewed_features))
plt.ticklabel_format(style='sci', axis='both', scilimits=(0,0))
for i,skewed_feature in enumerate(skewed_features):
g = sns.distplot(data[column])
sns.distplot(data[skewed_feature], ax=axs[i])
This is the result I'm getting:
Is not readable, how can I avoid that issue?
I know you are concerning about the layout of the figures. However, you need to first decide how to represent your data. Here are two choices for your case
(1) Multiple lines in one figure and
(2) Multiple subplots 2x2, each subplot draws one line.
I am not quite familiar with searborn, but the plotting of searborn is based on matplotlib. I could give you some basic ideas.
To archive (1), you can first declare the figure and ax, then add all line to this ax. Example codes:
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
# YOUR LOOP, use the ax parameter
for i in range(3)
sns.distplot(data[i], ax=ax)
To archive (2), same as above, but with different number subplots, and put your line in the different subplot.
# Four subplots, 2x2
fig, axarr = plt.subplots(2,2)
# YOUR LOOP, use different cell
You may check matplotlib subplots demo. To do a good visualization is a very tough work. There are so many documents to read. Check the gallery of matplotlib or seaborn is a good and quick way to understand how some kinds of visualization are implemented.
Thanks.

How to put the legend on first subplot of seaborn.FacetGrid?

I have a pandas DataFrame df which I visualize with subplots of a seaborn.barplot. My problem is that I want to move my legend inside one of the subplots.
To create subplots based on a condition (in my case Area), I use seaborn.FacetGrid. This is the code I use:
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
# .. load data
grid = sns.FacetGrid(df, col="Area", col_order=['F1','F2','F3'])
bp = grid.map(sns.barplot,'Param','Time','Method')
bp.add_legend()
bp.set_titles("{col_name}")
bp.set_ylabels("Time (s)")
bp.set_xlabels("Number")
sns.plt.show()
Which generates this plot:
You see that the legend here is totally at the right, but I would like to have it inside one of the plots (for example the left one) since my original data labels are quite long and the legend occupies too much space. This is the example for only 1 plot where the legend is inside the plot:
and the code:
mask = df['Area']=='F3'
ax=sns.barplot(x='Param',y='Time',hue='Method',data=df[mask])
sns.plt.show()
Test 1:
I tried the example of an answer where they have the legend in one of the subplots:
grid = sns.FacetGrid(df, col="Area", col_order=['F1','F2','F3'])
bp = grid.map(sns.barplot,'Param','Time','Method')
Ax = bp.axes[0]
Boxes = [item for item in Ax.get_children()
if isinstance(item, matplotlib.patches.Rectangle)][:-1]
legend_labels = ['So1', 'So2', 'So3', 'So4', 'So5']
# Create the legend patches
legend_patches = [matplotlib.patches.Patch(color=C, label=L) for
C, L in zip([item.get_facecolor() for item in Boxes],
legend_labels)]
# Plot the legend
plt.legend(legend_patches)
sns.plt.show()
Note that I changed plt.legend(handles=legend_patches) did not work for me therefore I use plt.legend(legend_patches) as commented in this answer. The result however is:
As you see the legend is in the third subplot and neither the colors nor labels match.
Test 2:
Finally I tried to create a subplot with a column wrap of 2 (col_wrap=2) with the idea of having the legend in the right-bottom square:
grid = sns.FacetGrid(df, col="MapPubName", col_order=['F1','F2','F3'],col_wrap=2)
but this also results in the legend being at the right:
Question: How can I get the legend inside the first subplot? Or how can I move the legend to anywhere in the grid?
You can set the legend on the specific axes you want, by using grid.axes[i][j].legend()
For your case of a 1 row, 3 column grid, you want to set grid.axes[0][0].legend() to plot on the left hand side.
Here's a simple example derived from your code, but changed to account for the sample dataset.
import matplotlib.pyplot as plt
import matplotlib
import seaborn as sns
df = sns.load_dataset("tips")
grid = sns.FacetGrid(df, col="day")
bp = grid.map(sns.barplot,"time",'total_bill','sex')
grid.axes[0][0].legend()
bp.set_titles("{col_name}")
bp.set_ylabels("Time (s)")
bp.set_xlabels("Number")
sns.plt.show()
Use the legend_out=False option.
If you are making a faceted bar plot, you should use factorplot with kind=bar. Otherwise, if you don't explicitly specify the order for each facet, it is possible that your plot will end up being wrong.
import seaborn as sns
tips = sns.load_dataset("tips")
sns.factorplot(x="sex", y="total_bill", hue="smoker", col="day",
data=tips, kind="bar", aspect=.7, legend_out=False)

Categories

Resources