Is there a way to make multiple horizontal boxplots in matplotlib? - python

I am trying to make a matplotlib figure that will have multiple horizontal boxplots stacked on one another. The documentation shows both how to make a single horizontal boxplot and how to make multiple vertically oriented plots in this section.
I tried using subplots as in the following code:
import numpy as np
import pylab as plt
totfigs = 5
plt.figure()
plt.hold = True
for i in np.arange(totfigs):
x = np.random.random(50)
plt.subplot('{0}{1}{2}'.format(totfigs,1,i+1))
plt.boxplot(x,vert=0)
plt.show()
My output results in just a single horizontal boxplot though.
Any suggestions anyone?
Edit: Thanks to #joaquin, I fixed the plt.subplot call line. Now the subplot version works, but still would like the boxplots all in one figure...

If I'm understanding you correctly, you just need to pass boxplot a list (or a 2d array) containing each array you want to plot.
import numpy as np
import pylab as plt
totfigs = 5
plt.figure()
plt.hold = True
boxes=[]
for i in np.arange(totfigs):
x = np.random.random(50)
boxes.append(x)
plt.boxplot(boxes,vert=0)
plt.show()

try:
plt.subplot('{0}{1}{2}'.format(totfigs, 1, i+1) # n rows, 1 column
or
plt.subplot('{0}{1}{2}'.format(1, totfigs, i+1)) # 1 row, n columns
from the docstring:
subplot(*args, **kwargs)
Create a subplot command, creating axes with::
subplot(numRows, numCols, plotNum)
where plotNum = 1 is the first plot number and increasing plotNums
fill rows first. max(plotNum) == numRows * numCols
if you want them all together, shift them conveniently. As an example with a constant shift:
for i in np.arange(totfigs):
x = np.random.random(50)
plt.boxplot(x+(i*2),vert=0)

Related

filling a Mat Plot Lib Scatter plot with points using a loop

I tried this but got an error that they are not the same size
x = np.linspace(0,501,num=50)
y = np.linspace(0,501,num=50)
for i in range(10,510,10):
plt.scatter(x,i,c='dimgrey')
ax = plt.gca()
ax.set_facecolor('darkgrey')
plt.xlim(0,501)
plt.ylim(0,501);
My overall goal is to have N amount of points plotted in a grid orientation in the scatter plot. I was tying to plot 2500 points like this.
All I could come up with was one row or column would equal 50 points,
and I made this loop.
I want to fill the plot like this: a line of points at y= 10 as I have here, then at 20,30,40... so on. I realize I could do this manually but is there an easier way I could incorporate it into the loop? I am planning on putting it into an animation later.
Here is an simple example, starting from your code.
import numpy as np
import matplotlib.pyplot as plt
x = np.linspace(0,501,num=50)
for i in range(10,40,10):
y = i * np.ones(50)
plt.scatter(x,y)
This gives the following plot :

Second y axis and vertical line

I am creating a violinplot using the following code:
import seaborn as sns
ax = sns.violinplot(data=df[['SoundProduction','SoundForecast','diff']])
ax.set_ylabel("Sound power level [dB(A)]")
It gives me the folowing result:
Is there any way I can plot diff on a second y-axis so that all three series become clearly visible?
Also, is there a way to plot a vertical line in between 2 series? In this case I want a vertical line between SoundForecast and diff once they are plotted on two different axes.
You can achieve this using multiple subplots, which are easily set up using the plt.subplots (see lots more subplot examples).
This allows you to display your distributions on scales that are appropriate, and don't "waste" the display space. Most(all?) of seaborn's plotting functions accept the ax= argument so you can set the axes where the plot will be rendered. The axes also have clear separations between them.
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# generate some random distribution data
n = 800 # samples
prod = 95 + 5 * np.random.beta(0.6, 0.5, size=n); # a bimodal distribution
forecast = prod + 3*np.random.randn(n) # forecast is noisy estimate around the "true" production
diff = prod-forecast # should be with mu 0 sigma 3
df = pd.DataFrame(np.array([prod, forecast, diff]).T, columns=['SoundProduction','SoundForecast','diff']);
# set up two subplots, with one wider than the other
fig, ax = plt.subplots(1,2, num=1, gridspec_kw={'width_ratios':[2,1]})
# plot violin distribution estimates separately so the y-scaling makes sense in each group
sns.violinplot(data=df[['SoundProduction','SoundForecast']], ax=ax[0])
sns.violinplot(data=df[['diff']], ax=ax[1])

how to change the colors of multiple subplots at once?

I am looping through a bunch of CSV files containing various measurements.
Each file might be from one of 4 different data sources.
In each file, I merge the data into monthly datasets, that I then plot in a 3x4 grid. After this plot has been saved, the loop moves on and does the same to the next file.
This part I got figured out, however I would like to add a visual clue to the plots, as to what data it is. As far as I understand it (and tried it)
plt.subplot(4,3,1)
plt.hist(Jan_Data,facecolor='Red')
plt.ylabel('value count')
plt.title('January')
does work, however this way, I would have to add the facecolor='Red' by hand to every 12 subplots. Looping through the plots wont work for this situation, since I want the ylabel only for the leftmost plots, and xlabels for the bottom row.
Setting facecolor at the beginning in
fig = plt.figure(figsize=(20,15),facecolor='Red')
does not work, since it only changes the background color of the 20 by 15 figure now, which subsequently gets ignored when I save it to a PNG, since it only gets set for screen output.
So is there just a simple setthecolorofallbars='Red' command for plt.hist(… or plt.savefig(… I am missing, or should I just copy n' paste it to all twelve months?
You can use mpl.rc("axes", color_cycle="red") to set the default color cycle for all your axes.
In this little toy example, I use the with mpl.rc_context block to limit the effects of mpl.rc to just the block. This way you don't spoil the default parameters for your whole session.
import matplotlib as mpl
import matplotlib.pylab as plt
import numpy as np
np.random.seed(42)
# create some toy data
n, m = 2, 2
data = []
for i in range(n*m):
data.append(np.random.rand(30))
# and do the plotting
with mpl.rc_context():
mpl.rc("axes", color_cycle="red")
fig, axes = plt.subplots(n, m, figsize=(8,8))
for ax, d in zip(axes.flat, data):
ax.hist(d)
The problem with the x- and y-labels (when you use loops) can be solved by using plt.subplots as you can access every axis seperately.
import matplotlib.pyplot as plt
import numpy.random
# creating figure with 4 plots
fig,ax = plt.subplots(2,2)
# some data
data = numpy.random.randn(4,1000)
# some titles
title = ['Jan','Feb','Mar','April']
xlabel = ['xlabel1','xlabel2']
ylabel = ['ylabel1','ylabel2']
for i in range(ax.size):
a = ax[i/2,i%2]
a.hist(data[i],facecolor='r',bins=50)
a.set_title(title[i])
# write the ylabels on all axis on the left hand side
for j in range(ax.shape[0]):
ax[j,0].set_ylabel(ylabel[j])
# write the xlabels an all axis on the bottom
for j in range(ax.shape[1]):
ax[-1,j].set_xlabel(xlabels[j])
fig.tight_layout()
All features (like titles) which are not constant can be put into arrays and placed at the appropriate axis.

matplotlib.pyplot issue with subplot, np.ones and np.arange?

I am plotting several data types which share the x axis so I am using the matplotlib.pylot subplots command
The shared x axis is time (in years AD). The last subplot I have is the number of independent observations as a function of the time. I have the following code
import numpy as np
import matplotlib.pyplot as plt
#
# There's a bunch of data analysis here
#
f, ax = plt.subplots(4, sharex=True)
# Here I plot the first 3 subplots with no issue
x = np.arange(900, 2000, 1)#make x array in steps of 1
ax[3].plot(x[0:28], np.ones(len(x[0:28])),'k')#one observation from 900-927 AD
ax[3].plot(x[29:62], 2*np.ones(len(x[29:62])),'k')#two observations from 928-961 AD
Now when I run this code, the subplot I get only shows the second ax[3] plot and not the first. How can I fix this?? Thanks
Ok, I think I found an answer. The first plot was plotting but I couldn't see it with the axes so I changed the y limits
ax[3].axes.set_ylim([0 7])
That seemed to work, although is there a way to connect these horizontal lines, perhaps with dashed lines?

Dynamically add/create subplots in matplotlib

I want to create a plot consisting of several subplots with shared x/y axes.
It should look something like this from the documentation (though my subplots will be scatterblots): (code here)
But I want to create the subplots dynamically!
So the number of subplots depends on the output of a previous function. (It will probably be around 3 to 15 subplots per diagram, each from a distinct dataset, depending on the input of my script.)
Can anyone tell me how to accomplish that?
Suppose you know total subplots and total columns you want to use:
import matplotlib.pyplot as plt
# Subplots are organized in a Rows x Cols Grid
# Tot and Cols are known
Tot = number_of_subplots
Cols = number_of_columns
# Compute Rows required
Rows = Tot // Cols
# EDIT for correct number of rows:
# If one additional row is necessary -> add one:
if Tot % Cols != 0:
Rows += 1
# Create a Position index
Position = range(1,Tot + 1)
First instance of Rows accounts only for rows completely filled by subplots, then is added one more Row if 1 or 2 or ... Cols - 1 subplots still need location.
Then create figure and add subplots with a for loop.
# Create main figure
fig = plt.figure(1)
for k in range(Tot):
# add every single subplot to the figure with a for loop
ax = fig.add_subplot(Rows,Cols,Position[k])
ax.plot(x,y) # Or whatever you want in the subplot
plt.show()
Please note that you need the range Position to move the subplots into the right place.
import matplotlib.pyplot as plt
from pylab import *
import numpy as np
x = np.linspace(0, 2*np.pi, 400)
y = np.sin(x**2)
subplots_adjust(hspace=0.000)
number_of_subplots=3
for i,v in enumerate(xrange(number_of_subplots)):
v = v+1
ax1 = subplot(number_of_subplots,1,v)
ax1.plot(x,y)
plt.show()
This code works but you will need to correct the axes. I used to subplot to plot 3 graphs all in the same column. All you need to do is assign an integer to number_of_plots variable. If the X and Y values are different for each plot you will need to assign them for each plot.
subplot works as follows, if for example I had a subplot values of 3,1,1. This creates a 3x1 grid and places the plot in the 1st position. In the next interation if my subplot values were 3,1,2 it again creates a 3x1 grid but places the plot in the 2nd position and so forth.
Based on this post, what you want to do is something like this:
import matplotlib.pyplot as plt
# Start with one
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot([1,2,3])
# Now later you get a new subplot; change the geometry of the existing
n = len(fig.axes)
for i in range(n):
fig.axes[i].change_geometry(n+1, 1, i+1)
# Add the new
ax = fig.add_subplot(n+1, 1, n+1)
ax.plot([4,5,6])
plt.show()
However, Paul H's answer points to the submodule called gridspec which might make the above easier. I am leaving that as an exercise for the reader ^_~.
Instead of counting your own number of rows and columns, I found it easier to create the subplots using plt.subplots first, then iterate through the axes object to add plots.
import matplotlib.pyplot as plt
import numpy as np
fig, axes = plt.subplots(nrows=3, ncols=2, figsize=(12, 8))
x_array = np.random.randn(6, 10)
y_array = np.random.randn(6, 10)
i = 0
for row in axes:
for ax in row:
x = x_array[i]
y = y_array[i]
ax.scatter(x, y)
ax.set_title("Plot " + str(i))
i += 1
plt.tight_layout()
plt.show()
Here I use i to iterate through elements of x_array and y_array, but you can likewise easily iterate through functions, or columns of dataframes to dynamically generate graphs.

Categories

Resources