Currently, I have the first y axis (probability) of my subplots aligned. However, I am attempting to get the secondary y axis (sample size) of the subplots aligned. I've tried to simply set the y-axis limit, but this solution isn't very generalizable.
Here is my code:
attacks = 5
crit_rate = .5
idealdata = fullMatrix(attacks, crit_rate)
crit_rate = ("crit_%.0f" % (crit_rate*100))
actualdata = trueDataM(attacks, crit_rate)
[enter image description here][1]
fig, axs = plt.subplots(attacks+1, sharex=True, sharey=True)
axs2 = [ax.twinx() for ax in axs]
fig.text(0.5, 0.04, 'State', ha='center')
fig.text(0.04, 0.5, 'Probability', va='center', rotation='vertical')
fig.text(.95, .5, 'Sample Size', va='center', rotation='vertical')
fig.text(.45, .9, 'Ideal vs. Actual Critical Strike Rate', va='center')
cmap = plt.get_cmap('rainbow')
samplesize = datasample(attacks, 'crit_50')
fig.set_size_inches(18.5, 10.5)
for i in range(attacks+1):
axs[i].plot(idealdata[i], color=cmap(i/attacks), marker='o', lw=3)
axs[i].plot(actualdata[i], 'gray', marker='o', lw=3, ls='--')
axs2[i].bar(range(len(samplesize[i])), samplesize[i], width=.1, color=cmap(i/attacks), alpha = .6)
plt.show()
https://i.stack.imgur.com/HKJlE.png
Without data to confirm my assumptions it's hard to tell if this will be correct.
You are not making any attempt to scale the left y-axes so that data must all have the same range. To ensure the right y-axes all have the same scale/limits you need to determine the range (max and min) of the (all) data being plotted on those axes then apply that to all of them.
It isn't clear whether samplesize is a Numpy ndarray or a lists of lists, I'm also assuming that it is a 2-d structure with range(attacks+1) rows. Since you are making bar charts on the second y-axes you only need to find the largest height in all the data.
# for a list of lists
biggest = max(max(row) for row in samplesize)
# or
biggest = max(map(max,samplesize))
# for an ndarray
biggest = samplesize.max()
Then apply that scale to all the right y-axes before they are shown
for ax in axs2:
ax.set_ylim(top=biggest)
If you determine biggest prior to the plot loop you can just add a line to that loop:
for i in range(attacks+1):
...
axs2[i].set_ylim(top=biggest)
You'll find plenty of related SO Q&A'a searching with the terms: matplotlib subplots same y scale, matplotlib subplots y axis limits or something similar.
Here is a toy example:
from matplotlib import pyplot as plt
import numpy as np
lines = np.random.randint(0,200,(5,10))
bars = [np.random.randint(0,np.random.randint(0,10000),10) for _ in (0,0,0,0,0,)]
fig, axs = plt.subplots(lines.shape[0], sharex=True, sharey=True)
axs2 = [ax.twinx() for ax in axs]
#xs = np.arange(lines.shape[1])
xs = np.arange(1,11)
biggest = max(map(max,bars))
for ax,ax2,line,row in zip(axs,axs2,lines,bars):
bars = ax2.bar(xs,row)
ax.plot(line)
ax2.set_ylim(top=biggest)
plt.show()
plt.close()
Related
I have a function that inputs a string (the name of the dataframe we're visualizing) and returns two histograms that visualize that data. The first plot (on the left) is the raw data, the one on the right is it after being normalized (same, just plotted using the matplotlib parameter density=True). But as you can see, this leads to transparency issues when the plots overlap. This is my code for this particular plot:
plt.rcParams["figure.figsize"] = [12, 8]
plt.rcParams["figure.autolayout"] = True
ax0_1 = plt.subplot(121)
_,bins,_ = ax0_1.hist(filtered_0,alpha=1,color='b',bins=15,label='All apples')
ax0_1.hist(filtered_1,alpha=0.9,color='gold',bins=bins,label='Less than two apples')
ax0_1.set_title('Condition 0 vs Condition 1: '+'{}'.format(apple_data),fontsize=14)
ax0_1.set_xlabel('{}'.format(apple_data),fontsize=13)
ax0_1.set_ylabel('Frequency',fontsize=13)
ax0_1.grid(axis='y',linewidth=0.4)
ax0_1.tick_params(axis='x',labelsize=13)
ax0_1.tick_params(axis='y',labelsize=13)
ax0_1_norm = plt.subplot(122)
_,bins,_ = ax0_1_norm.hist(filtered_0,alpha=1,color='b',bins=15,label='All apples',density=True)
ax0_1_norm.hist(filtered_1,alpha=0.9,color='gold',bins=bins,label='Less than two apples',density=True)
ax0_1_norm.set_title('Condition 0 vs Condition 1: '+'{} - Normalized'.format(apple_data),fontsize=14)
ax0_1_norm.set_xlabel('{}'.format(apple_data),fontsize=13)
ax0_1_norm.set_ylabel('Frequency',fontsize=13)
ax0_1_norm.legend(bbox_to_anchor=(2, 0.95))
ax0_1_norm.grid(axis='y',linewidth=0.4)
ax0_1_norm.tick_params(axis='x',labelsize=13)
ax0_1_norm.tick_params(axis='y',labelsize=13)
plt.tight_layout(pad=0.5)
plt.show()
What my current plot looks like
Any ideas on how to make the colors blend a bit better would be helpful. Alternatively, if there are any other combinations you know of that would work instead, feel free to share. I'm not picky about the color choice. Thanks!
I think it is better to emphasize such a histogram by distinguishing it by the shape of the histogram or by the difference in transparency rather than visualizing it by color. I have coded an example from the official reference with additional overlap.
import matplotlib.pyplot as plt
import numpy as np
np.random.seed(20211021)
N_points = 100000
n_bins = 20
x = np.random.randn(N_points)
y = .4 * x + np.random.randn(100000) + 2
fig, axs = plt.subplots(2, 2, sharey=True, tight_layout=True)
# We can set the number of bins with the `bins` kwarg
axs[0,0].hist(x, color='b', alpha=0.9, bins=n_bins, ec='b', fc='None')
axs[0,0].hist(y, color='gold', alpha=0.6, bins=21)
axs[0,0].set_title('edgecolor and facecolor None')
axs[0,1].hist(x, color='b', alpha=0.9, bins=n_bins)
axs[0,1].hist(y, color='gold', alpha=0.6, bins=21, ec='b')
axs[0,1].set_title('edgecolor and facecolor')
axs[1,0].hist(x, alpha=0.9, bins=n_bins, histtype='step', facecolor='b')
axs[1,0].hist(y, color='gold', alpha=0.6, bins=21)
axs[1,0].set_title('step')
axs[1,1].hist(x, color='b', alpha=0.9, bins=n_bins, histtype='bar', rwidth=0.8)
axs[1,1].hist(y, color='gold', alpha=0.6, bins=21, ec='b')
axs[1,1].set_title('bar')
plt.show()
I have created a histogram in a Jupyter notebook to show the distribution of time on page in seconds for 100 web visits.
Code as follows:
ax = df.hist(column='time_on_page', bins=25, grid=False, figsize=(12,8), color='#86bf91', zorder=2, rwidth=0.9)
ax = ax[0]
for x in ax:
# Despine
x.spines['right'].set_visible(False)
x.spines['top'].set_visible(False)
x.spines['left'].set_visible(False)
# Switch off ticks
x.tick_params(axis="both", which="both", bottom="off", top="off", labelbottom="on", left="off", right="off", labelleft="on")
# Draw horizontal axis lines
vals = x.get_yticks()
for tick in vals:
x.axhline(y=tick, linestyle='dashed', alpha=0.4, color='#eeeeee', zorder=1)
# Set title
x.set_title("Time on Page Histogram", fontsize=20, weight='bold', size=12)
# Set x-axis label
x.set_xlabel("Time on Page Duration (Seconds)", labelpad=20, weight='bold', size=12)
# Set y-axis label
x.set_ylabel("Page Views", labelpad=20, weight='bold', size=12)
# Format y-axis label
x.yaxis.set_major_formatter(StrMethodFormatter('{x:,g}'))
This produces the following visualisation:
I'm generally happy with the appearance however I'd like for the axis to be a little more descriptive, perhaps showing the bin range for each bin and the percentage of the total that each bin constitutes.
Have looked for this in the Matplotlib documentation but cannot seem ot find anything that would allow me to achieve my end goal.
Any help greatly appreciated.
When you set bins=25, 25 equally spaced bins are set between the lowest and highest values encountered. If you use these ranges to mark the bins, things can be confusing due to the arbitrary values. It seems more adequate to round these bin boundaries, for example to multiples of 20. Then, these values can be used as tick marks on the x-axis, nicely between the bins.
The percentages can be added by looping through the bars (rectangular patches). Their height indicates the number of rows belonging to the bin, so dividing by the total number of rows and multiplying by 100 gives a percentage. The bar height, x and half width can position the text.
from matplotlib import pyplot as plt
import numpy as np
import pandas as pd
df = pd.DataFrame({'time_on_page': np.random.lognormal(4, 1.1, 100)})
max_x = df['time_on_page'].max()
bin_width = max(20, np.round(max_x / 25 / 20) * 20) # round to multiple of 20, use max(20, ...) to avoid rounding to zero
bins = np.arange(0, max_x + bin_width, bin_width)
axes = df.hist(column='time_on_page', bins=bins, grid=False, figsize=(12, 8), color='#86bf91', rwidth=0.9)
ax = axes[0, 0]
total = len(df)
ax.set_xticks(bins)
for p in ax.patches:
h = p.get_height()
if h > 0:
ax.text(p.get_x() + p.get_width() / 2, h, f'{h / total * 100.0 :.0f} %\n', ha='center', va='center')
ax.grid(True, axis='y', ls=':', alpha=0.4)
ax.set_axisbelow(True)
for dir in ['left', 'right', 'top']:
ax.spines[dir].set_visible(False)
ax.tick_params(axis="y", length=0) # Switch off y ticks
ax.margins(x=0.02) # tighter x margins
plt.show()
I have written a code which will plot a graph of Time VS Amplitude. Now , I want to change the index which is on the horizontal axis. I want to know how I can do it for a single plot and also for the subplots. I want the range of the horizontal axis to be from 0 to 2*pi.
#the following code was written for plotting
fig, (ax1, ax2 ,ax3) = plt.subplots(3 ,constrained_layout = True)
fig.suptitle('AMPLITUDE MODULATION' ,color = 'Red')
ax1.plot(message_signal)
ax1.set_title('Message Signal' ,color = 'green')
I expect the x-axis to go from 0 to 2*pi only. In short, I want to customize the indexing of the x-axis
You can use xlim to set the limits of the x-axis for whole plot or specific axes, e.g. plt.xlim(0, 1) or ax1.set_xlim(0, 1).
Here I set the limits for the x-axis to be [0, 3*pi]
fig, (ax1, ax2 ,ax3) = plt.subplots(3, constrained_layout = True)
fig.suptitle('AMPLITUDE MODULATION', color = 'Red')
x = np.linspace(0, 2*np.pi, 1000)
ax1.plot(x, np.sin(x))
ax1.set_title('Message Signal', color = 'green')
ax1.set_xlim(0, 3*np.pi)
I need to create a Python script that plots a list of (sorted) value as a vertical bar plot. I'd like to plot all the values and save it as a long vertical plot, so that both the yticks labels and bars are clearly visible. That is, I'd like a "long" verticalplot. The number of elements in the list varies (e.g. from 500 to 1000), so the use of figsize does not help as I don't know how long that should be.
import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots()
# Example data
n = 500
y_pos = np.arange(n)
performance = 3 + 10 * np.random.rand(n)
ax.barh(y_pos, np.sort(performance), align='center', color='green', ecolor='black')
ax.set_yticks(y_pos)
ax.set_yticklabels([str(x) for x in y_pos])
ax.set_xlabel('X')
How can I modify the script so that I can stretch the figure vertically and make it readable?
Change the figsize depending on the number of data values. Also, manage the y-axis limit accordingly.
The following works perfectly:
n = 500
fig, ax = plt.subplots(figsize=(5,n//5)) # Changing figsize depending upon data
# Example data
y_pos = np.arange(n)
performance = 3 + 10 * np.random.rand(n)
ax.barh(y_pos, np.sort(performance), align='center', color='green', ecolor='black')
ax.set_yticks(y_pos)
ax.set_yticklabels([str(x) for x in y_pos])
ax.set_xlabel('X')
ax.set_ylim(0, n) # Manage y-axis properly
Given below is the output picture for n=200
What i wanna do is adding a single colorbar (at the right side of the figure shown below), that will show the colorbar for both subplots (they are at the same scale).
Another thing doesn't really make sense for me is why the lines I try to draw on the end of the code are not drawn (they are supposed to be horizontal lines on the center of both plots)
Thanks for the help.
Here are the code:
idx=0
b=plt.psd(dOD[:,idx],Fs=self.fs,NFFT=512)
B=np.zeros((2*len(self.Chan),len(b[0])))
B[idx,:]=20*log10(b[0])
c=plt.psd(dOD_filt[:,idx],Fs=self.fs,NFFT=512)
C=np.zeros((2*len(self.Chan),len(b[0])))
C[idx,:]=20*log10(c[0])
for idx in range(2*len(self.Chan)):
b=plt.psd(dOD[:,idx],Fs=self.fs,NFFT=512)
B[idx,:]=20*log10(b[0])
c=plt.psd(dOD_filt[:,idx],Fs=self.fs,NFFT=512)
C[idx,:]=20*log10(c[0])
## Calculate the color scaling for the imshow()
aux1 = max(max(B[i,:]) for i in range(size(B,0)))
aux2 = min(min(B[i,:]) for i in range(size(B,0)))
bux1 = max(max(C[i,:]) for i in range(size(C,0)))
bux2 = min(min(C[i,:]) for i in range(size(C,0)))
scale1 = 0.75*max(aux1,bux1)
scale2 = 0.75*min(aux2,bux2)
fig, axes = plt.subplots(nrows=2, ncols=1,figsize=(7,7))#,sharey='True')
fig.subplots_adjust(wspace=0.24, hspace=0.35)
ii=find(c[1]>=frange)[0]
## Making the plots
cax=axes[0].imshow(B, origin = 'lower',vmin=scale2,vmax=scale1)
axes[0].set_ylim((0,2*len(self.Chan)))
axes[0].set_xlabel(' Frequency (Hz) ')
axes[0].set_ylabel(' Channel Number ')
axes[0].set_title('Pre-Filtered')
cax2=axes[1].imshow(C, origin = 'lower',vmin=scale2,vmax=scale1)
axes[1].set_ylim(0,2*len(self.Chan))
axes[1].set_xlabel(' Frequency (Hz) ')
axes[1].set_ylabel(' Channel Number ')
axes[1].set_title('Post-Filtered')
axes[0].annotate('690nm', xy=((ii+1)/2, len(self.Chan)/2-1),
xycoords='data', va='center', ha='right')
axes[0].annotate('830nm', xy=((ii+1)/2, len(self.Chan)*3/2-1 ),
xycoords='data', va='center', ha='right')
axes[1].annotate('690nm', xy=((ii+1)/2, len(self.Chan)/2-1),
xycoords='data', va='center', ha='right')
axes[1].annotate('830nm', xy=((ii+1)/2, len(self.Chan)*3/2-1 ),
xycoords='data', va='center', ha='right')
axes[0].axis('tight')
axes[1].axis('tight')
## Set up the xlim to aprox frange Hz
axes[0].set_xlim(left=0,right=ii)
axes[1].set_xlim(left=0,right=ii)
## Make the xlabels become the actual frequency number
ticks = linspace(0,ii,10)
tickslabel = linspace(0.,frange,10)
for i in range(10):
tickslabel[i]="%.1f" % tickslabel[i]
axes[0].set_xticks(ticks)
axes[0].set_xticklabels(tickslabel)
axes[1].set_xticks(ticks)
axes[1].set_xticklabels(tickslabel)
## Draw a line to separate the two different wave lengths, and name each region
l1 = Line2D([0,frange],[28,28],ls='-',color='black')
axes[0].add_line(l1)
axes[1].add_line(l1)
And here the figure it makes:
If any more info are needed, just ask.
Basically, figure.colorbar() is good for both images, as long as their are not with too different scales. So you could let matplotlib do it for you... or you manually position your colorbar on axes inside the images. Here is how to control the location of the colorbar:
import numpy as np
from matplotlib import pyplot as plt
A = np.random.random_integers(0, 10, 100).reshape(10, 10)
B = np.random.random_integers(0, 10, 100).reshape(10, 10)
fig = plt.figure()
ax1 = fig.add_subplot(221)
ax2 = fig.add_subplot(222)
mapable = ax1.imshow(A, interpolation="nearest")
cax = ax2.imshow(A, interpolation="nearest")
# set the tickmarks *if* you want cutom (ie, arbitrary) tick labels:
cbar = fig.colorbar(cax, ax=None)
fig = plt.figure(2)
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122)
mapable = ax1.imshow(A, interpolation="nearest")
cax = ax2.imshow(A, interpolation="nearest")
# on the figure total in precent l b w , height
ax3 = fig.add_axes([0.1, 0.1, 0.8, 0.05]) # setup colorbar axes.
# put the colorbar on new axes
cbar = fig.colorbar(mapable,cax=ax3,orientation='horizontal')
plt.show()
Note ofcourse you can position ax3 as you wish, on the side, on the top, where ever,
as long as it is in the boundaries of the figure.
I don't know why your line2D is not appearing.
I added to my code before plt.show() the following and everything is showing:
from mpl_toolkits.axes_grid1 import anchored_artists
from matplotlib.patheffects import withStroke
txt = anchored_artists.AnchoredText("SC",
loc=2,
frameon=False,
prop=dict(size=12))
if withStroke:
txt.txt._text.set_path_effects([withStroke(foreground="w",
linewidth=3)])
ax1.add_artist(txt)
## Draw a line to separate the two different wave lengths, and name each region
l1 = plt.Line2D([-1,10],[5,5],ls='-',color='black',lineswidth=10)
ax1.add_line(l1)