shifting violin plot horizontally in python - python

i have 8 different arryas that i want to plot using violin plot to compare distributions, this is how I plotted:
plt.violinplot(alpha_g159)
plt.violinplot(alpha_g108)
plt.violinplot(alpha_g141)
plt.violinplot(alpha_g110)
plt.violinplot(alpha_g115)
plt.violinplot(alpha_g132)
plt.violinplot(alpha_g105)
plt.violinplot(alpha_g126)
And I have this plot:
Actually what I want to do, is to shift each plot horizontally (along the x-axis) so they would not overlap, and then add on the x-axis the label of each plot.
Could anyone guide me on how to do that? i tried adding for example alpha_108+x0with x0=2but it just shifts it vertically.

You can achieve this by putting your data into a list. Matplotlib than plots the individual plots side by side.
import matplotlib.pyplot as plt
import numpy as np
# put your data in a list like this:
# data = [alpha_g159, alpha_g108, alpha_g141, alpha_g110, alpha_g115, alpha_g132, alpha_g105, alpha_g126]
# as I do not have your data I created some test data
data = [sorted(np.random.normal(0, std, 100)) for std in range(1, 9)]
plt.violinplot(data)
labels = ["alpha_g159", "alpha_g108", "alpha_g141", "alpha_g110", "alpha_g115", "alpha_g132", "alpha_g105", "alpha_g126"]
# add the labels (rotated by 45 degrees so that they do not overlap)
plt.xticks(range(1, 9), labels, rotation=45)
# Tweak spacing to prevent clipping of tick-labels
plt.subplots_adjust(bottom=0.3)
plt.show()
resulting plot

Related

Set log xticks in matplotlib for a linear plot

Consider
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.hist(np.log10(xdata), bins=100)
plt.show()
plt.semilogy(xdata)
plt.show()
is there any way to display xticks of the first plot (plt.hist) as in the second plot's yticks? For good reasons I want to histogram the np.log10(xdata) of xdata but I'd like to set minor ticks to display as usual in a log scale (even considering that the exponent is linear...)
In other words, I want the x_axis of this plot:
to be like the y_axis
of the 2nd plot, without changing the spacing between major ticks (e.g., adding log marks between 5.5 and 6.0, without altering these values)
Proper histogram plot with logarithmic x-axis:
Explanation:
Cut off negative values
The randomly generated example data likely contains still some negative values
activate the commented code lines at the beginning to see the effect
logarithmic function isn't defined for values <= 0
while the 2nd plot just deals with y-axis log scaling (negative values are just out of range), the 1st plot doesn't work with negative values in the BINs range
probably real world working data won't be <= 0, otherwise keep that in mind
BINs should be aligned to log scale as well
otherwise the 'BINs widths' distribution looks off
switch # on the plt.hist( statements in the 1st plot section to see the effect)
xdata (not np.log10(xdata)) to be plotted in the histogram
that 'workaround' with plotting np.log10(xdata) probably was the root cause for the misunderstanding in the comments
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
# MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}") # note the negative values
# cut off potential negative values (log function isn't defined for <= 0 )
xdata = np.ma.masked_less_equal(xdata, 0)
MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}")
# align the bins to fit a log scale
bins = 100
bins_log_aligned = np.logspace(np.log10(MIN_xdata), np.log10(MAX_xdata), bins)
# 1st plot
plt.hist(xdata, bins = bins_log_aligned) # note: xdata (not np.log10(xdata) )
# plt.hist(xdata, bins = 100)
plt.xscale('log')
plt.show()
# 2nd plot
plt.semilogy(xdata)
plt.show()
Just kept for now for clarification purpose. Will be deleted when the question is revised.
Disclaimer:
As Lucas M. Uriarte already mentioned that isn't an expected way of changing axis ticks.
x axis ticks and labels don't represent the plotted data
You should at least always provide that information along with such a plot.
The plot
From seeing the result I kinda understand where that special plot idea is coming from - still there should be a preferred way (e.g. conversion of the data in advance) to do such a plot instead of 'faking' the axis.
Explanation how that special axis transfer plot is done:
original x-axis is hidden
a twiny axis is added
note that its y-axis is hidden by default, so that doesn't need handling
twiny x-axis is set to log and the 2nd plot y-axis limits are transferred
subplots used to directly transfer the 2nd plot y-axis limits
use variables if you need to stick with your two plots
twiny x-axis is moved from top (twiny default position) to bottom (where the original x-axis was)
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.figure()
fig, axs = plt.subplots(2, figsize=(7,10), facecolor=(1, 1, 1))
# 1st plot
axs[0].hist(np.log10(xdata), bins=100) # plot the data on the normal x axis
axs[0].axes.xaxis.set_visible(False) # hide the normal x axis
# 2nd plot
axs[1].semilogy(xdata)
# 1st plot - twin axis
axs0_y_twin = axs[0].twiny() # set a twiny axis, note twiny y axis is hidden by default
axs0_y_twin.set(xscale="log")
# transfer the limits from the 2nd plot y axis to the twin axis
axs0_y_twin.set_xlim(axs[1].get_ylim()[0],
axs[1].get_ylim()[1])
# move the twin x axis from top to bottom
axs0_y_twin.tick_params(axis="x", which="both", bottom=True, top=False,
labelbottom=True, labeltop=False)
# Disclaimer
disclaimer_text = "Disclaimer: x axis ticks and labels don't represent the plotted data"
axs[0].text(0.5,-0.09, disclaimer_text, size=12, ha="center", color="red",
transform=axs[0].transAxes)
plt.tight_layout()
plt.subplots_adjust(hspace=0.2)
plt.show()

How to plot for frequency only?

Question
How can I plot the following scenario, just like shown in the attached image? This is for the purpose of visualising frequency allocation in a network
Scenario
I have a range of frequency values in a list-tuple like so, where the 1st value is the centre frequency, 2nd is total width, 3rd is guard band:
frequencies = [('195.71250000', '59.00000000', '2.50000000'), ('195.78750000', '59.00000000', '2.50000000'), ('195.86250000', '59.00000000', '2.50000000')]
and the range of these values are:
range = [('191.32500000', '196.12500000')]
Note: These are dummy values, the actual data is much larger but follows the same general structure
There are several ways to create this plot. One way is to use ax.vlines to plot the dashed lines for the frequencies and to use ax.bar for the rectangles representing the frequency ranges.
Here is an example where the frequencies are occupied at regular intervals within the range you have given (boundaries included) but with widths of randomly varying size. No guards are computed seeing as they should be automatically apparent thanks to the position of the frequencies and the widths, as far as I understand.
Also, the widths are much smaller compared to the sample data you have provided, else the bars will be very wide and will all overlap with one another, which would look very different from the image you have shared.
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
# Create sample dataset
rng = np.random.default_rng(seed=1) # random number generator
frequencies = np.arange(191.325, 196.125, step=0.3)
widths = rng.uniform(0.05, 0.25, size=frequencies.size)
# Create figure with single Axes and loop through frequencies and widths to plot
# vertical dashed lines for the frequencies and bars for the widths
fig, ax = plt.subplots(figsize=(10,3))
for freq, width in zip(frequencies, widths):
ax.vlines(x=freq, ymin=0, ymax=10, colors='tab:blue', linestyle='--', zorder=1)
ax.bar(x=freq, height=6, width=width, color='tab:blue', zorder=2)
# Additional formatting
ax.set_xlabel('Frequency (THZ)', labelpad=15, size=12)
ax.set_xticks(frequencies[::2])
ax.yaxis.set_visible(False)
for spine in ['top', 'left', 'right']:
ax.spines[spine].set_visible(False)
plt.show()

Turning matplotlib grid of shaded values into a series of bar charts, one per row?

Using matlotlib, I can create figures that look like this:
Here, each row consists of a series of numbers from 0 to 0.6. The left hand axis text indicates the maximum value in each row. The bottom axis text represents the column indices.
The code for the actual grid essentially involves this line:
im = ax[r,c].imshow(info_to_use, vmin=0, vmax=0.6, cmap='gray')
where ax[r,c] is the current subplot axes at row r and column c, and info_to_use is a numpy array of shape (num_rows, num_cols) and has values between 0 and 0.6.
I am wondering if there is a way to convert the code above so that it instead displays bar charts, one per row? Something like this hand-drawn figure:
(The number of columns is not the same in my hand-drawn figure compared to the earlier one.) I know this would result in a very hard-to-read plot if it were embedded into a plot like the first one here. I would have this for a plot with fewer rows, which would make the bars easier to read.
The references that helped me make the first plot above were mostly from:
Python - Plotting colored grid based on values
custom matplotlib plot : chess board like table with colored cells
https://matplotlib.org/3.1.1/gallery/subplots_axes_and_figures/colorbar_placement.html#sphx-glr-gallery-subplots-axes-and-figures-colorbar-placement-py
https://matplotlib.org/3.1.1/gallery/images_contours_and_fields/image_annotated_heatmap.html#sphx-glr-gallery-images-contours-and-fields-image-annotated-heatmap-py
But I'm not sure how to make the jump from these to a bar chart in each row. Or at least something that could mirror it, e.g., instead of shading the full cell gray, only shade as much of it based on the percentage of the vmax?
import numpy as np
from matplotlib import pyplot as plt
a = np.random.rand(10,20)*.6
In a loop, call plt.subplot then plt.bar for each row in the 2-d array.
for i, thing in enumerate(a,1):
plt.subplot(a.shape[0],1,i)
plt.bar(range(a.shape[1]),thing)
plt.show()
plt.close()
Or, create all the subplots; then in a loop make a bar plot with each Axes.
fig, axes = plt.subplots(a.shape[0],1,sharex=True)
for ax, data in zip(axes, a):
ax.bar(range(a.shape[1]), data)
plt.show()
plt.close()

Properly adding a second set of ticks to python matplotlib colorbar

I have a figure with three subplots. The top two subplots share a similar data range, while the bottom one shows data with a different data range. I'd like to use only one colorbar for the whole figure by having ticks for the top two subplots to the left of the colorbar and having ticks for the bottom subplot to the right of the colorbar (see fig bellow).
I have been able to do this using a dirty hack, namely by displaying two colorbars on top of each other and moving the ticks of one of them to the left. As an example I've modified this matplotlib example:
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
# create three subplots
fig, axes = plt.subplots(3)
# filling subplots with figures and safing the map of the first and third figure.
# fig 1-2 have a data range of 0 - 1
map12 =axes[0].imshow(np.random.random((100, 100)), cmap=plt.cm.BuPu_r)
axes[1].imshow(np.random.random((100, 100)), cmap=plt.cm.BuPu_r)
# figure 3 has a larger data range from 0 - 5
map3 = axes[2].imshow(np.random.random((100, 100))*5, cmap=plt.cm.BuPu_r)
# Create two axes for the colorbar on the same place.
# They have to be very slightly missplaced, else a warning will appear and only the second colorbar will show.
cax12 = plt.axes([0.85, 0.1, 0.075, 0.8])
cax3 = plt.axes([0.85, 0.100000000000001, 0.075, 0.8])
# plot the two colorbars
cbar12 = plt.colorbar(map12, cax=cax12, label='ticks for top two figs')
cbar3 = plt.colorbar(map3, cax=cax3, label='ticks for bottom fig')
# move ticks and label of second plot to the left
cbar12.ax.yaxis.set_ticks_position('left')
cbar12.ax.yaxis.set_label_position('left')
## display image
plt.show()
While I'm happy with the visual result, i think there has to be a better way to do this. One problem is that if you save it as vector graphic, you will end up with overlapping shapes. Also if you make a mistake with the colors of the lower colorbar you might not realize it because the colors are hidden, or it might give you a headache if you want to make the colorbar sightly transpartent for some reason. I therefore wonder how one would do this properly, or if this is not possible, if there is a better hack?
You can achieve the same result without drawing the second colorbar, you just need to create a new axes with the ticks to the right, and adjust the range of the y-axis to the range of data of your 3rd plot.
import matplotlib.pyplot as plt
import numpy as np
# Fixing random state for reproducibility
np.random.seed(19680801)
# create three subplots
fig, axes = plt.subplots(3)
# filling subplots with figures and safing the map of the first and third figure.
# fig 1-2 have a data range of 0 - 1
map12 =axes[0].imshow(np.random.random((100, 100)), cmap=plt.cm.BuPu_r)
axes[1].imshow(np.random.random((100, 100)), cmap=plt.cm.BuPu_r)
# figure 3 has a larger data range from 0 - 5
map3 = axes[2].imshow(np.random.random((100, 100))*5, cmap=plt.cm.BuPu_r)
# Create two axes for the colorbar on the same place.
cax12 = plt.axes([0.85, 0.1, 0.075, 0.8])
cax3 = cax12.twinx()
# plot first colorbar
cbar12 = plt.colorbar(map12, cax=cax12, label='ticks for top two figs')
# move ticks and label of colorbar to the left
cbar12.ax.yaxis.set_ticks_position('left')
cbar12.ax.yaxis.set_label_position('left')
# adjust limits of right axis to match data range of 3rd plot
cax3.set_ylim(0,5)
cax3.set_ylabel('ticks for bottom fig')
## display image
plt.show()
For some reason, the above answer did not work for me. I do not know why. What worked for me is as follows:
cax2 = fig.add_axes([<xposition>, <yposition>, <xlength>, <ylegth>])
cax21 = cax2.twinx()
cax2.set_ylabel('right-label',size=<right_lable_size>)
cax2.tick_params(labelsize=<right_tick_size>)
'''
These did not work for me
cbar21.ax.yaxis.set_ticks_position('left')
cbar21.ax.yaxis.set_label_position('left')
'''
# This worked.
cax21.yaxis.tick_left()
cax21.yaxis.label_position='left'
cax21.set_ylim(<minVal>,<maxVal>,<step>)
cax21.set_ylabel("left-label",size=<left_lable_size>)
cax21.tick_params(labelsize=<left_tick_size>)
Hopefully, this helps.

Displaying 3 histograms on 1 axis in a legible way - matplotlib

I have produced 3 sets of data which are organised in numpy arrays. I'm interested in plotting the probability distribution of these three sets of data as normed histograms. All three distributions should look almost identical so it seems sensible to plot all three on the same axis for ease of comparison.
By default matplotlib histograms are plotted as bars which makes the image I want look very messy. Hence, my question is whether it is possible to force pyplot.hist to only draw a box/circle/triangle where the top of the bar would be in the default form so I can cleanly display all three distributions on the same graph or whether I have to calculate the histogram data and then plot it separately as a scatter graph.
Thanks in advance.
There are two ways to plot three histograms simultaniously, but both are not what you've asked for. To do what you ask, you must calculate the histogram, e.g. by using numpy.histogram, then plot using the plot method. Use scatter only if you want to associate other information with your points by setting a size for each point.
The first alternative approach to using hist involves passing all three data sets at once to the hist method. The hist method then adjusts the widths and placements of each bar so that all three sets are clearly presented.
The second alternative is to use the histtype='step' option, which makes clear plots for each set.
Here is a script demonstrating this:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(101)
a = np.random.normal(size=1000)
b = np.random.normal(size=1000)
c = np.random.normal(size=1000)
common_params = dict(bins=20,
range=(-5, 5),
normed=True)
plt.subplots_adjust(hspace=.4)
plt.subplot(311)
plt.title('Default')
plt.hist(a, **common_params)
plt.hist(b, **common_params)
plt.hist(c, **common_params)
plt.subplot(312)
plt.title('Skinny shift - 3 at a time')
plt.hist((a, b, c), **common_params)
plt.subplot(313)
common_params['histtype'] = 'step'
plt.title('With steps')
plt.hist(a, **common_params)
plt.hist(b, **common_params)
plt.hist(c, **common_params)
plt.savefig('3hist.png')
plt.show()
And here is the resulting plot:
Keep in mind you could do all this with the object oriented interface as well, e.g. make individual subplots, etc.

Categories

Resources