Seaborn heatmap with variating cell sizes - python

I have a heatmap with ticks which have non equal deltas between themselves:
For example, in the attached image, the deltas are between 0.015 to 0.13. The current scale doesn't show the real scenario, since all cell sizes are equal.
Is there a way to place the ticks in their realistic positions, such that cell sizes would also change accordingly?
Alternatively, is there another method to generate this figure such that it would provide a realistic representation of the tick values?

As mentioned in the comments, a Seaborn heatmap uses categorical labels. However, the underlying structure is a pcolormesh, which can have different sizes for each cell.
Also mentioned in the comments, is that updating the private attributes of the pcolormesh isn't recommended. Moreover, the heatmap can be directly created calling pcolormesh.
Note that if there are N cells, there will be N+1 boundaries. The example code below supposes you have x-positions for the centers of the cells. It then calculates boundaries in the middle between successive cells. The first and the last distance is repeated.
The ticks and tick labels for x and y axis can be set from the given x-values. The example code supposes the original values indicate the centers of the cells.
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
sns.set()
N = 10
xs = np.random.uniform(0.015, 0.13, 10).cumsum().round(3) # some random x values
values = np.random.rand(N, N) # a random matrix
# set bounds in the middle of successive cells, add extra bounds at start and end
bounds = (xs[:-1] + xs[1:]) / 2
bounds = np.concatenate([[2 * bounds[0] - bounds[1]], bounds, [2 * bounds[-1] - bounds[-2]]])
fig, ax = plt.subplots()
ax.pcolormesh(bounds, bounds, values)
ax.set_xticks(xs)
ax.set_xticklabels(xs, rotation=90)
ax.set_yticks(xs)
ax.set_yticklabels(xs, rotation=0)
plt.tight_layout()
plt.show()
PS: In case the ticks are mean to be the boundaries, the code can be simplified. One extra boundary is needed, for example a zero at the start.`
bounds = np.concatenate([[0], xs])
ax.tick_params(bottom=True, left=True)

Related

Set log xticks in matplotlib for a linear plot

Consider
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.hist(np.log10(xdata), bins=100)
plt.show()
plt.semilogy(xdata)
plt.show()
is there any way to display xticks of the first plot (plt.hist) as in the second plot's yticks? For good reasons I want to histogram the np.log10(xdata) of xdata but I'd like to set minor ticks to display as usual in a log scale (even considering that the exponent is linear...)
In other words, I want the x_axis of this plot:
to be like the y_axis
of the 2nd plot, without changing the spacing between major ticks (e.g., adding log marks between 5.5 and 6.0, without altering these values)
Proper histogram plot with logarithmic x-axis:
Explanation:
Cut off negative values
The randomly generated example data likely contains still some negative values
activate the commented code lines at the beginning to see the effect
logarithmic function isn't defined for values <= 0
while the 2nd plot just deals with y-axis log scaling (negative values are just out of range), the 1st plot doesn't work with negative values in the BINs range
probably real world working data won't be <= 0, otherwise keep that in mind
BINs should be aligned to log scale as well
otherwise the 'BINs widths' distribution looks off
switch # on the plt.hist( statements in the 1st plot section to see the effect)
xdata (not np.log10(xdata)) to be plotted in the histogram
that 'workaround' with plotting np.log10(xdata) probably was the root cause for the misunderstanding in the comments
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
# MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}") # note the negative values
# cut off potential negative values (log function isn't defined for <= 0 )
xdata = np.ma.masked_less_equal(xdata, 0)
MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}")
# align the bins to fit a log scale
bins = 100
bins_log_aligned = np.logspace(np.log10(MIN_xdata), np.log10(MAX_xdata), bins)
# 1st plot
plt.hist(xdata, bins = bins_log_aligned) # note: xdata (not np.log10(xdata) )
# plt.hist(xdata, bins = 100)
plt.xscale('log')
plt.show()
# 2nd plot
plt.semilogy(xdata)
plt.show()
Just kept for now for clarification purpose. Will be deleted when the question is revised.
Disclaimer:
As Lucas M. Uriarte already mentioned that isn't an expected way of changing axis ticks.
x axis ticks and labels don't represent the plotted data
You should at least always provide that information along with such a plot.
The plot
From seeing the result I kinda understand where that special plot idea is coming from - still there should be a preferred way (e.g. conversion of the data in advance) to do such a plot instead of 'faking' the axis.
Explanation how that special axis transfer plot is done:
original x-axis is hidden
a twiny axis is added
note that its y-axis is hidden by default, so that doesn't need handling
twiny x-axis is set to log and the 2nd plot y-axis limits are transferred
subplots used to directly transfer the 2nd plot y-axis limits
use variables if you need to stick with your two plots
twiny x-axis is moved from top (twiny default position) to bottom (where the original x-axis was)
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.figure()
fig, axs = plt.subplots(2, figsize=(7,10), facecolor=(1, 1, 1))
# 1st plot
axs[0].hist(np.log10(xdata), bins=100) # plot the data on the normal x axis
axs[0].axes.xaxis.set_visible(False) # hide the normal x axis
# 2nd plot
axs[1].semilogy(xdata)
# 1st plot - twin axis
axs0_y_twin = axs[0].twiny() # set a twiny axis, note twiny y axis is hidden by default
axs0_y_twin.set(xscale="log")
# transfer the limits from the 2nd plot y axis to the twin axis
axs0_y_twin.set_xlim(axs[1].get_ylim()[0],
axs[1].get_ylim()[1])
# move the twin x axis from top to bottom
axs0_y_twin.tick_params(axis="x", which="both", bottom=True, top=False,
labelbottom=True, labeltop=False)
# Disclaimer
disclaimer_text = "Disclaimer: x axis ticks and labels don't represent the plotted data"
axs[0].text(0.5,-0.09, disclaimer_text, size=12, ha="center", color="red",
transform=axs[0].transAxes)
plt.tight_layout()
plt.subplots_adjust(hspace=0.2)
plt.show()

How to plot for frequency only?

Question
How can I plot the following scenario, just like shown in the attached image? This is for the purpose of visualising frequency allocation in a network
Scenario
I have a range of frequency values in a list-tuple like so, where the 1st value is the centre frequency, 2nd is total width, 3rd is guard band:
frequencies = [('195.71250000', '59.00000000', '2.50000000'), ('195.78750000', '59.00000000', '2.50000000'), ('195.86250000', '59.00000000', '2.50000000')]
and the range of these values are:
range = [('191.32500000', '196.12500000')]
Note: These are dummy values, the actual data is much larger but follows the same general structure
There are several ways to create this plot. One way is to use ax.vlines to plot the dashed lines for the frequencies and to use ax.bar for the rectangles representing the frequency ranges.
Here is an example where the frequencies are occupied at regular intervals within the range you have given (boundaries included) but with widths of randomly varying size. No guards are computed seeing as they should be automatically apparent thanks to the position of the frequencies and the widths, as far as I understand.
Also, the widths are much smaller compared to the sample data you have provided, else the bars will be very wide and will all overlap with one another, which would look very different from the image you have shared.
import numpy as np # v 1.19.2
import matplotlib.pyplot as plt # v 3.3.2
# Create sample dataset
rng = np.random.default_rng(seed=1) # random number generator
frequencies = np.arange(191.325, 196.125, step=0.3)
widths = rng.uniform(0.05, 0.25, size=frequencies.size)
# Create figure with single Axes and loop through frequencies and widths to plot
# vertical dashed lines for the frequencies and bars for the widths
fig, ax = plt.subplots(figsize=(10,3))
for freq, width in zip(frequencies, widths):
ax.vlines(x=freq, ymin=0, ymax=10, colors='tab:blue', linestyle='--', zorder=1)
ax.bar(x=freq, height=6, width=width, color='tab:blue', zorder=2)
# Additional formatting
ax.set_xlabel('Frequency (THZ)', labelpad=15, size=12)
ax.set_xticks(frequencies[::2])
ax.yaxis.set_visible(False)
for spine in ['top', 'left', 'right']:
ax.spines[spine].set_visible(False)
plt.show()

Matplotlib - Contourf - How to have a non-uniform ticks spacing?

I would like to plot contourf with (lat,depth,temp) and then have similar spacing as in the figure below (the temperature vary more near the surface then at depth, so I want to emphasized this region).
My depth array is not uniform (i.e. depth = [5,15,...,4975,5185,...]. I want to have such non-uniform vertical spacing.
I would like to show yticks = [10,100,500,1000,1500,2000,3000,4000,5000], and depth array does not have those exact values.
z = np.arange(0,50) # I want uniform spacing
pos = ([0,2,5,10,15,20,30,40,48]) # I want some yticks (not all of them)
ax=plt.contourf(lat,z,temp) # temp is a variable with dimensions (lat,depth)
plt.colorbar()
plt.gca().yaxis.set_ticks(pos) # Set some yticks, not all of them
plt.yticks(z[pos],depth[pos].astype(int)) # Replace the dummy values of z-array by something meaningful
plt.gca().invert_yaxis()
plt.grid(linestyle=':')
plt.gca().set(ylabel='depth (m)',xlabel='Latitude')'''
Potential Temperature of the Atlantic Ocean:
Per the matplotlib docs on yticks, you can specify the labels you want to use. In your case, if you want to show the labels [10,100,500,1000,1500,2000,3000,4000,5000] you can simply pass that list as the second argument in plt.yticks(), like so
plt.yticks(z[pos], [10,100,500,1000,1500,2000,3000,4000,5000])
and it will display the yticks accordingly. The issue arises in the specification of the positions - since the depth array does not have points corresponding exactly to the desired ytick values you will need to interpolate in order to find the exact position at which to place the labels. Unless the approximate positions specified in pos are already sufficient, in which case the above suffices.
If the depth data are not uniformly spaced then you can use numpy.interp to perform the interpolation, as shown below
import matplotlib.pyplot as plt
import numpy as np
# Create some depth data that is not uniformly spaced over [0, 5500]
depth = [(np.random.random() - 0.5)*25 + ii for ii in np.linspace(0, 5500, 50)]
lat = np.linspace(-75, 75, 50)
z = np.linspace(0,50, 50)
yticks = [10,100,500,1000,1500,2000,3000,4000,5000]
# Interpolate depths to get z-positions
pos = np.interp(yticks, depth, z)
temp = np.outer(lat, z) # Arbitrarily populate temp for demonstration
ax = plt.contourf(lat,z,temp)
plt.colorbar()
plt.gca().yaxis.set_ticks(pos)
plt.yticks(pos,yticks) # Place yticks at interpolated z-positions
plt.gca().invert_yaxis()
plt.grid(linestyle=':')
plt.gca().set(ylabel='Depth (m)',xlabel='Latitude')
plt.show()
This will find the exact positions where the yticks would fall if the depth array had data at those positions and place them accordingly as shown below.

Can't get rid of leading zeros on y axis

I am trying to plot graphs in Matplotlib and embed them into pyqt5 GUI. Everything is working fine, except for the fact that my y axis has loads of leading zeros which I cannot seem to get rid of.
I have tried googling how to format the axis, but nothing seems to work! I can't set the ticks directly because there's no way of determining what they will be, as I am going to be working with varying sized data sets.
num_bins = 50
# create an axis
ax = self.figure.add_subplot(111)
# discards the old graph
ax.clear()
##draws the bars and legend
colours = ['blue','red']
ax.hist(self.histoSets, num_bins, density=True, histtype='bar', color=colours, label=colours)
ax.legend(prop={'size': 10})
##set x ticks
min,max = self.getMinMax()
scaleMax = math.ceil((max/10000))*10000
scaleMin = math.floor((min/10000))*10000
scaleRange = scaleMax - scaleMin
ax.xaxis.set_ticks(np.arange(scaleMin, scaleMax+1, scaleRange/4))
# refresh canvas
self.draw()
all those numbers on your y-axis are tiny, i.e. on the order of 1e-5. this is because the integral of the density is defined to be 1 and your x-axis spans such a large range
I can mostly reproduce your plot with:
import matplotlib.pyplot as plt
import numpy as np
y = np.random.normal([190000, 220000], 20000, (5000, 2))
a, b, c = plt.hist(y, 40, density=True)
giving me:
the tuple returned from hist contains useful information, notably the first element (a above) are the densities, and the second element (b above) are the bins that it picked. you can see this all sums to one by doing:
sum(a[0] * np.diff(b))
and getting 1 back.
as ImportanceOfBeingErnest says you can use tight_layout() to resize the plot if it doesn't fit into the area

matplotlib: manually change yaxis values to differ from the actual value (NOT: change ticks!) [duplicate]

I am trying to plot a data and function with matplotlib 2.0 under python 2.7.
The x values of the function are evolving with time and the x is first decreasing to a certain value, than increasing again.
If the function is plotted against time, it shows function like this plot of data against time
I need the same x axis evolution for plotting against real x values. Unfortunately as the x values are the same for both parts before and after, both values are mixed together. This gives me the wrong data plot:
In this example it means I need the x-axis to start on value 2.4 and decrease to 1.0 than again increase to 2.4. I swear I found before that this is possible, but unfortunately I can't find a trace about that again.
A matplotlib axis is by default linearly increasing. More importantly, there must be an injective mapping of the number line to the axis units. So changing the data range is not really an option (at least when the aim is to keep things simple).
It would hence be good to keep the original numbers and only change the ticks and ticklabels on the axis. E.g. you could use a FuncFormatter to map the original numbers to
np.abs(x-tp)+tp
where tp would be the turning point.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
x = np.linspace(-10,20,151)
y = np.exp(-(x-5)**2/19.)
plt.plot(x,y)
tp = 5
fmt = lambda x,pos:"{:g}".format(np.abs(x-tp)+tp)
plt.gca().xaxis.set_major_formatter(matplotlib.ticker.FuncFormatter(fmt))
plt.show()
One option would be to use two axes, and plot your two timespans separately on each axes.
for instance, if you have the following data:
myX = np.linspace(1,2.4,100)
myY1 = -1*myX
myY2 = -0.5*myX-0.5
plt.plot(myX,myY, c='b')
plt.plot(myX,myY2, c='g')
you can instead create two subplots with a shared y-axis and no space between the two axes, plot each time span independently, and finally, adjust the limits of one of your x-axis to reverse the order of the points
fig, (ax1,ax2) = plt.subplots(1,2, gridspec_kw={'wspace':0}, sharey=True)
ax1.plot(myX,myY1, c='b')
ax2.plot(myX,myY2, c='g')
ax1.set_xlim((2.4,1))
ax2.set_xlim((1,2.4))

Categories

Resources