I am trying to plot graphs in Matplotlib and embed them into pyqt5 GUI. Everything is working fine, except for the fact that my y axis has loads of leading zeros which I cannot seem to get rid of.
I have tried googling how to format the axis, but nothing seems to work! I can't set the ticks directly because there's no way of determining what they will be, as I am going to be working with varying sized data sets.
num_bins = 50
# create an axis
ax = self.figure.add_subplot(111)
# discards the old graph
ax.clear()
##draws the bars and legend
colours = ['blue','red']
ax.hist(self.histoSets, num_bins, density=True, histtype='bar', color=colours, label=colours)
ax.legend(prop={'size': 10})
##set x ticks
min,max = self.getMinMax()
scaleMax = math.ceil((max/10000))*10000
scaleMin = math.floor((min/10000))*10000
scaleRange = scaleMax - scaleMin
ax.xaxis.set_ticks(np.arange(scaleMin, scaleMax+1, scaleRange/4))
# refresh canvas
self.draw()
all those numbers on your y-axis are tiny, i.e. on the order of 1e-5. this is because the integral of the density is defined to be 1 and your x-axis spans such a large range
I can mostly reproduce your plot with:
import matplotlib.pyplot as plt
import numpy as np
y = np.random.normal([190000, 220000], 20000, (5000, 2))
a, b, c = plt.hist(y, 40, density=True)
giving me:
the tuple returned from hist contains useful information, notably the first element (a above) are the densities, and the second element (b above) are the bins that it picked. you can see this all sums to one by doing:
sum(a[0] * np.diff(b))
and getting 1 back.
as ImportanceOfBeingErnest says you can use tight_layout() to resize the plot if it doesn't fit into the area
Related
Consider
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.hist(np.log10(xdata), bins=100)
plt.show()
plt.semilogy(xdata)
plt.show()
is there any way to display xticks of the first plot (plt.hist) as in the second plot's yticks? For good reasons I want to histogram the np.log10(xdata) of xdata but I'd like to set minor ticks to display as usual in a log scale (even considering that the exponent is linear...)
In other words, I want the x_axis of this plot:
to be like the y_axis
of the 2nd plot, without changing the spacing between major ticks (e.g., adding log marks between 5.5 and 6.0, without altering these values)
Proper histogram plot with logarithmic x-axis:
Explanation:
Cut off negative values
The randomly generated example data likely contains still some negative values
activate the commented code lines at the beginning to see the effect
logarithmic function isn't defined for values <= 0
while the 2nd plot just deals with y-axis log scaling (negative values are just out of range), the 1st plot doesn't work with negative values in the BINs range
probably real world working data won't be <= 0, otherwise keep that in mind
BINs should be aligned to log scale as well
otherwise the 'BINs widths' distribution looks off
switch # on the plt.hist( statements in the 1st plot section to see the effect)
xdata (not np.log10(xdata)) to be plotted in the histogram
that 'workaround' with plotting np.log10(xdata) probably was the root cause for the misunderstanding in the comments
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
# MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}") # note the negative values
# cut off potential negative values (log function isn't defined for <= 0 )
xdata = np.ma.masked_less_equal(xdata, 0)
MIN_xdata, MAX_xdata = np.min(xdata), np.max(xdata)
# print(f"{MIN_xdata}, {MAX_xdata}")
# align the bins to fit a log scale
bins = 100
bins_log_aligned = np.logspace(np.log10(MIN_xdata), np.log10(MAX_xdata), bins)
# 1st plot
plt.hist(xdata, bins = bins_log_aligned) # note: xdata (not np.log10(xdata) )
# plt.hist(xdata, bins = 100)
plt.xscale('log')
plt.show()
# 2nd plot
plt.semilogy(xdata)
plt.show()
Just kept for now for clarification purpose. Will be deleted when the question is revised.
Disclaimer:
As Lucas M. Uriarte already mentioned that isn't an expected way of changing axis ticks.
x axis ticks and labels don't represent the plotted data
You should at least always provide that information along with such a plot.
The plot
From seeing the result I kinda understand where that special plot idea is coming from - still there should be a preferred way (e.g. conversion of the data in advance) to do such a plot instead of 'faking' the axis.
Explanation how that special axis transfer plot is done:
original x-axis is hidden
a twiny axis is added
note that its y-axis is hidden by default, so that doesn't need handling
twiny x-axis is set to log and the 2nd plot y-axis limits are transferred
subplots used to directly transfer the 2nd plot y-axis limits
use variables if you need to stick with your two plots
twiny x-axis is moved from top (twiny default position) to bottom (where the original x-axis was)
Code:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42) # just to have repeatable results for the answer
xdata=np.random.normal(5e5,2e5,int(1e4))
plt.figure()
fig, axs = plt.subplots(2, figsize=(7,10), facecolor=(1, 1, 1))
# 1st plot
axs[0].hist(np.log10(xdata), bins=100) # plot the data on the normal x axis
axs[0].axes.xaxis.set_visible(False) # hide the normal x axis
# 2nd plot
axs[1].semilogy(xdata)
# 1st plot - twin axis
axs0_y_twin = axs[0].twiny() # set a twiny axis, note twiny y axis is hidden by default
axs0_y_twin.set(xscale="log")
# transfer the limits from the 2nd plot y axis to the twin axis
axs0_y_twin.set_xlim(axs[1].get_ylim()[0],
axs[1].get_ylim()[1])
# move the twin x axis from top to bottom
axs0_y_twin.tick_params(axis="x", which="both", bottom=True, top=False,
labelbottom=True, labeltop=False)
# Disclaimer
disclaimer_text = "Disclaimer: x axis ticks and labels don't represent the plotted data"
axs[0].text(0.5,-0.09, disclaimer_text, size=12, ha="center", color="red",
transform=axs[0].transAxes)
plt.tight_layout()
plt.subplots_adjust(hspace=0.2)
plt.show()
I have a heatmap with ticks which have non equal deltas between themselves:
For example, in the attached image, the deltas are between 0.015 to 0.13. The current scale doesn't show the real scenario, since all cell sizes are equal.
Is there a way to place the ticks in their realistic positions, such that cell sizes would also change accordingly?
Alternatively, is there another method to generate this figure such that it would provide a realistic representation of the tick values?
As mentioned in the comments, a Seaborn heatmap uses categorical labels. However, the underlying structure is a pcolormesh, which can have different sizes for each cell.
Also mentioned in the comments, is that updating the private attributes of the pcolormesh isn't recommended. Moreover, the heatmap can be directly created calling pcolormesh.
Note that if there are N cells, there will be N+1 boundaries. The example code below supposes you have x-positions for the centers of the cells. It then calculates boundaries in the middle between successive cells. The first and the last distance is repeated.
The ticks and tick labels for x and y axis can be set from the given x-values. The example code supposes the original values indicate the centers of the cells.
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
sns.set()
N = 10
xs = np.random.uniform(0.015, 0.13, 10).cumsum().round(3) # some random x values
values = np.random.rand(N, N) # a random matrix
# set bounds in the middle of successive cells, add extra bounds at start and end
bounds = (xs[:-1] + xs[1:]) / 2
bounds = np.concatenate([[2 * bounds[0] - bounds[1]], bounds, [2 * bounds[-1] - bounds[-2]]])
fig, ax = plt.subplots()
ax.pcolormesh(bounds, bounds, values)
ax.set_xticks(xs)
ax.set_xticklabels(xs, rotation=90)
ax.set_yticks(xs)
ax.set_yticklabels(xs, rotation=0)
plt.tight_layout()
plt.show()
PS: In case the ticks are mean to be the boundaries, the code can be simplified. One extra boundary is needed, for example a zero at the start.`
bounds = np.concatenate([[0], xs])
ax.tick_params(bottom=True, left=True)
I would like to plot contourf with (lat,depth,temp) and then have similar spacing as in the figure below (the temperature vary more near the surface then at depth, so I want to emphasized this region).
My depth array is not uniform (i.e. depth = [5,15,...,4975,5185,...]. I want to have such non-uniform vertical spacing.
I would like to show yticks = [10,100,500,1000,1500,2000,3000,4000,5000], and depth array does not have those exact values.
z = np.arange(0,50) # I want uniform spacing
pos = ([0,2,5,10,15,20,30,40,48]) # I want some yticks (not all of them)
ax=plt.contourf(lat,z,temp) # temp is a variable with dimensions (lat,depth)
plt.colorbar()
plt.gca().yaxis.set_ticks(pos) # Set some yticks, not all of them
plt.yticks(z[pos],depth[pos].astype(int)) # Replace the dummy values of z-array by something meaningful
plt.gca().invert_yaxis()
plt.grid(linestyle=':')
plt.gca().set(ylabel='depth (m)',xlabel='Latitude')'''
Potential Temperature of the Atlantic Ocean:
Per the matplotlib docs on yticks, you can specify the labels you want to use. In your case, if you want to show the labels [10,100,500,1000,1500,2000,3000,4000,5000] you can simply pass that list as the second argument in plt.yticks(), like so
plt.yticks(z[pos], [10,100,500,1000,1500,2000,3000,4000,5000])
and it will display the yticks accordingly. The issue arises in the specification of the positions - since the depth array does not have points corresponding exactly to the desired ytick values you will need to interpolate in order to find the exact position at which to place the labels. Unless the approximate positions specified in pos are already sufficient, in which case the above suffices.
If the depth data are not uniformly spaced then you can use numpy.interp to perform the interpolation, as shown below
import matplotlib.pyplot as plt
import numpy as np
# Create some depth data that is not uniformly spaced over [0, 5500]
depth = [(np.random.random() - 0.5)*25 + ii for ii in np.linspace(0, 5500, 50)]
lat = np.linspace(-75, 75, 50)
z = np.linspace(0,50, 50)
yticks = [10,100,500,1000,1500,2000,3000,4000,5000]
# Interpolate depths to get z-positions
pos = np.interp(yticks, depth, z)
temp = np.outer(lat, z) # Arbitrarily populate temp for demonstration
ax = plt.contourf(lat,z,temp)
plt.colorbar()
plt.gca().yaxis.set_ticks(pos)
plt.yticks(pos,yticks) # Place yticks at interpolated z-positions
plt.gca().invert_yaxis()
plt.grid(linestyle=':')
plt.gca().set(ylabel='Depth (m)',xlabel='Latitude')
plt.show()
This will find the exact positions where the yticks would fall if the depth array had data at those positions and place them accordingly as shown below.
I am generating plots like this one:
When using less ticks, the plot fits nicely and the bars are wide enough to see them correctly. Nevertheless, when there are lots of ticks, instead of making the plot larger, it just compress the y axe, resulting in thin bars and overlapping tick text.
This is happening both for plt.show() and plt.save_fig().
Is there any solution so it plots the figure in a scale which guarantees that bars have the specified width, not more (if too few ticks) and not less (too many, overlapping)?
EDIT:
Yes, I'm using barh, and yes, I'm setting height to a fixed value (8):
height = 8
ax.barh(yvalues-width/2, xvalues, height=height, color='blue', align='center')
ax.barh(yvalues+width/2, xvalues, height=height, color='red', align='center')
I don't quite understand your code, it seems you do two plots with the same (only shifted) yvalues, but the image doesn't look so. And are you sure you want to shift by width/2 if you have align=center? Anyways, to changing the image size:
No, I am not sure there is no other way, but I don't see anything in the manual at a glance. To set image size by hand:
fig = plt.figure(figsize=(5, 80))
ax = fig.add_subplot(111)
...your_code
the size is in cm. You can compute it beforehand, try for example
import numpy as np
fig_height = (max(yvalues) - min(yvalues)) / np.diff(yvalue)
this would (approximately) set the minimum distance between ticks to a centimeter, which is too much, but try to adjust it.
I think of two solutions for your case:
If you are trying to plot a histogram, use hist function [1]. This will automatically bin your data. You can even plot multiple overlapping histograms as long as you set alpha value lower than 1. See this post
import matplotlib.pyplot as plt
import numpy as np
x = mu + sigma*np.random.randn(10000)
plt.hist(x, 50, normed=1, facecolor='green',
alpha=0.75, orientation='horizontal')
You can also identify interval of your axis ticks. This will place a tick every 10 items. But I doubt this will solve your problem.
import matplotlib.ticker as ticker
...
ax.yaxis.set_major_locator(ticker.MultipleLocator(10))
I'm trying to plot the contour map of a given function f(x,y), but since the functions output scales really fast, I'm losing a lot of information for lower values of x and y. I found on the forums to work that out using vmax=vmax, it actually worked, but only when plotted for a specific limit of x and y and levels of the colormap.
Say I have this plot:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
u = np.linspace(-2,2,1000)
x,y = np.meshgrid(u,u)
z = (1-x)**2+100*(y-x**2)**2
cont = plt.contour(x,y,z,500,colors='black',linewidths=.3)
cont = plt.contourf(x,y,z,500,cmap="jet",vmax=100)
plt.colorbar(cont)
plt.show
I want to uncover whats beyond the axis limits keeping the same scale, but if I change de x and y limits to -3 and 3 I get:
See how I lost most of my levels since my max value for the function at these limits are much higher. A work around to this problem is to increase the levels to 1000, but that takes a lot of computational time.
Is there a way to plot only the contour levels that I need? That is, between 0 and 100.
An example of a desired output would be:
With the white space being the continuation of the plot without resizing the levels.
The code I'm using is the one given after the first image.
There are a few possible ideas here. The one I very much prefer is a logarithmic representation of the data. An example would be
from matplotlib import ticker
fig = plt.figure(1)
cont1 = plt.contourf(x,y,z,cmap="jet",locator=ticker.LogLocator(numticks=10))
plt.colorbar(cont1)
plt.show()
fig = plt.figure(2)
cont2 = plt.contourf(x,y,np.log10(z),100,cmap="jet")
plt.colorbar(cont2)
plt.show()
The first example uses matplotlibs LogLocator functions. The second one just directly computes the logarithm of the data and plots that normally.
The third example just caps all data above 100.
fig = plt.figure(3)
zcapped = z.copy()
zcapped[zcapped>100]=100
cont3 = plt.contourf(x,y,zcapped,100,cmap="jet")
cbar = plt.colorbar(cont3)
plt.show()