Setting axis values in numpy/matplotlib.plot

Setting axis values in numpy/matplotlib.plot - python

I am in the process of learning numpy. I wish to plot a graph of Planck's law for different temperatures and so have two np.arrays, T and l for temperature and wavelength respectively.
import scipy.constants as sc
import numpy as np
import matplotlib.pyplot as plt
lhinm = 10000 # Highest wavelength in nm
T = np.linspace(200, 1000, 10) # Temperature in K
l = np.linspace(0, lhinm*1E-9, 101) # Wavelength in m
labels = np.linspace(0, lhinm, 6) # Axis labels giving l in nm
B = (2*sc.h*sc.c**2/l[:, np.newaxis]**5)/(np.exp((sc.h*sc.c)/(T*l[:, np.newaxis]*sc.Boltzmann))-1)
for ticks in [True, False]:
plt.plot(B)
plt.xlabel("Wavelength (nm)")
if ticks:
plt.xticks(l, labels)
plt.title("With xticks()")
plt.savefig("withticks.png")
else:
plt.title("Without xticks()")
plt.savefig("withoutticks.png")
plt.show()
I would like to label the x-axis with the wavelength in nm. If I don't call plt.xitcks() the labels on the x-axis would appear to be the index in to the array B (which holds the caculated values).
I've seen answer 7559542, but when I call plt.xticks() all the values are scrunched up on the left of the axis, rather than being evenly spread along it.
So what's the best way to define my own set of values (in this case a subset of the values in l) and place them on the axis?

The problem is that you're not giving your wavelength values to plt.plot(), so Matplotlib puts the index into the array on the horizontal axis as a default. Quick solution:
plt.plot(l, B)
Without explicitly setting tick labels, that gives you this:
Of course, the values on the horizontal axis in this plot are actually in meters, not nanometers (despite the labeling), because the values you passed as the first argument to plot() (namely the array l) are in meters. That's where xticks() comes in. The two-argument version xticks(locations, labels) places the labels at the corresponding locations on the x axis. For example, xticks([1], 'one') would put a label "one" at the location x=1, if that location is in the plot.
However, it doesn't change the range displayed on the axis. In your original example, your call to xticks() placed a bunch of labels at coordinates like 10-9, but it didn't change the axis range, which was still 0 to 100. No wonder all the labels were squished over to the left.
What you need to do is call xticks() with the points at which you want to place the labels, and the desired text of the labels. The way you were doing it, xticks(l, labels), would work except that l has length 101 and labels only has length 6, so it only uses the first 6 elements of l. To fix that, you can do something like
plt.xticks(labels * 1e-9, labels)
where the multiplication by 1e-9 converts from nanometers (what you want displayed) to meters (which are the coordinates Matplotlib actually uses in the plot).

You can supply the x values to plt.plot, and let matplotlib take care of setting the tick labels.
In your case, you could plot plt.plot(l, B), but then you still have the ticks in m, not nm.
You could therefore convert your l array to nm before plotting (or during plotting). Here's a working example:
import scipy.constants as sc
import numpy as np
import matplotlib.pyplot as plt
lhinm = 10000 # Highest wavelength in nm
T = np.linspace(200, 1000, 10) # Temperature in K
l = np.linspace(0, lhinm*1E-9, 101) # Wavelength in m
l_nm = l*1e9 # Wavelength in nm
labels = np.linspace(0, lhinm, 6) # Axis labels giving l in nm
B = (2*sc.h*sc.c**2/l[:, np.newaxis]**5)/(np.exp((sc.h*sc.c)/(T*l[:, np.newaxis]*sc.Boltzmann))-1)
plt.plot(l_nm, B)
# Alternativly:
# plt.plot(l*1e9, B)
plt.xlabel("Wavelength (nm)")
plt.title("With xticks()")
plt.savefig("withticks.png")
plt.show()

You need to use same size lists at the xtick. Try setting the axis values separately from the plot value as below.
import scipy.constants as sc
import numpy as np
import matplotlib.pyplot as plt
lhinm = 10000 # Highest wavelength in nm
T = np.linspace(200, 1000, 10) # Temperature in K
l = np.linspace(0, lhinm*1E-9, 101) # Wavelength in m
ll = np.linspace(0, lhinm*1E-9, 6) # Axis values
labels = np.linspace(0, lhinm, 6) # Axis labels giving l in nm
B = (2*sc.h*sc.c**2/l[:, np.newaxis]**5)/(np.exp((sc.h*sc.c)/(T*l[:, np.newaxis]*sc.Boltzmann))-1)
for ticks in [True, False]:
plt.plot(B)
plt.xlabel("Wavelength (nm)")
if ticks:
plt.xticks(ll, labels)
plt.title("With xticks()")
plt.savefig("withticks.png")
else:
plt.title("Without xticks()")
plt.savefig("withoutticks.png")
plt.show()

Related

Make a 2d histogram show if a certain value is above or below average?

I made a 2d histogram of two variables(x and y) and each of them are long, 1d arrays. I then calculated the average of x in each bin and want to make the colorbar show how much each x is above or below average in the respective bin.
So far I have tried to make a new array, z, that contains the values for how far above/below average each x is. When I try to use this with pcolormesh I run into issues that it is not a 2-D array. I also tried to solve this issue by following the solution from this problem (Using pcolormesh with 3 one dimensional arrays in python). The length of each array (x, y and z) are equal in this case and there is a respective z value for each x value.
My overall goal is to just have the colorbar not dependent on counts but to have it show how much above/below average each x value is from the average x of the bin. I suspect that it may make more sense to just plot x vs. z but I do not think that would fix my colorbar issue.

As LoneWanderer mentioned some sample code would be useful; however let me make an attempt at what you want.
import numpy as np
import matplotlib.pyplot as plt
N = 10000
x = np.random.uniform(0, 1, N)
y = np.random.uniform(0, 1, N) # Generating x and y data (you will already have this)
# Histogram data
xbins = np.linspace(0, 1, 100)
ybins = np.linspace(0, 1, 100)
hdata, xedges, yedged = np.histogram2d(x, y, bins=(xbins, ybins))
# compute the histogram average value and the difference
hdataMean = np.mean(hdata)
hdataRelDifference = (hdata - hdataMean) / hdataMean
# Plot the relative difference
fig, ax = plt.subplots(1, 1)
cax = ax.imshow(hdataRelDifference)
fig.colorbar(cax, ax=ax)
If this is not what you intended, hopefully there are enough pieces here to adapt it for your needs.

Plot 2D histogram data with pcolormesh

I need to plot a binned statistic, as one would get from scipy.stats.binned_statistic_2d. Basically, that means I have edge values and within-bin data. This also means I cannot (to my knowledge) use plt.hist2d. Here's a code snippet to generate the sort of data I might need to plot:
import numpy as np
x_edges = np.arange(6)
y_edges = np.arange(6)
bin_values = np.random.randn(5, 5)
One would imagine that I could use pcolormesh for this, but the issue is that pcolormesh does not allow for bin edge values. The following will only plot the values in bins 1 through 4. The 5th value is excluded, since while pcolormesh "knows" that the value at 4.0 is some value, there is no later value to plot, so the width of the 5th bin is zero.
import matplotlib.pyplot as plt
X, Y = np.broadcast_arrays(x_edges[:5, None], y_edges[None, :5])
plt.figure()
plt.pcolormesh(X, Y, bin_values)
plt.show()
I can get around this with an ugly hack by adding an additional set of values equal to the last values:
import matplotlib.pyplot as plt
X, Y = np.broadcast_arrays(x_edges[:, None], y_edges[None, :])
dummy_bin_values = np.zeros([6, 6])
dummy_bin_values[:5, :5] = bin_values
dummy_bin_values[5, :] = dummy_bin_values[4, :]
dummy_bin_values[:, 5] = dummy_bin_values[:, 4]
plt.figure()
plt.pcolormesh(X, Y, dummy_bin_values)
plt.show()
However, this is an ugly hack. Is there any cleaner way to plot 2D histogram data with bin edge values? "No" is possibly the correct answer, but convince me that's the case if it is.

I do not understand the problem with any of the two options. So here is simly a code which uses both, numpy histogrammed data with pcolormesh, as well as simply plt.hist2d.
import numpy as np
import matplotlib.pyplot as plt
x_edges = np.arange(6)
y_edges = np.arange(6)
data = np.random.rand(340,2)*5
### using numpy.histogram2d
bin_values,_,__ = np.histogram2d(data[:,0],data[:,1],bins=(x_edges, y_edges) )
X, Y = np.meshgrid(x_edges,y_edges)
fig, (ax,ax2) = plt.subplots(ncols=2)
ax.set_title("numpy.histogram2d \n + plt.pcolormesh")
ax.pcolormesh(X, Y, bin_values.T)
### using plt.hist2d
ax2.set_title("plt.hist2d")
ax2.hist2d(data[:,0],data[:,1],bins=(x_edges, y_edges))
plt.show()
Of course this would equally work with scipy.stats.binned_statistic_2d.

Setting font size in matplotlib plot [duplicate]

I have too many ticks on my graph and they are running into each other.
How can I reduce the number of ticks?
For example, I have ticks:
1E-6, 1E-5, 1E-4, ... 1E6, 1E7
And I only want:
1E-5, 1E-3, ... 1E5, 1E7
I've tried playing with the LogLocator, but I haven't been able to figure this out.

Alternatively, if you want to simply set the number of ticks while allowing matplotlib to position them (currently only with MaxNLocator), there is pyplot.locator_params,
pyplot.locator_params(nbins=4)
You can specify specific axis in this method as mentioned below, default is both:
# To specify the number of ticks on both or any single axes
pyplot.locator_params(axis='y', nbins=6)
pyplot.locator_params(axis='x', nbins=10)

To solve the issue of customisation and appearance of the ticks, see the Tick Locators guide on the matplotlib website
ax.xaxis.set_major_locator(plt.MaxNLocator(3))
would set the total number of ticks in the x-axis to 3, and evenly distribute them across the axis.
There is also a nice tutorial about this

If somebody still gets this page in search results:
fig, ax = plt.subplots()
plt.plot(...)
every_nth = 4
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)

There's a set_ticks() function for axis objects.

in case somebody still needs it, and since nothing
here really worked for me, i came up with a very
simple way that keeps the appearance of the
generated plot "as is" while fixing the number
of ticks to exactly N:
import numpy as np
import matplotlib.pyplot as plt
f, ax = plt.subplots()
ax.plot(range(100))
ymin, ymax = ax.get_ylim()
ax.set_yticks(np.round(np.linspace(ymin, ymax, N), 2))

The solution #raphael gave is straightforward and quite helpful.
Still, the displayed tick labels will not be values sampled from the original distribution but from the indexes of the array returned by np.linspace(ymin, ymax, N).
To display N values evenly spaced from your original tick labels, use the set_yticklabels() method. Here is a snippet for the y axis, with integer labels:
import numpy as np
import matplotlib.pyplot as plt
ax = plt.gca()
ymin, ymax = ax.get_ylim()
custom_ticks = np.linspace(ymin, ymax, N, dtype=int)
ax.set_yticks(custom_ticks)
ax.set_yticklabels(custom_ticks)

If you need one tick every N=3 ticks :
N = 3 # 1 tick every 3
xticks_pos, xticks_labels = plt.xticks() # get all axis ticks
myticks = [j for i,j in enumerate(xticks_pos) if not i%N] # index of selected ticks
newlabels = [label for i,label in enumerate(xticks_labels) if not i%N]
or with fig,ax = plt.subplots() :
N = 3 # 1 tick every 3
xticks_pos = ax.get_xticks()
xticks_labels = ax.get_xticklabels()
myticks = [j for i,j in enumerate(xticks_pos) if not i%N] # index of selected ticks
newlabels = [label for i,label in enumerate(xticks_labels) if not i%N]
(obviously you can adjust the offset with (i+offset)%N).
Note that you can get uneven ticks if you wish, e.g. myticks = [1, 3, 8].
Then you can use
plt.gca().set_xticks(myticks) # set new X axis ticks
or if you want to replace labels as well
plt.xticks(myticks, newlabels) # set new X axis ticks and labels
Beware that axis limits must be set after the axis ticks.
Finally, you may wish to draw only an arbitrary set of ticks :
mylabels = ['03/2018', '09/2019', '10/2020']
plt.draw() # needed to populate xticks with actual labels
xticks_pos, xticks_labels = plt.xticks() # get all axis ticks
myticks = [i for i,j in enumerate(b) if j.get_text() in mylabels]
plt.xticks(myticks, mylabels)
(assuming mylabels is ordered ; if it is not, then sort myticks and reorder it).

xticks function auto iterates with range function
start_number = 0
end_number = len(data you have)
step_number = how many skips to make from strat to end
rotation = 90 degrees tilt will help with long ticks
plt.xticks(range(start_number,end_number,step_number),rotation=90)

if you want 10 ticks:
for y axis: ax.set_yticks(ax.get_yticks()[::len(ax.get_yticks())//10])
for x axis: ax.set_xticks(ax.get_xticks()[::len(ax.get_xticks())//10])
this simply gets your ticks and chooses every 10th of the list and sets it back to your ticks. you can change the number of ticks as you wish.

When a log scale is used the number of major ticks can be fixed with the following command
import matplotlib.pyplot as plt
....
plt.locator_params(numticks=12)
plt.show()
The value set to numticks determines the number of axis ticks to be displayed.
Credits to #bgamari's post for introducing the locator_params() function, but the nticks parameter throws an error when a log scale is used.

Fit a distribution to a histogram

I want to know the distribution of my data points, so first I plotted the histogram of my data. My histogram looks like the following:
Second, in order to fit them to a distribution, here's the code I wrote:
size = 20000
x = scipy.arange(size)
# fit
param = scipy.stats.gamma.fit(y)
pdf_fitted = scipy.stats.gamma.pdf(x, *param[:-2], loc = param[-2], scale = param[-1]) * size
plt.plot(pdf_fitted, color = 'r')
# plot the histogram
plt.hist(y)
plt.xlim(0, 0.3)
plt.show()
The result is:
What am I doing wrong?

Your data does not appear to be gamma-distributed, but assuming it is, you could fit it like this:
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
gamma = stats.gamma
a, loc, scale = 3, 0, 2
size = 20000
y = gamma.rvs(a, loc, scale, size=size)
x = np.linspace(0, y.max(), 100)
# fit
param = gamma.fit(y, floc=0)
pdf_fitted = gamma.pdf(x, *param)
plt.plot(x, pdf_fitted, color='r')
# plot the histogram
plt.hist(y, normed=True, bins=30)
plt.show()
The area under the pdf (over the entire domain) equals 1.
The area under the histogram equals 1 if you use normed=True.
x has length size (i.e. 20000), and pdf_fitted has the same shape as x. If we call plot and specify only the y-values, e.g. plt.plot(pdf_fitted), then values are plotted over the x-range [0, size].
That is much too large an x-range. Since the histogram is going to use an x-range of [min(y), max(y)], we much choose x to span a similar range: x = np.linspace(0, y.max()), and call plot with both the x- and y-values specified, e.g. plt.plot(x, pdf_fitted).
As Warren Weckesser points out in the comments, for most applications you know the gamma distribution's domain begins at 0. If that is the case, use floc=0 to hold the loc parameter to 0. Without floc=0, gamma.fit will try to find the best-fit value for the loc parameter too, which given the vagaries of data will generally not be exactly zero.

Python - Line colour of 3D parametric curve

I have 2 lists tab_x (containe the values of x) and tab_z (containe the values of z) which have the same length and a value of y.
I want to plot a 3D curve which is colored by the value of z. I know it's can be plotted as a 2D plot but I want to plot a few of these plot with different values of y to compare so I need it to be 3D.
My tab_z also containe negatives values
I've found the code to color the curve by time (index) in this question but I don't know how to transforme this code to get it work in my case.
Thanks for the help.
I add my code to be more specific:
fig8 = plt.figure()
ax8 = fig8.gca(projection = '3d')
tab_y=[]
for i in range (0,len(tab_x)):
tab_y.append(y)
ax8.plot(tab_x, tab_y, tab_z)
I have this for now
I've tried this code
for i in range (0,len(tab_t)):
ax8.plot(tab_x[i:i+2], tab_y[i:i+2], tab_z[i:i+2],color=plt.cm.rainbow(255*tab_z[i]/max(tab_z)))
A total failure:

Your second attempt almost has it. The only change is that the input to the colormap cm.jet() needs to be on the range of 0 to 1. You can scale your z values to fit this range with Normalize.
import numpy as np
from matplotlib import pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
from matplotlib import colors
fig = plt.figure()
ax = fig.gca(projection='3d')
N = 100
y = np.ones((N,1))
x = np.arange(1,N + 1)
z = 5*np.sin(x/5.)
cn = colors.Normalize(min(z), max(z)) # creates a Normalize object for these z values
for i in xrange(N-1):
ax.plot(x[i:i+2], y[i:i+2], z[i:i+2], color=plt.cm.jet(cn(z[i])))
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Setting axis values in numpy/matplotlib.plot - python

Related

Make a 2d histogram show if a certain value is above or below average?

Plot 2D histogram data with pcolormesh

Setting font size in matplotlib plot [duplicate]

Fit a distribution to a histogram

Python - Line colour of 3D parametric curve

Categories

Resources