How would I go by if I wanted to plot a histogram when I already have the bins and their size ?
If I use :
plt.hist(x, bins)
it considers x as a list of results and not the already defined value of the corresponding bin.
Thanks
In that case you can simply create a bar chart with plt.bar:
plt.bar(bins[:, 0], x, bins[:, 1] - bins[:, 0])
I simply assumed bins is an array of shape (n, 2), where nis the number of bins. The first column is the lowest value covered by the bin and the second column is the upper value covered by the bin.
Related
I have a range of positive integers ranging from 250-1200, with a normal distribution. I have found the answer to creating bins of equal density (Matplotlib: How to make a histogram with bins of equal area?). What I am actually looking for is to be able to retrieve the upper and lower boundaries of each bin. Is there a library/function that exists for this? or can this information be pulled out from matplotlib?
Let's take a look at the code provided in the question you linked:
def histedges_equalN(x, nbin):
npt = len(x)
return np.interp(np.linspace(0, npt, nbin + 1),
np.arange(npt),
np.sort(x))
x = np.random.randn(1000)
n, bins, patches = plt.hist(x, histedges_equalN(x, 10))
bins is actually giving you the edges of each bin as you can read in the docs of hist function:
I would like to plot histogram using matplotlib.
I am just wondering how I may set up range (<9.0,9.0-10.0,11.0-12.0,12.0-13.0.. max element in an array) of bins.
<9.0 stands for elements smaller than 0.9
I have used the smallest and biggest value in an array:
plt.hist(results, bins=np.arange(np.amin(results),np.amax(results),0.1))
I'll be grateful for any hints
The list or array supplied to bins contains the edges of the histogram bins. You may therefore create a bin ranging from the minimal value in results to 9.0.
bins = [np.min(results)] + range(9, np.max(results), 1)
plt.hist(results, bins=bins)
I'm trying to make a scaled scatter plot from a histogram. The scatter plot is fairly straight-forward, make the histogram, find bin centers, scatter plot.
nbins=7
# Some example data
A = np.random.randint(0, 10, 100)
B = np.random.rand(100)
counts, binEdges=np.histogram(A,bins=nbins)
bincenters = 0.5*(binEdges[1:]+binEdges[:-1])
fig = plt.figure(figsize=(7,5))
ax = fig.add_subplot(111)
ax.scatter(bincenters,counts,c='k', marker='.')
ax_setup(ax, 'X', 'Y')
plt.show()
but I want each element of A to only contribute a scaled value to it's bin, that scaled value is stored in B. (i.e. instead of each bin being the count of elements from A for that bin, I want each bin to be the sum of corresponding values from B)
To do this I tried creating a list C (same length as A, and B) that had the bin number allocation for each element of A, then summing all of the values from B that go into the same bin. I thought numpy.searchsorted() is what I needed e.g.,
C = bincenters.searchsorted(A, 'right')
but this doesn't get the allocation right, and doesn't seem to return the correct number of bins.
So, how do I create a list that tells me which histogram bin each element of my data goes into?
You write
but I want each element of A to only contribute a scaled value to it's bin, that scaled value is stored in B. (i.e. instead of each bin being the count of elements from A for that bin, I want each bin to be the sum of corresponding values from B)
IIUC, this functionality is already supported in numpy.histogram via the weights parameter:
An array of weights, of the same shape as a. Each value in a only contributes its associated weight towards the bin count (instead of 1). If normed is True, the weights are normalized, so that the integral of the density over the range remains 1.
So, for your case, it would just be
counts, binEdges=np.histogram(A, bins=nbins, weights=B)
Another point: if your intent is to plot the histogram, note that you can directly use matplotlib.pyplot's utility functions for this (which take weights as well):
from matplotlib import pyplot as plt
plt.hist(A, bins=nbins, weights=B);
Finally, if you're intent on getting the assignments to bins, then that's exactly what numpy.digitize does:
nbins=7
# Some example data
A = np.random.randint(0, 10, 10)
B = np.random.rand(10)
counts, binEdges=np.histogram(A,bins=nbins)
>>> binEdges, np.digitize(A, binEdges)
array([ 0. , 1.28571429, 2.57142857, 3.85714286, 5.14285714,
6.42857143, 7.71428571, 9. ])
plt.hist(np.zeros((784,1)), bins=2)
This should produce histogram with all values for bin with 0 but the output is:
What's wrong?
Not shure what you are expecting, maybe this helps:
The bins represent intervals. The function computes the occurrences of the input data that fall within each bin (or interval).
Consider this example:
plt.hist(np.zeros((784)), bins=(0,1,2))
There are 2 intervals, the first for values from 0 to 1 , the second for values from 1 to 2. So you will have 784 'counts' in the first and no 'counts' in the second intervall. This will produce the following:
Now if you replace bins=(0,1,2) with bins=2, it will use 2 intervals of equal width between the minimum input value and the maximum input value. Since you have only zeros in the input, it takes -0,5 as minimum and +0,5 as maximum, resulting in the histogram you showed above: no 'counts' between -0,5 to 0 and all 784 zeros between 0 and +0,5.
So I guess what you want is a thin bar centered at zero, you can get this by e.g. setting bins = some bigger odd number:
plt.hist(np.zeros((784)), bins=7)
That's how plt.hist works. For example, you have a list like that (3, 5, 1, 7, 4, 3, 9, 0, 2) and pass it to plt.hist with bins=3. Hist distributes all the numbers to 3 categories (e.g. 0-2, 3-6, 7-9) and draws 3 bins. The height of each bin represents the quantity of numbers that were distributed to a corresponding category. In this case, heights will be (3, 4, 2). In your case, bins=2, and categories are something like (-0.5-0.0001, 0-0.5). All the 784 zeros are distributed to the second bin, and the fist bin is empty.
There is another function in matplotlob that works as you probably expected plt.hist to work. It's plt.bar. You can just pass the heights of the bins to it and it will will do nothing to them and just draw a histogram. You can use it like that:
plt.bar(np.arange(784), np.zeros((784,1)))
and it will give you 784 zero-height bars.
I have a list.
Index of list is degree number.
Value is the probability of this degree number.
It looks like, x[ 1 ] = 0.01 means, the degree 1 's probability is 0.01.
I want to draw a distribution graph of this list, and I try
hist = plt.figure(1)
plt.hist(PrDeg, bins = 1)
plt.title("Degree Probability Histogram")
plt.xlabel("Degree")
plt.ylabel("Prob.")
hist.savefig("Prob_Hist")
PrDeg is the list which i mention above.
But the saved figure is not correct.
The X axis value becomes to Prob. and Y is Degree ( Index of list )
How can I exchange x and y axis value by using pyplot ?
Histograms do not usually show you probabilities, they show the count or frequency of observations within different intervals of values, called bins. pyplot defines interval or bins by splitting the range between the minimum and maximum value of your array into n equally sized bins, where n is the number you specified with argument : bins = 1. So, in this case your histogram has a single bin which gives it its odd aspect. By increasing that number you will be able to better see what actually happens there.
The only information that we can get from such an histogram is that the values of your data range from 0.0 to ~0.122 and that len(PrDeg) is close to 1800. If I am right about that much, it means your graph looks like what one would expect from an histogram and it is therefore not incorrect.
To answer your question about swapping the axes, the argument orientation=u'horizontal' is what you are looking for. I used it in the example below, renaming the axes accordingly:
import numpy as np
import matplotlib.pyplot as plt
PrDeg = np.random.normal(0,1,10000)
print PrDeg
hist = plt.figure(1)
plt.hist(PrDeg, bins = 100, orientation=u'horizontal')
plt.title("Degree Probability Histogram")
plt.xlabel("count")
plt.ylabel("Values randomly generated by numpy")
hist.savefig("Prob_Hist")
plt.show()