Matplotlib displaying histogram with a specific value on x and y axis

Matplotlib displaying histogram with a specific value on x and y axis - python

fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
value = [8904,8953,8977,9147,9243,9320]
bin = np.arange(0,70,10)
ax.hist(value, bins=bin)
plt.grid(True)
plt.show()
I am trying to plot a histogram with the value array on the x-axis and the y-axis will be the bin. But when I run the code I get an empty chart. Could anyone please help me out. Thank you

First thing I see is that in your values array, your data points aren't separated by commas.
Second thing, your values are outside the ranges of your bins. All your values are well into the thousands, and your bins' range is between 0 and 70.
Here is my edited version of your code (I included my import statements to make things clear). I changed the values to being within your bin ranges:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
value = [7, 8, 15, 45, 50, 80]
bin = np.arange(0,70,10)
ax.hist(value, bins=bin)
plt.grid(True)
plt.show()
The result I get is this image, which illustrates what's going on. The data point 80 is outside the bin range, and therefore isn't shown at all, just like the data points you originally had. Other than that, all data points are shown in the histogram.
Hope this helps!
Edit: you said in a comment to this answer that you want it to be horizontal, not vertical. You add orientation="horizontal" to your ax.hist statement as an argument. New code looks like this:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
value = [7, 8, 15, 45, 50, 80]
bin = np.arange(0,70,10)
ax.hist(value, bins=bin, orientation="horizontal")
plt.grid(True)
plt.show()
Your plot should now look like this.

Related

How to customize bar graph (matplotlib)?

My code:
import matplotlib.pyplot as plt
import numpy as np
f = plt.figure()
production_level = [54, 83, 21, 3] #list_of_prod
periods = [x+1 for x in range(len(production_level))] #list_of_order
plt.bar(periods, production_level, color='orange')
plt.title('Dynamic lot-size problem chart')
plt.ylabel('Units')
plt.xlabel('Periods')
plt.grid(True)
plt.show()
f.savefig("bar.png", bbox_inches='tight')
Output:
How can I have just whole numbers on x axis (1,2,3,4) without 0,5; 1,5; 2,5 etc.? How can add bars' value on them or above them?

Add text by using plt.text() and tweaking the coordinates (hint, hardcoding these values might not be the best idea).
Change ticks by using plt.xticks() (see also this question).

How to remove an histogram in Matplotlib

I am used to work with plots that change over the time in order to show differences when a parameter is changed. Here I provide an easy example
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid(True)
x = np.arange(-3, 3, 0.01)
for j in range(1, 15):
y = np.sin(np.pi*x*j) / (np.pi*x*j)
line, = ax.plot(x, y)
plt.draw()
plt.pause(0.5)
line.remove()
You can clearly see that increasing the paramter j the plot becames narrower and narrower.
Now if I want to do the some job with a counter plot than I just have to remove the comma after "line". From my understanding this little modification comes from the fact that the counter plot is not an element of a tuple anymore, but just an attribute as the counter plot completely "fill up" all the space available.
But it looks like there is no way to remove (and plot again) an histogram. Infact if type
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
ax = fig.add_subplot(111)
ax.grid(True)
x = np.random.randn(100)
for j in range(15):
hist, = ax.hist(x, 40)*j
plt.draw()
plt.pause(0.5)
hist.remove()
It doesn't matter whether I type that comma or not, I just get a message of error.
Could you help me with this, please?

ax.hist doesn't return what you think it does.
The returns section of the docstring of hist (access via ax.hist? in an ipython shell) states:
Returns
-------
n : array or list of arrays
The values of the histogram bins. See **normed** and **weights**
for a description of the possible semantics. If input **x** is an
array, then this is an array of length **nbins**. If input is a
sequence arrays ``[data1, data2,..]``, then this is a list of
arrays with the values of the histograms for each of the arrays
in the same order.
bins : array
The edges of the bins. Length nbins + 1 (nbins left edges and right
edge of last bin). Always a single array even when multiple data
sets are passed in.
patches : list or list of lists
Silent list of individual patches used to create the histogram
or list of such list if multiple input datasets.
So you need to unpack your output:
counts, bins, bars = ax.hist(x, 40)*j
_ = [b.remove() for b in bars]

Here the right way to iteratively draw and delete histograms in matplotlib
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure(figsize = (20, 10))
ax = fig.add_subplot(111)
ax.grid(True)
for j in range(1, 15):
x = np.random.randn(100)
count, bins, bars = ax.hist(x, 40)
plt.draw()
plt.pause(1.5)
t = [b.remove() for b in bars]

Bars on polar bar plots are cut off when rlim is set

I am making some bar plots using a polar projection. The data are all large numbers far from the origin and thus I'm using the ax.set_rlim to make them easier to distinguish. However, when I set the rlim, some of the bars are cut off around the origin. This is not an issue when I do not set the rlim, but I can't present my data like this. Why is this happening and is there a way I can fix it?
Here is an example of the issue:
import matplotlib
import numpy as np
Sectors = np.arange(0,2*np.pi,np.pi/4)
Data = np.array([100,99,100,101,100.5,100.25,99.25,99.75])
fig, ax = plt.subplots(nrows = 1, ncols = 1, subplot_kw={'projection': 'polar'})
ax.bar(Sectors,Data)
ax.set_rlim(98,102)
plt.show()
Note, this does not happen if I don't apply the rlim. eg:
import matplotlib
import numpy as np
Sectors = np.arange(0,2*np.pi,np.pi/4)
Data = np.array([100,99,100,101,100.5,100.25,99.25,99.75])
fig, ax = plt.subplots(nrows = 1, ncols = 1, subplot_kw={'projection': 'polar'})
ax.bar(Sectors,Data)
#ax.set_rlim(98,102)
plt.show()
Any help is greatly appreciated!

This is a very strange effect indeed.
But there seems to be a workaround using the bottom keyword to bar. The trick is to set the bottom to the inner rlim (in this case 98) and specify the data relative to the bottom value.
import matplotlib.pyplot as plt
import numpy as np
Sectors = np.arange(0,2*np.pi,np.pi/4)
Data = np.array([100,99,100,101,100.5,100.25,99.25,99.75])
fig, ax = plt.subplots(nrows = 1, ncols = 1, subplot_kw={'projection': 'polar'})
ax.bar(Sectors,Data-98, bottom=98)
ax.set_rlim(98,102)
plt.show()

Looks like a silly round-off error in matplotlib. I bumped up your numbers by a factor 10 and all but one wedge showed correctly. Setting the rlim() to a larger range also shows some improvement. If you need to put this in a presentation, cover up the middle with a drawn in circle.
All bandaids I am afraid....

How to change axis range displayed in a histogram

I want to plot a histogram of my df with about 60 thousand of values. After I used plt.hist(x, bins = 30) it gave me something like
The problem is that there are more values bigger than 20 but the frequencies of those values may be smaller than 10. So how can I adjust the axis displayed to show more bins since I want to look at the whole distribution here.

The problem with histograms that skew so much towards one value is you're going to essentially flatten out any outlying values. A solution might be just to present the data with two charts.
Can you create another histogram containing only the values greater than 20?
(psuedo-code, since I don't know your data structure from your post)
plt.hist(x[x.column > 20], bins = 30)

Finally, it could look like this example:
import matplotlib.pyplot as plt
import numpy as np
values1 = np.random.rand(1000,1)*100
values2 = np.random.rand(100000,1)*5
values3 = np.random.rand(10000,1)*20
values = np.vstack((values1,values2,values3))
fig = plt.figure(figsize=(12,5))
ax1 = fig.add_subplot(121)
ax1.hist(values,bins=30)
ax1.set_yscale('log')
ax1.set_title('with log scale')
ax2 = fig.add_subplot(122)
ax2.hist(values,bins=30)
ax2.set_title('no log scale')
fig.savefig('test.jpg')

You could use plt.xscale('log')
PyPlot Logarithmic and other nonlinear axis

x axis with duplicate values (loading profile) plot in matplotlib

i have load profile data where x axis is load profile such that for multiple same values of x (constant load) i have different values for y.
till now in excel i used to line plot y and right click graph->selec data->change hoizontal axis data by providing it range o x axis data and that used to give me the graph
the problem i have is when i try to give
plot(x,y), matplotlib plots y for unique vals of x ie it neglects out all the remaining value of for same value of x.
and when i plot with plot(y) i get sequence numbers on x axis
i tried xticks([0,5,10,15]) for checking out but couldn't get the required result.
my question is
is it possible to plot a graph in a similar fashion as of excel
the other alternative i could think of was plotting plot(y and plot (x) with same horizontal axis it atleast gives a pictorial idea but is there any means to do it the excel way??

From your description, it sounds to me like you want to use the "scatter" plotting command instead of the "plot" plotting command. This will allow the use of redundant x-values. Sample code:
import numpy as np
import matplotlib.pyplot as plt
# Generate some data that has non-unique x-values
x1 = np.linspace(1,50)
y1 = x1**2
y2 = 2*x1
x3 = np.append(x1,x1)
y3 = np.append(y1,y2)
# Now plot it using the scatter command
# Note that some of the abbreviations that work with plot,
# such as 'ro' for red circles don't work with scatter
plt.scatter(x3,y3,color='red',marker='o')
As I mentioned in the comments, some of the handy "plot" shortcuts don't work with "scatter" so you may want to check the documentation: http://matplotlib.sourceforge.net/api/pyplot_api.html#matplotlib.pyplot.scatter

If you want to plot y-values for a given x-values, you need to get the index which has same x-values. If you are working with numpy then you can try
import pylab as plt
import numpy as np
x=np.array([1]*5+[2]*5+[3]*5)
y=np.array([1,2,3,4,5]*3)
idx=(x==1) # Get the index where x-values are 1
plt.plot(y[idx],'o-')
plt.show()
If you are working with lists you can get the index by
# Get the index where x-values are 1
idx=[i for i, j in enumerate(x) if j == 1]

just answering own question,found this around when i had posted this question years back :)
def plotter(y1,y2,y1name,y2name):
averageY1=float(sum(y1)/len(y1))
averageY2=float(sum(y2)/len(y2))
fig = plt.figure()
ax1 = fig.add_subplot(111)
ax1.plot(y1,'b-',linewidth=2.0)
ax1.set_xlabel("SNo")
# Make the y2-axis label and tick labels match the line color.
ax1.set_ylabel(y1name, color='b')
for tl in ax1.get_yticklabels():
tl.set_color('b')
ax1.axis([0,len(y2),0,max(y1)+50])
ax2 = ax1.twinx()
ax2.plot(y2, 'r-')
ax2.axis([0,len(y2),0,max(y2)+50])
ax2.set_ylabel(y2name, color='r')
for tl in ax2.get_yticklabels():
tl.set_color('r')
plt.title(y1name + " vs " + y2name)
#plt.fill_between(y2,1,y1)
plt.grid(True,linestyle='-',color='0.75')
plt.savefig(y1name+"VS"+y2name+".png",dpi=200)

You can use
import numpy as np
import matplotlib.pyplot as plt
x = np.array([1, 1, 1, 2, 2, 2])
y = np.array([1, 2, 1, 5, 6, 7])
fig, ax = plt.subplots()
ax.plot(np.arange(len(x)), y)
ax.set_xticklabels(x)
plt.show()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Matplotlib displaying histogram with a specific value on x and y axis - python

Related

How to customize bar graph (matplotlib)?

How to remove an histogram in Matplotlib

Bars on polar bar plots are cut off when rlim is set

How to change axis range displayed in a histogram

x axis with duplicate values (loading profile) plot in matplotlib

Categories

Resources