Creating a barchart from histogram, python matplotlib - python

I have an histogram like this:
Where my data are stored with an append in that way:
(while parsing the file)
{
[...]
a.append(int(number))
#a = [1,1,2,1,1, ]...
}
plt.hist(a, 180)
But as you can see from the image, there are lot of blank areas, so I would like to build a barchart from this data, how can I reorganize them like:
#a = [ 1: 4023, 2: 3043, 3:...]
Where 1 is the "number" and 4023 is an example on how many "hit" of the number 1? From what I have seen in this way I can call:
plt.bar(...)
and creating it, so that I can show only the relevant numbers, with more readability.
If there is a simple way to cut white area in the Histo is also welcome.
I would like also to show the top counter of each columns, but I have no idea how to do it.

Assuming you have some numpy array a full of integers then the code below will produce the bar chart you desire.
It uses np.bincount to count the number of values, note that it only works for non-negative integers.
Also note that I have adjusted the indices so that the plot centrally rather than to the left (using ind-width/2.).
import matplotlib.pyplot as plt
import numpy as np
# Generate some random data.
N=300
a = np.random.random_integers(low=0, high=20, size=N)
# Use bincount and nonzero to generate your data in the correct format.
b = np.bincount(a)
ind = np.nonzero(b)[0]
width=0.8
fig, ax = plt.subplots()
ax.bar(ind-width/2., b)
plt.show()

Related

Adding values and strings in Matplotlib subplot texts

I'm new to programming so this is a basic question. I am creating a number of subplots in a big loop and wish to annotate each one with both a description and a value for that plot, e.g. Alpha = 5. But I find that using ax.text I can create one part or the other, but not both. The following code snippet produces roughly the desired outcome, but only when I run ax.text twice and position them manually, which of course is impractical.
import matplotlib.pyplot as plt
plt.figure(figsize=(10,10))
i=0
for alpha,beta in [(5,10),(100,20)]:
for omega in ['A','B']:
i+=1
ax=plt.subplot(2,2,i)
ax.text(0.1,0.9,'Alpha = ')
ax.text(0.25,0.9,alpha)
plt.show()
I've tried various combinations of commas, plus signs and indices in ax.text but can't seem to get it to work.
import matplotlib.pyplot as plt
plt.figure(figsize=(10,10))
i=0
for alpha,beta in [(5,10),(100,20)]:
for omega in ['A','B']:
i+=1
ax=plt.subplot(2,2,i)
ax.text(0.1,0.9,'Alpha = {}'.format(alpha))
plt.show()
Using formatting to do this

Use a list to determine matplotlib colours

I am making a basic program using matplotlib which graphs a large number of points, and calculates a value to colour those points. My issue is that as the number of points gets very large, the time it takes to individually plot each point through a for loop also gets very large. Is there any way I can use one plot statement and specify a list to use the colours for each individual point? As an example,
Current method:
colours = [(1,0,0),(0,1,0),(0,1,1)] #The length of these lists is usual in the thousands
x = [0,1,2]
y = [2,1,0]
for i in range(len(colours)):
plot([x[i]],[y[i]],'o', color = colours[i])
Whereas what I would like to use would be something more like:
plot(x,y,'o', color=colours)
Which would use each colour for each point. Is there any better way to approach this than a for loop?
You do not want to use plot, but scatter.
import matplotlib.pyplot as plt
colours = [(1,0,0),(0,1,0),(0,1,1)]
x = [0,1,2]
y = [2,1,0]
plt.scatter(x,y, c=colours)
plt.show()

Plotting a histogram from lists using pylab

I am struggling wth plotting a histogram with two lists through the pylab module (which I am required to use)
The first list, totalTime, is populated with 7 float values calculated within the program.
The second list, raceTrack, is populated with 7 string values that represent the name of a race track.
totalTime[0] is the time taken on raceTrack[0], totalTime[3] is the time taken on raceTrack[3], etc...
I sorted out the array and rounded the values to 2 decimal place
totalTimes.sort()
myFormattedTotalTimes = ['%.2f' % elem for elem in totalTimes]
myFormattedTotalTimes' output (when the value entered is 100) is
['68.17', '71.43', '71.53', '84.23', '84.55', '87.20', '102.85']
I would need to use the values in the list to create a histogram, where x-axis would show the name of the race track and the y-axis would show the time on that particular track. Ive made quickly an excel histogram to help understand.
I have attempted but to no avail
for i in range (7):
pylab.hist([myFormattedTotalTimes[i]],7,[0,120])
pylab.show()
Any help would be very appreciated, I am quite lost on this one.
As #John Doe states, I think you want a bar chart. From the matplotlib example, the following does what you want,
import matplotlib.pyplot as plt
import numpy as np
myFormattedTotalTimes = ['68.17', '71.43', '71.53', '84.23', '84.55', '87.20', '102.85']
#Setup track names
raceTrack = ["track " + str(i+1) for i in range(7)]
#Convert to float
racetime = [float(i) for i in myFormattedTotalTimes]
#Plot a bar chart (not a histogram)
width = 0.35 # the width of the bars
ind = np.arange(7) #Bar indices
fig, ax = plt.subplots(1,1)
ax.bar(ind,racetime, width)
ax.set_xticks(ind + width)
ax.set_xticklabels(raceTrack)
plt.show()
Which looks like,

How to update pyplot histogram

I have a 100.000.000 sample dataset and I want to make a histogram with pyplot. But reading this large file drains my memory critically (cursor not moving anymore, ...), so I'm looking for ways to 'help' pyplot.hist. I was thinking breaking up the file into several smaller files might help. But I wouldn't know how to combine them afterwards.
you can combine the output of pyplot.hist, or as #titusjan suggested numpy.histogram, as long as you keep your bins fixed each time you call it. For example:
import matplotlib.pyplot as plt
import numpy as np
# Generate some fake data
data=np.random.rand(1000)
# The fixed bins (change depending on your data)
bins=np.arange(0,1.1,0.1)
sub_hist = [], []
# Split into 10 sub histograms
for i in np.arange(0,1000,10):
sub_hist_temp, bins_out = np.histogram(data[i:i+10],bins=bins)
sub_hist.append(sub_hist_temp)
# Sum the histograms
hist_sum = np.array(sub_hist).sum(axis=0)
# Plot the new summed data, using plt.bar
fig=plt.figure()
ax1=fig.add_subplot(211)
ax1.bar(bins[:-1],hist_sum,width=0.1) # Change width depending on your bins
# Plot the histogram of all data to check
ax2=fig.add_subplot(212)
hist_all, bins_out, patches = all=ax2.hist(data,bins=bins)
fig.savefig('histsplit.png')

Histogram in Python

I have a list of numbers.The list is like [0,0,1,0,1 .... ] .Presently it has binary digits only but later on it can have decimal digits as well. I want to plot a histogram of this sequence in the list.
When I use standard hist funcion of matplotlib library , I get only two bars.It counts all zeros and all ones and shows me the histogram with two bars. But I want to plot in a different way.
I want a no of bars = length of list
and
Height of each bar = value in the list at ( position = bar# ).
Here is the code:
def plot_histogram(self,li_input,):
binseq = numpy.arange(len(li_input))
tupl = matplotlib.pyplot.hist(li_input,bins=binseq)
matplotlib.pyplot.show()
li_input is the list discussed above.
I can do it in a nasty way like :
li_input_mod = []
for x in range(len(li_input)):
li_input_mod += [x]*li_input[x]
and then plot it but i want something better.
The behavior you describe is the way a histogram works; it shows you the distribution of values. It sounds to me like you want to create a bar chart:
import matplotlib.pyplot as plt
x = [0,0,1,0,1,1,0,1,1,0,0,0,1]
plt.bar(range(len(x)), x, align='center')
which would produce:

Categories

Resources