I have plotted a histogram. But I want to plot discrete lines instead of three bars. Is there any way to do that?
import matplotlib.pyplot as plt
w1 = [-2,-2,-2,-2,0,0,0,1,1,1,1,1,1]
n,bins,patches = plt.hist(w1,bins=10)
plt.xlabel("bins")
plt.ylabel("counts")
plt.show()
If you just want to plot bars with smaller width
Use the argument rwidth, for relative width of each histogram bar compared to the bin size. Experiment different values for different visual results. Example:
w1=[-2,-2,-2,-2,0,0,0,1,1,1,1,1,1]
n,bins,patches=plt.hist(w1,bins=10, rwidth=0.1)
plt.xlabel("bins")
plt.ylabel("counts")
plt.show()
If you actually want to plot lines instead of bars
Loop over each value inside w1 and call plt.plot on a line from XY (value, 0) to XY (value, number of times value appears in w1). Example:
for value in w1:
plt.plot([value, value], [0, w1.count(value)], color='b')
plt.show()
Note that I've used the argument color='b' so that matplotlib wouldn't make different colors for each line. Also, by default matplotlib adds some whitespace to surrounding lines when we call plt.plot, so you may want to call plt.ylim(bottom=0), so that the bars do not appear to "float" above the plot.
Inside of the plt.hist(...) ad a variable: rwidth (relative width) with a value bellow 1, that way you will get bars with lower width.
Read more about that here: https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html#matplotlib.pyplot.hist
I would propose to use a stem plot with the counts of the unique elements of the data. Of course this only makes sense for discrete data.
import numpy as np
import matplotlib.pyplot as plt
w1 = [-2,-2,-2,-2,0,0,0,1,1,1,1,1,1]
u, c = np.unique(w1, return_counts=True)
plt.stem(u,c, use_line_collection=True, basefmt="none")
plt.ylim(0,None)
plt.xlabel("bins")
plt.ylabel("counts")
plt.show()
Related
I am quite new to python programming. I have a script with me that plots out a heat map using matplotlib. Range of X-axis value = (-180 to +180) and Y-axis value =(0 to 180). The 2D heatmap colours areas in Rainbow according to the number of points occuring in a specified area in the x-y graph (defined by the 'bin' (see below)).
In this case, x = values_Rot and y = values_Tilt (see below for code).
As of now, this script colours the 2D-heatmap in the linear scale. How do I change this script such that it colours the heatmap in the log scale? Please note that I only want to change the heatmap colouring scheme to log-scale, i.e. only the number of points in a specified area. The x and y-axis stay the same in linear scale (not in logscale).
A portion of the code is here.
rot_number = get_header_number(headers, AngleRot)
tilt_number = get_header_number(headers, AngleTilt)
psi_number = get_header_number(headers, AnglePsi)
values_Rot = []
values_Tilt = []
values_Psi = []
for line in data:
try:
values_Rot.append(float(line.split()[rot_number]))
values_Tilt.append(float(line.split()[tilt_number]))
values_Psi.append(float(line.split()[psi_number]))
except:
print ('This line didnt work, it may just be a blank space. The line is:' + line)
# Change the values here if you want to plot something else, such as psi.
# You can also change how the data is binned here.
plt.hist2d(values_Rot, values_Tilt, bins=25,)
plt.colorbar()
plt.show()
plt.savefig('name_of_output.png')
You can use a LogNorm for the colors, using plt.hist2d(...., norm=LogNorm()). Here is a comparison.
To have the ticks in base 2, the developers suggest adding the base to the LogLocator and the LogFormatter. As in this case the LogFormatter seems to write the numbers with one decimal (.0), a StrMethodFormatter can be used to show the number without decimals. Depending on the range of numbers, sometimes the minor ticks (shorter marker lines) also get a string, which can be suppressed assigning a NullFormatter for the minor colorbar ticks.
Note that base 2 and base 10 define exactly the same color transformation. The position and the labels of the ticks are different. The example below creates two colorbars to demonstrate the different look.
import matplotlib.pyplot as plt
from matplotlib.ticker import NullFormatter, StrMethodFormatter, LogLocator
from matplotlib.colors import LogNorm
import numpy as np
from copy import copy
# create some toy data for a standalone example
values_Rot = np.random.randn(100, 10).cumsum(axis=1).ravel()
values_Tilt = np.random.randn(100, 10).cumsum(axis=1).ravel()
fig, (ax1, ax2) = plt.subplots(ncols=2, figsize=(15, 4))
cmap = copy(plt.get_cmap('hot'))
cmap.set_bad(cmap(0))
_, _, _, img1 = ax1.hist2d(values_Rot, values_Tilt, bins=40, cmap='hot')
ax1.set_title('Linear norm for the colors')
fig.colorbar(img1, ax=ax1)
_, _, _, img2 = ax2.hist2d(values_Rot, values_Tilt, bins=40, cmap=cmap, norm=LogNorm())
ax2.set_title('Logarithmic norm for the colors')
fig.colorbar(img2, ax=ax2) # default log 10 colorbar
cbar2 = fig.colorbar(img2, ax=ax2) # log 2 colorbar
cbar2.ax.yaxis.set_major_locator(LogLocator(base=2))
cbar2.ax.yaxis.set_major_formatter(StrMethodFormatter('{x:.0f}'))
cbar2.ax.yaxis.set_minor_formatter(NullFormatter())
plt.show()
Note that log(0) is minus infinity. Therefore, the zero values in the left plot (darkest color) are left empty (white background) on the plot with the logarithmic color values. If you just want to use the lowest color for these zeros, you need to set a 'bad' color. In order not the change a standard colormap, the latest matplotlib versions wants you to first make a copy of the colormap.
PS: When calling plt.savefig() it is important to call it before plt.show() because plt.show() clears the plot.
Also, try to avoid the 'jet' colormap, as it has a bright yellow region which is not at the extreme. It may look nice, but can be very misleading. This blog article contains a thorough explanation. The matplotlib documentation contains an overview of available colormaps.
Note that to compare two plots, plt.subplots() needs to be used, and instead of plt.hist2d, ax.hist2d is needed (see this post). Also, with two colorbars, the elements on which the colorbars are based need to be given as parameter. A minimal change to your code would look like:
from matplotlib.ticker import NullFormatter, StrMethodFormatter, LogLocator
from matplotlib.colors import LogNorm
from matplotlib import pyplot as plt
from copy import copy
# ...
# reading the data as before
cmap = copy(plt.get_cmap('magma'))
cmap.set_bad(cmap(0))
plt.hist2d(values_Rot, values_Tilt, bins=25, cmap=cmap, norm=LogNorm())
cbar = plt.colorbar()
cbar.ax.yaxis.set_major_locator(LogLocator(base=2))
cbar.ax.yaxis.set_major_formatter(StrMethodFormatter('{x:.0f}'))
cbar.ax.yaxis.set_minor_formatter(NullFormatter())
plt.savefig('name_of_output.png') # needs to be called prior to plt.show()
plt.show()
I am trying to plot a cumulative histogram similar to the one shown below. It shows the number of occurrences (y-axis) of the French pronoun “vous” in a text corpus (x-axis) represented from word 0 to 92,633. It’s been created using a corpus analysis application named TXM. TXM’s plots, however, are not adapted to the specific requirements of my publisher. I would like to produce my own plots exporting the data to python. The problem is that the data exported by TXM is a bit puzzling, and I am wondering how I it can be used to make plots:
it’s a one-column txt file with integers.
Each one of them indicates the position of “vous” in the text corpus. Word 2620 is one “vous,”
3376, another one, etc. One of my attempts with Matplotlib :
from matplotlib import pyplot as plt
pos = [2620,3367,3756,4522,4546,9914,9972,9979,9987,10013,10047,10087,10114,13635,13645,13646,13758,13771,13783,13796,23410,23420,28179,28265,28274,28297,28344,34579,34590,34612,40280,40449,40570,40932,40938,40969,40983,41006,41040,41069,41096,41120,41214,41474,41478,42524,42533,42534,45569,45587,45598,56450,57574,57587]
plt.bar(pos, 1)
plt.show()
But this doesn't come close.
What steps should I follow to complete the plot?
Desired plot:
With matplotlib, you could create the step plot as follows. where='post' means the value changes at every x-position and stays so until the next x-position.
The x-values are the positions in the text, a zero is prepended to let the graph start with zero occurrences. The text-length is appended at the end. The y-values are the numbers 0, 1, 2, ..., where the last value is repeated to draw the last step in full.
from matplotlib import pyplot as plt
from matplotlib.ticker import MultipleLocator, StrMethodFormatter
import numpy as np
pos = [2620,3367,3756,4522,4546,9914,9972,9979,9987,10013,10047,10087,10114,13635,13645,13646,13758,13771,13783,13796,23410,23420,28179,28265,28274,28297,28344,34579,34590,34612,40280,40449,40570,40932,40938,40969,40983,41006,41040,41069,41096,41120,41214,41474,41478,42524,42533,42534,45569,45587,45598,56450,57574,57587]
text_len = 92633
cum = np.arange(0, len(pos) + 1)
fig, ax = plt.subplots(figsize=(12, 3))
ax.step([0] + pos + [text_len], np.pad(cum, (0, 1), 'edge'), where='post', label=f'vous {len(pos)}')
ax.xaxis.set_major_locator(MultipleLocator(5000)) # x-ticks every 5000
ax.xaxis.set_major_formatter(StrMethodFormatter('{x:,.0f}')) # use the thousands separator
ax.yaxis.set_major_locator(MultipleLocator(5)) # have a y-tick every 5
ax.grid(b=True, ls=':') # show a grid with dotted lines
ax.autoscale(enable=True, axis='x', tight=True) # disable padding x-direction
ax.set_xlabel(f'T={text_len:,d}')
ax.set_ylabel('Occurrences')
ax.set_title("Progression of 'vous' in TCN")
plt.legend() # add a legend (uses the label of ax.step)
plt.tight_layout()
plt.show()
I am trying to plot a data and function with matplotlib 2.0 under python 2.7.
The x values of the function are evolving with time and the x is first decreasing to a certain value, than increasing again.
If the function is plotted against time, it shows function like this plot of data against time
I need the same x axis evolution for plotting against real x values. Unfortunately as the x values are the same for both parts before and after, both values are mixed together. This gives me the wrong data plot:
In this example it means I need the x-axis to start on value 2.4 and decrease to 1.0 than again increase to 2.4. I swear I found before that this is possible, but unfortunately I can't find a trace about that again.
A matplotlib axis is by default linearly increasing. More importantly, there must be an injective mapping of the number line to the axis units. So changing the data range is not really an option (at least when the aim is to keep things simple).
It would hence be good to keep the original numbers and only change the ticks and ticklabels on the axis. E.g. you could use a FuncFormatter to map the original numbers to
np.abs(x-tp)+tp
where tp would be the turning point.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
x = np.linspace(-10,20,151)
y = np.exp(-(x-5)**2/19.)
plt.plot(x,y)
tp = 5
fmt = lambda x,pos:"{:g}".format(np.abs(x-tp)+tp)
plt.gca().xaxis.set_major_formatter(matplotlib.ticker.FuncFormatter(fmt))
plt.show()
One option would be to use two axes, and plot your two timespans separately on each axes.
for instance, if you have the following data:
myX = np.linspace(1,2.4,100)
myY1 = -1*myX
myY2 = -0.5*myX-0.5
plt.plot(myX,myY, c='b')
plt.plot(myX,myY2, c='g')
you can instead create two subplots with a shared y-axis and no space between the two axes, plot each time span independently, and finally, adjust the limits of one of your x-axis to reverse the order of the points
fig, (ax1,ax2) = plt.subplots(1,2, gridspec_kw={'wspace':0}, sharey=True)
ax1.plot(myX,myY1, c='b')
ax2.plot(myX,myY2, c='g')
ax1.set_xlim((2.4,1))
ax2.set_xlim((1,2.4))
My program produces two arrays and I have to plot one of them in the X axis and the other one on the Y axis (the latter are taken from the row of a matrix).
The problem is that I have to repeat this operation for a number of times (I am running a loop) but all the graphs should be on the same plot. Every time the dots should be of a different colour. Then I should save the file.
I have tried with
for row in range(6):
plt.plot(betaArray, WabArray[row], 'ro')
plt.show()
but this only shows one plot each for every iteration and always of the same colour.
You could try something like this:
import numpy as np
import matplotlib.pylab as plt
import matplotlib as mpl
x = [1,2,3,4]
y_mat = np.array([[1,2,3,4], [5,6,7,8]])
n, _ = y_mat.shape
colors = mpl.cm.rainbow(np.linspace(0, 1, n))
fig, ax = plt.subplots()
for color, y in zip(colors, y_mat):
ax.scatter(x, y, color=color)
plt.show()
This creates n colors from the rainbow color map and uses scatter to plot the points in the respective color. You may want to switch to a different color map or even choose the colors manually.
This is the result:
I have a figure that consists of an image displayed by imshow(), a contour and a vector field set by quiver(). I have colored the vector field based on another scalar quantity. On the right of my figure, I have made a colorbar(). This colorbar() represents the values displayed by imshow() (which can be positive and negative in my case). I'd like to know how I could setup another colorbar which would be based on the values of the scalar quantity upon which the color of the vectors is based. Does anyone know how to do that?
Here is an example of the image I've been able to make. Notice that the colors of the vectors go from blue to red. According to the current colorbar, blue means negative. However I know that the quantity represented by the color of the vector is always positive.
Simply call colorbar twice, right after each plotting call. Pylab will create a new colorbar matching to the latest plot. Note that, as in your example, the quiver values range from 0,1 while the imshow takes negative values. For clarity (not shown in this example), I would use different colormaps to distinguish the two types of plots.
import numpy as np
import pylab as plt
# Create some sample data
dx = np.linspace(0,1,20)
X,Y = np.meshgrid(dx,dx)
Z = X**2 - Y
Z2 = X
plt.imshow(Z)
plt.colorbar()
plt.quiver(X,Y,Z2,width=.01,linewidth=1)
plt.colorbar()
plt.show()
Running quiver doesn't necessarily return the type of mappable object that colorbar() requires. I think it might be because I explicitly "have colored the vector field based on another scalar quantity" like Heimdall says they did. Therefore, Hooked's answer didn't work for me.
I had to create my own mappable for the color bar to read. I did this by using Normalize from matplotlib.colors on the data that I wanted to use to color my quiver vectors (which I'll call C, which is an array of the same shape as X, Y, U, and V.)
My quiver call looks like this:
import matplotlib.pyplot as pl
import matplotlib.cm as cm
import matplotlib.colors as mcolors
import matplotlib.colorbar as mcolorbar
pl.figure()
nz = mcolors.Normalize()
nz.autoscale(C)
pl.quiver(X, Y, U, V, color=cm.jet(nz(C)))
cax,_ = mcolorbar.make_axes(pl.gca())
cb = mcolorbar.ColorbarBase(cax, cmap=cm.jet, norm=nz)
cb.set_label('color data meaning')
Giving any other arguments to the colorbar function gave me a variety of errors.