Heatmap with matplotlib using matshow - python

I am trying to generate a heatmap of a 10x10 matrix. All values in the matrix are probabilities; sum of all elements equal to 1.0. I decided to use the matshow plot type (it seemed easy to use), however I cannot generate the output I'd like to have so far.
1.Visually it looks kinda ugly. Would you recommend a fitting color map for use in a heatmap?
2.Is there a way to assign predefined bins to the color map when using matshow? E.g. take a gradient of 1000 colors, always use the same colors for the corresponding probabilities. In default behavior, I think matshow checks the minimum and maximum values, assigned the first and last colors in the gradient to those values, then colorizes the values in between by interpolation.
Sometimes I have very similar probabilities in the matrix, and other times the range of probabilities may be great. Due to the default behavior I tried to explain above, I get similar plots, which makes comparisons harder.
My code for generating the said heat maps (and an example plot) is below by the way.
Thanks!
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
def pickcoord():
i = np.random.randint(0,10)
j = np.random.randint(0,10)
return [i,j]
board = np.zeros((10,10))
for i in range(1000000):
try:
direction = np.random.randint(0,2)
new_board = np.zeros((10,10))
coords = pickcoord()
if direction == 1:
for k in range(2):
new_board[coords[0]][coords[1]+k] = 1
else:
for k in range(2):
new_board[coords[0]+k][coords[1]] = 1
except IndexError:
new_board = np.zeros((10,10))
board = board + new_board
board_prob = board/np.sum(board)
plt.figure(figsize=(6,6))
plt.matshow(board_prob, cmap=matplotlib.cm.Spectral_r, interpolation='none')
plt.xticks(np.arange(0.5,10.5), [])
plt.yticks(np.arange(0.5,10.5), [])
plt.grid()

Your second problem can be solved using the vmin and vmax arguments of the matshow function:
matshow(board_prob, cmap=cm.Spectral_r, interpolation='none', vmin=0, vmax=1)
Considering your first problem, it depends on what you want to emphasize or display. Choose a fitting colormap from the default colormaps of matplotlib.

Related

Use a list to determine matplotlib colours

I am making a basic program using matplotlib which graphs a large number of points, and calculates a value to colour those points. My issue is that as the number of points gets very large, the time it takes to individually plot each point through a for loop also gets very large. Is there any way I can use one plot statement and specify a list to use the colours for each individual point? As an example,
Current method:
colours = [(1,0,0),(0,1,0),(0,1,1)] #The length of these lists is usual in the thousands
x = [0,1,2]
y = [2,1,0]
for i in range(len(colours)):
plot([x[i]],[y[i]],'o', color = colours[i])
Whereas what I would like to use would be something more like:
plot(x,y,'o', color=colours)
Which would use each colour for each point. Is there any better way to approach this than a for loop?
You do not want to use plot, but scatter.
import matplotlib.pyplot as plt
colours = [(1,0,0),(0,1,0),(0,1,1)]
x = [0,1,2]
y = [2,1,0]
plt.scatter(x,y, c=colours)
plt.show()

Scale colormap for contour and contourf

I'm trying to plot the contour map of a given function f(x,y), but since the functions output scales really fast, I'm losing a lot of information for lower values of x and y. I found on the forums to work that out using vmax=vmax, it actually worked, but only when plotted for a specific limit of x and y and levels of the colormap.
Say I have this plot:
import matplotlib.pyplot as plt
import numpy as np
fig = plt.figure()
u = np.linspace(-2,2,1000)
x,y = np.meshgrid(u,u)
z = (1-x)**2+100*(y-x**2)**2
cont = plt.contour(x,y,z,500,colors='black',linewidths=.3)
cont = plt.contourf(x,y,z,500,cmap="jet",vmax=100)
plt.colorbar(cont)
plt.show
I want to uncover whats beyond the axis limits keeping the same scale, but if I change de x and y limits to -3 and 3 I get:
See how I lost most of my levels since my max value for the function at these limits are much higher. A work around to this problem is to increase the levels to 1000, but that takes a lot of computational time.
Is there a way to plot only the contour levels that I need? That is, between 0 and 100.
An example of a desired output would be:
With the white space being the continuation of the plot without resizing the levels.
The code I'm using is the one given after the first image.
There are a few possible ideas here. The one I very much prefer is a logarithmic representation of the data. An example would be
from matplotlib import ticker
fig = plt.figure(1)
cont1 = plt.contourf(x,y,z,cmap="jet",locator=ticker.LogLocator(numticks=10))
plt.colorbar(cont1)
plt.show()
fig = plt.figure(2)
cont2 = plt.contourf(x,y,np.log10(z),100,cmap="jet")
plt.colorbar(cont2)
plt.show()
The first example uses matplotlibs LogLocator functions. The second one just directly computes the logarithm of the data and plots that normally.
The third example just caps all data above 100.
fig = plt.figure(3)
zcapped = z.copy()
zcapped[zcapped>100]=100
cont3 = plt.contourf(x,y,zcapped,100,cmap="jet")
cbar = plt.colorbar(cont3)
plt.show()

How to control the cell size of a pyplot pcolor heatmap?

I have a pair of lists of numbers representing points in a 2-D space, and I want to represent the y/x ratios for these points as a 1-dimensional heatmap, with a diverging color map centered around 1, or the logs of my ratios, with a diverging color map centered around 0.
How do I do that?
My current attempt (borrowing somewhat from Heatmap in matplotlib with pcolor?):
from matplotlib import numpy as np
import matplotlib.pyplot as plt
# There must be a better way to generate arrays of random values
x_values = [np.random.random() for _ in range(10)]
y_values = [np.random.random() for _ in range(10)]
labels = list("abcdefghij")
ratios = np.asarray(y_values) / np.asarray(x_values)
axis = plt.gca()
# I transpose the array to get the points arranged vertically
heatmap = axis.pcolor(np.log2([ratios]).T, cmap=plt.cm.PuOr)
# Put labels left of the colour cells
axis.set_yticks(np.arange(len(labels)) + 0.5, minor=False)
# (Not sure I get the label order correct...)
axis.set_yticklabels(labels)
# I don't want ticks on the x-axis: this has no meaning here
axis.set_xticks([])
plt.show()
Some points I'm not satisfied with:
The coloured cells I obtain are horizontally-elongated rectangles. I would like to control the width of these cells and obtain a column of cells.
I would like to add a legend for the color map. heatmap.colorbar = plt.colorbar() fails with RuntimeError: No mappable was found to use for colorbar creation. First define a mappable such as an image (with imshow) or a contour set (with contourf).
One important point:
matplotlib/pyplot always leaves me confused: there seems to be a lot of ways to do things and I get lost in the documentation. I never know what would be the "clean" way to do what I want: I welcome suggestions of reading material that would help me clarify my very approximative understanding of these things.
Just 2 more lines:
axis.set_aspect('equal') # X scale matches Y scale
plt.colorbar(mappable=heatmap) # Tells plt where it should find the color info.
Can't answer your final question very well. Part of it is due to we have two branches of doing things in matplotlib: the axis way (axis.do_something...) and the MATLAB clone way plt.some_plot_method. Unfortunately we can't change that, and it is a good feature for people to migrate into matplotlib. As far as the "Clean way" is concerned, I prefer to use whatever produces the shorter code. I guess that is inline with Python motto: Simple is better than complex and Readability counts.

Adding a 4th variable to a 3D plot in Python

I have 3 different parameters X,Y and Z over a range of values, and for each combination of these a certain value of V. To make it clearer, the data would look something like this.
X Y Z V
1 1 2 10
1 2 3 15
etc...
I'd like to visualize the data with a surface/contour plot, using V as a colour to see its value at that point, but I do not see how to add my custom colouring scheme into the mix using Python. Any idea on how to do this (or is this visualization outright silly)?
Thanks a lot!
Matplotlib allows one to pass the facecolors as an argument to e.g.
ax.plot_surface.
That would imply then that you would have to perform 2D interpolation on your
current array of colors, because you currently only have the colors in the
corners of the rectangular faces (you did mention that you have a rectilinear
grid).
You could use
scipy.interpolate.interp2d
for that, but as you see from the documentation, it is suggested to use
scipy.interpolate.RectBivariateSpline.
To give you a simple example:
import numpy as np
y,x = np.mgrid[1:10:10j, 1:10:10j] # returns 2D arrays
# You have 1D arrays that would make a rectangular grid if properly reshaped.
y,x = y.ravel(), x.ravel() # so let's convert to 1D arrays
z = x*(x-y)
colors = np.cos(x**2) - np.sin(y)**2
Now I have a similar dataset as you (one-dimensional arrays for x, y, z and
colors). Remark that the colors are defined for
each point (x,y). But when you want to plot with plot_surface, you'll
generate rectangular patches, of which the corners are given by those points.
So, on to interpolation then:
from scipy.interpolate import RectBivariateSpline
# from scipy.interpolate import interp2d # could 've used this too, but docs suggest the faster RectBivariateSpline
# Define the points at the centers of the faces:
y_coords, x_coords = np.unique(y), np.unique(x)
y_centers, x_centers = [ arr[:-1] + np.diff(arr)/2 for arr in (y_coords, x_coords)]
# Convert back to a 2D grid, required for plot_surface:
Y = y.reshape(y_coords.size, -1)
X = x.reshape(-1, x_coords.size)
Z = z.reshape(X.shape)
C = colors.reshape(X.shape)
#Normalize the colors to fit in the range 0-1, ready for using in the colormap:
C -= C.min()
C /= C.max()
interp_func = RectBivariateSpline(x_coords, y_coords, C.T, kx=1, ky=1) # the kx, ky define the order of interpolation. Keep it simple, use linear interpolation.
In this last step, you could also have used interp2d (with kind='linear'
replacing the kx=1, ky=1). But since the docs suggest to use the faster
RectBivariateSpline...
Now you're ready to plot it:
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
r = ax.plot_surface(X,Y,Z,
facecolors=cm.hot(interp_func(x_centers, y_centers).T),
rstride=1, cstride=1) # only added because of this very limited dataset
As you can see, the colors on the faces have nothing to do anymore with the height of the dataset.
Note that you could have thought simply passing the 2D array C to facecolors would work, and matplotlib would not have complained. However, the result isn't accurate then, because matplotlib will use only a subset of C for the facecolors (it seems to ignore the last column and last row of C). It is equivalent to using only the color defined by one coordinate (e.g. the top-left) over the entire patch.
An easier method would have been to let matplotlib do the interpolation and obtain the facecolors and then pass those in to the real plot:
r = ax.plot_surface(X,Y,C, cmap='hot') # first plot the 2nd dataset, i.e. the colors
fc = r.get_facecolors()
ax.clear()
ax.plot_surface(X, Y, Z, facecolors=fc)
However, that won't work in releases <= 1.4.1 due to this recently submitted bug.
It really depends on how you plan on plotting this data. I like to plot graphs with gnuplot: it's easy, free and intuitive. To plot your example with gnuplot you'd have to print those line into a file (with only those four columns) and plot using a code like the following
reset
set terminal png
set output "out.png"
splot "file.txt" using 1:2:3:4 with lines palette
Assuming that you save your data into the file file.txt. splot stands for surface plot. Of course, this is a minimum example.
Alternatively you can use matplotlib, but that is not, in my opinion, as intuitive. Although it has the advantage of centering all the processing in python.

Contour plot in square points

I have a 3 data sets, X,Y,Z which are my axes and my data respectively. They are well defined, i.e.
len(X) = len(Y) = len(Z) = len(Z[i]) = N for i in range(0,N).
I would like to make a similar to a contourf plot (I already made it), but using discrete axes, like a "contour squares", where each square (x,y) has a color given by the Z value (which is a float value).
So far I'm using the contourf(X,Y,Z), but it makes some interpolations that I don't want, I need a better visualization with squares.
Does anyone knows how to do it?
Thanks
You should use matshow or imshow plotting functions.
An important argument here is the interpolation one.
Check this example from the matplotlib gallery to see some examples.
By using matshow(), keyword arguments are passed to imshow().
matshow() sets defaults for origin, interpolation (='nearest'), and aspect.
here is an example from my own work...
# level, time and conc are previously read from a file
X,Y=[level,time]
Z=conc.transpose() # Create the data to be plotted
cax = matshow(Z, origin='lower', vmin=0, vmax=500)
# I am telling all the Z values above 500 will have the same color
# in the plot (if vmin or vmax are not given, they are taken from
# the input’s minimum and maximum value respectively)
grid(True)
cbar = colorbar(cax)
...which returns this plot:

Categories

Resources