I want to plot a true/false or active/deactive binary data similar to the following picture:
The horizontal axis is time and the vertical axis is some entities(Here some sensors) which is active(white) or deactive(black). How can I plot such a graphs using pyplot.
I searched to find the name of these graphs but I couldn't find it.

What you are looking for is imshow:
import matplotlib.pyplot as plt
import numpy as np
# get some data with true # probability 80 %
data = np.random.random((20, 500)) > .2
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(data, aspect='auto',, interpolation='nearest')
Then you will just have to get the Y labels from somewhere.
It seems that the image in your question has some interpolation in the image. Let us set a few more things:
import matplotlib.pyplot as plt
import numpy as np
# create a bit more realistic-looking data
# - looks complicated, but just has a constant switch-off and switch-on probabilities
# per column
# - the result is a 20 x 500 array of booleans
p_switchon = 0.02
p_switchoff = 0.05
data = np.empty((20,500), dtype='bool')
data[:,0] = np.random.random(20) < .2
for c in range(1, 500):
r = np.random.random(20)
data[data[:,c-1],c] = (r > p_switchoff)[data[:,c-1]]
data[-data[:,c-1],c] = (r < p_switchon)[-data[:,c-1]]
# create some labels
labels = [ "label_{0:d}".format(i) for i in range(20) ]
# this is the real plotting part
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(data, aspect='auto',
However, the interpolation is not necessarily a good thing here. To make the different rows easier to separate, one might use colors:
import matplotlib.pyplot as plt
import matplotlib.colors
import numpy as np
# create a bit more realistic-looking data
# - looks complicated, but just has a constant switch-off and switch-on probabilities
# per column
# - the result is a 20 x 500 array of booleans
p_switchon = 0.02
p_switchoff = 0.05
data = np.empty((20,500), dtype='bool')
data[:,0] = np.random.random(20) < .2
for c in range(1, 500):
r = np.random.random(20)
data[data[:,c-1],c] = (r > p_switchoff)[data[:,c-1]]
data[-data[:,c-1],c] = (r < p_switchon)[-data[:,c-1]]
# create some labels
labels = [ "label_{0:d}".format(i) for i in range(20) ]
# create a color map with random colors
colmap = matplotlib.colors.ListedColormap(np.random.random((21,3)))
colmap.colors[0] = [0,0,0]
# create some colorful data:
data_color = (1 + np.arange(data.shape[0]))[:, None] * data
# this is the real plotting part
fig = plt.figure()
ax = fig.add_subplot(111)
ax.imshow(data_color, aspect='auto', cmap=colmap, interpolation='nearest')
Of course, you will want to use something less strange as the coloring scheme, but that is really up to your artistic views. Here the trick is that all True elements on row n have value n+1 and, and all False elements are 0 in data_color. This makes it possible to create a color map. Naturally, if you want a cyclic color map with two or three colors, just use the modulus of data_color in imshow by, e.g. data_color % 3.


Is it possible to manipulate the data in a matplotlib histogram using Get and Set?

I have a stacked histogram made using matplotlib. It has of course multiple bins (on per sector) and each bin/bar is further segmented in subsectors (stacked histogram).
I'm wondering how I could get the datapoints, do some math (let's say divide each bin by it's total value), and than set the new datapoints.
How I expect it to work:
import matplotlib.plt as plt
ax = plt.subplt(111)
h = ax.hist((subsector1,subsector2,subsector3), bins = 20, stacked=True)
y_data = h.get_yData
The shape of y_data would be something like 20 x 3 (bins x subsectors)
new_y_data = y_data normalized by total on each bin
The shape of new_y_data would also be like 20 x 3, but the sum on each bin would be 1 (or 100%)
new_h = h.set_yData(new_y_data)
new_h would look more like a bar plot, with equal sized bars, but different subsector distributions on each bar..
Is this even possible in python matplotlib?
When you only want the values, it's easier to use np.histogram which does the same calculations without the need to draw.
When you have values, draws the directly without needing plt.hist.
Pandas might be an alternative. Have a look at Creating percentage stacked bar chart using groupby for an example similar to yours.
Here is some example code using np.histogram and
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import PercentFormatter
subsector1 = np.clip(np.random.normal(70, 20, 400), 0, 100)
subsector2 = np.clip(np.random.normal(50, 20, 1000), 0, 100)
subsector3 = np.clip(np.random.normal(25, 20, 500), 0, 100)
num_bins = 20
x_min = np.min(np.concatenate([subsector1, subsector2, subsector3]))
x_max = np.max(np.concatenate([subsector1, subsector2, subsector3]))
bounds = np.linspace(x_min, x_max, num_bins + 1)
values = np.zeros((num_bins, 3))
for i, subsect in enumerate((subsector1, subsector2, subsector3)):
values[:, i], _ = np.histogram(subsect, bins=bounds)
with np.errstate(divide='ignore', invalid='ignore'):
values /= values.sum(axis=1, keepdims=True)
fig, ax = plt.subplots()
bottom = 0
for i in range(3):[:-1] + bounds[1:]) / 2, values[:, i], bottom=bottom, width=np.diff(bounds) * 0.8)
bottom += values[:, i]
plt.xlim(x_min, x_max)

Setting discrete colormap corresponding to specific data range in Matplotlib

Some background
I have a 2-d array in the shape of (50,50), the data value are range from -40 ~ 40.
But I want to plot the data in three data range[<0], [0,20], [>20]
Then, I need to generate a colormap corresponding to the three section.
I have some thought now
## ratio is the original 2-d array
binlabel = np.zeros_like(ratio)
binlabel[ratio<0] = 1
binlabel[(ratio>0)&(ratio<20)] = 2
binlabel[ratio>20] = 3
def discrete_cmap(N, base_cmap=None):
base =
color_list = base(np.linspace(0, 1, N))
cmap_name = + str(N)
return base.from_list(cmap_name, color_list, N)
fig = plt.figure()
ax = plt.gca()
plt.pcolormesh(binlabel, cmap = discrete_cmap(3, 'jet'))
divider = make_axes_locatable(ax)
cax = divider.append_axes("bottom", size="4%", pad=0.45)
cbar = plt.colorbar(ratio_plot, cax=cax, orientation="horizontal")
labels = [1.35,2,2.65]
loc = labels
cbar.set_ticks(loc)['< 0', '0~20', '>20'])
Is there any better approach? Any advice would be appreciate.
There are various answers to other questions using ListedColormap and BoundaryNorm, but here's an alternative. I've ignored the placement of your colorbar, as that's not relevant to your question.
You can replace your binlabel calculation with a call to np.digitize() and replace your discrete_cmap() function by using the lut argument to get_cmap(). Also, I find it easier to place the color bounds at .5 midpoints between the indexes rather than scale to awkward fractions of odd numbers:
import matplotlib.colors as mcol
import as cm
import matplotlib.pyplot as plt
import numpy as np
ratio = np.random.random((50,50)) * 50.0 - 20.0
fig2, ax2 = plt.subplots(figsize=(5,5))
# Turn the data into an array of N bin indexes (i.e., 0, 1 and 2).
bounds = [0,20]
iratio = np.digitize(ratio.flat,bounds).reshape(ratio.shape)
# Create a colormap containing N colors and a Normalizer that defines where
# the boundaries of the colors should be relative to the indexes (i.e., -0.5,
# 0.5, 1.5, 2.5).
cmap = cm.get_cmap("jet",lut=len(bounds)+1)
cmap_bounds = np.arange(len(bounds)+2) - 0.5
norm = mcol.BoundaryNorm(cmap_bounds,cmap.N)
# Plot using the colormap and the Normalizer.
ratio_plot = plt.pcolormesh(iratio,cmap=cmap,norm=norm)
cbar = plt.colorbar(ratio_plot,ticks=[0,1,2],orientation="horizontal")
cbar.set_ticklabels(["< 0","0~20",">20"])

Python - Randomly subsamble a range of points to plot

I have two lists, x and y, that I wish to plot together in a scatter plot.
The lists contain too many data points. I would like a graph with much less points. I cannot crop or trim these lists, I need to randomly subsamble a set number of points from both of these lists. What would be the best way to approach this?
You could subsample the lists using
idx = np.random.choice(np.arange(len(x)), num_samples)
plt.scatter(x[idx], y[idx])
However, this leaves the result a bit up to random luck. We can do better by making a heatmap. plt.hexbin makes this particularly easy:
plt.hexbin(x, y)
Here is an example, comparing the two methods:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
N = 10**5
val1 = np.random.normal(loc=10, scale=2,size=N)
val2 = np.random.normal(loc=0, scale=1, size=N)
fig, ax = plt.subplots(nrows=2, sharex=True, sharey=True)
cmap = plt.get_cmap('jet')
norm = mcolors.LogNorm()
num_samples = 10**4
idx = np.random.choice(np.arange(len(val1)), num_samples)
ax[0].scatter(val1[idx], val2[idx])
im = ax[1].hexbin(val1, val2, gridsize=50, cmap=cmap, norm=norm)
ax[1].set_title('hexbin heatmap')
fig.colorbar(im, ax=ax.ravel().tolist())
You can pick randomly from x and y using a random index mask
import numpy as np
import matplotlib.pyplot as plt
N = 50
x = np.random.rand(N)
y = np.random.rand(N)
# Pick random 10 samples, 2 means two choices from [0, 1] for the mask
subsample = np.random.choice(2, 10).astype(bool)
plt.scatter(x[subsample], y[subsample])
Alternatively you can use hist2d to plot a 2D histogram, which uses densities instead of data points
plt.hist2d(x, y) # No need to subsample
You can use random.sample():
max_points = len(x)
# Assuming you only want 50 points.
random_indexes = random.sample(range(max_points), 50)
new_x = [x[i] for i in random_indexes]
new_y = [y[i] for i in random_indexes]

Color-coding a histogram

I have a set of N objects with two properties: x and y.
I would like to depict the distribution of x with a histogram in MATPLOTLIB using hist(). Easy enough. Now, I would like to color-code EACH bar of the histogram with a color that represents the average value of y in that set with a colormap. Is there an easy way to do this? Here, x and y are both N-d numpy arrays. Thanks!
fig = plt.figure()
n, bins, patches = plt.hist(x, 100, normed=1, histtype='stepfilled')
plt.setp(patches, 'facecolor', 'g', 'alpha', 0.1)
plt.ylabel('Normalized frequency')
import numpy as np
import matplotlib
import matplotlib.pyplot as plt
# set up the bins
Nbins = 10
bins = np.linspace(0, 1, Nbins +1, endpoint=True)
# get some fake data
x = np.random.rand(300)
y = np.arange(300)
# figure out which bin each x goes into
bin_num = np.digitize(x, bins, right=True) - 1
# compute the counts per bin
hist_vals = np.bincount(bin_num)
# set up array for bins
means = np.zeros(Nbins)
# numpy slicing magic to sum the y values by bin
means[bin_num] += y
# take the average
means /= hist_vals
# make the figure/axes objects
fig, ax = plt.subplots(1,1)
# get a color map
my_cmap = cm.get_cmap('jet')
# get normalize function (takes data in range [vmin, vmax] -> [0, 1])
my_norm = Normalize()
# use bar plot[:-1], hist_vals, color=my_cmap(my_norm(means)), width=np.diff(bins))
# make sure the figure updates
Shade 'cells' in polar plot with matplotlib

I've got a bunch of regularly distributed points (θ = n*π/6, r=1...8), each having a value in [0, 1]. I can plot them with their values in matplotlib using
polar(thetas, rs, c=values)
But rather then having just a meagre little dot I'd like to shade the corresponding 'cell' (ie. everything until halfway to the adjacent points) with the colour corresponding to the point's value:
(Note that here my values are just [0, .5, 1], in really they will be everything between 0 and 1. Is there any straight-forward way of realising this (or something close enough) with matplotlib? Maybe it's easier to think about it as a 2D-histogram?
This can be done quite nicely by treating it as a polar stacked barchart:
import matplotlib.pyplot as plt
import numpy as np
from random import choice
fig = plt.figure()
ax = fig.add_axes([0.1, 0.1, 0.8, 0.8], polar=True)
for i in xrange(12*8):
color = choice(['navy','maroon','lightgreen']) * 2 * np.pi / 12, 1, width=2 * np.pi / 12, bottom=i / 12,
color=color, edgecolor = color)
Sure! Just use pcolormesh on a polar axes.
import matplotlib.pyplot as plt
import numpy as np
# Generate some data...
# Note that all of these are _2D_ arrays, so that we can use meshgrid
# You'll need to "grid" your data to use pcolormesh if it's un-ordered points
theta, r = np.mgrid[0:2*np.pi:20j, 0:1:10j]
z = np.random.random(theta.size).reshape(theta.shape)
fig, (ax1, ax2) = plt.subplots(ncols=2, subplot_kw=dict(projection='polar'))
ax1.scatter(theta.flatten(), r.flatten(), c=z.flatten())
ax1.set_title('Scattered Points')
ax2.pcolormesh(theta, r, z)
for ax in [ax1, ax2]:
ax.set_ylim([0, 1])
If your data isn't already on a regular grid, then you'll need to grid it to use pcolormesh.
It looks like it's on a regular grid from your plot, though. In that case, gridding it is quite simple. If it's already ordered, it may be as simple as calling reshape. Otherwise, a simple loop or exploiting numpy.histogram2d with your z values as weights will do what you need.
Well, it's fairly unpolished overall, but here's a version that rounds out the sections.
from matplotlib.pylab import *
ax = subplot(111, projection='polar')
# starts grid and colors
th = array([pi/6 * n for n in range(13)]) # so n = 0..12, allowing for full wrapping
r = array(range(9)) # r = 0..8
c = array([[random_integers(0, 10)/10 for y in range(th.size)] for x in range(r.size)])
# The smoothing
TH = cbook.simple_linear_interpolation(th, 10)
# Properly padding out C so the colors go with the right sectors (can't remember the proper word for such segments of wedges)
# A much more elegant version could probably be created using stuff from itertools or functools
C = zeros((r.size, TH.size))
oldfill = 0
TH_ = TH.tolist()
for i in range(th.size):
fillto = TH_.index(th[i])
for j, x in enumerate(c[:,i]):
C[j, oldfill:fillto].fill(x)
oldfill = fillto
# The plotting
th, r = meshgrid(TH, r)
ax.pcolormesh(th, r, C)

