It is a simple but common task required when trying to fix a colormap according to a 2D matrix of values.
To demonstrate consider the problem in Matlab, the solution does not need to be in Matlab (i.e., the code presented here is only for demonstration purpose).
x = [0,1,2; 3,4,5; 6,7,8];
imagesc(x)
axis square
axis off
So the output is as:
when some values change to over the maximum value it happens like:
x = [0,1,2; 3,4,5; 6,7,18];
which looks logical but makes problems when we wish to compare/trace elements in two maps. Since the colormap association is changed it is almost impossible to find an individual cell for comparison/trace etc.
The solution I implemented is to mask the matrix as:
x = [0,1,2; 3,4,5; 6,7,18];
m = 8;
x(x>=m) = m;
which works perfectly.
Since the provided code requires searching/filtering (extra time consuming!) I wonder if there is a general/more efficient way for this job to be implemented in Matlab, Python etc?
One of the cases that this issue occurs is when we have many simulations sequentially and wish to make a sense-making animation of the progress; in this case each color should keep its association fixed.
In Python using package MatPlotLib the solution is as follows:
import pylab as pl
x = [[0,1,2],[3,4,5],[6,7,18]]
pl.matshow(x, vmin=0, vmax=8)
pl.axis('image')
pl.axis('off')
show()
So vmin and vmax are boundary limits for the full range of colormap.
The indexing is pretty quick so I don't think you need worry.
However, in Matlab, you can pass in the clims argument to imagesc:
imagesc(x,[0 8]);
This maps all values above 8 to the top colour in the colour scale, and all values below 0 to the bottom colour in the colour scale, and then stretches the scale for colours in-between.
imagesc documentation.
f1 = figure;
x = [0,1,2; 3,4,5; 6,7,8];
imagesc(x)
axis square
axis off
limits = get(gca(f1),'CLim');
f2 = figure;
z = [0,1,2; 3,4,5; 6,7,18];
imagesc(z)
axis square
axis off
caxis(limits)
Related
I'm trying to use the fastKDE package (https://pypi.python.org/pypi/fastkde/1.0.8) to find the KDE of a point in a 2D plot. However, I want to know the KDE beyond the limits of the data points, and cannot figure out how to do this.
Using the code listed on the site linked above;
#!python
import numpy as np
from fastkde import fastKDE
import pylab as PP
#Generate two random variables dataset (representing 100000 pairs of datapoints)
N = 2e5
var1 = 50*np.random.normal(size=N) + 0.1
var2 = 0.01*np.random.normal(size=N) - 300
#Do the self-consistent density estimate
myPDF,axes = fastKDE.pdf(var1,var2)
#Extract the axes from the axis list
v1,v2 = axes
#Plot contours of the PDF should be a set of concentric ellipsoids centered on
#(0.1, -300) Comparitively, the y axis range should be tiny and the x axis range
#should be large
PP.contour(v1,v2,myPDF)
PP.show()
I'm able to find the KDE for any point within the limits of the data, but how do I find the KDE for say the point (0,300), without having to include it into var1 and var2. I don't want the KDE to be calculated with this data point, I want to know the KDE at that point.
I guess what I really want to be able to do is give the fastKDE a histogram of the data, so that I can set its axes myself. I just don't know if this is possible?
Cheers
I, too, have been experimenting with this code and have run into the same issues. What I've done (in lieu of a good N-D extrapolator) is to build a KDTree (with scipy.spatial) from the grid points that fastKDE returns and find the nearest grid point to the point I was to evaluate. I then lookup the corresponding pdf value at that point (it should be small near the edge of the pdf grid if not identically zero) and assign that value accordingly.
I came across this post while searching for a solution of this problem. Similiar to the building of a KDTree you could just calculate your stepsize in every griddimension, and then get the index of your query point by just subtracting the point value with the beginning of your axis and divide by the stepsize of that dimension, finally round it off, turn it to integer and voila. So for example in 1D:
def fastkde_test(test_x):
kde, axes = fastKDE.pdf(test_x, numPoints=num_p)
x_step = (max(axes)-min(axes)) / len(axes)
x_ind = np.int32(np.round((test_x-min(axes)) / x_step))
return kde[x_ind]
where test_x in this case is both the set for defining the KDE and the query set. Doing it this way is marginally faster by a factor of 10 in my case (at least in 1D, higher dimensions not yet tested) and does basically the same thing as the KDTree query.
I hope this helps anyone coming across this problem in the future, as I just did.
Edit: if your querying points outside of the range over which the KDE was calculated this method of course can only give you the same result as the KDTree query, namely the corresponding border of your KDE-grid. You would however have to hardcode this by cutting the resulting x_ind at the highest index, i.e. `len(axes)-1'.
I would like to create a visualization like the upper part of this image. Essentially, a heatmap where each point in time has a fixed number of components but these components are anchored to the y axis by means of labels (that I can supply) rather than by their first index in the heatmap's matrix.
I am aware of pcolormesh, but that does not seem to give me the y-axis functionality I seek.
Lastly, I am also open to solutions in R, although a Python option would be much preferable.
I am not completely sure if I understand your meaning correctly, but by looking at the picture you have linked, you might be best off with a roll-your-own solution.
First, you need to create an array with the heatmap values so that you have on row for each label and one column for each time slot. You fill the array with nans and then write whatever heatmap values you have to the correct positions.
Then you need to trick imshow a bit to scale and show the image in the correct way.
For example:
# create some masked data
a=cumsum(random.random((20,200)), axis=0)
X,Y=meshgrid(arange(a.shape[1]),arange(a.shape[0]))
a[Y<15*sin(X/50.)]=nan
a[Y>10+15*sin(X/50.)]=nan
# draw the image along with some curves
imshow(a,interpolation='nearest',origin='lower',extent=[-2,2,0,3])
xd = linspace(-2, 2, 200)
yd = 1 + .1 * cumsum(random.random(200)-.5)
plot(xd, yd,'w',linewidth=3)
plot(xd, yd,'k',linewidth=1)
axis('normal')
Gives:
Let's say I have two histograms and I set the opacity using the parameter of hist: 'alpha=0.5'
I have plotted two histograms yet I get three colors! I understand this makes sense from an opacity point of view.
But! It makes is very confusing to show someone a graph of two things with three colors. Can I just somehow set the smallest bar for each bin to be in front with no opacity?
Example graph
The usual way this issue is handled is to have the plots with some small separation. This is done by default when plt.hist is given multiple sets of data:
import pylab as plt
x = 200 + 25*plt.randn(1000)
y = 150 + 25*plt.randn(1000)
n, bins, patches = plt.hist([x, y])
You instead which to stack them (this could be done above using the argument histtype='barstacked') but notice that the ordering is incorrect.
This can be fixed by individually checking each pair of points to see which is larger and then using zorder to set which one comes first. For simplicity I am using the output of the code above (e.g n is two stacked arrays of the number of points in each bin for x and y):
n_x = n[0]
n_y = n[1]
for i in range(len(n[0])):
if n_x[i] > n_y[i]:
zorder=1
else:
zorder=0
plt.bar(bins[:-1][i], n_x[i], width=10)
plt.bar(bins[:-1][i], n_y[i], width=10, color="g", zorder=zorder)
Here is the resulting image:
By changing the ordering like this the image looks very weird indeed, this is probably why it is not implemented and needs a hack to do it. I would stick with the small separation method, anyone used to these plots assumes they take the same x-value.
I am trying to create a filled contour plot in matplotlib (Win7, 1.1.0). I want to highlight certain values, and the levels are closer to log than linear.
There are numerous colormaps that would suit me, but my choice of cmap is ignored.
Do I need to create a custom "normalize"? If so is each contour colored according to its edge value and then filled with the same color to the next lower value? Why is the symptom of this to ignore my color map ... is this some exception during construction that is being caught and my request is being silently ignored?
My original data had missing values. I have played with making thise nan, large and small ... in each case I have tried masking them and not masking the "outside" values. I have also tried all permutations using the default levels and norm.
lev = [0.1,0.2,0.5,1.0,2.0,4.0,8.0,16.0,32.0]
norml = colors.normalize(0,32)
cs = plt.contourf(x,z,data,cmap=cm.gray, levels=lev, norm = norml)
I hope this snippet is sufficient to at least start the conversation.
Thanks,
Eli
If I understood you correctly, you need to rescale your data to colors using your levels as the basis rather than default linear scaling. If that's right, then you need to use colors.BoundaryNorm as the norm factor. Consider the following example:
x = np.arange(0,8,0.1)
y = np.arange(0,8,0.1)
z = (x[:,None]-4) ** 2 + (y[None,:]-4) ** 2
lev = [0.1,0.2,0.5,1.0,2.0,4.0,8.0,16.0,32.0]
norml = colors.BoundaryNorm(lev, 256)
cs = plt.contourf(x, y, z, cmap = cm.jet, levels = lev, norm = norml)
plt.show()
This yields
Compare it to default Normalize behaviour:
Hope that helps.
How do I invert a color mapped image?
I have a 2D image which plots data on a colormap. I'd like to read the image in and 'reverse' the color map, that is, look up a specific RGB value, and turn it into a float.
For example:
using this image: http://matplotlib.sourceforge.net/_images/mri_demo.png
I should be able to get a 440x360 matrix of floats, knowing the colormap was cm.jet
from pylab import imread
import matplotlib.cm as cm
a=imread('mri_demo.png')
b=colormap2float(a,cm.jet) #<-tricky part
There may be better ways to do this; I'm not sure.
If you read help(cm.jet) you will see the algorithm used to map values in the interval [0,1] to RGB 3-tuples. You could, with a little paper and pencil, work out formulas to invert the piecewise-linear functions which define the mapping.
However, there are a number of issues which make the paper and pencil solution somewhat unappealing:
It's a lot of laborious algebra, and
the solution is specific for cm.jet.
You'd have to do all this work again
if you change the color map. How to automate the solving of these algebraic equations is interesting, but not a problem I know how to solve.
In general, the color map may not be
invertible (more than one value may
be mapped to the same color). In the
case of cm.jet, values between 0.11
and 0.125 are all mapped to the RGB
3-tuple (0,0,1), for example. So if
your image contains a pure blue
pixel, there is really no way to
tell if it came from a value of 0.11
or a value of, say, 0.125.
The mapping from [0,1] to
3-tuples is a curve in 3-space. The
colors in your image may not lie
perfectly on this curve. There might
be round-off error, for example. So any practical solution has to be able to interpolate or somehow project points in 3-space onto the curve.
Due to the non-uniqueness issue, and the projection/interpolation issue, there can be many possible solutions to the problem you pose. Below is just one possibility.
Here is one way to resolve the uniqueness and projection/interpolation issues:
Create a gradient which acts as a "code book". The gradient is an array of RGBA 4-tuples in the cm.jet color map. The colors of the gradient correspond to values from 0 to 1. Use scipy's vector quantization function scipy.cluster.vq.vq to map all the colors in your image, mri_demo.png, onto the nearest color in gradient.
Since a color map may use the same color for many values, the gradient may contain duplicate colors. I leave it up to scipy.cluster.vq.vq to decide which (possibly) non-unique code book index to associate with a particular color.
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import scipy.cluster.vq as scv
def colormap2arr(arr,cmap):
# http://stackoverflow.com/questions/3720840/how-to-reverse-color-map-image-to-scalar-values/3722674#3722674
gradient=cmap(np.linspace(0.0,1.0,100))
# Reshape arr to something like (240*240, 4), all the 4-tuples in a long list...
arr2=arr.reshape((arr.shape[0]*arr.shape[1],arr.shape[2]))
# Use vector quantization to shift the values in arr2 to the nearest point in
# the code book (gradient).
code,dist=scv.vq(arr2,gradient)
# code is an array of length arr2 (240*240), holding the code book index for
# each observation. (arr2 are the "observations".)
# Scale the values so they are from 0 to 1.
values=code.astype('float')/gradient.shape[0]
# Reshape values back to (240,240)
values=values.reshape(arr.shape[0],arr.shape[1])
values=values[::-1]
return values
arr=plt.imread('mri_demo.png')
values=colormap2arr(arr,cm.jet)
# Proof that it works:
plt.imshow(values,interpolation='bilinear', cmap=cm.jet,
origin='lower', extent=[-3,3,-3,3])
plt.show()
The image you see should be close to reproducing mri_demo.png:
(The original mri_demo.png had a white border. Since white is not a color in cm.jet, note that scipy.cluster.vq.vq maps white to to closest point in the gradient code book, which happens to be a pale green color.)
Here is a simpler approach, that works for many colormaps, e.g. viridis, though not for LinearSegmentedColormaps such as 'jet'.
The colormaps are stored as lists of [r,g,b] values. For lots of colormaps, this map has exactly 256 entries. A value between 0 and 1 is looked up using its nearest neighbor in the color list. So, you can't get the exact value back, only an approximation.
Some code to illustrate the concepts:
from matplotlib import pyplot as plt
def find_value_in_colormap(tup, cmap):
# for a cmap like viridis, the result of the colormap lookup is a tuple (r, g, b, a), with a always being 1
# but the colors array is stored as a list [r, g, b]
# for some colormaps, the situation is reversed: the lookup returns a list, while the colors array contains tuples
tup = list(tup)[:3]
colors = cmap.colors
if tup in colors:
ind = colors.index(tup)
elif tuple(tup) in colors:
ind = colors.index(tuple(tup))
else: # tup was not generated by this colormap
return None
return (ind + 0.5) / len(colors)
val = 0.3
tup = plt.cm.viridis(val)
print(find_value_in_colormap(tup, plt.cm.viridis))
This prints the approximate value:
0.298828125
being the value corresponding to the color triple.
To illustrate what happens, here is a visualization of the function looking up a color for a value, followed by getting the value corresponding to that color.
from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(-0.1, 1.1, 10000)
y = [ find_value_in_colormap(plt.cm.viridis(x), plt.cm.viridis) for x in x]
fig, axes = plt.subplots(ncols=3, figsize=(12,4))
for ax in axes.ravel():
ax.plot(x, x, label='identity: y = x')
ax.plot(x, y, label='lookup, then reverse')
ax.legend(loc='best')
axes[0].set_title('overall view')
axes[1].set_title('zoom near x=0')
axes[1].set_xlim(-0.02, 0.02)
axes[1].set_ylim(-0.02, 0.02)
axes[2].set_title('zoom near x=1')
axes[2].set_xlim(0.98, 1.02)
axes[2].set_ylim(0.98, 1.02)
plt.show()
For a colormap with only a few colors, a plot can show the exact position where one color changes to the next. The plot is colored corresponding to the x-values.
Hy unutbu,
Thanks for your reply, I understand the process you explain, and reproduces it. It works very well, I use it to reverse IR camera shots in temperature grids, since a picture can be easily rework/reshape to fulfill my purpose using GIMP.
I'm able to create grids of scalar from camera shots that is really usefull in my tasks.
I use a palette file that I'm able to create using GIMP + Sample a Gradient Along a Path.
I pick the color bar of my original picture, convert it to palette then export as hex color sequence.
I read this palette file to create a colormap normalized by a temperature sample to be used as the code book.
I read the original image and use the vector quantization to reverse color into values.
I slightly improve the pythonic style of the code by using code book indices as index filter in the temperature sample array and apply some filters pass to smooth my results.
from numpy import linspace, savetxt
from matplotlib.colors import Normalize, LinearSegmentedColormap
from scipy.cluster.vq import vq
# sample the values to find from colorbar extremums
vmin = -20.
vmax = 120.
precision = 1.
resolution = 1 + vmax-vmin/precision
sample = linspace(vmin,vmax,resolution)
# create code_book from sample
cmap = LinearSegmentedColormap.from_list('Custom', hex_color_list)
norm = Normalize()
code_book = cmap(norm(sample))
# quantize colors
indices = vq(flat_image,code_book)[0]
# filter sample from quantization results **(improved)**
values = sample[indices]
savetxt(image_file_name[:-3]+'.csv',values ,delimiter=',',fmt='%-8.1f')
The results are finally exported in .csv
Most important thing is to create a well representative palette file to obtain a good precision. I start to obtain a good gradient (code book) using 12 colors and more.
This process is useful since sometimes camera shots cannot be translated to gray-scale easily and linearly.
Thanks to all contributors unutbu, Rob A, scipy community ;)
The LinearSegmentedColormap doesn't give me the same interpolation if I don't it manually during my test, so I prefer to use my own :
As an advantage, matplotlib is not more required since I integrate my code within an existing software.
def codeBook(color_list, N=256):
"""
return N colors interpolated from rgb color list
!!! workaround to matplotlib colormap to avoid dependency !!!
"""
# seperate r g b channel
rgb = np.array(color_list).T
# normalize data points sets
new_x = np.linspace(0., 1., N)
x = np.linspace(0., 1., len(color_list))
# interpolate each color channel
rgb = [np.interp(new_x, x, channel) for channel in rgb]
# round elements of the array to the nearest integer.
return np.rint(np.column_stack( rgb )).astype('int')