Zero-value colour in matplotlib hexbin - python

I have some spatially-distributed data. I'm plotting this with matplotlib.pyplot.hexbin and would like to change the "background" (i.e. zero-value) colour. An example is shown below - my colour-map of choice is matplotlib.cm.jet:
How can I change the base colour from blue to white? I have done something similar with masked arrays when using pcolormesh, but I can't see anyway of doing so in the hexbin arguments. My instinct would be to edit the colourmap itself, but I've not had much experience with that.
I'm using matplotlib v.0.99.1.1

hexbin(x,y,mincnt=1) should do the trick. Essentially, you only want to display the hexagons with more than 1 count in them.
from numpy import linspace
from numpy.random import normal
from pylab import hexbin,show
n = 2**6
x = linspace(-1,1,n)
y = normal(0,1,n)
h = hexbin(x,y,gridsize=10,mincnt=0)
gives,
and h = hexbin(x,y,gridsize=10,mincnt=1) gives,

Related

How to get colour value from cmap with RGB tuple? [duplicate]

How do I invert a color mapped image?
I have a 2D image which plots data on a colormap. I'd like to read the image in and 'reverse' the color map, that is, look up a specific RGB value, and turn it into a float.
For example:
using this image: http://matplotlib.sourceforge.net/_images/mri_demo.png
I should be able to get a 440x360 matrix of floats, knowing the colormap was cm.jet
from pylab import imread
import matplotlib.cm as cm
a=imread('mri_demo.png')
b=colormap2float(a,cm.jet) #<-tricky part
There may be better ways to do this; I'm not sure.
If you read help(cm.jet) you will see the algorithm used to map values in the interval [0,1] to RGB 3-tuples. You could, with a little paper and pencil, work out formulas to invert the piecewise-linear functions which define the mapping.
However, there are a number of issues which make the paper and pencil solution somewhat unappealing:
It's a lot of laborious algebra, and
the solution is specific for cm.jet.
You'd have to do all this work again
if you change the color map. How to automate the solving of these algebraic equations is interesting, but not a problem I know how to solve.
In general, the color map may not be
invertible (more than one value may
be mapped to the same color). In the
case of cm.jet, values between 0.11
and 0.125 are all mapped to the RGB
3-tuple (0,0,1), for example. So if
your image contains a pure blue
pixel, there is really no way to
tell if it came from a value of 0.11
or a value of, say, 0.125.
The mapping from [0,1] to
3-tuples is a curve in 3-space. The
colors in your image may not lie
perfectly on this curve. There might
be round-off error, for example. So any practical solution has to be able to interpolate or somehow project points in 3-space onto the curve.
Due to the non-uniqueness issue, and the projection/interpolation issue, there can be many possible solutions to the problem you pose. Below is just one possibility.
Here is one way to resolve the uniqueness and projection/interpolation issues:
Create a gradient which acts as a "code book". The gradient is an array of RGBA 4-tuples in the cm.jet color map. The colors of the gradient correspond to values from 0 to 1. Use scipy's vector quantization function scipy.cluster.vq.vq to map all the colors in your image, mri_demo.png, onto the nearest color in gradient.
Since a color map may use the same color for many values, the gradient may contain duplicate colors. I leave it up to scipy.cluster.vq.vq to decide which (possibly) non-unique code book index to associate with a particular color.
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import scipy.cluster.vq as scv
def colormap2arr(arr,cmap):
# http://stackoverflow.com/questions/3720840/how-to-reverse-color-map-image-to-scalar-values/3722674#3722674
gradient=cmap(np.linspace(0.0,1.0,100))
# Reshape arr to something like (240*240, 4), all the 4-tuples in a long list...
arr2=arr.reshape((arr.shape[0]*arr.shape[1],arr.shape[2]))
# Use vector quantization to shift the values in arr2 to the nearest point in
# the code book (gradient).
code,dist=scv.vq(arr2,gradient)
# code is an array of length arr2 (240*240), holding the code book index for
# each observation. (arr2 are the "observations".)
# Scale the values so they are from 0 to 1.
values=code.astype('float')/gradient.shape[0]
# Reshape values back to (240,240)
values=values.reshape(arr.shape[0],arr.shape[1])
values=values[::-1]
return values
arr=plt.imread('mri_demo.png')
values=colormap2arr(arr,cm.jet)
# Proof that it works:
plt.imshow(values,interpolation='bilinear', cmap=cm.jet,
origin='lower', extent=[-3,3,-3,3])
plt.show()
The image you see should be close to reproducing mri_demo.png:
(The original mri_demo.png had a white border. Since white is not a color in cm.jet, note that scipy.cluster.vq.vq maps white to to closest point in the gradient code book, which happens to be a pale green color.)
Here is a simpler approach, that works for many colormaps, e.g. viridis, though not for LinearSegmentedColormaps such as 'jet'.
The colormaps are stored as lists of [r,g,b] values. For lots of colormaps, this map has exactly 256 entries. A value between 0 and 1 is looked up using its nearest neighbor in the color list. So, you can't get the exact value back, only an approximation.
Some code to illustrate the concepts:
from matplotlib import pyplot as plt
def find_value_in_colormap(tup, cmap):
# for a cmap like viridis, the result of the colormap lookup is a tuple (r, g, b, a), with a always being 1
# but the colors array is stored as a list [r, g, b]
# for some colormaps, the situation is reversed: the lookup returns a list, while the colors array contains tuples
tup = list(tup)[:3]
colors = cmap.colors
if tup in colors:
ind = colors.index(tup)
elif tuple(tup) in colors:
ind = colors.index(tuple(tup))
else: # tup was not generated by this colormap
return None
return (ind + 0.5) / len(colors)
val = 0.3
tup = plt.cm.viridis(val)
print(find_value_in_colormap(tup, plt.cm.viridis))
This prints the approximate value:
0.298828125
being the value corresponding to the color triple.
To illustrate what happens, here is a visualization of the function looking up a color for a value, followed by getting the value corresponding to that color.
from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(-0.1, 1.1, 10000)
y = [ find_value_in_colormap(plt.cm.viridis(x), plt.cm.viridis) for x in x]
fig, axes = plt.subplots(ncols=3, figsize=(12,4))
for ax in axes.ravel():
ax.plot(x, x, label='identity: y = x')
ax.plot(x, y, label='lookup, then reverse')
ax.legend(loc='best')
axes[0].set_title('overall view')
axes[1].set_title('zoom near x=0')
axes[1].set_xlim(-0.02, 0.02)
axes[1].set_ylim(-0.02, 0.02)
axes[2].set_title('zoom near x=1')
axes[2].set_xlim(0.98, 1.02)
axes[2].set_ylim(0.98, 1.02)
plt.show()
For a colormap with only a few colors, a plot can show the exact position where one color changes to the next. The plot is colored corresponding to the x-values.
Hy unutbu,
Thanks for your reply, I understand the process you explain, and reproduces it. It works very well, I use it to reverse IR camera shots in temperature grids, since a picture can be easily rework/reshape to fulfill my purpose using GIMP.
I'm able to create grids of scalar from camera shots that is really usefull in my tasks.
I use a palette file that I'm able to create using GIMP + Sample a Gradient Along a Path.
I pick the color bar of my original picture, convert it to palette then export as hex color sequence.
I read this palette file to create a colormap normalized by a temperature sample to be used as the code book.
I read the original image and use the vector quantization to reverse color into values.
I slightly improve the pythonic style of the code by using code book indices as index filter in the temperature sample array and apply some filters pass to smooth my results.
from numpy import linspace, savetxt
from matplotlib.colors import Normalize, LinearSegmentedColormap
from scipy.cluster.vq import vq
# sample the values to find from colorbar extremums
vmin = -20.
vmax = 120.
precision = 1.
resolution = 1 + vmax-vmin/precision
sample = linspace(vmin,vmax,resolution)
# create code_book from sample
cmap = LinearSegmentedColormap.from_list('Custom', hex_color_list)
norm = Normalize()
code_book = cmap(norm(sample))
# quantize colors
indices = vq(flat_image,code_book)[0]
# filter sample from quantization results **(improved)**
values = sample[indices]
savetxt(image_file_name[:-3]+'.csv',values ,delimiter=',',fmt='%-8.1f')
The results are finally exported in .csv
Most important thing is to create a well representative palette file to obtain a good precision. I start to obtain a good gradient (code book) using 12 colors and more.
This process is useful since sometimes camera shots cannot be translated to gray-scale easily and linearly.
Thanks to all contributors unutbu, Rob A, scipy community ;)
The LinearSegmentedColormap doesn't give me the same interpolation if I don't it manually during my test, so I prefer to use my own :
As an advantage, matplotlib is not more required since I integrate my code within an existing software.
def codeBook(color_list, N=256):
"""
return N colors interpolated from rgb color list
!!! workaround to matplotlib colormap to avoid dependency !!!
"""
# seperate r g b channel
rgb = np.array(color_list).T
# normalize data points sets
new_x = np.linspace(0., 1., N)
x = np.linspace(0., 1., len(color_list))
# interpolate each color channel
rgb = [np.interp(new_x, x, channel) for channel in rgb]
# round elements of the array to the nearest integer.
return np.rint(np.column_stack( rgb )).astype('int')

How can I highlight a dot in a cloud of dots with Matplotlib?

I'm using Matplotlib and my goal is highlighting some points in a scatterplot.
I used the following code:
$colors = {'true':'red', 'false':'blue'}
plt.scatter(data[T[j]], data[T[i]], c=data['upgrade'].apply(lambda x: colors[x]) $
This code let my dots being red if the condition is "true", else blue.
I haven't any problem until I had the following example:
20k dots and just 1 is TRUE.
The plot that I obtained can't display my only point, because I have a cloud full of blue dots (2k) and just one should be red.
My question is if there is some way to show my only red dots, in general, to let red dots being more highlighted than the blue.
Thank you.
You can take advantage of numpy's indexing arrays with array conditions. And then efficiently call scatter twice. The ones you want on top you call last.
Additional tricks include using a less intense blue, and playing with the size of the dots. (Note that the dot size is relative to its area, not its diameter.)
import numpy as np
import matplotlib.pyplot as plt
N = 2000
x = np.random.rand(N)
y = np.random.rand(N)
z = np.random.rand(N)
c = np.where(z < 0.001, 1, 0)
plt.scatter(x[c==0], y[c==0], c='#2c7bb6', s=10)
plt.scatter(x[c==1], y[c==1], c='#ff0000', s=80)
plt.show()

How to normalise plotted points and get a circle?

Given 2000 random points in a unit circle (using numpy.random.normal(0,1)), I want to normalize them such that the output is a circle, how do I do that?
I was requested to show my efforts. This is part of a larger question: Write a program that samples 2000 points uniformly from the circumference of a unit circle. Plot and show it is indeed picked from the circumference. To generate a point (x,y) from the circumference, sample (x,y) from std normal distribution and normalise them.
I'm almost certain my code isn't correct, but this is where I am up to. Any advice would be helpful.
This is the new updated code, but it still doesn't seem to be working.
import numpy as np
import matplotlib.pyplot as plot
def plot():
xy = np.random.normal(0,1,(2000,2))
for i in range(2000):
s=np.linalg.norm(xy[i,])
xy[i,]=xy[i,]/s
plot.plot(xy)
plot.show()
I think the problem is in
plot.plot(xy)
even if I use
plot.plot(xy[:,0],xy[:,1])
it doesn't work.
Connected lines are not a good visualization here. You essentially connect random points on the circle. Since you do this quite often, you will get a filled circle. Try drawing points instead.
Also avoid name space mangling. You import matplotlib.pyplot as plot and also name your function plot. This will lead to name conflicts.
import numpy as np
import matplotlib.pyplot as plt
def plot():
xy = np.random.normal(0,1,(2000,2))
for i in range(2000):
s=np.linalg.norm(xy[i,])
xy[i,]=xy[i,]/s
fig, ax = plt.subplots(figsize=(5,5))
# scatter draws dots instead of lines
ax.scatter(xy[:,0], xy[:,1])
If you use dots instead, you will see that your points indeed lie on the unit circle.
Your code has many problems:
Why using np.random.normal (a gaussian distribution) when the problem text is about uniform (flat) sampling?
To pick points on a circle you need to correlate x and y; i.e. randomly sampling x and y will not give a point on the circle as x**2+y**2 must be 1 (for example for the unit circle centered in (x=0, y=0)).
A couple of ways to get the second point is to either "project" a random point from [-1...1]x[-1...1] on the unit circle or to pick instead uniformly the angle and compute a point on that angle on the circle.
First of all, if you look at the documentation for numpy.random.normal (and, by the way, you could just use numpy.random.randn), it takes an optional size parameter, which lets you create as large of an array as you'd like. You can use this to get a large number of values at once. For example: xy = numpy.random.normal(0,1,(2000,2)) will give you all the values that you need.
At that point, you need to normalize them such that xy[:,0]**2 + xy[:,1]**2 == 1. This should be relatively trivial after computing what xy[:,0]**2 + xy[:,1]**2 is. Simply using norm on each dimension separately isn't going to work.
Usual boilerplate
import numpy as np
import matplotlib.pyplot as plt
generate the random sample with two rows, so that it's more convenient to refer to x's and y's
xy = np.random.normal(0,1,(2,2000))
normalize the random sample using a library function to compute the norm, axis=0 means consider the subarrays obtained varying the first array index, the result is a (2000) shaped array that can be broadcasted to xy /= to have points with unit norm, hence lying on the unit circle
xy /= np.linalg.norm(xy, axis=0)
Eventually, the plot... here the key is the add_subplot() method, and in particular the keyword argument aspect='equal' that requires that the scale from user units to output units it's the same for both axes
plt.figure().add_subplot(111, aspect='equal').scatter(xy[0], xy[1])
pt.show()
to have

Trying to plot a quadratic regression, getting multiple lines

I'm making a demonstration of a different types of regression in numpy with ipython, and so far, I've been able to plot a simple linear regression without difficulty. Now, when I go on to make a quadratic fit to my data and go to plot it, I don't get a quadratic curve but instead get many lines. Here's the code I'm running that generates the problem:
import numpy
from numpy import random
from matplotlib import pyplot as plt
import math
# Generate random data
X = random.random((100,1))
epsilon=random.randn(100,1)
f = 3+5*X+epsilon
# least squares system
A =numpy.array([numpy.ones((100,1)),X,X**2])
A = numpy.squeeze(A)
A = A.T
quadfit = numpy.linalg.solve(numpy.dot(A.transpose(),A),numpy.dot(A.transpose(),f))
# plot the data and the fitted parabola
qdbeta0,qdbeta1,qdbeta2 = quadfit[0][0],quadfit[1][0],quadfit[2][0]
plt.scatter(X,f)
plt.plot(X,qdbeta0+qdbeta1*X+qdbeta2*X**2)
plt.show()
What I get is this picture (zoomed in to show the problem):
You can see that rather than having a single parabola that fits the data, I have a huge number of individual lines doing something that I'm not sure of. Any help would be greatly appreciated.
Your X is ordered randomly, so it's not a good set of x values to use to draw one continuous line, because it has to double back on itself. You could sort it, I guess, but TBH I'd just make a new array of x coordinates and use those:
plt.scatter(X,f)
x = np.linspace(0, 1, 1000)
plt.plot(x,qdbeta0+qdbeta1*x+qdbeta2*x**2)
gives me

How to reverse a color map image to scalar values?

How do I invert a color mapped image?
I have a 2D image which plots data on a colormap. I'd like to read the image in and 'reverse' the color map, that is, look up a specific RGB value, and turn it into a float.
For example:
using this image: http://matplotlib.sourceforge.net/_images/mri_demo.png
I should be able to get a 440x360 matrix of floats, knowing the colormap was cm.jet
from pylab import imread
import matplotlib.cm as cm
a=imread('mri_demo.png')
b=colormap2float(a,cm.jet) #<-tricky part
There may be better ways to do this; I'm not sure.
If you read help(cm.jet) you will see the algorithm used to map values in the interval [0,1] to RGB 3-tuples. You could, with a little paper and pencil, work out formulas to invert the piecewise-linear functions which define the mapping.
However, there are a number of issues which make the paper and pencil solution somewhat unappealing:
It's a lot of laborious algebra, and
the solution is specific for cm.jet.
You'd have to do all this work again
if you change the color map. How to automate the solving of these algebraic equations is interesting, but not a problem I know how to solve.
In general, the color map may not be
invertible (more than one value may
be mapped to the same color). In the
case of cm.jet, values between 0.11
and 0.125 are all mapped to the RGB
3-tuple (0,0,1), for example. So if
your image contains a pure blue
pixel, there is really no way to
tell if it came from a value of 0.11
or a value of, say, 0.125.
The mapping from [0,1] to
3-tuples is a curve in 3-space. The
colors in your image may not lie
perfectly on this curve. There might
be round-off error, for example. So any practical solution has to be able to interpolate or somehow project points in 3-space onto the curve.
Due to the non-uniqueness issue, and the projection/interpolation issue, there can be many possible solutions to the problem you pose. Below is just one possibility.
Here is one way to resolve the uniqueness and projection/interpolation issues:
Create a gradient which acts as a "code book". The gradient is an array of RGBA 4-tuples in the cm.jet color map. The colors of the gradient correspond to values from 0 to 1. Use scipy's vector quantization function scipy.cluster.vq.vq to map all the colors in your image, mri_demo.png, onto the nearest color in gradient.
Since a color map may use the same color for many values, the gradient may contain duplicate colors. I leave it up to scipy.cluster.vq.vq to decide which (possibly) non-unique code book index to associate with a particular color.
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import scipy.cluster.vq as scv
def colormap2arr(arr,cmap):
# http://stackoverflow.com/questions/3720840/how-to-reverse-color-map-image-to-scalar-values/3722674#3722674
gradient=cmap(np.linspace(0.0,1.0,100))
# Reshape arr to something like (240*240, 4), all the 4-tuples in a long list...
arr2=arr.reshape((arr.shape[0]*arr.shape[1],arr.shape[2]))
# Use vector quantization to shift the values in arr2 to the nearest point in
# the code book (gradient).
code,dist=scv.vq(arr2,gradient)
# code is an array of length arr2 (240*240), holding the code book index for
# each observation. (arr2 are the "observations".)
# Scale the values so they are from 0 to 1.
values=code.astype('float')/gradient.shape[0]
# Reshape values back to (240,240)
values=values.reshape(arr.shape[0],arr.shape[1])
values=values[::-1]
return values
arr=plt.imread('mri_demo.png')
values=colormap2arr(arr,cm.jet)
# Proof that it works:
plt.imshow(values,interpolation='bilinear', cmap=cm.jet,
origin='lower', extent=[-3,3,-3,3])
plt.show()
The image you see should be close to reproducing mri_demo.png:
(The original mri_demo.png had a white border. Since white is not a color in cm.jet, note that scipy.cluster.vq.vq maps white to to closest point in the gradient code book, which happens to be a pale green color.)
Here is a simpler approach, that works for many colormaps, e.g. viridis, though not for LinearSegmentedColormaps such as 'jet'.
The colormaps are stored as lists of [r,g,b] values. For lots of colormaps, this map has exactly 256 entries. A value between 0 and 1 is looked up using its nearest neighbor in the color list. So, you can't get the exact value back, only an approximation.
Some code to illustrate the concepts:
from matplotlib import pyplot as plt
def find_value_in_colormap(tup, cmap):
# for a cmap like viridis, the result of the colormap lookup is a tuple (r, g, b, a), with a always being 1
# but the colors array is stored as a list [r, g, b]
# for some colormaps, the situation is reversed: the lookup returns a list, while the colors array contains tuples
tup = list(tup)[:3]
colors = cmap.colors
if tup in colors:
ind = colors.index(tup)
elif tuple(tup) in colors:
ind = colors.index(tuple(tup))
else: # tup was not generated by this colormap
return None
return (ind + 0.5) / len(colors)
val = 0.3
tup = plt.cm.viridis(val)
print(find_value_in_colormap(tup, plt.cm.viridis))
This prints the approximate value:
0.298828125
being the value corresponding to the color triple.
To illustrate what happens, here is a visualization of the function looking up a color for a value, followed by getting the value corresponding to that color.
from matplotlib import pyplot as plt
import numpy as np
x = np.linspace(-0.1, 1.1, 10000)
y = [ find_value_in_colormap(plt.cm.viridis(x), plt.cm.viridis) for x in x]
fig, axes = plt.subplots(ncols=3, figsize=(12,4))
for ax in axes.ravel():
ax.plot(x, x, label='identity: y = x')
ax.plot(x, y, label='lookup, then reverse')
ax.legend(loc='best')
axes[0].set_title('overall view')
axes[1].set_title('zoom near x=0')
axes[1].set_xlim(-0.02, 0.02)
axes[1].set_ylim(-0.02, 0.02)
axes[2].set_title('zoom near x=1')
axes[2].set_xlim(0.98, 1.02)
axes[2].set_ylim(0.98, 1.02)
plt.show()
For a colormap with only a few colors, a plot can show the exact position where one color changes to the next. The plot is colored corresponding to the x-values.
Hy unutbu,
Thanks for your reply, I understand the process you explain, and reproduces it. It works very well, I use it to reverse IR camera shots in temperature grids, since a picture can be easily rework/reshape to fulfill my purpose using GIMP.
I'm able to create grids of scalar from camera shots that is really usefull in my tasks.
I use a palette file that I'm able to create using GIMP + Sample a Gradient Along a Path.
I pick the color bar of my original picture, convert it to palette then export as hex color sequence.
I read this palette file to create a colormap normalized by a temperature sample to be used as the code book.
I read the original image and use the vector quantization to reverse color into values.
I slightly improve the pythonic style of the code by using code book indices as index filter in the temperature sample array and apply some filters pass to smooth my results.
from numpy import linspace, savetxt
from matplotlib.colors import Normalize, LinearSegmentedColormap
from scipy.cluster.vq import vq
# sample the values to find from colorbar extremums
vmin = -20.
vmax = 120.
precision = 1.
resolution = 1 + vmax-vmin/precision
sample = linspace(vmin,vmax,resolution)
# create code_book from sample
cmap = LinearSegmentedColormap.from_list('Custom', hex_color_list)
norm = Normalize()
code_book = cmap(norm(sample))
# quantize colors
indices = vq(flat_image,code_book)[0]
# filter sample from quantization results **(improved)**
values = sample[indices]
savetxt(image_file_name[:-3]+'.csv',values ,delimiter=',',fmt='%-8.1f')
The results are finally exported in .csv
Most important thing is to create a well representative palette file to obtain a good precision. I start to obtain a good gradient (code book) using 12 colors and more.
This process is useful since sometimes camera shots cannot be translated to gray-scale easily and linearly.
Thanks to all contributors unutbu, Rob A, scipy community ;)
The LinearSegmentedColormap doesn't give me the same interpolation if I don't it manually during my test, so I prefer to use my own :
As an advantage, matplotlib is not more required since I integrate my code within an existing software.
def codeBook(color_list, N=256):
"""
return N colors interpolated from rgb color list
!!! workaround to matplotlib colormap to avoid dependency !!!
"""
# seperate r g b channel
rgb = np.array(color_list).T
# normalize data points sets
new_x = np.linspace(0., 1., N)
x = np.linspace(0., 1., len(color_list))
# interpolate each color channel
rgb = [np.interp(new_x, x, channel) for channel in rgb]
# round elements of the array to the nearest integer.
return np.rint(np.column_stack( rgb )).astype('int')

Categories

Resources