I have an array of values in range of 1500 to 4500.
I managed to convert the data using matplotlib function. The code as follows:
import matplotlib.pyplot as plt
import numpy as np
norm = plt.Normalize(vmin=1500, vmax=4500)
jet = plt.cm.jet
# generate 100x100 with value in range 1500-4500
original = np.random.randInt(1500,4500, (100,100))
# array in shape (100,100)
# convert the array to rgba image
converted = jet(norm(original))
# image in shape (100,100,4)
How to get the original array from converted images?
Some rounding will take place because of the limited amount of colors in the colormap, so a perfect reversal is not possible.
But you can get close by simply inverting the colormap and subsequent normalization.
Starting with some sample data:
import matplotlib as mpl
import numpy as np
rng = np.random.default_rng(seed=0)
data = rng.integers(1500,4500, (3,3))
# array([[4051, 3410, 3033],
# [2309, 2423, 1622],
# [1725, 1549, 2025]], dtype=int64)
Which can be converted to RGBA:
norm = mpl.colors.Normalize(vmin=1500, vmax=4500)
cmap = mpl.colormaps["jet"].copy()
data_rgb = cmap(norm(data))
Converting the colormap to a lookup table, I'll drop the alpha for simplicity since this colormap doesn't use it.
lut = np.zeros((256,) * 3, dtype=np.uint8)
for i in range(cmap.N):
r,g,b,a = cmap(i)
lut[int(r*255), int(g*255), int(b*255)] = i
The lookup table can then be indexed with the RGB expressed as bytes:
data_rgb_byte = (data_rgb*255).astype(np.uint16)
data_inv_norm = lut[
data_rgb_byte[:,:,0],
data_rgb_byte[:,:,1],
data_rgb_byte[:,:,2],
]/255
data_recovered = norm.inverse(data_inv_norm).data
data_recovered
# array([[4052.94117647, 3405.88235294, 3029.41176471],
# [2311.76470588, 2417.64705882, 1617.64705882],
# [1723.52941176, 1547.05882353, 2017.64705882]])
I guess the loss in accuracy relates to the range of initial normalization (4500 - 1500 = 3000) compared to the resolution of the colormap (N=256), so 3000/256 ~= 11.7.
Related
I have tens of thousands of images. I want to generate a histogram for each pixel. I have come up with the following code using NumPy to do this that works:
import numpy as np
import matplotlib.pyplot as plt
nimages = 1000
im_shape = (64,64)
nbins = 100
#predefine the histogram bins
hist_bins = np.linspace(0,1,nbins)
#create an array to store histograms for each pixel
perpix_hist = np.zeros((64,64,nbins))
for ni in range(nimages):
#create a simple image with normally distributed pixel values
im = np.random.normal(loc=0.5,scale=0.05,size=im_shape)
#sort each pixel into the predefined histogram
bins_for_this_image = np.searchsorted(hist_bins, im.ravel())
bins_for_this_image = bins_for_this_image.reshape(im_shape)
#this next part adds one to each of those bins
#but this is slow as it loops through each pixel
#how to vectorize?
for i in range(im_shape[0]):
for j in range(im_shape[1]):
perpix_hist[i,j,bins_for_this_image[i,j]] += 1
#plot histogram for a single pixel
plt.plot(hist_bins,perpix_hist[0,0])
plt.xlabel('pixel values')
plt.ylabel('counts')
plt.title('histogram for a single pixel')
plt.show()
I would like to know if anyone can help me vectorize the for loops? I can't think of how to index into the perpix_hist array properly. I have tens/hundreds of thousands of images and each image is ~1500x1500 pixels, and this is too slow.
You can vectorize it using np.meshgrid and providing indices for first, second and third dimension (the last dimension you already have).
y_grid, x_grid = np.meshgrid(np.arange(64), np.arange(64))
for i in range(nimages):
#create a simple image with normally distributed pixel values
im = np.random.normal(loc=0.5,scale=0.05,size=im_shape)
#sort each pixel into the predefined histogram
bins_for_this_image = np.searchsorted(hist_bins, im.ravel())
bins_for_this_image = bins_for_this_image.reshape(im_shape)
perpix_hist[x_grid, y_grid, bins_for_this_image] += 1
I wrote a function with this purpose:
to create a matplotlib figure, but not display it
with no frames, axes, etc.
to plot in the figure an input 2D array using a user-passed colormap
to save the colormapped 2D array from the canvas to a numpy array
that the output array should be the same size as the input
There are lots of questions with answers for tasks similar to either points 1-2 or point 4; for me it was also important to automate point 5. So I started by combining parts from both #joe-kington 's answer and from #matehat 's answer and comments to it, and with small modifications I got to this:
def mk_cmapped_data(data, mpl_cmap_name):
# This is to define figure & ouptput dimensions from input
r, c = data.shape
dpi = 72
w = round(c/dpi, 2)
h = round(r/dpi, 2)
# This part modified from #matehat's SO answer:
# https://stackoverflow.com/a/8218887/1034648
fig = plt.figure(frameon=False)
fig.set_size_inches((w, h))
ax = plt.Axes(fig, [0., 0., 1., 1.])
ax.set_axis_off()
fig.add_axes(ax)
plt.set_cmap(mpl_cmap_name)
ax.imshow(data, aspect='auto', cmap = mpl_cmap_name, interpolation = 'none')
fig.canvas.draw()
# This part is to save the canvas to numpy array
# Adapted rom Joe Kington's SO answer:
# https://stackoverflow.com/a/7821917/1034648
mat = np.frombuffer(fig.canvas.tostring_rgb(), dtype=np.uint8)
mat = mat.reshape(fig.canvas.get_width_height()[::-1] + (3,))
mat = normalise(mat) # this is just using a helper function to normalize output range
plt.close(fig=None)
return mat
The function does what it is supposed to do and is fast enough.
My question is whether I can make it more efficient and or more pythonic in any way.
If you're wanting RGB output that exactly matches the shape of the input array, it's probably easiest to not create a figure, and instead use the colormap objects directly. For example:
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
# Random data with a non 0-1 range.
data = 500 * np.random.random((100, 100)) - 200
# We'll use `LinearSegementedColormap` and `Normalize` instances directly
cmap = plt.get_cmap('viridis')
norm = plt.Normalize(data.min(), data.max())
# The norm instance scales data to a 0-1 range, cmap makes it RGB
rgb = cmap(norm(data))
# MPL uses a 0-1 float RGB representation, so we'll scale to 0-255
rgb = (255 * rgb).astype(np.uint8)
Image.fromarray(rgb).save('test.png')
Note that you likely don't want the additional step of saving it as a PNG, but I wanted to be able to show the result visually. This is exactly a 100x100 image where each pixel corresponds to the original input data.
This is what matplotlib does behind-the-scenes when you call imshow. The data is first run through a Normalize instance to scale it from its original range to 0-1. Then any Colormap instance can be called directly with the 0-1 results to turn the scalar data into RGB data.
One letter variables are hard to understand.
Change:
r -> n_rows
c -> n_cols
w -> width
h -> height
I have a HealPix plot, made with HEALPY, as in Healpy: From Data to Healpix map (with less pixels, for instance taking nside=2, see code below).
import healpy as hp
import numpy as np
import matplotlib.pyplot as plt
# Set the number of sources and the coordinates for the input
nsources = int(1.e4)
nside = 2
npix = hp.nside2npix(nside)
# Coordinates and the density field f
thetas = np.random.random(nsources) * np.pi
phis = np.random.random(nsources) * np.pi * 2.
fs = np.random.randn(nsources)
# Go from HEALPix coordinates to indices
indices = hp.ang2pix(nside, thetas, phis)
# Initate the map and fill it with the values
hpxmap = np.zeros(npix, dtype=np.float)
hpxmap[indices] += fs[indices]
# Inspect the map
hp.mollview(hpxmap)
example plot
How can I write a text with a value in the center of each HEALPix I have on the plot ?
For example, how to write an identifuer for each 'pixel', using an array like range(len(hpxmap)) ?
Thanks a lot in advance for your help !
I am confused regarding how matplotlib handles fp32 pixel intensities. To my understanding, it rescales the values between max and min values of the image. However, when I try to view images originally in [0,1] by rescaling their pixel intensites to [-1,1] (by im*2-1) using imshow(), the image appears differently colored. How do I rescale so that images don't differ ?
EDIT : Please look at the image -
PS: I need to do this as part of a program that outputs those values in [-1,1]
Following is the code used for this:
img = np.float32(misc.face(gray=False))
fig,ax = plt.subplots(1,2)
img = img/255 # Convert to 0,1 range
print (np.max(img), np.min(img))
img0 = ax[0].imshow(img)
plt.colorbar(img0,ax=ax[0])
print (np.max(2*img-1), np.min(2*img-1))
img1 = ax[1].imshow(2*img-1) # Convert to -1,1 range
plt.colorbar(img1,ax=ax[1])
plt.show()
The max,min output is :
(1.0, 0.0)
(1.0, -1.0)
You are probably using matplotlib wrong here.
The normalization-step should work correctly, if it's active. The docs tell us, that is only active by default, if the input-image is of type float!
Code
import numpy as np
import matplotlib.pyplot as plt
from scipy import misc
fig, ax = plt.subplots(2,2)
# This usage shows different colors because there is no normalization
# FIRST ROW
f = misc.face(gray=True)
print(f.dtype)
g = f*2 # just some operation to show the difference between usages
ax[0,0].imshow(f)
ax[0,1].imshow(g)
# This usage makes sure that the input-image is of type float
# -> automatic normalization is used!
# SECOND ROW
f = np.asarray(misc.face(gray=True), dtype=float) # TYPE!
print(f.dtype)
g = f*2 # just some operation to show the difference between usages
ax[1,0].imshow(f)
ax[1,1].imshow(g)
plt.show()
Output
uint8
float64
Analysis
The first row shows the wrong usage, because the input is of type int and therefore no normalization will be used.
The second row shows the correct usage!
EDIT:
sascha has correctly pointed out in the comments that rescaling is not applied for RGB images and inputs must be ensured to be in [0,1] range.
How to do histogram equalization for multiple grayscaled images stored in a NumPy array easily?
I have the 96x96 pixel NumPy data in this 4D format:
(1800, 1, 96,96)
Moose's comment which points to this blog entry does the job quite nicely.
For completeness, I give an example here using nicer variable names and a looped execution on 1000 96x96 images which are in a 4D array as in the question. It is fast (1-2 seconds on my computer) and only needs NumPy.
import numpy as np
def image_histogram_equalization(image, number_bins=256):
# from http://www.janeriksolem.net/histogram-equalization-with-python-and.html
# get image histogram
image_histogram, bins = np.histogram(image.flatten(), number_bins, density=True)
cdf = image_histogram.cumsum() # cumulative distribution function
cdf = (number_bins-1) * cdf / cdf[-1] # normalize
# use linear interpolation of cdf to find new pixel values
image_equalized = np.interp(image.flatten(), bins[:-1], cdf)
return image_equalized.reshape(image.shape), cdf
if __name__ == '__main__':
# generate some test data with shape 1000, 1, 96, 96
data = np.random.rand(1000, 1, 96, 96)
# loop over them
data_equalized = np.zeros(data.shape)
for i in range(data.shape[0]):
image = data[i, 0, :, :]
data_equalized[i, 0, :, :] = image_histogram_equalization(image)[0]
Very fast and easy way is to use the cumulative distribution function provided by the skimage module. Basically what you do mathematically to proof it.
from skimage import exposure
import numpy as np
def histogram_equalize(img):
img = rgb2gray(img)
img_cdf, bin_centers = exposure.cumulative_distribution(img)
return np.interp(img, bin_centers, img_cdf)
As of today janeriksolem's url is broken.
I found however this gist that links the same page and claims to perform histogram equalization without computing the histogram.
The code is:
img_eq = np.sort(img.ravel()).searchsorted(img)
Here's an alternate implementation for a single channel image that is fast. See skimage.exposure.histogram for reference. Using timeit, 'image_histogram_equalization' in Trilarion's answer has a mean execution time was 0.3696 seconds, while this function has a mean execution time of 0.0534 seconds. However this implementation also relies on skimage.
import numpy as np
from skimage import exposure
def hist_eq(image):
hist, bins = exposure.histogram(image, nbins=256, normalize=False)
# append any remaining 0 values to the histogram
hist = np.hstack((hist, np.zeros((255 - bins[-1]))))
cdf = 255*(hist/hist.sum()).cumsum()
equalized = cdf[image].astype(np.uint8)
return equalized