I am implementing color interpolation using a look-up-table (LUT) with NumPy. At one point I am using the 4 most significant bits of RGB values to choose corresponding CMYK values from a 17x17x17x4 LUT. Right now it looks something like this:
import numpy as np
rgb = np.random.randint(16, size=(3, 1000, 1000))
lut = np.random.randint(256, size=(17, 17, 17, 4))
cmyk = lut[rgb[0], rgb[1], rgb[2]]
Here comes the first question... Is there no better way? It sort of seems natural that you could tell NumPy that the indices for lut are stored along axis 0 of rgb, without having to actually write it out. So is there anything like cmyk = lut.fancier_take(rgb, axis=0) in NumPy?
Furthermore, I am left with an array of shape (1000, 1000, 4), so to be consistent with the input, I need to rotate it all around using a couple of swapaxes:
cmyk = cmyk.swapaxes(2, 1).swapaxes(1, 0).copy()
And I also need to add the copy statement, because if not the resulting array is not contiguous in memory, and that brings trouble later on.
Right now I am leaning towards rotating the LUT before the fancy indexing and then do something along the lines of:
swapped_lut = lut.swapaxes(2, 1).swapaxes(1, 0)
cmyk = swapped_lut[np.arange(4), rgb[0], rgb[1], rgb[2]]
But again, it just does not seem right... There has to be a more elegant way to do this, right? Something like cmyk = lut.even_fancier_take(rgb, in_axis=0, out_axis=0)...
I'd suggest using tuple to force indexing rowwise, and np.rollaxis or transpose instead of swapaxes:
lut[tuple(rgb)].transpose(2, 0, 1).copy()
or
np.rollaxis(lut[tuple(rgb)], 2).copy()
To roll the axis first, use:
np.rollaxis(lut, -1)[(Ellipsis,) + tuple(rgb)]
You'll need to do the following if you swap lut, np.arange(4) will not work:
swapped_lut = np.rollaxis(lut, -1)
cmyk = swapped_lut[:, rgb[0], rgb[1], rgb[2]].copy()
Or you can replace
cmyk = lut[rgb[0], rgb[1], rgb[2]]
cmyk = cmyk.swapaxes(2, 1).swapaxes(1, 0).copy()
with:
cmyk = lut[tuple(rgb)]
cmyk = np.rollaxis(cmyk, -1).copy()
But to try and do it all in one step, ... Maybe:
rng = np.arange(4).reshape(4, 1, 1)
cmyk = lut[rgb[0], rgb[1], rgb[2], rng]
That's not very readable at all is it?
Take a look at the answer to this question, Numpy multi-dimensional array indexing swaps axis order. It does a good job of explaining how numpy broadcasts multiple arrays to get the output size. Here you want to create indices into lut that broadcast to (4, 1000, 1000). Hope that makes some sense.
Related
I'm trying to create a multidimensional matrix with numpy, to describe a RGB image. Something like this does not work:
numpy.full(100, 100, [0,0,0])
This fails with TypeError: data type not understood. What I'm trying to get at is a pixel matrix with rgb values at every pixel.
Edit: this gets me halfway there:
n = numpy.empty((3,3,3))
n[:] = [0,0,0]
However, this gives a float array for every point, while a uint8 array would suffice. How would I fix that?
Apparently, this does the trick:
n = numpy.empty((3,3,3))
n[:] = numpy.array([0,0,0], dtype=numpy.uint8)
import numpy as np
ts = np.random.rand(40,45,40,1000)
mask = np.random.randint(2, size=(40,45,40),dtype=bool)
#creating a masked array
ts_m = np.ma.array(ts, mask=ts*~mask[:,:,:,np.newaxis])
#demeaning
ts_md = ts_m - ts_m.mean(axis=3)[:,:,:,np.newaxis]
#standardisation
ts_mds = ts_md / ts_md.std(ddof=1,axis=3)[:,:,:,np.newaxis]
I would like to demean ts (along axis 3), and divide by its standard deviation (along axis 3), all within the mask.
Am I doing this correctly ?
Is there a faster method ?
You have a couple of options available to you.
The first is to use masked arrays as you are doing, but provide a proper mask and use the masked functions. Right now, your code is computing all the means and standard deviations, and slapping a mask on the result. To skip masked elements, use np.ma.mean and np.ma.std, and thereby avoid doing a whole lot of extra work.
As you correctly understood, the size of the mask must match that of the data. While multiplying by the data gives you the correct size, it is expensive and gives the wrong result in the general case since the mask will be zero whenever either data or mask is zero. A better approach would be to create a view of the mask repeated along the last (new) dimension. You can use np.broadcast_to if you get the trailing dimensions to match up first:
ts = np.random.rand(40, 45, 40, 1000)
mask = np.random.randint(2, size=(40, 45, 40), dtype=np.bool)
#creating a masked array
ts_m = np.ma.array(ts, mask=np.broadcast_to(mask[..., None], ts.shape)
#demeaning
ts_md = ts_m - np.ma.mean(ts_m, axis=3)[..., None]
#standardisation
ts_mds = ts_md / np.ma.std(ts_m, ddof=1,axis=3)[..., None]
The mask is read only, and because it likely has a dimension with zero stride, can sometimes do unexpected things. The broadcasted version here is roughly equivalent to
np.lib.stride_tricks.as_strided(mask, ts.shape, (*mask.strides, 0), writeable=False)
Both versions create views to the original data, so are very fast. They just allocate a new array object that points to the existing data, which is not copied. Keep in mind that np.lib.stride_tricks.as_strided is a sledgehammer that should be used with the utmost care. It will crash your interpreted any day if you let it.
Note: The mask in a masked array is interpreted as True being masked, while Boolean indexing arrays are interpreted with False masked. Depending on how it's obtained and it's meaning in your real code, you may want to invert the mask
mask=np.broadcast_to(~mask[..., None], ...)
Another option is to implement the masking yourself. There are two ways you can do that. If you do it up-front, the mask will be applied to the leading dimensions of your data:
ts = np.random.rand(40, 45, 40, 1000)
mask = np.random.randint(2, size=(40, 45, 40), dtype=np.bool)
#creating a masked array
mask = ~mask # optional, see note above
ts_m = ts[mask]
#demeaning
ts_md = ts_m - ts_m.mean(axis=-1)
#standardisation
ts_mds = ts_md / ts_md.std(ddof=1,axis=-1)
# reshaping
result = np.empty_like(ts) # alternatively, np.zeros_like
result[mask] = ts_mds
This option may be cheaper than a masked array because the initial masking step creates a 40*45*40-mask_size x 1000 array, and only replaces it into the masked area of the result when finished, instead of operating on the full sized data and preserving shape.
The third option is only really useful if you have only a small number of elements masked out. It's essentially what your original code is doing: perform all the commutations, and apply the mask to the result.
More Tips
Ellipsis is a special object that means "all the remaining dimensions". It's usually abbreviated ... in slice notation. np.newaxis is an alias for None. Combine those pieces of information, and you get that [: :, :, np.newaxis] can be written more cleanly and elegantly as [..., None]. The latter is more general since it works for an arbitrary number of dimensions.
Numpy allows for negative axis indices. A nicer way to say "last axis" is generally axis=-1.
import numpy as np
ts = np.random.rand(40,45,40,1000)
mask = np.random.randint(2, size=(40,45,40)).astype(bool)
#creating a masked array
ts_m = np.ma.array(ts, mask=np.broadcast_to(~mask.reshape(40,45,40,1),ts.shape))
#demeaning
ts_md = ts_m - ts_m.mean(axis=3)[:,:,:,np.newaxis]
#standardisation
ts_mds = ts_md / ts_md.std(ddof=1,axis=3)[:,:,:,np.newaxis]
I'm having some trouble reshaping a 4D numpy array to a 2D numpy array. Currently the numpy array is follows, (35280L, 1L, 32L, 32L). The format is number of images, channel, width, height. Basically, I have 35280 image blocks that are 32x32 and I want to combine the image blocks (keeping the indices) to create one big image.
Reshaping is not sufficient, you must carefully rearrange your data with swapaxes.
Sample data :
dims=nbim,_,h,w=np.array([6,1,7,6])
data=arange(dims.prod()).reshape(dims)%256
The images :
figure()
for i in range(nbim):
subplot(1,nbim,i+1)
imshow(data[i,0],vmin=0,vmax=255)
and the big image :
#number of images in each dim :
nh = 2 # a choice
nw=nbim // nh
bigim=data.reshape(nh,nw,h,w).swapaxes(1,2).reshape(nh*h,nw*w)
figure()
imshow(bigim)
You have an array like this:
images = np.random.randint(0,256,(35280, 1, 32, 32))
The first thing you need is to figure out (somehow) what the width of the final image is supposed to be. Let's say for this example that it's (441 * 32, 80 * 32).
Then you can do:
image = images.swapaxes(0,2).reshape((441 * 32, -1))
This gives you almost what you need, except the rows are interleaved, so you have:
AAABBBCCC
DDDEEEFFF
GGGHHHIII
AAABBBCCC
DDDEEEFFF
GGGHHHIII
You can then use "fancy indexing" to rearrange the rows:
image[np.array([0,3,1,4,2,5])]
Now you have:
AAABBBCCC
AAABBBCCC
DDDEEEFFF
DDDEEEFFF
GGGHHHIII
GGGHHHIII
I will leave as an exercise the part where you generate the fancy indexing sequence.
I'm trying to find dominant colors in image and then treshold the most dominant one. However I'm having trouble with data types.
My formula gives the most dominant color as:
color=[10,10,10] # type=numpy.ndarray ,uint8
But it gives assertion error when I try to convert it:
color=cv2.cvtColor(color, cv2.COLOR_BGR2HSV) #gives assertion error
What cv2.cvtColor wants as an input is that:
color_ideal=[[[ 10, 10, 10 ]]] #type=numpy.ndarray, uint8
To obtain it, I managed to manipulate color as such:
color=np.uint8(np.atleast_3d(clr).astype(int).reshape(1,1,3))
This seems working, but know I cannot append multiple colors to numpy array.Somehow, after appending the dimension is reduced to 1. My code is:
color=np.uint8([[[]]])
for item in clt.cluster_centers_:
color=np.append(color,(np.uint8(np.atleast_3d(item).astype(int).reshape(1,1,3))))
#returns: color=[10,10,10] somehow its dimension is down to 1
My questions are:
1-How to properly append color data without loosing its dimension?
2-Is there easier way to handle this? I'm suprised how difficult it is to manipulate custom color pixel.
The full code is here in case it helps:
<!-- language: lang-py -->
import cv2
import numpy as np
from sklearn.cluster import KMeans
def find_kmean_colors(img,no_cluster=2):
clt = KMeans(no_cluster).fit(img)
return clt
def initialize(img='people_frontal.jpg'):
img=cv2.imread('people_frontal_close_body.jpg')
img=cv2.bilateralFilter(img,9,75,75)
return img
img=initialize()
img_hsv =cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
img_list= img.reshape((img.shape[0] * img_hsv.shape[1], 3))
clt=(find_kmean_colors(img_list,1))
color=np.uint8([[[]]])
for i in clt.cluster_centers_:
color=np.append(color,(np.uint8(np.atleast_3d(i).astype(int).reshape(1,1,3))))
#color=np.uint8(np.atleast_3d(clt.cluster_centers_).astype(int).reshape(1,1,3))
up=cv2.cvtColor(color,cv2.COLOR_BGR2HSV)
Without the cv2 code, I'm guessing about shapes here. But looks like img is a (n,m,3) array. img_list is (m1,3), and clt as list of m1 items, and clt.cluster_centers_ a list of m1 arrays of shape (3,).
For test sake lets make a list of lists (it could just as well be a list of arrays):
ctrs=[[10,10,10], [3,5,3], [20,10,10], [0,0,0]]
color = np.array(ctrs,dtype=np.uint8) # (4,3) array
color = color.reshape(len(ctrs),1,3)
Just wrap it in np.array, and reshape to 3d.
array([[[10, 10, 10]],
[[ 3, 5, 3]],
[[20, 10, 10]],
[[ 0, 0, 0]]], dtype=uint8)
Or it could be reshaped to (1,4,3) or (2,2,3).
Or closer to what you are trying:
np.concatenate([np.array(i,np.uint8).reshape(1,1,3) for i in ctrs])
You don't want to use atleast_3d here since it reshapes a (N,) array to (1,N,1) (see its docs). np.concatenate joins on the 1st axis, where as np.array adds a 1st dimension and then joins.
You probably could get append to work, but it just does a step by step concatenate, which is slower. In general if you need to append, do it with lists, and then convert to an array at the end.
There are various ways of preserving or restoring dimensions after slicing. If color is 3d and you need the i'th row also as 3d:
color[[i]]
color[i].reshape(1,...)
color[i][np.newaxis,...]
Reshaping operations like this do not add significant time to the processing, so don't be afraid to use them.
I have an image img:
>>> img.shape
(200, 200, 3)
On pixel (100, 100) I have a nice color:
>>> img[100,100]
array([ 0.90980393, 0.27450982, 0.27450982], dtype=float32)
Now my question is: How many different colors are there in this image, and how do I enumerate them?
My first idea was numpy.unique(), but somehow I am using this wrong.
Your initial idea to use numpy.unique() actually can do the job perfectly with the best performance:
numpy.unique(img.reshape(-1, img.shape[2]), axis=0)
At first, we flatten rows and columns of matrix. Now the matrix has as much rows as there're pixels in the image. Columns are color components of each pixels.
Then we count unique rows of flattened matrix.
You could do this:
set( tuple(v) for m2d in img for v in m2d )
One straightforward way to do this is to leverage the de-duplication that occurs when casting a list of all pixels as a set:
unique_pixels = np.vstack({tuple(r) for r in img.reshape(-1,3)})
Another way that might be of practical use, depending on your reasons for extracting unique pixels, would be to use Numpy’s histogramdd function to bin image pixels to some pre-specified fidelity as follows (where it is assumed pixel values range from 0 to 1 for a given image channel):
n_bins = 10
bin_edges = np.linspace(0, 1, n_bins + 1)
bin_centres = (bin_edges[0:-1] + bin_edges[1::]) / 2.
hist, _ = np.histogramdd(img.reshape(-1, 3), bins=np.vstack(3 * [bin_edges]))
unique_pixels = np.column_stack(bin_centres[dim] for dim in np.where(hist))
If for any reason you will need to count the number of times each unique color appears, you can use this:
from collections import Counter
Counter([tuple(colors) for i in img for colors in i])
The question about unique colors (or more generally unique values along a given axis) has been also asked here (in particular, see this answer). If you're seeking for the fastest available option then "void view" would be your weapon of choice:
axis=2
np.unique(
img.view(np.dtype((np.void, img.dtype.itemsize*img.shape[axis])))
).view(img.dtype).reshape(-1, img.shape[axis])
For any questions related to what the script actually does, I refer the reader to the links above.