I'm trying to apply a function over all pixels of an image (in my specific case I want to do some colour approximation per pixel, but I don't think this is relevant for the problem) in an efficient way. The thing is that I've found different approaches to do so, but all of them apply a function over each component of a pixel, whereas what I want to do is to apply a function that receives a pixel (not a pixel component), this is, the 3 rgb components (I guess as a tuple, but I don't care about the format as long as I have the 3 components as parameters in my function).
If you are interested on what I have, this is the non-efficient solution to my issue (works fine but it is too slow):
def closest_colour(pixel, colours):
closest_colours = sorted(colours, key=lambda colour: colours_distance(colour, pixel))
return closest_colours[0]
# reduces the colours of the image based on the results of KMean
# img is image open with opencv.imread()
# colours is an array of colours
def image_color_reduction(img, colours):
start = time.time()
print("Color reduction...")
reduced_img = img.copy()[...,::-1]
width = reduced_img.shape[0]
height = reduced_img.shape[1]
for x in range(width):
for y in range(height):
reduced_img[x,y] = closest_colour(reduced_img[x,y], colours)
end = time.time()
print(f"Successfully reduced in {end-start} seconds")
return reduced_img
I've followed this post: PIL - apply the same operation to every pixel that seemed to be pretty clear and aligned with my issue. I've tried using any kind of image formatting, I've tried multithreading (both with pool.map & pool.imap), I've tried numpy.apply_along_axis and I finally tried PIL.point(), what I thought was the most similar solution to what I was looking for. Indeed, if you take a look at their official documentation: .point(), it exactly says: The function is called once for each possible pixel value. I find this really misleading, because after trying it I realized pixel value in this context does not refer to an rgb tuple, but to each of the 3 rgb compoments (seriously, in what world?).
I would really appreciate it if somebody could share a bit of their experience and give me some light on this issue. Thank you in advance!!
(UPDATE)
As per your request I add more information on the specific problem I am approaching:
Given
an image M of size 1022*1080
an array of colours N with size 1 < |N| < 16
Reduce the colours of M by replacing each pixel's colour by the most
similar one in N (thanks to your answers I know this is defined as
Nearest Neighbor Color Quantization)
This is the missing implementation of colours_distance:
def colours_distance(c1, c2):
(r1,g1,b1) = c1
(r2,g2,b2) = c2
return math.sqrt((r1 - r2)**2 + (g1 - g2) ** 2 + (b1 - b2) **2)
And this are the imports needed to run this code:
import cv2
import time
import math
The solution shown in my question solves the problem described in slightly less than 40s on average.
Assuming that your image is an (M, N, 3) numpy array, your color table is (K, 3), and you measure color distance as some sane vector norm, you can use scipy's cKDTree (or just KDTree) to optimize and vectorize the lookup for you.
First make a tree out of your color table:
colors = ... # K, 3 array
color_tree = cKDTree(colors)
Now you can query the tree directly to get the output image:
_, output = color_tree.query(img)
output will be an (M, N) index array into color_table. The important thing is that KD trees are optimized to perform O(log K) lookups per pixel, not O(K) or O(K log K) as in your current implementation. Since the loops are implemented in C, you'll get a major boost from that too.
Note about "vectorization"
With numpy, there isn't a generic efficient way to apply an arbitrary function across some axis of your image. In order to do calculations efficiently, numpy needs to be able to do those calculations in the backend for you, not using Python at all. The same is true of OpenCV. When you call for e.g. np.mean() or cv.meanStdDev() or similar, these libraries are iterating through your images in C/C++/Fortran/etc so the code that gets executed needs to be there. However, you want to apply a function you've defined in Python on these values, which means you need to operate on Python objects directly, which removes all of the efficiency of doing operations in numpy/OpenCV/etc and is why there's no instantly fast way to do these calculations. You mentioned in your post about df.apply() from Pandas---note that apply() is actually slow, it does the looping in Python like you're currently doing, and generally you want to stay away from using it for that reason. Numpy and OpenCV don't expose a method like apply() because it is not really a good way to do things.
Generally, to do operations efficiently, you need to vectorize your code, which in Python-land means to only utilize built-in numpy/opencv/etc functions that can operate on your data all at once, without writing loops (or without them implicitly being called in Python, like df.apply()).
Note that nothing here is specific to working on pixels (or their individual components), this is a general problem in trying to achieve fast computations in Python. That is, even if any of the solutions you've tried were to work on a pixel (as opposed to components), it'd still be slow anyways.
Solution
The specific problem you give as an example (nearest neighbor color quantization) is non-trivial to make fast, since for each pixel you need to figure out where you sit nearest in the list of colors. If you only have a few colors, like 8, it isn't terrible to just calculate the distance to all, but if you are trying to reduce the palette to 256 colors or something like that, it's a lot of compute. If you have only a few colors, then you can vectorize the whole operation by creating a 3d array representing the distance to each color at each x, y location, and taking the argmin across the color axis, which you can then use with a lookup table.
Here's an example implementation, reducing an image to 8 colors. We'll start with an image and some defined colors
In [80]: img.shape
Out[80]: (90, 160, 3)
In [81]: colors
Out[81]:
array([[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 255],
[255, 255, 0],
[255, 0, 255],
[ 0, 255, 255],
[255, 255, 255]])
Now we want, for each pixel location in the image, the distance to each color (we'll use the abs diff distance function as an example, but any vectorizeable operation here will do). Here we can utilize broadcasting to get a resulting array of shape (h, w, n_colors):
In [83]: distances = np.sum(np.abs(img[..., np.newaxis] - colors.T), axis=2)
In [84]: distances.shape
Out[84]: (90, 160, 8)
Now you want to know which color resulted in the minimum distance for each pixel:
In [87]: nearest_colors = np.argmin(distances, axis=2)
In [88]: nearest_colors
Out[88]:
array([[7, 4, 3, ..., 2, 2, 4],
[5, 7, 6, ..., 3, 2, 5],
[5, 3, 7, ..., 3, 5, 7],
...,
[6, 5, 0, ..., 7, 6, 1],
[1, 6, 5, ..., 2, 0, 3],
[0, 1, 0, ..., 7, 5, 4]])
So at the first pixel, the closest color was the last one in my color list (all white), at the next pixel to the right the closest color was [255, 255, 0], and so on. Now you can use a lookup table to map from these to their actual color values. The way to do this with numpy is to use fancy indexing:
In [91]: quantized = colors[nearest_colors]
In [92]: quantized.shape
Out[92]: (90, 160, 3)
And here's your image with the new quantized colors.
A more efficient solution for this problem is to utilize a kd-tree, as MadPhysicist answered. However, color distance functions can be non-linear and those distances may not map well to spatial data structures, in which case there's usually specialized implementations or very specific ways in which to make them faster, but this is closer to research and not appropriate for SO.
For other color quantization algorithms, this question has a lot of good examples: Fast color quantization in OpenCV
Try this solution if it will be faster:
img = cv2.imread(path)
result = np.zeros_like(img)
colors_arr = [[0, 0, 255], [255, 0, 0], [0, 255, 0], [0, 255, 255], [255, 0, 255], [255, 255, 0]]
# Normalizing images and colors to 1.
colors = np.array(colors_arr, np.float32) / 255
img = img.astype(np.float32) / 255
# For each color making an array of weights.
weights = []
for i in range(colors.shape[0]):
weights.append(np.sum(np.square(img - colors[i]), axis=2))
weights = np.array(weights, np.float32)
# Finding the index of minimum weight
weights = np.transpose(weights, axes=[1, 2, 0])
color_inds = np.argmin(weights, axis=2)
# Depending on minimum weight index assigning the color to the result
for i in range(len(colors_arr)):
idx = np.where(color_inds == i)
result[idx] = colors_arr[i]
cv2.imshow('', result)
cv2.waitKey()
I was interested in the relative performance of using linalg.norm() versus the cKDTree() for your dataset sizes of 1022x1080 image with N (palette length) in the range 1..16.
#!/usr/bin/env python3
import numpy as np
import cv2
def QuantizeToGivenPalette(im, palette):
"""Quantize image to a given palette.
The input image is expected to be a Numpy array.
The palette is expected to be a list of R,G,B values."""
# Calculate the distance to each palette entry from each pixel
distance = np.linalg.norm(im[:,:,None].astype(np.float) - palette[None,None,:].astype(np.float), axis=3)
# Now choose whichever one of the palette colours is nearest for each pixel
palettised = np.argmin(distance, axis=2).astype(np.uint8)
return palettised
################################################################################
# main
################################################################################
# Let's get some repeatable randomness
np.random.seed(42)
# Open a colorwheel, resize to match dimensions of question
M = 1022, 1080
im = cv2.imread('colorwheel.png', cv2.IMREAD_COLOR)
im = cv2.resize(im, M, interpolation = cv2.INTER_AREA)
# Make a full 256-entry palette of random colours, but we'll just use the first N
palette = np.random.randint(0,256,(256,3),dtype=np.uint8)
# Try quantizing with linalg.norm, for various palette lengths
pLengths = [4,8,12,16]
for pLength in pLengths:
indices = QuantizeToGivenPalette(im, palette[:pLength])
# Write image of just palette indices
cv2.imwrite(f'DEBUG-indices-linalg{pLength}.png', indices)
# Look up each pixel in the palette and revert to BGR and save
BGR = palette[indices]
cv2.imwrite(f'DEBUG-result-linalg{pLength}.png', BGR)
################################################################################
# NOW DO SAME THING BUT WITH KDTREE
################################################################################
from scipy.spatial import cKDTree
# Try quantizing with cKDTree, for various palette lengths
for pLength in pLengths:
# Build our tree from the palette, only necessary once for any given palette
treeFromPalette = cKDTree(palette[:pLength])
# Lookup nearest indices for each pixel in image
_, indices = treeFromPalette.query(im)
# Write image of just palette indices
cv2.imwrite(f'DEBUG-indices-cKDTree{pLength}.png', indices)
# Look up each pixel in the palette and revert to BGR and save
BGR = palette[indices]
cv2.imwrite(f'DEBUG-result-cKDTree{pLength}.png', BGR)
I used this image as input and resized it to your stated dimensions:
The results were the same for both methods:
4 colours:
8 colours:
16 colours:
The interesting thing was the timings - all in milliseconds:
N norm() cKDTree()
4 147 485
8 307 308
12 449 530
16 601 542
If we plot those, you can see that cKDTree() only really comes into its own at the higher end of your N-values:
Keywords: Python, image processing, KDTree, linalg.norm, palette, quantisation, prime.
Editors comment:
How to count pixels occurences in an image?
I have a set of images where each pixel consists of 3 integers in the range 0-255.
I am interested in finding one pixel that is "representative" (as much as possible) for the entire pixel-population as a whole, and that pixel must occur in the pixel-population.
I am determining which pixel is the most common (the median mode) in my set of images makes the most sense.
I am using python, but I am not sure how to go about it.
The images are stored as an numpy array with dimensions [n, h, w, c], where n is the number of images, h is the height, w is the widthandc` is the channels (RGB).
I'm going to assume you need to find the most common element, which as Cris Luengo mentioned is called the mode. I'm also going to assume that the bit depth of the channels is 8-bit (value between 0 and 255, i.e. modulo 256).
Here is an implementation independent approach:
The aim is to maintain a count of all the different kinds of pixels encountered. It makes sense to use a dictionary for this, which would be of the form {pixel_value : count}.
Once this dictionary is populated, we can find the pixel with the highest count.
Now, 'pixels' are not hashable and hence cannot be stored in a dictionary directly. We need a way to assign an integer(which I'll be referring to as the pixel_value) to each unique pixel, i.e., you should be able to convert pixel_value <--> RGB value of a pixel
This function converts RGB values to an integer in the range of 0 to 16,777,215:
def get_pixel_value(pixel):
return pixel.red + 256*pixel.green + 256*256*pixel.blue
and to convert pixel_value back into RGB values:
def get_rgb_values(pixel_value):
red = pixel_value%256
pixel_value //= 256
green = pixel_value%256
pixel_value //= 256
blue = pixel_value
return [red,green,blue]
This function can find the most frequent pixel in an image:
def find_most_common_pixel(image):
histogram = {} #Dictionary keeps count of different kinds of pixels in image
for pixel in image:
pixel_val = get_pixel_value(pixel)
if pixel_val in histogram:
histogram[pixel_val] += 1 #Increment count
else:
histogram[pixel_val] = 1 #pixel_val encountered for the first time
mode_pixel_val = max(histogram, key = histogram.get) #Find pixel_val whose count is maximum
return get_rgb_values(mode_pixel_val) #Returna a list containing RGB Value of the median pixel
If you wish to find the most frequent pixel in a set of images, simply add another loop for image in image_set and populate the dictionary for all pixel_values in all images.
You can iterate over the x/y of the image.
a pixel will be img_array[x, y, :] (the : for the RBG channel)
you will add this to a Counter (from collections)
Here is an example of the concept over an Image
from PIL import Image
import numpy as np
from collections import Counter
# img_path is the path to your image
cnt = Counter()
img = Image.open(img_path)
img_arr = np.array(img)
for x in range(img_arr.shape[0]):
for y in range(img_arr.shape[1]):
cnt[str(img_arr[x, y, :])] += 1
print(cnt)
# Counter({'[255 255 255]': 89916, '[143 143 143]': 1491, '[0 0 0]': 891, '[211 208 209]': 185, ...
A More efficient way to do it is by using the power of numpy and some math manipulation (because we know values are bound [0, 255]
img = Image.open(img_path)
img_arr = np.array(img)
pixels_arr = (img_arr[:, :, 0] + img_arr[:, :, 1]*256 + img_arr[:, :, 2]*(256**2)).flatten()
cnt = Counter(pixels_arr)
# print(cnt)
# Counter({16777215: 89916, 9408399: 1491, 0: 891, 13750483: 185, 14803425: 177, 5263440: 122 ...
# print(cnt.most_common(1))
# [(16777215, 89916)]
pixel_value = cnt.most_common(1)[0][0]
Now a conversion back to the original 3 values is exactly like Aayush Mahajan have writte in his answer. But I've shorten it for the sake of simplicity:
r, b, g = pixel_value%256, (pixel_value//256)%256, pixel_value//(256**2)
So you are using the power of numpy fast computation (and it's significate improvement on run time.
You use Counter which is an extension of python dictionary, dedicated for counting.
I have a big RGB image as a numpy array,i want to set all pixel that has R=0,G=0,B=0 to R=255,G=0,B=0.
what is the fastest way?
i tried:
for pix in result:
if np.all(np.logical_and(pix[0]==pix[1],pix[2]==0,pix[2]==pix[1])):
pix [0] = 255
but in this way i don't have a single pixel. there is a similar way that it is not to iterate the index?
Here is a vectorized solution. Your image is basically an w by h by 3(colors) array. We can make use of the broadcasting rules that are not easy to grasp but are very powerful.
Basically, we compare the whole array to a 3 vector with the values that you are looking for. Due to the broadcasting rules Numpy will then compare each pixel to that three vector and tell you if it matched (so in this specific case, if the red, green and blue matched). You will end up with an boolean array of trues and falses of the same size as the image.
now we only want to find the pixels where all three colors matched. For that we use the "all" method, which is true, if all values of an array are true. If we apply that to a certain axis -- in this case the color axis -- we get an w by h array that is true, wherever all the colors matched.
Now we can apply this 2D boolean mask back to our original w by h by 3 array and get the pixels that match our color. we can now reassign them -- again with broadcasting.
Here is the example code
import numpy as np
#create a 2x2x3 image with ones
img = np.ones( (2,2,3) )
#make the off diagonal pixels into zeros
img[0,1] = [0,0,0]
img[1,0] = [0,0,0]
#find the only zeros pixels with the mask
#(of course any other color combination would work just as well)
#... and apply "all" along the color axis
mask = (img == [0.,0.,0.]).all(axis=2)
#apply the mask to overwrite the pixels
img[ mask ] = [255,0,0]
Since all values are positive or null, a simple and efficient way is:
img[img.sum(axis=2)==0,0]=255
img.sum(axis=2)==0 select good pixels in the two first dimensions, 0 the red canal in the third.