Binarize image "manually" - python

I am trying to modify an image by looping through the pixels. The code is this
for x in range(image.shape[0]):
for y in range(image.shape[1]):
if tuple(image[x, y]) in possible_colors_rgb:
image[x, y] = [255, 255, 255]
else:
image[x, y] = [0, 0, 0]
In the array possible_colors_bg I have a list of tuples of 3 elements that represent rgb values. The problem is that the if never evaluates as true, even if I am sure that there are some pixels that should satisfy the equality. How can I understand what's wrong?

cv2 keeps pixel values in BGR order. If your tuples in possible_colors_rgb are in RGB order, they won't match.
possible_colors_bgr = [(b,g,r) for r,g,b in possible_colors_rgb]
If the number of colors is large you might consider using a set instead of a tuple or list for the possible colors, for better efficiency.

Related

How to apply an operation to every PIXEL (not every rgb component!) of a python image (either using numpy, opencv or PIL)?

I'm trying to apply a function over all pixels of an image (in my specific case I want to do some colour approximation per pixel, but I don't think this is relevant for the problem) in an efficient way. The thing is that I've found different approaches to do so, but all of them apply a function over each component of a pixel, whereas what I want to do is to apply a function that receives a pixel (not a pixel component), this is, the 3 rgb components (I guess as a tuple, but I don't care about the format as long as I have the 3 components as parameters in my function).
If you are interested on what I have, this is the non-efficient solution to my issue (works fine but it is too slow):
def closest_colour(pixel, colours):
closest_colours = sorted(colours, key=lambda colour: colours_distance(colour, pixel))
return closest_colours[0]
# reduces the colours of the image based on the results of KMean
# img is image open with opencv.imread()
# colours is an array of colours
def image_color_reduction(img, colours):
start = time.time()
print("Color reduction...")
reduced_img = img.copy()[...,::-1]
width = reduced_img.shape[0]
height = reduced_img.shape[1]
for x in range(width):
for y in range(height):
reduced_img[x,y] = closest_colour(reduced_img[x,y], colours)
end = time.time()
print(f"Successfully reduced in {end-start} seconds")
return reduced_img
I've followed this post: PIL - apply the same operation to every pixel that seemed to be pretty clear and aligned with my issue. I've tried using any kind of image formatting, I've tried multithreading (both with pool.map & pool.imap), I've tried numpy.apply_along_axis and I finally tried PIL.point(), what I thought was the most similar solution to what I was looking for. Indeed, if you take a look at their official documentation: .point(), it exactly says: The function is called once for each possible pixel value. I find this really misleading, because after trying it I realized pixel value in this context does not refer to an rgb tuple, but to each of the 3 rgb compoments (seriously, in what world?).
I would really appreciate it if somebody could share a bit of their experience and give me some light on this issue. Thank you in advance!!
(UPDATE)
As per your request I add more information on the specific problem I am approaching:
Given
an image M of size 1022*1080
an array of colours N with size 1 < |N| < 16
Reduce the colours of M by replacing each pixel's colour by the most
similar one in N (thanks to your answers I know this is defined as
Nearest Neighbor Color Quantization)
This is the missing implementation of colours_distance:
def colours_distance(c1, c2):
(r1,g1,b1) = c1
(r2,g2,b2) = c2
return math.sqrt((r1 - r2)**2 + (g1 - g2) ** 2 + (b1 - b2) **2)
And this are the imports needed to run this code:
import cv2
import time
import math
The solution shown in my question solves the problem described in slightly less than 40s on average.
Assuming that your image is an (M, N, 3) numpy array, your color table is (K, 3), and you measure color distance as some sane vector norm, you can use scipy's cKDTree (or just KDTree) to optimize and vectorize the lookup for you.
First make a tree out of your color table:
colors = ... # K, 3 array
color_tree = cKDTree(colors)
Now you can query the tree directly to get the output image:
_, output = color_tree.query(img)
output will be an (M, N) index array into color_table. The important thing is that KD trees are optimized to perform O(log K) lookups per pixel, not O(K) or O(K log K) as in your current implementation. Since the loops are implemented in C, you'll get a major boost from that too.
Note about "vectorization"
With numpy, there isn't a generic efficient way to apply an arbitrary function across some axis of your image. In order to do calculations efficiently, numpy needs to be able to do those calculations in the backend for you, not using Python at all. The same is true of OpenCV. When you call for e.g. np.mean() or cv.meanStdDev() or similar, these libraries are iterating through your images in C/C++/Fortran/etc so the code that gets executed needs to be there. However, you want to apply a function you've defined in Python on these values, which means you need to operate on Python objects directly, which removes all of the efficiency of doing operations in numpy/OpenCV/etc and is why there's no instantly fast way to do these calculations. You mentioned in your post about df.apply() from Pandas---note that apply() is actually slow, it does the looping in Python like you're currently doing, and generally you want to stay away from using it for that reason. Numpy and OpenCV don't expose a method like apply() because it is not really a good way to do things.
Generally, to do operations efficiently, you need to vectorize your code, which in Python-land means to only utilize built-in numpy/opencv/etc functions that can operate on your data all at once, without writing loops (or without them implicitly being called in Python, like df.apply()).
Note that nothing here is specific to working on pixels (or their individual components), this is a general problem in trying to achieve fast computations in Python. That is, even if any of the solutions you've tried were to work on a pixel (as opposed to components), it'd still be slow anyways.
Solution
The specific problem you give as an example (nearest neighbor color quantization) is non-trivial to make fast, since for each pixel you need to figure out where you sit nearest in the list of colors. If you only have a few colors, like 8, it isn't terrible to just calculate the distance to all, but if you are trying to reduce the palette to 256 colors or something like that, it's a lot of compute. If you have only a few colors, then you can vectorize the whole operation by creating a 3d array representing the distance to each color at each x, y location, and taking the argmin across the color axis, which you can then use with a lookup table.
Here's an example implementation, reducing an image to 8 colors. We'll start with an image and some defined colors
In [80]: img.shape
Out[80]: (90, 160, 3)
In [81]: colors
Out[81]:
array([[ 0, 0, 0],
[255, 0, 0],
[ 0, 255, 0],
[ 0, 0, 255],
[255, 255, 0],
[255, 0, 255],
[ 0, 255, 255],
[255, 255, 255]])
Now we want, for each pixel location in the image, the distance to each color (we'll use the abs diff distance function as an example, but any vectorizeable operation here will do). Here we can utilize broadcasting to get a resulting array of shape (h, w, n_colors):
In [83]: distances = np.sum(np.abs(img[..., np.newaxis] - colors.T), axis=2)
In [84]: distances.shape
Out[84]: (90, 160, 8)
Now you want to know which color resulted in the minimum distance for each pixel:
In [87]: nearest_colors = np.argmin(distances, axis=2)
In [88]: nearest_colors
Out[88]:
array([[7, 4, 3, ..., 2, 2, 4],
[5, 7, 6, ..., 3, 2, 5],
[5, 3, 7, ..., 3, 5, 7],
...,
[6, 5, 0, ..., 7, 6, 1],
[1, 6, 5, ..., 2, 0, 3],
[0, 1, 0, ..., 7, 5, 4]])
So at the first pixel, the closest color was the last one in my color list (all white), at the next pixel to the right the closest color was [255, 255, 0], and so on. Now you can use a lookup table to map from these to their actual color values. The way to do this with numpy is to use fancy indexing:
In [91]: quantized = colors[nearest_colors]
In [92]: quantized.shape
Out[92]: (90, 160, 3)
And here's your image with the new quantized colors.
A more efficient solution for this problem is to utilize a kd-tree, as MadPhysicist answered. However, color distance functions can be non-linear and those distances may not map well to spatial data structures, in which case there's usually specialized implementations or very specific ways in which to make them faster, but this is closer to research and not appropriate for SO.
For other color quantization algorithms, this question has a lot of good examples: Fast color quantization in OpenCV
Try this solution if it will be faster:
img = cv2.imread(path)
result = np.zeros_like(img)
colors_arr = [[0, 0, 255], [255, 0, 0], [0, 255, 0], [0, 255, 255], [255, 0, 255], [255, 255, 0]]
# Normalizing images and colors to 1.
colors = np.array(colors_arr, np.float32) / 255
img = img.astype(np.float32) / 255
# For each color making an array of weights.
weights = []
for i in range(colors.shape[0]):
weights.append(np.sum(np.square(img - colors[i]), axis=2))
weights = np.array(weights, np.float32)
# Finding the index of minimum weight
weights = np.transpose(weights, axes=[1, 2, 0])
color_inds = np.argmin(weights, axis=2)
# Depending on minimum weight index assigning the color to the result
for i in range(len(colors_arr)):
idx = np.where(color_inds == i)
result[idx] = colors_arr[i]
cv2.imshow('', result)
cv2.waitKey()
I was interested in the relative performance of using linalg.norm() versus the cKDTree() for your dataset sizes of 1022x1080 image with N (palette length) in the range 1..16.
#!/usr/bin/env python3
import numpy as np
import cv2
def QuantizeToGivenPalette(im, palette):
"""Quantize image to a given palette.
The input image is expected to be a Numpy array.
The palette is expected to be a list of R,G,B values."""
# Calculate the distance to each palette entry from each pixel
distance = np.linalg.norm(im[:,:,None].astype(np.float) - palette[None,None,:].astype(np.float), axis=3)
# Now choose whichever one of the palette colours is nearest for each pixel
palettised = np.argmin(distance, axis=2).astype(np.uint8)
return palettised
################################################################################
# main
################################################################################
# Let's get some repeatable randomness
np.random.seed(42)
# Open a colorwheel, resize to match dimensions of question
M = 1022, 1080
im = cv2.imread('colorwheel.png', cv2.IMREAD_COLOR)
im = cv2.resize(im, M, interpolation = cv2.INTER_AREA)
# Make a full 256-entry palette of random colours, but we'll just use the first N
palette = np.random.randint(0,256,(256,3),dtype=np.uint8)
# Try quantizing with linalg.norm, for various palette lengths
pLengths = [4,8,12,16]
for pLength in pLengths:
indices = QuantizeToGivenPalette(im, palette[:pLength])
# Write image of just palette indices
cv2.imwrite(f'DEBUG-indices-linalg{pLength}.png', indices)
# Look up each pixel in the palette and revert to BGR and save
BGR = palette[indices]
cv2.imwrite(f'DEBUG-result-linalg{pLength}.png', BGR)
################################################################################
# NOW DO SAME THING BUT WITH KDTREE
################################################################################
from scipy.spatial import cKDTree
# Try quantizing with cKDTree, for various palette lengths
for pLength in pLengths:
# Build our tree from the palette, only necessary once for any given palette
treeFromPalette = cKDTree(palette[:pLength])
# Lookup nearest indices for each pixel in image
_, indices = treeFromPalette.query(im)
# Write image of just palette indices
cv2.imwrite(f'DEBUG-indices-cKDTree{pLength}.png', indices)
# Look up each pixel in the palette and revert to BGR and save
BGR = palette[indices]
cv2.imwrite(f'DEBUG-result-cKDTree{pLength}.png', BGR)
I used this image as input and resized it to your stated dimensions:
The results were the same for both methods:
4 colours:
8 colours:
16 colours:
The interesting thing was the timings - all in milliseconds:
N norm() cKDTree()
4 147 485
8 307 308
12 449 530
16 601 542
If we plot those, you can see that cKDTree() only really comes into its own at the higher end of your N-values:
Keywords: Python, image processing, KDTree, linalg.norm, palette, quantisation, prime.

Numpy Arrays - Replacing Elements

I am new to numpy and I want to replace specefic elements in a 3D numpy array. My 3D numpy array represents an image. The shape of the array is:
(1080, 1920, 3). The number 3 represents RGB of each pixel in the image.
All I want to know is how to change all the elements that are equal to [0,0,0] into [255,255,255]
Which means i want all black pixels in the image to be white.. How can i do it?
Thanks!
Say you have stored your array in data; this should work:
data[(data == 0).all(axis=2)] = [255, 255, 255]
This is due to numpy's broadcasting rules, which compare each value to 0, resulting in a boolean array with True values where they compare equal and False elsewhere.
The next step is to take only those sub-arrays where all of the individual values do compare equal, with .all(axis=2) - the last axis, which is the one you want.
Then, with the resulting boolean array, you can index back into data, which will give you only those sub-arrays equal to [0, 0, 0], and set those to [255, 255, 255].

How to find the most frequent pixel value in an image?

Editors comment:
How to count pixels occurences in an image?
I have a set of images where each pixel consists of 3 integers in the range 0-255.
I am interested in finding one pixel that is "representative" (as much as possible) for the entire pixel-population as a whole, and that pixel must occur in the pixel-population.
I am determining which pixel is the most common (the median mode) in my set of images makes the most sense.
I am using python, but I am not sure how to go about it.
The images are stored as an numpy array with dimensions [n, h, w, c], where n is the number of images, h is the height, w is the widthandc` is the channels (RGB).
I'm going to assume you need to find the most common element, which as Cris Luengo mentioned is called the mode. I'm also going to assume that the bit depth of the channels is 8-bit (value between 0 and 255, i.e. modulo 256).
Here is an implementation independent approach:
The aim is to maintain a count of all the different kinds of pixels encountered. It makes sense to use a dictionary for this, which would be of the form {pixel_value : count}.
Once this dictionary is populated, we can find the pixel with the highest count.
Now, 'pixels' are not hashable and hence cannot be stored in a dictionary directly. We need a way to assign an integer(which I'll be referring to as the pixel_value) to each unique pixel, i.e., you should be able to convert pixel_value <--> RGB value of a pixel
This function converts RGB values to an integer in the range of 0 to 16,777,215:
def get_pixel_value(pixel):
return pixel.red + 256*pixel.green + 256*256*pixel.blue
and to convert pixel_value back into RGB values:
def get_rgb_values(pixel_value):
red = pixel_value%256
pixel_value //= 256
green = pixel_value%256
pixel_value //= 256
blue = pixel_value
return [red,green,blue]
This function can find the most frequent pixel in an image:
def find_most_common_pixel(image):
histogram = {} #Dictionary keeps count of different kinds of pixels in image
for pixel in image:
pixel_val = get_pixel_value(pixel)
if pixel_val in histogram:
histogram[pixel_val] += 1 #Increment count
else:
histogram[pixel_val] = 1 #pixel_val encountered for the first time
mode_pixel_val = max(histogram, key = histogram.get) #Find pixel_val whose count is maximum
return get_rgb_values(mode_pixel_val) #Returna a list containing RGB Value of the median pixel
If you wish to find the most frequent pixel in a set of images, simply add another loop for image in image_set and populate the dictionary for all pixel_values in all images.
You can iterate over the x/y of the image.
a pixel will be img_array[x, y, :] (the : for the RBG channel)
you will add this to a Counter (from collections)
Here is an example of the concept over an Image
from PIL import Image
import numpy as np
from collections import Counter
# img_path is the path to your image
cnt = Counter()
img = Image.open(img_path)
img_arr = np.array(img)
for x in range(img_arr.shape[0]):
for y in range(img_arr.shape[1]):
cnt[str(img_arr[x, y, :])] += 1
print(cnt)
# Counter({'[255 255 255]': 89916, '[143 143 143]': 1491, '[0 0 0]': 891, '[211 208 209]': 185, ...
A More efficient way to do it is by using the power of numpy and some math manipulation (because we know values are bound [0, 255]
img = Image.open(img_path)
img_arr = np.array(img)
pixels_arr = (img_arr[:, :, 0] + img_arr[:, :, 1]*256 + img_arr[:, :, 2]*(256**2)).flatten()
cnt = Counter(pixels_arr)
# print(cnt)
# Counter({16777215: 89916, 9408399: 1491, 0: 891, 13750483: 185, 14803425: 177, 5263440: 122 ...
# print(cnt.most_common(1))
# [(16777215, 89916)]
pixel_value = cnt.most_common(1)[0][0]
Now a conversion back to the original 3 values is exactly like Aayush Mahajan have writte in his answer. But I've shorten it for the sake of simplicity:
r, b, g = pixel_value%256, (pixel_value//256)%256, pixel_value//(256**2)
So you are using the power of numpy fast computation (and it's significate improvement on run time.
You use Counter which is an extension of python dictionary, dedicated for counting.

how would you take the white pixels generated by a mask and put them into a list

I have a picture of a street, (the street has small variations of color) and with some help I was able to crop part of the street for a sample of the color I then took the color and calculated the mean and stdv and created the lower and upper boundry for a mask.
I took the mask output and ran closing = cv2.morphologyEx(output, cv2.MORPH_CLOSE, kernel)
I would like to take all the white pixels that are in closing and create a list of their x,y coordinates.
then I would like to take the x,y coordinates and create another list of there b,g,r values.
so then I can run it back through
blue=cropimg[:,:,0]; green=cropimg[:,:,1]; red=cropimg[:,:,2];
See image, showing: original, mask, closing, cropped areas:
It sounds like you're simply looking for:
rgb = cropimg[mask,:] # or mask > 0, if mask is not a boolean array
which will return an Nx3 array of the pixels under the mask
ebeneditos, I found that this works, but for large images it takes too much time.
coords, colors = [], []
for y in range(closing.shape[0]):
for x in range(closing.shape[1]):
if np.all(closing[y, x] > 0):
coords.append((y, x))
colors.append(img[y, x])
In order to get the lists you asked, you can do it like:
coords, colors = [], []
for y in range(closing.shape[0]):
for x in range(closing.shape[1]):
if np.all(closing[y, x] > 0):
coords.append((y, x))
colors.append(original[y, x])
Thus you get a list with the white coordinates and the BGR values in those coordinates in the original image.

Fast pixel comparisons with opencv and python [duplicate]

I'm looping through this image pixel by pixel and it's really slow. I have the 2 images I'm comparing sliced and flattened so each element is a 3 dimensional rgb value named e1 and e2. It is very slow though. Is there some method using opencv or numpy that can speed this up?
What I'm doing here is performing pixel comparisons on images with binned colors (8 colors).
I'm reading from a jpeg though so what should be [255,0,0] becomes [230,12,11] so what clean_key does is threshold the values to the cleaner ones. Then I append the number of times this combination occurs to a dictionary. So for example dict["255,0,0 0,0,255"] might occur 300 times in this image which means there were 300 instances where im1 had a red pixel and im2 had a blue pixel.
for e1,e2 in itertools.izip(im1_slice.reshape(-1,3),im2_slice.reshape(-1,3)):
key = str(clean_key(e1_row)) + str(clean_key(e2_row))
if key in proportion_dict:
proportion_dict[key] += 1
else:
proportion_dict[key] = 1
return (proportion_dict,total)
The way you want to do this is first compare each image to the color you want to see in that image, which makes a boolean mask where that image is the given color. You don't need to flatten the images to do this. This can be done by saying:
image == color
This works fine for grayscale images, but if color is actually along a third dimension, you want to make sure everything along that dimension matches (i.e., you want all of the r, g, and b components to match, so you use np.all along the last axis (-1 gives the last axis):
np.all(image == color, axis=-1)
Which gives a 2d array of booleans where each element is True if that pixel matches color and False if not. Do this for both images (and both colors) and then you'll have a mask where the color matches both images:
np.all(im1==c1, -1) & np.all(im2==c2, -1)
This not only tells you how many pixels match, but where they are (you could plot the above line and see dot at the points where they match). If you just want the count, just use np.sum on the mask which counts True as 1, and False as 0. All together:
def compare_colors(im1, im2, c1, c2):
matches = np.all(im1==c1, -1) & np.all(im2==c2, -1)
return matches.sum()
And to use/test it with random data:
>>> a = np.random.choice([0, 255], (20,20,3))
>>> b = np.random.choice([0, 255], (20,20,3))
>>> compare_colors(a, b, [255, 0, 255], [0, 255, 0])
12
But before you do that, with your real input, you want to "clean" your colors by a threshold. You could easily do that with np.where which looks at each element of an array, and if a condition is met, gives one thing, and if not, gives another. Here, if the value is less than 128, it uses 0, and otherwise uses 255:
np.where(a<128, 0, 255)
In general, you could write a function like this, with the values above as defaults:
def clean(a, thresh=128, under=0, over=255):
return np.where(a<128, under, over)
Of course to build up your dict of counts, you still have to loop through each color combination, but that's a short loop (8*8). Here's a full run through:
# some fake data (has values between 0 and 255 for r, g, and b)
H, W = 20, 20
a = np.random.randint(0, 256, (H,W,3))
b = np.random.randint(0, 256, (H,W,3))
# clean the images:
ac = clean(a)
bc = clean(b)
# build a list of all pairs of all 8 colors using itertools.product:
col_combos = itertools.product(itertools.product((0,255), repeat=3), repeat=2)
# now apply the comparison to the images for each pair of colors
col_dict = { (c1,c2): compare_colors(ac, bc, c1, c2) for c1,c2 in col_combos }
Then, the keys for col_dict are actually tuples of tuples, which are much easier to deal with than strings, in my opinion. Here's how you'd access an example key:
>>> col_dict[((0, 255, 255), (255, 0, 255))]
8

Categories

Resources