I want to convert a 3 channel RGB image to a index image with Python. It's used for handling the labels of training a deep net for semantic segmentation. By index image I mean it has one channel and each pixel is the index, which should starts with zero. And certainly they should have the same size. The conversion is based on the following mapping in Python dict:
color2index = {
(255, 255, 255) : 0,
(0, 0, 255) : 1,
(0, 255, 255) : 2,
(0, 255, 0) : 3,
(255, 255, 0) : 4,
(255, 0, 0) : 5
}
I've implemented a naive function:
def im2index(im):
"""
turn a 3 channel RGB image to 1 channel index image
"""
assert len(im.shape) == 3
height, width, ch = im.shape
assert ch == 3
m_lable = np.zeros((height, width, 1), dtype=np.uint8)
for w in range(width):
for h in range(height):
b, g, r = im[h, w, :]
m_lable[h, w, :] = color2index[(r, g, b)]
return m_lable
The input im is a numpy array created by cv2.imread(). However, this code is really slow.
Since the im is in numpy array I firstly tried the ufunc of numpy with something like this:
RGB2index = np.frompyfunc(lambda x: color2index(tuple(x)))
indices = RGB2index(im)
But it turns out that the ufunc takes only one element each time. I was unable to give the function three arguments(RGB value) one time.
So is there any other ways to do the optimization?
The mapping has not to be that way, if a more efficient data structure exists. I noticed that the access of a Python dict dose not cost much time, but the casting from numpy array to tuple(which is hashable) does.
PS:
One idea I got is to implement a kernel in CUDA. But it would be more complicated.
UPDATA1:
Dan Mašek's Answer works fine. But first we have to convert the RGB image to grayscale. It could be problematic when two colors have the same grayscale value.
I paste the working code here. Hope it could help others.
lut = np.ones(256, dtype=np.uint8) * 255
lut[[255,29,179,150,226,76]] = np.arange(6, dtype=np.uint8)
im_out = cv2.LUT(cv2.cvtColor(im, cv2.COLOR_BGR2GRAY), lut)
What about this?
color2index = {
(255, 255, 255) : 0,
(0, 0, 255) : 1,
(0, 255, 255) : 2,
(0, 255, 0) : 3,
(255, 255, 0) : 4,
(255, 0, 0) : 5
}
def rgb2mask(img):
assert len(img.shape) == 3
height, width, ch = img.shape
assert ch == 3
W = np.power(256, [[0],[1],[2]])
img_id = img.dot(W).squeeze(-1)
values = np.unique(img_id)
mask = np.zeros(img_id.shape)
for i, c in enumerate(values):
try:
mask[img_id==c] = color2index[tuple(img[img_id==c][0])]
except:
pass
return mask
Then just call:
mask = rgb2mask(ing)
Here's a small utility function to convert images (np.array) to per-pixel labels (indices), which can also be a one-hot encoding:
def rgb2label(img, color_codes = None, one_hot_encode=False):
if color_codes is None:
color_codes = {val:i for i,val in enumerate(set( tuple(v) for m2d in img for v in m2d ))}
n_labels = len(color_codes)
result = np.ndarray(shape=img.shape[:2], dtype=int)
result[:,:] = -1
for rgb, idx in color_codes.items():
result[(img==rgb).all(2)] = idx
if one_hot_encode:
one_hot_labels = np.zeros((img.shape[0],img.shape[1],n_labels))
# one-hot encoding
for c in range(n_labels):
one_hot_labels[: , : , c ] = (result == c ).astype(int)
result = one_hot_labels
return result, color_codes
img = cv2.imread("input_rgb_for_labels.png")
img_labels, color_codes = rgb2label(img)
print(color_codes) # e.g. to see what the codebook is
img1 = cv2.imread("another_rgb_for_labels.png")
img1_labels, _ = rgb2label(img1, color_codes) # use the same codebook
It calculates (and returns) the color codebook if None is supplied.
actually for-loop takes much time.
binary_mask = (im_array[:,:,0] == 255) & (im_array[:,:,1] == 255) & (im_array[:,:,2] == 0)
maybe above code can help you
I've implemented a naive function: …
I firstly tried the ufunc of numpy with something like this: …
I suggest using an even more naive function which converts just one pixel:
def rgb2index(rgb):
"""
turn a 3 channel RGB color to 1 channel index color
"""
return color2index[tuple(rgb)]
Then using a numpy routine is a good idea, but we don't need a ufunc:
np.apply_along_axis(rgb2index, 2, im)
Here numpy.apply_along_axis() is used to apply our rgb2index() function to the RGB slices along the last of the three axes (0, 1, 2) for the whole image im.
We could even do without the function and just write:
np.apply_along_axis(lambda rgb: color2index[tuple(rgb)], 2, im)
Similar to what Armali and Mendrika proposed, I somehow had to tweak it a little bit to get it to work (maybe totally my fault). So I just wanted to share a snippet that works.
COLORS = np.array([
[0, 0, 0],
[0, 0, 255],
[255, 0, 0]
])
W = np.power(255, [0, 1, 2])
HASHES = np.sum(W * COLORS, axis=-1)
HASH2COLOR = {h : c for h, c in zip(HASHES, COLORS)}
HASH2IDX = {h: i for i, h in enumerate(HASHES)}
def rgb2index(segmentation_rgb):
"""
turn a 3 channel RGB color to 1 channel index color
"""
s_shape = segmentation_rgb.shape
s_hashes = np.sum(W * segmentation_rgb, axis=-1)
func = lambda x: HASH2IDX[int(x)]
segmentation_idx = np.apply_along_axis(func, 0, s_hashes.reshape((1, -1)))
segmentation_idx = segmentation_idx.reshape(s_shape[:2])
return segmentation_idx
segmentation = np.array([[0, 0, 0], [0, 0, 255], [255, 0, 0]] * 3).reshape((3, 3, 3))
rgb2index(segmentation)
Example plot
The code is also available here:
https://github.com/theRealSuperMario/supermariopy/blob/dev/scripts/rgb2labels.py
Did you check Pillow library https://python-pillow.org/? As I remember, it has some classes and methods to deal with color conversion. See: https://pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert
If you are happy using MATLAB - maybe saving the result as *.mat and loading with scipy.io.loadmat - there is the rgb2ind function in MATLAB, which does exactly what you are asking for. If not, it could be used as inspiration for a similar implementation in Python.
Related
What I'm trying to achieve: lookup tables to create duotone effect also called false color.
Say I have two colours: pure red and pure green provided in hex format ff0000 and 00ff00 respectively. We know its essentially (255, 0, 0) and (0, 255, 0). I need to create a 256x1 gradient image in numpy with red and green at both ends of the gradient.
I would strongly prefer to limit the dependancies to numpy and cv2.
Below is a code that works for me just fine, however all the rgb values are already hardcoded and I need to compute LUT gradient map dynamically for any given left and right colors (LUT tables truncated for brevity):
lut = np.zeros((256, 1, 3), dtype=np.uint8)
lut[:, 0, 0] = [250,248,246,244,242,240,238,236,234,232,230, ...]
lut[:, 0, 1] = [109,107,105,103,101,99,97,95,93,91,89,87,85, ...]
lut[:, 0, 2] = [127,127,127,127,127,127,127,127,127,127,127, ...]
im_color = cv2.LUT(image, lut)
From here modifying to give numpy arrays
def hex_to_rgb(hex):
hex = hex.lstrip('#')
hlen = len(hex)
return np.array([int(hex[i:i+hlen//3], 16) for i in range(0, hlen, hlen//3)])
Then the numpy part:
def gradient(hex1, hex2):
np1 = hex_to_rgb(hex1)
np2 = hex_to_rgb(hex2)
return np.linspace(np1[:, None], np2[:, None], 256, dtype = int)
I know the question has been answered, but just want to ask the author if the code for duotone effect can be shared. I have a brute-forth solution that updates an image pixel by pixel, it works but is really inefficient. So I'm looking for a more efficient algorithm, and found this post inspiring, but haven't figured out a working solution using the clues. #Pono, it'd be great if you can share the code to create a duotone image using any 2 colors.
Never mind, I figured it out, and share the code below in case someone else looks for the same thing.
def gradient1d(rbg1, rbg2):
bgr1 = np.array((rbg1[2], rbg1[1], rbg1[0]))
bgr2 = np.array((rbg2[2], rbg2[1], rbg2[0]))
return np.linspace(bgr2, bgr1, 256, dtype = int)
def duotone(image, color1, color2):
img = image.copy()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
table = gradient1d(color1, color2)
result = np.zeros((*gray.shape,3), dtype=np.uint8)
np.take(table, gray, axis=0, out=result)
return result
I have a PNG image I am loading within Tensorflow using:
image = tf.io.decode_png(tf.io.read_file(path), channels=3)
The image contains pixels that match a lookup like this:
image_colors = [
(0, 0, 0), # black
(0.5, 0.5, 0.5), # grey
(1, 0.5, 1), # pink
]
How can I convert it so that the output has the pixels mapped into one-hot encodings where the hot component would be the matching color?
Let me assume for convenience that all values in image_colors are in [0, 255]:
image_colors = [
(0, 0, 0), # black
(127, 127, 127), # grey
(255, 127, 255), # pink
]
My approach maps pixels into one-hot values as follows:
# Create a "color reference" tensor from image_colors
color_reference = tf.cast(tf.constant(image_colors), dtype=tf.uint8)
# Load the image and obtain tensor with one-hot values
image = tf.io.decode_png(tf.io.read_file(path), channels=3)
comp = tf.equal(image[..., None, :], color_reference)
one_hot = tf.cast(tf.reduce_all(comp, axis=-1), dtype=tf.float32)
Note that you can easily add new colors to image_colors without changing the TF implementation. Also, this assumes that all pixels in the image are in image_colors. If that is not the case, one could define a range for each color and then use other tf.math operations (e.g. tf.greater and tf.less) instead of tf.equal.
There might be better approaches than this.
def map_colors(pixel):
if pixel[0] < 10 and pixel[1] < 10 and pixel[2] < 10: ## Black
return 0
elif pixel[0] > 245 and pixel[1] > 245 and pixel[2] > 245: ## White
return 1
else:
return 11
image = tf.io.decode_png(tf.io.read_file(path), channels=3)
img_shape = image.shape
# Arrange the pixels in RGB format from 3D array to 2D array.
image = tf.reshape(image, [-1, 3])
colors = tf.map_fn(lambda x: map_colors(x), image)
one_hot = tf.one_hot(colors, depth=12)
print(one_hot)
# If necessary
image_one_hot = tf.reshape(one_hot, [img_shape[0], img_shape[1], 12])
Ensure that in map_colors, you put all your 12 colors with the range of RGB color values they can accept. Make sure that all the combinations are covered or else, add an extra class of None of the above and name it 12.
I'm trying to setup a path-finder in which I pass a maze (array of 1/0's with 1 being obstacles), start point/end point and for it to return the optimal path.
I have code from the following as my base, with the 'main' function modified as shown below.
https://medium.com/#nicholas.w.swift/easy-a-star-pathfinding-7e6689c7f7b2
def main():
maze = [[0,1,0,0,...],[0,0,0,0...],[...]...,[...]] #Example 2D List
start = (4, 33)
end = (200, 200)
path = astar(maze, start, end)
print(path)
#Create blank image for openCV
img = np.zeros((221,221,3), np.uint8)
x, y = 0, 0
red = [0, 0, 255]
#Draw obstacles
for row in maze:
y+=1
x=0
for value in row:
x+=1
if value == 1: img[y, x]=red
#Draw path
for x, y in path:
img[y, x] = (255, 0, 0)
cv2.imshow("Image", img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Full maze used to make the map is here: https://pastebin.com/wT6dGQnj
this is a simplified case of a larger project, so list has been 'hard coded' here.
Below is the output, which seems to be incorrect as the path crosses multiple obstacles:
Results
I think the problem is in how you're drawing the walls, not the path. You start with y=1 and also seem to switch the x and y there. I used the 10x10 maze in the original code, and things looked correct once I changed the wall-drawing part as follows:
n = 10
img = np.zeros((n,n,3), np.uint8)
x, y = 0, 0
red = [0, 0, 255]
# Draw the walls in red. This is the part I changed.
for i in range(n):
for j in range(n):
if maze[i][j] == 1:
img[i,j] = red
# Draw the path in blue
for x, y in path:
img[x, y] = (255, 0, 0)
There's probably a more efficient way though you can map maze to img without doing two nested for loops like I did here.
As stated in the help section of Stackoverflow that one can ask about a "software algorithm," I believe this question is on topic. I'm viewing the following algorithm and I'm having a hard time understanding the why it is being used. I've explained the mechanics below. The code was pulled from the following github repo.
import numpy as np
import cv2
import sys
def calc_sloop_change(histo, mode, tolerance):
sloop = 0
for i in range(0, len(histo)):
if histo[i] > max(1, tolerance):
sloop = i
return sloop
else:
sloop = i
def process(inpath, outpath, tolerance):
original_image = cv2.imread(inpath)
tolerance = int(tolerance) * 0.01
#Get properties
width, height, channels = original_image.shape
color_image = original_image.copy()
blue_hist = cv2.calcHist([color_image], [0], None, [256], [0, 256])
green_hist = cv2.calcHist([color_image], [1], None, [256], [0, 256])
red_hist = cv2.calcHist([color_image], [2], None, [256], [0, 256])
blue_mode = blue_hist.max()
blue_tolerance = np.where(blue_hist == blue_mode)[0][0] * tolerance
green_mode = green_hist.max()
green_tolerance = np.where(green_hist == green_mode)[0][0] * tolerance
red_mode = red_hist.max()
red_tolerance = np.where(red_hist == red_mode)[0][0] * tolerance
sloop_blue = calc_sloop_change(blue_hist, blue_mode, blue_tolerance)
sloop_green = calc_sloop_change(green_hist, green_mode, green_tolerance)
sloop_red = calc_sloop_change(red_hist, red_mode, red_tolerance)
gray_image = cv2.cvtColor(original_image, cv2.COLOR_BGR2GRAY)
gray_hist = cv2.calcHist([original_image], [0], None, [256], [0, 256])
largest_gray = gray_hist.max()
threshold_gray = np.where(gray_hist == largest_gray)[0][0]
#Red cells
gray_image = cv2.adaptiveThreshold(gray_image, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 85, 4)
_, contours, hierarchy = cv2.findContours(gray_image, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
c2 = [i for i in contours if cv2.boundingRect(i)[3] > 15]
cv2.drawContours(color_image, c2, -1, (0, 0, 255), 1)
cp = [cv2.approxPolyDP(i, 0.015 * cv2.arcLength(i, True), True) for i in c2]
countRedCells = len(c2)
for c in cp:
xc, yc, wc, hc = cv2.boundingRect(c)
cv2.rectangle(color_image, (xc, yc), (xc + wc, yc + hc), (0, 255, 0), 1)
#Malaria cells
gray_image = cv2.inRange(original_image, np.array([sloop_blue, sloop_green, sloop_red]), np.array([255, 255, 255]))
_, contours, hierarchy = cv2.findContours(gray_image, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)
c2 = [i for i in contours if cv2.boundingRect(i)[3] > 8]
cv2.drawContours(color_image, c2, -1, (0, 0, 0), 1)
cp = [cv2.approxPolyDP(i, 0.15 * cv2.arcLength(i, True), True) for i in c2]
countMalaria = len(c2)
for c in cp:
xc, yc, wc, hc = cv2.boundingRect(c)
cv2.rectangle(color_image, (xc, yc), (xc + wc, yc + hc), (0, 0, 0), 1)
#Write image
cv2.imwrite(outpath, color_image)
#Write statistics
with open(outpath + '.stats', mode='w') as f:
f.write(str(countRedCells) + '\n')
f.write(str(countMalaria) + '\n')
The above code looks at images of cells(irregular shapes) and identifies if there are black spots /blobs inside them. Then, it draws contours around the cells and blobs. For example:
I don't understand why the algorithm works the following way:
Let me illustrate with an example:
Let's say my tolerance passed into process() is 50. Let's say blue_hist returns an array [1, 2, 3, 4, 100, 0, ..., 0] and the largest value in this array is 100 at index 4. This indicates that there are a 100 pixels with an intensity of 4 in the gray scale version of the color image when just the blue signal is extracted. In this situation, the function where(blue_hist = blue_mode) will return 4. This value is multiplied by 0.01*tolerance giving us 2.
So, if the value 4 is pixel intensity value then multiplying it by a scalar only gives another pixel intensity value (in our case, (4 * (0.01*50)) = 2. This new pixel intensity is passed into calc_sloop_change(). In this function, compares histo[i] which returns the number of pixels at intensity i with tolerance(which is the pixel value we calculated earlier). So in our case, the first value greater than 2 happens when i=3. So 3 is returned.
This is where I'm confused. Why is this being done? It seems illogical to compare the number of pixels vs pixel intensity. They are not even the same entity. So, why are they using this algorithm? I must add that this code actually performs really well. So something must be right.
Lastly, the three values calculated by calc_sloop_change(), one for each color signal, acts as a lower cutoff to produce a binary image. Anything less than those values(which are actually pixel intensity values) becomes black and everything above those values becomes white.
I have an RGB image and am trying to set every pixel on my RGB to black where the corresponding alpha pixel is black as well. So basically I am trying to "bake" the alpha into my RGB.
I have tried this using PIL pixel access objects, PIL ImageMath.eval and numpy arrays:
PIL pixel access objects:
def alphaCutoutPerPixel(im):
pixels = im.load()
for x in range(im.size[0]):
for y in range(im.size[1]):
px = pixels[x, y]
r,g,b,a = px
if px[3] == 0: # If alpha is black...
pixels[x,y] = (0,0,0,0)
return im
PIL ImageMath.eval:
def alphaCutoutPerBand(im):
newBands = []
r, g, b, a = im.split()
for band in (r, g, b):
out = ImageMath.eval("convert(min(band, alpha), 'L')", band=band, alpha=a)
newBands.append(out)
newImg = Image.merge("RGB", newBands)
return newImg
Numpy array:
def alphaCutoutNumpy(im):
data = numpy.array(im)
r, g, b, a = data.T
blackAlphaAreas = (a == 0)
# This fails; why?
data[..., :-1][blackAlphaAreas] = (0, 255, 0)
return Image.fromarray(data)
The first method works fine, but is really slow. The second method works fine for a single image, but will stop after the first when asked to convert multiple. The third method I created based on this example (first answer): Python: PIL replace a single RGBA color
But it fails at the marked command:
data[..., :-1][blackAlphaAreas] = (0, 255, 0, 0)
IndexError: index (295) out of range (0<=index<294) in dimension 0
Numpy seems promising for this kind of stuff, but I dont really get the syntax on how to set parts of the array in one step. Any help? Maybe other ideas to achieve what I describe above quickly?
Cheers
This doesn't use advanced indexing but is easier to read, imho:
def alphaCutoutNumpy(im):
data = numpy.array(im)
data_T = data.T
r, g, b, a = data_T
blackAlphaAreas = (a == 0)
data_T[0][blackAlphaAreas] = 0
data_T[1][blackAlphaAreas] = 0
data_T[2][blackAlphaAreas] = 0
#data_T[3][blackAlphaAreas] = 255
return Image.fromarray(data[:,:,:3])