I'm a beginner in image processing.
I work with an RGB image image.shape = (4512,3000,3)
I saw the value of the the first pixel: image[0][0] = [210 213 220]
When I use the rgb2gray function the result is rgb2gray(image[0][0]) = 0.8347733333333334
But I saw that the relation used by the function is Y = 0.2125 * R + 0.7454 * G + 0.0721 * B. I did the calculation, I should have Y = im[0,0,0] * 0.2125 + im[0,0,1] * 0.7154 + im[0,0,2] * 0.0721 = 212.8672
It seems my result is 212.8672/255 = 0.8347733333333334
Why is the result between 0 and 1 and not between 0 and 255?
I assume you are using scikit-image's rgb2gray. In that case, you can see in the code from https://github.com/scikit-image/scikit-image/blob/main/skimage/color/colorconv.py that every color conversion in the color module starts with the _prepare_colorarray methods which converts to floating point representation.
def _prepare_colorarray(arr, force_copy=False, *, channel_axis=-1):
"""Check the shape of the array and convert it to
floating point representation.
"""
arr = np.asanyarray(arr)
if arr.shape[channel_axis] != 3:
msg = (f'the input array must have size 3 along `channel_axis`, '
f'got {arr.shape}')
raise ValueError(msg)
float_dtype = _supported_float_type(arr.dtype)
if float_dtype == np.float32:
_func = dtype.img_as_float32
else:
_func = dtype.img_as_float64
return _func(arr, force_copy=force_copy)
The module does (thankfully) support 8-bit int representation as an input, but converts the image array to float representation and uses that representation all along.
Related
I've rewritten a bit of what was done here in an attempt to not have to use recursion so as to produce the images. While I can get what appears to be the correct string of random functions, I am unable to get the correct output arrays so as to build the image.
You'll notice I've put the xVar function first in the random functions because it will operate on an empty string and give me back values. This is similar to what the original code does except that (by recursion) uses the value 0 to pick out one of three functions that will operate on empty strings. I am thinking that the results are passed back in so that functions such as np.sin will work.
I think the issue might lie in my usage of the identity decorator func(*testlist), perhaps I'm using it incorrectly.
import numpy as np, random
from PIL import Image
width, height = 256,256
xArray = np.linspace(0.0, 1.0, width).reshape((1, width, 1))
yArray = np.linspace(0.0, 1.0, height).reshape((height, 1, 1))
def xVar(): return xArray
def yVar(): return yArray
def safeDivide(a, b): return np.divide(a, np.maximum(b, 0.001))
def add(x,y):
added = np.add(x, y)
return added
def Color():
randColorarray = np.array([random.random(), random.random(), random.random()]).reshape((1, 1, 3))
return randColorarray
# def circle(x,y):
# circles = (x- 100) ** 2 + (y - 100) ** 2
# return circles
functions = (Color, xVar, yVar, np.sin, np.multiply, safeDivide)
depth = 5
def functionArray(depth = 0):
FunctList = []
FunctList.append(xVar)
for x in range(depth):
func = random.choice(functions)
FunctList.append(func)
return FunctList
def ImageBuilder():
FunctionList = functionArray(depth)
testlist = []
for func in FunctionList:
values = func(*testlist)
return values
vals = ImageBuilder()
repetitions = (int(xArray / vals.shape[0]), int(yArray / vals.shape[1]), int(3 / vals.shape[2]))
img = np.tile(vals, repetitions)
# Convert to 8-bit, send to PIL and save
img8Bit = np.uint8(np.rint(img.clip(0.0, 1.0) * 255.0))
Image.fromarray(img8Bit).save('Images/' + '.png', "PNG")
Depending on which random function is chosen, I'll either get
values = func(*testlist)
ValueError: invalid number of arguments
or
TypeError: safeDivide() missing 2 required positional arguments: 'a' and 'b'
Note however that the linked program does not get a safe divide error and both a and b are not being explicitly passed in (as is the same with np.multiply).
Thanks for any help.
I'm using Python 3.7.7. I'm trying to resize a Numpy image array with this function:
def resize_image_array(image_array, rows_standard, cols_standard):
# image_array.shape = (3929, 2, 256, 256, 1)
# rows_standard = 200
# cols_standard = 200
# Height or row number.
image_rows_Dataset = np.shape(image_array)[2]
# Width or column number.
image_cols_Dataset = np.shape(image_array)[3]
num_rows_1 = ((image_rows_Dataset // 2) - (rows_standard / 2)) # num_rows_1 = 28.0
num_rows_2 = ((image_rows_Dataset // 2) + (rows_standard / 2)) # num_rows_2 = 228.0
num_cols_1 = ((image_cols_Dataset // 2) - (cols_standard / 2)) # num_cols_1 = 28.0
num_cols_2 = ((image_cols_Dataset // 2) + (cols_standard / 2)) # num_cols_2 = 228.0
return image_array[..., num_rows_1:num_rows_2, num_cols_1:num_cols_2, :]
But in the last statement I get this error:
TypeError: slice indices must be integers or None or have an __index__ method
I have also tried:
return image_array[:, :, num_rows_1:num_rows_2, num_cols_1:num_cols_2, :]
But with the same error as shown above.
How can I fix this error?
The issue, as mentioned in the comments, is that using true divide (/) on vanilla python scalars returns a float, even if both operands are integers. The operator does not check for integer divisibility before performing the division. floats do not have an __index__ method, which converts int-like quantities to an actual int.
The simple solution is to replace / with //. However, the computation of num_rows_2 and num_cols_2 seems superfluous. If you know the values of rows_standard and cols_standard that you want, just add them to num_rows_1 and num_cols_1, respectively. This will result in a much more robust expression:
row_start = (image_array.shape[2] - rows_standard) // 2
row_end = row_start + rows_standard
col_start = (image_array.shape[3] - cols_standard) // 2
col_end = col_start + cols_standard
image_array[..., row_start:row_end, col_start:col_end, :]
I'm trying to improve the speed of my image manipulation as it's been too slow for actual use.
What I need to do is apply a complex transformation on the colour of every pixel on an image. The manipulation is basically apply a vector transform like T(r, g, b, a) => (r * x, g * x, b * y, a) or in layman's terms, it's a multiplication of Red and Green values by a constant, a different multiplication for Blue and keep Alpha. But I also need to manipulate it differently if the RGB colour falls under some specific colours, in those cases they must follow a dictionary/transformation table where RGB => newRGB again keeping alpha.
The algorithm would be:
for each pixel in image:
if pixel[r, g, b] in special:
return special[pixel[r, g, b]] + pixel[a]
else:
return T(pixel)
It's simple but speed has been sub-optimal. I believe there's some way using numpy vectors, but I could not find how.
Important details about the implementation:
I don't care about the original buffer/image (manipulation can be in place)
I can use wxPython, Pillow and NumPy
Order or dimension of the array is not important as long as the buffer keeps the length
The buffer is obtained from a wxPython Bitmap and special and (RG|B)_pal are transformation tables, the end result will become a wxPython Bitmap too. They're obtained like these:
# buffer
bitmap = wx.Bitmap # it's valid wxBitmap here, this is just to let you know it exists
buff = bytearray(bitmap.GetWidth() * bitmap.GetHeight() * 4)
bitmap.CopyToBuffer(buff, wx.BitmapBufferFormat_RGBA)
self.RG_mult= 0.75
self.B_mult = 0.83
self.RG_pal = []
self.B_pal = []
for i in range(0, 256):
self.RG_pal.append(int(i * self.RG_mult))
self.B_pal.append(int(i * self.B_mult))
self.special = {
# RGB: new_RGB
# Implementation specific for the fastest access
# with buffer keys are 24bit numbers, with PIL keys are tuples
}
Implementations I tried include direct buffer manipulation:
for x in range(0, bitmap.GetWidth() * bitmap.GetHeight()):
index = x * 4
r = buf[index]
g = buf[index + 1]
b = buf[index + 2]
rgb = buf[index:index + 3]
if rgb in self.special:
special = self.special[rgb]
buf[index] = special[0]
buf[index + 1] = special[1]
buf[index + 2] = special[2]
else:
buf[index] = self.RG_pal[r]
buf[index + 1] = self.RG_pal[g]
buf[index + 2] = self.B_pal[b]
Use Pillow with getdata():
pil = Image.frombuffer("RGBA", (bitmap.GetWidth(), bitmap.GetHeight()), buf)
pil_buf = []
for colour in pil.getdata():
colour_idx = colour[0:3]
if (colour_idx in self.special):
special = self.special[colour_idx]
pil_buf.append((
special[0],
special[1],
special[2],
colour[3],
))
else:
pil_buf.append((
self.RG_pal[colour[0]],
self.RG_pal[colour[1]],
self.B_pal[colour[2]],
colour[3],
))
pil.putdata(pil_buf)
buf = pil.tobytes()
Pillow with point() and getdata() (fastest I achieved, more than twice times faster than others)
pil = Image.frombuffer("RGBA", (bitmap.GetWidth(), bitmap.GetHeight()), buf)
r, g, b, a = pil.split()
r = r.point(lambda r: r * self.RG_mult)
g = g.point(lambda g: g * self.RG_mult)
b = b.point(lambda b: b * self.B_mult)
pil = Image.merge("RGBA", (r, g, b, a))
i = 0
for colour in pil.getdata():
colour_idx = colour[0:3]
if (colour_idx in self.special):
special = self.special[colour_idx]
pil.putpixel(
(i % bitmap.GetWidth(), i // bitmap.GetWidth()),
(
special[0],
special[1],
special[2],
colour[3],
)
)
i += 1
buf = pil.tobytes()
I also tried working with numpy.where but then I could not get it to work. With numpy.apply_along_axis it worked but the performance was terrible. Other tries with numpy I could not access the RGB together, only as separated bands.
Pure Numpy Version
This first optimization relies on the fact, that one probably has way less special colors than pixels. I use numpy to do all the inner loops. This works well with images of up to 1MP. If You have multiple images I'd recommend the parallel approach.
Let's define a test case:
import requests
from io import BytesIO
from PIL import Image
import numpy as np
# Load some image, so we have the same
response = requests.get("https://upload.wikimedia.org/wikipedia/commons/4/41/Rick_Astley_Dallas.jpg")
# Make areas of known color
img = Image.open(BytesIO(response.content)).rotate(10, expand=True).rotate(-10,expand=True, fillcolor=(255,255,255)).convert('RGBA')
print("height: %d, width: %d (%.2f MP)"%(img.height, img.width, img.width*img.height/10e6))
height: 5034, width: 5792 (2.92 MP)
Define our special colors
specials = {
(4,1,6):(255,255,255),
(0, 0, 0):(255, 0, 255),
(255, 255, 255):(0, 255, 0)
}
Algorithm
def transform_map(img, specials, R_factor, G_factor, B_factor):
# Your transform
def transform(x, a):
a *= x
return a.clip(0, 255).astype(np.uint8)
# Convert to array
img_array = np.asarray(img)
# Extract channels
R = img_array.T[0]
G = img_array.T[1]
B = img_array.T[2]
A = img_array.T[3]
# Find Special colors
# First, calculate a uniqe hash
color_hashes = (R + 2**8 * G + 2**16 * B)
# Find inidices of special colors
special_idxs = []
for k, v in specials.items():
key_arr = np.array(list(k))
val_arr = np.array(list(v))
spec_hash = key_arr[0] + 2**8 * key_arr[1] + 2**16 * key_arr[2]
special_idxs.append(
{
'mask': np.where(np.isin(color_hashes, spec_hash)),
'value': val_arr
}
)
# Apply transform to whole image
R = transform(R, R_factor)
G = transform(G, G_factor)
B = transform(B, B_factor)
# Replace values where special colors were found
for idx in special_idxs:
R[idx['mask']] = idx['value'][0]
G[idx['mask']] = idx['value'][1]
B[idx['mask']] = idx['value'][2]
return Image.fromarray(np.array([R,G,B,A]).T, mode='RGBA')
And finally some bench marks on a Intel Core i5-6300U # 2.40GHz
import time
times = []
for i in range(10):
t0 = time.time()
# Test
transform_map(img, specials, 1.2, .9, 1.2)
#
t1 = time.time()
times.append(t1-t0)
np.round(times, 2)
print('average run time: %.2f +/-%.2f'%(np.mean(times), np.std(times)))
average run time: 9.72 +/-0.91
EDIT Parallelization
With the same setup as above, we can get a 2x speed increase on large images. (Small ones are faster without numba)
from numba import njit, prange
from numba.core import types
from numba.typed import Dict
# Map dict of special colors or transform over array of pixel values
#njit(parallel=True, locals={'px_hash': types.uint32})
def check_and_transform(img_array, d, T):
#Save Shape for later
shape = img_array.shape
# Flatten image for 1-d iteration
img_array_flat = img_array.reshape(-1,3).copy()
N = img_array_flat.shape[0]
# Replace or map
for i in prange(N):
px_hash = np.uint32(0)
px_hash += img_array_flat[i,0]
px_hash += types.uint32(2**8) * img_array_flat[i,1]
px_hash += types.uint32(2**16) * img_array_flat[i,2]
try:
img_array_flat[i] = d[px_hash]
except Exception:
img_array_flat[i] = (img_array_flat[i] * T).astype(np.uint8)
# return image
return img_array_flat.reshape(shape)
# Wrapper for function above
def map_or_transform_jit(image: Image, specials: dict, T: np.ndarray):
# assemble numba typed dict
d = Dict.empty(
key_type=types.uint32,
value_type=types.uint8[:],
)
for k, v in specials.items():
k = types.uint32(k[0] + 2**8 * k[1] + 2**16 * k[2])
v = np.array(v, dtype=np.uint8)
d[k] = v
# get rgb channels
img_arr = np.array(img)
rgb = img_arr[:,:,:3].copy()
img_shape = img_arr.shape
# apply map
rgb = check_and_transform(rgb, d, T)
# set color channels
img_arr[:,:,:3] = rgb
return Image.fromarray(img_arr, mode='RGBA')
# Benchmark
import time
times = []
for i in range(10):
t0 = time.time()
# Test
test_img = map_or_transform_jit(img, specials, np.array([1, .5, .5]))
#
t1 = time.time()
times.append(t1-t0)
np.round(times, 2)
print('average run time: %.2f +/- %.2f'%(np.mean(times), np.std(times)))
test_img
average run time: 3.76 +/- 0.08
I'm trying to implement Reinhard's method to use the color distribution of a target image to color normalize a passed in image for a research project. I've gotten the code to work and it outputs correctly but it's pretty slow. It takes about 20 minutes to iterate through 300 images. I'm pretty sure the bottleneck is how I'm handling applying the function to each image. I'm currently iterating through each pixel of the image and applying the functions below to each channel.
def reinhard(target, img):
#converts image and target from BGR colorspace to l alpha beta
lAB_img = cv2.cvtColor(img, cv2.COLOR_BGR2Lab)
lAB_tar = cv2.cvtColor(target, cv2.COLOR_BGR2Lab)
#finds mean and standard deviation for each color channel across the entire image
(mean, std) = cv2.meanStdDev(lAB_img)
(mean_tar, std_tar) = cv2.meanStdDev(lAB_tar)
#iterates over image implementing formula to map color normalized pixels to target image
for y in range(512):
for x in range(512):
lAB_tar[x, y, 0] = (lAB_img[x, y, 0] - mean[0]) / std[0] * std_tar[0] + mean_tar[0]
lAB_tar[x, y, 1] = (lAB_img[x, y, 1] - mean[1]) / std[1] * std_tar[1] + mean_tar[1]
lAB_tar[x, y, 2] = (lAB_img[x, y, 2] - mean[2]) / std[2] * std_tar[2] + mean_tar[2]
mapped = cv2.cvtColor(lAB_tar, cv2.COLOR_Lab2BGR)
return mapped
My supervisor told me that I could try using a matrix to apply the function all at once to improve the runtime but I'm not exactly sure how to go about doing that.
The original and the target:
Color transfer reuslts using Reinhard'method in 5 ms:
I prefer to implement the formulat in numpy vectorized operations other than python loops.
# implementing the formula
#(Io - mo)/so*st + mt = Io * (st/so) + mt - mo*(st/so)
ratio = (std_tar/std_ori).reshape(-1)
offset = (mean_tar - mean_ori*std_tar/std_ori).reshape(-1)
lab_tar = cv2.convertScaleAbs(lab_ori*ratio + offset)
Here is the code:
# 2019/02/19 by knight-金
# https://stackoverflow.com/a/54757659/3547485
import numpy as np
import cv2
def reinhard(target, original):
# cvtColor: COLOR_BGR2Lab
lab_tar = cv2.cvtColor(target, cv2.COLOR_BGR2Lab)
lab_ori = cv2.cvtColor(original, cv2.COLOR_BGR2Lab)
# meanStdDev: calculate mean and stadard deviation
mean_tar, std_tar = cv2.meanStdDev(lab_tar)
mean_ori, std_ori = cv2.meanStdDev(lab_ori)
# implementing the formula
#(Io - mo)/so*st + mt = Io * (st/so) + mt - mo*(st/so)
ratio = (std_tar/std_ori).reshape(-1)
offset = (mean_tar - mean_ori*std_tar/std_ori).reshape(-1)
lab_tar = cv2.convertScaleAbs(lab_ori*ratio + offset)
# convert back
mapped = cv2.cvtColor(lab_tar, cv2.COLOR_Lab2BGR)
return mapped
if __name__ == "__main__":
ori = cv2.imread("ori.png")
tar = cv2.imread("tar.png")
mapped = reinhard(tar, ori)
cv2.imwrite("mapped.png", mapped)
mapped_inv = reinhard(ori, tar)
cv2.imwrite("mapped_inv.png", mapped)
I managed to figure it out after looking at the numpy documentation. I just needed to replace my nested for loop with proper array accessing. It took less than a minute to iterate through all 300 images with this.
lAB_tar[:,:,0] = (lAB_img[:,:,0] - mean[0])/std[0] * std_tar[0] + mean_tar[0]
lAB_tar[:,:,1] = (lAB_img[:,:,1] - mean[1])/std[1] * std_tar[1] + mean_tar[1]
lAB_tar[:,:,2] = (lAB_img[:,:,2] - mean[2])/std[2] * std_tar[2] + mean_tar[2]
I have a 16-bit grayscale image and I want to convert it to an 8-bit grayscale image in OpenCV for Python to use it with various functions (like findContours etc.). How can I do this in Python?
You can use numpy conversion methods as an OpenCV mat is a numpy array.
This works:
img8 = (img16/256).astype('uint8')
You can do this in Python using NumPy by mapping the image trough a lookup table.
import numpy as np
def map_uint16_to_uint8(img, lower_bound=None, upper_bound=None):
'''
Map a 16-bit image trough a lookup table to convert it to 8-bit.
Parameters
----------
img: numpy.ndarray[np.uint16]
image that should be mapped
lower_bound: int, optional
lower bound of the range that should be mapped to ``[0, 255]``,
value must be in the range ``[0, 65535]`` and smaller than `upper_bound`
(defaults to ``numpy.min(img)``)
upper_bound: int, optional
upper bound of the range that should be mapped to ``[0, 255]``,
value must be in the range ``[0, 65535]`` and larger than `lower_bound`
(defaults to ``numpy.max(img)``)
Returns
-------
numpy.ndarray[uint8]
'''
if not(0 <= lower_bound < 2**16) and lower_bound is not None:
raise ValueError(
'"lower_bound" must be in the range [0, 65535]')
if not(0 <= upper_bound < 2**16) and upper_bound is not None:
raise ValueError(
'"upper_bound" must be in the range [0, 65535]')
if lower_bound is None:
lower_bound = np.min(img)
if upper_bound is None:
upper_bound = np.max(img)
if lower_bound >= upper_bound:
raise ValueError(
'"lower_bound" must be smaller than "upper_bound"')
lut = np.concatenate([
np.zeros(lower_bound, dtype=np.uint16),
np.linspace(0, 255, upper_bound - lower_bound).astype(np.uint16),
np.ones(2**16 - upper_bound, dtype=np.uint16) * 255
])
return lut[img].astype(np.uint8)
# Let's generate an example image (normally you would load the 16-bit image: cv2.imread(filename, cv2.IMREAD_UNCHANGED))
img = (np.random.random((100, 100)) * 2**16).astype(np.uint16)
# Convert it to 8-bit
map_uint16_to_uint8(img)
Yes you can in Python. To get the expected result, choose a method based on what you want the values mapped from say uint16 to uint8 be.
For instance,
if you do img8 = (img16/256).astype('uint8') values below 256 are mapped to 0
if you do img8 = img16.astype('uint8') values above 255 are mapped to 0
In the LUT method as described and corrected above, you have to define the mapping.
Opencv provides the function cv2.convertScaleAbs()
image_8bit = cv2.convertScaleAbs(image, alpha=0.03)
Alpha is just a optional scale factor. Also works for multiple channel images.
OpenCV documentation:
Scales, calculates absolute values, and converts the result to 8-bit.
On each element of the input array, the function convertScaleAbs
performs three operations sequentially: scaling, taking an absolute
value, conversion to an unsigned 8-bit type:
Other Info on Stackoverflow: OpenCV: How to use the convertScaleAbs() function
It's really easy to convert to 8-bit using scipy.misc.bytescale. The OpenCV matrix is a numpy array, so bytescale will do exactly what you want.
from scipy.misc import bytescale
img8 = bytescale(img16)
Code from scipy (now deprecated):
def bytescaling(data, cmin=None, cmax=None, high=255, low=0):
"""
Converting the input image to uint8 dtype and scaling
the range to ``(low, high)`` (default 0-255). If the input image already has
dtype uint8, no scaling is done.
:param data: 16-bit image data array
:param cmin: bias scaling of small values (def: data.min())
:param cmax: bias scaling of large values (def: data.max())
:param high: scale max value to high. (def: 255)
:param low: scale min value to low. (def: 0)
:return: 8-bit image data array
"""
if data.dtype == np.uint8:
return data
if high > 255:
high = 255
if low < 0:
low = 0
if high < low:
raise ValueError("`high` should be greater than or equal to `low`.")
if cmin is None:
cmin = data.min()
if cmax is None:
cmax = data.max()
cscale = cmax - cmin
if cscale == 0:
cscale = 1
scale = float(high - low) / cscale
bytedata = (data - cmin) * scale + low
return (bytedata.clip(low, high) + 0.5).astype(np.uint8)
This is the simplest way I found: img8 = cv2.normalize(img, None, 0, 255, cv2.NORM_MINMAX, dtype=cv2.CV_8U)