I implemented computation of average RGB value of a Python Imaging Library image in 2 ways:
1 - using lists
def getAverageRGB(image):
"""
Given PIL Image, return average value of color as (r, g, b)
"""
# no. of pixels in image
npixels = image.size[0]*image.size[1]
# get colors as [(cnt1, (r1, g1, b1)), ...]
cols = image.getcolors(npixels)
# get [(c1*r1, c1*g1, c1*g2),...]
sumRGB = [(x[0]*x[1][0], x[0]*x[1][1], x[0]*x[1][2]) for x in cols]
# calculate (sum(ci*ri)/np, sum(ci*gi)/np, sum(ci*bi)/np)
# the zip gives us [(c1*r1, c2*r2, ..), (c1*g1, c1*g2,...)...]
avg = tuple([sum(x)/npixels for x in zip(*sumRGB)])
return avg
2 - using numpy
def getAverageRGBN(image):
"""
Given PIL Image, return average value of color as (r, g, b)
"""
# get image as numpy array
im = np.array(image)
# get shape
w,h,d = im.shape
# change shape
im.shape = (w*h, d)
# get average
return tuple(np.average(im, axis=0))
I was surprised to find that #1 runs about 20% faster than #2.
Am I using numpy correctly? Is there a better way to implement the average computation?
Surprising indeed.
You may want to use:
tuple(im.mean(axis=0))
to compute your mean (r,g,b), but I doubt it's gonna improve things a lot. Have you tried to profile getAverageRGBN and find the bottleneck?
One-liner w/o changing dimension or writing getAverageRGBN:
np.array(image).mean(axis=(0,1))
Again, it might not improve any performance.
In PIL or Pillow, in Python 3.4+:
from statistics import mean
average_color = [mean(image.getdata(band)) for band in range(3)]
Related
When I was searching internet for an algorithm to correct luminance I came across this article about prospective correction and retrospective correction. I'm mostly interested in the prospective correction. Basically we take pictures of the scene with image in it(original one), and two other ,one bright and one dark, pictures where we only see the background of the original picture.
My problem is that I couldn't find any adaptation of these formulas in openCV or code example. I tried to use the formulas as they were in my code but this time I had a problem with data types. This happened when I tried to find C constant by applying operations on images.
This is how I implemented the formula in my code:
def calculate_C(im, im_b):
fx_mean = cv.mean(im)
fx_over_bx = np.divide(im,im_b)
mean_fx_bx = cv.mean(fx_over_bx)
c = np.divide(fx_mean, mean_fx_bx)
return c
#Basic image reading and resizing
# Original image
img = cv.imread(image_path)
img = cv.resize(img, (1000,750))
# Bright image
b_img = cv.imread(bright_image_path)
b_img = cv.resize(b_img, (1000,750))
# Calculating C constant from the formula
c_constant = calculate_C(img, b_img)
# Because I have only the bright image I am using second formula from the article
img = np.multiply(np.divide(img,b_img), c_constant)
When I try to run this code I get the error:
img = np.multiply(np.divide(img,b_img), c_constant)
ValueError: operands could not be broadcast together with shapes (750,1000,3) (4,)
So, is there anything I can do to fix my code? or is there any hints that you can share with me to handle luminance correction with this method or better methods?
You are using cv2.mean function which returns array with shape (4,) - mean value for each channel. You may need to ignore last channel and correctly broadcast it to numpy.
Or you could use numpy for calculations instead of opencv.
I just take example images from provided article.
grain.png:
grain_background.png:
Complete example:
import cv2
import numpy as np
from numpy.ma import divide, mean
f = cv2.imread("grain.png")
b = cv2.imread("grain_background.png")
f = f.astype(np.float32)
b = b.astype(np.float32)
C = mean(f) / divide(f, b).mean()
g = divide(f, b) * C
g = g.astype(np.uint8)
cv2.imwrite("grain_out.png", g)
Your need to use masked divide operation because ordinary operation could lead to division by zero => nan values.
Resulting image (output.png):
I'm working with OpenCV and Python.
I separated the green, red and blue components of an RGB image with OpenCV and Python. Then subdivide each of these matrices into 8x8 submatrices in order to work with them. So far, this is already done.
For each of the 8x8 submatrices that it generates, I need to obtain the mean of each one, and order the matrices in descending order according to the mean obtained. I'm stuck in this. I need help.
The code that I have so far is the following
import cv2
import numpy as np
img = cv2.imread("6.jpg")
b,g,r = cv2.split(img)
def sub_matrices(color_channel):
matrices = []
for i in range(int(color_channel.shape[0]/8)):
for j in range(int(color_channel.shape[1]/8)):
matrices.append(color_channel[i*8:i*8 + 8, j*8:j*8+8])
return matrices
#returns list of sub matrices
r_submatrices = sub_matrices(r)
g_submatrices = sub_matrices(g)
b_submatrices = sub_matrices(b)
print (r_submatrices)
print (g_submatrices)
print (b_submatrices)
for i in r_submatrices:
x = np.mean(i)
print(i)
I am using numpy to get the mean, but then I do not understand very well how I can order these matrices, depending on the value I get in the mean?
The easiest way is to calculate all the means, save the means and matrixes as pairs (you can use a tuple for this) and then sort.
matrix_mean_list = []
for i in r_submatrices:
x = np.mean(i)
matrix_mean_list.append((i, x))
matrix_mean_list = sorted(matrix_mean_list, key=lambda m: m[1])
Now matrix_mean_list should be sorted with respect to the means. You can iterate through it to get the matrixes back.
I want to create salt and pepper noise function.
The input is noise_density, i.e. the amount of pixels as noise in the output image and it should return value is the noisy image data source
def salt_pepper(noise_density):
noisesource = ColumnDataSource(data={'image': [noiseImage]})
return noisesource
This function returns an image that is [density]x[density] pixels, using numpy to generate a random array and using PIL to generate the image itself from the array.
def salt_pepper(density):
imarray = numpy.random.rand(density,density,3) * 255
return Image.fromarray(imarray.astype('uint8')).convert('L')
Now, for example, you could run
salt_pepper(500)
To generate an image file that is 500x500px.
Of course, make sure to
import numpy
from PIL import Image
I came up with a vectorized solution which I'm sure can be improved/simplified. Although the interface is not exactly as the requested one, the code is pretty straightforward (and fast 😬) and I'm sure it can be easily adapted.
import numpy as np
from PIL import Image
def salt_and_pepper(image, prob=0.05):
# If the specified `prob` is negative or zero, we don't need to do anything.
if prob <= 0:
return image
arr = np.asarray(image)
original_dtype = arr.dtype
# Derive the number of intensity levels from the array datatype.
intensity_levels = 2 ** (arr[0, 0].nbytes * 8)
min_intensity = 0
max_intensity = intensity_levels - 1
# Generate an array with the same shape as the image's:
# Each entry will have:
# 1 with probability: 1 - prob
# 0 or np.nan (50% each) with probability: prob
random_image_arr = np.random.choice(
[min_intensity, 1, np.nan], p=[prob / 2, 1 - prob, prob / 2], size=arr.shape
)
# This results in an image array with the following properties:
# - With probability 1 - prob: the pixel KEEPS ITS VALUE (it was multiplied by 1)
# - With probability prob/2: the pixel has value zero (it was multiplied by 0)
# - With probability prob/2: the pixel has value np.nan (it was multiplied by np.nan)
# We need to to `arr.astype(np.float)` to make sure np.nan is a valid value.
salt_and_peppered_arr = arr.astype(np.float) * random_image_arr
# Since we want SALT instead of NaN, we replace it.
# We cast the array back to its original dtype so we can pass it to PIL.
salt_and_peppered_arr = np.nan_to_num(
salt_and_peppered_arr, nan=max_intensity
).astype(original_dtype)
return Image.fromarray(salt_and_peppered_arr)
You can load a black and white version of Lena like so:
lena = Image.open("lena.ppm")
bwlena = Image.fromarray(np.asarray(lena).mean(axis=2).astype(np.uint8))
Finally, you can save a couple of examples:
salt_and_pepper(bwlena, prob=0.1).save("sp01lena.png", "PNG")
salt_and_pepper(bwlena, prob=0.3).save("sp03lena.png", "PNG")
Results:
https://i.ibb.co/J2y9HXS/sp01lena.png
https://i.ibb.co/VTm5Vy2/sp03lena.png
I'm looking for an efficient way to efficiently gamma-blend images.
While regular (additive) blend of pixels A and B with a factor r is expressed as this:
C = (1-r) A + r B
Gamma (multiplicative) blend is done as follows:
C = A^(1-r) B^r
This would require a way to raise a pixel channels to a non-integer power, a bit like a gamma correction.
Since I have a large batch of 4K images to process, I need this be done efficiently (without looping through all pixels and performing the computation individually).
Thanks!
Posting an implementation of the solution #Pascal Mount mentioned in the comments he used as he has yet to post his:
import numpy as np
def blend_gamma_mul(img_A, img_B, r):
arr_A = np.array(img_A)
arr_B = np.array(img_B)
arr_C = arr_A**(1-r) * arr_B**r
return Image.fromarray(np.array(arr_C, dtype=np.uint8))
Use the function like so:
from PIL import Image
img_A = Image.open("A.jpg")
img_B = Image.open("B.jpg")
img_C = blend_gamma_mul(img_A, img_B, 0.7)
img_C.save("C.jpg")
Took 3.47s on my computer to blend two 4k images.
I've had following codes that use Python and OpenCV. Briefly, I have a stack of image taken at different focal depth. The codes pick out pixels at every (x,y) position that has the largest Laplacian of Guassian response among all focal depth(z), thus creating a focus-stacked image. Function get_fmap creates a 2d array where each pixel will contains the number of the focal plane having the largest log response. In the following codes, lines that are commented out are my current VIPS implementation. They don't look compatible within the function definition because it's only partial solution.
# from gi.repository import Vips
def get_log_kernel(siz, std):
x = y = np.linspace(-siz, siz, 2*siz+1)
x, y = np.meshgrid(x, y)
arg = -(x**2 + y**2) / (2*std**2)
h = np.exp(arg)
h[h < sys.float_info.epsilon * h.max()] = 0
h = h/h.sum() if h.sum() != 0 else h
h1 = h*(x**2 + y**2 - 2*std**2) / (std**4)
return h1 - h1.mean()
def get_fmap(img): # img is a 3-d numpy array.
log_response = np.zeros_like(img[:, :, 0], dtype='single')
fmap = np.zeros_like(img[:, :, 0], dtype='uint8')
log_kernel = get_log_kernel(11, 2)
# kernel = get_log_kernel(11, 2)
# kernel = [list(row) for row in kernel]
# kernel = Vips.Image.new_from_array(kernel)
# img = Vips.new_from_file("testimg.tif")
for ii in range(img.shape[2]):
# img_filtered = img.conv(kernel)
img_filtered = cv2.filter2D(img[:, :, ii].astype('single'), -1, log_kernel)
index = img_filtered > log_response
log_response[index] = img_filtered[index]
fmap[index] = ii
return fmap
and then fmap will be used to pick out pixels from different focal planes to create a focus-stacked image
This is done on an extremely large image, and I feel VIPS might do a better job than OpenCV on this. However, the official documentation provides rather scant information on its Python binding. From the information I can find on the internet, I'm only able to make image convolution work ( which, in my case, is an order of magnitude faster than OpenCV.). I'm wondering how to implement this in VIPS, especially these lines?
log_response = np.zeros_like(img[:, :, 0], dtype = 'single')
index = img_filtered > log_response
log_response[index] = im_filtered[index]
fmap[index] = ii
log_response and fmap are initialized as 3D arrays in the question code, whereas the question text states that the output, fmap is a 2D array. So, I am assuming that log_response and fmap are to be initialized as 2D arrays with their shapes same as each image. Thus, the edits would be -
log_response = np.zeros_like(img[:,:,0], dtype='single')
fmap = np.zeros_like(img[:,:,0], dtype='uint8')
Now, back to the theme of the question, you are performing 2D filtering on each image one-by-one and getting the maximum index of filtered output across all stacked images. In case, you didn't know as per the documentation of cv2.filter2D, it could also be used on a multi-dimensional array giving us a multi-dimensional array as output. Then, getting the maximum index across all images is as simple as .argmax(2). Thus, the implementation must be extremely efficient and would be simply -
fmap = cv2.filter2D(img,-1,log_kernel).argmax(2)
After consulting the Python VIPS manual and some trial-and-error, I've come up with my own answer. My numpy and OpenCV implementation in question can be translated into VIPS like this:
import pyvips
img = []
for ii in range(num_z_levels):
img.append(pyvips.Image.new_from_file("testimg_z" + str(ii) + ".tif")
def get_fmap(img)
log_kernel = get_log_kernel(11,2) # get_log_kernel is my own function, which generates a 2-d numpy array.
log_kernel = [list(row) for row in log_kernel] # pyvips.Image.new_from_array takes 1-d list array.
log_kernel = pyvips.Image.new_from_array(log_kernel) # Turn the kernel into Vips array so it can be used by Vips.
log_response = img[0].conv(log_kernel)
for ii in range(len(img)):
img_filtered = img[ii+1].conv(log_kernel)
log_response = (img_filtered > log_response).ifthenelse(img_filtered, log_response)
fmap = (img_filtered > log_response).ifthenelse(ii+1, 0)
Logical indexing is achieved through ifthenelse method :
result_img = (test_condition).ifthenelse(value_if_true, value_if_false)
The syntax is rather flexible. The test condition can be a comparison between two images of the same size or between an image and a value, e.g. img1 > img2 or img > 5. Like wise, value_if_true can be a single value or a Vips image.