Related
I am trying to calculate Delta E (see e.g. here) in order to measure the color/colour difference between two different images.
I am following the method at How to compute the Delta E between two images, but (partly in order to reduce dependencies on other libraries), I would like to calculate Delta E in python using only opencv (and/or numpy/scipy) and its dependencies.
How?
I think it is pretty straightforward. Just compute the math from the Wikipedia reference. Here is a Python/OpenCV/Numpy only solution.
Input A:
Input B:
import cv2
import numpy as np
# read image_A and convert to float
image_A = cv2.imread('barn.jpg').astype("float32")
# read image_B as grayscale and convert to float
image_B = cv2.imread('barn_mod.jpg').astype("float32")
# convert image_A and image_B from BGR to LAB
image_A = cv2.cvtColor(image_A,cv2.COLOR_BGR2LAB)
image_B = cv2.cvtColor(image_B,cv2.COLOR_BGR2LAB)
# compute difference
diff = cv2.add(image_A,-image_B)
# separate into L,A,B channel diffs
diff_L = diff[:,:,0]
diff_A = diff[:,:,1]
diff_B = diff[:,:,2]
# compute delta_e as mean over every pixel using equation from
# https://en.wikipedia.org/wiki/Color_difference#CIELAB_ΔE*
delta_e = np.mean( np.sqrt(diff_L*diff_L + diff_A*diff_A + diff_B*diff_B) )
# print results
print (delta_e)
delta_e:
0.29771116
See also:
https://python-colormath.readthedocs.io/en/latest/delta_e.html
https://python-colormath.readthedocs.io/en/latest/_modules/colormath/color_diff.html
https://github.com/scikit-image/scikit-image/blob/master/skimage/color/delta_e.py
I want to create salt and pepper noise function.
The input is noise_density, i.e. the amount of pixels as noise in the output image and it should return value is the noisy image data source
def salt_pepper(noise_density):
noisesource = ColumnDataSource(data={'image': [noiseImage]})
return noisesource
This function returns an image that is [density]x[density] pixels, using numpy to generate a random array and using PIL to generate the image itself from the array.
def salt_pepper(density):
imarray = numpy.random.rand(density,density,3) * 255
return Image.fromarray(imarray.astype('uint8')).convert('L')
Now, for example, you could run
salt_pepper(500)
To generate an image file that is 500x500px.
Of course, make sure to
import numpy
from PIL import Image
I came up with a vectorized solution which I'm sure can be improved/simplified. Although the interface is not exactly as the requested one, the code is pretty straightforward (and fast 😬) and I'm sure it can be easily adapted.
import numpy as np
from PIL import Image
def salt_and_pepper(image, prob=0.05):
# If the specified `prob` is negative or zero, we don't need to do anything.
if prob <= 0:
return image
arr = np.asarray(image)
original_dtype = arr.dtype
# Derive the number of intensity levels from the array datatype.
intensity_levels = 2 ** (arr[0, 0].nbytes * 8)
min_intensity = 0
max_intensity = intensity_levels - 1
# Generate an array with the same shape as the image's:
# Each entry will have:
# 1 with probability: 1 - prob
# 0 or np.nan (50% each) with probability: prob
random_image_arr = np.random.choice(
[min_intensity, 1, np.nan], p=[prob / 2, 1 - prob, prob / 2], size=arr.shape
)
# This results in an image array with the following properties:
# - With probability 1 - prob: the pixel KEEPS ITS VALUE (it was multiplied by 1)
# - With probability prob/2: the pixel has value zero (it was multiplied by 0)
# - With probability prob/2: the pixel has value np.nan (it was multiplied by np.nan)
# We need to to `arr.astype(np.float)` to make sure np.nan is a valid value.
salt_and_peppered_arr = arr.astype(np.float) * random_image_arr
# Since we want SALT instead of NaN, we replace it.
# We cast the array back to its original dtype so we can pass it to PIL.
salt_and_peppered_arr = np.nan_to_num(
salt_and_peppered_arr, nan=max_intensity
).astype(original_dtype)
return Image.fromarray(salt_and_peppered_arr)
You can load a black and white version of Lena like so:
lena = Image.open("lena.ppm")
bwlena = Image.fromarray(np.asarray(lena).mean(axis=2).astype(np.uint8))
Finally, you can save a couple of examples:
salt_and_pepper(bwlena, prob=0.1).save("sp01lena.png", "PNG")
salt_and_pepper(bwlena, prob=0.3).save("sp03lena.png", "PNG")
Results:
https://i.ibb.co/J2y9HXS/sp01lena.png
https://i.ibb.co/VTm5Vy2/sp03lena.png
I've had following codes that use Python and OpenCV. Briefly, I have a stack of image taken at different focal depth. The codes pick out pixels at every (x,y) position that has the largest Laplacian of Guassian response among all focal depth(z), thus creating a focus-stacked image. Function get_fmap creates a 2d array where each pixel will contains the number of the focal plane having the largest log response. In the following codes, lines that are commented out are my current VIPS implementation. They don't look compatible within the function definition because it's only partial solution.
# from gi.repository import Vips
def get_log_kernel(siz, std):
x = y = np.linspace(-siz, siz, 2*siz+1)
x, y = np.meshgrid(x, y)
arg = -(x**2 + y**2) / (2*std**2)
h = np.exp(arg)
h[h < sys.float_info.epsilon * h.max()] = 0
h = h/h.sum() if h.sum() != 0 else h
h1 = h*(x**2 + y**2 - 2*std**2) / (std**4)
return h1 - h1.mean()
def get_fmap(img): # img is a 3-d numpy array.
log_response = np.zeros_like(img[:, :, 0], dtype='single')
fmap = np.zeros_like(img[:, :, 0], dtype='uint8')
log_kernel = get_log_kernel(11, 2)
# kernel = get_log_kernel(11, 2)
# kernel = [list(row) for row in kernel]
# kernel = Vips.Image.new_from_array(kernel)
# img = Vips.new_from_file("testimg.tif")
for ii in range(img.shape[2]):
# img_filtered = img.conv(kernel)
img_filtered = cv2.filter2D(img[:, :, ii].astype('single'), -1, log_kernel)
index = img_filtered > log_response
log_response[index] = img_filtered[index]
fmap[index] = ii
return fmap
and then fmap will be used to pick out pixels from different focal planes to create a focus-stacked image
This is done on an extremely large image, and I feel VIPS might do a better job than OpenCV on this. However, the official documentation provides rather scant information on its Python binding. From the information I can find on the internet, I'm only able to make image convolution work ( which, in my case, is an order of magnitude faster than OpenCV.). I'm wondering how to implement this in VIPS, especially these lines?
log_response = np.zeros_like(img[:, :, 0], dtype = 'single')
index = img_filtered > log_response
log_response[index] = im_filtered[index]
fmap[index] = ii
log_response and fmap are initialized as 3D arrays in the question code, whereas the question text states that the output, fmap is a 2D array. So, I am assuming that log_response and fmap are to be initialized as 2D arrays with their shapes same as each image. Thus, the edits would be -
log_response = np.zeros_like(img[:,:,0], dtype='single')
fmap = np.zeros_like(img[:,:,0], dtype='uint8')
Now, back to the theme of the question, you are performing 2D filtering on each image one-by-one and getting the maximum index of filtered output across all stacked images. In case, you didn't know as per the documentation of cv2.filter2D, it could also be used on a multi-dimensional array giving us a multi-dimensional array as output. Then, getting the maximum index across all images is as simple as .argmax(2). Thus, the implementation must be extremely efficient and would be simply -
fmap = cv2.filter2D(img,-1,log_kernel).argmax(2)
After consulting the Python VIPS manual and some trial-and-error, I've come up with my own answer. My numpy and OpenCV implementation in question can be translated into VIPS like this:
import pyvips
img = []
for ii in range(num_z_levels):
img.append(pyvips.Image.new_from_file("testimg_z" + str(ii) + ".tif")
def get_fmap(img)
log_kernel = get_log_kernel(11,2) # get_log_kernel is my own function, which generates a 2-d numpy array.
log_kernel = [list(row) for row in log_kernel] # pyvips.Image.new_from_array takes 1-d list array.
log_kernel = pyvips.Image.new_from_array(log_kernel) # Turn the kernel into Vips array so it can be used by Vips.
log_response = img[0].conv(log_kernel)
for ii in range(len(img)):
img_filtered = img[ii+1].conv(log_kernel)
log_response = (img_filtered > log_response).ifthenelse(img_filtered, log_response)
fmap = (img_filtered > log_response).ifthenelse(ii+1, 0)
Logical indexing is achieved through ifthenelse method :
result_img = (test_condition).ifthenelse(value_if_true, value_if_false)
The syntax is rather flexible. The test condition can be a comparison between two images of the same size or between an image and a value, e.g. img1 > img2 or img > 5. Like wise, value_if_true can be a single value or a Vips image.
I already achieved the goal described in the title but I was wondering if there was a more efficient (or generally better) way to do it. First of all let me introduce the problem.
I have a set of images of different sizes but with a width/height ratio less than (or equal) 2 (could be anything but let's say 2 for now), I want to normalize each one, meaning I want all of them to have the same size. Specifically I am going to do so like this:
Extract the max height above all images
Zoom the image so that each image reaches the max height keeping its ratio
Add a padding to the right with just white pixels until the image has a width/height ratio of 2
Keep in mind the images are represented as numpy matrices of grey scale values [0,255].
This is how I'm doing it now in Python:
max_height = numpy.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
for obs in data:
if len(obs[0])/len(obs) <= 2:
new_img = ndimage.zoom(obs, round(max_height/len(obs), 2), order=3)
missing_cols = max_height * 2 - len(new_img[0])
norm_img = []
for row in new_img:
norm_img.append(np.pad(row, (0, missing_cols), mode='constant', constant_values=255))
norm_img = np.resize(norm_img, (max_height, max_height*2))
There's a note about this code:
I'm rounding the zoom ratio because it makes the final height equal to max_height, I'm sure this is not the best approach but it's working (any suggestion is appreciated here). What I'd like to do is to expand the image keeping the ratio until it reaches a height equal to max_height. This is the only solution I found so far and it worked right away, the interpolation works pretty good.
So my final questions are:
Is there a better approach to achieve what explained above (image normalization) ? Do you think I could have done this differently ? Is there a common good practice I'm not following ?
Thanks in advance for your time.
Instead of ndimage.zoom you could use
scipy.misc.imresize. This
function allows you to specify the target size as a tuple, instead of by zoom
factor. Thus you won't have to call np.resize later to get the size exactly as
desired.
Note that scipy.misc.imresize calls
PIL.Image.resize
under the hood, so PIL (or Pillow) is a dependency.
Instead of using np.pad in a for-loop, you could allocate space for the desired array, norm_arr, first:
norm_arr = np.full((max_height, max_width), fill_value=255)
and then copy the resized image, new_arr into norm_arr:
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
For example,
from __future__ import division
import numpy as np
from scipy import misc
data = [np.linspace(255, 0, i*10).reshape(i,10)
for i in range(5, 100, 11)]
max_height = np.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
max_width = 2*max_height
result = []
for obs in data:
norm_arr = obs
h, w = obs.shape
if float(w)/h <= 2:
scale_factor = max_height/float(h)
target_size = (max_height, int(round(w*scale_factor)))
new_arr = misc.imresize(obs, target_size, interp='bicubic')
norm_arr = np.full((max_height, max_width), fill_value=255)
# check the shapes
# print(obs.shape, new_arr.shape, norm_arr.shape)
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
result.append(norm_arr)
# visually check the result
# misc.toimage(norm_arr).show()
I am trying to convert some code from Matlab to Python, but I am unfamiliar with a considerable amount of the Matlab syntax and functionality. I have managed to do a some of the conversion using the PIL and Numpy python package, but I was hoping someone would be able to explain what is going on with some elements of this code.
clear all;close all;clc;
% Set gray scale to 0 for color images. Will need more memory
GRAY_SCALE = 1
% The physical mask placed close to the sensor has 4 harmonics, therefore
% we will have 9 angular samples in the light field
nAngles = 9;
cAngles = (nAngles+1)/2;
% The fundamental frequency of the cosine in the mask in pixels
F1Y = 238; F1X = 191; %Cosine Frequency in Pixels from Calibration Image
F12X = floor(F1X/2);
F12Y = floor(F1Y/2);
%PhaseShift due to Mask In-Plane Translation wrt Sensor
phi1 = 300; phi2 = 150;
%read 2D image
disp('Reading Input Image...');
I = double(imread('InputCones.png'));
if(GRAY_SCALE)
%take green channel only
I = I(:,:,2);
end
%make image odd size
I = I(1:end,1:end-1,:);
%find size of image
[m,n,CH] = size(I);
%Compute Spectral Tile Centers, Peak Strengths and Phase
for i = 1:nAngles
for j = 1:nAngles
CentY(i,j) = (m+1)/2 + (i-cAngles)*F1Y;
CentX(i,j) = (n+1)/2 + (j-cAngles)*F1X;
%Mat(i,j) = exp(-sqrt(-1)*((phi1*pi/180)*(i-cAngles) + (phi2*pi/180)*(j-cAngles)));
end
end
Mat = ones(nAngles,nAngles);
% 20 is because we cannot have negative values in the mask. So strenght of
% DC component is 20 times that of harmonics
Mat(cAngles,cAngles) = Mat(cAngles,cAngles) * 20;
% Beginning of 4D light field computation
% do for all color channel
for ch = 1:CH
disp('=================================');
disp(sprintf('Processing channel %d',ch));
% Find FFT of image
disp('Computing FFT of 2D image');
f = fftshift(fft2(I(:,:,ch)));
%If you want to visaulize the FFT of input 2D image (Figure 8 of
%paper), uncomment the next 2 lines
% figure;imshow(log10(abs(f)),[]);colormap gray;
% title('2D FFT of captured image (Figure 8 of paper). Note the spectral replicas');
%Rearrange Tiles of 2D FFT into 4D Planes to obtain FFT of 4D Light-Field
disp('Rearranging 2D FFT into 4D');
for i = 1: nAngles
for j = 1: nAngles
FFT_LF(:,:,i,j) = f( CentY(i,j)-F12Y:CentY(i,j)+F12Y, CentX(i,j)-F12X:CentX(i,j)+F12X)/Mat(i,j);
end
end
clear f
k = sqrt(-1);
for i = 1:nAngles
for j = 1:nAngles
shift = (phi1*pi/180)*(i-cAngles) + (phi2*pi/180)*(j-cAngles);
FFT_LF(:,:,i,j) = FFT_LF(:,:,i,j)*exp(k*shift);
end
end
disp('Computing inverse 4D FFT');
LF = ifftn(ifftshift(FFT_LF)); %Compute Light-Field by 4D Inverse FFT
clear FFT_LF
if(ch==1)
LF_R = LF;
elseif(ch==2)
LF_G = LF;
elseif(ch==3)
LF_B = LF;
end
clear LF
end
clear I
%Now we have 4D light fiel
disp('Light Field Computed. Done...');
disp('==========================================');
% Digital Refocusing Code
% Take a 2D slice of 4D light field
% For refocusing, we only need the FFT of light field, not the light field
disp('Synthesizing Refocused Images by taking 2D slice of 4D Light Field');
if(GRAY_SCALE)
FFT_LF_R = fftshift(fftn(LF_R));
clear LF_R
else
FFT_LF_R = fftshift(fftn(LF_R));
clear LF_R
FFT_LF_G = fftshift(fftn(LF_G));
clear LF_G
FFT_LF_B = fftshift(fftn(LF_B));
clear LF_B
end
% height and width of refocused image
H = size(FFT_LF_R,1);
W = size(FFT_LF_R,2);
count = 0;
for theta = -14:14
count = count + 1;
disp('===============================================');
disp(sprintf('Calculating New ReFocused Image: theta = %d',theta));
if(GRAY_SCALE)
RefocusedImage = Refocus2D(FFT_LF_R,[theta,theta]);
else
RefocusedImage = zeros(H,W,3);
RefocusedImage(:,:,1) = Refocus2D(FFT_LF_R,[theta,theta]);
RefocusedImage(:,:,2) = Refocus2D(FFT_LF_G,[theta,theta]);
RefocusedImage(:,:,3) = Refocus2D(FFT_LF_B,[theta,theta]);
end
str = sprintf('RefocusedImage%03d.png',count);
%Scale RefocusedImage in [0,255]
RefocusedImage = RefocusedImage - min(RefocusedImage(:));
RefocusedImage = 255*RefocusedImage/max(RefocusedImage(:));
%write as png image
clear tt
for ii = 1:CH
tt(:,:,ii) = fliplr(RefocusedImage(:,:,ii)');
end
imwrite(uint8(tt),str);
disp(sprintf('Refocused image written as %s',str));
end
Here is the Refocus2d function:
function IOut = Refocus2D(FFTLF,theta)
[m,n,p,q] = size(FFTLF);
Theta1 = theta(1);
Theta2 = theta(2);
cTem = floor(size(FFTLF)/2) + 1;
% find the coordinates of 2D slice
[XX,YY] = meshgrid(1:n,1:m);
cc = (XX - cTem(2))/size(FFTLF,2);
cc = Theta2*cc + cTem(4);
dd = (YY - cTem(1))/size(FFTLF,1);
dd = Theta1*dd + cTem(3);
% Resample 4D light field along the 2D slice
v = interpn(FFTLF,YY,XX,dd,cc,'cubic');
%set nan values to zero
idx = find(isnan(v)==1);
disp(sprintf('Number of Nans in sampling = %d',size(idx,1)))
v(isnan(v)) = 0;
% take inverse 2D FFT to get the image
IOut = real(ifft2(ifftshift(v)));
If anyone could help it would be greatly appreciated.
Thanks in advance
Apologies: Here is a brief description of what the code does:
The code reads in an image of a light field, and with prior knowledge of the plenoptic mask it we store the relevant nAngles and the fundamental frequencies of the mask and the phase shift, these are used to find multiple spectral replicas of the image.
Once the image is read in and the green channel is extracted we perform a Fast Fourier Transform on the image, and start taking slices from the image matrix that represent one of the spectral replicas.
We then take the Inverse Fourier Transform of all the spectral replicas to produce a the light field.
The Refocus2d function, then takes a 2 dimensional slice of the 4d data to recreate a refocused image.
The things I am struggling with specifically are:
FFT_LF(:,:,i,j) = f( CentY(i,j)-F12Y:CentY(i,j)+F12Y, CentX(i,j)-F12X:CentX(i,j)+F12X)/Mat(i,j);
We are taking a slice from the Matrix f, but where is that data in FFT_LF? What does (:,:,i,j) mean? Is it a multidimensional array?
and what does the size function return:
[m,n,p,q] = size(FFTLF);
Just a brief explanation of how this translates to python would be a great help.
Thanks everyone so far :)
How about getting start with this page http://www.scipy.org/NumPy_for_Matlab_Users? Also if you have brief description of what this is supposed to do, that would be good
You're correct: FFT_LF(:,:,i,j) refers to a multidimensional array. In this case, FFT_LF is a 4-D array, but the calculations result in a 2-D array. The (:,:,i,j) tells MATLAB exactly how to place the 2-D results into the 4-D variable.
In effect, it is storing one MxN array for each pair of indices (i,j). The colons (:) effectively mean "get every element in that dimension."
What [m,n,p,q] = size(FFTLF) will do is return the length of each dimension in your array. So, if FFTLF ends up being a 5x5x3x2 array, you get:
m=5, n=5, p=3, q=2.
If you have MATLAB available, typing "help size" should give a good explanation of what it does. The same can be said for most MATLAB functions: I've always been quite impressed with their documentation.
Hope that Helps