I'm trying to improve the speed of my image manipulation as it's been too slow for actual use.
What I need to do is apply a complex transformation on the colour of every pixel on an image. The manipulation is basically apply a vector transform like T(r, g, b, a) => (r * x, g * x, b * y, a) or in layman's terms, it's a multiplication of Red and Green values by a constant, a different multiplication for Blue and keep Alpha. But I also need to manipulate it differently if the RGB colour falls under some specific colours, in those cases they must follow a dictionary/transformation table where RGB => newRGB again keeping alpha.
The algorithm would be:
for each pixel in image:
if pixel[r, g, b] in special:
return special[pixel[r, g, b]] + pixel[a]
else:
return T(pixel)
It's simple but speed has been sub-optimal. I believe there's some way using numpy vectors, but I could not find how.
Important details about the implementation:
I don't care about the original buffer/image (manipulation can be in place)
I can use wxPython, Pillow and NumPy
Order or dimension of the array is not important as long as the buffer keeps the length
The buffer is obtained from a wxPython Bitmap and special and (RG|B)_pal are transformation tables, the end result will become a wxPython Bitmap too. They're obtained like these:
# buffer
bitmap = wx.Bitmap # it's valid wxBitmap here, this is just to let you know it exists
buff = bytearray(bitmap.GetWidth() * bitmap.GetHeight() * 4)
bitmap.CopyToBuffer(buff, wx.BitmapBufferFormat_RGBA)
self.RG_mult= 0.75
self.B_mult = 0.83
self.RG_pal = []
self.B_pal = []
for i in range(0, 256):
self.RG_pal.append(int(i * self.RG_mult))
self.B_pal.append(int(i * self.B_mult))
self.special = {
# RGB: new_RGB
# Implementation specific for the fastest access
# with buffer keys are 24bit numbers, with PIL keys are tuples
}
Implementations I tried include direct buffer manipulation:
for x in range(0, bitmap.GetWidth() * bitmap.GetHeight()):
index = x * 4
r = buf[index]
g = buf[index + 1]
b = buf[index + 2]
rgb = buf[index:index + 3]
if rgb in self.special:
special = self.special[rgb]
buf[index] = special[0]
buf[index + 1] = special[1]
buf[index + 2] = special[2]
else:
buf[index] = self.RG_pal[r]
buf[index + 1] = self.RG_pal[g]
buf[index + 2] = self.B_pal[b]
Use Pillow with getdata():
pil = Image.frombuffer("RGBA", (bitmap.GetWidth(), bitmap.GetHeight()), buf)
pil_buf = []
for colour in pil.getdata():
colour_idx = colour[0:3]
if (colour_idx in self.special):
special = self.special[colour_idx]
pil_buf.append((
special[0],
special[1],
special[2],
colour[3],
))
else:
pil_buf.append((
self.RG_pal[colour[0]],
self.RG_pal[colour[1]],
self.B_pal[colour[2]],
colour[3],
))
pil.putdata(pil_buf)
buf = pil.tobytes()
Pillow with point() and getdata() (fastest I achieved, more than twice times faster than others)
pil = Image.frombuffer("RGBA", (bitmap.GetWidth(), bitmap.GetHeight()), buf)
r, g, b, a = pil.split()
r = r.point(lambda r: r * self.RG_mult)
g = g.point(lambda g: g * self.RG_mult)
b = b.point(lambda b: b * self.B_mult)
pil = Image.merge("RGBA", (r, g, b, a))
i = 0
for colour in pil.getdata():
colour_idx = colour[0:3]
if (colour_idx in self.special):
special = self.special[colour_idx]
pil.putpixel(
(i % bitmap.GetWidth(), i // bitmap.GetWidth()),
(
special[0],
special[1],
special[2],
colour[3],
)
)
i += 1
buf = pil.tobytes()
I also tried working with numpy.where but then I could not get it to work. With numpy.apply_along_axis it worked but the performance was terrible. Other tries with numpy I could not access the RGB together, only as separated bands.
Pure Numpy Version
This first optimization relies on the fact, that one probably has way less special colors than pixels. I use numpy to do all the inner loops. This works well with images of up to 1MP. If You have multiple images I'd recommend the parallel approach.
Let's define a test case:
import requests
from io import BytesIO
from PIL import Image
import numpy as np
# Load some image, so we have the same
response = requests.get("https://upload.wikimedia.org/wikipedia/commons/4/41/Rick_Astley_Dallas.jpg")
# Make areas of known color
img = Image.open(BytesIO(response.content)).rotate(10, expand=True).rotate(-10,expand=True, fillcolor=(255,255,255)).convert('RGBA')
print("height: %d, width: %d (%.2f MP)"%(img.height, img.width, img.width*img.height/10e6))
height: 5034, width: 5792 (2.92 MP)
Define our special colors
specials = {
(4,1,6):(255,255,255),
(0, 0, 0):(255, 0, 255),
(255, 255, 255):(0, 255, 0)
}
Algorithm
def transform_map(img, specials, R_factor, G_factor, B_factor):
# Your transform
def transform(x, a):
a *= x
return a.clip(0, 255).astype(np.uint8)
# Convert to array
img_array = np.asarray(img)
# Extract channels
R = img_array.T[0]
G = img_array.T[1]
B = img_array.T[2]
A = img_array.T[3]
# Find Special colors
# First, calculate a uniqe hash
color_hashes = (R + 2**8 * G + 2**16 * B)
# Find inidices of special colors
special_idxs = []
for k, v in specials.items():
key_arr = np.array(list(k))
val_arr = np.array(list(v))
spec_hash = key_arr[0] + 2**8 * key_arr[1] + 2**16 * key_arr[2]
special_idxs.append(
{
'mask': np.where(np.isin(color_hashes, spec_hash)),
'value': val_arr
}
)
# Apply transform to whole image
R = transform(R, R_factor)
G = transform(G, G_factor)
B = transform(B, B_factor)
# Replace values where special colors were found
for idx in special_idxs:
R[idx['mask']] = idx['value'][0]
G[idx['mask']] = idx['value'][1]
B[idx['mask']] = idx['value'][2]
return Image.fromarray(np.array([R,G,B,A]).T, mode='RGBA')
And finally some bench marks on a Intel Core i5-6300U # 2.40GHz
import time
times = []
for i in range(10):
t0 = time.time()
# Test
transform_map(img, specials, 1.2, .9, 1.2)
#
t1 = time.time()
times.append(t1-t0)
np.round(times, 2)
print('average run time: %.2f +/-%.2f'%(np.mean(times), np.std(times)))
average run time: 9.72 +/-0.91
EDIT Parallelization
With the same setup as above, we can get a 2x speed increase on large images. (Small ones are faster without numba)
from numba import njit, prange
from numba.core import types
from numba.typed import Dict
# Map dict of special colors or transform over array of pixel values
#njit(parallel=True, locals={'px_hash': types.uint32})
def check_and_transform(img_array, d, T):
#Save Shape for later
shape = img_array.shape
# Flatten image for 1-d iteration
img_array_flat = img_array.reshape(-1,3).copy()
N = img_array_flat.shape[0]
# Replace or map
for i in prange(N):
px_hash = np.uint32(0)
px_hash += img_array_flat[i,0]
px_hash += types.uint32(2**8) * img_array_flat[i,1]
px_hash += types.uint32(2**16) * img_array_flat[i,2]
try:
img_array_flat[i] = d[px_hash]
except Exception:
img_array_flat[i] = (img_array_flat[i] * T).astype(np.uint8)
# return image
return img_array_flat.reshape(shape)
# Wrapper for function above
def map_or_transform_jit(image: Image, specials: dict, T: np.ndarray):
# assemble numba typed dict
d = Dict.empty(
key_type=types.uint32,
value_type=types.uint8[:],
)
for k, v in specials.items():
k = types.uint32(k[0] + 2**8 * k[1] + 2**16 * k[2])
v = np.array(v, dtype=np.uint8)
d[k] = v
# get rgb channels
img_arr = np.array(img)
rgb = img_arr[:,:,:3].copy()
img_shape = img_arr.shape
# apply map
rgb = check_and_transform(rgb, d, T)
# set color channels
img_arr[:,:,:3] = rgb
return Image.fromarray(img_arr, mode='RGBA')
# Benchmark
import time
times = []
for i in range(10):
t0 = time.time()
# Test
test_img = map_or_transform_jit(img, specials, np.array([1, .5, .5]))
#
t1 = time.time()
times.append(t1-t0)
np.round(times, 2)
print('average run time: %.2f +/- %.2f'%(np.mean(times), np.std(times)))
test_img
average run time: 3.76 +/- 0.08
Related
I'm trying to write the code of this paper paper for a university project. the idea is to insert an invisible watermark into a grayscale image, which can be extracted later to verify the image ownership.
This is the code I wrote for the watermark embedding process :
import pywt
import numpy as np
import cv2
from PIL import Image
from math import sqrt, log10
from scipy.fftpack import dct, idct
def Get_MSB_LSB_Watermark () : #Function that separates the watermark into MSB and LSB images
MSBs = []
LSBs = []
for i in range (len(Watermark)) :
binary = '{:0>8}'.format(str(bin(Watermark[i]))[2:])
MSB = (binary[0:4])
LSB = (binary[4:])
MSB = int(MSB, 2)
LSB = int(LSB,2)
MSBs.append(MSB)
LSBs.append(LSB)
MSBs = np.array(MSBs)
LSBs = np.array(LSBs)
return MSBs.reshape(64,64), LSBs.reshape(64,64)
def split(array, nrows, ncols): #Split array into blocks of size nrows* ncols
r, h = array.shape
return (array.reshape(h//nrows, nrows, -1, ncols)
.swapaxes(1, 2)
.reshape(-1, nrows, ncols))
def unblockshaped(arr, h, w): #the inverse of the split function
n, nrows, ncols = arr.shape
return (arr.reshape(h//nrows, -1, nrows, ncols)
.swapaxes(1,2)
.reshape(h, w))
def ISVD (U,S,V): #the inverse of singular value decomposition
s = np.zeros(np.shape(U))
for i in range(4):
s[i, i] = S[i]
recon_image = U # s # V
return recon_image
def Watermark_Embedding (blocks, watermark) :
Watermarked_blocks = []
k1 = []
k2 = []
#convert the watermark to a list
w = list(np.ndarray.flatten(watermark))
for i in range (len(blocks)) :
B = blocks[i]
#Aplly singular value decoposition to the block
U, s, V = np.linalg.svd(B)
#Modify the singular values of the block
P = s[1] - s[2]
delta = abs(w[i]) - P
s[1] = s[1] + delta
if s[0] >= s[1] :
k1.append(1)
else :
k1.append(-1)
#the inverse of SVD after watermark embedding
recunstructed_B = ISVD(U, s, V)
Watermarked_blocks.append(recunstructed_B)
for j in range(len(w)):
if w[j] >= 0:
k2.append(1)
else:
k2.append(-1)
return k1,k2, np.array(Watermarked_blocks)
def apply_dct(image_array):
size = image_array[0].__len__()
all_subdct = np.empty((size, size))
for i in range (0, size, 4):
for j in range (0, size, 4):
subpixels = image_array[i:i+4, j:j+4]
subdct = dct(dct(subpixels.T, norm="ortho").T, norm="ortho")
all_subdct[i:i+4, j:j+4] = subdct
return all_subdct
def inverse_dct(all_subdct):
size = all_subdct[0].__len__()
all_subidct = np.empty((size, size))
for i in range (0, size, 4):
for j in range (0, size, 4):
subidct = idct(idct(all_subdct[i:i+4, j:j+4].T, norm="ortho").T, norm="ortho")
all_subidct[i:i+4, j:j+4] = subidct
return all_subidct
#read watermark
Watermark = Image.open('Copyright.png').convert('L')
Watermark = list(Watermark.getdata())
#Separate the watermark into LSB and MSB images
Watermark1, Watermark2 = Get_MSB_LSB_Watermark()
#Apply descrete cosine Transform on the two generated images
DCT_Watermark1 = apply_dct(Watermark1)
DCT_Watermark2 = apply_dct(Watermark2)
#read cover Image
Cover_Image = Image.open('10.png').convert('L')
#Apply 1 level descrete wavelet transform
LL1, (LH1, HL1, HH1) = pywt.dwt2(Cover_Image, 'haar')
#Split the LH1 and HL1 subbands into blocks of size 4*4
blocks_LH1 = split(LH1,4,4)
blocks_HL1 = split(HL1,4,4)
#Watermark Embedding in LH1 and HL1 and Keys generation
Key1, Key3, WatermarkedblocksLH1 = Watermark_Embedding(blocks_LH1,DCT_Watermark1)
Key2 ,Key4, WatermarkedblocksHL1 = Watermark_Embedding(blocks_HL1,DCT_Watermark2)
#Merge the watermzrked Blocks
reconstructed_LH1 = unblockshaped(WatermarkedblocksLH1, 256,256)
reconstructed_HL1 = unblockshaped(WatermarkedblocksHL1, 256,256)
#Apply the inverse of descrete wavelet transform to get the watermarked image
IDWT = pywt.idwt2((LL1, (reconstructed_LH1, reconstructed_HL1, HH1)), 'haar')
cv2.imwrite('Watermarked_img.png', IDWT)
This is the code I wrote for the Extraction process :
import pywt
from scipy import fftpack
import numpy as np
import cv2
from PIL import Image
import scipy
from math import sqrt, log10
from Watermark_Embedding import *
def Watermark_Extraction(blocks,key1, key2) :
Extracted_Watermark = []
for i in range(len(blocks)):
B = blocks[i]
#apply SVD on the Block
U, s, V = np.linalg.svd(B)
if key1[i] == 1 :
P = (s[1] - s[2])
Extracted_Watermark.append(P)
else :
P = (s[0] - s[2])
Extracted_Watermark.append(P)
for j in range(len(Extracted_Watermark)) :
if key2[j] == 1 :
Extracted_Watermark[j] = Extracted_Watermark[j]
else :
Extracted_Watermark[j] = - (Extracted_Watermark[j])
return np.array(Extracted_Watermark)
def Merge_W1_W2 ():
Merged_watermark = []
w1 = list(np.ndarray.flatten(IDCTW1))
w2 = list(np.ndarray.flatten(IDCTW2))
for i in range (len(w2)):
bw1 = '{:0>4}'.format((bin(int(abs(w1[i]))))[2:])
bw2 = '{:0>4}'.format((bin(int(abs(w2[i]))))[2:])
P = bw1+bw2
pixel = (int(P,2))
Merged_watermark.append(pixel)
return Merged_watermark
Watermarked_Image = Image.open('Watermarked_img.png')
LL1, (LH1, HL1, HH1) = pywt.dwt2(Watermarked_Image, 'haar')
blocks_LH1 = split(LH1,4,4)
blocks_HL1 = split(HL1,4,4)
W1 = Watermark_Extraction(blocks_LH1, Key1,Key3)
W2 = Watermark_Extraction(blocks_HL1, Key2, Key4)
W1 = W1.reshape(64,64)
W2 = W2.reshape(64,64)
IDCTW1 = inverse_dct(W1)
IDCTW2 = inverse_dct(W2)
Merged = np.array(Merge_W1_W2())
Merged = Merged.reshape(64,64)
cv2.imwrite('Extracted_Watermark.png', Merged)
The cover Image of size 512*512:
The 64*64 watermark I used
The watermarked Image :
The extracted Watermark I get:
I calculated the similarity between the two watermarks using SSIM :
from skimage.metrics import structural_similarity
original_Watermark = cv2.imread('Copyright.png')
extracted_watermark = cv2.imread('Extracted_Watermark.png')
# Convert images to grayscale
original_watermark = cv2.cvtColor(original_Watermark, cv2.COLOR_BGR2GRAY)
extracted_Watermark = cv2.cvtColor(extracted_watermark, cv2.COLOR_BGR2GRAY)
# Compute SSIM between two images
(score, diff) = structural_similarity(original_Watermark, extracted_Watermark, full=True)
print("SSIM = ", score)
I didn't apply any modification on the watermarked image and The SSIM I got is 0.8445354561524052. however the SSIM of the extracted watermark should be 0.99 according to the paper.
I don't know what's wrong with my code and I have a deadline after two days so I really need help.
thanks in advance.
There are two issues:
In Merge_W1_W2 you are using int to convert from float to int but that introduces errors for numbers where the floating point representation is not exact (e.g. 14.99999999999997); this can be fixed by using round instead.
Saving cv2.imwrite('Watermarked_img.png', IDWT) is a lossy operation because it rounds the values in IDWT to the nearest integer; if you use Watermarked_Image = IDWT then you will get back the exact same watermark image.
`n = 3
array = np.ones((n,n)) / (n*n)
n = array.shape[0] * array.shape1
while(True):
ret, frame = cap.read()
if ret is True:
print("newframe")
gframe = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
dst = cv2.copyMakeBorder(gframe, 1, 1, 1, 1, borderType, None, None)
blur = cv2.blur(dst,(3,3))
if k == 1 :
lastframe = gframe
curframe = gframe
nextframe = gframe
newFrame = gframe
k = 0
else :
lf = ndimage.convolve(lastframe, array, mode='constant', cval= 0.0)
cf = ndimage.convolve(curframe, array, mode='constant', cval= 0.0)
nf = ndimage.convolve(nextframe, array, mode='constant', cval= 0.0)
lastframe = curframe
curframe = nextframe
nextframe = gframe
b = np.zeros((3, 528, 720))
b[0] = lf
b[1] = cf
b[2] = nf
result = np.mean(b, axis=0)
cv2.imshow('frame',result)
cv2.imshow('frame2',gframe)
`enter image description here
I am trying to add all pixel values of a 3x3 pixel and then average them. I need to do that for every pixel and every frame and replace the primary pixel with the averaged one. However the way i am trying to do it makes it really slow and not really accurate.
This sounds like a convolution.
import numpy as np
from scipy import ndimage
a = np.random.random((5, 5))
a
[[0.14742615 0.83548453 0.67433445 0.59162829 0.21160044]
[0.1700598 0.89074466 0.84155171 0.65092969 0.3842437 ]
[0.22662423 0.2266929 0.47757456 0.34480112 0.06261333]
[0.89402116 0.00101947 0.90503461 0.93112109 0.44817247]
[0.21788789 0.3338606 0.07323461 0.28944439 0.91217591]]
Convolution operation with window size 3x3
n = 3
k = np.ones((n, n)) / (n * n)
n = k.shape[0] * k.shape[1]
b = ndimage.convolve(a, k, mode='constant', cval=0.0)
b
[[0.22707946 0.39551126 0.49829704 0.3726987 0.2042669 ]
[0.27744803 0.49894366 0.61486021 0.47103081 0.24953517]
[0.26768469 0.51481368 0.58549664 0.56067136 0.31354238]
[0.21112292 0.37288334 0.39808704 0.4937969 0.33203648]
[0.16075435 0.26945093 0.28152386 0.39546479 0.28676821]]
Now you just have to do it for the current frame, and the two prior frames.
-------- EDIT: For three frames -----------
For 3D you could write a convolution function like in this post, but its quite complex as it uses FFTs
If you just want to average across three frames, you could do:
f1 = np.random.random((5, 5)) # Frame1
f2 = np.random.random((5, 5)) # Frame2
f3 = np.random.random((5, 5)) # Frame3
n = 3
k = np.ones((n, n)) / (n * n)
n = k.shape[0] * k.shape[1]
b0 = ndimage.convolve(f1, k, mode='constant', cval=0.0)
b1 = ndimage.convolve(f2, k, mode='constant', cval=0.0)
b2 = ndimage.convolve(f3, k, mode='constant', cval=0.0)
# Create a 3D Matrix, with each fame placed along the first dimension
b = np.zeros((3, 5, 5))
b[0] = b0
b[1] = b1
b[2] = b2
# Take the average across the first dimension (across frames)
result = np.mean(b, axis=0)
There probably is a more elegant solution than this, but it gets the job done.
-------- EDIT: For Movies -----------
Based on all the questions in the comments I've decided to attempt to add some more code to help with implementation.
Firstly I'm starting out with these 7 consecutive stills from a movie:
I have not verified that the following code is bug proof or actually returns the correct result.
import cv2
import numpy as np
from scipy import ndimage
# this is a function to do previous code
def mean_frames(frames, kernel):
b = np.zeros(frames.shape)
for i in range(frames.shape[0]):
b[i] = ndimage.convolve(frames[i], k, mode='constant', cval=0.0)
b = np.mean(b, axis=0) / frames.shape[0]
return b
mean_N = 3 # frames to average
# read in 1 file to get dimenions
im = cv2.imread(f'{root}1.png', cv2.IMREAD_GRAYSCALE)
# setup numpy matrix that will hold mean_N frames at a time
frames = np.zeros((mean_N, im.shape[0], im.shape[1]))
avg_frames = [] # list to store our 3 averaged frames
count = 0 # counter to position frames in 1st dim of 3D matrix for avg
k = np.ones((3, 3)) / (3 * 3) # kernel for 2D convolution
for j in range(1, 7): # 7 images
file_name = root + str(j) + '.png'
im = cv2.imread(file_name, cv2.IMREAD_GRAYSCALE)
frames[count, ::] = im # store in 3D matrix
# if loaded more than min req. for avg, we average
if j >= mean_N:
# average and store to list
avg_frames.append(mean_frames(frames, k))
# if the count is mean_N - 1, that means we need to replace
# the 0th matrix in frames so that we are doing a 'moving avg'
if count == (mean_N - 1):
count = 0
else:
count += 1 #increase position in 0th dim for 3D matrix storage
# ouput averaged frames
for i, f in enumerate(avg_frames):
cv2.imwrite(f'{path}output{i}.jpg', f)
Then looking at the folder, there are 5 files (as expected if we did a moving average of 3 frames over 7 stills:
looking at before and after:
Image 3:
and averaged image #1:
The image not only is in gray scale (as expected) but seems quite dark. Perhaps some brightening would make things look better/more apparent.
Your question is very interesting.
I saw that you use many loops for activating this function. Let's process analysis.
Just for a frame.
You want to add all pixel values of a 3x3 pixel neighborhood. So I think Image interpolation is very suitable for this case. In OpenCV, we use resize() to interpolate pixel for image. So the INTER_NEAREST is best for this situation.
This is the formula for INTER_NEAREST.
Now you get the pixel added image.
Then you want to do that for every pixel and every frame and replace the primary pixel with the average one. And I think the Average filtering is a better solution.
The filter will work every pixel.
The code of a temporary example.
Interpolation
img = cv2.resize(img, (img.size[0]*3, img.size[1]*3), cv2.INTER_NEAREST)
Filter
img = cv2.blur(img, (3, 3))
I'm trying to implement Reinhard's method to use the color distribution of a target image to color normalize a passed in image for a research project. I've gotten the code to work and it outputs correctly but it's pretty slow. It takes about 20 minutes to iterate through 300 images. I'm pretty sure the bottleneck is how I'm handling applying the function to each image. I'm currently iterating through each pixel of the image and applying the functions below to each channel.
def reinhard(target, img):
#converts image and target from BGR colorspace to l alpha beta
lAB_img = cv2.cvtColor(img, cv2.COLOR_BGR2Lab)
lAB_tar = cv2.cvtColor(target, cv2.COLOR_BGR2Lab)
#finds mean and standard deviation for each color channel across the entire image
(mean, std) = cv2.meanStdDev(lAB_img)
(mean_tar, std_tar) = cv2.meanStdDev(lAB_tar)
#iterates over image implementing formula to map color normalized pixels to target image
for y in range(512):
for x in range(512):
lAB_tar[x, y, 0] = (lAB_img[x, y, 0] - mean[0]) / std[0] * std_tar[0] + mean_tar[0]
lAB_tar[x, y, 1] = (lAB_img[x, y, 1] - mean[1]) / std[1] * std_tar[1] + mean_tar[1]
lAB_tar[x, y, 2] = (lAB_img[x, y, 2] - mean[2]) / std[2] * std_tar[2] + mean_tar[2]
mapped = cv2.cvtColor(lAB_tar, cv2.COLOR_Lab2BGR)
return mapped
My supervisor told me that I could try using a matrix to apply the function all at once to improve the runtime but I'm not exactly sure how to go about doing that.
The original and the target:
Color transfer reuslts using Reinhard'method in 5 ms:
I prefer to implement the formulat in numpy vectorized operations other than python loops.
# implementing the formula
#(Io - mo)/so*st + mt = Io * (st/so) + mt - mo*(st/so)
ratio = (std_tar/std_ori).reshape(-1)
offset = (mean_tar - mean_ori*std_tar/std_ori).reshape(-1)
lab_tar = cv2.convertScaleAbs(lab_ori*ratio + offset)
Here is the code:
# 2019/02/19 by knight-金
# https://stackoverflow.com/a/54757659/3547485
import numpy as np
import cv2
def reinhard(target, original):
# cvtColor: COLOR_BGR2Lab
lab_tar = cv2.cvtColor(target, cv2.COLOR_BGR2Lab)
lab_ori = cv2.cvtColor(original, cv2.COLOR_BGR2Lab)
# meanStdDev: calculate mean and stadard deviation
mean_tar, std_tar = cv2.meanStdDev(lab_tar)
mean_ori, std_ori = cv2.meanStdDev(lab_ori)
# implementing the formula
#(Io - mo)/so*st + mt = Io * (st/so) + mt - mo*(st/so)
ratio = (std_tar/std_ori).reshape(-1)
offset = (mean_tar - mean_ori*std_tar/std_ori).reshape(-1)
lab_tar = cv2.convertScaleAbs(lab_ori*ratio + offset)
# convert back
mapped = cv2.cvtColor(lab_tar, cv2.COLOR_Lab2BGR)
return mapped
if __name__ == "__main__":
ori = cv2.imread("ori.png")
tar = cv2.imread("tar.png")
mapped = reinhard(tar, ori)
cv2.imwrite("mapped.png", mapped)
mapped_inv = reinhard(ori, tar)
cv2.imwrite("mapped_inv.png", mapped)
I managed to figure it out after looking at the numpy documentation. I just needed to replace my nested for loop with proper array accessing. It took less than a minute to iterate through all 300 images with this.
lAB_tar[:,:,0] = (lAB_img[:,:,0] - mean[0])/std[0] * std_tar[0] + mean_tar[0]
lAB_tar[:,:,1] = (lAB_img[:,:,1] - mean[1])/std[1] * std_tar[1] + mean_tar[1]
lAB_tar[:,:,2] = (lAB_img[:,:,2] - mean[2])/std[2] * std_tar[2] + mean_tar[2]
I have an rgb image with 12 distinct colours but I do not know the colours (pixel values) beforehand. I want to convert all the pixel values between 0 and 11,each symbolising a unique colour of the original rgb image.
e.g. all [230,100,140] converted to [0,0,0] , all [130,90,100] converted to [0,0,1] and so on ...all [210,80,50] converted to [0,0,11].
Quick and dirty application. Much can be improved, especially going through the whole image pixel by pixel is not very numpy nor very opencv, but I was too lazy to remember exactly how to threshold and replace RGB pixels..
import cv2
import numpy as np
#finding unique rows
#comes from this answer : http://stackoverflow.com/questions/8560440/removing-duplicate-columns-and-rows-from-a-numpy-2d-array
def unique_rows(a):
a = np.ascontiguousarray(a)
unique_a = np.unique(a.view([('', a.dtype)]*a.shape[1]))
return unique_a.view(a.dtype).reshape((unique_a.shape[0], a.shape[1]))
img=cv2.imread(your_image)
#listing all pixels
pixels=[]
for p in img:
for k in p:
pixels.append(k)
#finding all different colors
colors=unique_rows(pixels)
#comparing each color to every pixel
res=np.zeros(img.shape)
cpt=0
for color in colors:
for i in range(img.shape[0]):
for j in range(img.shape[1]):
if (img[i,j,:]==color).all(): #if pixel is this color
res[i,j,:]=[0,0,cpt] #set the pixel to [0,0,counter]
cpt+=1
You can use np.unique with a bit of trickery:
import numpy as np
def safe_method(image, k):
# a bit of black magic to make np.unique handle triplets
out = np.zeros(image.shape[:-1], dtype=np.int32)
out8 = out.view(np.int8)
# should really check endianness here
out8.reshape(image.shape[:-1] + (4,))[..., 1:] = image
uniq, map_ = np.unique(out, return_inverse=True)
assert uniq.size == k
map_.shape = image.shape[:-1]
# map_ contains the desired result. However, order of colours is most
# probably different from original
colours = uniq.view(np.uint8).reshape(-1, 4)[:, 1:]
return colours, map_
However, if the number of pixels is much larger than the number of colours,
the following heuristic algorithm may deliver huge speedups.
It tries to find a cheap hash function (such as only looking at the red channel) and if it succeds it uses that to create a lookup table. If not it falls back to the above safe method.
CHEAP_HASHES = [lambda x: x[..., 0], lambda x: x[..., 1], lambda x: x[..., 2]]
def fast_method(image, k):
# find all colours
chunk = int(4 * k * np.log(k)) + 1
colours = set()
for chunk_start in range(0, image.size // 3, chunk):
colours |= set(
map(tuple, image.reshape(-1,3)[chunk_start:chunk_start+chunk]))
if len(colours) == k:
break
colours = np.array(sorted(colours))
# find hash method
for method in CHEAP_HASHES:
if len(set(method(colours))) == k:
break
else:
safe_method(image, k)
# create lookup table
hashed = method(colours)
# should really provide for unexpected colours here
lookup = np.empty((hashed.max() + 1,), int)
lookup[hashed] = np.arange(k)
return colours, lookup[method(image)]
Testing and timings:
from timeit import timeit
def create_image(k, M, N):
colours = np.random.randint(0, 256, (k, 3)).astype(np.uint8)
map_ = np.random.randint(0, k, (M, N))
image = colours[map_, :]
return colours, map_, image
k, M, N = 12, 1000, 1000
colours, map_, image = create_image(k, M, N)
for f in fast_method, safe_method:
print('{:16s} {:10.6f} ms'.format(f.__name__, timeit(
lambda: f(image, k), number=10)*100))
rec_colours, rec_map_ = f(image, k)
print('solution correct:', np.all(rec_colours[rec_map_, :] == image))
Sample output (12 colours, 1000x1000 pixels):
fast_method 3.425885 ms
solution correct: True
safe_method 73.622813 ms
solution correct: True
I open a TIFF LAB image and return a big numpy array (4928x3264x3 float64) using python with this function:
def readTIFFLAB(filename):
"""Read TIFF LAB and retur a float matrix
read 16 bit (2 byte) each time without any multiprocessing
about 260 sec"""
import numpy as np
....
....
# Data read
# Matrix creation
dim = (int(ImageLength), int(ImageWidth), int(SamplePerPixel))
Image = np.empty(dim, np.float64)
contatore = 0
for address in range(0, len(StripOffsets)):
offset = StripOffsets[address]
f.seek(offset)
for lung in range(0, (StripByteCounts[address]/SamplePerPixel/2)):
v = np.array(f.read(2))
v.dtype = np.uint16
v1 = np.array(f.read(2))
v1.dtype = np.int16
v2 = np.array(f.read(2))
v2.dtype = np.int16
v = np.array([v/65535.0*100])
v1 = np.array([v1/32768.0*128])
v2 = np.array([v2/32768.0*128])
v = np.append(v, [v1, v2])
riga = contatore // ImageWidth
colonna = contatore % ImageWidth
# print(contatore, riga, colonna)
Image[riga, colonna, :] = v
contatore += 1
return(Image)
but this routine need about 270 second to do all the work and return a numpy array.
I try to use multiprocessing but is not possible to share an array or to use queue to pass it and sharedmem is not usable in windows system (at home I use openSuse but at work I must use windows).
Someone could help me to reduce the elaboration time? I read about threadind, to write some part in C language but I don’t understand what the best (and easier) solution,...I’m a food technologist not a real programmer :-)
Thanks
Wow, your method is really slow indeed, try tifffile library, you can find it here. That library will open your file very fast, then you just need to make the proper conversion, here's the simple usage:
import numpy as np
import tifffile
from skimage import color
import time
import matplotlib.pyplot as plt
def convert_to_tifflab(image):
# divide the color channel
L = image[:, :, 0]
a = image[:, :, 1]
b = image[:, :, 2]
# correct interpretation of a/b channel
a.dtype = np.int16
b.dtype = np.int16
# scale the result
L = L / 65535.0 * 100
a = a / 32768.0 * 128
b = b / 32768.0 * 128
# join the result
lab = np.dstack([L, a, b])
# view the image
start = time.time()
rgb = color.lab2rgb(lab)
print "Lab2Rgb: {0}".format(time.time() - start)
return rgb
if __name__ == "__main__":
filename = '/home/cilladani1/FERRERO/Immagini Digi Eye/Test Lettura CIELAB/TestLetturaCIELAB (LAB).tif'
start = time.time()
I = tifffile.imread(filename)
end = time.time()
print "Image fetching: {0}".format(end - start)
rgb = convert_to_tifflab(I)
print "Image conversion: {0}".format(time.time() - end)
plt.imshow(rgb)
plt.show()
The benchmark gives this data:
Image fetching: 0.0929999351501
Lab2Rgb: 12.9520001411
Image conversion: 13.5920000076
As you can see the bottleneck in this case is lab2rgb, which converts from xyz to rgb space. I'd recommend you to report an issue to the author of tifffile requesting the feature to read your fileformat, I'm sure he'll be able to speed up directly the C code.
After doing what BPL suggest me I modify the result array as follow:
# divide the color channel
L = I[:, :, 0]
a = I[:, :, 1]
b = I[:, :, 2]
# correct interpretation of a/b channel
a.dtype = np.int16
b.dtype = np.int16
# scale the result
L = L / 65535.0 * 100
a = a / 32768.0 * 128
b = b / 32768.0 * 128
# join the result
lab = np.dstack([L, a, b])
# view the image
from skimage import color
rgb = color.lab2rgb(lab)
plt.imshow(rgb)
So now is easier to read TIFF LAB image.
Thank BPL