I am using Python 3.6 to perform basic image manipulation through Pillow. Currently, I am attempting to take 32-bit PNG images (RGBA) of arbitrary color compositions and sizes and quantize them to a known palette of 16 colors. Optimally, this quantization method should be able to leave fully transparent (A = 0) pixels alone, while forcing all semi-transparent pixels to be fully opaque (A = 255). I have already devised working code that performs this, but I wonder if it may be inefficient:
import math
from PIL import Image
# a list of 16 RGBA tuples
palette = [
(0, 0, 0, 255),
# ...
]
with Image.open('some_image.png').convert('RGBA') as img:
for py in range(img.height):
for px in range(img.width):
pix = img.getpixel((px, py))
if pix[3] == 0: # Ignore fully transparent pixels
continue
# Perform exhaustive search for closest Euclidean distance
dist = 450
best_fit = (0, 0, 0, 0)
for c in palette:
if pix[:3] == c: # If pixel matches exactly, break
best_fit = c
break
tmp = sqrt(pow(pix[0]-c[0], 2) + pow(pix[1]-c[1], 2) + pow(pix[2]-c[2], 2))
if tmp < dist:
dist = tmp
best_fit = c
img.putpixel((px, py), best_fit + (255,))
img.save('quantized.png')
I think of two main inefficiencies of this code:
Image.putpixel() is a slow operation
Calculating the distance function multiple times per pixel is computationally wasteful
Is there a faster method to do this?
I've noted that Pillow has a native function Image.quantize() that seems to do exactly what I want. But as it is coded, it forces dithering in the result, which I do not want. This has been brought up in another StackOverflow question. The answer to that question was simply to extract the internal Pillow code and tweak the control variable for dithering, which I tested, but I find that Pillow corrupts the palette I give it and consistently yields an image where the quantized colors are considerably darker than they should be.
Image.point() is a tantalizing method, but it only works on each color channel individually, where color quantization requires working with all channels as a set. It'd be nice to be able to force all of the channels into a single channel of 32-bit integer values, which seems to be what the ill-documented mode "I" would do, but if I run img.convert('I'), I get a completely greyscale result, destroying all color.
An alternative method seems to be using NumPy and altering the image directly. I've attempted to create a lookup table of RGB values, but the three-dimensional indexing of NumPy's syntax is driving me insane. Ideally I'd like some kind of code that works like this:
img_arr = numpy.array(img)
# Find all unique colors
unique_colors = numpy.unique(arr, axis=0)
# Generate lookup table
colormap = numpy.empty(unique_colors.shape)
for i, c in enumerate(unique_colors):
dist = 450
best_fit = None
for pc in palette:
tmp = sqrt(pow(c[0] - pc[0], 2) + pow(c[1] - pc[1], 2) + pow(c[2] - pc[2], 2))
if tmp < dist:
dist = tmp
best_fit = pc
colormap[i] = best_fit
# Hypothetical pseudocode I can't seem to write out
for iy in range(arr.size):
for ix in range(arr[0].size):
if arr[iy, ix, 3] == 0: # Skip transparent
continue
index = # Find index of matching color in unique_colors, somehow
arr[iy, ix] = colormap[index]
I note with this hypothetical example that numpy.unique() is another slow operation, since it sorts the output. Since I cannot seem to finish the code the way I want, I haven't been able to test if this method is faster anyway.
I've also considered attempting to flatten the RGBA axis by converting the values to a 32-bit integer and desiring to create a one-dimensional lookup table with the simpler index:
def shift(a):
return a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3]
img_arr = numpy.apply_along_axis(shift, 1, img_arr)
But this operation seemed noticeably slow on its own.
I would prefer answers that involve only Pillow and/or NumPy, please. Unless using another library demonstrates a dramatic computational speed increase over any PIL- or NumPy-native solution, I don't want to import extraneous libraries to do something these two libraries should be reasonably capable of on their own.
for loops should be avoided for speed.
I think you should make a tensor like:
d2[x,y,color_index,rgb] = distance_squared
where rgb = 0..2 (0 = r, 1 = g, 2 = b).
Then compute the distance:
d[x,y,color_index] =
sqrt(sum(rgb,d2))
Then select the color_index with the minimal distance:
c[x,y] = min_index(color_index, d)
Finally replace alpha as needed:
alpha = ceil(orig_image.alpha)
img = c,alpha
Related
Finally, I succeeded in changing the MATLAB code to Python code.
However, contrary to what i have heard, Python's execution speed was remarkably slow.
Is MATLAB better on the image processing side these days?
Of course, I have no choice but to use Python because the company doesn't buy MATLAB...
ps. run Python in visual studio environment and IDLE.
If there's a way to speed Python up, please help me.
## Exercise Python image processing ##
import numpy as np
import cv2
import matplotlib.pyplot as plt
B = cv2.imread(r'D:\remedi\Exercise\Xray\Offset.png', -1) # offset image
for i in range(2,3):
org_I = cv2.imread(r'D:\remedi\Exercise\Xray\objects\object (' + str(i) + ').png', -1) # original image
w = cv2.imread(r'D:\remedi\Exercise\Xray\white\white (' + str(i) + ').png', -1) # white image
# dead & bad pixel correction
corrected_w = w.copy()
corrected_org_I = org_I.copy()
c = np.mean(corrected_w)
p = np.abs(corrected_w - c)
sens = 0.7
[num_y, num_x] = np.where((p < c*sens) | (p > c*sens))
ar = np.zeros((3,3))
ar2 = np.zeros((3,3))
for n in range(0, num_y.shape[0]):
for j in range(-1,2):
for k in range(-1,2):
if num_y[n]+j+1 == 0 or num_x[n]+k+1 == 0 or num_y[n]+j+1 == 577 or num_x[n]+k+1 == 577:
ar[j+1][k+1] = 0
ar2[j+1][k+1] = 0
else:
ar[j+1][k+1] = corrected_w[num_y[n]+j-1][num_x[n]+k-1]
ar2[j+1][k+1] = corrected_org_I[num_y[n]+j-1][num_x[n]+k-1]
ar[1][1] = 0
ar2[1][1] = 0
corrected_w[num_y[n]][num_x[n]] = np.sum(ar)/np.count_nonzero(ar)
corrected_org_I[num_y[n]][num_x[n]] = np.sum(ar2)/np.count_nonzero(ar2)
c = np.mean(corrected_w) # constant
FFC = np.uint16(np.divide(c*(corrected_org_I-B), (corrected_w-B))) # flat field correction
plt.subplot(2,3,1), plt.imshow(org_I, cmap='gray'), plt.title('Original Image')
plt.subplot(2,3,2), plt.imshow(corrected_org_I, cmap='gray'), plt.title('corrected original Image')
plt.subplot(2,3,3), plt.imshow(FFC, cmap='gray'), plt.title('FFC')
plt.subplot(2,3,4), plt.imshow(w, cmap='gray'), plt.title('w')
plt.subplot(2,3,5), plt.imshow(corrected_w, cmap='gray'), plt.title('corrected w')
plt.subplot(2,3,6), plt.imshow(B, cmap='gray'), plt.title('B')
plt.show()
cv2 calls into c++ library code, there is some overhead to each of these calls. These can add up depending on what you are doing. Additionally opencv takes advantage of some processor optimizations such as SIMD and NEON.
Here is a concrete example, I have implemented my own threshold function by iterating through the pixels. I then use opencv's built-in threshold function. I print timing for each function and compare the final result to verify they are the same.
img = cv2.imread('image.jpg', cv2.IMREAD_GRAYSCALE)
thresh = 128
thresh_loops = img.copy()
# implement threshold with loops
t1 = cv2.getTickCount()
for c in range(img.shape[0]):
for r in range(img.shape[1]):
if thresh_loops[c,r] > thresh:
thresh_loops[c,r] = 255
else:
thresh_loops[c,r] = 0
t2 = cv2.getTickCount()
print((t2-t1)/cv2.getTickFrequency())
# use the threshold function
t1 = cv2.getTickCount()
thr, thresh_fun = cv2.threshold(img, thresh, 255, cv2.THRESH_BINARY)
t2 = cv2.getTickCount()
print((t2-t1)/cv2.getTickFrequency())
print(np.all(thresh_loops == thresh_fun)) # verify the same result
I ran on a 720p image (720x1280 - or approximately 1 million pixels), on my machine the first function too 2.319724256 seconds the second took 0.013217389 seconds. So the second function is approximately 200x faster! You will run into this problem if you use Java or other languages that call into the OpenCV library (or any other C library).
Use it as good motivation to learn the api.
I'll also add that if you are doing image processing in Python with OpenCV, you should also learn the Numpy api, since the matrix (Mat) class is represented as a numpy array. You can get massive speedups by knowing the numpy api as well.
I switched to python a year ago from my beloved Matlab. Yes you are right, Matlab is faster than python, usually at least by 3-time or more.
2 ways I found to save python is here
Try to read Numpy Functional programming Routines and replace some for-loops with methods there. They are much faster than slicing an ndarray in side a for-loop.
Try Multiprocessing. I know that Matlab parfor is also better than any parallel computation package in python. Try 'joblib' or 'multiprocessing'. They might help a little bit.
Previously, I had created a Mandelbrot generator in python using turtle. Now, I am re-writing the program to use the Python Imaging Library in order to increase speed and reduce limits on size of images.
However, the program below only outputs RGB nonsense, almost noise. I think it is something to do with a difference in the way NumPy and PIL deal with arrays, since saying l[x,y] = [1,1,1] where l = np.zeros((height,width,3)) doesn't just make 1 pixel white when img = Image.fromarray(l) and img.show() are performed.
def imagebrot(mina=-1.25, maxa=1.25, minb=-1.25, maxb=1.25, width=100, height=100, maxit=300, inf=2):
l,b = np.zeros((height,width,3), dtype=np.float64), minb
for y in range(0, height):
a = mina
for x in range(0, width):
ab = mandel(a, b, maxit, inf)
if ab[0] == maxit:
l[x,y:] = [1,1,1]
#if ab[0] < maxit:
#smoothit = mandelc(ab[0], ab[1], ab[2])
#l[x, y] = colorsys.hsv_to_rgb(smoothit, 1, 1)
a += abs(mina-maxa)/width
b += abs(minb-maxb)/height
img = Image.fromarray(l, "RGB")
img.show()
def mandel(re, im, maxit, inf):
z = complex(re, im)
c,it = z,0
for i in range(0, maxit):
if abs(z) > inf:
break
z,it = z*z+c,it+1
return it,z,inf
def mandelc(it,z,inf):
return (it+1-log(log(abs(z)))/log(2))
UPDATE 1:
I realised that one of the major errors in this program (I'm sure there are many) is the fact that I was using the x,y coords as the complex coefficients! So, 0 to 100 instead of -1.25 to 1.25! I have changed this so that the code now uses variables a,b to describe them, incremented in a manner I've stolen from some of my code in the turtle version. The code above has been updated accordingly. Since the Smooth Colouring Algorithm code is currently commented out for debugging, the inf variable has been reduced to 2 in size.
UPDATE 2:
I have edited the numpy index with help from a great user. The program now outputs this when set to 200 by 200:
As you can see, it definitely shows some mathematical shape and yet is filled with all these strange red, green and blue pixels! Why could these be here? My program can only set RGB values to [1,1,1] or leave it as a default [0,0,0]. It can't be [1,0,0] or anything like that - this must be a serious flaw...
UPDATE 3:
I think there is an error with NumPy and PIL's integration. If I make l = np.zeros((100, 100, 3)) and then state l[0,0,:] = 1 and finally img = Image.fromarray(l) & img.show(), this is what we get:
Here we get a series of coloured pixels. This calls for another question.
UPDATE 4:
I have no idea what was happening previously, but it seems with a np.uint8 array, Image.fromarray() uses colour values from 0-255. With this piece of wisdom, I move one step closer to understanding this Mandelbug!
Now, I do get something vaguely mathematical, however it still outputs strange things.
This dot is all there is... I get even stranger things if I change to np.uint16, I presume due to the different byte-shape and encoding scheme.
You are indexing the 3D array l incorrectly, try
l[x,y,:] = [1,1,1]
instead. For more details on how to access and modify numpy arrays have a look at numpy indexing
As a side note: the quickstart documentation of numpy actually has an implementation of the mandelbrot set generation and plotting.
I already achieved the goal described in the title but I was wondering if there was a more efficient (or generally better) way to do it. First of all let me introduce the problem.
I have a set of images of different sizes but with a width/height ratio less than (or equal) 2 (could be anything but let's say 2 for now), I want to normalize each one, meaning I want all of them to have the same size. Specifically I am going to do so like this:
Extract the max height above all images
Zoom the image so that each image reaches the max height keeping its ratio
Add a padding to the right with just white pixels until the image has a width/height ratio of 2
Keep in mind the images are represented as numpy matrices of grey scale values [0,255].
This is how I'm doing it now in Python:
max_height = numpy.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
for obs in data:
if len(obs[0])/len(obs) <= 2:
new_img = ndimage.zoom(obs, round(max_height/len(obs), 2), order=3)
missing_cols = max_height * 2 - len(new_img[0])
norm_img = []
for row in new_img:
norm_img.append(np.pad(row, (0, missing_cols), mode='constant', constant_values=255))
norm_img = np.resize(norm_img, (max_height, max_height*2))
There's a note about this code:
I'm rounding the zoom ratio because it makes the final height equal to max_height, I'm sure this is not the best approach but it's working (any suggestion is appreciated here). What I'd like to do is to expand the image keeping the ratio until it reaches a height equal to max_height. This is the only solution I found so far and it worked right away, the interpolation works pretty good.
So my final questions are:
Is there a better approach to achieve what explained above (image normalization) ? Do you think I could have done this differently ? Is there a common good practice I'm not following ?
Thanks in advance for your time.
Instead of ndimage.zoom you could use
scipy.misc.imresize. This
function allows you to specify the target size as a tuple, instead of by zoom
factor. Thus you won't have to call np.resize later to get the size exactly as
desired.
Note that scipy.misc.imresize calls
PIL.Image.resize
under the hood, so PIL (or Pillow) is a dependency.
Instead of using np.pad in a for-loop, you could allocate space for the desired array, norm_arr, first:
norm_arr = np.full((max_height, max_width), fill_value=255)
and then copy the resized image, new_arr into norm_arr:
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
For example,
from __future__ import division
import numpy as np
from scipy import misc
data = [np.linspace(255, 0, i*10).reshape(i,10)
for i in range(5, 100, 11)]
max_height = np.max([len(obs) for obs in data if len(obs[0])/len(obs) <= 2])
max_width = 2*max_height
result = []
for obs in data:
norm_arr = obs
h, w = obs.shape
if float(w)/h <= 2:
scale_factor = max_height/float(h)
target_size = (max_height, int(round(w*scale_factor)))
new_arr = misc.imresize(obs, target_size, interp='bicubic')
norm_arr = np.full((max_height, max_width), fill_value=255)
# check the shapes
# print(obs.shape, new_arr.shape, norm_arr.shape)
nh, nw = new_arr.shape
norm_arr[:nh, :nw] = new_arr
result.append(norm_arr)
# visually check the result
# misc.toimage(norm_arr).show()
I have a problem in which a have a bunch of images for which I have to generate histograms. But I have to generate an histogram for each pixel. I.e, for a collection of n images, I have to count the values that the pixel 0,0 assumed and generate an histogram, the same for 0,1, 0,2 and so on. I coded the following method to do this:
class ImageData:
def generate_pixel_histogram(self, images, bins):
"""
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
"""
max_value = 0.0
min_value = 0.0
for i in range(len(images)):
image = images[i]
max_entry = max(max(p[1:]) for p in image.data)
min_entry = min(min(p[1:]) for p in image.data)
if max_entry > max_value:
max_value = max_entry
if min_entry < min_value:
min_value = min_entry
interval_size = (math.fabs(min_value) + math.fabs(max_value))/bins
for x in range(self.width):
for y in range(self.height):
pixel_histogram = {}
for i in range(bins+1):
key = round(min_value+(i*interval_size), 2)
pixel_histogram[key] = 0.0
for i in range(len(images)):
image = images[i]
value = round(Utils.get_bin(image.data[x][y], interval_size), 2)
pixel_histogram[value] += 1.0/len(images)
self.data[x][y] = pixel_histogram
Where each position of a matrix store a dictionary representing an histogram. But, how I do this for each pixel, and this calculus take a considerable time, this seems to me to be a good problem to be parallelized. But I don't have experience with this and I don't know how to do this.
EDIT:
I tried what #Eelco Hoogendoorn told me and it works perfectly. But applying it to my code, where the data are a large number of images generated with this constructor (after the values are calculated and not just 0 anymore), I just got as h an array of zeros [0 0 0]. What I pass to the histogram method is an array of ImageData.
class ImageData(object):
def __init__(self, width=5, height=5, range_min=-1, range_max=1):
"""
The ImageData constructor
"""
self.width = width
self.height = height
#The values range each pixel can assume
self.range_min = range_min
self.range_max = range_max
self.data = np.arange(width*height).reshape(height, width)
#Another class, just the method here
def generate_pixel_histogram(realizations, bins):
"""
Generate a histogram of the image for each pixel, counting
the values assumed for each pixel in a specified bins
"""
data = np.array([image.data for image in realizations])
min_max_range = data.min(), data.max()+1
bin_boundaries = np.empty(bins+1)
# Function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=min_max_range)
bin_boundaries[:] = b
return h
# Apply this for each pixel
hist_data = np.apply_along_axis(hist, 0, data)
print hist_data
print bin_boundaries
Now I get:
hist_data = np.apply_along_axis(hist, 0, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/numpy/lib/shape_base.py", line 104, in apply_along_axis
outshape[axis] = len(res)
TypeError: object of type 'NoneType' has no len()
Any help would be appreciated.
Thanks in advance.
As noted by john, the most obvious solution to this is to look for library functionality that will do this for you. It exists, and it will be orders of magnitude more efficient than what you are doing here.
Standard numpy has a histogram function that can be used for this purpose. If you have only few values per pixel, it will be relatively inefficient; and it creates a dense histogram vector rather than the sparse one you produce here. Still, chances are good the code below solves your problem efficiently.
import numpy as np
#some example data; 128 images of 4x4 pixels
voxeldata = np.random.randint(0,100, (128, 4,4))
#we need to apply the same binning range to each pixel to get sensibble output
globalminmax = voxeldata.min(), voxeldata.max()+1
#number of output bins
bins = 20
bin_boundaries = np.empty(bins+1)
#function to wrap np.histogram, passing on only the first return value
def hist(pixel):
h, b = np.histogram(pixel, bins=bins, range=globalminmax)
bin_boundaries[:] = b #simply overwrite; result should be identical each time
return h
#apply this for each pixel
histdata = np.apply_along_axis(hist, 0, voxeldata)
print bin_boundaries
print histdata[:,0,0] #print the histogram of an arbitrary pixel
But the more general message id like to convey, looking at your code sample and the type of problem you are working on: do yourself a favor, and learn numpy.
Parallelization certainly would not be my first port of call in optimizing this kind of thing. Your main problem is that you're doing lots of looping at the Python level. Python is inherently slow at this kind of thing.
One option would be to learn how to write Cython extensions and write the histogram bit in Cython. This might take you a while.
Actually, taking a histogram of pixel values is a very common task in computer vision and it has already been efficiently implemented in OpenCV (which has python wrappers). There are also several functions for taking histograms in the numpy python package (though they are slower than the OpenCV implementations).
I have a script which uses Google Maps API to download a sequence of equal-sized square satellite images and generates a PDF. The images need to be rotated beforehand, and I already do so using PIL.
I noticed that, due to different light and terrain conditions, some images are too bright, others are too dark, and the resulting pdf ends up a bit ugly, with less-than-ideal reading conditions "in the field" (which is backcountry mountain biking, where I want to have a printed thumbnail of specific crossroads).
(EDIT) The goal then is to make all images end up with similar apparent brightness and contrast. So, the images that are too bright would have to be darkened, and the dark ones would have to be lightened. (by the way, I once used imagemagick autocontrast, or auto-gamma, or equalize, or autolevel, or something like that, with interesting results in medical images, but don't know how to do any of these in PIL).
I already used some image corrections after converting to grayscale (had a grayscale printer a time ago), but the results weren't good, either. Here is my grayscale code:
#!/usr/bin/python
def myEqualize(im)
im=im.convert('L')
contr = ImageEnhance.Contrast(im)
im = contr.enhance(0.3)
bright = ImageEnhance.Brightness(im)
im = bright.enhance(2)
#im.show()
return im
This code works independently for each image. I wonder if it would be better to analyze all images first and then "normalize" their visual properties (contrast, brightness, gamma, etc).
Also, I think it would be necessary to perform some analysis in the image (histogram?), so as to apply a custom correction depending on each image, and not an equal correction for all of them (although any "enhance" function implicitly considers initial contitions).
Does anybody had such problem and/or know a good alternative to do this with the colored images (no grayscale)?
Any help will be appreciated, thanks for reading!
What you are probably looking for is a utility that performs "histogram stretching". Here is one implementation. I am sure there are others. I think you want to preserve the original hue and apply this function uniformly across all color bands.
Of course there is a good chance that some of the tiles will have a noticeable discontinuity in level where they join. Avoiding this, however, would involve spatial interpolation of the "stretch" parameters and is a much more involved solution. (...but would be a good exercise if there is that need.)
Edit:
Here is a tweak that preserves image hue:
import operator
def equalize(im):
h = im.convert("L").histogram()
lut = []
for b in range(0, len(h), 256):
# step size
step = reduce(operator.add, h[b:b+256]) / 255
# create equalization lookup table
n = 0
for i in range(256):
lut.append(n / step)
n = n + h[i+b]
# map image through lookup table
return im.point(lut*im.layers)
The following code works on images from a microscope (which are similar), to prepare them prior to stitching. I used it on a test set of 20 images, with reasonable results.
The brightness average function is from another Stackoverflow question.
from PIL import Image
from PIL import ImageStat
import math
# function to return average brightness of an image
# Source: https://stackoverflow.com/questions/3490727/what-are-some-methods-to-analyze-image-brightness-using-python
def brightness(im_file):
im = Image.open(im_file)
stat = ImageStat.Stat(im)
r,g,b = stat.mean
return math.sqrt(0.241*(r**2) + 0.691*(g**2) + 0.068*(b**2)) #this is a way of averaging the r g b values to derive "human-visible" brightness
myList = [0.0]
deltaList = [0.0]
b = 0.0
num_images = 20 # number of images
# loop to auto-generate image names and run prior function
for i in range(1, num_images + 1): # for loop runs from image number 1 thru 20
a = str(i)
if len(a) == 1: a = '0' + str(i) # to follow the naming convention of files - 01.jpg, 02.jpg... 11.jpg etc.
image_name = 'twenty/' + a + '.jpg'
myList.append(brightness(image_name))
avg_brightness = sum(myList[1:])/num_images
print myList
print avg_brightness
for i in range(1, num_images + 1):
deltaList.append(i)
deltaList[i] = avg_brightness - myList[i]
print deltaList
At this point, the "correction" values (i.e. difference between value and mean) are stored in deltaList. The following section applies this correction to all the images one by one.
for k in range(1, num_images + 1): # for loop runs from image number 1 thru 20
a = str(k)
if len(a) == 1: a = '0' + str(k) # to follow the naming convention of files - 01.jpg, 02.jpg... 11.jpg etc.
image_name = 'twenty/' + a + '.jpg'
img_file = Image.open(image_name)
img_file = img_file.convert('RGB') # converts image to RGB format
pixels = img_file.load() # creates the pixel map
for i in range (img_file.size[0]):
for j in range (img_file.size[1]):
r, g, b = img_file.getpixel((i,j)) # extracts r g b values for the i x j th pixel
pixels[i,j] = (r+int(deltaList[k]), g+int(deltaList[k]), b+int(deltaList[k])) # re-creates the image
j = str(k)
new_image_name = 'twenty/' +'image' + j + '.jpg' # creates a new filename
img_file.save(new_image_name) # saves output to new file name