Related
I am using Python 3.6 to perform basic image manipulation through Pillow. Currently, I am attempting to take 32-bit PNG images (RGBA) of arbitrary color compositions and sizes and quantize them to a known palette of 16 colors. Optimally, this quantization method should be able to leave fully transparent (A = 0) pixels alone, while forcing all semi-transparent pixels to be fully opaque (A = 255). I have already devised working code that performs this, but I wonder if it may be inefficient:
import math
from PIL import Image
# a list of 16 RGBA tuples
palette = [
(0, 0, 0, 255),
# ...
]
with Image.open('some_image.png').convert('RGBA') as img:
for py in range(img.height):
for px in range(img.width):
pix = img.getpixel((px, py))
if pix[3] == 0: # Ignore fully transparent pixels
continue
# Perform exhaustive search for closest Euclidean distance
dist = 450
best_fit = (0, 0, 0, 0)
for c in palette:
if pix[:3] == c: # If pixel matches exactly, break
best_fit = c
break
tmp = sqrt(pow(pix[0]-c[0], 2) + pow(pix[1]-c[1], 2) + pow(pix[2]-c[2], 2))
if tmp < dist:
dist = tmp
best_fit = c
img.putpixel((px, py), best_fit + (255,))
img.save('quantized.png')
I think of two main inefficiencies of this code:
Image.putpixel() is a slow operation
Calculating the distance function multiple times per pixel is computationally wasteful
Is there a faster method to do this?
I've noted that Pillow has a native function Image.quantize() that seems to do exactly what I want. But as it is coded, it forces dithering in the result, which I do not want. This has been brought up in another StackOverflow question. The answer to that question was simply to extract the internal Pillow code and tweak the control variable for dithering, which I tested, but I find that Pillow corrupts the palette I give it and consistently yields an image where the quantized colors are considerably darker than they should be.
Image.point() is a tantalizing method, but it only works on each color channel individually, where color quantization requires working with all channels as a set. It'd be nice to be able to force all of the channels into a single channel of 32-bit integer values, which seems to be what the ill-documented mode "I" would do, but if I run img.convert('I'), I get a completely greyscale result, destroying all color.
An alternative method seems to be using NumPy and altering the image directly. I've attempted to create a lookup table of RGB values, but the three-dimensional indexing of NumPy's syntax is driving me insane. Ideally I'd like some kind of code that works like this:
img_arr = numpy.array(img)
# Find all unique colors
unique_colors = numpy.unique(arr, axis=0)
# Generate lookup table
colormap = numpy.empty(unique_colors.shape)
for i, c in enumerate(unique_colors):
dist = 450
best_fit = None
for pc in palette:
tmp = sqrt(pow(c[0] - pc[0], 2) + pow(c[1] - pc[1], 2) + pow(c[2] - pc[2], 2))
if tmp < dist:
dist = tmp
best_fit = pc
colormap[i] = best_fit
# Hypothetical pseudocode I can't seem to write out
for iy in range(arr.size):
for ix in range(arr[0].size):
if arr[iy, ix, 3] == 0: # Skip transparent
continue
index = # Find index of matching color in unique_colors, somehow
arr[iy, ix] = colormap[index]
I note with this hypothetical example that numpy.unique() is another slow operation, since it sorts the output. Since I cannot seem to finish the code the way I want, I haven't been able to test if this method is faster anyway.
I've also considered attempting to flatten the RGBA axis by converting the values to a 32-bit integer and desiring to create a one-dimensional lookup table with the simpler index:
def shift(a):
return a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3]
img_arr = numpy.apply_along_axis(shift, 1, img_arr)
But this operation seemed noticeably slow on its own.
I would prefer answers that involve only Pillow and/or NumPy, please. Unless using another library demonstrates a dramatic computational speed increase over any PIL- or NumPy-native solution, I don't want to import extraneous libraries to do something these two libraries should be reasonably capable of on their own.
for loops should be avoided for speed.
I think you should make a tensor like:
d2[x,y,color_index,rgb] = distance_squared
where rgb = 0..2 (0 = r, 1 = g, 2 = b).
Then compute the distance:
d[x,y,color_index] =
sqrt(sum(rgb,d2))
Then select the color_index with the minimal distance:
c[x,y] = min_index(color_index, d)
Finally replace alpha as needed:
alpha = ceil(orig_image.alpha)
img = c,alpha
I have an arbitrary input curve, given as numpy array. I want to create a smoothed version of it, similar to a rolling mean, but which is strictly greater than the original and strictly smooth. I could use the rolling mean value but if the input curve has a negative peak, the smoothed version will drop below the original around that peak. I could then simply use the maximum of this and the original but that would introduce non-smooth spots where the transition occurs.
Furthermore, I would like to be able to parameterize the algorithm with a look-ahead and a look-behind for this resulting curve, so that given a large look-ahead and a small look-behind the resulting curve would rather stick to the falling edges, and with a large look-behind and a small look-ahead it would rather be close to rising edges.
I tried using the pandas.Series(a).rolling() facility to get rolling means, rolling maxima, etc., but up to now I found no way to generate a smoothed version of my input which in all cases stays above to input.
I guess there is a way to combine rolling maxima and rolling means somehow to achieve what I want, so here is some code for computing these:
import pandas as pd
import numpy as np
my input curve:
original = np.array([ 5, 5, 5, 8, 8, 8, 2, 2, 2, 2, 2, 3, 3, 7 ])
This can be padded left (pre) and right (post) with the edge values as a preparation for any rolling function:
pre = 2
post = 3
padded = np.pad(original, (pre, post), 'edge')
Now we can apply a rolling mean:
smoothed = pd.Series(padded).rolling(
pre + post + 1).mean().get_values()[pre+post:]
But now the smoothed version is below the original, e. g. at index 4:
print(original[4], smoothed[4]) # 8 and 5.5
To compute a rolling maximum, you can use this:
maximum = pd.Series(padded).rolling(
pre + post + 1).max().get_values()[pre+post:]
But a rolling maximum alone would of course not be smooth in many cases and would display a lot of flat tops around the peaks of the original. I would prefer a smooth approach to these peaks.
If you have also pyqtgraph installed, you can easily plot such curves:
import pyqtgraph as pg
p = pg.plot(original)
p.plotItem.plot(smoothed, pen=(255,0,0))
(Of course, other plot libraries would do as well.)
What I would like to have as a result is a curve which is e. g. like the one formed by these values:
goal = np.array([ 5, 7, 7.8, 8, 8, 8, 7, 5, 3.5, 3, 4, 5.5, 6.5, 7 ])
Here is an image of the curves. The white line is the original (input), the red the rolling mean, the green is about what I would like to have:
EDIT: I just found the functions baseline() and envelope() of a module named peakutils. These two functions can compute polynomials of a given degree fitting the lower resp. upper peaks of the input. For small samples this can be a good solution. I'm looking for something which can also be applied on very large samples with millions of values; then the degree would need to be very high and the computation then also takes a considerate amount of time. Doing it piecewise (section for section) opens up a bunch of new questions and problems (like how to stitch properly while staying smooth and guaranteed above the input, performance when processing a massive amount of pieces etc.), so I'd like to avoid that if possible.
EDIT 2: I have a promising approach by a repetitively applying a filter which creates a rolling mean, shifts it slightly to the left and the right, and then takes the maximum of these two and the original sample. After applying this several times, it smoothes out the curve in the way I wanted it. Some unsmooth spots can remain, though, in deep valleys. Here is the code for this:
pre = 30
post = 30
margin = 10
s = [ np.array(sum([[ x ] * 100 for x in
[ 5, 5, 5, 8, 8, 8, 2, 2, 2, 2, 2, 3, 3, 7 ]], [])) ]
for _ in range(30):
s.append(np.max([
pd.Series(np.pad(s[-1], (margin+pre, post), 'edge')).rolling(
1 + pre + post).mean().get_values()[pre+post:-margin],
pd.Series(np.pad(s[-1], (pre, post+margin), 'edge')).rolling(
1 + pre + post).mean().get_values()[pre+post+margin:],
s[-1]], 0))
This creates 30 iterations of applying the filter, plotting these can be done using pyqtplot so:
p = pg.plot(original)
for q in s:
p.plotItem.plot(q, pen=(255, 100, 100))
The resulting image looks like this:
There are two aspects I don't like about this approach: ① It needs iterating lots of time (which slows me down), ② it still has unsmooth parts in the valleys (although in my usecase this might be acceptable).
I have now played around quite a bit and I think I found two main answers which solve my direct need. I will give them below.
import numpy as np
import pandas as pd
from scipy import signal
import pyqtgraph as pg
These are just the necessary imports, used in all code blow. pyqtgraph is only used for displaying stuff of course, so you do not really need it.
Symmetrical Smoothing
This can be used to create a smooth line which is always above the signal, but it cannot distinguish between rising and falling edges, so the curve around a single peak will look symmetrical. In many cases this might be quite okay and since it is way less complex than the asymmetrical solution below (and also does not have any quirks I would know about).
s = np.repeat([5, 5, 5, 8, 8, 8, 2, 2, 2, 2, 2, 3, 3, 7], 400) + 0.1
s *= np.random.random(len(s))
pre = post = 400
x = pd.Series(np.pad(s, (pre, post), 'edge')).rolling(
pre + 1 + post).max().get_values()[pre+post:]
y = pd.Series(np.pad(x, (pre, post), 'edge')).rolling(
pre + 1 + post, win_type='blackman').mean().get_values()[pre+post:]
p = pg.plot(s, pen=(100,100,100))
for c, pen in ((x, (0, 200, 200)),
(y, pg.mkPen((255, 255, 255), width=3, style=3))):
p.plotItem.plot(c, pen=pen)
Create a rolling maximum (x, cyan), and
create a windowed rolling mean of this (z, white dotted).
Asymmetrical Smoothing
My usecase called for a version which allowed to distinguish between rising and falling edges. The speed of the output should be different when falling or when rising.
Comment: Used as an envelope for a compressor/expander, a quickly rising curve would mean to dampen the effect of a sudden loud noise almost completely, while a slowly rising curve would mean to slowly compress the signal for a long time before the loud noise, keeping the dynamics when the bang appears. On the other hand, if the curve falls quickly after a loud noise, this would make quiet stuff shortly after a bang audible while a slowly falling curve would keep the dynamics there as well and only slowly expanding the signal back to normal levels.
s = np.repeat([5, 5, 5, 8, 8, 8, 2, 2, 2, 2, 2, 3, 3, 7], 400) + 0.1
s *= np.random.random(len(s))
pre, post = 100, 1000
t = pd.Series(np.pad(s, (post, pre), 'edge')).rolling(
pre + 1 + post).max().get_values()[pre+post:]
g = signal.get_window('boxcar', pre*2)[pre:]
g /= g.sum()
u = np.convolve(np.pad(t, (pre, 0), 'edge'), g)[pre:]
g = signal.get_window('boxcar', post*2)[:post]
g /= g.sum()
v = np.convolve(np.pad(t, (0, post), 'edge'), g)[post:]
u, v = u[:len(v)], v[:len(u)]
w = np.min(np.array([ u, v ]),0)
pre = post = max(100, min(pre, post)*3)
x = pd.Series(np.pad(w, (pre, post), 'edge')).rolling(
pre + 1 + post).max().get_values()[pre+post:]
y = pd.Series(np.pad(x, (pre, post), 'edge')).rolling(
pre + 1 + post, win_type='blackman').mean().get_values()[pre+post:]
p = pg.plot(s, pen=(100,100,100))
for c, pen in ((t, (200, 0, 0)),
(u, (200, 200, 0)),
(v, (0, 200, 0)),
(w, (200, 0, 200)),
(x, (0, 200, 200)),
(y, pg.mkPen((255, 255, 255), width=3))):
p.plotItem.plot(c, pen=pen)
This sequence combines ruthlessly several methods of signal processing.
The input signal is shown in grey. It is a noisy version of the input mentioned above.
A rolling maximum is applied to this (t, red).
Then a specially designed convolution curve for the falling edges is created (g) and the convolution is computed (u, yellow).
This is repeated for the rising edges with a different convolution curve (g again) and the convolution is computed (v, green).
The minimum of u and v is a curve having the desired slopes but is not very smooth yet; especially it has ugly spikes when the falling and the rising slope reach into each other (w, purple).
On this the symmetrical method above is applied:
Create a rolling maximum of this curve (x, cyan).
Create a windowed rolling mean of this curve (y, white dotted).
As an initial stab at part of the problem, I've produced a function which fits a polynomial to the data by minimising the integral subject to constraints that the polynomial be strictly above the points. I suspect if you do this piecewise over your data, it may work for you.
import scipy.optimize
def upperpoly(xdata, ydata, order):
def objective(p):
"""Minimize integral"""
pint = np.polyint(p)
integral = np.polyval(pint, xdata[-1]) - np.polyval(pint, xdata[0])
return integral
def constraints(p):
"""Polynomial values be > data at every point"""
return np.polyval(p, xdata) - ydata
p0 = np.polyfit(xdata, ydata, order)
y0 = np.polyval(p0, xdata)
shift = (ydata - y0).max()
p0[-1] += shift
result = scipy.optimize.minimize(objective, p0,
constraints={'type':'ineq',
'fun': constraints})
return result.x
As pointed out in my note, the behaviour of your green line is inconsistent in the regions before and after the eight-high plateau. If the left region behavior is what you want, you could do something like this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.spatial import ConvexHull
# %matplotlib inline # for interactive notebooks
y=np.array([ 5, 5, 5, 8, 8, 8, 2, 2, 2, 2, 2, 3, 3, 7])
x=np.array(range(len(y)))
#######
# This essentially selects the vertices that you'd touch streatching a
# rubber band over the top of the function
vs = ConvexHull(np.asarray([x,y]).transpose()).vertices
indices_of_upper_hull_verts = list(reversed(np.concatenate([vs[np.where(vs == len(x)-1)[0][0]: ],vs[0:1]])))
newX = x[indices_of_upper_hull_verts]
newY = y[indices_of_upper_hull_verts]
#########
x_smooth = np.linspace(newX.min(), newX.max(),500)
f = interp1d(newX, newY, kind='quadratic')
y_smooth=f(x_smooth)
plt.plot (x,y)
plt.plot (x_smooth,y_smooth)
plt.scatter (x, y)
which yields:
UPDATE:
Here's an alternative that might better suit you. If instead of a rolling average you use a simple convolution centered around 1, the resulting curve will always be larger than the input. Wings of the convolution kernel can be adjusted for look-ahead/look-behind.
Like this:
import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d
from scipy.ndimage.filters import convolve
## For interactive notebooks
#%matplotlib inline
y=np.array([ 5, 5, 5, 8, 8, 8, 2, 2, 2, 2, 2, 3, 3, 7]).astype(np.float)
preLength = 1
postLength = 1
preWeight = 0.2
postWeight = 0.2
kernal = [preWeight/preLength for i in range(preLength)] + [1] + [postWeight/postLength for i in range(postLength)]
output = convolve(y,kernal)
x=np.array(range(len(y)))
plt.plot (x,y)
plt.plot (x,output)
plt.scatter (x, y)
A drawback is that because the integrated kernel will typically be larger than one (which ensures that the output curve is smooth and never below the input), the output curve will always be larger than the input curve, e.g. on top of the large peak and not sitting right on top as you drew.
I want to apply rigid body transformations to a large set of 2D image matrices. Ideally, I'd like to be able to just supply an affine transformation matrix specifying both the translation and rotation, apply this in one go, then do cubic spline interpolation on the output.
Unfortunately it seems that affine_transform in scipy.ndimage.interpolation doesn't do translation. I know I could use a combination of shift and rotate, but this is kind of messy and in involves interpolating the output multiple times.
I've also tried using the generic geometric_transformation like this:
import numpy as np
from scipy.ndimage.interpolation import geometric_transformation
# make the affine matrix
def maketmat(xshift,yshift,rotation,dimin=(0,0)):
# centre on the origin
in2orig = np.identity(3)
in2orig[:2,2] = -dimin[0]/2.,-dimin[1]/2.
# rotate about the origin
theta = np.deg2rad(rotation)
rotmat = np.identity(3)
rotmat[:2,:2] = [np.cos(theta),np.sin(theta)],[-np.sin(theta),np.cos(theta)]
# translate to new position
orig2out = np.identity(3)
orig2out[:2,2] = xshift,yshift
# the final affine matrix is just the product
tmat = np.dot(orig2out,np.dot(rotmat,in2orig))
# function that maps output space to input space
def out2in(outcoords,affinemat):
outcoords = np.asarray(outcoords)
outcoords = np.concatenate((outcoords,(1.,)))
incoords = np.dot(affinemat,outcoords)
incoords = tuple(incoords[0:2])
return incoords
def rbtransform(source,xshift,yshift,rotation,outdims):
# source --> target
forward = maketmat(xshift,yshift,rotation,source.shape)
# target --> source
backward = np.linalg.inv(forward)
# now we can use geometric_transform to do the interpolation etc.
tformed = geometric_transform(source,out2in,output_shape=outdims,extra_arguments=(backward,))
return tformed
This works, but it's horribly slow, since it's essentially looping over pixel coordinates! What's a good way to do this?
Can you use the scikit image?
If this is the case, you could try to apply an homography. An homography cab used to represent both translation and rotation at the same time through a 3x3 matrix.
You can use the skimage.transform.fast_homography function.
import numpy as np
import scipy
import skimage.transform
im = scipy.misc.lena()
H = np.asarray([[1, 0, 10], [0, 1, 20], [0, 0, 1]])
skimage.transform.fast_homography(im, H)
The transform took about 30 ms on my old Core 2 Duo.
About homography : http://en.wikipedia.org/wiki/Homography
I think affine_transform does do translation --- there's the offset parameter.
I'm currently using this code to calculate the magnitude of the Sobel gradient:
sobel_x = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_16S, 1)
sobel_y = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_16S, 1)
cv.Sobel(im, sobel_x, 1, 0, 3)
cv.Sobel(im, sobel_y, 0, 1, 3)
width, height = cv.GetSize(im)
for i in range(width*height):
x, _, _, _ = cv.Get1D(sobel_x, i)
y, _, _, _ = cv.Get1D(sobel_y, i)
px = int(math.sqrt(x*x + y*y))
cv.Set1D(sobel, i, px)
It's simple enough, but it's not very efficient, because I'm accessing each pixel one by one. I was hoping of a better way to do this in OpenCV:
sobel_x2 = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_32S, 1)
sobel_y2 = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_32S, 1)
sobel_2 = cv.CreateImage(cv.GetSize(im), cv.IPL_DEPTH_32S, 1)
cv.Mul(sobel_x, sobel_x, sobel_x2)
cv.Mul(sobel_y, sobel_y, sobel_y2)
cv.Add(sobel_x2, sobel_y2, sobel_2)
Here I'm just squaring the images and adding them. It uses more memory but should be faster because now some operations will be done in parallel. What I'm stuck on is there's no element-wise square root function (cv.Sqrt seems to only work with scalars).
Any ideas?
As you've already noted, cv.Sqrt() only accepts a scalar in the Python bindings. Since there is an equivalent function, cv::sqrt(), that performs an element-wise square-root, it should also be in the mostly auto-generated Python bindings. Perhaps this is a bug in the version of OpenCV that you are using.
Regardless, you should be able to use cv.Pow() to get the same result:
cv.Pow(src, dst, 0.5)
This is likely not as fast as cv.Sqrt() would be, but should still dramatically outperform an element-wise computation.
I'm writing a library to process gaze tracking in Python, and I'm rather new to the whole numpy / scipy world. Essentially, I'm looking to take an array of (x,y) values in time and "paint" some shape onto a canvas at those coordinates. For example, the shape might be a blurred circle.
The operation I have in mind is more or less identical to using the paintbrush tool in Photoshop.
I've got an interative algorithm that trims my "paintbrush" to be within the bounds of my image and adds each point to an accumulator image, but it's slow(!), and it seems like there's probably a fundamentally easier way to do this.
Any pointers as to where to start looking?
In your question you describe a Gaussian filter, for which scipy has support via a package.
For example:
from scipy import * # rand
from pylab import * # figure, imshow
from scipy.ndimage import gaussian_filter
# random "image"
I = rand(100, 100)
figure(1)
imshow(I)
# gaussian filter
J = gaussian_filter(I, sigma=10)
figure(2)
imshow(J)
Of course, you can apply this on the whole image, or just on a patch, using slicing:
J = array(I) # copy image
J[30:70, 30:70] = gaussian_filter(I[30:70, 30:70], sigma=1) # apply filter to subregion
figure(2)
imshow(2)
For basic image manipulation, the Python Image library (PIL) is probably what you want.
NOTE:
for "painting" with a "brush", I think you could just create a boolean mask array with your brush. For instance:
# 7x7 boolean mask with the "brush" (example: a _crude_ circle)
mask = array([[0, 0, 1, 1, 1, 0, 0],
[0, 1, 1, 1, 1, 1, 0],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[1, 1, 1, 1, 1, 1, 1],
[0, 1, 1, 1, 1, 1, 0],
[0, 0, 1, 1, 1, 0, 0]], dtype=bool)
# random image
I = rand(100, 100)
# apply filter only on mask
# compute the gauss. filter only on the 7x7 subregion, not the whole image
I[40:47, 40:47][mask] = gaussian_filter(I[40:47, 40:47][mask], sigma=1)
You should really look into Andrew Straw's motmot and libcamiface. He uses it for fly behaviour experiments but it's a flexible library for doing just the kind of image acquisition and processing you're doing I think. There's a video of his presentation at SciPy2009.
As for the paintbrush scenario you mention, I'd make a copy of the image with the .copy() method, keep the paintbrush image in an array, and simply add it with
arr[first_br_row:last_br_row, first_br_col:last_br_col] += brush[first_row:last_row, first_col:last_col]
where you set first_br_row, last_br_row first_br_col, last_br_col to address the subimage where you want to add the brush and first_row, last_row, first_col, last_col to clip the brush (normally set them to 0 and # rows/cols - 1, but adjust when you're near enough to the image boundary to only want to paint part of the brush).
Hope all that helps.
Doing a little of math in Fourier space may help: a translation (convolution by a dirac) is equal to a simple multiplication by a phase in Fourier... this makes your paintbrush move to the exact place (a similar solution than catchmeifyoutry & dwf, but this allows a translation finer than the pixel, like 2.5, alas with some ringing). Then, a sum of such strokes is the sum of these operations.
In code:
import numpy
import pylab
from scipy import mgrid
def FTfilter(image, FTfilter):
from scipy.fftpack import fftn, fftshift, ifftn, ifftshift
from scipy import real
FTimage = fftshift(fftn(image)) * FTfilter
return real(ifftn(ifftshift(FTimage)))
def translate(image, vec):
"""
Translate image by vec (in pixels)
"""
u = ((vec[0]+image.shape[0]/2) % image.shape[0]) - image.shape[0]/2
v = ((vec[1]+image.shape[1]/2) % image.shape[1]) - image.shape[1]/2
f_x, f_y = mgrid[-1:1:1j*image.shape[0], -1:1:1j*image.shape[1]]
trans = numpy.exp(-1j*numpy.pi*(u*f_x + v*f_y))
return FTfilter(image, trans)
def occlude(image, mask):
# combine in oclusive mode
return numpy.max(numpy.dstack((image, mask)), axis=2)
if __name__ == '__main__':
Image = numpy.random.rand(100, 100)
X, Y = mgrid[-1:1:1j*Image.shape[0], -1:1:1j*Image.shape[1]]
brush = X**2 + Y**2 < .05 # relative size of the brush
# shows the brush
pylab.imshow(brush)
# move it to some other position / use a threshold to avoid ringing
brushed = translate(brush, [20, -10.51]) > .6
pylab.imshow(brushed)
pylab.imshow(occlude(Image, brushed))
more_strokes = [[40, -15.1], [-40, -15.1], [-25, 15.1], [20, 10], [0, -10], [25, -10.51]]
for stroke in more_strokes:
brushed = brushed + translate(brush, stroke) > .6
pylab.imshow(occlude(Image, brushed))
OpenCV uses numpy arrays and has basic drawing functions: circles, elipses, polylines...
To draw a line you can call
cv.line(array,previous_point,new_point,colour,thickness=x)
each time you get a mouse event.
Have you looked into Tkinter?
Python Image Library may be some help too.