Coding a circular filter in Python - python

I found a code snippet for making a circular filter using scipy and I'd like to understand how it works. I know there's a better one in skimage, but I'm interested in what's going on in this one.
from scipy.ndimage.filters import generic_filter as gf
# Define physical shape of filter mask
def circular_filter(image_data, radius):
kernel = np.zeros((2*radius+1, 2*radius+1))
y, x = np.ogrid[-radius:radius+1, -radius:radius+1]
mask = x**2 + y**2 <= radius**2
kernel[mask] = 1
filtered_image = gf(image_data, np.median, footprint = kernel)
return filtered_image
But I'm not sure I understand perfectly what's going on. In particular, what exactly do the lines
y, x = np.ogrid[-radius:radius+1, -radius:radius+1]
mask = x**2 + y**2 <= radius**2
kernel[mask] = 1
I posted this as an answer to one of my previous questions, but it wasn't replied to, so I'm posting it as a new question.

Looking at your code in detail:
kernel = np.zeros((2*radius+1, 2*radius+1))
y, x = np.ogrid[-radius:radius+1, -radius:radius+1]
mask = x**2 + y**2 <= radius**2
kernel[mask] = 1
The first line:
kernel = np.zeros((2*radius+1, 2*radius+1))
creates a 2-d array of zeros, with a center point and "radius" points on either side. For radius = 2, you would get:
# __r__ +1 __r__
[ 0, 0, 0, 0, 0, ] #\
[ 0, 0, 0, 0, 0, ] #_} r
[ 0, 0, 0, 0, 0, ] # +1
[ 0, 0, 0, 0, 0, ] #\
[ 0, 0, 0, 0, 0, ] #_} r
Next, you get two arrays from the open mesh grid created by numpy.ogrid. Mesh grids are a "trick" in numpy that involves storing a "parallel" array or matrix that holds the x or y coordinate of a particular cell at the location of that cell.
For example, a y-mesh grid might look like this:
[ 0, 0, 0, 0, 0, ]
[ 1, 1, 1, 1, 1, ]
[ 2, 2, 2, 2, 2, ]
[ 3, 3, 3, 3, 3, ]
[ 4, 4, 4, 4, 4, ]
And an x-mesh grid might look like this:
[ 0, 1, 2, 3, 4, ]
[ 0, 1, 2, 3, 4, ]
[ 0, 1, 2, 3, 4, ]
[ 0, 1, 2, 3, 4, ]
[ 0, 1, 2, 3, 4, ]
If you look at them, you'll realize that Y_grid[x][y] == y and X_grid[x][y] == x which is so often useful that it has more than one numpy function to support it. ;-)
An open mesh grid is similar to a closed one, except that it only has "one dimension." That is, instead of a pair of (for example) 5x5 arrays, you get a 1x5 array and a 5x1 array. That's what ogrid does - it returns two open grids. The values are from -radius to radius+1, according to python rules (meaning the radius+1 is left out):
y, x = np.ogrid[-radius:radius+1, -radius:radius+1]
So y is a numpy array storing from e.g., -2..2 (inclusive), and x is an array from -2..2 inclusive. The next step is to build a boolean mask - that is, an array full of boolean values. As you know, when you operate on a numpy array, you get another numpy array. So involving two arrays in an expression with a constant produces another array:
mask = x**2 + y**2 <= radius**2
The value of mask is going to be a 2-color bitmap, where one color is "True" and the other color is "False." The bitmap will describe a solid circle or disk. (Because of the <= relation. Remember that x and y contain -2..2, not 0..4.)
Finally, you convert from type Boolean to int by using the masking array as an overlay on the kernel array (of zeroes), setting the zeroes to ones whenever the mask is "True":
kernel[mask] = 1
At this point, kernel looks like:
# __r__ +1 __r__
[ 0, 0, 1, 0, 0, ] #\
[ 0, 1, 1, 1, 0, ] #_} r
[ 1, 1, 1, 1, 1, ] # +1
[ 0, 1, 1, 1, 0, ] #\
[ 0, 0, 1, 0, 0, ] #_} r

I'm not familiar with SciPy but I'll give it a shot trying to explain the basic concepts.
This entire function's purpose is to alter the original image by applying a filter. This filter could do a lot of things, from changing the contrast of the image, or adding special effects, etc.
Let's go through the different lines:
kernel = np.zeros((2*radius+1, 2*radius+1))
In this line, a copy of the image data is being created, but with all the data being zeros (hence the zeros function is being used). This is so the mask can be applied later onto it.
y, x = np.ogrid[-radius:radius+1, -radius:radius+1]
This is creating what is known as a "meshgrid" or a multi-dimensional grid. This is to create the circular "mask". Just like how on a graph, x and y axes have evenly spaced scaling, the same is necessary here in the meshgrid.
The x and y variables in this case store evenly spaced values that serve as the axes' scaling.
mask = x**2 + y**2 <= radius**2
Here, a "mask" is being created. A mask will serve as the region in the image to be protected from the filter, so as to not alter any original data. Notice how x and y variables are used here in a Pythagorean inequality (important to see that it's not just a circle but a disk), just like how they would be in a mathematical sense. This will create a disk with the given radius that is now considered the mask. The mask variable now contains all coordinates (x,y) where the original data values should not be altered.
kernel[mask] = 1
This is where the mask is now applied to the copy of the image that was created earlier. Now, there is a perfect copy of the image (i.e. same dimensions) but with a disk-like "mask" that "protects" the original data from being altered. This is why all the points covered by the disk is set to 1. Also, notice how the dimensions of kernel and mask match. Both are multi-dimensional. The rest of values in the image copy are still set to zero, as was done in the first line.
filtered_image = gf(image_data, np.median, footprint = kernel)
This is final part where everything is pieced together. There is the original data stored in image_data and there is the kernel, which is the image copy with the mask applied on it indicating where the data should not be altered. Both of them are passed as parameters into the actual filter function gf (stands for generic filter) and the output is a new filtered image.
This is a core concept in image filtering and if you want to learn more about it, I suggest starting out by learning basic signal processing concepts. Signal processing courses cover the mathematics of how these concepts work, but are usually explained in really abstract mathematics because this concept can be applied to numerous different examples.


2d numpy array, making each value the sum of the 3x3 square it is centered at

I have a square 2D numpy array, A, and an array of zeros, B, with the same shape.
For every index (i, j) in A, other than the first and last rows and columns, I want to assign to B[i, j] the value of np.sum(A[i - 1:i + 2, j - 1:j + 2].
A =
array([[0, 0, 0, 0, 0],
[0, 1, 0, 1, 0],
[0, 1, 1, 0, 0],
[0, 1, 0, 1, 0],
[0, 0, 0, 0, 0])
B =
array([[0, 0, 0, 0, 0],
[0, 3, 4, 2, 0],
[0, 4, 6, 3, 0],
[0, 3, 4, 2, 0],
[0, 0, 0, 0, 0])
Is there an efficient way to do this? Or should I simply use a for loop?
There is a clever (read "borderline smartass") way to do this with np.lib.stride_tricks.as_strided. as_strided allows you to create views into your buffer that simulate windows by adding another dimension to the view. For example, if you had a 1D array like
>>> x = np.arange(10)
>>> np.lib.stride_tricks.as_strided(x, shape=(3, x.shape[0] - 2), strides=x.strides * 2)
array([[0, 1, 2, 3, 4, 5, 6, 7],
[1, 2, 3, 4, 5, 6, 7, 8],
[2, 3, 4, 5, 6, 7, 8, 9]])
Hopefully it is clear that you can just sum along axis=0 to get the sum of each size 3 window. There is no reason you couldn't extrend that to two or more dimensions. I've written the shape and index of the previous example in a way that suggests a solution:
A = np.array([[0, 0, 0, 0, 0],
[0, 1, 0, 1, 0],
[0, 1, 1, 0, 0],
[0, 1, 0, 1, 0],
[0, 0, 0, 0, 0]])
view = np.lib.stride_tricks.as_strided(A,
shape=(3, 3, A.shape[0] - 2, A.shape[1] - 2),
strides=A.strides * 2
B[1:-1, 1:-1] = view.sum(axis=(0, 1))
Summing along multiple axes simultaneously has been supported in np.sum since v1.7.0. For older versions of numpy, just sum repeatedly (twice) along axis=0.
Filling in the edges of B is left as an exercise for the reader (since it's not really part of the question).
As an aside, the solution here is a one-liner if you want it to be. Personally, I think anything with as_strided is already illegible enough, and doesn't need any further obfuscation. I'm not sure if a for loop is going to be bad enough performance-wise to justify this method in fact.
For future reference, here is a generic window-making function that can be used to solve this sort of problem:
def window_view(a, window=3):
Create a (read-only) view into `a` that defines window dimensions.
The first ``a.ndim`` dimensions of the returned view will be sized according to `window`.
The remaining ``a.ndim`` dimensions will be the original dimensions of `a`, truncated by `window - 1`.
The result can be post-precessed by reducing the leading dimensions. For example, a multi-dimensional moving average could look something like ::
window_view(a, window).sum(axis=tuple(range(a.ndim))) / window**a.ndim
If the window size were different for each dimension (`window` were a sequence rather than a scalar), the normalization would be ```` instead of ``window**a.ndim``.
a : array-like
The array to window into. Due to numpy dimension constraints, can not have > 16 dims.
window :
Either a scalar indicating the window size for all dimensions, or a sequence of length `a.ndim` providing one size for each dimension.
view : numpy.ndarray
A read-only view into `a` whose leading dimensions represent the requested windows into `a`.
``view.ndim == 2 * a.ndim``.
a = np.array(a, copy=False, subok=True)
window = np.array(window, copy=False, subok=False,
if window.size == 1:
window = np.full(a.ndim, window)
elif window.size == a.ndim:
window = window.ravel()
raise ValueError('Number of window sizes must match number of array dimensions')
shape = np.concatenate((window, a.shape))
shape[a.ndim:] -= window - 1
strides = a.strides * 2
return np.lib.stride_tricks.as_strided(a, shake=shape, strides=strides)
I have found no 'simple' ways of doing this. But here are two ways:
Still involves a for loop
# Basically, get the sum for each location and then pad the result with 0's
B = [[np.sum(A[j-1:j+2,i-1:i+2]) for i in range(1,len(A)-1)] for j in range(1,len(A[0])-1)]
B = np.pad(B, ((1,1)), "constant", constant_values=(0))
Is longer but no for loops (this will be a lot more efficient on big arrays):
# Roll basically slides the array in the desired direction
A_right = np.roll(A, -1, 1)
A_left = np.roll(A, 1, 1)
A_top = np.roll(A, 1, 0)
A_bottom = np.roll(A, -1, 0)
A_bot_right = np.roll(A_bottom, -1, 1)
A_bot_left = np.roll(A_bottom, 1, 1)
A_top_right = np.roll(A_top, -1, 1)
A_top_left = np.roll(A_top, 1, 1)
# After doing that, you can just add all those arrays and these operations
# are handled better directly by numpy compared to when you use for loops
B = A_right + A_left + A_top + A_bottom + A_top_left + A_top_right + A_bot_left + A_bot_right + A
# You can then return the edges to 0 or whatever you like
B[0:len(B),0] = 0
B[0:len(B),len(B[0])-1] = 0
B[0,0:len(B)] = 0
B[len(B[0])-1,0:len(B)] = 0
You can just sum the 9 arrays that make up a block, each one being shifted by 1 w.r.t. the previous in either dimension. Using slice notation this can be done for the whole array A at once:
B = np.zeros_like(A)
B[1:-1, 1:-1] = sum(A[i:A.shape[0]-2+i, j:A.shape[1]-2+j]
for i in range(0, 3) for j in range(0, 3))
General version for arbitrary rectangular windows
def sliding_window_sum(a, size):
"""Compute the sum of elements of a rectangular sliding window over the input array.
a : array_like
Two-dimensional input array.
size : int or tuple of int
The size of the window in row and column dimension; if int then a quadratic window is used.
Shape is ``(a.shape[0] - size[0] + 1, a.shape[1] - size[1] + 1)``.
if isinstance(size, int):
size = (size, size)
m = a.shape[0] - size[0] + 1
n = a.shape[1] - size[1] + 1
return sum(A[i:m+i, j:n+j] for i in range(0, size[0]) for j in range(0, size[1]))

vectorized/broadcasted Dot product of numpy arrays with different dimensions

The Problem:
I want to calculate the dot product of a very large set of data. I am able to do this in a nested for-loop, but this is way too slow.
Here is a small example:
import numpy as np
points = np.array([[0.5, 2, 3, 5.5, 8, 11], [1, 2, -1.5, 0.5, 4, 5]])
lines = np.array([[0, 2, 4, 6, 10, 10, 0, 0], [0, 0, 0, 0, 0, 4, 4, 0]])
x1 = lines[0][0:-1]
y1 = lines[1][0:-1]
L1 = np.asarray([x1, y1])
# calculate the relative length of the projection
# of each point onto each line
a = np.diff(lines)
b = points[:,:,None] - L1[:,None,:]
[rows, cols, pages] = np.shape(b)
Z = np.zeros((cols, pages))
for k in range(cols):
for l in range(pages):
Z[k][l] = a[0][l]*b[0][k][l] + a[1][l]*b[1][k][l]
N = np.linalg.norm(a, axis=0)**2
relativeProjectionLength = np.squeeze(np.asarray(Z/N))
In this example, the first two dimensions of both a and b represent the x- and y-coordinates that I need for the dot product.
The shape of a is (2,7) and b has (2,6,7). Since the dot product reduces the first dimension I would expect the result to be of the shape (6,7). How can I calculate this without the slow loops?
What I have tried:
I think that with correct broadcasting could do the job, however I have trouble setting up the dimensions correctly.
a = a[:, None, :]
Z =,b)
This on gives me the following error:
shapes (2,1,7) and (2,6,7) not aligned: 7 (dim 2) != 6 (dim 1)
You can use np.einsum -
Explanation :
Keep the last axes aligned for the two inputs.
Sum-reduce the first from those.
Let the rest stay, which is the second axis of b.
Usual rules on whether to use einsum or stick to a loopy-dot based method apply here. does not reduce the first dimension. From the docs:
For N dimensions it is a sum product over the last axis of a and the second-to-last of b:
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])
That is exactly what the error is telling you: it is attempting to match axis 2 in the first vector to axis 1 in the second.
You can fix this using numpy.rollaxis or better yet numpy.moveaxis. Instead of a = a[:, None, :], do
a = np.movesxis(a, 0, -1)
b = np.moveaxis(b, 0, -2)
Z =, b)
Better yet, you can construct your arrays to have the correct shape up front. For example, transpose lines and do a = np.diff(lines, axis=0).

How to make a circular kernel?

For the Code below, I am wondering how to make a circular kernel instead of a rectangular one. I am currently looking at something circular, and I want to find the BGR average values for it. By adjusting my kernel, my data will be more accurate.
for center in c_1:
b = img2[center[0]-4: center[0]+5, center[1]-4: center[1]+5, 0]
g = img2[center[0]-4: center[0]+5, center[1]-4: center[1]+5, 1]
r = img2[center[0]-4: center[0]+5, center[1]-4: center[1]+5, 2]
We manually created a structuring elements in the previous examples with help of Numpy. It is rectangular shape. But in some cases, you may need elliptical/circular shaped kernels. So for this purpose, OpenCV has a function, cv2.getStructuringElement(). You just pass the shape and size of the kernel, you get the desired kernel.
# Elliptical Kernel
>>> cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(5,5))
array([[0, 0, 1, 0, 0],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[1, 1, 1, 1, 1],
[0, 0, 1, 0, 0]], dtype=uint8)
Get the circle region when given the center, you could try the following function:
def circleAverage(center, r = 4):
for i in range(center[0]-r, center[0]+r):
for j in range(center[1]-r, center[1] + r):
if (center[0] - i) ** 2 + (center[1] - j) ** 2 <= r**2:
// do your computation here.
Hope this helps you.
Came here to find how to make a circular (symmetric) kernel. Ended up with my own implementation.
import numpy as np
def get_circular_kernel(diameter):
mid = (diameter - 1) / 2
distances = np.indices((diameter, diameter)) - np.array([mid, mid])[:, None, None]
kernel = ((np.linalg.norm(distances, axis=0) - mid) <= 0).astype(int)
return kernel
Note that for low diameters, behavior is perhaps unexpected. Variable mid when used for the second time can for example be replaced by diameter / 2.
I've implemented it in a following way:
r = 16
kernel = np.fromfunction(lambda x, y: ((x-r)**2 + (y-r)**2 <= r**2)*1, (2*r+1, 2*r+1), dtype=int).astype(np.uint8)
Extra type conversion is needed to avoid overflow

Python: Generating from geometric distribution

Is this best way or most efficient way to generate random numbers from a geometric distribution with an array of parameters that may contain 0?
[ 0 if x == 0 else numpy.random.geometric(1./x) for x in allids["c"]]
Note I am somewhat concerned about optimization.
A bit of context: I have an sequence of characters (i.e. ATCGGGA) and I would like to expand/contract runs of a single character (i.e. if original sequence had a run of 2 'A's I want to simulate a sequence that will have an expected value of 2 'A's, but vary according to a geometric distribution). All the characters that are runs of length 1 I do NOT want to be of variable length.
So if
rep=[ 0 if x == 0 else numpy.random.geometric(1./x) for x in allids["c"]]
"".join([s*r for r, s in zip(rep, seq)])
will output (when rep is [1, 0, 1, 1, 3, 0, 0, 1, 0])
You can use a masked array to avoid the division by zero.
import numpy as np
a =[2, 0, 1, 1, 3, 0, 0, 2, 0], 0)
rep = np.random.geometric(1. / a)
rep[a.mask] = 0
This generates a random sample for each element of a, and then deletes some of them later. If you're concerned about this waste of random numbers, you could generate just enough, like so:
import numpy as np
a =[2, 0, 1, 1, 3, 0, 0, 2, 0], 0)
rep = np.zeros(a.shape, dtype=int)
rep[~a.mask] = np.random.geometric(1. / a[~a.mask])
What about this:
counts = array([2, 0, 1, 1, 3, 0, 0, 2, 0], dtype=float)
counts_ma =, mask=(counts == 0))
counts[logical_not(counts.mask)] = \
array([numpy.random.geometric(v) for v in 1.0 / counts[logical_not(counts.mask)]])
You could potentially precompute the distribution of homopolymer runs and limit the number of calls to geometric as fetching large numbers of values from RNGs is more efficient than individual calls

A 3-D grid of regularly spaced points

I want to create a list containing the 3-D coords of a grid of regularly spaced points, each as a 3-element tuple. I'm looking for advice on the most efficient way to do this.
In C++ for instance, I simply loop over three nested loops, one for each coordinate. In Matlab, I would probably use the meshgrid function (which would do it in one command). I've read about meshgrid and mgrid in Python, and I've also read that using numpy's broadcasting rules is more efficient. It seems to me that using the zip function in combination with the numpy broadcast rules might be the most efficient way, but zip doesn't seem to be overloaded in numpy.
Use ndindex:
import numpy as np
for i in ind:
# (0, 0, 0)
# (0, 0, 1)
# (0, 1, 0)
# (0, 1, 1)
# (0, 2, 0)
# (0, 2, 1)
# (1, 0, 0)
# (1, 0, 1)
# (1, 1, 0)
# (1, 1, 1)
# (1, 2, 0)
# (1, 2, 1)
# (2, 0, 0)
# (2, 0, 1)
# (2, 1, 0)
# (2, 1, 1)
# (2, 2, 0)
# (2, 2, 1)
Instead of meshgrid and mgrid, you can use ogrid, which is a "sparse" version of mgrid. That is, only the dimension along which the values change are filled in. The others are simply broadcast. This uses much less memory for large grids than the non-sparse alternatives.
For example:
>>> import numpy as np
>>> x, y = np.ogrid[-1:2, -2:3]
>>> x
[ 0],
[ 1]])
>>> y
array([[-2, -1, 0, 1, 2]])
>>> x**2 + y**2
array([[5, 2, 1, 2, 5],
[4, 1, 0, 1, 4],
[5, 2, 1, 2, 5]])
I would say go with meshgrid or mgrid, in particular if you need non-integer coordinates. I'm surprised that Numpy's broadcasting rules would be more efficient, as meshgrid was designed especially for the problem that you want to solve.
for multi-d (greater than 2) meshgrids, use numpy.lib.index_tricks.nd_grid like so:
import numpy
grid = numpy.lib.index_tricks.nd_grid()
g1 = grid[:3,:3,:3]
g2 = grid[0:1:0.5, 0:1, 0:2]
g3 = grid[0:1:3j, 0:1:2j, 0:2:2j]
where g1 has x values of [0,1,2]
and g2 has x values of [0,.5],
and g3 has x values of [0.0,0.5,1.0] (the 3j defining the step count instead of the step increment. see the documentation for more details.
Here's an efficient option similar to your C++ solution, which I've used for exactly the same purpose:
import numpy, itertools, collections
def grid(xmin, xmax, xstep, ymin, ymax, ystep, zmin, zmax, zstep):
"return nested tuples of grid-sampled coordinates that include maxima"
return collections.deque( itertools.product(
numpy.arange(xmin, xmax+xstep, xstep).tolist(),
numpy.arange(ymin, ymax+ystep, ystep).tolist(),
numpy.arange(zmin, zmax+zstep, zstep).tolist() ) )
Performance is best (in my tests) when using a.tolist(), as shown above, but you can use a.flat instead and drop the deque() to get an iterator that will sip memory. Of course, you can also use a plain old tuple() or list() instead of deque() for a slight performance penalty (again, in my tests).

