Math behind scipy.ndimage.convolve - python

While I have already found the documentation on scipy.ndimage.convolve function and I "practically know what it does", when I try to calculate the resulting arrays I can't follow the mathematical formula. Let's take for example:
a = np.array([[1, 2, 0, 0],`
[5, 3, 0, 4],
[0, 0, 0, 7],
[9, 3, 0, 0]])
k = np.array([[1,1,1],[1,1,0],[1,0,0]])
from scipy import ndimage
ndimage.convolve(a, k, mode='constant', cval=0.0)
# Why is the result like this ?
array([[11, 10, 7, 4],
[10, 3, 11, 11],
[15, 12, 14, 7],
[12, 3, 7, 0]])
I would appreciate a step by step calculation.

Details on NDImage.convolve
I stumbled on this NDImage convolution eventhough I know the basic np.convolve, and the document is not much self explanatory, so I took the effort to crunch through and supplement the earlier explanatory post:
A. Basics:
Reference: refers to following if your concept on convolution is not so well grounded
https://en.wikipedia.org/wiki/Kernel_(image_processing),
https://en.wikipedia.org/wiki/Convolution
Essentially NDimage.convolve has 4 modes, this post focused on the Constant mode, for which you use the value as specified by cval=0 or whatever and add padded rows and columns as needed (will explain in a little bit)
The convolution essentially slides the kernel from left and right and then step down again and from left to right again until the needed (same number) number of convolved elements are achieved
The function will calculate the padded rows/columns needed. In this case the filter K is 3 x 3 matrix, and the source image is matrix a is 4 x 4, so you need two padded rows at top and bottom and two padded rows at left and right (4 + 2 = 6, and the number of rows or columns needed is 3 + 1 + 1 + 1 = 6, each slide will need the extra one row or column)
B. Operations:
Add a row and column of zeros to the top and left of Array a (to convolve a 3 x 3 to 4 x 4 evenly,
you need extra padded row/column at the 1st and 4th sliding window) and also one row/column of padded zeros to the bottom and right
Flip the kernel K as Kflip: [[0,0,1], [0,1,1], [1,1,1]]
you can use numpy np.flip (why it need to be flipped basically relates to the concept of convolution vs correlation which are like twins in opposite direction)
Slide the flipped K matrix onto this size 6 x 6 expanded matrix [[0,0,0,0,0], [0,1,2,0,0], [0,5,3,0,4,0], [0,0,0,0,7], [0,9,3,0,0,0], [0,0,0,0,0]]
For the first step of sliding window (note the first row of column of the kernel will convolved with the padded zeros), you get:
Flipped K dot sum [[0,0,0], [0,1,2], [0,5,3]] = 11 (1*1+1*2+1*5+1*3, others are zeros)
(dot sum refers to sum of the inner dot element-wise multiplication, basically just multiply the corresponding elements in the same positions for the two given matrices)
Slide K one step to the right, you will have 10 (first row all zeros due to padded zeros, second row: 1*2+, third row 1*3 + 1*4, fourth row all zeros due to [0,0,0,0,7])
likewise you slide to the right for another two steps to get all four elements for the convolved matrix (note for the 4th of this row, again we partially convolved on expanded padded row/columns)(
Then you slide the K filter one row down and reset to the far left of the "expanded /padded matrix"
You will have again the same 10 (first row: 1*2+, second row 1*3 + 1*4), so on and so forth

Just to warm up consider
k = np.array([[1,0,0],[0,1,0],[0,0,0]])
instead of your k, then if you
ndimage.convolve(a, k, mode='constant', cval=0.0)
you get
array([[4, 2, 4, 0],
[5, 3, 7, 4],
[3, 0, 0, 7],
[9, 3, 0, 0]])
and note that any element is the sum of it's own position (due to the 2nd 1 in k) and the one below and to the right (due to the 1st 1 in k), ie the 4 in the top corner is from the original 1 in the top corner plus the 3 diagonally down from it.
The (possibly) confusing part is that the effect of the k is opposite of what you might expect, ie for the k above you might expect the first 1 to add the value above and to the left, instead of down and to the right.
Now back to yours: the 12 (3 down and 2 across) is the sum of 9+3+0+0+0+0.
Note that anything outside the matrix is assumed to be 0.

Related

Is there any way to vectorize a rolling cross-correlation in python based on my example?

Let's suppose I have two arrays that represent pixels in pictures.
I want to build an array of tensordot products of pixels of a smaller picture with a bigger picture as it "scans" the latter. By "scanning" I mean iteration over rows and columns while creating overlays with the original picture.
For instance, a 2x2 picture can be overlaid on top of 3x3 in four different ways, so I want to produce a four-element array that contains tensordot products of matching pixels.
Tensordot is calculated by multiplying a[i,j] with b[i,j] element-wise and summing the terms.
Please examine this code:
import numpy as np
a = np.array([[0,1,2],
[3,4,5],
[6,7,8]])
b = np.array([[0,1],
[2,3]])
shape_diff = (a.shape[0] - b.shape[0] + 1,
a.shape[1] - b.shape[1] + 1)
def compute_pixel(x,y):
sub_matrix = a[x : x + b.shape[0],
y : y + b.shape[1]]
return np.tensordot(sub_matrix, b, axes=2)
def process():
arr = np.zeros(shape_diff)
for i in range(shape_diff[0]):
for j in range(shape_diff[1]):
arr[i,j]=compute_pixel(i,j)
return arr
print(process())
Computing a single pixel is very easy, all I need is the starting location coordinates within a. From there I match the size of the b and do a tensordot product.
However, because I need to do this all over again for each x and y location as I'm iterating over rows and columns I've had to use a loop, which is of course suboptimal.
In the next piece of code I have tried to utilize a handy feature of tensordot, which also accepts tensors as arguments. In order words I can feed an array of arrays for different combinations of a, while keeping the b the same.
Although in order to create an array of said combination, I couldn't think of anything better than using another loop, which kind of sounds silly in this case.
def try_vector():
tensor = np.zeros(shape_diff + b.shape)
for i in range(shape_diff[0]):
for j in range(shape_diff[1]):
tensor[i,j]=a[i: i + b.shape[0],
j: j + b.shape[1]]
return np.tensordot(tensor, b, axes=2)
print(try_vector())
Note: tensor size is the sum of two tuples, which in this case gives (2, 2, 2, 2)
Yet regardless, even if I produced such array, it would be prohibitively large in size to be of any practical use. For doing this for a 1000x1000 picture, could probably consume all the available memory.
So, is there any other ways to avoid loops in this problem?
In [111]: process()
Out[111]:
array([[19., 25.],
[37., 43.]])
tensordot with 2 is the same as element multiply and sum:
In [116]: np.tensordot(a[0:2,0:2],b, axes=2)
Out[116]: array(19)
In [126]: (a[0:2,0:2]*b).sum()
Out[126]: 19
A lower-memory way of generating your tensor is:
In [121]: np.lib.stride_tricks.sliding_window_view(a,(2,2))
Out[121]:
array([[[[0, 1],
[3, 4]],
[[1, 2],
[4, 5]]],
[[[3, 4],
[6, 7]],
[[4, 5],
[7, 8]]]])
We can do a broadcasted multiply, and sum on the last 2 axes:
In [129]: (Out[121]*b).sum((2,3))
Out[129]:
array([[19, 25],
[37, 43]])

NumPy array with largest value on diagonal and other values shuffled

I am trying to create a square NumPy (or PyTorch, since PyTorch code can be turned into NumPy with minimal effort) matrix which has the following property: given a set of values, the diagonal elements in each row have the largest value and the other values are randomly shuffled for the other positions.
For example, if I have [1, 2, 3, 4], a possible desired output is:
[[4, 3, 1, 2],
[1, 4, 3, 2],
[2, 1, 4, 3],
[2, 3, 1, 4]]
There can be (several) other possible outputs, as long as the diagonal elements are the largest value (4 in this case) and the off-diagonal elements in each row contain the other values but shuffled.
A hacky/inefficient way of doing this could be first creating a square matrix (4x4) of zeros and putting the largest value (4) in all the diagonal positions, and then traversing the matrix row by row, where for each row i, populate the elements except index i with shuffled remaining values (shuffled versions of [1, 2, 3]). This would be very slow as the matrix size increases. Is there a cleaner/faster/Pythonic way of doing it? Thank you.
First you can generate a randomized array on the first axis with np.random.shuffle(), then I've used a (not so easy to understand) mathematical tricks to shift each rows:
import numpy as np
from numpy.fft import fft, ifft
# First create your randomized array with np.random.shuffle()
x = np.array([[1,2,3,4],
[2,4,3,1],
[4,1,2,3],
[2,3,1,4]])
# We use np.where to determine on which column each 4 are.
_,s = np.where(x==4);
# We compute the left shift that need to be applied to each row in order to get each 4 on the diagonal
s = s-np.r_[0:x.shape[0]]
# And here is the tricks, we can use the fast fourrier transform in order to left shift each row by a given value:
L = np.real(ifft(fft(x,axis=1)*np.exp(2*1j*np.pi/x.shape[1]*s[:,None]*np.r_[0:x.shape[1]][None,:]),axis=1).round())
# Noticed that we could also use a right shift, we simply have to negate our exponential exponant:
# np.exp(-2*1j*np.pi...
And we obtain the following matrix:
[[4. 1. 2. 3.]
[2. 4. 1. 3.]
[2. 3. 4. 1.]
[3. 2. 1. 4.]]
No hidden for loop, only pure linear algaebra stuff.
To give you an idea it take only a few milliseconds for a 1000x1000 matrix on my computer and ~20s for a 10000x10000 matrix.

Minimum absolute difference between elements in two numpy arrays

Consider two 1d numpy arrays.
import numpy as np
X = np.array([-43, 21, 4, 6, -1, 22, 8])
Y = np.array([13, 5, -12, 0])
I want to find the value(s) from X that have the minimum absolute difference with the value(s) from Y. In the example shown, the minimum absolute difference is 1, given by [[4, 5], [6, 5], [-1, 0]]. There are lots of resources on this site about finding minimum element of arrays, but that's not what I'm after.
For the present question, both starting arrays are 1d, though their sizes may differ. I'd also be interested, though, on tips about how to proceed if the starting arrays had different shapes. Is it simply a matter of flattening both then proceeding as before?
You can calculate the absolute distance array and then find the minimum in that array. This method works for different X and Y lengths. If they are multi-dimensional, simply flatten them first (using X.flatten(), ...) and apply this solution to the flattened arrays:
If you want ALL pairs with minimum absolute distance:
#absolute distance between X and Y
dist = np.abs(X[:,None]-Y)
#elements of X with minimum absolute distance
X[np.where(dist==dist.min())[0]]
#corresponding elements of Y with absolute distance
Y[np.where(dist==dist.min())[1]]
output:
[ 4 6 -1]
[5 5 0]
And you want them in a single array format:
idx = np.where(dist==dist.min())
np.stack((X[idx[0]], Y[idx[1]])).T
[[ 4 5]
[ 6 5]
[-1 0]]
If you want the first occurrence of minimum absolute distance with faster solution:
X[dist.argmin()//Y.size]
Y[dist.argmin()//X.size]
or equally another solution (I think would be faster):
idx = np.unravel_index(np.argmin(dist), dist.shape)
X[idx[0]]
Y[idx[1]]
output:
4
5
Note: Another way of getting the absolute distance array is:
dist = np.abs(np.subtract.outer(X,Y))

How to account for neighboring cells in a 2D array in Python

Prompt:
Given a 2D integer matrix M representing the gray scale of an image, you need to design a smoother to make the gray scale of each cell becomes the average gray scale (rounding down) of all the 8 surrounding cells and itself. If a cell has less than 8 surrounding cells, then use as many as you can.
Example:
Input:
[[1,1,1],
[1,0,1],
[1,1,1]]
Output:
[[0, 0, 0],
[0, 0, 0],
[0, 0, 0]]
Explanation:
For the point (0,0), (0,2), (2,0), (2,2) -> floor(3/4) = floor(0.75) = 0
For the point (0,1), (1,0), (1,2), (2,1) -> floor(5/6) = floor(0.83333333) = 0
For the point (1,1): floor(8/9) = floor(0.88888889) = 0
Solution:
class Solution:
def imageSmoother(self, grid):
"""
:type M: List[List[int]]
:rtype: List[List[int]]
"""
rows, cols = len(grid), len(grid[0])
#Go through each cell
for r in range(rows):
for c in range(cols):
#Metrics for calculating average, starting inputs are zero since the loop includes the current cell, grid[r][c]
total = 0
n = 0
#Checking the neighbors
for ri in [-1,0,1]:
for ci in [-1,0,1]:
if (r + ri >= 0 and r + ri <= rows-1 and c + ci >=0 and c + ci <= cols-1):
total += grid[r+ri][c+ci]
n += 1
#Now we convert the cell value to the average
grid[r][c] = int(total/n)
return grid
My solution is incorrect. It passes some test cases, but for this one I fail.
Input: [[2,3,4],[5,6,7],[8,9,10],[11,12,13],[14,15,16]]
Output: [[4,4,5],[6,6,6],[8,9,9],[11,11,12],[12,12,12]]
Expected: [[4,4,5],[5,6,6],[8,9,9],[11,12,12],[13,13,14]]
As you can see, my solution is really close. I'm not sure where I'm messing up since when I changed the parameters around I started failing other basic test cases. The solutions I see online use other packages which I'd prefer not to use since I want to approach this problem more intuitively.
How do you check where you're going wrong with 2D array problems? Thanks!
Leetcode solution:
def imageSmoother(self, M):
R,C=len(M),len(M[0])
M2=[[0]*C for i in range(R)]
for i in range(R):
for j in range(C):
temp=[M[i+x][j+y] for x,y in list(itertools.product([-1,0,1],[-1,0,1])) if 0<=i+x<R and 0<=j+y<C ]
M2[i][j]=(sum(temp)//len(temp))
return M2
The problem with your code is that you're modifying grid as you go along. So, for each cell, you're using the input values for the down/right neighbors, but the output values for the up/left neighbors.
So, for your given example, when you're computing the neighbors of grid[1][0], you've already replaced two of the neighbors, grid[0][0] and grid[0][1], so they're now 4, 4 instead of 2, 3. Which means you're averaging 4, 4, 5, 6, 8, 9 instead of 2, 3, 5, 6, 8, 9. So, instead of getting a 5.5 that you round down to 5, you get a 6.0 that you round down to 6.
The simplest fix is to just build up a new output grid as you go along, then return that:
rows, cols = len(grid), len(grid[0])
outgrid = []
#Go through each cell
for r in range(rows):
outrow = []
for c in range(cols):
# … same code as before, but instead of the grid[r][c] =
outrow.append(int(total/n))
outgrid.append(outrow)
return outgrid
If you need to modify the grid in place, you can instead copy the original grid, and iterate over that copy:
rows, cols = len(grid), len(grid[0])
ingrid = [list(row) for row in grid]
#Go through each cell
for r in range(rows):
for c in range(cols):
# … same code as before, but instead of total += grid[r+ri][c+ci]
total += ingrid[r+ri][c+ci]
If you used a 2D NumPy array instead of a list of lists, you could solve this at a higher level.
NumPy lets you add entire arrays all at once, divide them by scalars, etc., so you can get rid of those loops over r and c and just do the work array-wide. But you still have to think about your boundaries. You can't just add arr and arr[:-1] and arr[1:] and so on, you need to pad them out to the same size. And if you just pad with 0s, you'll end up averaging 0, 4, 4, 0, 5, 6, 0, 8, 9, which is no good. But if you pad them with NaN values, so you're averaging NaN, 4, 4, NaN, 5, 6, NaN, 8, 9, then you can use the nanmean function, which ignores those NaN values and averages the 6 real values.
So, this is still a few lines of code to iterate over the 9 directions, pad the 9 arrays, and nanmean the results. (Or you could cram it into a giant expression with product, like the leetcode answer, but that isn't exactly more readable or easier to understand.)
But if you can drag in SciPy, a collection of algorithms for almost anything you'd ever want to build on top of NumPy, it has a function in its ndimage library called generic_filter that can do every conceivable variation of "gather the N neighbors, padding like X, and run function Y on the resulting arrays".
In our case, we want to gather the 3-per-axis neighbors, pad with the constant value NaN, and run the nanmean function, so this one-liner will do everything you need:
scipy.ndimage.generic_filter(grid, function=np.nanmean, size=3, mode='constant', cval=np.NaN)

How to choose axis value in numpy array

I am a new user to numpy and I was using numpy delete, where it mention that to delete horizontal row we should use axis=0 but in other documentation of numpy glossary, it says horizontal axis is 1. It would be great if someone can let me know what is wrong in my understanding.
An array is a systematic way of structuring numbers in grids of any dimensionality. The grid directions have labels, and these labels come from a convention of how new dimensions are added to a grid.
Here's the convention:
The simplest such grid is a 0-dimensional (0D) array, which has no axes and can only hold a scalar. This is a 0D array:
42
If we start putting scalars into a list we get a 1D array. This new grid only has one axis, and if we want to label that axis with a number, we better start with something simple - like axis=0! A 1D array could be:
# ----0--->
[42, π, √2]
Now we want to create an array of 1D arrays, which will give us a 2D array. The horizontal axis will still be 0, but the new vertical axis will get the next lowest number we know, axis=1. Here's what it could look like:
# ----0---->
[[42, π, √2], # |
[1, 2, 3], # 1
[10, 20, 30]] # V
The true beauty is that this generalizes to infinity. If we need a box of numbers we'd create a 3D array by stacking 2D arrays, and the direction that traces the depth of the box would naturally have to be axis=2. If we wanted a 4D array, we would just make a list of boxes (3D arrays), and call every box using an index along axis=3. This can go on forever.
In NumPy:
Any function/method that takes an axis-argument uses this convention. For a 2D array this means that doing something like np.delete(X, [1, 2, 3], axis=0) will iterate over arrays extruded along the 0'th axis, to return X without rows 1, 2 and 3. The same logic applies for getting values from an array.
X[rows_along_0th_axis, columns_along_1st_axis, ..., vectors_along_nth_axis]
Taking from the links that you provided, here the excerpts from numpy delete and glossary that probably caused you some confusions and the clarification in the following.
Excerpt
>>> arr = np.array([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
>>> arr
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
>>> np.delete(arr, 1, 0)
array([[ 1, 2, 3, 4],
[ 9, 10, 11, 12]])
Excerpt
the first running vertically downwards across rows (axis 0), and the
second running horizontally across columns (axis 1)
I think the confusion derives from the words vertically and horizontally in the second excerpt.
What the second excerpt means is that by setting axis it is possible to decide over which dimension to move. For example, in a 2d matrix, axis=0 corresponds to iterating over the rows (thus moving vertically over the array), while axis=1 corresponds
to iterating over columns (so moving horizontally over the array). It does not say that axis=1 corresponds to the horizontal axis as the OP understood.
The delete function follows the above description, as indeed, by using np.delete(arr, 1, axis=0), the function iterates over the rows, and deletes the row with index 1. If, instead, columns should be deleted, then axis=1. For example, on the same array arr
>>> np.delete(arr, [0,1,4], axis=1)
array([[ 3, 4],
[ 7, 8],
[11, 12]])
in which delete iterates over the columns, and the columns with indices 0, 1 are deleted, and nothing else is deleted as column with index 4 does not exist.

Categories

Resources