I'm trying to implement fprop for MaxPooling layer in Conv Networks with no overlapping and pooling regions 2x2. To do so, I need to split my input matrix into matrices of size 2x2 so that I can extract the maximum. I am then creating a mask which I can use later on in bprop. To carry out the splitting I am splitting my input matrix first vertically and then horizontally and then finding the maximum using vsplit, hsplit and amax respectively. This keeps crashing however with index out of bounds exceptions and I am not sure where the error is. Is there a simpler way to split the 24 x 24 input matrix into 144 2x2 matrices so that I can obtain the maximum.
I am doing the following to do so:
for i in range(inputs.shape[0]):
for j in range(inputs.shape[1]):
for k in range(inputs.shape[2] // 2):
for h in range(inputs.shape[3] // 2):
outputs[i,j,k,h] = np.amax(np.hsplit(np.vsplit(inputs[i,j], inputs.shape[2] // 2)[k], inputs.shape[1] // 2)[h])
max_ind = np.argmax(np.hsplit(np.vsplit(inputs[i,j], inputs.shape[2] // 2)[k], inputs.shape[1] // 2)[h])
max_ind_y = max_ind // inputs.shape[2]
if (max_ind_y == 0):
max_ind_x = max_ind
else:
max_ind_x = max_ind % inputs.shape[3]
self.mask[i,j,max_ind_y + 2 * k, max_ind_x + 2 * h] = outputs[i,j,k,h]
EDIT:
This is the output produced by reshape:
What I would like instead is
[0 1
4 5]
[2 3
6 7]
and so on...
This is implemented as view_as_blocks in skimage.util:
blocks = skimage.util.view_as_blocks(a,(2,2))
maxs = blocks.max((2,3))
Step #1 : Getting max_ind_x, max_ind_y
We need to get the row, column indices of the max element per block -
m,n = inputs.shape
a = inputs.reshape(m//2,2,n//2,2).swapaxes(1,2)
row, col = np.unravel_index(a.reshape(a.shape[:-2] + (4,)).argmax(-1), (2,2))
Step #2 : Setting output array with argmax places from the input
Then, looking at your code it seems you are trying to create an output array with those argmax places set with values from the input array. Hence, we could do -
out = np.zeros_like(a)
M,N = a.shape[:2]
indx_tuple = np.arange(M)[:,None],np.arange(N), row, col
out[indx_tuple] = a[indx_tuple]
Finally, we could get the 2D shape back for the output and this would be a good verification step against the original input inputs -
out2d = out.reshape(a.shape[:2]+(2,2)).swapaxes(1,2).reshape(m,n)
Sample input, output -
In [291]: np.random.seed(0)
...: inputs = np.random.randint(11,99,(6,4))
In [292]: inputs
Out[292]:
array([[55, 58, 75, 78],
[78, 20, 94, 32],
[47, 98, 81, 23],
[69, 76, 50, 98],
[57, 92, 48, 36],
[88, 83, 20, 31]])
In [286]: out2d
Out[286]:
array([[ 0, 0, 0, 0],
[78, 0, 94, 0],
[ 0, 98, 0, 0],
[ 0, 0, 0, 98],
[ 0, 92, 48, 0],
[ 0, 0, 0, 0]])
I want to find the max row and column index in a numpy matrix. But it not be in the a set of rows or columns. Thus, it should skip those rows and columns while computing the max.
Example:
# finding max in numpy matrix
[row,col] = np.where(mat == mat.max())
But it should skip rows removed_rows=[] and columns columns_rows=[]
I don't want to create a new sub matrix for the computation.
Let a be the input array, rows_rem and cols_rem be the rows and column indices to be skipped respectively. We would have an approach using masking, like so -
m,n = a.shape
d0,d1 = np.ogrid[:m,:n]
a_masked = a*~(np.in1d(d0,rows_rem)[:,None] | np.in1d(d1,cols_rem))
max_row, max_col = np.where(a_masked == a_masked.max())
Sample run -
In [204]: # Inputs
...: a = np.random.randint(11,99,(4,5))
...: rows_rem = [1,3]
...: cols_rem = [1,2,4]
...:
In [205]: a
Out[205]:
array([[36, 51, 72, 18, 31],
[78, 42, 12, 71, 72],
[38, 46, 42, 67, 12],
[87, 56, 76, 14, 21]])
In [206]: a_masked
Out[206]:
array([[64, 0, 0, 90, 0],
[ 0, 0, 0, 0, 0],
[17, 0, 0, 40, 0],
[ 0, 0, 0, 0, 0]])
In [207]: max_row, max_col
Out[207]: (array([0]), array([3]))
Please note that if there's more than one element with the same max value, we would have all of those in the output. So, if you want any or the first of those, we can use argmax, like so -
max_row, max_col = np.unravel_index(a_masked.argmax(),a.shape)
remove_rows = [2,3]
remove_cols = [0,1]
a = np.random.randint(11,99,(4,5))
>>> a
array([[60, 86, 89, 66, 20],
[77, 86, 78, 90, 44],
[68, 57, 83, 48, 25],
[30, 81, 42, 11, 63]])
>>>
Get the row and column indices that you are interested in by filtering out the indices you want removed:
r, c = a.shape
r = [x for x in range(r) if x not in remove_rows]
c = [x for x in range(c) if x not in remove_cols]
>>> r,c
([0, 1], [2, 3, 4])
>>>
Now r and c can be used for integer indexing, numpy.ix_ helps with this.
>>> a[np.ix_(r,c)]
array([[89, 66, 20],
[78, 90, 44]])
>>>
Tack on ndarray.max() to get the max value:
>>> a[np.ix_(r,c)].max()
90
>>>
Finally, use numpy.where to find where it is in the original array:
>>> row, col = np.where(a == a[np.ix_(r,c)].max())
>>> row, col
(array([1]), array([3]))
>>>
This method will also work if removing non-sequential rows or columns.
For example:
remove_rows = [0,3]
remove_cols = [1,4]
A is a ((d,e)) numpy array. I compute a ((d,e)) numpy array B where I compute the entry B[i,j] as follows
b=0
for k in range(i+1,d):
for l in range(j+1,e):
b=b+A[k,l]
B[i,j]=b
In other words, B[i,j] is the sum of A[k,l] taken over all indices k>i, l>j; this is sort of the opposite of the usual cumsum applied to both axis. I am wondering if there is a more elegant and faster way to do this (e.g. using np.cumsum)?
Assuming you're trying to do this:
A = np.arange(15).reshape((5, -1))
def cumsum2_reverse(arr):
out = np.empty_like(arr)
d, e = arr.shape
for i in xrange(d):
for j in xrange(e):
b = 0
for k in xrange(i + 1, d):
for l in xrange(j + 1, e):
b += arr[k, l]
out[i, j] = b
return out
Then if you do,
In [1]: A_revsum = cumsum2_reverse(A)
In [2]: A_revsum
Out[2]:
array([[72, 38, 0],
[63, 33, 0],
[48, 25, 0],
[27, 14, 0],
[ 0, 0, 0]])
You could use np.cumsum on the reverse-ordered arrays to compute the sum. For example, at first you might try something similar to what #Jaime suggested:
In [3]: np.cumsum(np.cumsum(A[::-1, ::-1], 0), 1)[::-1, ::-1]
Out[3]:
array([[105, 75, 40],
[102, 72, 38],
[ 90, 63, 33],
[ 69, 48, 25],
[ 39, 27, 14]])
Here we remember that np.cumsum starts with the value in the first column (in this case last column), so to ensure zeros there, you could shift the output of this operation. This might look like:
def cumsum2_reverse_alt(arr):
out = np.zeros_like(arr)
out[:-1, :-1] = np.cumsum(np.cumsum(arr[:0:-1, :0:-1], 0), 1)[::-1, ::-1]
return out
This gives the same values as above.
In [4]: (cumsum2_reverse(A) == cumsum2_reverse_alt(A)).all()
Out[4]: True
Note, that the one that utilizes np.cumsum is much faster for large arrays. For example:
In [5]: A=np.arange(3000).reshape((50, -1))
In [6]: %timeit cumsum2_reverse(A)
1 loops, best of 3: 453 ms per loop
In [7]: %timeit cumsum2_reverse_alt(A)
10000 loops, best of 3: 24.7 us per loop
I have 4 2D numpy arrays, called a, b, c, d, each of them made of n rows and m columns. What I need to do is giving to each element of b and d a value calculated as follows (pseudo-code):
min_coords = min_of_neighbors_coords(x, y)
b[x,y] = a[x,y] * a[min_coords];
d[x,y] = c[min_coords];
Where min_of_neighbors_coords is a function that, given the coordinates of an element of the array, returns the coordinates of the 'neighbor' element that has the lower value. I.e., considering the array:
1, 2, 5
3, 7, 2
2, 3, 6
min_of_neighbors_coords(1, 1) will refer to the central element with the value of 7, and will return the tuple (0, 0): the coordinates of the number 1.
I managed to do this using for loops (element per element), but the algorithm is VERY slow and I'm searching a way to improve it, avoiding loops and demanding the calculations to numpy.
Is it possible?
EDIT I have kept my original answer at the bottom. As Paul points out in the comments, the original answer didn't really answer the OP's question, and could be more easily achieved with an ndimage filter. The following much more cumbersome function should do the right thing. It takes two arrays, a and c, and returns the windowed minimum of a and the values in c at the positions of the windowed minimums in a:
def neighbor_min(a, c):
ac = np.concatenate((a[None], c[None]))
rows, cols = ac.shape[1:]
ret = np.empty_like(ac)
# Fill in the center
win_ac = as_strided(ac, shape=(2, rows-2, cols, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :cols] +
[np.argmin(win_ac[0], axis=2)]]
win_ac = as_strided(win_ac, shape=(2, rows-2, cols-2, 3),
strides=win_ac.strides+win_ac.strides[2:3])
ret[:, 1:-1, 1:-1] = win_ac[np.ogrid[:2, :rows-2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
# Fill the top, bottom, left and right borders
win_ac = as_strided(ac[:, :2, :], shape=(2, 2, cols-2, 3),
strides=ac.strides+ac.strides[2:3])
win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 0, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),
np.ogrid[:cols-2]]
win_ac = as_strided(ac[:, -2:, :], shape=(2, 2, cols-2, 3),
strides=ac.strides+ac.strides[2:3])
win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, -1, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),
np.ogrid[:cols-2]]
win_ac = as_strided(ac[:, :, :2], shape=(2, rows-2, 2, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 1:-1, 0] = win_ac[:, np.ogrid[:rows-2],
np.argmin(win_ac[0], axis=1)]
win_ac = as_strided(ac[:, :, -2:], shape=(2, rows-2, 2, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 1:-1, -1] = win_ac[:, np.ogrid[:rows-2],
np.argmin(win_ac[0], axis=1)]
# Fill the corners
win_ac = ac[:, :2, :2]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, 0, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, :2, -2:]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, 0, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, -2:, -2:]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, -1, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, -2:, :2]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, -1, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
return ret
The return is a (2, rows, cols) array that can be unpacked into the two arrays:
>>> a = np.random.randint(100, size=(5,5))
>>> c = np.random.randint(100, size=(5,5))
>>> a
array([[42, 54, 18, 88, 26],
[80, 65, 83, 31, 4],
[51, 52, 18, 88, 52],
[ 1, 70, 5, 0, 89],
[47, 34, 27, 67, 68]])
>>> c
array([[94, 94, 29, 6, 76],
[81, 47, 67, 21, 26],
[44, 92, 20, 32, 90],
[81, 25, 32, 68, 25],
[49, 43, 71, 79, 77]])
>>> neighbor_min(a, c)
array([[[42, 18, 18, 4, 4],
[42, 18, 18, 4, 4],
[ 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0]],
[[94, 29, 29, 26, 26],
[94, 29, 29, 26, 26],
[81, 81, 68, 68, 68],
[81, 81, 68, 68, 68],
[81, 81, 68, 68, 68]]])
The OP's case could then be solved as:
def bd_from_ac(a, c):
b,d = neighbor_min(a, c)
return a*b, d
And while there is a serious performance hit, it is pretty fast still:
In [3]: a = np.random.rand(1000, 1000)
In [4]: c = np.random.rand(1000, 1000)
In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 570 ms per loop
You are not really using the coordinates of the minimum neighboring element for anything else than fetching it, so you may as well skip that part and create a min_neighbor function. If you don't want to resort to cython for fast looping, you are going to have to go with rolling window views, such as outlined in Paul's link. This will typically convert your (m, n) array into a (m-2, n-2, 3, 3) view of the same data, and you would then apply np.min over the last two axes.
Unfortunately you have to apply it one axis at a time, so you will have to create a (m-2, n-2, 3) copy of your data. Fortunately, you can compute the minimum in two steps, first windowing and minimizing along one axis, then along the other, and obtain the same result. So at most you are going to have intermediate storage the size of your input. If needed, you could even reuse the output array as intermediate storage and avoid memory allocations, but that is left as exercise...
The following function does that. It is kind of lengthy because it has to deal not only with the central area, but also with the special cases of the four edges and four corners. Other than that it is a pretty compact implementation:
def neighbor_min(a):
rows, cols = a.shape
ret = np.empty_like(a)
# Fill in the center
win_a = as_strided(a, shape=(m-2, n, 3),
strides=a.strides+a.strides[:1])
win_a = win_a.min(axis=2)
win_a = as_strided(win_a, shape=(m-2, n-2, 3),
strides=win_a.strides+win_a.strides[1:])
ret[1:-1, 1:-1] = win_a.min(axis=2)
# Fill the top, bottom, left and right borders
win_a = as_strided(a[:2, :], shape=(2, cols-2, 3),
strides=a.strides+a.strides[1:])
ret[0, 1:-1] = win_a.min(axis=2).min(axis=0)
win_a = as_strided(a[-2:, :], shape=(2, cols-2, 3),
strides=a.strides+a.strides[1:])
ret[-1, 1:-1] = win_a.min(axis=2).min(axis=0)
win_a = as_strided(a[:, :2], shape=(rows-2, 2, 3),
strides=a.strides+a.strides[:1])
ret[1:-1, 0] = win_a.min(axis=2).min(axis=1)
win_a = as_strided(a[:, -2:], shape=(rows-2, 2, 3),
strides=a.strides+a.strides[:1])
ret[1:-1, -1] = win_a.min(axis=2).min(axis=1)
# Fill the corners
ret[0, 0] = a[:2, :2].min()
ret[0, -1] = a[:2, -2:].min()
ret[-1, -1] = a[-2:, -2:].min()
ret[-1, 0] = a[-2:, :2].min()
return ret
You can now do things like:
>>> a = np.random.randint(10, size=(5, 5))
>>> a
array([[0, 3, 1, 8, 9],
[7, 2, 7, 5, 7],
[4, 2, 6, 1, 9],
[2, 8, 1, 2, 3],
[7, 7, 6, 8, 0]])
>>> neighbor_min(a)
array([[0, 0, 1, 1, 5],
[0, 0, 1, 1, 1],
[2, 1, 1, 1, 1],
[2, 1, 1, 0, 0],
[2, 1, 1, 0, 0]])
And your original question can be solved as:
def bd_from_ac(a, c):
return a*neighbor_min(a), neighbor_min(c)
As a performance benchmark:
In [2]: m, n = 1000, 1000
In [3]: a = np.random.rand(m, n)
In [4]: c = np.random.rand(m, n)
In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 123 ms per loop
Finding a[min_coords] is a rolling window operation. Several clever solutions our outlined in this post. You'll want to make the creation of the c[min_coords] array a side-effect of whichever solution you choose.
I hope this helps. I can post some sample code later when I have some time.
I have interest in helping you, and I believe there are possibly better solutions outside the scope of your question, but in order to put my own time into writing code, I must have some feedback of yours, because I am not 100% sure I understand what you need.
One thing to consider: if you are a C# developer, maybe a "brute-force" implementation of C# can outperform a clever implementation of Numpy, so you could consider at least testing your rather simple operations implemented in C#. Geotiff (which I suppose you are reading) has a relatively friendly specification, and I guess there might be .NET GeoTiff libraries around.
But supposing you want to give Numpy a try (and I believe you should), let's take a look at what you're trying to achieve:
If you are going to run min_coords(array) in every element of arrays a and c, you might consider to "stack" nine copies of the same array, each copy rolled by some offset, using numpy.dstack() and numpy.roll(). Then, you apply numpy.argmin(stacked_array, axis=2) and you get an array containing values between 0 and 8, where each of these values map to a tuple containing the offset indexes.
Then, using this principle, your min_coords() function would be vectorized, operating in the whole array at once, and giving back an array that gives you an offset which would be the index of a lookup table containing the offsets.
If you have interest in elaborating this, please leave a comment.
Hope this helps!