Related
I am reading numpy.r_ docs. I get it that I cannot place the 1’s at the mid position.
For example ,
a = np.array( [[3,4,5],[ 33,44,55]])
b = np.array( [[-3,-4,-5],[ -33,-44,-55]])
np.r_['0,3,1',a,b]
Actually firstly the shape (2,3) of a is upgraded to shape (1,2,3) and the same as b.Then plus the two shape (1,2,3) + (1,2,3) = (2,2,3) is the final shape of result, note I only plus the first number since the '0' in the '0,3,1'.
Now the question is that according the docs, I can upgrade the shape of a to shape(1,2,3) or (2,3,1) ,but how can upgrade to the shape (2,1,3) ?
In [381]: a = np.array( [[3,4,5],[ 33,44,55]])
...:
...: b = np.array( [[-3,-4,-5],[ -33,-44,-55]])
...:
...: np.r_['0,3,1',a,b]
Out[381]:
array([[[ 3, 4, 5],
[ 33, 44, 55]],
[[ -3, -4, -5],
[-33, -44, -55]]])
Your question should have displayed this result. It helps the reader visualize the action, and better understand your question. Not everyone can run your sample (I couldn't when I first read it on my phone).
You can do the same concatenation with stack (or even np.array((a,b))):
In [382]: np.stack((a,b))
Out[382]:
array([[[ 3, 4, 5],
[ 33, 44, 55]],
[[ -3, -4, -5],
[-33, -44, -55]]])
stack with axis produces what you want (again, a good question would display the desired result):
In [383]: np.stack((a,b), axis=1)
Out[383]:
array([[[ 3, 4, 5],
[ -3, -4, -5]],
[[ 33, 44, 55],
[-33, -44, -55]]])
We can add the dimension to a by itself with:
In [384]: np.expand_dims(a,1)
Out[384]:
array([[[ 3, 4, 5]],
[[33, 44, 55]]])
In [385]: _.shape
Out[385]: (2, 1, 3)
a[:,None] and a.reshape(2,1,3) also do it.
As you found, I can't do the same with np.r_:
In [413]: np.r_['0,3,0',a].shape
Out[413]: (2, 3, 1)
In [414]: np.r_['0,3,1',a].shape
Out[414]: (1, 2, 3)
In [415]: np.r_['0,3,-1',a].shape
Out[415]: (1, 2, 3)
Even looking at the code it is hard to tell how r_ is handling this 3rd parameter. It looks like it uses the ndmin parameter to expand the arrays (which prepends new axes if needed), and then some sort of transpose to move the new axis.
This could be classed as bug in r_, but it's been around so long, I doubt if any one will care. It's more useful for expanding "slices" than for fancy concatenation.
While the syntax of np.r_ may be convenient on occasion, it isn't an essential function. It's just another front end to np.concatenate (with the added arange/linspace functionality).
Suppose I have a 3D array (tensor)
print a.shape
(100, 100, 100)
and want to index its first axis using one array:
print i.shape
(20,)
and its last axis using another array
print j.shape
(30,)
My intent is to get a (20, 100, 30) array, for example, to be used in assignments like
a[?!, :, ?!] = b
However, I can't figure out how.
print a[i.reshape(20, 1, 1), :, j.reshape(1, 1, 30)].shape
(20, 1, 30, 100)
print a[i.reshape(20, 1, 1), :, j].shape
(20, 1, 30, 100)
print a[i.reshape(20, 1), :, j].shape
(20, 30, 100)
print a[i.reshape(20, 1, 1), j.reshape(1, 1, 30)].shape
(20, 1, 30, 100)
As I understand the "advanced indexing" rules, the first attempt should have worked, but I didn't even end up with a 3D array, and the full dimension (100) came at the end instead of the middle.
Approach #1 : You can use np.ix_ for getting such meshes of indices and simply indexing into the input array must give you the desired output. Thus, an implementation would be like so -
a[np.ix_(i,np.arange(a.shape[1]),j)]
Approach #2 : Simpler way using broadcasted indexing -
a[i[:,None],:,j].swapaxes(1,2)
Verify with a sample case -
In [24]: a = np.random.randint(0,99,(5,3,5))
In [25]: i = np.random.randint(0,5,(2))
In [26]: j = np.random.randint(0,5,(2))
In [27]: a[i[0],:,j[0]]
Out[27]: array([15, 7, 74])
In [28]: a[i[0],:,j[1]]
Out[28]: array([32, 19, 85])
In [29]: a[i[1],:,j[0]]
Out[29]: array([76, 65, 96])
In [30]: a[i[1],:,j[1]]
Out[30]: array([54, 65, 66])
In [31]: a[np.ix_(i,np.arange(a.shape[1]),j)]
Out[31]:
array([[[15, 32],
[ 7, 19],
[74, 85]],
[[76, 54],
[65, 65],
[96, 66]]])
In [50]: a[i[:,None],:,j].swapaxes(1,2)
Out[50]:
array([[[15, 32],
[ 7, 19],
[74, 85]],
[[76, 54],
[65, 65],
[96, 66]]])
Assigning values with the indexing
For approach #1, it's just straight-forward -
a[np.ix_(i,np.arange(a.shape[1]),j)] = b
For approach #2, if b is a scalar, it should be straight-forward too -
a[i[:,None],:,j] = b
For approach #2 again, if you are assigning to a ndarray b of shape (20,100,30), we need to swap axes of b before assigning, like so -
a[i[:,None],:,j] = np.swapaxes(b,1,2)
I have a function that I would like to apply to an array of tuples and I am wondering if there is a clean way to do it.
Normally, I could use np.vectorize to apply the function to each item in the array, however, in this case "each item" is a tuple so numpy interprets the array as a 3d array and applies the function to each item within the tuple.
So I can assume that the incoming array is one of:
tuple
1 dimensional array of tuples
2 dimensional array of tuples
I can probably write some looping logic but it seems like numpy most likely has something that does this more efficiently and I don't want to reinvent the wheel.
This is an example. I am trying to apply the tuple_converter function to each tuple in the array.
array_of_tuples1 = np.array([
[(1,2,3),(2,3,4),(5,6,7)],
[(7,2,3),(2,6,4),(5,6,6)],
[(8,2,3),(2,5,4),(7,6,7)],
])
array_of_tuples2 = np.array([
(1,2,3),(2,3,4),(5,6,7),
])
plain_tuple = (1,2,3)
# Convert each set of tuples
def tuple_converter(tup):
return tup[0]**2 + tup[1] + tup[2]
# Vectorizing applies the formula to each integer rather than each tuple
tuple_converter_vectorized = np.vectorize(tuple_converter)
print(tuple_converter_vectorized(array_of_tuples1))
print(tuple_converter_vectorized(array_of_tuples2))
print(tuple_converter_vectorized(plain_tuple))
Desired Output for array_of_tuples1:
[[ 6 11 38]
[54 14 37]
[69 13 62]]
Desired Output for array_of_tuples2:
[ 6 11 38]
Desired Output for plain_tuple:
6
But the code above produces this error (because it is trying to apply the function to an integer rather than a tuple.)
<ipython-input-209-fdf78c6f4b13> in tuple_converter(tup)
10
11 def tuple_converter(tup):
---> 12 return tup[0]**2 + tup[1] + tup[2]
13
14
IndexError: invalid index to scalar variable.
array_of_tuples1 and array_of_tuples2 are not actually arrays of tuples, but just 3- and 2-dimensional arrays of integers:
In [1]: array_of_tuples1 = np.array([
...: [(1,2,3),(2,3,4),(5,6,7)],
...: [(7,2,3),(2,6,4),(5,6,6)],
...: [(8,2,3),(2,5,4),(7,6,7)],
...: ])
In [2]: array_of_tuples1
Out[2]:
array([[[1, 2, 3],
[2, 3, 4],
[5, 6, 7]],
[[7, 2, 3],
[2, 6, 4],
[5, 6, 6]],
[[8, 2, 3],
[2, 5, 4],
[7, 6, 7]]])
So, instead of vectorizing your function, because it then will basically for-loop through the elements of the array (integers), you should apply it on the suitable axis (the axis of the "tuples") and not care about the type of the sequence:
In [6]: np.apply_along_axis(tuple_converter, 2, array_of_tuples1)
Out[6]:
array([[ 6, 11, 38],
[54, 14, 37],
[69, 13, 62]])
In [9]: np.apply_along_axis(tuple_converter, 1, array_of_tuples2)
Out[9]: array([ 6, 11, 38])
The other answer above is certainly correct, and probably what you're looking for. But I noticed you put the word "clean" into your question, and so I'd like to add this answer as well.
If we can make the assumption that all the tuples are 3 element tuples (or that they have some constant number of elements), then there's a nice little trick you can do so that the same piece of code will work on any single tuple, 1d array of tuples, or 2d array of tuples without an if/else for the 1d/2d cases. I'd argue that avoiding switches is always cleaner (although I suppose this could be contested).
import numpy as np
def map_to_tuples(x):
x = np.array(x)
flattened = x.flatten().reshape(-1, 3)
return np.array([tup[0]**2 + tup[1] + tup[2] for tup in flattened]).reshape(x.shape[:-1])
Outputs the following for your inputs (respectively), as desired:
[[ 6 11 38]
[54 14 37]
[69 13 62]]
[ 6 11 38]
6
If you are serious about the tuples bit, you could define a structured dtype.
In [535]: dt=np.dtype('int,int,int')
In [536]: x1 = np.array([
[(1,2,3),(2,3,4),(5,6,7)],
[(7,2,3),(2,6,4),(5,6,6)],
[(8,2,3),(2,5,4),(7,6,7)],
], dtype=dt)
In [537]: x1
Out[537]:
array([[(1, 2, 3), (2, 3, 4), (5, 6, 7)],
[(7, 2, 3), (2, 6, 4), (5, 6, 6)],
[(8, 2, 3), (2, 5, 4), (7, 6, 7)]],
dtype=[('f0', '<i4'), ('f1', '<i4'), ('f2', '<i4')])
Note that the display uses tuples. x1 is a 3x3 array of type dt. The elements, or records, are displayed as tuples. This more useful if the tuple elements differ - float, integer, string etc.
Now define a function that works with fields of such an array:
In [538]: def foo(tup):
return tup['f0']**2 + tup['f1'] + tup['f2']
It applies neatly to x1.
In [539]: foo(x1)
Out[539]:
array([[ 6, 11, 38],
[54, 14, 37],
[69, 13, 62]])
It also applies to a 1d array of the same dtype.
In [540]: x2=np.array([(1,2,3),(2,3,4),(5,6,7) ],dtype=dt)
In [541]: foo(x2)
Out[541]: array([ 6, 11, 38])
And a 0d array of matching type:
In [542]: foo(np.array(plain_tuple,dtype=dt))
Out[542]: 6
But foo(plain_tuple) won't work, since the function is written to work with named fields, not indexed ones.
The function could be modified to cast the input to the correct dtype if needed:
In [545]: def foo1(tup):
temp = np.asarray(tup, dtype=dt)
.....: return temp['f0']**2 + temp['f1'] + temp['f2']
In [548]: plain_tuple
Out[548]: (1, 2, 3)
In [549]: foo1(plain_tuple)
Out[549]: 6
In [554]: foo1([(1,2,3),(2,3,4),(5,6,7)]) # list of tuples
Out[554]: array([ 6, 11, 38])
Lets say I have an tensor of the following form:
import numpy as np
a = np.array([ [[1,2],
[3,4]],
[[5,6],
[7,3]]
])
# a.shape : (2,2,2) is a tensor containing 2x2 matrices
indices = np.argmax(a, axis=2)
#print indices
for mat in a:
max_i = np.argmax(mat,axis=1)
# Not really working I would like to
# change 4 in the first matrix to -1
# and 3 in the last to -1
mat[max_i] = -1
print a
Now what I would like to do is to use indices as a mask on a to replace every max element with say -1. Is there a numpy way of doing this ? so far all I have figured out is using for loops.
Here's one way using linear indexing in 3D -
m,n,r = a.shape
offset = n*r*np.arange(m)[:,None] + r*np.arange(n)
np.put(a,indices + offset,-1)
Sample run -
In [92]: a
Out[92]:
array([[[28, 59, 26, 70],
[57, 28, 71, 49],
[33, 6, 10, 90]],
[[24, 16, 83, 67],
[96, 16, 72, 56],
[74, 4, 71, 81]]])
In [93]: indices = np.argmax(a, axis=2)
In [94]: m,n,r = a.shape
...: offset = n*r*np.arange(m)[:,None] + r*np.arange(n)
...: np.put(a,indices + offset,-1)
...:
In [95]: a
Out[95]:
array([[[28, 59, 26, -1],
[57, 28, -1, 49],
[33, 6, 10, -1]],
[[24, 16, -1, 67],
[-1, 16, 72, 56],
[74, 4, 71, -1]]])
Here's another way with linear indexing again, but in 2D -
m,n,r = a.shape
a.reshape(-1,r)[np.arange(m*n),indices.ravel()] = -1
Runtime tests and verify output -
In [156]: def vectorized_app1(a,indices): # 3D linear indexing
...: m,n,r = a.shape
...: offset = n*r*np.arange(m)[:,None] + r*np.arange(n)
...: np.put(a,indices + offset,-1)
...:
...: def vectorized_app2(a,indices): # 2D linear indexing
...: m,n,r = a.shape
...: a.reshape(-1,r)[np.arange(m*n),indices.ravel()] = -1
...:
In [157]: # Generate random 3D array and the corresponding indices array
...: a = np.random.randint(0,99,(100,100,100))
...: indices = np.argmax(a, axis=2)
...:
...: # Make copies for feeding into functions
...: ac1 = a.copy()
...: ac2 = a.copy()
...:
In [158]: vectorized_app1(ac1,indices)
In [159]: vectorized_app2(ac2,indices)
In [160]: np.allclose(ac1,ac2)
Out[160]: True
In [161]: # Make copies for feeding into functions
...: ac1 = a.copy()
...: ac2 = a.copy()
...:
In [162]: %timeit vectorized_app1(ac1,indices)
1000 loops, best of 3: 311 µs per loop
In [163]: %timeit vectorized_app2(ac2,indices)
10000 loops, best of 3: 145 µs per loop
You can use indices to index into the last dimension of a provided that you also specify index arrays into the first two dimensions as well:
import numpy as np
a = np.array([[[1, 2],
[3, 4]],
[[5, 6],
[7, 3]]])
indices = np.argmax(a, axis=2)
print(repr(a[range(a.shape[0]), range(a.shape[1]), indices]))
# array([[2, 3],
# [2, 7]])
a[range(a.shape[0]), range(a.shape[1]), indices] = -1
print(repr(a))
# array([[[ 1, -1],
# [ 3, 4]],
# [[ 5, 6],
# [-1, -1]]])
I have 4 2D numpy arrays, called a, b, c, d, each of them made of n rows and m columns. What I need to do is giving to each element of b and d a value calculated as follows (pseudo-code):
min_coords = min_of_neighbors_coords(x, y)
b[x,y] = a[x,y] * a[min_coords];
d[x,y] = c[min_coords];
Where min_of_neighbors_coords is a function that, given the coordinates of an element of the array, returns the coordinates of the 'neighbor' element that has the lower value. I.e., considering the array:
1, 2, 5
3, 7, 2
2, 3, 6
min_of_neighbors_coords(1, 1) will refer to the central element with the value of 7, and will return the tuple (0, 0): the coordinates of the number 1.
I managed to do this using for loops (element per element), but the algorithm is VERY slow and I'm searching a way to improve it, avoiding loops and demanding the calculations to numpy.
Is it possible?
EDIT I have kept my original answer at the bottom. As Paul points out in the comments, the original answer didn't really answer the OP's question, and could be more easily achieved with an ndimage filter. The following much more cumbersome function should do the right thing. It takes two arrays, a and c, and returns the windowed minimum of a and the values in c at the positions of the windowed minimums in a:
def neighbor_min(a, c):
ac = np.concatenate((a[None], c[None]))
rows, cols = ac.shape[1:]
ret = np.empty_like(ac)
# Fill in the center
win_ac = as_strided(ac, shape=(2, rows-2, cols, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :cols] +
[np.argmin(win_ac[0], axis=2)]]
win_ac = as_strided(win_ac, shape=(2, rows-2, cols-2, 3),
strides=win_ac.strides+win_ac.strides[2:3])
ret[:, 1:-1, 1:-1] = win_ac[np.ogrid[:2, :rows-2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
# Fill the top, bottom, left and right borders
win_ac = as_strided(ac[:, :2, :], shape=(2, 2, cols-2, 3),
strides=ac.strides+ac.strides[2:3])
win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 0, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),
np.ogrid[:cols-2]]
win_ac = as_strided(ac[:, -2:, :], shape=(2, 2, cols-2, 3),
strides=ac.strides+ac.strides[2:3])
win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, -1, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),
np.ogrid[:cols-2]]
win_ac = as_strided(ac[:, :, :2], shape=(2, rows-2, 2, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 1:-1, 0] = win_ac[:, np.ogrid[:rows-2],
np.argmin(win_ac[0], axis=1)]
win_ac = as_strided(ac[:, :, -2:], shape=(2, rows-2, 2, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 1:-1, -1] = win_ac[:, np.ogrid[:rows-2],
np.argmin(win_ac[0], axis=1)]
# Fill the corners
win_ac = ac[:, :2, :2]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, 0, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, :2, -2:]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, 0, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, -2:, -2:]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, -1, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, -2:, :2]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, -1, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
return ret
The return is a (2, rows, cols) array that can be unpacked into the two arrays:
>>> a = np.random.randint(100, size=(5,5))
>>> c = np.random.randint(100, size=(5,5))
>>> a
array([[42, 54, 18, 88, 26],
[80, 65, 83, 31, 4],
[51, 52, 18, 88, 52],
[ 1, 70, 5, 0, 89],
[47, 34, 27, 67, 68]])
>>> c
array([[94, 94, 29, 6, 76],
[81, 47, 67, 21, 26],
[44, 92, 20, 32, 90],
[81, 25, 32, 68, 25],
[49, 43, 71, 79, 77]])
>>> neighbor_min(a, c)
array([[[42, 18, 18, 4, 4],
[42, 18, 18, 4, 4],
[ 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0]],
[[94, 29, 29, 26, 26],
[94, 29, 29, 26, 26],
[81, 81, 68, 68, 68],
[81, 81, 68, 68, 68],
[81, 81, 68, 68, 68]]])
The OP's case could then be solved as:
def bd_from_ac(a, c):
b,d = neighbor_min(a, c)
return a*b, d
And while there is a serious performance hit, it is pretty fast still:
In [3]: a = np.random.rand(1000, 1000)
In [4]: c = np.random.rand(1000, 1000)
In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 570 ms per loop
You are not really using the coordinates of the minimum neighboring element for anything else than fetching it, so you may as well skip that part and create a min_neighbor function. If you don't want to resort to cython for fast looping, you are going to have to go with rolling window views, such as outlined in Paul's link. This will typically convert your (m, n) array into a (m-2, n-2, 3, 3) view of the same data, and you would then apply np.min over the last two axes.
Unfortunately you have to apply it one axis at a time, so you will have to create a (m-2, n-2, 3) copy of your data. Fortunately, you can compute the minimum in two steps, first windowing and minimizing along one axis, then along the other, and obtain the same result. So at most you are going to have intermediate storage the size of your input. If needed, you could even reuse the output array as intermediate storage and avoid memory allocations, but that is left as exercise...
The following function does that. It is kind of lengthy because it has to deal not only with the central area, but also with the special cases of the four edges and four corners. Other than that it is a pretty compact implementation:
def neighbor_min(a):
rows, cols = a.shape
ret = np.empty_like(a)
# Fill in the center
win_a = as_strided(a, shape=(m-2, n, 3),
strides=a.strides+a.strides[:1])
win_a = win_a.min(axis=2)
win_a = as_strided(win_a, shape=(m-2, n-2, 3),
strides=win_a.strides+win_a.strides[1:])
ret[1:-1, 1:-1] = win_a.min(axis=2)
# Fill the top, bottom, left and right borders
win_a = as_strided(a[:2, :], shape=(2, cols-2, 3),
strides=a.strides+a.strides[1:])
ret[0, 1:-1] = win_a.min(axis=2).min(axis=0)
win_a = as_strided(a[-2:, :], shape=(2, cols-2, 3),
strides=a.strides+a.strides[1:])
ret[-1, 1:-1] = win_a.min(axis=2).min(axis=0)
win_a = as_strided(a[:, :2], shape=(rows-2, 2, 3),
strides=a.strides+a.strides[:1])
ret[1:-1, 0] = win_a.min(axis=2).min(axis=1)
win_a = as_strided(a[:, -2:], shape=(rows-2, 2, 3),
strides=a.strides+a.strides[:1])
ret[1:-1, -1] = win_a.min(axis=2).min(axis=1)
# Fill the corners
ret[0, 0] = a[:2, :2].min()
ret[0, -1] = a[:2, -2:].min()
ret[-1, -1] = a[-2:, -2:].min()
ret[-1, 0] = a[-2:, :2].min()
return ret
You can now do things like:
>>> a = np.random.randint(10, size=(5, 5))
>>> a
array([[0, 3, 1, 8, 9],
[7, 2, 7, 5, 7],
[4, 2, 6, 1, 9],
[2, 8, 1, 2, 3],
[7, 7, 6, 8, 0]])
>>> neighbor_min(a)
array([[0, 0, 1, 1, 5],
[0, 0, 1, 1, 1],
[2, 1, 1, 1, 1],
[2, 1, 1, 0, 0],
[2, 1, 1, 0, 0]])
And your original question can be solved as:
def bd_from_ac(a, c):
return a*neighbor_min(a), neighbor_min(c)
As a performance benchmark:
In [2]: m, n = 1000, 1000
In [3]: a = np.random.rand(m, n)
In [4]: c = np.random.rand(m, n)
In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 123 ms per loop
Finding a[min_coords] is a rolling window operation. Several clever solutions our outlined in this post. You'll want to make the creation of the c[min_coords] array a side-effect of whichever solution you choose.
I hope this helps. I can post some sample code later when I have some time.
I have interest in helping you, and I believe there are possibly better solutions outside the scope of your question, but in order to put my own time into writing code, I must have some feedback of yours, because I am not 100% sure I understand what you need.
One thing to consider: if you are a C# developer, maybe a "brute-force" implementation of C# can outperform a clever implementation of Numpy, so you could consider at least testing your rather simple operations implemented in C#. Geotiff (which I suppose you are reading) has a relatively friendly specification, and I guess there might be .NET GeoTiff libraries around.
But supposing you want to give Numpy a try (and I believe you should), let's take a look at what you're trying to achieve:
If you are going to run min_coords(array) in every element of arrays a and c, you might consider to "stack" nine copies of the same array, each copy rolled by some offset, using numpy.dstack() and numpy.roll(). Then, you apply numpy.argmin(stacked_array, axis=2) and you get an array containing values between 0 and 8, where each of these values map to a tuple containing the offset indexes.
Then, using this principle, your min_coords() function would be vectorized, operating in the whole array at once, and giving back an array that gives you an offset which would be the index of a lookup table containing the offsets.
If you have interest in elaborating this, please leave a comment.
Hope this helps!