Related
As I read through numpy tutorials, I give myself challenges to build my understanding. I was reading through tutorialpoint.com's numpy resource when I saw in the last example that their end-product modified array was not really an array.
Bottom of page, Broadcasting Iteration example:
https://www.tutorialspoint.com/numpy/numpy_iterating_over_array.htm
So I decided it would be a nice challenge to try and create the same endproduct as an array. I succeeded but I was not able to use np.nditer nor was I able to utilize broadcasting although I'm sure there must be a way to utilize either/both.
Here is my code:
a = np.arange(0,60,5)
a = a.reshape(12,1)
b = np.arange(1,5)
arr = np.zeros((12,2))
counter = 0
for i in range(arr.shape[0]):
if counter < 4:
arr[i,:] = np.array([a[i],b[counter]])
counter += 1
else:
counter = 0
arr[i,:] = np.array([a[i],b[counter]])
print arr
How can I do this more efficiently?
I haven't seen that particular nditer tutorial before.
https://docs.scipy.org/doc/numpy-1.13.0/reference/arrays.nditer.html
is the one I've used. And I keep telling people, nditer, with this Python interface, is not efficient. This page is most useful as stepping stone to using nditer in C code, as illustrated in the last cython example.
There aren't many numpy functions that use np.nditer (in Python code). np.ndindex is one of the few. It's worth reading its code. np.einsum uses this iterator, but in compiled code.
I'll take time later to read and comment on the example in question. Learning to use broadcasting well is more important than using nditer.
In [212]: a=np.arange(0,60,5).reshape(3,4)
In [213]: a
Out[213]:
array([[ 0, 5, 10, 15],
[20, 25, 30, 35],
[40, 45, 50, 55]])
In [214]: b=np.arange(1,5)
In [215]: b
Out[215]: array([1, 2, 3, 4])
In [225]: for x,y in np.nditer([a,b]):
...: print("%d:%d"%(x,y), end=' ')
...: print()
0:1 5:2 10:3 15:4 20:1 25:2 30:3 35:4 40:1 45:2 50:3 55:4
Equivalent plain Python iteration:
In [231]: for row in a:
...: for x,y in zip(row,b):
...: print("%d:%d"%(x,y), end=' ')
...: print()
...:
0:1 5:2 10:3 15:4 20:1 25:2 30:3 35:4 40:1 45:2 50:3 55:4
np.broadcast will broadcast the (3,4) array with (4,):
In [234]: np.broadcast(a,b)
Out[234]: <numpy.broadcast at 0x9c2a7f8>
In [235]: list(_)
Out[235]:
[(0, 1),
(5, 2),
(10, 3),
(15, 4),
(20, 1),
(25, 2),
(30, 3),
(35, 4),
(40, 1),
(45, 2),
(50, 3),
(55, 4)]
Use np.array(list(np.broadcast(a,b))) to make a (12,2) array.
Or with the same print:
In [237]: for x,y in np.broadcast(a,b):
...: print("%d:%d"%(x,y), end=' ')
...: print()
...:
0:1 5:2 10:3 15:4 20:1 25:2 30:3 35:4 40:1 45:2 50:3 55:4
Your iteration:
In [251]: arr = np.zeros((12,2),dtype=int)
...: counter = 0
...: for i in range(arr.shape[0]):
...: if counter < 4:
...: arr[i,:] = np.array([a.flat[i],b[counter]])
...: counter += 1
...: else:
...: counter = 0
...: arr[i,:] = np.array([a.flat[i],b[counter]])
...:
In [252]: arr
Out[252]:
array([[ 0, 1],
[ 5, 2],
[10, 3],
[15, 4],
[20, 1],
[25, 1],
[30, 2],
[35, 3],
[40, 4],
[45, 1],
[50, 1],
[55, 2]])
Oops, looks like something off, if you expect the second column to be a repeated b.
There are many ways of combining a and b into this kind of array.
This turns the 2d a into a 1d; replicates b with tile, and joins them with stack (column_stack would have worked as well):
In [264]: np.stack((a.flat, np.tile(b,3)),1)
Out[264]:
array([[ 0, 1],
[ 5, 2],
[10, 3],
[15, 4],
[20, 1],
[25, 2],
[30, 3],
[35, 4],
[40, 1],
[45, 2],
[50, 3],
[55, 4]])
I want to find the max row and column index in a numpy matrix. But it not be in the a set of rows or columns. Thus, it should skip those rows and columns while computing the max.
Example:
# finding max in numpy matrix
[row,col] = np.where(mat == mat.max())
But it should skip rows removed_rows=[] and columns columns_rows=[]
I don't want to create a new sub matrix for the computation.
Let a be the input array, rows_rem and cols_rem be the rows and column indices to be skipped respectively. We would have an approach using masking, like so -
m,n = a.shape
d0,d1 = np.ogrid[:m,:n]
a_masked = a*~(np.in1d(d0,rows_rem)[:,None] | np.in1d(d1,cols_rem))
max_row, max_col = np.where(a_masked == a_masked.max())
Sample run -
In [204]: # Inputs
...: a = np.random.randint(11,99,(4,5))
...: rows_rem = [1,3]
...: cols_rem = [1,2,4]
...:
In [205]: a
Out[205]:
array([[36, 51, 72, 18, 31],
[78, 42, 12, 71, 72],
[38, 46, 42, 67, 12],
[87, 56, 76, 14, 21]])
In [206]: a_masked
Out[206]:
array([[64, 0, 0, 90, 0],
[ 0, 0, 0, 0, 0],
[17, 0, 0, 40, 0],
[ 0, 0, 0, 0, 0]])
In [207]: max_row, max_col
Out[207]: (array([0]), array([3]))
Please note that if there's more than one element with the same max value, we would have all of those in the output. So, if you want any or the first of those, we can use argmax, like so -
max_row, max_col = np.unravel_index(a_masked.argmax(),a.shape)
remove_rows = [2,3]
remove_cols = [0,1]
a = np.random.randint(11,99,(4,5))
>>> a
array([[60, 86, 89, 66, 20],
[77, 86, 78, 90, 44],
[68, 57, 83, 48, 25],
[30, 81, 42, 11, 63]])
>>>
Get the row and column indices that you are interested in by filtering out the indices you want removed:
r, c = a.shape
r = [x for x in range(r) if x not in remove_rows]
c = [x for x in range(c) if x not in remove_cols]
>>> r,c
([0, 1], [2, 3, 4])
>>>
Now r and c can be used for integer indexing, numpy.ix_ helps with this.
>>> a[np.ix_(r,c)]
array([[89, 66, 20],
[78, 90, 44]])
>>>
Tack on ndarray.max() to get the max value:
>>> a[np.ix_(r,c)].max()
90
>>>
Finally, use numpy.where to find where it is in the original array:
>>> row, col = np.where(a == a[np.ix_(r,c)].max())
>>> row, col
(array([1]), array([3]))
>>>
This method will also work if removing non-sequential rows or columns.
For example:
remove_rows = [0,3]
remove_cols = [1,4]
A is a ((d,e)) numpy array. I compute a ((d,e)) numpy array B where I compute the entry B[i,j] as follows
b=0
for k in range(i+1,d):
for l in range(j+1,e):
b=b+A[k,l]
B[i,j]=b
In other words, B[i,j] is the sum of A[k,l] taken over all indices k>i, l>j; this is sort of the opposite of the usual cumsum applied to both axis. I am wondering if there is a more elegant and faster way to do this (e.g. using np.cumsum)?
Assuming you're trying to do this:
A = np.arange(15).reshape((5, -1))
def cumsum2_reverse(arr):
out = np.empty_like(arr)
d, e = arr.shape
for i in xrange(d):
for j in xrange(e):
b = 0
for k in xrange(i + 1, d):
for l in xrange(j + 1, e):
b += arr[k, l]
out[i, j] = b
return out
Then if you do,
In [1]: A_revsum = cumsum2_reverse(A)
In [2]: A_revsum
Out[2]:
array([[72, 38, 0],
[63, 33, 0],
[48, 25, 0],
[27, 14, 0],
[ 0, 0, 0]])
You could use np.cumsum on the reverse-ordered arrays to compute the sum. For example, at first you might try something similar to what #Jaime suggested:
In [3]: np.cumsum(np.cumsum(A[::-1, ::-1], 0), 1)[::-1, ::-1]
Out[3]:
array([[105, 75, 40],
[102, 72, 38],
[ 90, 63, 33],
[ 69, 48, 25],
[ 39, 27, 14]])
Here we remember that np.cumsum starts with the value in the first column (in this case last column), so to ensure zeros there, you could shift the output of this operation. This might look like:
def cumsum2_reverse_alt(arr):
out = np.zeros_like(arr)
out[:-1, :-1] = np.cumsum(np.cumsum(arr[:0:-1, :0:-1], 0), 1)[::-1, ::-1]
return out
This gives the same values as above.
In [4]: (cumsum2_reverse(A) == cumsum2_reverse_alt(A)).all()
Out[4]: True
Note, that the one that utilizes np.cumsum is much faster for large arrays. For example:
In [5]: A=np.arange(3000).reshape((50, -1))
In [6]: %timeit cumsum2_reverse(A)
1 loops, best of 3: 453 ms per loop
In [7]: %timeit cumsum2_reverse_alt(A)
10000 loops, best of 3: 24.7 us per loop
Say I have two 3 dimensional matrices, like so (taken from this matlab example http://www.mathworks.com/help/matlab/ref/dot.html):
A = cat(3,[1 1;1 1],[2 3;4 5],[6 7;8 9])
B = cat(3,[2 2;2 2],[10 11;12 13],[14 15; 16 17])
If I want to take pairwise dot products along the third dimension, I could do so like this in matlab:
C = dot(A,B,3)
Which would give the result:
C =
106 140
178 220
What would be the equivalent operation in numpy, preferably a vectorized option, to avoid having to write a double for loop through the entire array. I can't seem to make sense of what np.tensordot or np.inner are supposed to do, but they might be options.
In [169]:
A = np.dstack([[[1, 1],[1 ,1]],[[2 ,3],[4, 5]],[[6, 7],[8, 9]]])
B = np.dstack([[[2, 2],[2, 2]],[[10, 11],[12, 13]],[[14, 15], [16, 17]]])
c=np.tensordot(A, B.T,1)
np.vstack([np.diag(c[:,i,i]) for i in range(A.shape[0])]).T
Out[169]:
array([[106, 140],
[178, 220]])
But surprisingly it is the slowest:
In [170]:
%%timeit
c=np.tensordot(A, B.T,1)
np.vstack([np.diag(c[:,i,i]) for i in range(A.shape[0])]).T
10000 loops, best of 3: 95.2 µs per loop
In [171]:
%timeit np.einsum('i...,i...',a,b)
100000 loops, best of 3: 6.93 µs per loop
In [172]:
%timeit inner1d(A,B)
100000 loops, best of 3: 4.51 µs per loop
Using np.einsum:
In [9]: B = np.array([[[2, 2],[2, 2]],[[10, 11],[12, 13]],[[14, 15],[16, 17]]])
In [10]: A = np.array([[[1, 1],[1, 1]],[[2, 3],[4, 5]],[[6, 7],[8, 9]]])
In [11]: np.einsum('i...,i...',A,B)
Out[11]:
array([[106, 140],
[178, 220]])
Or here's another fun one:
In [37]: from numpy.core.umath_tests import inner1d
In [38]: inner1d(A,B)
Out[38]:
array([[106, 140],
[178, 220]])
Edit in response to #flebool's comment, inner1d works for both (2,2,3) and (3,2,2) shaped arrays:
In [41]: A = dstack([[[1, 1],[1 ,1]],[[2 ,3],[4, 5]],[[6, 7],[8, 9]]])
In [42]: B = dstack([[[2, 2],[2, 2]],[[10, 11],[12, 13]],[[14, 15], [16, 17]]])
In [43]: inner1d(A,B)
Out[43]:
array([[106, 140],
[178, 220]])
Here's a solution:
A = dstack([[[1, 1],[1 ,1]],[[2 ,3],[4, 5]],[[6, 7],[8, 9]]])
B = dstack([[[2, 2],[2, 2]],[[10, 11],[12, 13]],[[14, 15], [16, 17]]])
C = einsum('...k,...k',A,B)
Basically dstack concatenates along the third axis, (docs), and then you use the powerful einstein summation tool einsum provided by numpy (docs)
I have 4 2D numpy arrays, called a, b, c, d, each of them made of n rows and m columns. What I need to do is giving to each element of b and d a value calculated as follows (pseudo-code):
min_coords = min_of_neighbors_coords(x, y)
b[x,y] = a[x,y] * a[min_coords];
d[x,y] = c[min_coords];
Where min_of_neighbors_coords is a function that, given the coordinates of an element of the array, returns the coordinates of the 'neighbor' element that has the lower value. I.e., considering the array:
1, 2, 5
3, 7, 2
2, 3, 6
min_of_neighbors_coords(1, 1) will refer to the central element with the value of 7, and will return the tuple (0, 0): the coordinates of the number 1.
I managed to do this using for loops (element per element), but the algorithm is VERY slow and I'm searching a way to improve it, avoiding loops and demanding the calculations to numpy.
Is it possible?
EDIT I have kept my original answer at the bottom. As Paul points out in the comments, the original answer didn't really answer the OP's question, and could be more easily achieved with an ndimage filter. The following much more cumbersome function should do the right thing. It takes two arrays, a and c, and returns the windowed minimum of a and the values in c at the positions of the windowed minimums in a:
def neighbor_min(a, c):
ac = np.concatenate((a[None], c[None]))
rows, cols = ac.shape[1:]
ret = np.empty_like(ac)
# Fill in the center
win_ac = as_strided(ac, shape=(2, rows-2, cols, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :cols] +
[np.argmin(win_ac[0], axis=2)]]
win_ac = as_strided(win_ac, shape=(2, rows-2, cols-2, 3),
strides=win_ac.strides+win_ac.strides[2:3])
ret[:, 1:-1, 1:-1] = win_ac[np.ogrid[:2, :rows-2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
# Fill the top, bottom, left and right borders
win_ac = as_strided(ac[:, :2, :], shape=(2, 2, cols-2, 3),
strides=ac.strides+ac.strides[2:3])
win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 0, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),
np.ogrid[:cols-2]]
win_ac = as_strided(ac[:, -2:, :], shape=(2, 2, cols-2, 3),
strides=ac.strides+ac.strides[2:3])
win_ac = win_ac[np.ogrid[:2, :2, :cols-2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, -1, 1:-1] = win_ac[:, np.argmin(win_ac[0], axis=0),
np.ogrid[:cols-2]]
win_ac = as_strided(ac[:, :, :2], shape=(2, rows-2, 2, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 1:-1, 0] = win_ac[:, np.ogrid[:rows-2],
np.argmin(win_ac[0], axis=1)]
win_ac = as_strided(ac[:, :, -2:], shape=(2, rows-2, 2, 3),
strides=ac.strides+ac.strides[1:2])
win_ac = win_ac[np.ogrid[:2, :rows-2, :2] +
[np.argmin(win_ac[0], axis=2)]]
ret[:, 1:-1, -1] = win_ac[:, np.ogrid[:rows-2],
np.argmin(win_ac[0], axis=1)]
# Fill the corners
win_ac = ac[:, :2, :2]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, 0, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, :2, -2:]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, 0, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, -2:, -2:]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, -1, -1] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
win_ac = ac[:, -2:, :2]
win_ac = win_ac[:, np.ogrid[:2],
np.argmin(win_ac[0], axis=-1)]
ret[:, -1, 0] = win_ac[:, np.argmin(win_ac[0], axis=-1)]
return ret
The return is a (2, rows, cols) array that can be unpacked into the two arrays:
>>> a = np.random.randint(100, size=(5,5))
>>> c = np.random.randint(100, size=(5,5))
>>> a
array([[42, 54, 18, 88, 26],
[80, 65, 83, 31, 4],
[51, 52, 18, 88, 52],
[ 1, 70, 5, 0, 89],
[47, 34, 27, 67, 68]])
>>> c
array([[94, 94, 29, 6, 76],
[81, 47, 67, 21, 26],
[44, 92, 20, 32, 90],
[81, 25, 32, 68, 25],
[49, 43, 71, 79, 77]])
>>> neighbor_min(a, c)
array([[[42, 18, 18, 4, 4],
[42, 18, 18, 4, 4],
[ 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0],
[ 1, 1, 0, 0, 0]],
[[94, 29, 29, 26, 26],
[94, 29, 29, 26, 26],
[81, 81, 68, 68, 68],
[81, 81, 68, 68, 68],
[81, 81, 68, 68, 68]]])
The OP's case could then be solved as:
def bd_from_ac(a, c):
b,d = neighbor_min(a, c)
return a*b, d
And while there is a serious performance hit, it is pretty fast still:
In [3]: a = np.random.rand(1000, 1000)
In [4]: c = np.random.rand(1000, 1000)
In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 570 ms per loop
You are not really using the coordinates of the minimum neighboring element for anything else than fetching it, so you may as well skip that part and create a min_neighbor function. If you don't want to resort to cython for fast looping, you are going to have to go with rolling window views, such as outlined in Paul's link. This will typically convert your (m, n) array into a (m-2, n-2, 3, 3) view of the same data, and you would then apply np.min over the last two axes.
Unfortunately you have to apply it one axis at a time, so you will have to create a (m-2, n-2, 3) copy of your data. Fortunately, you can compute the minimum in two steps, first windowing and minimizing along one axis, then along the other, and obtain the same result. So at most you are going to have intermediate storage the size of your input. If needed, you could even reuse the output array as intermediate storage and avoid memory allocations, but that is left as exercise...
The following function does that. It is kind of lengthy because it has to deal not only with the central area, but also with the special cases of the four edges and four corners. Other than that it is a pretty compact implementation:
def neighbor_min(a):
rows, cols = a.shape
ret = np.empty_like(a)
# Fill in the center
win_a = as_strided(a, shape=(m-2, n, 3),
strides=a.strides+a.strides[:1])
win_a = win_a.min(axis=2)
win_a = as_strided(win_a, shape=(m-2, n-2, 3),
strides=win_a.strides+win_a.strides[1:])
ret[1:-1, 1:-1] = win_a.min(axis=2)
# Fill the top, bottom, left and right borders
win_a = as_strided(a[:2, :], shape=(2, cols-2, 3),
strides=a.strides+a.strides[1:])
ret[0, 1:-1] = win_a.min(axis=2).min(axis=0)
win_a = as_strided(a[-2:, :], shape=(2, cols-2, 3),
strides=a.strides+a.strides[1:])
ret[-1, 1:-1] = win_a.min(axis=2).min(axis=0)
win_a = as_strided(a[:, :2], shape=(rows-2, 2, 3),
strides=a.strides+a.strides[:1])
ret[1:-1, 0] = win_a.min(axis=2).min(axis=1)
win_a = as_strided(a[:, -2:], shape=(rows-2, 2, 3),
strides=a.strides+a.strides[:1])
ret[1:-1, -1] = win_a.min(axis=2).min(axis=1)
# Fill the corners
ret[0, 0] = a[:2, :2].min()
ret[0, -1] = a[:2, -2:].min()
ret[-1, -1] = a[-2:, -2:].min()
ret[-1, 0] = a[-2:, :2].min()
return ret
You can now do things like:
>>> a = np.random.randint(10, size=(5, 5))
>>> a
array([[0, 3, 1, 8, 9],
[7, 2, 7, 5, 7],
[4, 2, 6, 1, 9],
[2, 8, 1, 2, 3],
[7, 7, 6, 8, 0]])
>>> neighbor_min(a)
array([[0, 0, 1, 1, 5],
[0, 0, 1, 1, 1],
[2, 1, 1, 1, 1],
[2, 1, 1, 0, 0],
[2, 1, 1, 0, 0]])
And your original question can be solved as:
def bd_from_ac(a, c):
return a*neighbor_min(a), neighbor_min(c)
As a performance benchmark:
In [2]: m, n = 1000, 1000
In [3]: a = np.random.rand(m, n)
In [4]: c = np.random.rand(m, n)
In [5]: %timeit bd_from_ac(a, c)
1 loops, best of 3: 123 ms per loop
Finding a[min_coords] is a rolling window operation. Several clever solutions our outlined in this post. You'll want to make the creation of the c[min_coords] array a side-effect of whichever solution you choose.
I hope this helps. I can post some sample code later when I have some time.
I have interest in helping you, and I believe there are possibly better solutions outside the scope of your question, but in order to put my own time into writing code, I must have some feedback of yours, because I am not 100% sure I understand what you need.
One thing to consider: if you are a C# developer, maybe a "brute-force" implementation of C# can outperform a clever implementation of Numpy, so you could consider at least testing your rather simple operations implemented in C#. Geotiff (which I suppose you are reading) has a relatively friendly specification, and I guess there might be .NET GeoTiff libraries around.
But supposing you want to give Numpy a try (and I believe you should), let's take a look at what you're trying to achieve:
If you are going to run min_coords(array) in every element of arrays a and c, you might consider to "stack" nine copies of the same array, each copy rolled by some offset, using numpy.dstack() and numpy.roll(). Then, you apply numpy.argmin(stacked_array, axis=2) and you get an array containing values between 0 and 8, where each of these values map to a tuple containing the offset indexes.
Then, using this principle, your min_coords() function would be vectorized, operating in the whole array at once, and giving back an array that gives you an offset which would be the index of a lookup table containing the offsets.
If you have interest in elaborating this, please leave a comment.
Hope this helps!