python numpy argmax to max in multidimensional array - python

I have the following code:
import numpy as np
sample = np.random.random((10,10,3))
argmax_indices = np.argmax(sample, axis=2)
i.e. I take the argmax along axis=2 and it gives me a (10,10) matrix. Now, I want to assign these indices value 0. For this, I want to index the sample array. I tried:
max_values = sample[argmax_indices]
but it doesn't work. I want something like
max_values = sample[argmax_indices]
sample[argmax_indices] = 0
I simply validate by checking that max_values - np.max(sample, axis=2) should give a zero matrix of shape (10,10).
Any help will be appreciated.

Here's one approach -
m,n = sample.shape[:2]
I,J = np.ogrid[:m,:n]
max_values = sample[I,J, argmax_indices]
sample[I,J, argmax_indices] = 0
Sample step-by-step run
1) Sample input array :
In [261]: a = np.random.randint(0,9,(2,2,3))
In [262]: a
Out[262]:
array([[[8, 4, 6],
[7, 6, 2]],
[[1, 8, 1],
[4, 6, 4]]])
2) Get the argmax indices along axis=2 :
In [263]: idx = a.argmax(axis=2)
3) Get the shape and arrays for indexing into first two dims :
In [264]: m,n = a.shape[:2]
In [265]: I,J = np.ogrid[:m,:n]
4) Index using I, J and idx for storing the max values using advanced-indexing :
In [267]: max_values = a[I,J,idx]
In [268]: max_values
Out[268]:
array([[8, 7],
[8, 6]])
5) Verify that we are getting an all zeros array after subtracting np.max(a,axis=2) from max_values :
In [306]: max_values - np.max(a, axis=2)
Out[306]:
array([[0, 0],
[0, 0]])
6) Again using advanced-indexing assign those places as zeros and do one more level of visual verification :
In [269]: a[I,J,idx] = 0
In [270]: a
Out[270]:
array([[[0, 4, 6], # <=== Compare this against the original version
[0, 6, 2]],
[[1, 0, 1],
[4, 0, 4]]])

An alternative to np.ogrid is np.indices.
I, J = np.indices(argmax_indices.shape)
sample[I,J,argmax_indices] = 0

This can also be generalized to handle matrices of any dimensionality. The resulting function will set the largest value in every 1-d vector of the matrix along any dimension d desired (dimension 2 in the case of the original question) to 0 (or to whatever value is desired):
def set_zero(sample, d, val):
"""Set all max value along dimension d in matrix sample to value val."""
argmax_idxs = sample.argmax(d)
idxs = [np.indices(argmax_idxs.shape)[j].flatten() for j in range(len(argmax_idxs.shape))]
idxs.insert(d, argmax_idxs.flatten())
sample[idxs] = val
return sample
set_zero(sample, d=2, val=0)
(Tested for numpy 1.14.1 on python 3.6.4 and python 2.7.14)

Related

How can I assign multiple rows and columns of one array to the same rows and columns of another array in Python?

As the title says, how do I assign multiple rows and columns of one array to the same rows and columns of another array in Python?
I want to do the following:
Kn[0, 0] = KeTrans[startPosRow, startPosCol];
Kn[0, 1] = KeTrans[startPosRow, endPosCol];
Kn[1, 0] = KeTrans[endPosRow, startPosCol];
Kn[1, 1] = KeTrans[endPosRow, endPosCol];
Kn is a 2X2 matrix and KeTrans is a 4X4.
I tried the following but with no luck.
Kn[0:1, 0:1] = KeTrans[startPosRow: endPosRow, startPosCol: endPosCol]
But they're not the same rows and columns :-) (unless startPosRow and friends have very specific values).
The problem is that the slice startPosRow:endPosRow (for example) does not mean "element startPosRow and element endPosRow". It means "elements in range(startPosRow, endPosRow)", which doesn't include endPosRow and which typically has more than two matching indices.
If you just want the four corners, you could use slices with a step size:
Kn[0:1, 0:1] = KeTrans[startPosRow:endPosRow + 1:endPosRow - startPosRow,
startPosCol:endPosCol + 1:endPosCol - startPosCol]
For multi-dimensional arrays, I highly suggest use Numpy.
import numpy as np
To create an Nth-dimensional array:
a = np.array([4,2,4],[23,4,3,2]...,[2,3,4])
The array are indexed very intuitively:
>> a[0,1]
4
You can even do slicing for the np array.
documentation of numpy multi-dimensional array: https://numpy.org/doc/stable/reference/arrays.ndarray.html
Is this what you are trying to do:
In [323]: X = np.arange(16).reshape(4,4)
In [324]: Y = np.zeros((2,2),int)
In [325]: Y[:] = X[:2,:2]
In [326]: Y
Out[326]:
array([[0, 1],
[4, 5]])
In [327]: Y[:] = X[1:3,2:]
In [328]: Y
Out[328]:
array([[ 6, 7],
[10, 11]])
For reference
In [329]: X
Out[329]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])

The best way to get the array index number for maximum element in the array

I wish to get the element index that is the highest in the array.
import numpy as np
a = np.array([1,2,3,4])
print(np.where(a==a.max()))
Current output:
(array([3]),)
Expected output:
3
Use argmax that returns the indices of the maximum values along an axis:
np.argmax(a)
3
As you don't supply the axis it will return the index of flattened array:
a = np.array([[1, 2, 3, 4], [2, 3, 3, 9]])
np.argmax(a)
7
You can use np.argmax(). It will return the index of the highest value in your array.
For more deatils on the function here is a link to the documentation.
np.argmax() also works for 2D-arrys:
a = array([[10, 11, 12],
[13, 14, 15]])
np.argmax(a)
>>> 5
np.argmax(a, axis=0)
>>> array([1, 1, 1])
np.argmax(a, axis=1)
>>> array([2, 2])
Try this, it will return the value of the largest element in the array
import numpy as np
a = np.array([1,2,3,4])
print(np.argmax(a))

Numpy indexing broadcasting introduces new dimension

I have an array I wan to use for mapping. Let's call it my_map ,type float shape (m,c).
I have a second array with indexes, lest call it my_indexes, type int size (n,c), every value is between 0 and m.
Trying to index my_map doing my_ans = my_map[my_indexes] I get an array of shape (n,c,c), when I was expecting (n,c). What would be the proper way to do it?
Just to be clear, what I am trying to do is something equivalent to:
my_ans = np.empty_like(touch_probability)
for i in range(c):
my_ans[:,i] = my_map[:,i][my_indexes[:,i]]
To illustrate and test your problem, define simple, real arrays:
In [44]: arr = np.arange(12).reshape(3,4)
In [45]: idx = np.array([[0,2,1,0],[2,2,1,0]])
In [46]: arr.shape
Out[46]: (3, 4)
In [47]: idx.shape
Out[47]: (2, 4)
Your desired calculation:
In [48]: res = np.zeros((2,4), int)
In [49]: for i in range(4):
...: res[:,i] = arr[:,i][idx[:,i]] # same as arr[idx[:,i], i]
...:
In [50]: res
Out[50]:
array([[0, 9, 6, 3],
[8, 9, 6, 3]])
Doing the same with one indexing step:
In [51]: arr[idx, np.arange(4)]
Out[51]:
array([[0, 9, 6, 3],
[8, 9, 6, 3]])
This is broadcasting the two indexing arrays against each other, and then picking points:
In [52]: np.broadcast_arrays(idx, np.arange(4))
Out[52]:
[array([[0, 2, 1, 0],
[2, 2, 1, 0]]),
array([[0, 1, 2, 3],
[0, 1, 2, 3]])]
So we are indexing the (m,c) array with 2 (n,c) arrays
The following are the same:
arr[idx]
arr[idx, :]
It is using idx to select whole rows from arr, so the result is shape of idx plus the last dimension of arr. Where as what you want is just the ith element of the idx[j,i] row.

numpy argmax when values are equal

I got a numpy matrix and I want to get the index of the maximum value in each row. E.g.
[[1,2,3],[1,3,2],[3,2,1]]
will return
[0,1,2]
However, when there're more than 1 maximum value in each row, numpy.argmax will only return the smallest index. E.g.
[[0,0,0],[0,0,0],[0,0,0]]
will return
[0,0,0]
Can I change the default (smallest index) to some other values? E.g. when there're equal maximum values, return 1 or None, so that the above result will be
[1,1,1]
or
[None, None, None]
If I can do this in TensorFlow that'll be better.
Thanks!
You can use np.partition two find the two largest values and check if they are equal, and then use that as a mask in np.where to set the default value:
In [228]: a = np.array([[1, 2, 3, 2], [3, 1, 3, 2], [3, 5, 2, 1]])
In [229]: twomax = np.partition(a, -2)[:, -2:].T
In [230]: default = -1
In [231]: argmax = np.where(twomax[0] != twomax[1], np.argmax(a, -1), default)
In [232]: argmax
Out[232]: array([ 2, -1, 1])
A convenient value of "default" is -1, as argmax will not return that on its own. None does not fit in an integer array. A masked array is also an option, but I didn't go that far. Here is a NumPy implementation
def my_argmax(a):
rows = np.where(a == a.max(axis=1)[:, None])[0]
rows_multiple_max = rows[:-1][rows[:-1] == rows[1:]]
my_argmax = a.argmax(axis=1)
my_argmax[rows_multiple_max] = -1
return my_argmax
Example of use:
import numpy as np
a = np.array([[0, 0, 0], [4, 5, 3], [3, 4, 4], [6, 2, 1]])
my_argmax(a) # array([-1, 1, -1, 0])
Explanation: where selects the indexes of all maximal elements in each row. If a row has multiple maxima, the row number will appear more than once in rows array. Since this array is already sorted, such repetition is detected by comparing consecutive elements. This identifies the rows with multiple maxima, after which they are masked in the output of NumPy's argmax method.

How to take elements along a given axis, given by their indices?

I have a 3D array and I need to "squeeze" it over the last axis, so that I get a 2D array. I need to do it in the following way. For each values of the indices for the first two dimensions I know the value of the index for the 3rd dimension from where the value should be taken.
For example, I know that if i1 == 2 and i2 == 7 then i3 == 11. It means that out[2,7] = inp[2,7,11]. This mapping from first two dimensions into the third one is given in another 2D array. In other words, I have an array in which on the position 2,7 I have 11 as a value.
So, my question is how to combine these two array (3D and 2D) to get the output array (2D).
In [635]: arr = np.arange(24).reshape(2,3,4)
In [636]: idx = np.array([[1,2,3],[0,1,2]])
In [637]: I,J = np.ogrid[:2,:3]
In [638]: arr[I,J,idx]
Out[638]:
array([[ 1, 6, 11],
[12, 17, 22]])
In [639]: arr
Out[639]:
array([[[ 0, 1, 2, 3], # 1
[ 4, 5, 6, 7], # 6
[ 8, 9, 10, 11]], # ll
[[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]]])
I,J broadcast together to select a (2,3) set of values, matching idx:
In [640]: I
Out[640]:
array([[0],
[1]])
In [641]: J
Out[641]: array([[0, 1, 2]])
This is a generalization to 3d of the easier 2d problem - selecting one item from each row:
In [649]: idx
Out[649]:
array([[1, 2, 3],
[0, 1, 2]])
In [650]: idx[np.arange(2), [0,1]]
Out[650]: array([1, 1])
In fact we could convert the 3d problem into a 2d one:
In [655]: arr.reshape(6,4)[np.arange(6), idx.ravel()]
Out[655]: array([ 1, 6, 11, 12, 17, 22])
Generalizing the original case:
In [55]: arr = np.arange(24).reshape(2,3,4)
In [56]: idx = np.array([[1,2,3],[0,1,2]])
In [57]: IJ = np.ogrid[[slice(i) for i in idx.shape]]
In [58]: IJ
Out[58]:
[array([[0],
[1]]), array([[0, 1, 2]])]
In [59]: (*IJ,idx)
Out[59]:
(array([[0],
[1]]), array([[0, 1, 2]]), array([[1, 2, 3],
[0, 1, 2]]))
In [60]: arr[_]
Out[60]:
array([[ 1, 6, 11],
[12, 17, 22]])
The key is in combining the IJ list of arrays with the idx to make a new indexing tuple. Constructing the tuple is a little messier if idx isn't the last index, but it's still possible. E.g.
In [61]: (*IJ[:-1],idx,IJ[-1])
Out[61]:
(array([[0],
[1]]), array([[1, 2, 3],
[0, 1, 2]]), array([[0, 1, 2]]))
In [62]: arr.transpose(0,2,1)[_]
Out[62]:
array([[ 1, 6, 11],
[12, 17, 22]])
Of if it's easier transpose arr to the idx dimension is last. The key is that the index operation takes a tuple of index arrays, arrays which broadcast against each other to select specific items.
That's what ogrid is doing, create the arrays that work with idx.
inp = np.random.random((20, 10, 5)) # simulate some input
i1, i2 = np.indices(inp.shape[:2])
i3 = np.random.randint(0, 5, size=inp.shape) # or implement whatever mapping
# you want between (i1,i2) and i3
out = inp[(i1, i2, i3)]
See https://docs.scipy.org/doc/numpy/reference/arrays.indexing.html#integer-array-indexing for more details
Using numpy.einsum
This can be achieved by a combination of array indexing and usage of numpy.einsum:
>>> numpy.einsum('ijij->ij', inp[:, :, indices])
inp[:, :, indices] creates a four-dimensional array where for each of the first two indices (the first two dimensions) all indices of the index array are applied to the third dimension. Because the index array is two-dimensional this results in 4D. However you only want those indices of the index array which correspond to the ones of the first two dimensions. This is then achieved by using the string ijij->ij. This tells einsum that you want to select only those elements where the indices of 1st and 3rd and 2nd and 4th axis are similar. Because the last two dimensions (3rd and 4th) were added by the index array this is similar to selecting only the index index[i, j] for the third dimension of inp.
Note that this method can really blow up the memory consumption. Especially if inp.shape[:2] is much greater than inp.shape[2] then inp[:, :, indices].size will be approximately inp.size ** 2.
Building the indices manually
First we prepare the new index array:
>>> idx = numpy.array(list(
... numpy.ndindex(*inp.shape[:2], 1) # Python 3 syntax
... ))
Then we update the column which corresponds to the third axis:
>>> idx[:, 2] = indices[idx[:, 0], idx[:, 1]]
Now we can select the elements and simply reshape the result:
>>> inp[tuple(idx.T)].reshape(*inp.shape[:2])
Using numpy.choose
Note: numpy.choose allows a maximum size of 32 for the axis which is chosen from.
According to this answer and the documentation of numpy.choose we can also use the following:
# First we need to bring the last axis to the front because
# `numpy.choose` chooses from the first axis.
>>> new_inp = numpy.moveaxis(inp, -1, 0)
# Now we can select the elements.
>>> numpy.choose(indices, new_inp)
Although the documentation discourages the use of a single array for the 2nd argument (the choices)
To reduce the chance of misinterpretation, even though the following “abuse” is nominally supported, choices should neither be, nor be thought of as, a single array, i.e., the outermost sequence-like container should be either a list or a tuple.
this seems to be the case only for preventing misunderstandings:
choices : sequence of arrays
Choice arrays. a and all of the choices must be broadcastable to the same shape. If choices is itself an array (not recommended), then its outermost dimension (i.e., the one corresponding to choices.shape[0]) is taken as defining the “sequence”.
So from my point of view there's nothing wrong with using numpy.choose that way, as long as one is aware of what they're doing.
I believe this should do it:
for i in range(n):
for j in range(m):
k = index_mapper[i][j]
value = input_3d[i][j][k]
out_2d[i][j] = value

Categories

Resources