ndarray row-wise index of values greater than array

ndarray row-wise index of values greater than array - python

I have one array of shape (X, 5):
M = [[1,2,3,4,5],
[6,7,8,9,1],
[2,5,7,8,3]
...]
and one array of shape (X, 1):
n = [[3],
[7],
[100],
...]
Now I need to get the first index of M >= n for each row, or nan if there is no such index.
For example:
np.where([1,2,3,4,5] >= 3)[0][0] # Returns 2
np.searchsorted(np.array([1,2,3,4,5]), 3) # Returns 2
These examples are applied to each row individually (I could loop X times as both arrays have the length X).
I wonder, is there a way to do it in a multidimensional way to get an output of all indices at once?
Something like:
np.where(M>=n)
Thank you
Edit: Values in M are unsorted, I'm still looking for the first index/occurrence fitting M >= n (so probably not searchsorted)

You could start by checking which row indices are lower or equal than n and use argmax to get the first True for each row. For the rows where all columns are False, we can use np.where to set them to np.nan for instance:
M = np.array([[1,2,3,4,5],
[6,7,8,9,1],
[2,5,7,8,3]])
n = np.array([[3],[7],[100]])
le = n<=M
# array([[False, False, True, True, True],
# [False, True, True, True, False],
# [False, False, False, False, False]])
lea = le.argmax(1)
has_any = le[np.arange(len(le)), lea]
np.where(has_any, lea, np.nan)
# array([ 2., 1., nan])

Related

How to compare one 2d array with 1d array to check for elements?

So I have a matrix of 3xn. Something like
A=[[1,2,3],
[6,2,5],
[8,1,7],
[2,9,8],
[1,9,3],
[1,4,3]]
and another list of B= [1,2,5,6,8,9]
So, if every element from A[i] is in list B then I have to delete the row. Eg. row 2,4 will need to be removed.
I wrote something like.
copy=[]
for i in A:
for j in B:
if int(j) in i:
for k in B[B.index(j):]:
if int(k) in i:
for l in B[B.index(k):]:
if int(l) in i:
copy.append(i)
This keeps returning recurring values. It also removes more than what I have already written. What am I doing wrong?
I also tried
for i in A:
copy=[x for x in i if x not in B]
copy=np.array(copy)
final.append(copy)
But it doesn't work.
Note: I am using numpy array for A and list for B so I always need to convert between them when I am doing comparing.

It is quite straightforward with numpy arrays, use isin to identify values present in B, then aggregate as a single boolean with all to get rows where all values are present, invert the mask with ~ and slice:
A = np.array([[1,2,3],
[6,2,5],
[8,1,7],
[2,9,8],
[1,9,3],
[1,4,3]])
B = np.array([1,2,5,6,8,9])
# or as list
B = [1,2,5,6,8,9]
A2 = A[~np.isin(A, B).all(1)]
output:
array([[1, 2, 3],
[8, 1, 7],
[1, 9, 3],
[1, 4, 3]])
intermediates:
np.isin(A, B)
array([[ True, True, False],
[ True, True, True],
[ True, True, False],
[ True, True, True],
[ True, True, False],
[ True, False, False]])
np.isin(A, B).all(1)
array([False, True, False, True, False, False])
~np.isin(A, B).all(1)
array([ True, False, True, False, True, True])

Loop through each sublist inside list "A". For each item in that sublist, we check if it is present in list "B". If it is, we increment count to 1.
When count reaches the length of our sublist, we remove that sublist by using the .remove() method, which takes an index, in our case, the index is that of the sublist inside the list "A".
for lst in A:
count=0
for subList_itm in lst:
if subList_itm in B:
count = count+1
if count== len(lst):
A.remove(lst)
count=0
print(A)

Concept
Iterate through all of the elements inside A, and check if each array is a subset of B. If it is not then put the array into a result array.
Code
A=[[1,2,3],[6,2,5],[8,1,7],[2,9,8],[1,9,3],[1,4,3]]
B=[1,2,5,6,8,9]
set_B=set(B)
result=[]
for arr in A:
if not set(arr).issubset(set_B):
result.append(arr)
print(result)

Is there a 2-D "where" in numpy?

This might seem an odd question, but it boils down to quite a simple operation that I can't find a numpy equivalent for. I've looked at np.where as well as many other operations but can't find anything that does this:
a = np.array([1,2,3])
b = np.array([1,2,3,4])
c = np.array([i<b for i in a])
The output is a 2-D array (3,4), of booleans comparing each value.

If you're asking how to get c without loop, try this
# make "a" a column vector
# > broadcasts to produce a len(a) x len(b) array
c = b > a[:, None]
c
array([[False, True, True, True],
[False, False, True, True],
[False, False, False, True]])

You can extend the approach in the other answer to get the values of a and b. Given a mask of
c = b > a[:, None]
You can extract the indices for each dimension using np.where or np.nonzero:
row, col = np.nonzero(c)
And use the indices to get the corresponding values:
ag = a[row]
bg = b[col]
Elements of a and b may be repeated in the result.

Unsorted segment argmax workaround tensorflow

I am trying to create a tf_boolean_mask that filters duplicate indices from a tensor, by the value of the indices. If the value is greater than that of a duplicate it should be kept, the others discarded. If index and value are identical, only one should remain:
[Pseudocode]
for index in indices
If index is unique:
keep index = True
else:
if val[index] > val[index of all other duplicate indices]:
keep index = True
elif val[index] < val[index of any other duplicate indices]:
keep index = False
elif val[index] == val[index of any other duplicate indices]:
keep only a single one of the equal indices(doesn't matter which)
A short example for the problem is the following:
import tensorflow as tf
tf.enable_eager_execution()
index = tf.convert_to_tensor([ 10, 5, 20, 20, 30, 30])
value = tf.convert_to_tensor([ 1., 0., 2., 0., 0., 0.])
# bool_mask = [True, True, True, False, True, False]
# or [True, True, True, False, False, True]
# the index 3 is filtered because index 2 has a greater value (2 comp. to 0)
# The index 4 and 5 are identical in their respective values, that's why both
# of them can be kept, but at maximum one of them.
...
bool_mask = ?
My current approach succesfully solves the removal of duplicates with different values but fails at the ones with identical value. However this is an edge case that unfortunately appears in my data:
import tensorflow as tf
y, idx = tf.unique(index)
num_segments = tf.shape(y)[0]
maximum_vals = tf.unsorted_segment_max(value, idx, num_segments)
fused_filt = tf.stack([tf.cast(y, tf.float32), maximum_vals],axis=1)
fused_orig = tf.stack([tf.cast(index, tf.float32), value], axis=1)
fused_orig_tiled = tf.tile(fused_orig, [1, tf.shape(fused_filt)[0]])
fused_orig_res = tf.reshape(fused_orig_tiled, [-1, tf.shape(fused_filt)[0], 2])
comp_1 = tf.equal(fused_orig_res, fused_filt)
comp_2 = tf.reduce_all(comp_1, -1)
comp_3 = tf.reduce_any(comp_2, -1)
# comp_3 = [True, True, True, False, True, True]
A pure tensorflow solution would be nice, since a For loop over the indices could be implemented rather simply. Thank you.

how to add two numpy.array with indices or mask

I have two arrays, like as following.
a=numpy.array((1,2,3,4))
b=numpy.array((1,2,3,4))
I just want to add the first two elements. How can i do it with a mask [True, True, False, False], or indice [0,1]
After addition b=(2, 4,3,4)

It's simply:
b[:2] += a[:2]
Or:
mask = numpy.array((True, True, False, False))
b[mask] += a[mask]

numpy argmin elegant solution required.

In python to find the index of the minimum value of the array I usey = numpy.argmin(someMat)
Can i find the minimum value of this matrix such that it does not lie within a specified range in a neat way?

"Can i find the minimum value of this matrix such that it does not lie within a specified range in a neat way?"
If you only care about the minimum value satisfying some condition and not the location, then
>>> numpy.random.seed(1)
>>> m = numpy.random.randn(5.,5.)
>>> m
array([[ 1.62434536, -0.61175641, -0.52817175, -1.07296862, 0.86540763],
[-2.3015387 , 1.74481176, -0.7612069 , 0.3190391 , -0.24937038],
[ 1.46210794, -2.06014071, -0.3224172 , -0.38405435, 1.13376944],
[-1.09989127, -0.17242821, -0.87785842, 0.04221375, 0.58281521],
[-1.10061918, 1.14472371, 0.90159072, 0.50249434, 0.90085595]])
>>> m[~ ((m < 0.5) | (m > 0.8))].min()
0.50249433890186823
If you do want the location via argmin, then that's a bit trickier, but one way is to use masked arrays:
>>> numpy.ma.array(m,mask=((m<0.5) | (m > 0.8))).argmin()
23
>>> m.flat[23]
0.50249433890186823
Note that the condition here is flipped, as the mask is True for the excluded values, not the included ones.
Update: it appears that by "within a specified range" you don't mean the minimum value isn't within some bounds, but that you want to exclude portions of the matrix from the search based on the x,y coordinates. Here's one way (same matrix as before):
>>> xx, yy = numpy.indices(m.shape)
>>> points = ((xx == 0) & (yy == 0)) | ((xx > 2) & (yy < 3))
>>> points
array([[ True, False, False, False, False],
[False, False, False, False, False],
[False, False, False, False, False],
[ True, True, True, False, False],
[ True, True, True, False, False]], dtype=bool)
>>> m[points]
array([ 1.62434536, -1.09989127, -0.17242821, -0.87785842, -1.10061918,
1.14472371, 0.90159072])
>>> m[points].min()
-1.1006191772129212
with the corresponding masked array variant if you need the locations. [Edited to use indices instead of mgrid; I'd actually forgotten about it until it was used in another answer today!]
If I'm still wrong :^) and this also isn't what you're after, please edit your question to include a 3x3 example of your desired input and output.

I'm guessing this is what you are trying to achieve:
Argmin with arrays:
>>> from numpy import *
>>> a = array( [2,3,4] )
>>> argmin(a)
0
>>> print a[argmin(a)]
2
Argmin with matrices:
>>> b=array( [[6,5,4],[3,2,1]] )
>>> argmin(b)
5
>>> print b[argmin(b)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: index out of bounds
Same approach for indexing doesn't work for arrays. The reason is that argmin (as well as argmax) returns index of the variable -- in case of a matrix, you need to convert your n-dimensional matrix to a 1-dimensional array of indices.
In order to do this, you need to call ravel :
>>> print b
[[6 5 4]
[3 2 1]]
>>> ravel(b)
array([6, 5, 4, 3, 2, 1])
When you combine ravel with argmin, you must write:
>>> print ravel(b)[argmin(b)]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

ndarray row-wise index of values greater than array - python

Related

How to compare one 2d array with 1d array to check for elements?

Is there a 2-D "where" in numpy?

Unsorted segment argmax workaround tensorflow

how to add two numpy.array with indices or mask

numpy argmin elegant solution required.

Categories

Resources