For example I have this np.array:
[[True, True, False, False]
[True, False, True, False]
[False, True, False, True]
[False, False, True, True]]
I want to get the the first index that is True in each row but counting from the back of the row. So expected output is a (4,) array corresponding to each row:
[1, # First index that is True is at index 1
2, # First index that is True is at index 2
3, # First index that is True is at index 3
3] # First index that is True is at index 3
a = np.array(
[[True, True, False, False],
[True, False, True, False],
[False, True, False, True],
[False, False, True, True]]
)
idx = a.shape[1] - 1 - np.argmax(a[:,::-1], axis=1)
np.argmax() will find the index of the highest value (for each row with axis=1). Since True is equivalent to 1 and False to 0, it'll record the index of the first True value. Simply reverse the rows so you find the "last" one and then substract this from the row length to account for the fact that you're counting backwards.
you can use python to reversea row and find an element: row.reverse() and row.find(True). in numpy you can use numpy.flip(row) to reverse a row and numpy.where(row == True) to find an element in a row.
import numpy as np
x = np.array([[True, True, False, False],
[True, False, True, False],
[False, True, False, True],
[False, False, True, True]])
result = []
for row in x:
row = np.flip(row)
index = np.where(row == True)
result.append(index[0][0])
print(result)
Related
Take the following example. I have an array test and want to get a boolean mask with True's for all elements that are equal to elements of ref.
import numpy as np
test = np.array([[2, 3, 1, 0], [5, 4, 2, 3], [6, 7, 5 ,4]])
ref = np.array([3, 4, 5])
I am looking for something equivalent to
mask = (test == ref[0]) | (test == ref[1]) | (test == ref[2])
which in this case should yield
>>> print(mask)
[[False, True, False, False],
[ True, True, False, True],
[False, False, True, True]]
but without having to resort to any loops.
Numpy comes with a function isin that does exactly this
np.isin(test, ref)
which return
array([[False, True, False, False],
[ True, True, False, True],
[False, False, True, True]])
You can use numpy broadcasting:
mask = (test[:,None] == ref[:,None]).any(1)
output:
array([[False, True, False, False],
[ True, True, False, True],
[False, False, True, True]])
NB. this is faster that numpy.isin, but creates a (X, X, Y) sized intermediate array where X, Y is the shape of test, so this will consume some memory on very large arrays
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
I am new to Python, and now I am encountered with this question to count occurrences of consecutive True values in nested list or two dimensional numpy array filled with Booleans. Say I have a nested list like listX = [[True, False, True, True, True], [False, True, True, False, True], [False, True, False, False, False], [True, True, False, False, True]]. I want to count the occurrences of consecutive True values in each list, i.e. for listX[0], I would want the answer to be [1,3]. (In reality, I can have 10-25 flexible number of lists inside the nested list and each list contains 100 Boolean values.)
Based on the itertools mentioned in the answer for a previous question with one dimensional array
Count consecutive occurences of values varying in length in a numpy array, I can answer my simple example like this:
listX = [[True, False, True, True, True], [False, True, True, False, True], [False, True, False, False, False], [True, True, False, False, True]]
import numpy as np
arr = np.array(listX)
arr
>>> array([[ True, False, True, True, True],
[False, True, True, False, True],
[False, True, False, False, False],
[ True, True, False, False, True]])
import itertools
c1 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[0]) if key]
c2 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[1]) if key]
c3 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[2]) if key]
c4 = [sum(1 for _ in group) for key, group in itertools.groupby(arr[3]) if key]
c1, c2, c3, c4
>>> ([1, 3], [2, 1], [1], [2, 1])
Since the example here just have 4 rows, I can code this way with indexing each row for 2D array, but in reality, I can have 10-25 flexible number of rows and each row contains 100 Boolean values. Is there any simpler way than this?
Convert your code applied to each row to the following lambda function:
myCount = lambda ar: [sum(1 for _ in group) for key, group in itertools.groupby(ar) if key]
Then assemble results for each row the following way:
res = []
for i in range(arr.shape[0]):
res.append(myCount(arr[i]))
To test also other cases, I extended your sample data with a row full of
False values and another row full of True:
array([[ True, False, True, True, True],
[False, True, True, False, True],
[False, True, False, False, False],
[ True, True, False, False, True],
[False, False, False, False, False],
[ True, True, True, True, True]])
The result for the above array is:
[[1, 3], [2, 1], [1], [2, 1], [], [5]]
I think, this result should be left as a pythonic nested list.
The reason is that Numpy does not support "jagged" arrays (with
rows of different length).
I have a 1D (numpy) array with boolean values. for example:
x = [True, True, False, False, False, True, False, True, True, True, False, True, True, False]
The array contains 8 True values. I would like to keep, for example, exactly 3 (must be less than 8 in this case) as True values randomly from the 8 that exist. In other words I would like to randomly set 5 of those 8 True values as False.
A possible result can be:
x = [True, True, False, False, False, False, False, False, False, False, False, False, True, False]
How to implement it?
One approach would be -
# Get the indices of True values
idx = np.flatnonzero(x)
# Get unique indices of length 3 less than the number of indices and
# set those in x as False
x[np.random.choice(idx, len(idx)-3, replace=0)] = 0
Sample run -
# Input array
In [79]: x
Out[79]:
array([ True, True, False, False, False, True, False, True, True,
True, False, True, True, False], dtype=bool)
# Get indices
In [80]: idx = np.flatnonzero(x)
# Set 3 minus number of True indices as False
In [81]: x[np.random.choice(idx, len(idx)-3, replace=0)] = 0
# Verify output to have exactly three True values
In [82]: x
Out[82]:
array([ True, False, False, False, False, False, False, True, False,
False, False, True, False, False], dtype=bool)
Build an array with the number of desired True and False, then just shuffle it
import random
def buildRandomArray(size, numberOfTrues):
res = [False]*(size-numberOfTrues) + [True]*numberOfTrues
random.shuffle(res)
return res
Live example
as the title say, I want to make np.where() returning a coordinate multiple time if it comes across the same value, exemple:
import numpy as np
a = 2*np.arange(5)
b = [8,8]
condition = np.isin(a,b)
print np.where(condition)
>>> (array([4], dtype=int64),)
it returns [4] because a[4] = 8, but since b has two 8, I want it to returns [4,4], is there a way to do this without iterating throught each b value?
With your a,b:
In [687]: condition=isin(a,b)
In [688]: condition
Out[688]: array([False, False, False, False, True], dtype=bool)
where just tells us the index of that one True value.
Switch the test, and you find that both items of b are in a.
In [697]: isin(b,a)
Out[697]: array([ True, True], dtype=bool)
You could use a simple broadcasted comparison:
In [700]: a[:,None]==b
Out[700]:
array([[False, False],
[False, False],
[False, False],
[False, False],
[ True, True]], dtype=bool)
In [701]: np.where(a[:,None]==b)
Out[701]: (array([4, 4], dtype=int32), array([0, 1], dtype=int32))
isin (and in1d which it uses) worries about uniqueness, but you aren't. So testing the array == gives you more control.
test if both values in b match the same a element
In [703]: (a[:,None]==b).all(axis=1)
Out[703]: array([False, False, False, False, True], dtype=bool)
test if any - essentially what in1d does:
In [704]: (a[:,None]==b).any(axis=1)
Out[704]: array([False, False, False, False, True], dtype=bool)
Starting from an array:
a = np.array([1,1,1,2,3,4,5,5])
and a filter:
m = np.array([1,5])
I am now building a mask with:
b = np.in1d(a,m)
that correctly returns:
array([ True, True, True, False, False, False, True, True], dtype=bool)
I would need to limit the number of boolean Trues for unique values to a maximum value of 2, so that 1 is masked only two times instead of three). The resulting mask would then appear (no matter the order of the first real True values):
array([ True, True, False, False, False, False, True, True], dtype=bool)
or
array([ True, False, True, False, False, False, True, True], dtype=bool)
or
array([ False, True, True, False, False, False, True, True], dtype=bool)
Ideally this is a kind of "random" masking over a limited frequency of values. So far I tried to random select the original unique elements in the array, but actually the mask select the True values no matter their frequency.
For a generic case with unsorted input array, here's one approach based on np.searchsorted -
N = 2 # Parameter to decide how many duplicates are allowed
sortidx = a.argsort()
idx = np.searchsorted(a,m,sorter=sortidx)[:,None] + np.arange(N)
lim_counts = (a[:,None] == m).sum(0).clip(max=N)
idx_clipped = idx[lim_counts[:,None] > np.arange(N)]
out = np.in1d(np.arange(a.size),idx_clipped)[sortidx.argsort()]
Sample run -
In [37]: a
Out[37]: array([5, 1, 4, 2, 1, 3, 5, 1])
In [38]: m
Out[38]: [1, 2, 5]
In [39]: N
Out[39]: 2
In [40]: out
Out[40]: array([ True, True, False, True, True, False, True, False], dtype=bool)