I'd like to do something like this:
>>> y = np.arange(5)
>>> y in (0, 1, 2)
array([True, True, True, False, False])
This syntax doesn't work. What's the best way to achieve the desired result?
(I'm looking for a general solution. Obviously in this specific case I could do y < 3.)
I'll spell this out a little more clearly for you guys, since at least a few people seem to be confused.
Here is a long way of getting my desired behavior:
new_y = np.empty_like(y)
for i in range(len(y)):
if y[i] in (0, 1, 2):
new_y[i] = True
else:
new_y[i] = False
I'm looking for this behavior in a more compact form.
Here's another solution:
new_y = np.array([True if item in (0, 1, 2) else False for item in y])
Again, just looking for a simpler way.
A good general purpose tool is a broadcasted, or 'outer', comparison between elements of two arrays:
In [35]: y=np.arange(5)
In [36]: x=np.array([0,1,2])
In [37]: y[:,None]==x
Out[37]:
array([[ True, False, False],
[False, True, False],
[False, False, True],
[False, False, False],
[False, False, False]])
This is doing a fast comparison between every element of y and every element of x. Depending on your needs, you can condense this array along one of the axes:
In [38]: (y[:,None]==x).any(axis=1)
Out[38]: array([ True, True, True, False, False])
A comment suggested in1d. I think it's a good idea to look at its code. It has several strategies depending on the relative sizes of the inputs.
In [40]: np.in1d(y,x)
Out[40]: array([ True, True, True, False, False])
In [41]: np.array([True if item in x else False for item in y])
Out[41]: array([ True, True, True, False, False])
Which is fastest may depend on the size of the inputs. Starting lists your list comprehension might be faster. This pure list version is by far the fastest:
[True if item in (0,1,2) else False for item in (0,1,2,3,4)]
[item in (0,1,2) for item in (0,1,2,3,4)] # simpler
I would like to only allow lists where the first contiguous group of elements are True and then all of the remaining elements are False. I want lists like these examples to return True:
[True]
[False]
[True, False]
[True, False, False]
[True, True, True, False]
And lists like these to return False:
[False, True]
[True, False, True]
I am currently using this function, but I feel like there is probably a better way of doing this:
def my_function(x):
n_trues = sum(x)
should_be_true = x[:n_trues] # get the first n items
should_be_false = x[n_trues:len(x)] # get the remaining items
# return True only if all of the first n elements are True and the remaining
# elements are all False
return all(should_be_true) and all([not element for element in should_be_false])
Testing:
test_cases = [[True], [False],
[True, False],
[True, False, False],
[True, True, True, False],
[False, True],
[True, False, True]]
print([my_function(test_case) for test_case in test_cases])
# expected output: [True, True, True, True, True, False, False]
Is it possible to use a comprehension instead to make this a one/two line function? I know I could not define the two temporary lists and instead put their definitions in place of their names on the return line, but I think that would be too messy.
Method 1
You could use itertools.groupby. This would avoid doing multiple passes over the list and would also avoid creating the temp lists in the first place:
def check(x):
status = list(k for k, g in groupby(x))
return len(status) <= 2 and (status[0] is True or status[-1] is False)
This assumes that your input is non-empty and already all boolean. If that's not always the case, adjust accordingly:
def check(x):
status = list(k for k, g in groupby(map(book, x)))
return status and len(status) <= 2 and (status[0] or not status[-1])
If you want to have empty arrays evaluate to True, either special case it, or complicate the last line a bit more:
return not status or (len(status) <= 2 and (status[0] or not status[-1]))
Method 2
You can also do this in one pass using an iterator directly. This relies on the fact that any and all are guaranteed to short-circuit:
def check(x):
iterator = iter(x)
# process the true elements
all(iterator)
# check that there are no true elements left
return not any(iterator)
Personally, I think method 1 is total overkill. Method 2 is much nicer and simpler, and achieves the same goals faster. It also stops immediately if the test fails, rather than having to process the whole group. It also doesn't allocate any temporary lists at all, even for the group aggregation. Finally, it handles empty and non-boolean inputs out of the box.
Since I'm writing on mobile, here's an IDEOne link for verification: https://ideone.com/4MAYYa
This question already has answers here:
Python AND operator on two boolean lists - how?
(10 answers)
Closed 6 years ago.
Shouldn't the results be the same?
I do not understand.
[True,False] and [True, True]
Out[1]: [True, True]
[True, True] and [True,False]
Out[2]: [True, False]
No, because that's not the way that and operation works in python. First off it doesn't and the list items separately. Secondly the and operator works between two objects and if one of them is False (evaluated as False 1) it returns that and if both are True it returns the second one. Here is an example :
>>> [] and [False]
[]
>>>
>>> [False] and []
[]
>>> [False] and [True]
[True]
x and y : if x is false, then x, else y
If you want to apply the logical operations on all the lists pairs you can use numpy arrays:
>>> import numpy as np
>>> a = np.array([True, False])
>>> b = np.array([True, True])
>>>
>>> np.logical_and(a,b)
array([ True, False], dtype=bool)
>>> np.logical_and(b,a)
array([ True, False], dtype=bool)
1. Here since you are dealing with lists an empty list will be evaluated as False
Numpy's logical_or function takes no more than two arrays to compare. How can I find the union of more than two arrays? (The same question could be asked with regard to Numpy's logical_and and obtaining the intersection of more than two arrays.)
If you're asking about numpy.logical_or, then no, as the docs explicitly say, the only parameters are x1, x2, and optionally out:
numpy.logical_or(x1, x2[, out]) = <ufunc 'logical_or'>
You can of course chain together multiple logical_or calls like this:
>>> x = np.array([True, True, False, False])
>>> y = np.array([True, False, True, False])
>>> z = np.array([False, False, False, False])
>>> np.logical_or(np.logical_or(x, y), z)
array([ True, True, True, False], dtype=bool)
The way to generalize this kind of chaining in NumPy is with reduce:
>>> np.logical_or.reduce((x, y, z))
array([ True, True, True, False], dtype=bool)
And of course this will also work if you have one multi-dimensional array instead of separate arrays—in fact, that's how it's meant to be used:
>>> xyz = np.array((x, y, z))
>>> xyz
array([[ True, True, False, False],
[ True, False, True, False],
[False, False, False, False]], dtype=bool)
>>> np.logical_or.reduce(xyz)
array([ True, True, True, False], dtype=bool)
But a tuple of three equal-length 1D arrays is an array_like in NumPy terms, and can be used as a 2D array.
Outside of NumPy, you can also use Python's reduce:
>>> functools.reduce(np.logical_or, (x, y, z))
array([ True, True, True, False], dtype=bool)
However, unlike NumPy's reduce, Python's is not often needed. For most cases, there's a simpler way to do things—e.g., to chain together multiple Python or operators, don't reduce over operator.or_, just use any. And when there isn't, it's usually more readable to use an explicit loop.
And in fact NumPy's any can be used for this case as well, although it's not quite as trivial; if you don't explicitly give it an axis, you'll end up with a scalar instead of an array. So:
>>> np.any((x, y, z), axis=0)
array([ True, True, True, False], dtype=bool)
As you might expect, logical_and is similar—you can chain it, np.reduce it, functools.reduce it, or substitute all with an explicit axis.
What about other operations, like logical_xor? Again, same deal… except that in this case there is no all/any-type function that applies. (What would you call it? odd?)
In case someone still need this - Say you have three Boolean arrays a, b, c with the same shape, this gives and element-wise:
a * b * c
this gives or:
a + b + c
Is this what you want?
Stacking a lot of logical_and or logical_or is not practical.
Building on abarnert's answer for n-dimensional case:
TL;DR: np.logical_or.reduce(np.array(list))
As boolean algebras are both commutative and associative by definition, the following statements or equivalent for boolean values of a, b and c.
a or b or c
(a or b) or c
a or (b or c)
(b or a) or c
So if you have a "logical_or" which is dyadic and you need to pass it three arguments (a, b, and c), you can call
logical_or(logical_or(a, b), c)
logical_or(a, logical_or(b, c))
logical_or(c, logical_or(b, a))
or whatever permutation you like.
Back to python, if you want to test whether a condition (yielded by a function test that takes a testee and returns a boolean value) applies to a or b or c or any element of list L, you normally use
any(test(x) for x in L)
I use this workaround which can be extended to n arrays:
>>> a = np.array([False, True, False, False])
>>> b = np.array([True, False, False, False])
>>> c = np.array([False, False, False, True])
>>> d = (a + b + c > 0) # That's an "or" between multiple arrays
>>> d
array([ True, True, False, True], dtype=bool)
I've tried the following three different methods to get the logical_and of a list l of k arrays of size n:
Using a recursive numpy.logical_and (see below)
Using numpy.logical_and.reduce(l)
Using numpy.vstack(l).all(axis=0)
Then I did the same for the logical_or function. Surprisingly enough, the recursive method is the fastest one.
import numpy
import perfplot
def and_recursive(*l):
if len(l) == 1:
return l[0].astype(bool)
elif len(l) == 2:
return numpy.logical_and(l[0],l[1])
elif len(l) > 2:
return and_recursive(and_recursive(*l[:2]),and_recursive(*l[2:]))
def or_recursive(*l):
if len(l) == 1:
return l[0].astype(bool)
elif len(l) == 2:
return numpy.logical_or(l[0],l[1])
elif len(l) > 2:
return or_recursive(or_recursive(*l[:2]),or_recursive(*l[2:]))
def and_reduce(*l):
return numpy.logical_and.reduce(l)
def or_reduce(*l):
return numpy.logical_or.reduce(l)
def and_stack(*l):
return numpy.vstack(l).all(axis=0)
def or_stack(*l):
return numpy.vstack(l).any(axis=0)
k = 10 # number of arrays to be combined
perfplot.plot(
setup=lambda n: [numpy.random.choice(a=[False, True], size=n) for j in range(k)],
kernels=[
lambda l: and_recursive(*l),
lambda l: and_reduce(*l),
lambda l: and_stack(*l),
lambda l: or_recursive(*l),
lambda l: or_reduce(*l),
lambda l: or_stack(*l),
],
labels = ['and_recursive', 'and_reduce', 'and_stack', 'or_recursive', 'or_reduce', 'or_stack'],
n_range=[2 ** j for j in range(20)],
logx=True,
logy=True,
xlabel="len(a)",
equality_check=None
)
Here below the performances for k = 4.
And here below the performances for k = 10.
It seems that there is an approximately constant time overhead also for higher n.
using the sum function:
a = np.array([True, False, True])
b = array([ False, False, True])
c = np.vstack([a,b,b])
Out[172]:
array([[ True, False, True],
[False, False, True],
[False, False, True]], dtype=bool)
np.sum(c,axis=0)>0
Out[173]: array([ True, False, True], dtype=bool)
a = np.array([True, False, True])
b = np.array([False, False, True])
c = np.array([True, True, True])
d = np.array([True, True, True])
# logical or
lor = (a+b+c+d).astype(bool)
# logical and
land = (a*b*c*d).astype(bool)
If you want a short (maybe not optimal) function for performing logical AND on multidimensional boolean masks, you may use this recursive lambda function:
masks_and = lambda *masks : masks[0] if len(masks) == 1 else masks_and(np.logical_and(masks[0], masks[-1]), *masks[1:-1])
result = masks_and(mask1, mask2, ...)
You can also generalize the lambda function for applying any operator (function of 2 arguments) with distributive property (such as multiplication/AND, sum/OR and so on), assuming the order is also important, to any objects like this:
fn2args_reduce = lambda fn2args, *args : args[0] if len(args) == 1 else fn2args_reduce(fn2args, fn2args(args[0], args[1]), *args[2:])
result = fn2args_reduce(np.dot, matrix1, matrix2, ... matrixN)
which gives you the same result as if you use # numpy operator):
np.dot(...(np.dot(np.dot(matrix1, matrix2), matrix3)...), matrixN)
For example fn2args_reduce(lambda a,b: a+b, 1,2,3,4,5) gives you 15 - sum of these numbers (of course you have a much more efficient sum function for this, but I like it).
Even more generalized model for functions of N arguments could look like this:
fnNargs_reduce = lambda fnNargs, N, *args : args[0] if len(args) == 1 else fnNargs_reduce(fnNargs, N, fnNargs(*args[:N]), *args[N:])
fnNargs = lambda x1, x2, x3=neutral, ..., xN=neutral: x1 (?) x2 (?) ... (?) xN
Where neutral means it is neutral element for (?) operator, eg. 0 for +, 1 for * etc.
Why? Just for fun :-)