Multi-argument null coalesce and built-in "or" function in Python - python

Python has a great syntax for null coalescing:
c = a or b
This sets c to a if a is not False, None, empty, or 0, otherwise c is set to b.
(Yes, technically this is not null coalescing, it's more like bool coalescing, but it's close enough for the purpose of this question.)
There is not an obvious way to do this for a collection of objects, so I wrote a function to do this:
from functools import reduce
def or_func(x, y):
return x or y
def null_coalesce(*a):
return reduce(or_func, a)
This works, but writing my own or_func seems suboptimal - surely there is a built-in like __or__? I've attempted to use object.__or__ and operator.__or__, but the first gives an AttributeError and the second refers to the bitwise | (or) operator.
As a result I have two questions:
Is there a built-in function which acts like a or b?
Is there a built-in implementation of such a null coalesce function?
The answer to both seems to be no, but that would be somewhat surprising to me.

It's not exactly a single built-in, but what you want to achieve can be easily done with:
def null_coalesce(*a):
return next(x for x in a if x)
It's lazy, so it does short-circuit like a or b or c, but unlike reduce.
You can also make it null-specific with:
def null_coalesce(*a):
return next(x for x in a if x is not None)

Is there a built-in function which I can use which acts like a or b?
No. Quoting from this answer on why:
The or and and operators can't be expressed as functions because of their short-circuiting behavior:
False and some_function()
True or some_function()
in these cases, some_function() is never called.
A hypothetical or_(True, some_function()), on the other hand, would have to call some_function(), because function arguments are always evaluated before the function is called.
Is there a built-in implementation of such a null coalesce function?
No, there isn't. However, the Python documentation page for itertools suggests the following:
def first_true(iterable, default=False, pred=None):
"""Returns the first true value in the iterable.
If no true value is found, returns *default*
If *pred* is not None, returns the first item
for which pred(item) is true.
"""
# first_true([a,b,c], x) --> a or b or c or x
# first_true([a,b], x, f) --> a if f(a) else b if f(b) else x
return next(filter(pred, iterable), default)

Marco has it right, there's no built-in, and itertools has a recipe. You can also pip install boltons to use the boltons.iterutils.first() utility, which is perfect if you want short-circuiting.
from boltons.iterutils import first
c = first([a, b])
There are a few other related and handy reduction tools in iterutils, too, like one().
I've done enough of the above that I actually ended up wanting a higher-level tool that could capture the entire interaction (including the a and b references) in a Python data structure, yielding glom and its Coalesce functionality.
from glom import glom, Coalesce
target = {'b': 1}
spec = Coalesce('a', 'b')
c = glom(target, spec)
# c = 1
(Full disclosure, as hinted above, I maintain glom and boltons, which is good news, because you can bug me if you find bugs.)

Related

Correct way of using "not in" operator

We know that,
a = 1
b = 2
print(not a > b)
is the correct way of using the "not" keyword and the below throws an error
a = 1
b = 2
print(a not > b)
since "not" inverts the output Boolean.
Thus, by this logic the correct way for checking the presence of a member in a list should be
a = 1
b = [2,3,4,5]
print(not a in b)
But I find the most common way is
a = 1
b = [2,3,4,5]
print(a not in b)
which from the logic given in previous example should throw an error.
So what is the correct way of using the "not in" operator in Python3.x?
not in is a special case that simplifies to exactly what you tried first. Namely,
a not in b
literally simplifies to not (a in b). It also works (slightly differently, but same idea) for is.
a is not b`
is equivalent to not (a is b). Python added these because they flow naturally like English prose. On the other hand, a not < b doesn't look or feel natural, so it's not allowed. The not in and is not are special cases in the grammar, not small parts of a general rule about where not can go. The only general rule in play is that not can always be used as a prefix operator (like in not (a < b))
what is the correct way of using the "not in" operator
There is only one way to use the not in operator. Your not a in b instead uses the not operator and the in operator.
PEP 8 doesn't seem to have an opinion about which to use, but about the similar is not operator (thanks Silvio) it says:
Use is not operator rather than not ... is. While both expressions are functionally identical, the former is more readable and preferred:
# Correct:
if foo is not None:
# Wrong:
if not foo is None:
So I'd say not in should also be preferred, for the same reason.
not, not in and in are all valid operators. Transitively, not (in_expression) is also valid
Correct way? Refer Zen of Python.
First of all not in, is not a two separate operator, is constituently a single operator ,and also known as membership operator. There is another membership operator that is in. Membership operator has high precedence than logical NOT, AND and OR.
print(not a in b) -> This is actually first evaluating a in b then result is inverted by the logical 'not' and then result is printed.
So as per your example it should print True as a in b gives False then it is inverted to True via logical NOT operator.
print(a not in b) -> Here python checks if a is not a part of the b, if it is return 'False' else 'True` .
So as per your example it should return True as a is not a part of b.
I think a not in b is more clear than not a in b.I would suggest to use membership operator for testing the membership.
However the result will remain same for both kind of expression but the process of evaluating is completely different.

Where is the logical 'or' equivalent in the 'operator' module?

Adding or multiplying a large list of numbers in Python can elegantly be done by folding the list with the addition or multiplication operator:
import functools, operator
lst = range(1,100)
sum = functools.reduce(operator.add, lst)
prod = functools.reduce(operator.mul, lst)
This needs the function equivalents of the operators + and * which
are provided by the operator module as operator.add and
operator.mul, respectively.
If I want to use the same idiom with the operator or:
ingredients = ['onion', 'celery', 'cyanide', 'chicken stock']
soup_is_poisonous = functools.reduce(operator.or, map(is_poisonous, ingredients))
... then I discover that operator doesn't have a function equivalent of the logical and and or operators (though it has one for logical not)
Of course, I can trivially write one that works:
def operator_or(x,y):
return x or y
But I wonder: why are there no operator.or and operator.and in operator? Bitwise and and or are there, but not the logical ones.
Of course this is just a minor annoyance, and the answer may well be
the same as with the missing identity function:
that it is easy to write one. But this holds for * and + as well, so why the difference?
To wrap up all your helpful answers
and comments, in order of somewhat decreasing (to me) convincingness:
the addition of operator.or would break an important promise made by the module
For all operators <op> that have function equivalents
operator.op in the operator module, it is the case that a <op> b is equivalent to (i.e. can always, without changing program
behaviour, replace or be replaced by) operator.op(a, b). This
equivalence is actually mentioned in the module docstring. This is
impossible to do for the operators and and or as their
evaluation is short-circuiting while Python function calls are always evaluated after all of their arguments are.
On the values True and False, | and &, hence also the existing (bitwise) operator.and_ and operator.or_
already return the same results (if they return at all, that is) as or and and.
If is_poisonous() returns either True of False (not an unreasonable requirement), I could use
soup_is_poisonous = reduce(operator.or_, map(is_poisonous, ingredients), False)
in the example from the original question. However, many Python
programs conveniently use any "truthy" value as True in idioms like
your_model_T_color = "black" or any_color_you_like
using | or operator.or_ instead of or here will result in a
TypeError or, even worse, some unexpected value (if the operands
are ints)
The functions any and all can be used instead of
functools.reduce(operator.or, ....)
I'm not convinced by this
argument: operator functions are used in many more contexts than
as a first argument to reduce. Moreover, any always returns
either True or False, not the first truthy value:
any([0,0,0,5,6,7]) # returns True
reduce(lambda x, y: x or y, [0,0,0,5,6,7]) # returns 5
so any and reduce(operator.or would not really be equivalent
any([x,y]) does the same (and more, as it accepts iterables) as operator.or(x,y) would.
That is not quite true (see above), any([0,5]) returns True while operator.or(0,5) would return 5. Moreover, the number of arguments matters greatly if we use a function as an argument to another function like reduce()
all is short-circuiting logical-and.
any is short-circuiting logical-or.
No need to put versions that take exactly two arguments (instead of an iterable) into the operator module, I guess.

Why is operator module missing `and` and `or`?

operator module makes it easy to avoid unnecessary functions and lambdas
in situations like this:
import operator
def mytest(op, list1, list2):
ok = [op(i1, i2) for i1, i2 in zip(list1, list2)]
return all(ok)
mytest(operator.eq, [1, 2, 3], [1, 2, 3]) # True
mytest(operator.add, [-1, 2, -3], [1, -2, 33]) # False
Well, now I need to do i1 and i2, but to my surprise, I can't find and in the operator module! And the same applies to or! I know, and is not exactly operator, it's a keyword, but not, along with is and even del, are all keywords and all are included.
So what's the story? Why are they missing?
Because you cannot convert boolean operators into python functions. Functions always evaluate their arguments, and boolean operators do not. Adding and and or to the operators module would also require adding a special kind of functions (like lisp "macros") that evaluate their arguments on demand. Obviously, this is not something python designers ever wanted. Consider:
if obj is not None and obj.is_valid():
....
you cannot write this in a functional form. An attempt like
if operator.xyz(obj is not None, obj.is_valid())
will fail if obj is actually None.
You can write these yourself, but you'll need to pass a function (e.g. lambda) for the second argument to prevent it from being evaluated at call time, assuming that the usual short-circuiting behavior is important to you.
def func_or(val1, fval2):
return val1 or fval2()
def func_and(val1, fval2):
return val1 and fval2()
Usage:
func_or(False, lambda: True)
func_and(True, lambda: False)
The reason there's no operator.and is that and is a keyword, so that would be a SyntaxError.
As tgh435 explained, the reason there's no renamed and function in operator is that it would be misleading: a function call always evaluates its operands, but the and operator doesn't. (It would also be an exception to an otherwise consistent and simple rule.)
In your case, it looks like you don't actually care about short-circuiting at all, so can build your own version trivially:
def and_(a, b):
return a and b
Or, if you're just using it once, even inline:
mytest(lambda a, b: a and b, [-1, 2, -3], [1, -2, 33])
In some cases, it's worth looking at all (and, for or, any). It is effectively short-circuited and expanded to arbitrary operands. Of course it has a different API than the operator functions, taking a single iterable of operands instead of two separate operands. And the way it short-circuits is different; it just stops iterating the iterable, which only helps if you've set things up so the iterable is only evaluating things as needed. So, it's usually not usable as a drop-in replacement—but it's sometimes usable if you refactor your code a bit.
Python's and and or syntaxes cannot directly be mapped to functions. These syntaxes are lazy evaluated: If the result of the left part of the expression allows to know the value of the whole expression, the right part is skipped. Since they introduce flow control, their behavior cannot be reproduced using an operator.
To reduce confusion, python have chosen to simply not provide these methods.
georg gives a good example of a situation where and laziness matters:
if obj is not None and obj.is_valid():
...
Now, if you don't need lazy evaluation, you can use abarnert's answer implementation:
def and_(a, b):
return a and b
def or_(a, b):
return a or b
Usage:
>>> or_(False, True)
>>> and_(True, False)
If you need lazy evaluation, you can use kindall's answer implementation:
def func_or(val1, fval2):
return val1 or fval2()
def func_and(val1, fval2):
return val1 and fval2()
Usage:
>>> func_or(False, lambda: True)
>>> func_and(True, lambda: False)
Note:
As mentioned in the comments, the functions operator.and_ and operator.or_ correspond to the bitwise operators & and |. See: https://docs.python.org/3/library/operator.html#mapping-operators-to-functions
Note that the names operators.and and operators.or aren't used: and and or are Python keywords so it would be a syntax error.

Check if all values of iterable are zero

Is there a good, succinct/built-in way to see if all the values in an iterable are zeros? Right now I am using all() with a little list comprehension, but (to me) it seems like there should be a more expressive method. I'd view this as somewhat equivalent to a memcmp() in C.
values = (0, 0, 0, 0, 0)
# Test if all items in values tuple are zero
if all([ v == 0 for v in values ]) :
print 'indeed they are'
I would expect a built-in function that does something like:
def allcmp(iter, value) :
for item in iter :
if item != value :
return False
return True
Does that function exist in python and I'm just blind, or should I just stick with my original version?
Update
I'm not suggesting that allcmp() is the solution. It is an example of what I think might be more meaningful. This isn't the place where I would suggest new built-ins for Python.
In my opinion, all() isn't that meaningful. It doesn't express what "all" is checking for. You could assume that all() takes an iterable, but it doesn't express what the function is looking for (an iterable of bools that tests all of them for True). What I'm asking for is some function like my allcmp() that takes two parameters: an iterable and a comparison value. I'm asking if there is a built-in function that does something similar to my made up allcmp().
I called mine allcmp() because of my C background and memcmp(), the name of my made up function is irrelevant here.
Use generators rather than lists in cases like that:
all(v == 0 for v in values)
Edit:
all is standard Python built-in. If you want to be efficient Python programmer you should know probably more than half of them (http://docs.python.org/library/functions.html). Arguing that alltrue is better name than all is like arguing that C while should be call whiletrue. Is subjective, but i think that most of the people prefer shorter names for built-ins. This is because you should know what they do anyway, and you have to type them a lot.
Using generators is better than using numpy because generators have more elegant syntax. numpy may be faster, but you will benefit only in rare cases (generators like showed are fast, you will benefit only if this code is bottleneck in your program).
You probably can't expect nothing more descriptive from Python.
PS. Here is code if you do this in memcpm style (I like all version more, but maybe you will like this one):
list(l) == [0] * len(l)
If you know that the iterable will contain only integers then you can just do this:
if not any(values):
# etc...
If values is a numpy array you can write
import numpy as np
values = np.array((0, 0, 0, 0, 0))
all(values == 0)
The any() function may be the most simple and easy way to achieve just that. If the iterable is empty,e.g. all elements are zero, it will return False.
values = (0, 0, 0, 0, 0)
print (any(values)) # return False
The built-in set is given an iterable and returns a collection (set) of unique values.
So it can be used here as:
set(it) == {0}
assuming it is the iterable
{0} is a set containing only zero
More info on python set-types-set-frozenset here in docs.
I prefer using negation:
all(not v for v in values)

Setting numpy slice in lambda function

I want to create a lambda function that takes two numpy arrays and sets a slice of the first to the second and returns the newly set numpy array.
Considering you can't assign things in lambda functions is there a way to do something similar to this?
The context of this is that I want to set the centre of a zeros array to another array in a single line, and the only solution I could come up with is to use reduce and lambda functions.
I.e. I'm thinking about the condensation of this (where b is given):
a = numpy.zeros( numpy.array(b.shape) + 2)
a[1:-1,1:-1] = b
Into a single line. Is this possible?
This is just an exercise in oneliners. I have the code doing what I want it to do, I'm just wondering about this for the fun of it :).
This is ugly; you should not use it. But it is oneline lambda as you've asked:
f = lambda b, a=None, s=slice(1,-1): f(b, numpy.zeros(numpy.array(b.shape) + 2))\
if a is None else (a.__setitem__([s]*a.ndim, b), a)[1]
What is __setitem__?
obj.__setitem__(index, value) is equivalent to obj[index] = value in this case. Example:
class A:
def __setitem__(self, index, value):
print 'index=%s, value=%s' % (index, value)
a = A()
a[1, 2] = 3
It prints:
index=(1, 2), value=3
Why does __setitem__() return None?
There is a general convention in Python that methods such as list.extend(), list.append() that modify an object in-place should return None. There are exceptions e.g., list.pop().
Y Combinator in Python
Here's blog post On writing Python one-liners which shows how write nameless recursive functions using lambdas (the link is suggested by #Peter Hansen).

Categories

Resources