I recently came across some python code I don't understand completely.
s = "abcdef"
x = "bde"
it = iter(s)
print all(c in it for c in x)
I understand that this code checks if x is a subsequence of s. Can someone explain or point me towards an article that explains what's exactly happening at c in it. What is calling the next method of iterator it?
It’s good to start with reading the documentation for the built-in function all():
Return True if all elements of the iterable are true (or if the iterable is empty).
That means that c in it for c in x is a “generator expression”: it produces values. The values it produces are of the boolean expression c in it (see the in operator) for all characters c in string x.
Here, the in operator is responsible for advancing the iterator. Note, however, that the True result is probably lucky. The iterator it can advance only once and because x = "bde" contains the letters in the same sequence as they appear in s = "abcdef", the whole expression works out to the expected result True. Reverse x = "edb" and the expression is False because the iterator is exhausted.
Related
In Python, an empty list is considered a Falsey value
Therefore this is how things should work:
>>> [] and False
False
But in reality, python returns an empty list.
>>> [] and False
[]
Is this intended or a bug?
It's intended. Both and and or are defined to return the last thing evaluated (based on short-circuiting), not actually True or False. For and, this means it returns the first falsy value, if any, and the last value (regardless of truthiness) if all the others are truthy.
It was especially useful back before the conditional expression was added, as it let you do some almost-equivalent hacks, e.g. before the conditional expression:
b if a else c
could be written as:
a and b or c
and, assuming b itself was some truthy thing, it would behave equivalently (the conditional expression lacked that limitation and was more clear about intent, which is why it was added). Even today this feature is occasionally useful for replacing all falsy values with some more specifically-typed default, e.g. when lst might be passed as None or a list, you can ensure it's a list with:
lst = lst or []
to cheaply replace None (and any other falsy thing) with a new empty list.
This is how it is supposed to work. and will only return the right hand operand if the left hand operand is truthy. Since [] is falsy, and returns the left hand operand.
That's a totally expected behaviour. To understand it, you need to know how the Boolean operators (and, or, not) work. From the Boolean Operations documentation:
The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.
Now let's consider your example: [] and False. Here, since [] is falsey, it's value is returned back by the statement which is [].
Above linked Python documentation explicitly mentions:
Note: Neither and nor or restrict the value and type they return to False and True, but rather return the last evaluated argument.
However, in case you need the return value as boolean, you can explicitly type-cast the value to True or False using the bool() function.
For example, in your case it will return as False:
>>> bool([] and False)
False
I have a script that checks if there are one or more of the same items in a list. Here's the code:
items = ["Blue", "Black", "Red"]
def isUnique(item):
seen = list()
return not any(i in seen or seen.append(i) for i in item)
print(isUnique(items))
It prints "True" if all the items in the given list are unique and "False" if one or more items in the list are unique. Can someone please explain the any() part of the script for me as I don't fully understand how it works?
This code is kind of a hack, since it uses a generator expression with side-effects and exploits the fact that append returns None, which is falsy.
The equivalent code written in the imperative style is like so:
def isUnique(items):
seen = list()
for i in items:
if i in seen or seen.append(i):
return False
return True
The or is still a bit strange there - it is being used for its short-circuiting behaviour, so that append is only called when i in seen is false - so we could rewrite it like this:
def isUnique(items):
seen = list()
for i in items:
if i in seen:
return False
else:
seen.append(i)
return True
This is equivalent because append is only called when i in seen is false, and the call to append returns None which means the return False line shouldn't execute in that case.
Here you need to understand first how or operator works.
or is like exp1 or exp2
it just evaluates the expression which gives True first or give true at last
eg
>>> 2 or 3
2
>>> 5 or 0.0
5
>>> [] or 3
3
>>> 0 or {}
{}
now for your list comprehension, [i in seen or seen.append(i) for i in items] i in seen evaluate false and seen.append(i) True and which return None ie list.append return None so , comprehension contain all None
>>> seen = []
>>> items = ["Blue", "Black", "Red"]
>>> res = [i in seen or seen.append(i) for i in items]
>>> res
[None, None, None]
>>> any(res)
False
as per any documentation, it is returning false beacuse as it is not getting iterable or bool.
>>> help(any)
Help on built-in function any in module builtins:
any(iterable, /)
Return True if bool(x) is True for any x in the iterable.
If the iterable is empty, return False.
the any function in python takes a list of booleans and returns the OR of all of them.
the i in seen or seen.append(i) for i in item appends i to seen if it's not in seen already. but if it is already in seen then the append() does not run since the first part is already True, and python doesn't need to know if the second part is true since True OR'd with anything is True. so it doesn't execute it. so the seen array ends up being a unique list of colours it has seen.
i in seen or seen.append(i) for i in item is also a generator expression,
which generates booleans, and any checks the booleans it generates, if even one of them evaluates to True, the whole any will return True.
so the first time an item that is already in the seen array is found, any will stop the generator and return True itself.
so if a duplicate element happens to be in the array no more conditions are evaluated and no more elements are appended to seen array
so if the array had duplicate elements, like,
items = ["Blue", "Blue", "Black", "Red"]
def isUnique(item):
seen = list()
unique = not any(i in seen or seen.append(i) for i in item)
print(seen)
return unique
isUnique(items)
would result in the output, just
['Blue']
EDIT: there are great answers. Adding some simpler ways to achieve the wanted result:
Method 1:
items = ["Blue", "Black", "Red"]
items_set = set(items)
if len(items_set) != len(items):
# there are duplications
This works because a set object ‘removes’ duplications.
Method 2:
contains_duplicates = any(items.count(element) > 1 for element in items) # true if contains duplications and false otherwise.
See https://www.kite.com/python/answers/how-to-check-for-duplicates-in-a-list-in-python
———————————————
any is a great function
Return True if any element of the iterable is true. If the iterable is empty, return False
Your function isUnique, however, does a bit more logic. Let's break it down:
First you create an empty list object and store it in 'seen' variable.
for i in item - iterates the list of items.
i in seen - This statement returns True if 'i' is a member of 'seen', and false otherwise.
seen.append(i) - add i to seen. This statement returns None if 'i' is appeneded to seen successfully.
Notice the or statement between i in seen or seen.append(i). That means, if one of the statements here is True, the or statement returns True.
At this point, I'd run [i in seen or seen.append(i) for i in item], see the result and experiment with it. The result for your example is [None, None, None].
Basically, for each item, you both add it to the list and check if it is already in the list.
Finally, you use the any() function - which returns True if the iterable has a True value. This will happen only if i in seen will return True.
Notice you are using not any(...), which returns False in case there are no repititions.
There are simpler and clearer ways to implement this. You should try!
It is quite simple: the expression inside any() is a generator. any() draws from that generator and returns True (and stops) at the first element from the generator that is True. If it exhausts the generator, then it returns False.
The expression in the generator (i in seen or seen.append(i)) is a trick to express as a one-liner the logic that: if i is in the list, the expression is True and any() stops immediately, otherwise, i is added to the list and the generator continues.
The function can be significantly improved by using a set instead of a list:
def isUnique(item):
seen = set()
return not any(i in seen or seen.add(i) for i in item)
It is much faster to test for presence of an item in a set (O[1]) than in a list (O[n]).
One interesting and perhaps underappreciated aspect of this code is that it works on a (potentially infinite) generator. It will stop drawing from the generator at the first repeated item. Subsequent items that would be obtained by the generator are not evaluated at all (with potential side-effects, desirable or not).
A different approach, suitable for known and finite collections of items, would be the following:
def isUnique(items):
items = tuple(items) # in case items is a generator
return len(set(items)) == len(items)
This assumes that all the items fit in memory. Obviously this won't work if items is a generator of a very large or infinite number of elements.
Kindly help me understand why this works. The code below lists duplicates in an iterable. However, the use of the or operator behaves like the else in an if..else statement..
j = set()
my_list = [1, 2, 3 ,3 , 3 ,4, 4]
j_add = j.add
twice = set(x for x in my_list if x in j or j_add(x))
print list(twice)
Would expect the line to be:
twice = set(x for x in my_list if x in j else j_add(x))
Thought or returns a boolean not a value
The or operator returns the last evaluated argument, which may or may not be a Boolean.
This behavior is explained in the Documentation:
Note that neither and nor or restrict the value and type they return to False and True, but rather return the last evaluated argument. This is sometimes useful, e.g., if s is a string that should be replaced by a default value if it is empty, the expression s or 'foo' yields the desired value.
Of course, it helps to remember what is interpreted as false and what is interpreted as true:
[T]he following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true.
So in the expression:
A = B or C
As #MartijnPieters points out in a comment, an or expression short-circuits. If the first argument (B in this case) is interpreted as true, the entire expression must be true so the second argument (C) is never evaluated. Therefore the first argument (B) is "the last evaluated argument" and is what is returned. However, if the first argument (B) is interpreted as false, the second argument (C) must still be evaluated to determined the truthiness of the expression (no short-circuit takes place). In that case, "the last evaluated argument" is the second argument (C), which is returned regardless of whether the expression evaluates true or false.
It effectively accomplishes the same as the Conditional Expression:
A = B if B else C
However, Conditional Expressions were only added to Python in version 2.5, while the Boolean Operator behavior has existed from the beginning (or at least for a very long time). Most seasoned Python programmers will easily recognize and are in the habit of using A = B or C. Conditional Expressions are commonly reserved for more complex conditions that won't work with a simple or (for example in A = B if X else C the condition is not based on the truthiness of B but X, which could be anything from a simple value to a complex expression).
However, you need to be careful because, as JaredGoguen points out in his answer, changing the or to an else in the OP's sample actually changes the behavior of the code. That code was written to depend on this specific behavior of the or operator. You can't just replace any use of or for assignment with a Conditional Expression. Additional refactoring may be needed as well.
I might make a value judgment here and say that this is not good code because it is using the short-circuiting behavior of or to produce a side-effect.
Consider the given conditional: if x in j or j_add(x).
When x in j, the or short-circuits, skips the j_add(x) part of the conditional, and evaluates as True.
When x not in j, the statement j_add(x) is checked for its truthiness. This method returns None, which is falsy, and so or evaluate as False.
So, the entire conditional will evaluate the same as x in j. However j_add(x) has the side-effect of adding x to j! This side-effect is being exploited in order to record the unique members my_list in a quick-and-dirty comprehension.
Changing the or to an else would still construct j as desired, but it would inappropriately add None, the return value of j_add(x), to twice.
Numpy has a great method .all() for arrays of booleans, that tests if all the values are true. I'd like to do the same without adding numpy to my project. Is there something similar in the standard libary? Otherwise, how would you implement it?
I can of course think of the obvious way to do it:
def all_true(list_of_booleans):
for v in list_of_booleans:
if not v:
return False
return True
Is there a more elegant way, perhaps a one-liner?
There is; it is called all(), surprisingly. It is implemented exactly as you describe, albeit in C. Quoting the docs:
Return True if all elements of the iterable are true (or if the
iterable is empty). Equivalent to:
def all(iterable):
for element in iterable:
if not element:
return False
return True
New in version 2.5.
This is not limited to just booleans. Note that this takes an iterable; passing in a generator expression means only enough of the generator expression is going to be evaluated to test the hypothesis:
>>> from itertools import count
>>> c = count()
>>> all(i < 10 for i in c)
False
>>> next(c)
11
There is an equivalent any() function as well.
There is a similar function, called all().
I'm a little curious about the difference between if and inline if, in Python. Which one is better?
Is there any reason to use inline if, other than the fact that it's shorter?
Also, is there anything wrong with this statement? I'm getting a syntax error: SyntaxError: can't assign to conditional expression
a = a*2 if b == 2 else a = a/w
The advantage of the inline if expression is that it's an expression, which means you can use it inside other expressions—list comprehensions, lambda functions, etc.
The disadvantage of the inline if expression is also that it's an expression, which means you can't use any statements inside of it.
A perfect example of the disadvantage is exactly what's causing your error: a = a/w is a statement, so you can't use it inside an expression. You have to write this:
if b == 2:
a = a*2
else:
a = a/w
Except that in this particular case, you just want to assign something to a in either case, so you can just write this:
a = a*2 if b==2 else a/w
As for the advantage, consider this:
odd_numbers = [number if number%2 else number+1 for number in numbers]
Without the if expression, you'd have to wrap the conditional in a named function—which is a good thing for non-trivial cases, but overly verbose here:
def oddify(number):
if number%2:
return number
else:
return number+1
odd_numbers = [oddify(number) for number in numbers]
Also, note that the following example is not using an if (ternary conditional) expression, but an if (conditional filter) clause:
odd_numbers = [number for number in numbers if number % 2]
The correct way to use the conditional expression is:
result = X if C else Y
what you have is:
result = X if C else result = Y
So, you should remove the result = part from there. The major advantage of conditional expression is that, it's an expression. You can use them wherever you would use a normal expression, as RHS of assignment expression, as method/function arguments, in lambdas, in list comprehension, so on. However, you can't just put any arbitrary statements in them, like say print statements.
Fo e.g. suppose you want all even integers from a list, but for all odd numbers, you want the values as 0. You would use it in list comprehension like this:
result = [x if x % 2 == 0 else 0 for x in li]
Inline if is an expression, so you can not put assignments inside.
Correct syntax would be:
a = a*2 if b == 2 else a/w
As for the usefulness, it's a question of style, and perhaps it would be a good question for Programmers StackExchange.