Is OR and ELSE similar in list comprehension statement - python

Kindly help me understand why this works. The code below lists duplicates in an iterable. However, the use of the or operator behaves like the else in an if..else statement..
j = set()
my_list = [1, 2, 3 ,3 , 3 ,4, 4]
j_add = j.add
twice = set(x for x in my_list if x in j or j_add(x))
print list(twice)
Would expect the line to be:
twice = set(x for x in my_list if x in j else j_add(x))
Thought or returns a boolean not a value

The or operator returns the last evaluated argument, which may or may not be a Boolean.
This behavior is explained in the Documentation:
Note that neither and nor or restrict the value and type they return to False and True, but rather return the last evaluated argument. This is sometimes useful, e.g., if s is a string that should be replaced by a default value if it is empty, the expression s or 'foo' yields the desired value.
Of course, it helps to remember what is interpreted as false and what is interpreted as true:
[T]he following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true.
So in the expression:
A = B or C
As #MartijnPieters points out in a comment, an or expression short-circuits. If the first argument (B in this case) is interpreted as true, the entire expression must be true so the second argument (C) is never evaluated. Therefore the first argument (B) is "the last evaluated argument" and is what is returned. However, if the first argument (B) is interpreted as false, the second argument (C) must still be evaluated to determined the truthiness of the expression (no short-circuit takes place). In that case, "the last evaluated argument" is the second argument (C), which is returned regardless of whether the expression evaluates true or false.
It effectively accomplishes the same as the Conditional Expression:
A = B if B else C
However, Conditional Expressions were only added to Python in version 2.5, while the Boolean Operator behavior has existed from the beginning (or at least for a very long time). Most seasoned Python programmers will easily recognize and are in the habit of using A = B or C. Conditional Expressions are commonly reserved for more complex conditions that won't work with a simple or (for example in A = B if X else C the condition is not based on the truthiness of B but X, which could be anything from a simple value to a complex expression).
However, you need to be careful because, as JaredGoguen points out in his answer, changing the or to an else in the OP's sample actually changes the behavior of the code. That code was written to depend on this specific behavior of the or operator. You can't just replace any use of or for assignment with a Conditional Expression. Additional refactoring may be needed as well.

I might make a value judgment here and say that this is not good code because it is using the short-circuiting behavior of or to produce a side-effect.
Consider the given conditional: if x in j or j_add(x).
When x in j, the or short-circuits, skips the j_add(x) part of the conditional, and evaluates as True.
When x not in j, the statement j_add(x) is checked for its truthiness. This method returns None, which is falsy, and so or evaluate as False.
So, the entire conditional will evaluate the same as x in j. However j_add(x) has the side-effect of adding x to j! This side-effect is being exploited in order to record the unique members my_list in a quick-and-dirty comprehension.
Changing the or to an else would still construct j as desired, but it would inappropriately add None, the return value of j_add(x), to twice.

Related

Python: `and` operator does not return a boolean value

In Python, an empty list is considered a Falsey value
Therefore this is how things should work:
>>> [] and False
False
But in reality, python returns an empty list.
>>> [] and False
[]
Is this intended or a bug?
It's intended. Both and and or are defined to return the last thing evaluated (based on short-circuiting), not actually True or False. For and, this means it returns the first falsy value, if any, and the last value (regardless of truthiness) if all the others are truthy.
It was especially useful back before the conditional expression was added, as it let you do some almost-equivalent hacks, e.g. before the conditional expression:
b if a else c
could be written as:
a and b or c
and, assuming b itself was some truthy thing, it would behave equivalently (the conditional expression lacked that limitation and was more clear about intent, which is why it was added). Even today this feature is occasionally useful for replacing all falsy values with some more specifically-typed default, e.g. when lst might be passed as None or a list, you can ensure it's a list with:
lst = lst or []
to cheaply replace None (and any other falsy thing) with a new empty list.
This is how it is supposed to work. and will only return the right hand operand if the left hand operand is truthy. Since [] is falsy, and returns the left hand operand.
That's a totally expected behaviour. To understand it, you need to know how the Boolean operators (and, or, not) work. From the Boolean Operations documentation:
The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.
Now let's consider your example: [] and False. Here, since [] is falsey, it's value is returned back by the statement which is [].
Above linked Python documentation explicitly mentions:
Note: Neither and nor or restrict the value and type they return to False and True, but rather return the last evaluated argument.
However, in case you need the return value as boolean, you can explicitly type-cast the value to True or False using the bool() function.
For example, in your case it will return as False:
>>> bool([] and False)
False

Or keyword inside a print statement in Python?

How does Python decide the output of this ?
print([1, 2] or ["hello"])
I mean why will always print([2] or ["az"]) output [2] and not ["az"] ?
Since those lists contain elements, they will evaluate to True so Python prints whichever True literal comes first.
There are two things you have to understand here. First:
x or y
If x is truthy, it has the value of x (without even evaluating y, so 23 or launch_nukes() doesn't launch the nukes).
Otherwise, it has the value of y.
Or, as the docs put it:
The expression x or y first evaluates x; if x is true, its value is returned; otherwise, y is evaluated and the resulting value is returned.
Notice that it uses the word "true" here, not the value True. This is a bit confusing (even more so if you're talking out loud, or typing in ASCII without formatting…), which is why everyone says "truthy".
So, what do "truthy" and "falsey" mean?1
"x is truthy" does not mean x == True, it means bool(x) == True.
"x is falsey" does not mean x == False, it means bool(x) == False.
For all builtin and stdlib types:
False is falsey.
None is falsey.
Numeric zero values are falsey.
Empty containers are falsey.
Everything else is truthy.
Notice that None and empty containers are falsey, but they're not equal to False.
By convention, third-party types (including types that you define2) should follow the same rules. But sometimes there are good reasons not to. (For example, a NumPy array is neither truthy nor falsey.3)
This is covered loosely in the same docs section:
In the context of Boolean operations, and also when expressions are used by control flow statements, the following values are interpreted as false: False, None, numeric zero of all types, and empty strings and containers (including strings, tuples, lists, dictionaries, sets and frozensets). All other values are interpreted as true. User-defined objects can customize their truth value by providing a __bool__() method.
The exact details for all builtin types are buried in the standard type hierarchy, which is where you learn things like whether bytes is covered by "strings and containers" (yes) or whether there's anything special about NotImplemented (nope, it's truthy).
So, let's go through your examples:
[1, 2] or ["hello"]
Since [1, 2] is a non-empty container, it's truthy. So this equals [1, 2].
[] or ["hello"]
Since [] is an empty container, it's falsey. So this equals ["hello"].
[] == False
[] may be falsey, but it's not False, or even equal to False. Only numbers equal other numbers, and False is the number 0 in the numeric type bool,4 but [] is not a number. So this is False.
Just be glad you didn't ask about is. :)
1. Technically, these terms aren't defined, even though everyone, even the core devs, uses them all the time. The official reference defines things in terms of evaluating to true or false as a boolean, and then explains what that means elsewhere.
2. You can control whether your types' values are truthy by defining a __bool__ method—or by defining __len__. The only things you're allowed to do are return True, return False, or raise an exception; if you try to return anything different, it raises a TypeError. So, everything is either truthy, or falsey, or untestable.
3. If you try to check its truthiness, it will raise an exception. That's because NumPy uses boolean arrays widely—e.g., array([1, 2]) < array([2, 1]) is array([True, False]), and you don't want people writing if array([1, 2]) < array([2, 1]):, since whatever they're expecting it to do probably doesn't make sense.
4. Yes, bool is a numeric type—in fact, a subclass of int. This is a little weird when you're first learning, but it turns out to be useful more often than it's dangerous, so it's not just preserved for historic reasons.
x or y [or z or z1 or z2 or ...] returns the first Truthy element in sequence, or the last Falsey element if all are Falsey.
x and y [and z and z1 and z2 and ...] returns the first Falsey element in sequence, or the last Truthy element if all are Truthy.
Python has a notion of Truthiness and Falsiness that is separate from True and False. An empty list is not False, but it is Falsey. bool(something_truthy) == True and bool(something_falsey) == False.
Most things are Truthy, so it's easier to list the Falsey things:
0 (note that -1 is Truthy)
None
Empty collections ([], {}, set(), "", and etc. Note that non-empty collections containing entirely Falsey elements are still truthy e.g. [None, None, None, None])
False
Everything else is Truthy.
In your example: [1, 2] or ["hello"] == [1, 2] because the first element, [1, 2 is Truthy (the fact that ["hello"] is also Truthy is irrelevant in this case). Note that [1, 2] and ["hello"] == ["hello"].

Python iterator over string using keyword in

I recently came across some python code I don't understand completely.
s = "abcdef"
x = "bde"
it = iter(s)
print all(c in it for c in x)
I understand that this code checks if x is a subsequence of s. Can someone explain or point me towards an article that explains what's exactly happening at c in it. What is calling the next method of iterator it?
It’s good to start with reading the documentation for the built-in function all():
Return True if all elements of the iterable are true (or if the iterable is empty).
That means that c in it for c in x is a “generator expression”: it produces values. The values it produces are of the boolean expression c in it (see the in operator) for all characters c in string x.
Here, the in operator is responsible for advancing the iterator. Note, however, that the True result is probably lucky. The iterator it can advance only once and because x = "bde" contains the letters in the same sequence as they appear in s = "abcdef", the whole expression works out to the expected result True. Reverse x = "edb" and the expression is False because the iterator is exhausted.

Python assignment quirk w/ list index assign, dict index assign, and dict.get

In ruby 2.4:
x = ['a']
y = {}
x[0] = y[x[0]] = y.fetch(x[0], y.length)
puts y #=> {"a"=>0}
In python 3.5:
x = ['a']
y = {}
x[0] = y[x[0]] = y.get(x[0], len(y))
print(y) #=> {0: 0}
Why this?
ETA:
y[x[0]] = x[0] = y.get(x[0], len(y))
produces expected behavior (much to my chagrin.)
Ruby and Python are different languages, and make different choices. In Python assignments are statements and evaluates multiple assignment targets from left to right. Ruby made other choices; assignments are expressions and as a result are evaluated in the opposite order.
So in Ruby this happens:
Evaluate y.fetch(x[0], y.length), produces 0 (key is missing, y is empty).
Evaluate y[x[0]] = 0, so y['a'] = 0. This is an expression resulting in 0.
Evaluate x[0] = 0 (0 being the result of the y[x[0]] = 0 assignment expression).
Note that in Ruby, an assignment is an expression. It can be nested inside other expressions, and the result of the assignment expression is the value of the target after assignment.
In Python this happens instead:
Evaluate y.get(x[0], len(y)), produces 0 (key is missing, y is empty).
Evaluate x[0] = 0.
Evaluate y[x[0]] = 0, so y[0] = 0.
From the Python assignment statements documentation:
An assignment statement evaluates the expression list (remember that this can be a single expression or a comma-separated list, the latter yielding a tuple) and assigns the single resulting object to each of the target lists, from left to right.
So the expression on the right-hand-side is evaluated first, and then assignment takes place to each target from left to right.
Python made assignments statements on purpose, because the difference between:
if a = 42:
and
if a == 42:
is so damn hard to spot, and even if intentional really hurt the readability of code. In Python, readability counts. A lot.
Generally speaking, you really want to avoid making assignments to names that are then also used in subsequent assignments in the same statement. Don't assign to x[0] and then use x[0] again in the same assignment, that's just hugely confusing.

The differences' between the operator "==" and "="

The differences' between the operator "==" and "=". When would each be used? Why would each be used?
In python and other languages like C,
"=" is a assignment operator and is used to assign a value to a variable.
Example: a=2 # the value of a is 2
whereas "==" is Comparison operator and is used to check whether 2 expressions give the same value .Equality check returns true if it succeeds and else return false.
Example: a=2 b=3 c=2
a==b (#false because 2 is not equal to 3)
a==c (#true because 2 is equal to 2)
= is used for assignment: e.g.: apple = 'apple'.
It states what is what.
== compares one value to another. Is 5 equal to 5 should be written like this: 5 == 5
An == expression evaluates to true, is an equality operator.
== has the value of two operands are equal make the condition or statement true.
= is an expression of assignment operator to the symbol of variables, arrays, objects.
Both operators are very important, and they work in different ways in every equivalent object. Their behavior in their operation is based on the identity of objects. Are reflation of their variables.
When using == in compares the values of two objects example having two cars from the same company and have the same identity and features and same looks.
The rule implies that the statement and condition to be trues
To use = operator is when to evaluate variables in an expression if both sides of the operator mean same or the objects are same if not same its expression will be false and if true the expressions or objects are same.

Categories

Resources