Python operator precedence - and vs greater than - python

I have a line of code in my script that has both these operators chained together. From the documentation reference BOOLEAN AND has a lower precedence than COMPARISON GREATER THAN. I am getting unexpected results here in this code:
>>> def test(msg, value):
... print(msg)
... return value
>>> test("First", 10) and test("Second", 15) > test("Third", 5)
First
Second
Third
True
I was expecting Second or Third test to happen before the fist one, since > operator has a higher precedence. What am I doing wrong here?
https://docs.python.org/3/reference/expressions.html#operator-precedence

Because you are looking at the wrong thing. call (or function call) takes higher precendence over both and as well as > (greater than) . So first function calls occur from left to right.
Python will get the results for all function calls before either comparison happens. The only thing that takes precendence over here would be short circuiting , so if test("First",10) returned False, it would short circuit and return False.
The comparisons and and still occur in the same precendence , that is first the result of test("Second", 15) is compared against test("Third", 5) (please note only the return values (the function call already occured before)) . Then the result of test("Second", 15) > test("Third", 5) is used in the and operation.
From the documentation on operator precedence -

One way to see what's happening is to look at exactly how Python is interpreting this result:
>>> x = lambda: test("First", 10) and test("Second", 15) > test("Third", 5)
>>> dis.dis(x)
1 0 LOAD_GLOBAL 0 (test)
3 LOAD_CONST 1 ('First')
6 LOAD_CONST 2 (10)
9 CALL_FUNCTION 2
12 JUMP_IF_FALSE_OR_POP 42
15 LOAD_GLOBAL 0 (test)
18 LOAD_CONST 3 ('Second')
21 LOAD_CONST 4 (15)
24 CALL_FUNCTION 2
27 LOAD_GLOBAL 0 (test)
30 LOAD_CONST 5 ('Third')
33 LOAD_CONST 6 (5)
36 CALL_FUNCTION 2
39 COMPARE_OP 4 (>)
>> 42 RETURN_VALUE
If you do the same for 10 and 15 > 5, you get:
>>> x = lambda: 10 and 15 > 5
>>> dis.dis(x)
1 0 LOAD_CONST 1 (10)
3 JUMP_IF_FALSE_OR_POP 15
6 LOAD_CONST 2 (15)
9 LOAD_CONST 3 (5)
12 COMPARE_OP 4 (>)
>> 15 RETURN_VALUE

Related

Is it possible to call a function from within a list comprehension without the overhead of calling the function?

In this trivial example, I want to factor out the i < 5 condition of a list comprehension into it's own function. I also want to eat my cake and have it too, and avoid the overhead of the CALL_FUNCTION bytecode/creating a new frame in the python virtual machine.
Is there any way to factor out the conditions inside of a list comprehension into a new function but somehow get a disassembled result that avoids the large overhead of CALL_FUNCTION?
import dis
import sys
import timeit
def my_filter(n):
return n < 5
def a():
# list comprehension with function call
return [i for i in range(10) if my_filter(i)]
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
assert a() == b()
>>> sys.version_info[:]
(3, 6, 5, 'final', 0)
>>> timeit.timeit(a)
1.2616060493517098
>>> timeit.timeit(b)
0.685117881097812
>>> dis.dis(a)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F4890B660, file "<stdin>", line 3>)
# ...
>>> dis.dis(b)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F48A42270, file "<stdin>", line 3>)
# ...
# list comprehension with function call
# big overhead with that CALL_FUNCTION at address 12
>>> dis.dis(a.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_GLOBAL 0 (my_filter)
10 LOAD_FAST 1 (i)
12 CALL_FUNCTION 1
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
# list comprehension without function call
>>> dis.dis(b.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
I'm willing to take a hacky solution that I would never use in production, like somehow replacing the bytecode at run time.
In other words, is it possible to replace a's addresses 8, 10, and 12 with b's 8, 10, and 12 at runtime?
Consolidating all of the excellent answers in the comments into one.
As georg says, this sounds like you are looking for a way to inline a function or an expression, and there is no such thing in CPython attempts have been made: https://bugs.python.org/issue10399
Therefore, along the lines of "metaprogramming", you can build the lambda's inline and eval:
from typing import Callable
import dis
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
def gen_list_comprehension(expr: str) -> Callable:
return eval(f"lambda: [i for i in range(10) if {expr}]")
a = gen_list_comprehension("i < 5")
dis.dis(a.__code__.co_consts[1])
print("=" * 10)
dis.dis(b.__code__.co_consts[1])
which when run under 3.7.6 gives:
6 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
==========
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
From a security standpoint "eval" is dangerous, athough here it is less so because what you can do inside a lambda. And what can be done in an IfExp expression is even more limited, but still dangerous like call a function that does evil things.
However, if you want the same effect that is more secure, instead of working with strings you can modify AST's. I find that a lot more cumbersome though.
A hybrid approach would be the call ast.parse() and check the result. For example:
import ast
def is_cond_str(s: str) -> bool:
try:
mod_ast = ast.parse(s)
expr_ast = isinstance(mod_ast.body[0])
if not isinstance(expr_ast, ast.Expr):
return False
compare_ast = expr_ast.value
if not isinstance(compare_ast, ast.Compare):
return False
return True
except:
return False
This is a little more secure, but there still may be evil functions in the condition so you could keep going. Again, I find this a little tedious.
Coming from the other direction of starting off with bytecode, there is my cross-version assembler; see https://pypi.org/project/xasm/

PyCharm not hitting Quick and Dirty breakpoint on "pass"

I want to add a quick & dirty breakpoint, e.g when I am interested in stopping in the middle of iterating a long list.
for item in list:
if item == 'curry':
pass
I put a breakpoint on pass, and it is not hit(!).
If I add a following (empty) print
for item in list:
if item = 'curry':
pass
print('')
and breakpoint both pass and print, only print is hit.
Any idea why? Windows 7, (portable) Python 3.7
[Update] as per the comment form #Adam.Er8 I tried inserting and breakpointing the ellipsis literal, ... but that was not hit, although the following print('') was.
[Updtae++] Hmm, it does hit a breakpoint on the pass in
for key, value in dictionary.items():
pass
The pass doesn't actually make it into the bytecode. The code is exactly the same as if it wasn't there. You can see this using the dis module. (examples using 3.7 on linux).
>>> import dis
>>> dis.dis(dis.dis('for i in a:\n\tprint("i")')
1 0 SETUP_LOOP 20 (to 22)
2 LOAD_NAME 0 (a)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_NAME 1 (i)
2 10 LOAD_NAME 2 (print)
12 LOAD_CONST 0 ('i')
14 CALL_FUNCTION 1
16 POP_TOP
18 JUMP_ABSOLUTE 6
>> 20 POP_BLOCK
>> 22 LOAD_CONST 1 (None)
24 RETURN_VALUE
>>> dis.dis('for i in a:\n\tpass\n\tprint("i")')
1 0 SETUP_LOOP 20 (to 22)
2 LOAD_NAME 0 (a)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_NAME 1 (i)
3 10 LOAD_NAME 2 (print)
12 LOAD_CONST 0 ('i')
14 CALL_FUNCTION 1
16 POP_TOP
18 JUMP_ABSOLUTE 6
>> 20 POP_BLOCK
>> 22 LOAD_CONST 1 (None)
24 RETURN_VALUE
What the bytecode is doing isn't as relevant as the fact both blocks are identical. the pass is just ignored so there is nothing for the debugger to break on.
try replacing pass with ...:
for item in list:
if item = 'curry':
...
you should be able to break-point there
this is called the ellipsis literal, unlike pass it is actually "executed" (well, sort of), and this is why you can break on it, like you would on any other statement, but it has 0 side effects and reads like "nothing" (before discovering this trick I'd just write _ = 0)
EDIT:
you can just set a conditional breakpoint.
In PyCharm this is done by right-clicking the bp and writing the condition:

Can you simplify chained comparisons with not equal or equal in Python? [duplicate]

This question already has answers here:
Simplify Chained Comparison
(2 answers)
How do chained comparisons in Python actually work?
(1 answer)
Closed 4 years ago.
Probably some of you might think this is duplicate, and yes, I found a lot of examples similar to this:
But Pycharm says that I could simply this:
if y > x and x != -1:
# do something
And I did some searching and could not find something similar.
My question is it correct to simplify this function like this:
if y > x != -1:
# do something
And if so, is it safe and is there is any difference at all between this version and not simplified version except that it's shorter?
If it is not correct way to simplify it, what is then?
this is functionnaly equivalent, but when this:
10 < x < 40
is nice to read, mixing different operators to use chained comparisons isn't the best choice.
Is that really the same? let's disassemble to find out:
def f1(x,y):
if y > x and x != -1:
return 0
def f2(x,y):
if y > x != -1:
return 0
import dis
print("function 1")
dis.dis(f1)
print("function 2")
dis.dis(f2)
result:
function 1
2 0 LOAD_FAST 1 (y)
3 LOAD_FAST 0 (x)
6 COMPARE_OP 4 (>)
9 POP_JUMP_IF_FALSE 28
12 LOAD_FAST 0 (x)
15 LOAD_CONST 3 (-1)
18 COMPARE_OP 3 (!=)
21 POP_JUMP_IF_FALSE 28
3 24 LOAD_CONST 2 (0)
27 RETURN_VALUE
>> 28 LOAD_CONST 0 (None)
31 RETURN_VALUE
function 2
6 0 LOAD_FAST 1 (y)
3 LOAD_FAST 0 (x)
6 DUP_TOP
7 ROT_THREE
8 COMPARE_OP 4 (>)
11 JUMP_IF_FALSE_OR_POP 23
14 LOAD_CONST 3 (-1)
17 COMPARE_OP 3 (!=)
20 JUMP_FORWARD 2 (to 25)
>> 23 ROT_TWO
24 POP_TOP
>> 25 POP_JUMP_IF_FALSE 32
7 28 LOAD_CONST 2 (0)
31 RETURN_VALUE
>> 32 LOAD_CONST 0 (None)
35 RETURN_VALUE
>>>
Surprisingly they aren't the same, and the chained version has more instructions.
Not sure what's going on here (some took some more time to explain it better: How do chained comparisons in Python actually work?), but really I'd stick to the and version which shortcuts and is so much readable (think of future maintainers too...).
That said, one interesting thing about chained comparisons would be if the central argument is computed/takes a long time to compute/has a side effect in the computation and you don't want to store it in a variable:
if y > super_long_computation(x) != -1:
In that case, the central argument is only evaluated once. In the case of and, you'd have to store it beforehand.

What's the difference between in-place assignment and assignment using the variable's name again?

In Python, while assigning a value to a variable, we can either do:
variable = variable + 20
or
variable += 20.
While I do understand that both the operations are semantically same, i.e., they achieve the same goal of increasing the previous value of variable by 20, I was wondering if there are subtle run-time performance differences between the two, or any other slight differences which might deem one better than the other.
Is there any such difference, or are they exactly the same?
If there is any difference, is it the same for other languages such as C++?
Thanks.
Perhaps this can help you understand better:
import dis
def a():
x = 0
x += 20
return x
def b():
x = 0
x = x + 20
return x
print 'In place add'
dis.dis(a)
print 'Binary add'
dis.dis(b)
We get the following outputs:
In place add
4 0 LOAD_CONST 1 (0)
3 STORE_FAST 0 (x)
5 6 LOAD_FAST 0 (x)
9 LOAD_CONST 2 (20)
12 INPLACE_ADD
13 STORE_FAST 0 (x)
6 16 LOAD_FAST 0 (x)
19 RETURN_VALUE
Binary add
9 0 LOAD_CONST 1 (0)
3 STORE_FAST 0 (x)
10 6 LOAD_FAST 0 (x)
9 LOAD_CONST 2 (20)
12 BINARY_ADD
13 STORE_FAST 0 (x)
11 16 LOAD_FAST 0 (x)
19 RETURN_VALUE
You could do a loop a thousand or so times using a timer to compare perfomance, but the main difference is that one. I suppose binary add should be faster tho.

Why does removing the else slow down my code?

Consider the following functions:
def fact1(n):
if n < 2:
return 1
else:
return n * fact1(n-1)
def fact2(n):
if n < 2:
return 1
return n * fact2(n-1)
They should be equivalent. But there's a performance difference:
>>> T(lambda : fact1(1)).repeat(number=10000000)
[2.5754408836364746, 2.5710129737854004, 2.5678811073303223]
>>> T(lambda : fact2(1)).repeat(number=10000000)
[2.8432059288024902, 2.834425926208496, 2.8364310264587402]
The version without the else is 10% slower. This is pretty significant. Why?
What is happening here is that fact2 has a hash conflict with __name__ in your module globals. That makes the lookup of the global fact2 ever so slightly slower.
>>> [(k, hash(k) % 32) for k in globals().keys() ]
[('__builtins__', 8), ('__package__', 15), ('fact2', 25), ('__name__', 25), ('fact1', 26), ('__doc__', 29)]
i.e. The same answer as for Why is early return slower than else? except that there the hash conflict was with __builtins__
For me, they are virtually the same speed: (Python 2.6.6 on Debian)
In [4]: %timeit fact1(1)
10000000 loops, best of 3: 151 ns per loop
In [5]: %timeit fact2(1)
10000000 loops, best of 3: 154 ns per loop
The byte code is also very similar:
In [6]: dis.dis(fact1)
2 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (2)
6 COMPARE_OP 0 (<)
9 JUMP_IF_FALSE 5 (to 17)
12 POP_TOP
3 13 LOAD_CONST 2 (1)
16 RETURN_VALUE
>> 17 POP_TOP
5 18 LOAD_FAST 0 (n)
21 LOAD_GLOBAL 0 (fact)
24 LOAD_FAST 0 (n)
27 LOAD_CONST 2 (1)
30 BINARY_SUBTRACT
31 CALL_FUNCTION 1
34 BINARY_MULTIPLY
35 RETURN_VALUE
36 LOAD_CONST 0 (None)
39 RETURN_VALUE
In [7]: dis.dis(fact2)
2 0 LOAD_FAST 0 (n)
3 LOAD_CONST 1 (2)
6 COMPARE_OP 0 (<)
9 JUMP_IF_FALSE 5 (to 17)
12 POP_TOP
3 13 LOAD_CONST 2 (1)
16 RETURN_VALUE
>> 17 POP_TOP
4 18 LOAD_FAST 0 (n)
21 LOAD_GLOBAL 0 (fact)
24 LOAD_FAST 0 (n)
27 LOAD_CONST 2 (1)
30 BINARY_SUBTRACT
31 CALL_FUNCTION 1
34 BINARY_MULTIPLY
35 RETURN_VALUE
The only difference is that the version with the else includes code to return None in case control reaches the end of the function body.
I question the timings. The two functions aren't recursing to themselves. fact1 and fact2 both call fact which isn't shown.
Once that is fixed, the disassembly (in both Py2.6 and Py2.7) shows that both are running the same op codes except for the name of the recursed into function. The choice of name trigger a small difference in timings because fact1 may insert in the module dictionary with no name collisions while *fact2) may have a hash value that collides with another name in the module.
In other words, any differences you see in timings are not due to the choice of whether the else-clause is present :-)

Categories

Resources