What's the effect of `pass` in Python debug mode - python

I just found this phenomenon by coincidence.
mylist = [('1',), ('2',), ('3',), ('4',)]
for l in mylist:
print(l)
pass # first pass
pass # second pass
print("end")
If I set the red stop point at the first pass and debug, the program will stop here and the output is:
('1',)
However, if I set the red stop point at the second pass and debug, the output include the end in the last line. It seems like the pass avoid stopping at this point and just let the program run further.
I thought pass should have no real meaning, but it seems not. So how can understand the pass?
Thank you all

pass is just syntactic sugar for the parser to know that a statement is intentionally left empty. It does not generate an opcode, and thus, the debugger can't pause when it gets hit. Instead you're seeing it halt when the next instruction is executed.
You can see this by printing the opcodes generated by an empty function:
>>> def test():
... pass
...
>>> import dis
>>> dis.dis(test)
2 0 LOAD_CONST 0 (None)
3 RETURN_VALUE

pass doesn't do anything. It compiles to no bytecode. However, the bytecode to jump back to the start of the loop is associated with the line of the last statement in the loop, and pass counts. Here's what it looks like if we decompile it, on Python 3.7.3:
import dis
dis.dis(r'''mylist = [('1',), ('2',), ('3',), ('4',)]
for l in mylist:
print(l)
pass # first pass
pass # second pass
print("end")''')
Output:
1 0 LOAD_CONST 0 (('1',))
2 LOAD_CONST 1 (('2',))
4 LOAD_CONST 2 (('3',))
6 LOAD_CONST 3 (('4',))
8 BUILD_LIST 4
10 STORE_NAME 0 (mylist)
2 12 SETUP_LOOP 20 (to 34)
14 LOAD_NAME 0 (mylist)
16 GET_ITER
>> 18 FOR_ITER 12 (to 32)
20 STORE_NAME 1 (l)
3 22 LOAD_NAME 2 (print)
24 LOAD_NAME 1 (l)
26 CALL_FUNCTION 1
28 POP_TOP
4 30 JUMP_ABSOLUTE 18
>> 32 POP_BLOCK
6 >> 34 LOAD_NAME 2 (print)
36 LOAD_CONST 4 ('end')
38 CALL_FUNCTION 1
40 POP_TOP
42 LOAD_CONST 5 (None)
44 RETURN_VALUE
The JUMP_ABSOLUTE and POP_BLOCK get associated with line 4, the first pass.
When you set a breakpoint on the first pass, Python breaks before the JUMP_ABSOLUTE. When you set a breakpoint on the second pass, no bytecode is associated with line 5, so Python breaks on line 6, which does have bytecode.

pass is just a null operator, if your looking to exit the for loop, you need to use break. The reason you see the end of the output from mylist at the second pass is that the first pass just continues the for loop.

Related

Is it possible to call a function from within a list comprehension without the overhead of calling the function?

In this trivial example, I want to factor out the i < 5 condition of a list comprehension into it's own function. I also want to eat my cake and have it too, and avoid the overhead of the CALL_FUNCTION bytecode/creating a new frame in the python virtual machine.
Is there any way to factor out the conditions inside of a list comprehension into a new function but somehow get a disassembled result that avoids the large overhead of CALL_FUNCTION?
import dis
import sys
import timeit
def my_filter(n):
return n < 5
def a():
# list comprehension with function call
return [i for i in range(10) if my_filter(i)]
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
assert a() == b()
>>> sys.version_info[:]
(3, 6, 5, 'final', 0)
>>> timeit.timeit(a)
1.2616060493517098
>>> timeit.timeit(b)
0.685117881097812
>>> dis.dis(a)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F4890B660, file "<stdin>", line 3>)
# ...
>>> dis.dis(b)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F48A42270, file "<stdin>", line 3>)
# ...
# list comprehension with function call
# big overhead with that CALL_FUNCTION at address 12
>>> dis.dis(a.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_GLOBAL 0 (my_filter)
10 LOAD_FAST 1 (i)
12 CALL_FUNCTION 1
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
# list comprehension without function call
>>> dis.dis(b.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
I'm willing to take a hacky solution that I would never use in production, like somehow replacing the bytecode at run time.
In other words, is it possible to replace a's addresses 8, 10, and 12 with b's 8, 10, and 12 at runtime?
Consolidating all of the excellent answers in the comments into one.
As georg says, this sounds like you are looking for a way to inline a function or an expression, and there is no such thing in CPython attempts have been made: https://bugs.python.org/issue10399
Therefore, along the lines of "metaprogramming", you can build the lambda's inline and eval:
from typing import Callable
import dis
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
def gen_list_comprehension(expr: str) -> Callable:
return eval(f"lambda: [i for i in range(10) if {expr}]")
a = gen_list_comprehension("i < 5")
dis.dis(a.__code__.co_consts[1])
print("=" * 10)
dis.dis(b.__code__.co_consts[1])
which when run under 3.7.6 gives:
6 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
==========
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
From a security standpoint "eval" is dangerous, athough here it is less so because what you can do inside a lambda. And what can be done in an IfExp expression is even more limited, but still dangerous like call a function that does evil things.
However, if you want the same effect that is more secure, instead of working with strings you can modify AST's. I find that a lot more cumbersome though.
A hybrid approach would be the call ast.parse() and check the result. For example:
import ast
def is_cond_str(s: str) -> bool:
try:
mod_ast = ast.parse(s)
expr_ast = isinstance(mod_ast.body[0])
if not isinstance(expr_ast, ast.Expr):
return False
compare_ast = expr_ast.value
if not isinstance(compare_ast, ast.Compare):
return False
return True
except:
return False
This is a little more secure, but there still may be evil functions in the condition so you could keep going. Again, I find this a little tedious.
Coming from the other direction of starting off with bytecode, there is my cross-version assembler; see https://pypi.org/project/xasm/

PyCharm not hitting Quick and Dirty breakpoint on "pass"

I want to add a quick & dirty breakpoint, e.g when I am interested in stopping in the middle of iterating a long list.
for item in list:
if item == 'curry':
pass
I put a breakpoint on pass, and it is not hit(!).
If I add a following (empty) print
for item in list:
if item = 'curry':
pass
print('')
and breakpoint both pass and print, only print is hit.
Any idea why? Windows 7, (portable) Python 3.7
[Update] as per the comment form #Adam.Er8 I tried inserting and breakpointing the ellipsis literal, ... but that was not hit, although the following print('') was.
[Updtae++] Hmm, it does hit a breakpoint on the pass in
for key, value in dictionary.items():
pass
The pass doesn't actually make it into the bytecode. The code is exactly the same as if it wasn't there. You can see this using the dis module. (examples using 3.7 on linux).
>>> import dis
>>> dis.dis(dis.dis('for i in a:\n\tprint("i")')
1 0 SETUP_LOOP 20 (to 22)
2 LOAD_NAME 0 (a)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_NAME 1 (i)
2 10 LOAD_NAME 2 (print)
12 LOAD_CONST 0 ('i')
14 CALL_FUNCTION 1
16 POP_TOP
18 JUMP_ABSOLUTE 6
>> 20 POP_BLOCK
>> 22 LOAD_CONST 1 (None)
24 RETURN_VALUE
>>> dis.dis('for i in a:\n\tpass\n\tprint("i")')
1 0 SETUP_LOOP 20 (to 22)
2 LOAD_NAME 0 (a)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_NAME 1 (i)
3 10 LOAD_NAME 2 (print)
12 LOAD_CONST 0 ('i')
14 CALL_FUNCTION 1
16 POP_TOP
18 JUMP_ABSOLUTE 6
>> 20 POP_BLOCK
>> 22 LOAD_CONST 1 (None)
24 RETURN_VALUE
What the bytecode is doing isn't as relevant as the fact both blocks are identical. the pass is just ignored so there is nothing for the debugger to break on.
try replacing pass with ...:
for item in list:
if item = 'curry':
...
you should be able to break-point there
this is called the ellipsis literal, unlike pass it is actually "executed" (well, sort of), and this is why you can break on it, like you would on any other statement, but it has 0 side effects and reads like "nothing" (before discovering this trick I'd just write _ = 0)
EDIT:
you can just set a conditional breakpoint.
In PyCharm this is done by right-clicking the bp and writing the condition:

python `for i in iter` vs `while True; i = next(iter)`

To my understanding, both these approach work for operating on every item in a generator:
let i be our operator target
let my_iter be our generator
let callable do_something_with return None
While Loop + StopIteratioon
try:
while True:
i = next(my_iter)
do_something_with(i)
except StopIteration:
pass
For loop / list comprehension
for i in my_iter:
do_something_with(i)
[do_something_with(i) for i in my_iter]
Minor Edit: print(i) replaced with do_something_with(i) as suggested by #kojiro to disambiguate a use case with the interpreter mechanics.
As far as I am aware, these are both applicable ways to iterate over a generator, Is there any reason to prefer one over the other?
Right now the for loop is looking superior to me. Due to: less lines/clutter and readability in general, plus single indent.
I really only see the while approach being advantages if you want to handily break the loop on particular exceptions.
the third option is definitively NOT the same as the first two. the third example creates a list, one each for the return value of print(i), which happens to be None, so not a very interesting list.
the first two are semantically similar. There is a minor, technical difference; the while loop, as presented, does not work if my_iter is not, in fact an iterator (ie, has a __next__() method); for instance, if it's a list. The for loop works for all iterables (has an __iter__() method) in addition to iterators.
The correct version is thus:
my_iter = iter(my_iterable)
try:
while True:
i = next(my_iter)
print(i)
except StopIteration:
pass
Now, aside from readability reasons, there in fact is a technical reason you should prefer the for loop; there is a penalty you pay (in CPython, anyhow) for the number of bytecodes executed in tight inner loops. lets compare:
In [1]: def forloop(my_iter):
...: for i in my_iter:
...: print(i)
...:
In [57]: dis.dis(forloop)
2 0 SETUP_LOOP 24 (to 27)
3 LOAD_FAST 0 (my_iter)
6 GET_ITER
>> 7 FOR_ITER 16 (to 26)
10 STORE_FAST 1 (i)
3 13 LOAD_GLOBAL 0 (print)
16 LOAD_FAST 1 (i)
19 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
22 POP_TOP
23 JUMP_ABSOLUTE 7
>> 26 POP_BLOCK
>> 27 LOAD_CONST 0 (None)
30 RETURN_VALUE
7 bytecodes called in inner loop vs:
In [55]: def whileloop(my_iterable):
....: my_iter = iter(my_iterable)
....: try:
....: while True:
....: i = next(my_iter)
....: print(i)
....: except StopIteration:
....: pass
....:
In [56]: dis.dis(whileloop)
2 0 LOAD_GLOBAL 0 (iter)
3 LOAD_FAST 0 (my_iterable)
6 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
9 STORE_FAST 1 (my_iter)
3 12 SETUP_EXCEPT 32 (to 47)
4 15 SETUP_LOOP 25 (to 43)
5 >> 18 LOAD_GLOBAL 1 (next)
21 LOAD_FAST 1 (my_iter)
24 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
27 STORE_FAST 2 (i)
6 30 LOAD_GLOBAL 2 (print)
33 LOAD_FAST 2 (i)
36 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
39 POP_TOP
40 JUMP_ABSOLUTE 18
>> 43 POP_BLOCK
44 JUMP_FORWARD 18 (to 65)
7 >> 47 DUP_TOP
48 LOAD_GLOBAL 3 (StopIteration)
51 COMPARE_OP 10 (exception match)
54 POP_JUMP_IF_FALSE 64
57 POP_TOP
58 POP_TOP
59 POP_TOP
8 60 POP_EXCEPT
61 JUMP_FORWARD 1 (to 65)
>> 64 END_FINALLY
>> 65 LOAD_CONST 0 (None)
68 RETURN_VALUE
9 Bytecodes in the inner loop.
We can actually do even better, though.
In [58]: from collections import deque
In [59]: def deqloop(my_iter):
....: deque(map(print, my_iter), 0)
....:
In [61]: dis.dis(deqloop)
2 0 LOAD_GLOBAL 0 (deque)
3 LOAD_GLOBAL 1 (map)
6 LOAD_GLOBAL 2 (print)
9 LOAD_FAST 0 (my_iter)
12 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
15 LOAD_CONST 1 (0)
18 CALL_FUNCTION 2 (2 positional, 0 keyword pair)
21 POP_TOP
22 LOAD_CONST 0 (None)
25 RETURN_VALUE
everything happens in C, collections.deque, map and print are all builtins. (for cpython) so in this case, there are no bytecodes executed for looping. This is only a useful optimization when the iteration step is a c function (as is the case for print. Otherwise, the overhead of a python function call is larger than the JUMP_ABSOLUTE overhead.
The for loop is the most pythonic. Note that you can break out of for loops as well as while loops.
Don't use the list comprehension unless you need the resulting list, otherwise you are needlessly storing all the elements. Your example list comprehension will only work with the print function in Python 3, it won't work with the print statement in Python 2.
I would agree with you that the for loop is superior. As you mentioned it is less clutter and it is a lot easier to read. Programmers like to keep things as simple as possible and the for loop does that. It is also better for novice Python programmers who might not have learned try/except. Also, as Alasdair mentioned, you can break out of for loops. Also the while loop runs an error if you are using a list unless you use iter() on my_iter first.

Lambda function behavior with and without keyword arguments

I am using lambda functions for GUI programming with tkinter.
Recently I got stuck when implementing buttons that open files:
self.file=""
button = Button(conf_f, text="Tools opt.",
command=lambda: tktb.helpers.openfile(self.file))
As you see, I want to define a file path that can be updated, and that is not known when creating the GUI.
The issue I had is that earlier my code was :
button = Button(conf_f, text="Tools opt.",
command=lambda f=self.file: tktb.helpers.openfile(f))
The lambda function had a keyword argument to pass the file path. In this case, the parameter f was not updated when self.file was.
I got the keyword argument from a code snippet and I use it everywhere. Obviously I shouldn't...
This is still not clear to me... Could someone explain me the difference between the two lambda forms and when to use one an another?
PS: The following comment led me to the solution but I'd like a little more explanations:
lambda working oddly with tkinter
I'll try to explain it more in depth.
If you do
i = 0
f = lambda: i
you create a function (lambda is essentially a function) which accesses its enclosing scope's i variable.
Internally, it does so by having a so-called closure which contains the i. It is, loosely spoken, a kind of pointer to the real variable which can hold different values at different points of time.
def a():
# first, yield a function to access i
yield lambda: i
# now, set i to different values successively
for i in range(100): yield
g = a() # create generator
f = next(g) # get the function
f() # -> error as i is not set yet
next(g)
f() # -> 0
next(g)
f() # -> 1
# and so on
f.func_closure # -> an object stemming from the local scope of a()
f.func_closure[0].cell_contents # -> the current value of this variable
Here, all values of i are - at their time - stored in that said closure. If the function f() needs them. it gets them from there.
You can see that difference on the disassembly listings:
These said functions a() and f() disassemble like this:
>>> dis.dis(a)
2 0 LOAD_CLOSURE 0 (i)
3 BUILD_TUPLE 1
6 LOAD_CONST 1 (<code object <lambda> at 0xb72ea650, file "<stdin>", line 2>)
9 MAKE_CLOSURE 0
12 YIELD_VALUE
13 POP_TOP
3 14 SETUP_LOOP 25 (to 42)
17 LOAD_GLOBAL 0 (range)
20 LOAD_CONST 2 (100)
23 CALL_FUNCTION 1
26 GET_ITER
>> 27 FOR_ITER 11 (to 41)
30 STORE_DEREF 0 (i)
33 LOAD_CONST 0 (None)
36 YIELD_VALUE
37 POP_TOP
38 JUMP_ABSOLUTE 27
>> 41 POP_BLOCK
>> 42 LOAD_CONST 0 (None)
45 RETURN_VALUE
>>> dis.dis(f)
2 0 LOAD_DEREF 0 (i)
3 RETURN_VALUE
Compare that to a function b() which looks like
>>> def b():
... for i in range(100): yield
>>> dis.dis(b)
2 0 SETUP_LOOP 25 (to 28)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (100)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 11 (to 27)
16 STORE_FAST 0 (i)
19 LOAD_CONST 0 (None)
22 YIELD_VALUE
23 POP_TOP
24 JUMP_ABSOLUTE 13
>> 27 POP_BLOCK
>> 28 LOAD_CONST 0 (None)
31 RETURN_VALUE
The main difference in the loop is
>> 13 FOR_ITER 11 (to 27)
16 STORE_FAST 0 (i)
in b() vs.
>> 27 FOR_ITER 11 (to 41)
30 STORE_DEREF 0 (i)
in a(): the STORE_DEREF stores in a cell object (closure), while STORE_FAST uses a "normal" variable, which (probably) works a little bit faster.
The lambda as well makes a difference:
>>> dis.dis(lambda: i)
1 0 LOAD_GLOBAL 0 (i)
3 RETURN_VALUE
Here you have a LOAD_GLOBAL, while the one above uses LOAD_DEREF. The latter, as well, is for the closure.
I completely forgot about lambda i=i: i.
If you have the value as a default parameter, it finds its way into the function via a completely different path: the current value of i gets passed to the just created function via a default parameter:
>>> i = 42
>>> f = lambda i=i: i
>>> dis.dis(f)
1 0 LOAD_FAST 0 (i)
3 RETURN_VALUE
This way the function gets called as f(). It detects that there is a missing argument and fills the respective parameter with the default value. All this happens before the function is called; from within the function you just see that the value is taken and returned.
And there is yet another way to accomplish your task: Just use the lambda as if it would take a value: lambda i: i. If you call this, it complains about a missing argument.
But you can cope with that with the use of functools.partial:
ff = [functools.partial(lambda i: i, x) for x in range(100)]
ff[12]()
ff[54]()
This wrapper gets a callable and a number of arguments to be passed. The resulting object is a callable which calls the original callable with these arguments plus any arguments you give to it. It can be used here to keep locked to the value intended.

Checking for __debug__ and some other condition in Python

In Python sometimes I want to do something like (1)
if __debug__ and verbose: print "whatever"
If Python is run with -O, then I'd like for that whole piece of code to disappear, as it would if I just had (2)
if __debug__: print "whatever"
or even (3)
if __debug__:
if verbose: print foo
However, that doesn't seem to happen (see below). Is there a way I can get the run-time efficiency of #3 with compact code more like #1?
Here's how I tested that I'm not getting the efficient code I want:
#!/usr/bin/python2.7
from dis import dis
import sys
cmds = ["""
def func ():
if __debug__ and 1+1: sys.stdout.write('spam')""", """
def func():
if __debug__: sys.stdout.write('ham')""", """
def func():
__debug__ and sys.stdout.write('eggs')"""]
print "__debug__ is", __debug__, "\n\n\n"
for cmd in cmds:
print "*"*80, "\nSource of {}\n\ncompiles to:".format(cmd)
exec(cmd)
dis(func)
print "\n"*4
Running this gives
__debug__ is False
********************************************************************************
Source of
def func ():
if __debug__ and 1+1: sys.stdout.write('spam')
compiles to:
3 0 LOAD_GLOBAL 0 (__debug__)
3 POP_JUMP_IF_FALSE 31
6 LOAD_CONST 3 (2)
9 POP_JUMP_IF_FALSE 31
12 LOAD_GLOBAL 1 (sys)
15 LOAD_ATTR 2 (stdout)
18 LOAD_ATTR 3 (write)
21 LOAD_CONST 2 ('spam')
24 CALL_FUNCTION 1
27 POP_TOP
28 JUMP_FORWARD 0 (to 31)
>> 31 LOAD_CONST 0 (None)
34 RETURN_VALUE
********************************************************************************
Source of
def func():
if __debug__: sys.stdout.write('ham')
compiles to:
3 0 LOAD_CONST 0 (None)
3 RETURN_VALUE
********************************************************************************
Source of
def func():
__debug__ and sys.stdout.write('eggs')
compiles to:
3 0 LOAD_GLOBAL 0 (__debug__)
3 JUMP_IF_FALSE_OR_POP 21
6 LOAD_GLOBAL 1 (sys)
9 LOAD_ATTR 2 (stdout)
12 LOAD_ATTR 3 (write)
15 LOAD_CONST 1 ('eggs')
18 CALL_FUNCTION 1
>> 21 POP_TOP
22 LOAD_CONST 0 (None)
25 RETURN_VALUE
No, you can't. Python's compiler is not nearly smart enough to detect in what cases it could remove the code block and if statement.
Python would have to do a whole lot of logic inference otherwise. Compare:
if __debug__ or verbose:
with
if __debug__ and verbose:
for example. Python would have to detect the difference between these two expressions at compile time; one can be optimised away, the other cannot.
Note that the difference in runtime between code with and without if __debug__ statements is truly minute, everything else being equal. A small constant value test and jump is not anything to fuss about, really.

Categories

Resources