How to cleanly keep below 80-char width with long strings? - python

I'm attempting to keep my code to 80 chars or less nowadays as I think it looks more aesthetically pleasing, for the most part. Sometimes, though, the code ends up looking worse if I have to put line breaks in weird places.
One thing I haven't figured out how to handle very nicely yet is long strings. For example:
#0.........1........2........3........4.........5.........6.........7.........8xxxxxxxxx9xxxxxx
def foo():
if conditional():
logger.info("<Conditional's meaning> happened, so we're not setting up the interface.")
return
#.....
It's over! Putting it on the next line won't help either:
#0.........1........2........3........4.........5.........6.........7.........8xxxxxxxxx9xxxxxx
def foo():
if conditional():
logger.info(
"<Conditional's meaning> happened, so we're not setting up the interface.")
return
#.....
I could use line breaks but that looks awful:
#0.........1........2........3........4.........5.........6.........7.........8
def foo():
if conditional():
logger.info(
"<Conditional's meaning> happened, so we're not setting \
up the interface.")
return
#.....
What to do? Shortening the string is one option but I don't want the readability of my messages to be affected by something as arbitrary as how many indentation levels the code happened to have at that point.

You can split the string into two:
def foo():
if conditional():
logger.info("<Conditional's meaning> happened, so we're not "
"setting up the interface.")
Multiple consecutive strings within the same expression are automatically concatenated into one, at compile time:
>>> def foo():
... if conditional():
... logger.info("<Conditional's meaning> happened, so we're not "
... "setting up the interface.")
...
>>> import dis
>>> dis.dis(foo)
2 0 LOAD_GLOBAL 0 (conditional)
3 CALL_FUNCTION 0
6 POP_JUMP_IF_FALSE 25
3 9 LOAD_GLOBAL 1 (logger)
12 LOAD_ATTR 2 (info)
15 LOAD_CONST 1 ("<Conditional's meaning> happened, so we're not setting up the interface.")
18 CALL_FUNCTION 1
21 POP_TOP
22 JUMP_FORWARD 0 (to 25)
>> 25 LOAD_CONST 0 (None)
28 RETURN_VALUE
Note the LOAD_CONST for line 3, the bytecode for the function contains one string, already concatenated.
If you were to add a + to the expression, two separate constants are created:
>>> def foo():
... if conditional():
... logger.info("<Conditional's meaning> happened, so we're not " +
... "setting up the interface.")
...
>>> dis.dis(foo)
2 0 LOAD_GLOBAL 0 (conditional)
3 CALL_FUNCTION 0
6 POP_JUMP_IF_FALSE 29
3 9 LOAD_GLOBAL 1 (logger)
12 LOAD_ATTR 2 (info)
15 LOAD_CONST 1 ("<Conditional's meaning> happened, so we're not ")
4 18 LOAD_CONST 2 ('setting up the interface.')
21 BINARY_ADD
22 CALL_FUNCTION 1
25 POP_TOP
26 JUMP_FORWARD 0 (to 29)
>> 29 LOAD_CONST 0 (None)
32 RETURN_VALUE
Python does fold binary operations on constants at compile time (so +, *, - etc.), in the peephole optimizations for the byte compiler. So for certain string concatenations the compiler may also replace + string concatenation of constants with the concatenated result. See peephole.c, for sequences (including strings) this optimization is only applied if the result is limited to 20 items (characters) or fewer.

Related

What's the effect of `pass` in Python debug mode

I just found this phenomenon by coincidence.
mylist = [('1',), ('2',), ('3',), ('4',)]
for l in mylist:
print(l)
pass # first pass
pass # second pass
print("end")
If I set the red stop point at the first pass and debug, the program will stop here and the output is:
('1',)
However, if I set the red stop point at the second pass and debug, the output include the end in the last line. It seems like the pass avoid stopping at this point and just let the program run further.
I thought pass should have no real meaning, but it seems not. So how can understand the pass?
Thank you all
pass is just syntactic sugar for the parser to know that a statement is intentionally left empty. It does not generate an opcode, and thus, the debugger can't pause when it gets hit. Instead you're seeing it halt when the next instruction is executed.
You can see this by printing the opcodes generated by an empty function:
>>> def test():
... pass
...
>>> import dis
>>> dis.dis(test)
2 0 LOAD_CONST 0 (None)
3 RETURN_VALUE
pass doesn't do anything. It compiles to no bytecode. However, the bytecode to jump back to the start of the loop is associated with the line of the last statement in the loop, and pass counts. Here's what it looks like if we decompile it, on Python 3.7.3:
import dis
dis.dis(r'''mylist = [('1',), ('2',), ('3',), ('4',)]
for l in mylist:
print(l)
pass # first pass
pass # second pass
print("end")''')
Output:
1 0 LOAD_CONST 0 (('1',))
2 LOAD_CONST 1 (('2',))
4 LOAD_CONST 2 (('3',))
6 LOAD_CONST 3 (('4',))
8 BUILD_LIST 4
10 STORE_NAME 0 (mylist)
2 12 SETUP_LOOP 20 (to 34)
14 LOAD_NAME 0 (mylist)
16 GET_ITER
>> 18 FOR_ITER 12 (to 32)
20 STORE_NAME 1 (l)
3 22 LOAD_NAME 2 (print)
24 LOAD_NAME 1 (l)
26 CALL_FUNCTION 1
28 POP_TOP
4 30 JUMP_ABSOLUTE 18
>> 32 POP_BLOCK
6 >> 34 LOAD_NAME 2 (print)
36 LOAD_CONST 4 ('end')
38 CALL_FUNCTION 1
40 POP_TOP
42 LOAD_CONST 5 (None)
44 RETURN_VALUE
The JUMP_ABSOLUTE and POP_BLOCK get associated with line 4, the first pass.
When you set a breakpoint on the first pass, Python breaks before the JUMP_ABSOLUTE. When you set a breakpoint on the second pass, no bytecode is associated with line 5, so Python breaks on line 6, which does have bytecode.
pass is just a null operator, if your looking to exit the for loop, you need to use break. The reason you see the end of the output from mylist at the second pass is that the first pass just continues the for loop.

Is it possible to call a function from within a list comprehension without the overhead of calling the function?

In this trivial example, I want to factor out the i < 5 condition of a list comprehension into it's own function. I also want to eat my cake and have it too, and avoid the overhead of the CALL_FUNCTION bytecode/creating a new frame in the python virtual machine.
Is there any way to factor out the conditions inside of a list comprehension into a new function but somehow get a disassembled result that avoids the large overhead of CALL_FUNCTION?
import dis
import sys
import timeit
def my_filter(n):
return n < 5
def a():
# list comprehension with function call
return [i for i in range(10) if my_filter(i)]
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
assert a() == b()
>>> sys.version_info[:]
(3, 6, 5, 'final', 0)
>>> timeit.timeit(a)
1.2616060493517098
>>> timeit.timeit(b)
0.685117881097812
>>> dis.dis(a)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F4890B660, file "<stdin>", line 3>)
# ...
>>> dis.dis(b)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F48A42270, file "<stdin>", line 3>)
# ...
# list comprehension with function call
# big overhead with that CALL_FUNCTION at address 12
>>> dis.dis(a.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_GLOBAL 0 (my_filter)
10 LOAD_FAST 1 (i)
12 CALL_FUNCTION 1
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
# list comprehension without function call
>>> dis.dis(b.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
I'm willing to take a hacky solution that I would never use in production, like somehow replacing the bytecode at run time.
In other words, is it possible to replace a's addresses 8, 10, and 12 with b's 8, 10, and 12 at runtime?
Consolidating all of the excellent answers in the comments into one.
As georg says, this sounds like you are looking for a way to inline a function or an expression, and there is no such thing in CPython attempts have been made: https://bugs.python.org/issue10399
Therefore, along the lines of "metaprogramming", you can build the lambda's inline and eval:
from typing import Callable
import dis
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
def gen_list_comprehension(expr: str) -> Callable:
return eval(f"lambda: [i for i in range(10) if {expr}]")
a = gen_list_comprehension("i < 5")
dis.dis(a.__code__.co_consts[1])
print("=" * 10)
dis.dis(b.__code__.co_consts[1])
which when run under 3.7.6 gives:
6 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
==========
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
From a security standpoint "eval" is dangerous, athough here it is less so because what you can do inside a lambda. And what can be done in an IfExp expression is even more limited, but still dangerous like call a function that does evil things.
However, if you want the same effect that is more secure, instead of working with strings you can modify AST's. I find that a lot more cumbersome though.
A hybrid approach would be the call ast.parse() and check the result. For example:
import ast
def is_cond_str(s: str) -> bool:
try:
mod_ast = ast.parse(s)
expr_ast = isinstance(mod_ast.body[0])
if not isinstance(expr_ast, ast.Expr):
return False
compare_ast = expr_ast.value
if not isinstance(compare_ast, ast.Compare):
return False
return True
except:
return False
This is a little more secure, but there still may be evil functions in the condition so you could keep going. Again, I find this a little tedious.
Coming from the other direction of starting off with bytecode, there is my cross-version assembler; see https://pypi.org/project/xasm/

PyCharm not hitting Quick and Dirty breakpoint on "pass"

I want to add a quick & dirty breakpoint, e.g when I am interested in stopping in the middle of iterating a long list.
for item in list:
if item == 'curry':
pass
I put a breakpoint on pass, and it is not hit(!).
If I add a following (empty) print
for item in list:
if item = 'curry':
pass
print('')
and breakpoint both pass and print, only print is hit.
Any idea why? Windows 7, (portable) Python 3.7
[Update] as per the comment form #Adam.Er8 I tried inserting and breakpointing the ellipsis literal, ... but that was not hit, although the following print('') was.
[Updtae++] Hmm, it does hit a breakpoint on the pass in
for key, value in dictionary.items():
pass
The pass doesn't actually make it into the bytecode. The code is exactly the same as if it wasn't there. You can see this using the dis module. (examples using 3.7 on linux).
>>> import dis
>>> dis.dis(dis.dis('for i in a:\n\tprint("i")')
1 0 SETUP_LOOP 20 (to 22)
2 LOAD_NAME 0 (a)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_NAME 1 (i)
2 10 LOAD_NAME 2 (print)
12 LOAD_CONST 0 ('i')
14 CALL_FUNCTION 1
16 POP_TOP
18 JUMP_ABSOLUTE 6
>> 20 POP_BLOCK
>> 22 LOAD_CONST 1 (None)
24 RETURN_VALUE
>>> dis.dis('for i in a:\n\tpass\n\tprint("i")')
1 0 SETUP_LOOP 20 (to 22)
2 LOAD_NAME 0 (a)
4 GET_ITER
>> 6 FOR_ITER 12 (to 20)
8 STORE_NAME 1 (i)
3 10 LOAD_NAME 2 (print)
12 LOAD_CONST 0 ('i')
14 CALL_FUNCTION 1
16 POP_TOP
18 JUMP_ABSOLUTE 6
>> 20 POP_BLOCK
>> 22 LOAD_CONST 1 (None)
24 RETURN_VALUE
What the bytecode is doing isn't as relevant as the fact both blocks are identical. the pass is just ignored so there is nothing for the debugger to break on.
try replacing pass with ...:
for item in list:
if item = 'curry':
...
you should be able to break-point there
this is called the ellipsis literal, unlike pass it is actually "executed" (well, sort of), and this is why you can break on it, like you would on any other statement, but it has 0 side effects and reads like "nothing" (before discovering this trick I'd just write _ = 0)
EDIT:
you can just set a conditional breakpoint.
In PyCharm this is done by right-clicking the bp and writing the condition:

Would a StopIteration make python slow? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
As far as i know, monitoring exception will make a program slower.
Would an iterator exception monitor, such as StopIteration make a for loop slower?
While exception monitoring has some small overhead in the usual case, in the case of iterators there does not appear to be any overhead involved in handling StopIteration exceptions. Python optimises iterators as a special case so that StopIteration doesn't involve any exception handlers. (I'll also observe---and I may be missing something---that it's hard to come up with a Python for loop that doesn't implicitly use iterators).
Here's some examples, first using the built-in range function and a simple for loop:
Python 2.7.5
>>> import dis
>>> def x():
... for i in range(1,11):
... pass
...
>>> dis.dis(x)
2 0 SETUP_LOOP 23 (to 26)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (1)
9 LOAD_CONST 2 (11)
12 CALL_FUNCTION 2
15 GET_ITER
>> 16 FOR_ITER 6 (to 25)
19 STORE_FAST 0 (i)
3 22 JUMP_ABSOLUTE 16
>> 25 POP_BLOCK
>> 26 LOAD_CONST 0 (None)
29 RETURN_VALUE
Note that range is essentially being treated as an iterator.
Now, using a simple generator function:
>>> def g(x):
... while x < 11:
... yield x
... x = x + 1
...
>>> def y():
... for i in g(1):
... pass
...
>>> dis.dis(y)
2 0 SETUP_LOOP 20 (to 23)
3 LOAD_GLOBAL 0 (g)
6 LOAD_CONST 1 (1)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 6 (to 22)
16 STORE_FAST 0 (i)
3 19 JUMP_ABSOLUTE 13
>> 22 POP_BLOCK
>> 23 LOAD_CONST 0 (None)
26 RETURN_VALUE
>>> dis.dis(g)
2 0 SETUP_LOOP 31 (to 34)
>> 3 LOAD_FAST 0 (x)
6 LOAD_CONST 1 (11)
9 COMPARE_OP 0 (<)
12 POP_JUMP_IF_FALSE 33
3 15 LOAD_FAST 0 (x)
18 YIELD_VALUE
19 POP_TOP
4 20 LOAD_FAST 0 (x)
23 LOAD_CONST 2 (1)
26 BINARY_ADD
27 STORE_FAST 0 (x)
30 JUMP_ABSOLUTE 3
>> 33 POP_BLOCK
>> 34 LOAD_CONST 0 (None)
37 RETURN_VALUE
Note that y here is basically the same as x above, the difference being one LOAD_CONST instruction, since x references the number 11. Likewise, our simple generator is basically equivalent to the same thing written as a while loop:
>>> def q():
... x = 1
... while x < 11:
... x = x + 1
...
>>> dis.dis(q)
2 0 LOAD_CONST 1 (1)
3 STORE_FAST 0 (x)
3 6 SETUP_LOOP 26 (to 35)
>> 9 LOAD_FAST 0 (x)
12 LOAD_CONST 2 (11)
15 COMPARE_OP 0 (<)
18 POP_JUMP_IF_FALSE 34
4 21 LOAD_FAST 0 (x)
24 LOAD_CONST 1 (1)
27 BINARY_ADD
28 STORE_FAST 0 (x)
31 JUMP_ABSOLUTE 9
>> 34 POP_BLOCK
>> 35 LOAD_CONST 0 (None)
38 RETURN_VALUE
Again, there's no specific overhead to handle the iterator or the generator (range may be somewhat more optimised than the generator version, simply because its a built-in, but not due to the way Python handles it).
Finally, let's look at an actual explicit iterator written with StopIteration
>>> class G(object):
... def __init__(self, x):
... self.x = x
... def __iter__(self):
... return self
... def next(self):
... x = self.x
... if x >= 11:
... raise StopIteration
... x = x + 1
... return x - 1
...
>>> dis.dis(G.next)
7 0 LOAD_FAST 0 (self)
3 LOAD_ATTR 0 (x)
6 STORE_FAST 1 (x)
8 9 LOAD_FAST 1 (x)
12 LOAD_CONST 1 (11)
15 COMPARE_OP 5 (>=)
18 POP_JUMP_IF_FALSE 30
9 21 LOAD_GLOBAL 1 (StopIteration)
24 RAISE_VARARGS 1
27 JUMP_FORWARD 0 (to 30)
10 >> 30 LOAD_FAST 1 (x)
33 LOAD_CONST 2 (1)
36 BINARY_ADD
37 STORE_FAST 1 (x)
11 40 LOAD_FAST 1 (x)
43 LOAD_CONST 2 (1)
46 BINARY_SUBTRACT
47 RETURN_VALUE
Now, here we can see that the generator function involves a few less instructions than this simple iterator, mostly related to the differences in implementation and a couple of instructions related to raising the StopIteration exception. Nevertheless, a function using this iterator is exactly equivalent to y above:
>>> def z():
... for i in G(1):
... pass
...
>>> dis.dis(z)
2 0 SETUP_LOOP 20 (to 23)
3 LOAD_GLOBAL 0 (G)
6 LOAD_CONST 1 (1)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 6 (to 22)
16 STORE_FAST 0 (i)
3 19 JUMP_ABSOLUTE 13
>> 22 POP_BLOCK
>> 23 LOAD_CONST 0 (None)
26 RETURN_VALUE
Of course, these results are based around the fact that Python for-loops will optimise iterators to remove the need for explicit handlers for the StopIteration exception. After all, StopIteration exception essentially form a normal part of the operation of a Python for-loop.
Regarding why it is implemented this way, see PEP-234 which defines iterators. This specifically addresses the issue of the expense of the exception:
It has been questioned whether an exception to signal the end of
the iteration isn't too expensive. Several alternatives for the
StopIteration exception have been proposed: a special value End
to signal the end, a function end() to test whether the iterator
is finished, even reusing the IndexError exception.
A special value has the problem that if a sequence ever
contains that special value, a loop over that sequence will
end prematurely without any warning. If the experience with
null-terminated C strings hasn't taught us the problems this
can cause, imagine the trouble a Python introspection tool
would have iterating over a list of all built-in names,
assuming that the special End value was a built-in name!
Calling an end() function would require two calls per
iteration. Two calls is much more expensive than one call
plus a test for an exception. Especially the time-critical
for loop can test very cheaply for an exception.
Reusing IndexError can cause confusion because it can be a
genuine error, which would be masked by ending the loop
prematurely.
Looking at the output of the bytecode generated by a function with a try and except block, it looks like it would be slightly slower, however, this is honestly negligible in most circumstances, as it is extremely small as far as performance hit goes. I think the real thing to consider when doing an optimization like this would be scoping the exceptions properly.
Output of an example function with try/except block when compiled to bytecode:
Python 2.7.3 (default, Apr 10 2012, 23:31:26) [MSC v.1500 32 bit (Intel)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import dis
>>> def x():
try:
sd="lol"
except:
raise
>>> dis.dis(x)
2 0 SETUP_EXCEPT 10 (to 13)
3 3 LOAD_CONST 1 ('lol')
6 STORE_FAST 0 (sd)
9 POP_BLOCK
10 JUMP_FORWARD 10 (to 23)
4 >> 13 POP_TOP
14 POP_TOP
15 POP_TOP
5 16 RAISE_VARARGS 0
19 JUMP_FORWARD 1 (to 23)
22 END_FINALLY
>> 23 LOAD_CONST 0 (None)
26 RETURN_VALUE
>>>

Checking for __debug__ and some other condition in Python

In Python sometimes I want to do something like (1)
if __debug__ and verbose: print "whatever"
If Python is run with -O, then I'd like for that whole piece of code to disappear, as it would if I just had (2)
if __debug__: print "whatever"
or even (3)
if __debug__:
if verbose: print foo
However, that doesn't seem to happen (see below). Is there a way I can get the run-time efficiency of #3 with compact code more like #1?
Here's how I tested that I'm not getting the efficient code I want:
#!/usr/bin/python2.7
from dis import dis
import sys
cmds = ["""
def func ():
if __debug__ and 1+1: sys.stdout.write('spam')""", """
def func():
if __debug__: sys.stdout.write('ham')""", """
def func():
__debug__ and sys.stdout.write('eggs')"""]
print "__debug__ is", __debug__, "\n\n\n"
for cmd in cmds:
print "*"*80, "\nSource of {}\n\ncompiles to:".format(cmd)
exec(cmd)
dis(func)
print "\n"*4
Running this gives
__debug__ is False
********************************************************************************
Source of
def func ():
if __debug__ and 1+1: sys.stdout.write('spam')
compiles to:
3 0 LOAD_GLOBAL 0 (__debug__)
3 POP_JUMP_IF_FALSE 31
6 LOAD_CONST 3 (2)
9 POP_JUMP_IF_FALSE 31
12 LOAD_GLOBAL 1 (sys)
15 LOAD_ATTR 2 (stdout)
18 LOAD_ATTR 3 (write)
21 LOAD_CONST 2 ('spam')
24 CALL_FUNCTION 1
27 POP_TOP
28 JUMP_FORWARD 0 (to 31)
>> 31 LOAD_CONST 0 (None)
34 RETURN_VALUE
********************************************************************************
Source of
def func():
if __debug__: sys.stdout.write('ham')
compiles to:
3 0 LOAD_CONST 0 (None)
3 RETURN_VALUE
********************************************************************************
Source of
def func():
__debug__ and sys.stdout.write('eggs')
compiles to:
3 0 LOAD_GLOBAL 0 (__debug__)
3 JUMP_IF_FALSE_OR_POP 21
6 LOAD_GLOBAL 1 (sys)
9 LOAD_ATTR 2 (stdout)
12 LOAD_ATTR 3 (write)
15 LOAD_CONST 1 ('eggs')
18 CALL_FUNCTION 1
>> 21 POP_TOP
22 LOAD_CONST 0 (None)
25 RETURN_VALUE
No, you can't. Python's compiler is not nearly smart enough to detect in what cases it could remove the code block and if statement.
Python would have to do a whole lot of logic inference otherwise. Compare:
if __debug__ or verbose:
with
if __debug__ and verbose:
for example. Python would have to detect the difference between these two expressions at compile time; one can be optimised away, the other cannot.
Note that the difference in runtime between code with and without if __debug__ statements is truly minute, everything else being equal. A small constant value test and jump is not anything to fuss about, really.

Categories

Resources