Understanding python tuple to list performance optimization - python

I have a tuple, say, atup = (1,3,4,5,6,6,7,78,8) and produced dynamically by list of tuples when iterated (generator yield). Each tuple needs to get converted to list so each elements of tuple can be transformed further and used in a method. While doing this, I was surprised to learn that just doing list(atup) is much faster than using list comprehension like this [i for i in atup]. Here is what I did:
Performance Test 1:
timeit.timeit('list((1,3,4,5,6,6,7,78,8))', number=100000)
0.02268475245609025
Performance Test 2:
timeit.timeit('[i for i in (1,3,4,5,6,6,7,78,8)]', number=100000)
0.05304025196801376
Can you please explain this ?

The list comprehension has to iterate over the tuple at the Python level:
>>> dis.dis("[i for i in (1,2,3)]")
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x1075c0c90, file "<dis>", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_CONST 5 ((1, 2, 3))
8 GET_ITER
10 CALL_FUNCTION 1
12 RETURN_VALUE
list iterates over the tuple itself, and uses the C API to do it without going through (as much of) the Python data model.
>>> dis.dis("list((1,2,3))")
1 0 LOAD_NAME 0 (list)
2 LOAD_CONST 3 ((1, 2, 3))
4 CALL_FUNCTION 1
6 RETURN_VALUE
The Python-level iteration is more clearly seen in Python 2, which implements list comprehensions in a different fashion.
>>> def f():
... return [i for i in (1,2,3)]
...
>>> dis.dis(f)
2 0 BUILD_LIST 0
3 LOAD_CONST 4 ((1, 2, 3))
6 GET_ITER
>> 7 FOR_ITER 12 (to 22)
10 STORE_FAST 0 (i)
13 LOAD_FAST 0 (i)
16 LIST_APPEND 2
19 JUMP_ABSOLUTE 7
>> 22 RETURN_VALUE
As #blhsing points out, you can get disassemble the code object generated by the list comprehension in Python 3 to see the same thing.
>>> code = compile('[i for i in (1,2,3)]', '', 'eval')
>>> dis(code.co_consts[0])
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE

The list constructor is implemented purely in C and has therefore minimal overhead, while with a list comprehension the Python compiler has to build a temporary function, build an iterator, store the iterator's output as variable i, and load the variable i to append to a list, which are a lot more Python byte codes to execute than simply loading a tuple and calling the list constructor.

Related

Time complexity for function converting list to set and vice versa

I am converting a list to set and back to list. I know that set(list) takes O(n) time, but I am converting it back to list in same line list(set(list)). Since both these operations take O(n) time, would the time complexity be O(n^2) now?
Logic 1:
final = list(set(list1)-set(list2))
Logic 2:
s = set(list1)-set(list2)
final = list(s)
Do both these implementations have different time complexities, and if they do which of them is more efficient?
One set conversion is unnecessary, as set.difference works with any iterable as it should:
final = list(set(list1).difference(list2))
But the asymptotic time and space complexity of the whole thing is still O(m+n) (the sizes of the two lists).
In both versions you are doing:
convert a list to a set
convert another list to a set
subtract the two sets
convert the resultant set to a list
Each of those is O(n), where n is a bound on the size of both your starting lists.
That's O(n) four times, which makes it overall O(n).
Your two versions are essentially identical.
Your two versions are identical, as from dis:
>>> dis.dis("list(set(list1) - set(list2))")
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (set)
4 LOAD_NAME 2 (list1)
6 CALL_FUNCTION 1
8 LOAD_NAME 1 (set)
10 LOAD_NAME 3 (list2)
12 CALL_FUNCTION 1
14 BINARY_SUBTRACT
16 CALL_FUNCTION 1
18 RETURN_VALUE
>>> dis.dis("s = set(list1) - set(list2);list(s)")
1 0 LOAD_NAME 0 (set)
2 LOAD_NAME 1 (list1)
4 CALL_FUNCTION 1
6 LOAD_NAME 0 (set)
8 LOAD_NAME 2 (list2)
10 CALL_FUNCTION 1
12 BINARY_SUBTRACT
14 STORE_NAME 3 (s)
16 LOAD_NAME 4 (list)
18 LOAD_NAME 3 (s)
20 CALL_FUNCTION 1
22 POP_TOP
24 LOAD_CONST 0 (None)
26 RETURN_VALUE
>>>
The only difference between these two is that the second one stores it to a variable s, so it has to LOAD_NAME the name s. And in the first code the list is called first, but in the second one get's called later.
But in #user2390182's answer, instead of a second loaded set LOAD_NAME, it loads the function name difference instead, which IMO is the most efficient one here:
>>> dis.dis("list(set(list1).difference(list2))")
1 0 LOAD_NAME 0 (list)
2 LOAD_NAME 1 (set)
4 LOAD_NAME 2 (list1)
6 CALL_FUNCTION 1
8 LOAD_METHOD 3 (difference)
10 LOAD_NAME 4 (list2)
12 CALL_METHOD 1
14 CALL_FUNCTION 1
16 RETURN_VALUE
>>>

Is it possible to call a function from within a list comprehension without the overhead of calling the function?

In this trivial example, I want to factor out the i < 5 condition of a list comprehension into it's own function. I also want to eat my cake and have it too, and avoid the overhead of the CALL_FUNCTION bytecode/creating a new frame in the python virtual machine.
Is there any way to factor out the conditions inside of a list comprehension into a new function but somehow get a disassembled result that avoids the large overhead of CALL_FUNCTION?
import dis
import sys
import timeit
def my_filter(n):
return n < 5
def a():
# list comprehension with function call
return [i for i in range(10) if my_filter(i)]
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
assert a() == b()
>>> sys.version_info[:]
(3, 6, 5, 'final', 0)
>>> timeit.timeit(a)
1.2616060493517098
>>> timeit.timeit(b)
0.685117881097812
>>> dis.dis(a)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F4890B660, file "<stdin>", line 3>)
# ...
>>> dis.dis(b)
3 0 LOAD_CONST 1 (<code object <listcomp> at 0x0000020F48A42270, file "<stdin>", line 3>)
# ...
# list comprehension with function call
# big overhead with that CALL_FUNCTION at address 12
>>> dis.dis(a.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_GLOBAL 0 (my_filter)
10 LOAD_FAST 1 (i)
12 CALL_FUNCTION 1
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
# list comprehension without function call
>>> dis.dis(b.__code__.co_consts[1])
3 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
I'm willing to take a hacky solution that I would never use in production, like somehow replacing the bytecode at run time.
In other words, is it possible to replace a's addresses 8, 10, and 12 with b's 8, 10, and 12 at runtime?
Consolidating all of the excellent answers in the comments into one.
As georg says, this sounds like you are looking for a way to inline a function or an expression, and there is no such thing in CPython attempts have been made: https://bugs.python.org/issue10399
Therefore, along the lines of "metaprogramming", you can build the lambda's inline and eval:
from typing import Callable
import dis
def b():
# list comprehension without function call
return [i for i in range(10) if i < 5]
def gen_list_comprehension(expr: str) -> Callable:
return eval(f"lambda: [i for i in range(10) if {expr}]")
a = gen_list_comprehension("i < 5")
dis.dis(a.__code__.co_consts[1])
print("=" * 10)
dis.dis(b.__code__.co_consts[1])
which when run under 3.7.6 gives:
6 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
==========
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 16 (to 22)
6 STORE_FAST 1 (i)
8 LOAD_FAST 1 (i)
10 LOAD_CONST 0 (5)
12 COMPARE_OP 0 (<)
14 POP_JUMP_IF_FALSE 4
16 LOAD_FAST 1 (i)
18 LIST_APPEND 2
20 JUMP_ABSOLUTE 4
>> 22 RETURN_VALUE
From a security standpoint "eval" is dangerous, athough here it is less so because what you can do inside a lambda. And what can be done in an IfExp expression is even more limited, but still dangerous like call a function that does evil things.
However, if you want the same effect that is more secure, instead of working with strings you can modify AST's. I find that a lot more cumbersome though.
A hybrid approach would be the call ast.parse() and check the result. For example:
import ast
def is_cond_str(s: str) -> bool:
try:
mod_ast = ast.parse(s)
expr_ast = isinstance(mod_ast.body[0])
if not isinstance(expr_ast, ast.Expr):
return False
compare_ast = expr_ast.value
if not isinstance(compare_ast, ast.Compare):
return False
return True
except:
return False
This is a little more secure, but there still may be evil functions in the condition so you could keep going. Again, I find this a little tedious.
Coming from the other direction of starting off with bytecode, there is my cross-version assembler; see https://pypi.org/project/xasm/

What are these extra symbols in a comprehension's symtable?

I'm using symtable to get the symbol tables of a piece of code. Curiously, when using a comprehension (listcomp, setcomp, etc.), there are some extra symbols I didn't define.
Reproduction (using CPython 3.6):
import symtable
root = symtable.symtable('[x for x in y]', '?', 'exec')
# Print symtable of the listcomp
print(root.get_children()[0].get_symbols())
Output:
[<symbol '.0'>, <symbol '_[1]'>, <symbol 'x'>]
Symbol x is expected. But what are .0 and _[1]?
Note that with any other non-comprehension construct I'm getting exactly the identifiers I used in the code. E.g., lambda x: y only results in the symbols [<symbol 'x'>, <symbol 'y'>].
Also, the docs say that symtable.Symbol is...
An entry in a SymbolTable corresponding to an identifier in the source.
...although these identifiers evidently don't appear in the source.
The two names are used to implement list comprehensions as a separate scope, and they have the following meaning:
.0 is an implicit argument, used for the iterable (sourced from y in your case).
_[1] is a temporary name in the symbol table, used for the target list. This list eventually ends up on the stack.*
A list comprehension (as well as dict and set comprehension and generator expressions) is executed in a new scope. To achieve this, Python effectively creates a new anonymous function.
Because it is a function, really, you need to pass in the iterable you are looping over as an argument. This is what .0 is for, it is the first implicit argument (so at index 0). The symbol table you produced explicitly lists .0 as an argument:
>>> root = symtable.symtable('[x for x in y]', '?', 'exec')
>>> type(root.get_children()[0])
<class 'symtable.Function'>
>>> root.get_children()[0].get_parameters()
('.0',)
The first child of your table is a function with one argument named .0.
A list comprehension also needs to build the output list, and that list could be seen as a local too. This is the _[1] temporary variable. It never actually becomes a named local variable in the code object that is produced; this temporary variable is kept on the stack instead.
You can see the code object produced when you use compile():
>>> code_object = compile('[x for x in y]', '?', 'exec')
>>> code_object
<code object <module> at 0x11a4f3ed0, file "?", line 1>
>>> code_object.co_consts[0]
<code object <listcomp> at 0x11a4ea8a0, file "?", line 1>
So there is an outer code object, and in the constants, is another, nested code object. That latter one is the actual code object for the loop. It uses .0 and x as local variables. It also takes 1 argument; the names for arguments are the first co_argcount values in the co_varnames tuple:
>>> code_object.co_consts[0].co_varnames
('.0', 'x')
>>> code_object.co_consts[0].co_argcount
1
So .0 is the argument name here.
The _[1] temporary variable is handled on the stack, see the disassembly:
>>> import dis
>>> dis.dis(code_object.co_consts[0])
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (x)
8 LOAD_FAST 1 (x)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
Here we see .0 referenced again. _[1] is the BUILD_LIST opcode pushing a list object onto the stack, then .0 is put on the stack for the FOR_ITER opcode to iterate over (the opcode removes the iterable from .0 from the stack again).
Each iteration result is pushed onto the stack by FOR_ITER, popped again and stored in x with STORE_FAST, then loaded onto the stack again with LOAD_FAST. Finally LIST_APPEND takes the top element from the stack, and adds it to the list referenced by the next element on the stack, so to _[1].
JUMP_ABSOLUTE then brings us back to the top of the loop, where we continue to iterate until the iterable is done. Finally, RETURN_VALUE returns the top of the stack, again _[1], to the caller.
The outer code object does the work of loading the nested code object and calling it as a function:
>>> dis.dis(code_object)
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x11a4ea8a0, file "?", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (y)
8 GET_ITER
10 CALL_FUNCTION 1
12 POP_TOP
14 LOAD_CONST 2 (None)
16 RETURN_VALUE
So this makes a function object, with the function named <listcomp> (helpful for tracebacks), loads y, produces an iterator for it (the moral equivalent of iter(y), and calls the function with that iterator as the argument.
If you wanted to translate that to Psuedo-code, it would look like:
def <listcomp>(.0):
_[1] = []
for x in .0:
_[1].append(x)
return _[1]
<listcomp>(iter(y))
The _[1] temporary variable is of course not needed for generator expressions:
>>> symtable.symtable('(x for x in y)', '?', 'exec').get_children()[0].get_symbols()
[<symbol '.0'>, <symbol 'x'>]
Instead of appending to a list, the generator expression function object yields the values:
>>> dis.dis(compile('(x for x in y)', '?', 'exec').co_consts[0])
1 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 10 (to 14)
4 STORE_FAST 1 (x)
6 LOAD_FAST 1 (x)
8 YIELD_VALUE
10 POP_TOP
12 JUMP_ABSOLUTE 2
>> 14 LOAD_CONST 0 (None)
16 RETURN_VALUE
Together with the outer bytecode, the generator expression is equivalent to:
def <genexpr>(.0):
for x in .0:
yield x
<genexpr>(iter(y))
* The temporary variable is actually no longer needed; they were used in the initial implementation of comprehensions, but this commit from April 2007 moved the compiler to just using the stack, and this has been the norm for all of the 3.x releases, as well as Python 2.7. It still is easier to think of the generated name as a reference to the stack. Because the variable is no longer needed, I filed issue 32836 to have it removed, and Python 3.8 and onwards will no longer include it in the symbol table.
In Python 2.6, you can still see the actual temporary name in the disassembly:
>>> import dis
>>> dis.dis(compile('[x for x in y]', '?', 'exec'))
1 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_NAME 0 (_[1])
7 LOAD_NAME 1 (y)
10 GET_ITER
>> 11 FOR_ITER 13 (to 27)
14 STORE_NAME 2 (x)
17 LOAD_NAME 0 (_[1])
20 LOAD_NAME 2 (x)
23 LIST_APPEND
24 JUMP_ABSOLUTE 11
>> 27 DELETE_NAME 0 (_[1])
30 POP_TOP
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
Note how the name actually has to be deleted again!
So, the way list-comprehensions are implemented is actually by creating a code-object, it is sort of like creating a one-time use anonymous function, for scoping purposes:
>>> import dis
>>> def f(y): [x for x in y]
...
>>> dis.dis(f)
1 0 LOAD_CONST 1 (<code object <listcomp> at 0x101df9db0, file "<stdin>", line 1>)
3 LOAD_CONST 2 ('f.<locals>.<listcomp>')
6 MAKE_FUNCTION 0
9 LOAD_FAST 0 (y)
12 GET_ITER
13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
16 POP_TOP
17 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>>
Inspecting the code-object, I can find the .0 symbol:
>>> dis.dis(f.__code__.co_consts[1])
1 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
Note, the LOAD_FAST in the list-comp code-object seems to be loading the unnamed argument, which would correspond to the GET_ITER

Idiomatic Python: `in` keyword on literal

When using the in operator on a literal, is it most idiomatic for that literal to be a list, set, or tuple?
e.g.
for x in {'foo', 'bar', 'baz'}:
doSomething(x)
...
if val in {1, 2, 3}:
doSomethingElse(val)
I don't see any benefit to the list, but the tuple's immutably means it could be hoisted or reused by an efficient interpreter. And in the case of the if, if it's reused, there's an efficiency benefit.
Which is the most idiomatic, and which is most performant in cpython?
Python provides a disassembler, so you can often just check the bytecode:
In [4]: def checktup():
...: for _ in range(10):
...: if val in (1, 2, 3):
...: print("foo")
...:
In [5]: def checkset():
...: for _ in range(10):
...: if val in {1, 2, 3}:
...: print("foo")
...:
In [6]: import dis
For the tuple literal:
In [7]: dis.dis(checktup)
2 0 SETUP_LOOP 32 (to 34)
2 LOAD_GLOBAL 0 (range)
4 LOAD_CONST 1 (10)
6 CALL_FUNCTION 1
8 GET_ITER
>> 10 FOR_ITER 20 (to 32)
12 STORE_FAST 0 (_)
3 14 LOAD_GLOBAL 1 (val)
16 LOAD_CONST 6 ((1, 2, 3))
18 COMPARE_OP 6 (in)
20 POP_JUMP_IF_FALSE 10
4 22 LOAD_GLOBAL 2 (print)
24 LOAD_CONST 5 ('foo')
26 CALL_FUNCTION 1
28 POP_TOP
30 JUMP_ABSOLUTE 10
>> 32 POP_BLOCK
>> 34 LOAD_CONST 0 (None)
36 RETURN_VALUE
For the set-literal:
In [8]: dis.dis(checkset)
2 0 SETUP_LOOP 32 (to 34)
2 LOAD_GLOBAL 0 (range)
4 LOAD_CONST 1 (10)
6 CALL_FUNCTION 1
8 GET_ITER
>> 10 FOR_ITER 20 (to 32)
12 STORE_FAST 0 (_)
3 14 LOAD_GLOBAL 1 (val)
16 LOAD_CONST 6 (frozenset({1, 2, 3}))
18 COMPARE_OP 6 (in)
20 POP_JUMP_IF_FALSE 10
4 22 LOAD_GLOBAL 2 (print)
24 LOAD_CONST 5 ('foo')
26 CALL_FUNCTION 1
28 POP_TOP
30 JUMP_ABSOLUTE 10
>> 32 POP_BLOCK
>> 34 LOAD_CONST 0 (None)
36 RETURN_VALUE
You'll notice that in both cases, the function will LOAD_CONST, i.e., both times it has been optimized. Even better, in the case of the set literal, the compiler has saved a frozenset, which during the construction of the function, the peephole-optimizer has managed to figure out can become the immutable equivalent of a set.
Note, on Python 2, the compiler builds a set every time!:
In [1]: import dis
In [2]: def checkset():
...: for _ in range(10):
...: if val in {1, 2, 3}:
...: print("foo")
...:
In [3]: dis.dis(checkset)
2 0 SETUP_LOOP 49 (to 52)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (10)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 35 (to 51)
16 STORE_FAST 0 (_)
3 19 LOAD_GLOBAL 1 (val)
22 LOAD_CONST 2 (1)
25 LOAD_CONST 3 (2)
28 LOAD_CONST 4 (3)
31 BUILD_SET 3
34 COMPARE_OP 6 (in)
37 POP_JUMP_IF_FALSE 13
4 40 LOAD_CONST 5 ('foo')
43 PRINT_ITEM
44 PRINT_NEWLINE
45 JUMP_ABSOLUTE 13
48 JUMP_ABSOLUTE 13
>> 51 POP_BLOCK
>> 52 LOAD_CONST 0 (None)
55 RETURN_VALUE
IMO, there's essentially no such thing as "idiomatic" usage of literal values as shown in the question. Such values look like "magic numbers" to me. Using literals for "performance" is probably misguided because it sacrifices readability for marginal gains. In cases where performance really matters, using literals is unlikely to help much and there are better options regardless.
I think the idiomatic thing to do would be to store such values in a global or class variable, especially if you're using them in multiple places (but also even if you aren't). This provides some documentation as to what a value's purpose is and makes it easier to update. You can then memomize these values in function/method definitions to improve performance if necessary.
As to what type of data structure is most appropriate, that would depend on what your program does and how it uses the data. For example, does ordering matter? With an if x in y, it won't, but maybe you're using the data in a for and an if. Without context, it's hard to say what the best choice would be.
Here's an example I think is readable, extensible, and also efficient. Memoizing the global ITEMS in the function definitions makes lookup fast because items is in the local namespace of the function. If you look at the disassembled code, you'll see that items is looked up via LOAD_FAST instead of LOAD_GLOBAL. This approach also avoids making multiple copies of the list of items, which might be relevant if it's big enough (although, if it was big enough, you probably wouldn't try to inline it anyway). Personally, I wouldn't bother with these kinds of optimizations most of the time, but they can be useful in some cases.
# In real code, this would have a domain-specific name instead of the
# generic `ITEMS`.
ITEMS = {'a', 'b', 'c'}
def filter_in_items(values, items=ITEMS):
matching_items = []
for value in values:
if value in items:
matching_items.append(value)
return matching_items
def filter_not_in_items(values, items=ITEMS):
non_matching_items = []
for value in values:
if value not in items:
non_matching_items.append(value)
return non_matching_items
print(filter_in_items(('a', 'x'))) # -> ['a']
print(filter_not_in_items(('a', 'x'))) # -> ['x']
import dis
dis.dis(filter_in_items)

Are list comprehensions syntactic sugar for `list(generator expression)` in Python 3?

In Python 3, is a list comprehension simply syntactic sugar for a generator expression fed into the list function?
e.g. is the following code:
squares = [x**2 for x in range(1000)]
actually converted in the background into the following?
squares = list(x**2 for x in range(1000))
I know the output is identical, and Python 3 fixes the surprising side-effects to surrounding namespaces that list comprehensions had, but in terms of what the CPython interpreter does under the hood, is the former converted to the latter, or are there any difference in how the code gets executed?
Background
I found this claim of equivalence in the comments section to this question, and a quick google search showed the same claim being made here.
There was also some mention of this in the What's New in Python 3.0 docs, but the wording is somewhat vague:
Also note that list comprehensions have different semantics: they are closer to syntactic sugar for a generator expression inside a list() constructor, and in particular the loop control variables are no longer leaked into the surrounding scope.
Both work differently. The list comprehension version takes advantage of the special bytecode LIST_APPEND which calls PyList_Append directly for us. Hence it avoids an attribute lookup to list.append and a function call at the Python level.
>>> def func_lc():
[x**2 for x in y]
...
>>> dis.dis(func_lc)
2 0 LOAD_CONST 1 (<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>)
3 LOAD_CONST 2 ('func_lc.<locals>.<listcomp>')
6 MAKE_FUNCTION 0
9 LOAD_GLOBAL 0 (y)
12 GET_ITER
13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
16 POP_TOP
17 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>> lc_object = list(dis.get_instructions(func_lc))[0].argval
>>> lc_object
<code object <listcomp> at 0x10d3c6780, file "<ipython-input-42-ead395105775>", line 2>
>>> dis.dis(lc_object)
2 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 16 (to 25)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LOAD_CONST 0 (2)
18 BINARY_POWER
19 LIST_APPEND 2
22 JUMP_ABSOLUTE 6
>> 25 RETURN_VALUE
On the other hand the list() version simply passes the generator object to list's __init__ method which then calls its extend method internally. As the object is not a list or tuple, CPython then gets its iterator first and then simply adds the items to the list until the iterator is exhausted:
>>> def func_ge():
list(x**2 for x in y)
...
>>> dis.dis(func_ge)
2 0 LOAD_GLOBAL 0 (list)
3 LOAD_CONST 1 (<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>)
6 LOAD_CONST 2 ('func_ge.<locals>.<genexpr>')
9 MAKE_FUNCTION 0
12 LOAD_GLOBAL 1 (y)
15 GET_ITER
16 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
19 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
22 POP_TOP
23 LOAD_CONST 0 (None)
26 RETURN_VALUE
>>> ge_object = list(dis.get_instructions(func_ge))[1].argval
>>> ge_object
<code object <genexpr> at 0x10cde6ae0, file "<ipython-input-41-f9a53483f10a>", line 2>
>>> dis.dis(ge_object)
2 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 15 (to 21)
6 STORE_FAST 1 (x)
9 LOAD_FAST 1 (x)
12 LOAD_CONST 0 (2)
15 BINARY_POWER
16 YIELD_VALUE
17 POP_TOP
18 JUMP_ABSOLUTE 3
>> 21 LOAD_CONST 1 (None)
24 RETURN_VALUE
>>>
Timing comparisons:
>>> %timeit [x**2 for x in range(10**6)]
1 loops, best of 3: 453 ms per loop
>>> %timeit list(x**2 for x in range(10**6))
1 loops, best of 3: 478 ms per loop
>>> %%timeit
out = []
for x in range(10**6):
out.append(x**2)
...
1 loops, best of 3: 510 ms per loop
Normal loops are slightly slow due to slow attribute lookup. Cache it and time again.
>>> %%timeit
out = [];append=out.append
for x in range(10**6):
append(x**2)
...
1 loops, best of 3: 467 ms per loop
Apart from the fact that list comprehension don't leak the variables anymore one more difference is that something like this is not valid anymore:
>>> [x**2 for x in 1, 2, 3] # Python 2
[1, 4, 9]
>>> [x**2 for x in 1, 2, 3] # Python 3
File "<ipython-input-69-bea9540dd1d6>", line 1
[x**2 for x in 1, 2, 3]
^
SyntaxError: invalid syntax
>>> [x**2 for x in (1, 2, 3)] # Add parenthesis
[1, 4, 9]
>>> for x in 1, 2, 3: # Python 3: For normal loops it still works
print(x**2)
...
1
4
9
Both forms create and call an anonymous function. However, the list(...) form creates a generator function and passes the returned generator-iterator to list, while with the [...] form, the anonymous function builds the list directly with LIST_APPEND opcodes.
The following code gets decompilation output of the anonymous functions for an example comprehension and its corresponding genexp-passed-to-list:
import dis
def f():
[x for x in []]
def g():
list(x for x in [])
dis.dis(f.__code__.co_consts[1])
dis.dis(g.__code__.co_consts[1])
The output for the comprehension is
4 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
The output for the genexp is
7 0 LOAD_FAST 0 (.0)
>> 3 FOR_ITER 11 (to 17)
6 STORE_FAST 1 (x)
9 LOAD_FAST 1 (x)
12 YIELD_VALUE
13 POP_TOP
14 JUMP_ABSOLUTE 3
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
You can actually show that the two can have different outcomes to prove they are inherently different:
>>> list(next(iter([])) if x > 3 else x for x in range(10))
[0, 1, 2, 3]
>>> [next(iter([])) if x > 3 else x for x in range(10)]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <listcomp>
StopIteration
The expression inside the comprehension is not treated as a generator since the comprehension does not handle the StopIteration, whereas the list constructor does.
They aren't the same, list() will evaluate what ever is given to it after what is in the parentheses has finished executing, not before.
The [] in python is a bit magical, it tells python to wrap what ever is inside it as a list, more like a type hint for the language.

Categories

Resources