Disclaimer: I'm new to programming, but new to Python. This may be a pretty basic question.
I have the following block of code:
for x in range(0, 100):
y = 1 + 1;
Is the calculation of 1 + 1 in the second line executed 100 times?
I have two suspicions why it might not:
1) The compiler sees 1 + 1 as a constant value, and thus compiles this line into y = 2;.
2) The compiler sees that y is only set and never referenced, so it omits this line of code.
Are either/both of these correct, or does it actually get executed each iteration over the loop?
Option 1 is executed; the CPython compiler simplifies mathematical expressions with constants in the peephole optimiser.
Python will not eliminate the loop body however.
You can introspect what Python produces by looking at the bytecode; use the dis module to take a look:
>>> import dis
>>> def f():
... for x in range(100):
... y = 1 + 1
...
>>> dis.dis(f)
2 0 SETUP_LOOP 26 (to 29)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (100)
9 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
12 GET_ITER
>> 13 FOR_ITER 12 (to 28)
16 STORE_FAST 0 (x)
3 19 LOAD_CONST 3 (2)
22 STORE_FAST 1 (y)
25 JUMP_ABSOLUTE 13
>> 28 POP_BLOCK
>> 29 LOAD_CONST 0 (None)
32 RETURN_VALUE
The bytecode at position 19, LOAD_CONST loads the value 2 to store in y.
You can see the constants associated with the code object in the co_consts attribute of a code object; for functions you can find that object under the __code__ attribute:
>>> f.__code__.co_consts
(None, 100, 1, 2)
None is the default return value for any function, 100 the literal passed to the range() call, 1 the original literal, left in place by the peephole optimiser and 2 is the result of the optimisation.
The work is done in peephole.c, in the fold_binops_on_constants() function:
/* Replace LOAD_CONST c1. LOAD_CONST c2 BINOP
with LOAD_CONST binop(c1,c2)
The consts table must still be in list form so that the
new constant can be appended.
Called with codestr pointing to the first LOAD_CONST.
Abandons the transformation if the folding fails (i.e. 1+'a').
If the new constant is a sequence, only folds when the size
is below a threshold value. That keeps pyc files from
becoming large in the presence of code like: (None,)*1000.
*/
Take into account that Python is a highly dynamic language, such optimisations can only be applied to literals and constants that you cannot later dynamically replace.
Related
I wrote the following function:
def f():
for i in range(100000):
print(i)
some_function_that_doesnt_exist()
When I run my file, this will print out numbers in range 100000 and then raise the error: NameError: name 'some_function_that_doesnt_exist' is not defined.
This is an example of a broader question I have about Python. If the Python code is compiled to bytecode before the Python VM interprets it (or JITs it), why don't we see this error during compilation? Why are 100000 numbers printing out before we finally see an error? In a purely compiled language like C we would see the error at compile time, so why don't we see it in this "partially compiled" language?
You can look at what the python VM run, using dis:
import dis
def f():
for i in range(100000):
print(i)
some_function_that_doesnt_exist()
dis.dis(f)
Will output:
4 0 SETUP_LOOP 24 (to 26)
2 LOAD_GLOBAL 0 (range)
4 LOAD_CONST 1 (100000)
6 CALL_FUNCTION 1
8 GET_ITER
>> 10 FOR_ITER 12 (to 24)
12 STORE_FAST 0 (i)
5 14 LOAD_GLOBAL 1 (print)
16 LOAD_FAST 0 (i)
18 CALL_FUNCTION 1
20 POP_TOP
22 JUMP_ABSOLUTE 10
>> 24 POP_BLOCK
6 >> 26 LOAD_GLOBAL 2 (some_function_that_doesnt_exist)
28 CALL_FUNCTION 0
30 POP_TOP
32 LOAD_CONST 0 (None)
34 RETURN_VALUE
This is what the python VM actually runs. as you can see from lines 10 to 24 it actually deal with the printing.
Only when you reach line 26, the python VM tries to load the function, which results in the name error.
A bit on the difference, when compiled language do the compiling to byte-code (C to assembly for example), a function call need to know exactly where in the memory the function exists. Thus when you attempt to compile a non-existing function, it will not be able to translate it into byte code, as it wont be able to look up the function location.
Python VM can access an array of globals, which can be modified in real time, and the lookup in this array is also in real time.
Doing:
dis.dis("def some_function_that_doesnt_exist(): 1")
will result in:
1 0 LOAD_CONST 0 (<code object some_function_that_doesnt_exist at 0x7f38dc90ed20, file "<dis>", line 1>)
2 LOAD_CONST 1 ('some_function_that_doesnt_exist')
4 MAKE_FUNCTION 0
6 STORE_NAME 0 (some_function_that_doesnt_exist)
8 LOAD_CONST 2 (None)
10 RETURN_VALUE
Where line 2 load the function name from the const table, and after making the function at line 4, stores it, according to the name, at line 6. Thus when looking for that function next time, python VM will be able to find it.
Python code is compiled to bytecode and that bytecode is then interpreted at runtime. Your code is perfectly valid Python, so it will compile to bytecode successfully. That bytecode, however, will throw an error at runtime when the interpreter attempts to execute it.
I'm using symtable to get the symbol tables of a piece of code. Curiously, when using a comprehension (listcomp, setcomp, etc.), there are some extra symbols I didn't define.
Reproduction (using CPython 3.6):
import symtable
root = symtable.symtable('[x for x in y]', '?', 'exec')
# Print symtable of the listcomp
print(root.get_children()[0].get_symbols())
Output:
[<symbol '.0'>, <symbol '_[1]'>, <symbol 'x'>]
Symbol x is expected. But what are .0 and _[1]?
Note that with any other non-comprehension construct I'm getting exactly the identifiers I used in the code. E.g., lambda x: y only results in the symbols [<symbol 'x'>, <symbol 'y'>].
Also, the docs say that symtable.Symbol is...
An entry in a SymbolTable corresponding to an identifier in the source.
...although these identifiers evidently don't appear in the source.
The two names are used to implement list comprehensions as a separate scope, and they have the following meaning:
.0 is an implicit argument, used for the iterable (sourced from y in your case).
_[1] is a temporary name in the symbol table, used for the target list. This list eventually ends up on the stack.*
A list comprehension (as well as dict and set comprehension and generator expressions) is executed in a new scope. To achieve this, Python effectively creates a new anonymous function.
Because it is a function, really, you need to pass in the iterable you are looping over as an argument. This is what .0 is for, it is the first implicit argument (so at index 0). The symbol table you produced explicitly lists .0 as an argument:
>>> root = symtable.symtable('[x for x in y]', '?', 'exec')
>>> type(root.get_children()[0])
<class 'symtable.Function'>
>>> root.get_children()[0].get_parameters()
('.0',)
The first child of your table is a function with one argument named .0.
A list comprehension also needs to build the output list, and that list could be seen as a local too. This is the _[1] temporary variable. It never actually becomes a named local variable in the code object that is produced; this temporary variable is kept on the stack instead.
You can see the code object produced when you use compile():
>>> code_object = compile('[x for x in y]', '?', 'exec')
>>> code_object
<code object <module> at 0x11a4f3ed0, file "?", line 1>
>>> code_object.co_consts[0]
<code object <listcomp> at 0x11a4ea8a0, file "?", line 1>
So there is an outer code object, and in the constants, is another, nested code object. That latter one is the actual code object for the loop. It uses .0 and x as local variables. It also takes 1 argument; the names for arguments are the first co_argcount values in the co_varnames tuple:
>>> code_object.co_consts[0].co_varnames
('.0', 'x')
>>> code_object.co_consts[0].co_argcount
1
So .0 is the argument name here.
The _[1] temporary variable is handled on the stack, see the disassembly:
>>> import dis
>>> dis.dis(code_object.co_consts[0])
1 0 BUILD_LIST 0
2 LOAD_FAST 0 (.0)
>> 4 FOR_ITER 8 (to 14)
6 STORE_FAST 1 (x)
8 LOAD_FAST 1 (x)
10 LIST_APPEND 2
12 JUMP_ABSOLUTE 4
>> 14 RETURN_VALUE
Here we see .0 referenced again. _[1] is the BUILD_LIST opcode pushing a list object onto the stack, then .0 is put on the stack for the FOR_ITER opcode to iterate over (the opcode removes the iterable from .0 from the stack again).
Each iteration result is pushed onto the stack by FOR_ITER, popped again and stored in x with STORE_FAST, then loaded onto the stack again with LOAD_FAST. Finally LIST_APPEND takes the top element from the stack, and adds it to the list referenced by the next element on the stack, so to _[1].
JUMP_ABSOLUTE then brings us back to the top of the loop, where we continue to iterate until the iterable is done. Finally, RETURN_VALUE returns the top of the stack, again _[1], to the caller.
The outer code object does the work of loading the nested code object and calling it as a function:
>>> dis.dis(code_object)
1 0 LOAD_CONST 0 (<code object <listcomp> at 0x11a4ea8a0, file "?", line 1>)
2 LOAD_CONST 1 ('<listcomp>')
4 MAKE_FUNCTION 0
6 LOAD_NAME 0 (y)
8 GET_ITER
10 CALL_FUNCTION 1
12 POP_TOP
14 LOAD_CONST 2 (None)
16 RETURN_VALUE
So this makes a function object, with the function named <listcomp> (helpful for tracebacks), loads y, produces an iterator for it (the moral equivalent of iter(y), and calls the function with that iterator as the argument.
If you wanted to translate that to Psuedo-code, it would look like:
def <listcomp>(.0):
_[1] = []
for x in .0:
_[1].append(x)
return _[1]
<listcomp>(iter(y))
The _[1] temporary variable is of course not needed for generator expressions:
>>> symtable.symtable('(x for x in y)', '?', 'exec').get_children()[0].get_symbols()
[<symbol '.0'>, <symbol 'x'>]
Instead of appending to a list, the generator expression function object yields the values:
>>> dis.dis(compile('(x for x in y)', '?', 'exec').co_consts[0])
1 0 LOAD_FAST 0 (.0)
>> 2 FOR_ITER 10 (to 14)
4 STORE_FAST 1 (x)
6 LOAD_FAST 1 (x)
8 YIELD_VALUE
10 POP_TOP
12 JUMP_ABSOLUTE 2
>> 14 LOAD_CONST 0 (None)
16 RETURN_VALUE
Together with the outer bytecode, the generator expression is equivalent to:
def <genexpr>(.0):
for x in .0:
yield x
<genexpr>(iter(y))
* The temporary variable is actually no longer needed; they were used in the initial implementation of comprehensions, but this commit from April 2007 moved the compiler to just using the stack, and this has been the norm for all of the 3.x releases, as well as Python 2.7. It still is easier to think of the generated name as a reference to the stack. Because the variable is no longer needed, I filed issue 32836 to have it removed, and Python 3.8 and onwards will no longer include it in the symbol table.
In Python 2.6, you can still see the actual temporary name in the disassembly:
>>> import dis
>>> dis.dis(compile('[x for x in y]', '?', 'exec'))
1 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_NAME 0 (_[1])
7 LOAD_NAME 1 (y)
10 GET_ITER
>> 11 FOR_ITER 13 (to 27)
14 STORE_NAME 2 (x)
17 LOAD_NAME 0 (_[1])
20 LOAD_NAME 2 (x)
23 LIST_APPEND
24 JUMP_ABSOLUTE 11
>> 27 DELETE_NAME 0 (_[1])
30 POP_TOP
31 LOAD_CONST 0 (None)
34 RETURN_VALUE
Note how the name actually has to be deleted again!
So, the way list-comprehensions are implemented is actually by creating a code-object, it is sort of like creating a one-time use anonymous function, for scoping purposes:
>>> import dis
>>> def f(y): [x for x in y]
...
>>> dis.dis(f)
1 0 LOAD_CONST 1 (<code object <listcomp> at 0x101df9db0, file "<stdin>", line 1>)
3 LOAD_CONST 2 ('f.<locals>.<listcomp>')
6 MAKE_FUNCTION 0
9 LOAD_FAST 0 (y)
12 GET_ITER
13 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
16 POP_TOP
17 LOAD_CONST 0 (None)
20 RETURN_VALUE
>>>
Inspecting the code-object, I can find the .0 symbol:
>>> dis.dis(f.__code__.co_consts[1])
1 0 BUILD_LIST 0
3 LOAD_FAST 0 (.0)
>> 6 FOR_ITER 12 (to 21)
9 STORE_FAST 1 (x)
12 LOAD_FAST 1 (x)
15 LIST_APPEND 2
18 JUMP_ABSOLUTE 6
>> 21 RETURN_VALUE
Note, the LOAD_FAST in the list-comp code-object seems to be loading the unnamed argument, which would correspond to the GET_ITER
This question already has answers here:
What does "list comprehension" and similar mean? How does it work and how can I use it?
(5 answers)
Closed 7 months ago.
I am going trough the docs of Python 3.X, I have doubt about List Comprehension speed of execution and how it exactly work.
Let's take the following example:
Listing 1
...
L = range(0,10)
L = [x ** 2 for x in L]
...
Now in my knowledge this return a new listing, and it's equivalent to write down:
Listing 2
...
res = []
for x in L:
res.append(x ** 2)
...
The main difference is the speed of execution if I am correct. Listing 1 is supposed to be performed at C language speed inside the interpreter, meanwhile Listing 2 is not.
But Listing 2 is what the list comprehension does internally (not sure), so why Listing 1 is executed at C Speed inside the interpreter & Listing 2 is not? Both are converted to byte code before being processed, or am I missing something?
Look at the actual bytecode that is produced. I've put the two fragments of code into fuctions called f1 and f2.
The comprehension does this:
3 15 LOAD_CONST 3 (<code object <listcomp> at 0x7fbf6c1b59c0, file "<stdin>", line 3>)
18 LOAD_CONST 4 ('f1.<locals>.<listcomp>')
21 MAKE_FUNCTION 0
24 LOAD_FAST 0 (L)
27 GET_ITER
28 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
31 STORE_FAST 0 (L)
Notice there is no loop in the bytecode. The loop happens in C.
Now the for loop does this:
4 21 SETUP_LOOP 31 (to 55)
24 LOAD_FAST 0 (L)
27 GET_ITER
>> 28 FOR_ITER 23 (to 54)
31 STORE_FAST 2 (x)
34 LOAD_FAST 1 (res)
37 LOAD_ATTR 1 (append)
40 LOAD_FAST 2 (x)
43 LOAD_CONST 3 (2)
46 BINARY_POWER
47 CALL_FUNCTION 1 (1 positional, 0 keyword pair)
50 POP_TOP
51 JUMP_ABSOLUTE 28
>> 54 POP_BLOCK
In contrast to the comprehension, the loop is clearly here in the bytecode. So the loop occurs in python.
The bytecodes are different, and the first should be faster.
The answer is actually in your question.
When you run any built in python function you are running something that has been written in C and compiled into machine code.
When you write your own version of it, that code must be converted into CPython objects which are handled by the interpreter.
In consequence the built-in approach or function is always quicker (or takes less space) in Python than writing your own function.
I am using lambda functions for GUI programming with tkinter.
Recently I got stuck when implementing buttons that open files:
self.file=""
button = Button(conf_f, text="Tools opt.",
command=lambda: tktb.helpers.openfile(self.file))
As you see, I want to define a file path that can be updated, and that is not known when creating the GUI.
The issue I had is that earlier my code was :
button = Button(conf_f, text="Tools opt.",
command=lambda f=self.file: tktb.helpers.openfile(f))
The lambda function had a keyword argument to pass the file path. In this case, the parameter f was not updated when self.file was.
I got the keyword argument from a code snippet and I use it everywhere. Obviously I shouldn't...
This is still not clear to me... Could someone explain me the difference between the two lambda forms and when to use one an another?
PS: The following comment led me to the solution but I'd like a little more explanations:
lambda working oddly with tkinter
I'll try to explain it more in depth.
If you do
i = 0
f = lambda: i
you create a function (lambda is essentially a function) which accesses its enclosing scope's i variable.
Internally, it does so by having a so-called closure which contains the i. It is, loosely spoken, a kind of pointer to the real variable which can hold different values at different points of time.
def a():
# first, yield a function to access i
yield lambda: i
# now, set i to different values successively
for i in range(100): yield
g = a() # create generator
f = next(g) # get the function
f() # -> error as i is not set yet
next(g)
f() # -> 0
next(g)
f() # -> 1
# and so on
f.func_closure # -> an object stemming from the local scope of a()
f.func_closure[0].cell_contents # -> the current value of this variable
Here, all values of i are - at their time - stored in that said closure. If the function f() needs them. it gets them from there.
You can see that difference on the disassembly listings:
These said functions a() and f() disassemble like this:
>>> dis.dis(a)
2 0 LOAD_CLOSURE 0 (i)
3 BUILD_TUPLE 1
6 LOAD_CONST 1 (<code object <lambda> at 0xb72ea650, file "<stdin>", line 2>)
9 MAKE_CLOSURE 0
12 YIELD_VALUE
13 POP_TOP
3 14 SETUP_LOOP 25 (to 42)
17 LOAD_GLOBAL 0 (range)
20 LOAD_CONST 2 (100)
23 CALL_FUNCTION 1
26 GET_ITER
>> 27 FOR_ITER 11 (to 41)
30 STORE_DEREF 0 (i)
33 LOAD_CONST 0 (None)
36 YIELD_VALUE
37 POP_TOP
38 JUMP_ABSOLUTE 27
>> 41 POP_BLOCK
>> 42 LOAD_CONST 0 (None)
45 RETURN_VALUE
>>> dis.dis(f)
2 0 LOAD_DEREF 0 (i)
3 RETURN_VALUE
Compare that to a function b() which looks like
>>> def b():
... for i in range(100): yield
>>> dis.dis(b)
2 0 SETUP_LOOP 25 (to 28)
3 LOAD_GLOBAL 0 (range)
6 LOAD_CONST 1 (100)
9 CALL_FUNCTION 1
12 GET_ITER
>> 13 FOR_ITER 11 (to 27)
16 STORE_FAST 0 (i)
19 LOAD_CONST 0 (None)
22 YIELD_VALUE
23 POP_TOP
24 JUMP_ABSOLUTE 13
>> 27 POP_BLOCK
>> 28 LOAD_CONST 0 (None)
31 RETURN_VALUE
The main difference in the loop is
>> 13 FOR_ITER 11 (to 27)
16 STORE_FAST 0 (i)
in b() vs.
>> 27 FOR_ITER 11 (to 41)
30 STORE_DEREF 0 (i)
in a(): the STORE_DEREF stores in a cell object (closure), while STORE_FAST uses a "normal" variable, which (probably) works a little bit faster.
The lambda as well makes a difference:
>>> dis.dis(lambda: i)
1 0 LOAD_GLOBAL 0 (i)
3 RETURN_VALUE
Here you have a LOAD_GLOBAL, while the one above uses LOAD_DEREF. The latter, as well, is for the closure.
I completely forgot about lambda i=i: i.
If you have the value as a default parameter, it finds its way into the function via a completely different path: the current value of i gets passed to the just created function via a default parameter:
>>> i = 42
>>> f = lambda i=i: i
>>> dis.dis(f)
1 0 LOAD_FAST 0 (i)
3 RETURN_VALUE
This way the function gets called as f(). It detects that there is a missing argument and fills the respective parameter with the default value. All this happens before the function is called; from within the function you just see that the value is taken and returned.
And there is yet another way to accomplish your task: Just use the lambda as if it would take a value: lambda i: i. If you call this, it complains about a missing argument.
But you can cope with that with the use of functools.partial:
ff = [functools.partial(lambda i: i, x) for x in range(100)]
ff[12]()
ff[54]()
This wrapper gets a callable and a number of arguments to be passed. The resulting object is a callable which calls the original callable with these arguments plus any arguments you give to it. It can be used here to keep locked to the value intended.
I was trying to figure out which integers python only instantiates once (-6 to 256 it seems), and in the process stumbled on some string behaviour I can't see the pattern in. Sometimes, equal strings created in different ways share the same id, sometimes not. This code:
A = "10000"
B = "10000"
C = "100" + "00"
D = "%i"%10000
E = str(10000)
F = str(10000)
G = str(100) + "00"
H = "0".join(("10","00"))
for obj in (A,B,C,D,E,F,G,H):
print obj, id(obj), obj is A
prints:
10000 4959776 True
10000 4959776 True
10000 4959776 True
10000 4959776 True
10000 4959456 False
10000 4959488 False
10000 4959520 False
10000 4959680 False
I don't even see the pattern - save for the fact that the first four don't have an explicit function call - but surely that can't be it, since the "+" in C for example implies a function call to add. I especially don't understand why C and G are different, seeing as that implies that the ids of the components of the addition are more important than the outcome.
So, what is the special treatment that A-D undergo, making them come out as the same instance?
In terms of language specification, any compliant Python compiler and runtime is fully allowed, for any instance of an immutable type, to make a new instance OR find an existing instance of the same type that's equal to the required value and use a new reference to that same instance. This means it's always incorrect to use is or by-id comparison among immutables, and any minor release may tweak or change strategy in this matter to enhance optimization.
In terms of implementations, the tradeoff are pretty clear: trying to reuse an existing instance may mean time spent (perhaps wasted) trying to find such an instance, but if the attempt succeeds then some memory is saved (as well as the time to allocate and later free the memory bits needed to hold a new instance).
How to solve those implementation tradeoffs is not entirely obvious -- if you can identify heuristics that indicate that finding a suitable existing instance is likely and the search (even if it fails) will be fast, then you may want to attempt the search-and-reuse when the heuristics suggest it, but skip it otherwise.
In your observations you seem to have found a particular dot-release implementation that performs a modicum of peephole optimization when that's entirely safe, fast, and simple, so the assignments A to D all boil down to exactly the same as A (but E to F don't, as they involve named functions or methods that the optimizer's authors may reasonably have considered not 100% safe to assume semantics for -- and low-ROI if that was done -- so they're not peephole-optimized).
Thus, A to D reusing the same instance boils down to A and B doing so (as C and D get peephole-optimized to exactly the same construct).
That reuse, in turn, clearly suggests compiler tactics/optimizer heuristics whereby identical literal constants of an immutable type in the same function's local namespace are collapsed to references to just one instance in the function's .func_code.co_consts (to use current CPython's terminology for attributes of functions and code objects) -- reasonable tactics and heuristics, as reuse of the same immutable constant literal within one function are somewhat frequent, AND the price is only paid once (at compile time) while the advantage is accrued many times (every time the function runs, maybe within loops etc etc).
(It so happens that these specific tactics and heuristics, given their clearly-positive tradeoffs, have been pervasive in all recent versions of CPython, and, I believe, IronPython, Jython, and PyPy as well;-).
This is a somewhat worthy and interesting are of study if you're planning to write compilers, runtime environments, peephole optimizers, etc etc, for Python itself or similar languages. I guess that deep study of the internals (ideally of many different correct implementations, of course, so as not to fixate on the quirks of a specific one -- good thing Python currently enjoys at least 4 separate production-worthy implementations, not to mention several versions of each!) can also help, indirectly, make one a better Python programmer -- but it's particularly important to focus on what's guaranteed by the language itself, which is somewhat less than what you'll find in common among separate implementations, because the parts that "just happen" to be in common right now (without being required to be so by the language specs) may perfectly well change under you at the next point release of one or another implementation and, if your production code was mistakenly relying on such details, that might cause nasty surprises;-). Plus -- it's hardly ever necessary, or even particularly helpful, to rely on such variable implementation details rather than on language-mandated behavior (unless you're coding something like an optimizer, debugger, profiler, or the like, of course;-).
Python is allowed to inline string constants; A,B,C,D are actually the same literals (if Python sees a constant expression, it treats it as a constant).
str is actually a class, so str(whatever) is calling this class' constructor, which should yield a fresh object. This explains E,F,G (note that each of these has separate identity).
As for H, I am not sure, but I'd go for explanation that this expression is too complicated for Python to figure out it's actually a constant, so it computes a new string.
I believe short strings that can be evaluated at compile time, will be interned automatically. In the last examples, the result can't be evaluated at compile time because str or join might be redefined.
in answer to S.Lott's suggestion of examining the byte code:
import dis
def moo():
A = "10000"
B = "10000"
C = "100" + "00"
D = "%i"%10000
E = str(10000)
F = str(10000)
G = "1000"+str(0)
H = "0".join(("10","00"))
I = str("10000")
for obj in (A,B,C,D,E,F,G,H, I):
print obj, id(obj), obj is A
moo()
print dis.dis(moo)
yields:
10000 4968128 True
10000 4968128 True
10000 4968128 True
10000 4968128 True
10000 2840928 False
10000 2840896 False
10000 2840864 False
10000 2840832 False
10000 4968128 True
4 0 LOAD_CONST 1 ('10000')
3 STORE_FAST 0 (A)
5 6 LOAD_CONST 1 ('10000')
9 STORE_FAST 1 (B)
6 12 LOAD_CONST 10 ('10000')
15 STORE_FAST 2 (C)
7 18 LOAD_CONST 11 ('10000')
21 STORE_FAST 3 (D)
8 24 LOAD_GLOBAL 0 (str)
27 LOAD_CONST 5 (10000)
30 CALL_FUNCTION 1
33 STORE_FAST 4 (E)
9 36 LOAD_GLOBAL 0 (str)
39 LOAD_CONST 5 (10000)
42 CALL_FUNCTION 1
45 STORE_FAST 5 (F)
10 48 LOAD_CONST 6 ('1000')
51 LOAD_GLOBAL 0 (str)
54 LOAD_CONST 7 (0)
57 CALL_FUNCTION 1
60 BINARY_ADD
61 STORE_FAST 6 (G)
11 64 LOAD_CONST 8 ('0')
67 LOAD_ATTR 1 (join)
70 LOAD_CONST 12 (('10', '00'))
73 CALL_FUNCTION 1
76 STORE_FAST 7 (H)
12 79 LOAD_GLOBAL 0 (str)
82 LOAD_CONST 1 ('10000')
85 CALL_FUNCTION 1
88 STORE_FAST 8 (I)
14 91 SETUP_LOOP 66 (to 160)
94 LOAD_FAST 0 (A)
97 LOAD_FAST 1 (B)
100 LOAD_FAST 2 (C)
103 LOAD_FAST 3 (D)
106 LOAD_FAST 4 (E)
109 LOAD_FAST 5 (F)
112 LOAD_FAST 6 (G)
115 LOAD_FAST 7 (H)
118 LOAD_FAST 8 (I)
121 BUILD_TUPLE 9
124 GET_ITER
>> 125 FOR_ITER 31 (to 159)
128 STORE_FAST 9 (obj)
15 131 LOAD_FAST 9 (obj)
134 PRINT_ITEM
135 LOAD_GLOBAL 2 (id)
138 LOAD_FAST 9 (obj)
141 CALL_FUNCTION 1
144 PRINT_ITEM
145 LOAD_FAST 9 (obj)
148 LOAD_FAST 0 (A)
151 COMPARE_OP 8 (is)
154 PRINT_ITEM
155 PRINT_NEWLINE
156 JUMP_ABSOLUTE 125
>> 159 POP_BLOCK
>> 160 LOAD_CONST 0 (None)
163 RETURN_VALUE
so it would seem that indeed the compiler understands A-D to mean the same thing, and so it saves memory by only generating it once (as suggested by Alex,Maciej and Greg). (added case I seems to just be str() realising it's trying to make a string from a string, and just passing it through.)
Thanks everyone, that's a lot clearer now.