Uncommon behaviour of IS operator in python - python

From some of the answers on Stackoverflow, I came to know that from -5 to 256 same memory location is referenced thus we get true for:
>>> a = 256
>>> a is 256
True
Now comes the twist (see this line before marking duplicate):
>>> a = 257
>>> a is 257
False
This is completely understood, but now if I do:
>>> a = 257; a is 257
True
>>> a = 12345; a is 12345
True
Why?

What you're seeing is an optimization in the compiler in CPython (which compiles your source code into the bytecode that the interpreter runs). Whenever the same immutable constant value is used in several different places within the a chunk of code that is being compiled in one step, the compiler will try to use a reference to same object for each place.
So if you do multiple assignments on the same line in an interactive session, you'll get two references to the same object, but you won't if you use two separate lines:
>>> x = 257; y = 257 # multiple statements on the same line are compiled in one step
>>> print(x is y) # prints True
>>> x = 257
>>> y = 257
>>> print(x is y) # prints False this time, since the assignments were compiled separately
Another place this optimization comes up is in the body of a function. The whole function body will be compiled together, so any constants used anywhere in the function can be combined, even if they're on separate lines:
def foo():
x = 257
y = 257
return x is y # this will always return True
While it's interesting to investigate optimizations like this one, you should never rely upon this behavior in your normal code. Different Python interpreters, and even different versions of CPython may do these optimizations differently or not at all. If your code depends on a specific optimization, it may be completely broken for somebody else who tries to run it on their own system.
As an example, the two assignments on the same line I show in my first code block above doesn't result in two references to the same object when I do it in the interactive shell inside Spyder (my preferred IDE). I have no idea why that specific situation doesn't work the same way it does in a conventional interactive shell, but the different behavior is my fault, since my code relies upon implementation-specific behavior.

Generally speaking, numbers outside the range -5 to 256 will not necessarily have the optimization applied to numbers within that range. However, Python is free to apply other optimizations as appropriate. In your cause, you're seeing that the same literal value used multiple times on one line is stored in a single memory location no matter how many times it's used on that line. Here are some other examples of this behavior:
>>> s = 'a'; s is 'a'
True
>>> s = 'asdfghjklzxcvbnmsdhasjkdhskdja'; s is 'asdfghjklzxcvbnmsdhasjkdhskdja'
True
>>> x = 3.14159; x is 3.14159
True
>>> t = 'a' + 'b'; t is 'a' + 'b'
True
>>>

From python2 docs:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. x is not y yields the
inverse truth value. [6]
From python3 docs:
The operators is and is not test for object identity: x is y is true
if and only if x and y are the same object. Object identity is
determined using the id() function. x is not y yields the inverse
truth value. [4]
So basically the key to understand those tests you've run on the repl console is by using
accordingly the id() function, here's an example that will show you what's going on behind the curtains:
>>> a=256
>>> id(a);id(256);a is 256
2012996640
2012996640
True
>>> a=257
>>> id(a);id(257);a is 257
36163472
36162032
False
>>> a=257;id(a);id(257);a is 257
36162496
36162496
True
>>> a=12345;id(a);id(12345);a is 12345
36162240
36162240
True
That said, usually a good way to understand what's going on behind the curtains with these type of snippets is by using either dis.dis or dis.disco, let's take a look for instance what this snippet would look like:
import dis
import textwrap
dis.disco(compile(textwrap.dedent("""\
a=256
a is 256
a=257
a is 257
a=257;a is 257
a=12345;a is 12345\
"""), '', 'exec'))
the output would be:
1 0 LOAD_CONST 0 (256)
2 STORE_NAME 0 (a)
2 4 LOAD_NAME 0 (a)
6 LOAD_CONST 0 (256)
8 COMPARE_OP 8 (is)
10 POP_TOP
3 12 LOAD_CONST 1 (257)
14 STORE_NAME 0 (a)
4 16 LOAD_NAME 0 (a)
18 LOAD_CONST 1 (257)
20 COMPARE_OP 8 (is)
22 POP_TOP
5 24 LOAD_CONST 1 (257)
26 STORE_NAME 0 (a)
28 LOAD_NAME 0 (a)
30 LOAD_CONST 1 (257)
32 COMPARE_OP 8 (is)
34 POP_TOP
6 36 LOAD_CONST 2 (12345)
38 STORE_NAME 0 (a)
40 LOAD_NAME 0 (a)
42 LOAD_CONST 2 (12345)
44 COMPARE_OP 8 (is)
46 POP_TOP
48 LOAD_CONST 3 (None)
50 RETURN_VALUE
As we can see in this case the asm output doesn't tell us very much, we can see than lines 3-4 are basically the "same" instructions than line 5. So my recommendation would be once again to use id() smartly so you'll know what's is will compare. In case you want to know exactly the type of optimizations cpython is doing I'm afraid you'd need to dig out in its source code

After discussion and testing in various versions, the final conclusions can be drawn.
Python will interpret and compile instructions in blocks. Depending on the syntax used, Python version, Operating System, distribution, different results may be achieved depending on what instructions Python takes in one block.
The general rules are:
(from official documentation)
The current implementation keeps an array of integer objects for all
integers between -5 and 256
Therefore:
a = 256
id(a)
Out[2]: 1997190544
id(256)
Out[3]: 1997190544 # int actually stored once within Python
a = 257
id(a)
Out[5]: 2365489141456
id(257)
Out[6]: 2365489140880 #literal, temporary. as you see the ids differ
id(257)
Out[7]: 2365489142192 # literal, temporary. as you see it gets a new id everytime
# since it is not pre-stored
The part below returns False in Python 3.6.3 |Anaconda custom (64-bit)| (default, Oct 17 2017, 23:26:12) [MSC v.1900 64 bit (AMD64)]
a = 257; a is 257
Out[8]: False
But
a=257; print(a is 257) ; a=258; print(a is 257)
>>>True
>>>False
As it is evident, whatever Python takes in "one block" is non deterministic and can be swayed depending on how it is written, single line or not, as well as the version, operating system and distribution used.

Related

Why does `x = 1234`, `y = 1234`, `x is y` return False in the REPL, but True in a stand-alone script?

I know there is something called interning in python, so basically
x, y = 1, 1
print(x is y) # True
x = 1234
y = 1234
print(x is y) # False
However when I wrap it into a script and run with python command it prints True twice. My guess is there are some optimizations under the hood but I cannot find any reference of them. Could someone explain what causes such behaviour and how to run that script without it?
I am on Ubuntu 20 and use CPython, version Python 3.9.9+ [GCC 9.3.0] on linux if that matters.
First, and only important thing you have to know:
you can't rely on "sameness" of Python literals, be them ints, strings, or whatever.
So, keep in mind this is absolutely irrelevant, but for the fact one always have to compare numbers, strings, and even "True" and "False" with ==, never with the is operator in any code intended to actually work in a consistent way.
That said, the reason the script will always print True in the case of a saved script, and will depend on version, runtime, lunar phase, CPU architecture in the interactive mode is simple:
with a script, the code is only executed after all of it has been compiled. While in interactive mode, each line of code is compiled and executed independently as you go.
So, when the compiler "sees" the same constant in the same block of code (the 1234 integer), it simply reuses the object it already created as a constant: it is a straightforward optimization.
While in the interactive mode, the literal will be "seen" only when compiling an entire new block of code, with a different internal state.
Regardless of the outputs and of this reasoning: this is not to be trusted. It is not part of the language specification. Compare numbers with == only - and forget there is a chance they might or not be the same object in memory. It is irrelevant either way.
It’s called constant pooling, and it’s a pretty standard technique when implementing interpreters.
>>> def f():
... x = 1234
... y = 1234
... return x is y
...
>>> f()
True
>>> import dis
>>> dis.dis(f)
2 0 LOAD_CONST 1 (1234)
2 STORE_FAST 0 (x)
3 4 LOAD_CONST 1 (1234)
6 STORE_FAST 1 (y)
4 8 LOAD_FAST 0 (x)
10 LOAD_FAST 1 (y)
12 IS_OP 0
14 RETURN_VALUE
Each closed (self-contained) piece of bytecode carries a constant pool with it. When the compiler parses a suite as a single unit, literals found in the code at compile time are added into the pool; when the same value is encountered again, the constant pool slot is reused. When the function bytecode is later executed, the values are loaded from the pool onto the value stack, and then manipulated there. Here, both instances of the literal 1234 end up as reads from the same pool slot 1 (slot 0 is reserved for None). Because they read from the same slot, they end up reading the same object, which is of course, the same as itself.
Pooling can be applied not only to literals, but also to values obtained by constant folding:
>>> def g():
... x = 4
... y = 2 + 2
... return x is y
...
>>> dis.dis(g)
2 0 LOAD_CONST 1 (4)
2 STORE_FAST 0 (x)
3 4 LOAD_CONST 1 (4)
6 STORE_FAST 1 (y)
4 8 LOAD_FAST 0 (x)
10 LOAD_FAST 1 (y)
12 IS_OP 0
14 RETURN_VALUE
At the REPL prompt, every prompt triggers a separate compilation, which does not share a constant pool with any other; doing otherwise would arguably amount to having a memory leak. As such, number literals that are not otherwise interned end up referring to different objects when they are provided at different prompts.
>>> x = 1234
>>> y = 1234
>>> id(x)
140478281905648
>>> id(y)
140478281906160
>>> x is y
False
Constant pooling is pretty fundamental to the design of CPython and cannot be disabled as such. After all, the bytecode has no way to refer to a hardcoded value other than by referring to the constant pool. There is also no option that disables reusing constant pool slots for already-encountered values. But if you’re crazy enough…
def deadpool(func):
import dis
import opcode
import functools
new_cpool = [None]
new_bcode = bytearray(func.__code__.co_code)
_Func = type(lambda: 0)
def pool(value):
idx = len(new_cpool)
new_cpool.append(value)
return idx
def clone(val):
if isinstance(val, int):
return int(str(val))
return val
op_EXTENDED_ARG = opcode.opmap['EXTENDED_ARG']
op_LOAD_CONST = opcode.opmap['LOAD_CONST']
insn_ext = None
for insn in dis.get_instructions(func):
if insn.opcode == op_LOAD_CONST:
idx = pool(clone(func.__code__.co_consts[insn.arg]))
assert idx < 256 or (idx < 65536 and had_ext)
new_bcode[insn.offset + 1] = idx & 0xff
if insn_ext:
new_bcode[insn_ext.offset + 1] = idx >> 8
insn_ext = None
elif insn.opcode == op_EXTENDED_ARG:
assert insn_ext is None
insn_ext = insn
else:
insn_ext = None
return functools.wraps(func)(_Func(
func.__code__.replace(
co_code=bytes(new_bcode),
co_consts=tuple(new_cpool)
),
func.__globals__,
func.__name__,
func.__defaults__,
func.__closure__
))
def f():
x = 1234
y = 1234
return x is y
#deadpool
def g():
x = 1234
y = 1234
return x is y
print(f()) # True
print(g()) # False
…you can re-write the bytecode so that each constant load refers to a different slot, and then attempt to put a distinct, though otherwise indistinguishable object in each slot. (The above is just a proof-of-concept; there are some corner cases on which it fails, which would be much more laborious to cover fully.)
The above can be made to run in PyPy with only slight modifications. The results, however, will be different, because PyPy does not expose the identity of integer objects and always compares them by value, even when using the is operator. And after all, why should it not? As the other answer rightly points out, identity of primitives is an implementation detail with which you should not be concerned when writing ordinary code, and even most extraordinary code.

Why is True if foo else False faster then bool(foo)?

Why is it that according to the timeit.timeit function the code boolean = True if foo else False runs faster than the code boolean = bool(foo)?
How is it that the if statement is able to determine the trueness of foo faster then the bool function itself?
Why doesn't the bool function simply use the same mechanic?
And what is the purpose of the bool function when it can be outperformed by a factor of four by a different technique?
Or, is it so that I am misusing the timeit function and that bool(foo) is, in fact, faster?
>>> timeit.timeit("boolean = True if foo else False", setup="foo='zon-zero'")
0.021019499999965774
>>> timeit.timeit("boolean = bool(foo)", setup="foo='zon-zero'")
0.0684856000000309
>>> timeit.timeit("boolean = True if foo else False", setup="foo=''")
0.019911300000103438
>>> timeit.timeit("boolean = bool(foo)", setup="foo=''")
0.09232059999999365
Looking at these results, True if foo else False seems to be four to five times faster than bool(foo).
I suspect that the difference in speed is caused by the overhead of calling a function and that does indeed seem to be the case when I use the dis module.
>>> dis.dis("boolean = True if foo else False")
1 0 LOAD_NAME 0 (foo)
2 POP_JUMP_IF_FALSE 8
4 LOAD_CONST 0 (True)
6 JUMP_FORWARD 2 (to 10)
>> 8 LOAD_CONST 1 (False)
>> 10 STORE_NAME 1 (boolean)
12 LOAD_CONST 2 (None)
14 RETURN_VALUE
>>> dis.dis("boolean = bool(foo)")
1 0 LOAD_NAME 0 (bool)
2 LOAD_NAME 1 (foo)
4 CALL_FUNCTION 1
6 STORE_NAME 2 (boolean)
8 LOAD_CONST 0 (None)
10 RETURN_VALUE
According to the dis module, than the difference between the two techniques is:
2 POP_JUMP_IF_FALSE 8
4 LOAD_CONST 0 (True)
6 JUMP_FORWARD 2 (to 10)
>> 8 LOAD_CONST 1 (False)
versus
0 LOAD_NAME 1 (bool)
4 CALL_FUNCTION 1
which makes it look like either the call to a function is far too expensive for something as simple as determining a boolean value or the bool function has been written very inefficiently.
But that actually makes me wonder why anyone would use the bool function when it is this much slower and why the bool function even exists when python does not even seem to use it internally.
So, is the bool function slower because it has been written inefficiently, because of the function overhead, or because of a different reason?
And why would anyone use the bool function when a much faster and equally clear alternative is available?
As per Python documentation :
class bool( [ x ] )
Return a Boolean value, i.e. one of True or False. x is converted using the standard truth testing
procedure. If x is false or omitted, this returns False; otherwise it returns True. The bool class is a
subclass of int (see Numeric Types — int, float, complex). It cannot be subclassed further. Its only
instances are False and True
So, when you directly use the object itself (like foo), the interpreter uses its foo.__bool__ property. But the bool function is a wrapper that again calls foo.__bool__
As you said, calling the function made it expensive.
And the use of bool is, there are certain situations where you need the boolean value of an object and need to refer it by a variable.
x = bool(my_object)
Writing x = my_object doesn't work.
Here its useful.
Sometimes bool(foo) is more readable where you can ignore small time lags.
You might be also interested in knowing that
x = {}
is faster than
x = dict()
Find out why... :)

What is faster, `if x` or `if x != 0`?

I was wondering, what code runs faster? For example, we have variable x:
if x!=0 : return
or
if x: return
I tried to check with timeit, and here are results:
>>> def a():
... if 0 == 0: return
...
>>> def b():
... if 0: return
...>>> timeit(a)
0.18059834650234943
>>> timeit(b)
0.13115053638194007
>>>
I can't quite understand it.
This is too hard to show in a comment: there's more (or less ;-) ) going on here than any of the comments so far noted. With a() and b() defined as you showed, let's go on:
>>> from dis import dis
>>> dis(b)
2 0 LOAD_CONST 0 (None)
3 RETURN_VALUE
What happens is that when the CPython compiler sees if 0: or if 1:, it evaluates them at compile time, and doesn't generate any code to do the testing at run time. So the code for b() just loads None and returns it.
But the code generated for a() is much more involved:
>>> dis(a)
2 0 LOAD_CONST 1 (0)
3 LOAD_CONST 1 (0)
6 COMPARE_OP 2 (==)
9 POP_JUMP_IF_FALSE 16
12 LOAD_CONST 0 (None)
15 RETURN_VALUE
>> 16 LOAD_CONST 0 (None)
19 RETURN_VALUE
Nothing is evaluated at compile time in this case - it's all done at run time. That's why a() is much slower.
Beyond that, I endorse #Charles Duffy's comment: worrying about micro-optimization is usually counterproductive in Python. But, if you must ;-) , learn how to use dis.dis so you're not fooled by gross differences in generated code, as happened in this specific case.

How does tuple unpacking differ from normal assignment? [duplicate]

This question already has answers here:
The `is` operator behaves unexpectedly with non-cached integers
(2 answers)
What's with the integer cache maintained by the interpreter?
(1 answer)
"is" operator behaves unexpectedly with integers
(11 answers)
Closed 1 year ago.
From this link I learnt that
The current implementation keeps an array of integer objects for all integers between -5 and 256, when you create an int in that range you actually just get back a reference to the existing object
But when I tried to give some example for my session and I found out that it behaves differently with assignment and tuple unpacking.
Here is the snippet:
>>> a,b = 300,300
>>> a is b
True
>>> c = 300
>>> d = 300
>>> c is d
False
Because int is immutable, Python may or may not use exists object, if you save the following code in to a script file, and run it, it will output two True.
a, b = 300, 300
print a is b
c = 300
d = 300
print c is d
When Python compile the code, it may reuse all the constants. Becasue you input your code in a python session, the codes are compiled line by line, Python can't reuse all the constants as one object.
The document only says that there will be only one instance for -5 to 256, but doesn't define the behavior of others. For immutable types, is and is not is not important, because you can't modify them.
import dis
def testMethod1():
a, b = 300, 300
print dis.dis(testMethod1)
Prints:
4 0 LOAD_CONST 2 ((300, 300))
3 UNPACK_SEQUENCE 2
6 STORE_FAST 0 (a)
9 STORE_FAST 1 (b)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE None
def testMethod2():
a = 300
b = 300
Prints:
7 0 LOAD_CONST 1 (300)
3 STORE_FAST 0 (a)
8 6 LOAD_CONST 1 (300)
9 STORE_FAST 1 (b)
12 LOAD_CONST 0 (None)
15 RETURN_VALUE None
So, it looks essentially the same, but with LOAD_CONST in one step in the first method and two steps in the second method....
EDIT
After some testing, I discovered that both methods return False eventually; however, on one run only, ie not putting the methods in a loop, they seem to always return True. Sometimes it uses a single reference, and sometimes it does not.
The documentation only states that -5 to 256 will return the same reference; hence, you simply just shouldn't be using is for comparison (in this case) as the number's current id has no guarantee on it.
NB: You never want to use is for comparison of values, as that's not what it's for, it's to compare identities. My point was that is's return value is not always going to be True when you're outside of the defined range.

Is the resulting bytecode generated in Python deterministic?

Given a Python interpreter (CPython, Jython, etc), is the bytecode generated deterministic?
That is, if I compile 2 different scripts that differ only in whitespace, but otherwise syntactically equivalent, would the chosen compiler generate exactly the same bytecodes?
It is not clear what you are looking for, exactly. Syntactically the same code is going to result in the same instructions being executed, certainly. But even syntactically equivalent python files can generate different .pyc cached bytecode files. Adding or removing newlines will result in different line offsets:
>>> import dis
>>> def foo():
... # in the interpreter, comments will do the same job as newlines
... baz
... # extra newlines or comments push the bytecode offsets
... return 42
...
>>> def bar():
... baz
... return 42
...
>>> dis.dis(foo)
3 0 LOAD_GLOBAL 0 (baz)
3 POP_TOP
5 4 LOAD_CONST 1 (42)
7 RETURN_VALUE
>>> dis.dis(bar)
2 0 LOAD_GLOBAL 0 (baz)
3 POP_TOP
3 4 LOAD_CONST 1 (42)
7 RETURN_VALUE
Note the different values in the left-hand column; the interpreter will still behave exactly the same, but the offsets differ.
The bytecode and offsets can be accessed separately leaving what the interpreter executes equal:
>>> foo.__code__.co_lnotab
'\x00\x02\x04\x02'
>>> bar.__code__.co_lnotab
'\x00\x01\x04\x01'
>>> foo.__code__.co_code == bar.__code__.co_code
True

Categories

Resources