What is faster, `if x` or `if x != 0`? - python

I was wondering, what code runs faster? For example, we have variable x:
if x!=0 : return
or
if x: return
I tried to check with timeit, and here are results:
>>> def a():
... if 0 == 0: return
...
>>> def b():
... if 0: return
...>>> timeit(a)
0.18059834650234943
>>> timeit(b)
0.13115053638194007
>>>
I can't quite understand it.

This is too hard to show in a comment: there's more (or less ;-) ) going on here than any of the comments so far noted. With a() and b() defined as you showed, let's go on:
>>> from dis import dis
>>> dis(b)
2 0 LOAD_CONST 0 (None)
3 RETURN_VALUE
What happens is that when the CPython compiler sees if 0: or if 1:, it evaluates them at compile time, and doesn't generate any code to do the testing at run time. So the code for b() just loads None and returns it.
But the code generated for a() is much more involved:
>>> dis(a)
2 0 LOAD_CONST 1 (0)
3 LOAD_CONST 1 (0)
6 COMPARE_OP 2 (==)
9 POP_JUMP_IF_FALSE 16
12 LOAD_CONST 0 (None)
15 RETURN_VALUE
>> 16 LOAD_CONST 0 (None)
19 RETURN_VALUE
Nothing is evaluated at compile time in this case - it's all done at run time. That's why a() is much slower.
Beyond that, I endorse #Charles Duffy's comment: worrying about micro-optimization is usually counterproductive in Python. But, if you must ;-) , learn how to use dis.dis so you're not fooled by gross differences in generated code, as happened in this specific case.

Related

While loop without content

I'm writing a Python program for a guessing game and the current way I'm implementing the main loop works, but it feels wrong:
# Guess Loop
while not(guess(ans)):
pass
It will work but I was wondering if this is bad practice (I assume it is).
There is nothing against this in PEP-8, Python's de facto coding standard. This is exactly what while loops are for.
I think you're concerned because the while loop is empty, but in a non-trivial program it wouldn't be, and you'd have something to put there.
I'd prefer
while True:
if guess(ans):
break
but your code
while not(guess(ans)):
pass
functions identically. It may even be slightly more performant!
from dis import dis
from random import choice
def foo(): return choice([True, False]) # flip a coin
# My code
def bar1():
while True:
if foo(): break
# Your code
def bar2():
while not(foo()):
pass
>>> dis(bar1)
2 0 SETUP_LOOP 12 (to 14)
3 >> 2 LOAD_GLOBAL 0 (foo)
4 CALL_FUNCTION 0
6 POP_JUMP_IF_FALSE 2
8 BREAK_LOOP
10 JUMP_ABSOLUTE 2
12 POP_BLOCK
>> 14 LOAD_CONST 0 (None)
16 RETURN_VALUE
>>> dis(bar2)
2 0 SETUP_LOOP 10 (to 12)
>> 2 LOAD_GLOBAL 0 (foo)
4 CALL_FUNCTION 0
6 POP_JUMP_IF_TRUE 10
3 8 JUMP_ABSOLUTE 2
>> 10 POP_BLOCK
>> 12 LOAD_CONST 0 (None)
14 RETURN_VALUE
Your snippet can omit the "BREAK_LOOP" there, which might improve performance. Of course the POP_JUMP_IF_TRUE might also be slightly harder than the POP_JUMP_IF_FALSE. Or vice-versa.
Regardless, they function identically. Use whichever one you like. I prefer the look of "Loop forever, and if guess(ans) is true, break" to "Loop until the negation of guess(ans) is false."
There are multiple ways to handle this idea - any issues with looping (specifically the bugs that can occur of a while(true) loop) wouldn't be true here, but maybe would come up in other languages, which is maybe what gave you pause.

Script run time difference for code in loop vs code in called function [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I run 2 functions. Both of them have for-loops to execute instructions. Both functions accomplish the same task, but one takes much longer.
Function 1 executes and is self contained, performs TaskA.
f1:
For x in X:
do task a
Function 2 executes and calls Function 3. Function 3 performs TaskA
f2:
For x in X:
call function 3
f3:
do task a
Why does function 2 generally take 10x as long to execute as function 1?
EDIT: Previous phrasing confused people.
Another factor could be the "preparation" / setup being done before TaskA is called. Possible that in f1 you've done it once before the for loop and then it's done in f3 so it gets called for every x in X from f2 rather than just once at the beginning. Without any real code, it's hard to say.
As for the potential complexity of calling f3 for every x, it's unlikely that that's the cause of the 10x slowness.
Only in an oversimplified example with pass do we see this behaviour. Let's take these 3 bad versions of f1, f2 and f3:
>>> def f1():
... for x in X:
... pass
...
>>> def f2():
... for x in X:
... f3()
...
>>> def f3():
... pass
...
Using dis, here's what the bytecode looks like for f1:
>>> dis.dis(f1)
2 0 SETUP_LOOP 14 (to 17)
3 LOAD_GLOBAL 0 (X)
6 GET_ITER
>> 7 FOR_ITER 6 (to 16)
10 STORE_FAST 0 (x)
3 13 JUMP_ABSOLUTE 7
>> 16 POP_BLOCK
>> 17 LOAD_CONST 0 (None)
20 RETURN_VALUE
...vs f2:
>>> dis.dis(f2)
2 0 SETUP_LOOP 21 (to 24)
3 LOAD_GLOBAL 0 (X)
6 GET_ITER
>> 7 FOR_ITER 13 (to 23)
10 STORE_FAST 0 (x)
3 13 LOAD_GLOBAL 1 (f3)
16 CALL_FUNCTION 0
19 POP_TOP
20 JUMP_ABSOLUTE 7
>> 23 POP_BLOCK
>> 24 LOAD_CONST 0 (None)
27 RETURN_VALUE
Those look nearly the same except for the CALL_FUNCTION and POP_TOP. However, they are very different with timeit:
>>> X = range(1000) # [0, 1, 2, ...999]
>>>
>>> import timeit
>>> timeit.timeit(f1)
10.290941975496747
>>> timeit.timeit(f2)
81.18860785875617
>>>
Now that's 8x the time but not because calling a function is slow but because doing nothing but pass in f1's for loop is extremely fast, especially when calling a function each time which then does nothing. So hopefully you were not using these as examples to find out/wonder why.
Now, if you actually do something in the task, like say x * x then you'll see the timing/performance difference between the two becomes smaller:
>>> def f1():
... for x in X:
... _ = x*x
...
>>> def f2():
... for x in X:
... _ = f3(x) # didn't pass in `x` to `f3` in the previous example
...
>>> def f3(x):
... return x*x
...
>>> timeit.timeit(f1)
38.76545268807092
>>> timeit.timeit(f2)
113.72242594670047
>>>
Now that's only 2.9x the time. It's not the function call which causes the slowness (yes, there is some overhead) but also what you're doing in that function vs pass that makes a difference to the overall time.
If you replace the _ = x * x with print x * x in both places, which is quite "slow", and with just X = range(5):
>>> timeit.timeit(f1, number=10000)
3.640433839719143
>>> timeit.timeit(f2, number=10000)
3.6921612171574765
And now there's much less difference in their performance.
So do an actual check with real code, not just simple pseudocode analysis. Empty calls may appear faster but that overhead is really small compared with the slower stuff that code in functions does.

Is the resulting bytecode generated in Python deterministic?

Given a Python interpreter (CPython, Jython, etc), is the bytecode generated deterministic?
That is, if I compile 2 different scripts that differ only in whitespace, but otherwise syntactically equivalent, would the chosen compiler generate exactly the same bytecodes?
It is not clear what you are looking for, exactly. Syntactically the same code is going to result in the same instructions being executed, certainly. But even syntactically equivalent python files can generate different .pyc cached bytecode files. Adding or removing newlines will result in different line offsets:
>>> import dis
>>> def foo():
... # in the interpreter, comments will do the same job as newlines
... baz
... # extra newlines or comments push the bytecode offsets
... return 42
...
>>> def bar():
... baz
... return 42
...
>>> dis.dis(foo)
3 0 LOAD_GLOBAL 0 (baz)
3 POP_TOP
5 4 LOAD_CONST 1 (42)
7 RETURN_VALUE
>>> dis.dis(bar)
2 0 LOAD_GLOBAL 0 (baz)
3 POP_TOP
3 4 LOAD_CONST 1 (42)
7 RETURN_VALUE
Note the different values in the left-hand column; the interpreter will still behave exactly the same, but the offsets differ.
The bytecode and offsets can be accessed separately leaving what the interpreter executes equal:
>>> foo.__code__.co_lnotab
'\x00\x02\x04\x02'
>>> bar.__code__.co_lnotab
'\x00\x01\x04\x01'
>>> foo.__code__.co_code == bar.__code__.co_code
True

Is it preferable to use an "else" in Python when it's not necessary?

I actually use Python and Flask for my devblog. I know that depending of the language, it is advisable to use a explicit else when it is not obligatory, but I don't know how it's work in Python.
By example, I have a a function with a if that return something if the statement is true. So, The else is not necessary because with or without it, the execution continue normally.
def foo(bar):
if not isinstance(foo, list):
return "an error"
else: # not necessary
return "something"
So, I should use it like this, or like :
def foo(bar):
if not isinstance(foo, list):
return "an error"
return "something"
In the first case, Python will add an explicit return None to the end of the function - even though we can see it's not really needed. In the second case it doesn't.
I don't see any advantage to having the else: there
>>> import dis
>>> def f():
... if 1>2:
... return 2
... return 3
...
>>> def g():
... if 1>2:
... return 2
... else:
... return 3
...
>>> dis.dis(f)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 COMPARE_OP 4 (>)
9 POP_JUMP_IF_FALSE 16
3 12 LOAD_CONST 2 (2)
15 RETURN_VALUE
4 >> 16 LOAD_CONST 3 (3)
19 RETURN_VALUE
>>> dis.dis(g)
2 0 LOAD_CONST 1 (1)
3 LOAD_CONST 2 (2)
6 COMPARE_OP 4 (>)
9 POP_JUMP_IF_FALSE 16
3 12 LOAD_CONST 2 (2)
15 RETURN_VALUE
5 >> 16 LOAD_CONST 3 (3)
19 RETURN_VALUE
20 LOAD_CONST 0 (None)
23 RETURN_VALUE
This has already been discussed here: If-Else-Return or just if-Return?
Essentially, the two forms are equivalent in terms of efficiency because the machine has to make a jump anyway. So it boils down to coding style and you'll have to decide that on your own (or with your team).
Personally, I prefer with the else statement for readability.
It really makes no difference. Either way will do the same thing.
I prefer the latter, because it's one less line of code.
Je préfère cette dernière
From Chromium's style guide:
Don't use else after return:
# Bad
if (foo)
return 1;
else
return 2;
# Good
if (foo)
return 1;
return 2;
return foo ? 1 : 2;
I am quite new to programming, and in the context of this question I was reminded that there is often the case where I am not sure if I should use an "else" or "elif", e.g., in a scenario like that.
1)
if numA < numB:
print('numA is smaller')
elif numB < numA:
print('numB is smaller')
else:
print('both numbers are equal')
2)
if numA < numB:
print('numA is smaller')
elif numB < numA:
print('numB is smaller')
elif numA == numB:
print('both numbers are equal')
I think it would not make a huge difference, or am I wrong? In other examples the second variant might be more "robust" in general, I think.

"x not in y" or "not x in y"

When testing for membership, we can use:
x not in y
Or alternatively:
not x in y
There can be many possible contexts for this expression depending on x and y. It could be for a substring check, list membership, dict key existence, for example.
Are the two forms always equivalent?
Is there a preferred syntax?
They always give the same result.
In fact, not 'ham' in 'spam and eggs' appears to be special cased to perform a single "not in" operation, rather than an "in" operation and then negating the result:
>>> import dis
>>> def notin():
'ham' not in 'spam and eggs'
>>> dis.dis(notin)
2 0 LOAD_CONST 1 ('ham')
3 LOAD_CONST 2 ('spam and eggs')
6 COMPARE_OP 7 (not in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
>>> def not_in():
not 'ham' in 'spam and eggs'
>>> dis.dis(not_in)
2 0 LOAD_CONST 1 ('ham')
3 LOAD_CONST 2 ('spam and eggs')
6 COMPARE_OP 7 (not in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
>>> def not__in():
not ('ham' in 'spam and eggs')
>>> dis.dis(not__in)
2 0 LOAD_CONST 1 ('ham')
3 LOAD_CONST 2 ('spam and eggs')
6 COMPARE_OP 7 (not in)
9 POP_TOP
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
>>> def noteq():
not 'ham' == 'spam and eggs'
>>> dis.dis(noteq)
2 0 LOAD_CONST 1 ('ham')
3 LOAD_CONST 2 ('spam and eggs')
6 COMPARE_OP 2 (==)
9 UNARY_NOT
10 POP_TOP
11 LOAD_CONST 0 (None)
14 RETURN_VALUE
I had thought at first that they always gave the same result, but that not on its own was simply a low precedence logical negation operator, which could be applied to a in b just as easily as any other boolean expression, whereas not in was a separate operator for convenience and clarity.
The disassembly above was revealing! It seems that while not obviously is a logical negation operator, the form not a in b is special cased so that it's not actually using the general operator. This makes not a in b literally the same expression as a not in b, rather than merely an expression that results in the same value.
No, there is no difference.
The operator not in is defined to have the inverse true value of in.
—Python documentation
I would assume not in is preferred because it is more obvious and they added a special case for it.
They are identical in meaning, but the pycodestyle Python style guide checker (formerly called pep8) prefers the not in operator in rule E713:
E713: test for membership should be not in
See also "Python if x is not None or if not x is None?" for a very similar choice of style.
Others have already made it very clear that the two statements are, down to a quite low level, equivalent.
However, I don't think that anyone yet has stressed enough that since this leaves the choice up to you, you should
choose the form that makes your code as readable as possible.
And not necessarily as readable as possible to anyone, even if that's of course a nice thing to aim for. No, make sure the code is as readable as possible to you, since you are the one who is the most likely to come back to this code later and try to read it.
In Python, there is no difference. And there is no preference.
Syntactically they're the same statement. I would be quick to state that 'ham' not in 'spam and eggs' conveys clearer intent, but I've seen code and scenarios in which not 'ham' in 'spam and eggs' conveys a clearer meaning than the other.

Categories

Resources