Python performance: repeating calculations vs temp variable

Python performance: repeating calculations vs temp variable - python

Does python recalculate every repeating expression in code?
For example does
a = [1,23,45,45,456,34]
b = len(a) + 213
c = len(a) + 3432
differ in performance from
a = [1,23,45,45,456,34]
l = len(a)
b = l + 213
c = l + 3432
I would guess second one uses more memory (to store l) but less cpu. Am I correct?

Does python recalculate every repeating expression in code?
It is unspecified in the language specification. In fact, this is highly dependent of the Python implementation. The mainstream Python implementation, called CPython, does recompute the expression. PyPy (an alternative implementation focusing on performance) usually do not recompute the expression in hot portions of the code, thanks to just-in-time compilation. There are many other implementation of Python (eg. Pyston, Jython, IronPython) and each one can behave differently.
I would guess second one uses more memory (to store l) but less cpu.
Yes, but the difference is actually marginal and still dependent of the Python implementation used (eg. PyPy may not require more memory in this case). Note that calling len on a list is very fast and this is done in constant time.
While the second code should be slightly faster, such micro-optimization will likely have no significant impact on a big code. Keep in mind that readable code are generally easier to maintain, improve and optimize.

Related

Why is branchless programming and built-ins slower?

I found 2 branchless functions that find the maximum of two numbers in python, and compared them to an if statement and the built-in max function. I thought the branchless or the built-in functions would be the fastest, but the fastest was the if-statement function by a large margin. Does anybody know why this is? Here are the functions:
If-statement (2.16 seconds for 25000 operations):
def max1(a, b):
if a > b:
return a
return b
Built-in (4.69 seconds for 25000 operations):
def max2(a, b):
return max(a, b)
Branchless 1 (4.12 seconds for 25000 operations):
def max3(a, b):
return (a > b) * a + (a <= b) * b
Branchless 2 (5.34 seconds for 25000 operations):
def max4(a, b):
diff = a - b
return a - (diff & diff >> 31)

Your expectations about branching vs. branchless code apply to low-level languages like assembly and C. Branchless code can be faster in low-level languages because it prevents slowdowns caused by branch prediction misses. (Note: this means branchless code can be faster, but it will not necessarily be.)
Python is a high-level language. Assuming you are using the CPython interpreter: for every bytecode instruction you execute, the interpreter has to branch on the kind of opcode, and typically many other things. For example, even the simple < operator requires a branch to check for the < opcode, another branch to check whether the object's class implements a __lt__ method, more branches to check whether the right-hand-side value is of a valid type for the comparison to be performed, and probably several other branches. Even your so-called "branchless" code will in practice result in a lot of branching for these reasons.
Because Python is so high-level, each bytecode instruction is actually doing quite a lot of work compared to a single machine-code instruction. So the performance of simple code like this will mainly depend on how many bytecode instructions have to be interpreted:
Your max1 function has to do three loads of local variables, a comparison, a conditional jump and a return. That's six.
Your max2 function does two loads of local variables, one load of a global variable (referencing the built-in max), and also makes a function call; that requires passing arguments, and is relatively expensive compared to other bytecode instructions. On top of that, the built-in function itself has to do the same work as your own max1 function, so no wonder max2 is slower.
Your max3 function does six loads of local variables, two comparisons, two multiplications, one addition, and one return. That's twelve instructions, so we should expect it to take about twice as long as max1.
Likewise max4 does five loads of local variables, one store to a local variable, one load of a constant, two subtractions, one bitshift, one bitwise "and", and one return. That's twelve instructions again.
That said, note that if we compare your max1 with the built-in function max directly, instead of your max2 which has an extra function call, your max1 function is still a bit faster than the built-in max. This is probably because the built-in max accepts a variable number of arguments, which may involve building a tuple of arguments, and the built-in max function also has a branch to check if it was called with a single iterable argument (e.g. max([3, 1, 4, 2])), and handle that case differently; your max1 function doesn't do those things.

Python code is not machine optimized. It is highly unlikely that you get any "branchless" code optimization in the interpreted code.
Branchless code is faster sometimes if it effectively do less work or the hardware is able to do better branch prediction because of it.
Function call has cost, so if the code inside the function is too trivial, the cost of the function call is relatively high.
There is a missing control case: just call the builtin max function in a loop and compare (as in max2 but without the function call overhead). It is probable that builtin max is implemented in C and it is already optimized for your hardware.

Operations: Saving in Variables Then Operating vs Single Liners

I am writing a program in Python (using the numpy package). I am writing a program that contains a very long function that involves many terms:
result = a + b + c + d +...
...whatever. These terms a, b, c, d, etc...themselves are matrices that involve many operations, for example in Python code:
a = np.identity(3, dtype = np.double)/3.0
b = np.kron(vec1, vec2).reshape(3,3) # Also with np.double precision.
Just taking two variables, I have been wondering if doing:
a = np.identity(3, dtype = np.double)/3.0
b = np.kron(vec1, vec2).reshape(3,3) # Also with np.double precision.
c = a + b
is the same as doing:
c = np.identity(3, dtype = np.double)/3.0 + np.kron(vec1, vec2).reshape(3,3)
This may sound silly, but I require a very high numerical stability, i.e., introducing numerical errors, as subtle as they are, might ruing the program or yield a weird result. Of course, this question can be extended to other programming languages.
Which is suggested? Does it matter? Any suggested references?

Under "normal" circumstances, both approaches are equivalent.
In other words, whether you use a value through an explicit expression (eg, np.identity(3, dtype = np.double)/3.0) or through a variable-name that has been initialized with that expression (here, a), the outcome would "normally" be the same.
There are some not-so-normal circumstances, where they may produce different results. As far as I can see all these have to do with situations in which there are side-effects such that the outcome depends upon the order in which things happen. For example:
Consider a scenario where the initialization of the variable-name b involves a side-effect that affects the initialization of the variable-name a. And let's say your code depends on that side-effect. In this scenario, in the case of the fist approach (where you first initialize the variable-names and then use only those variables), your code would have to initialize b first, and a later -- the order of the initialization of the variable-names matters. In the second approach (where you would have explicit expressions rather than variable-names, participating in a larger expression), to achieve the same effect, you will have to pay attention to the order in which Python interpreter evaluates sub-expressions within an expression. If you don't, then the order of evaluation of sub-expressions may not produce the side-effect that your code needs, and you might end up getting a different result.
As for other programming languages the answer is a big yes, the two approaches can yield different results, in languages (such as Java), where the variable-names have associated data-types, which can cause some silent numerical conversions (such as truncations) to happen during variable-assignment.

Why is `word == word[::-1]` to test for palindrome faster than a more algorithmic solution in Python?

I wrote a disaster of a question on Code Review asking why Python programmers normally test if a string is a palindrome by comparing the string to itself reversed, instead of a more algorithmic way with lower complexity, assuming that the normal way would be faster.
Here is the pythonic way:
def is_palindrome_pythonic(word):
# The slice requires N operations, plus memory
# and the equality requires N operations in the worst case
return word == word[::-1]
Here is my attempt at a more efficient way to accomplish this:
def is_palindrome_normal(word):
# This requires N/2 operations in the worst case
low = 0
high = len(word) - 1
while low < high:
if word[low] != word[high]:
return False
low += 1
high -= 1
return True
I would expect the normal way would be faster than the pythonic way. See for example this great article
Timing it with timeit, however, brought exactly the opposite result:
setup = '''
def is_palindrome_pythonic(word):
# ...
def is_palindrome_normal(word):
# ...
# N here is 2000
first_half = ''.join(map(str, (i for i in range(1000))))
word = first_half + first_half[::-1]
'''
timeit.timeit('is_palindrome_pythonic(word)', setup=setup, number=1000)
# 0.0052
timeit.timeit('is_palindrome_normal(word)', setup=setup, number=1000)
# 0.4268
I then figured that my n was too small, so I changed the length of word from 2000 to 2,000,000. The pythonic way took about 16 seconds on average, whereas the normal way ran several minutes before I canceled it.
Incidentally, in the best case scenario, where the very first letter does not match the very last letter, the normal algorithm was much faster.
What explains the extreme difference between the speeds of the two algorithms?

Because the "Pythonic" way with slicing is implemented in C. The interpreter / VM doesn't need to execute more than approximately once. The bulk of the algorithm is spent in a tight loop of native code.

As much as I love Python, I have to say that if you want maximum speed you probably shouldn't be using Python. ;)
The rule of thumb in Python time optimization is to use operators or module functions that do the bulk of the work at C speed rather than equivalent code running at Python speed. Even if the two equivalent approaches are using algorithms with the same big-O complexity, the time scaling factor of (mostly) running directly on the CPU vs running on the Python virtual machine has a big impact.
This is even true of an algorithm that's mostly just integer arithmetic, since Python integers are immutable objects, so when you do arithmetic there's the overhead of allocating and initialising a new integer object and disposing of the old one. CPython tries to be frugal, and is pretty smart at managing memory (so every new object doesn't require a system call to allocate memory), and of course the CPython interpreter maintains a cache of integers from -5 to 256 (inclusive) so that arithmetic with small numbers isn't so bad. But it's certainly slower than doing arithmetic at C speed with machine integers.
You can see the difference even with a simple counting loop. On my admittedly ancient 32 bit machine running Python 3.6, using the Bash time command to do the timings,
m = 5000000
for i in range(m):
i
is roughly twice as fast as
m = 5000000
i = 0
while i<m:
i += 1
because range can do the arithmetic at C speed, even though it still has to create a new integer object on each iteration. If you replace the i line in the range version with pass the time is roughly halved.
With more complicated algorithms the time differences can be much more significant, eg string or list copying that happens at the C level can often be done with efficient CPU operators that are much faster than chugging along on the Python virtual machine with Python code.
I agree that this can take a while to get used to if you come from a language that gets compiled to native machine code. And I admit that even after over 10 years of using Python it still feels a little weird to me that when (for example) you need to do some bit manipulation stuff that it can often be faster in Python to do it using string operations on a string composed of '0's and '1's that to do it using the traditional bitwise and arithmetic integer operators.
OTOH, I think it's useful to know the traditional algorithms as well as the Pythonic ones. It's rare that a programmer will work only in Python, so it's good to know how to do things in languages that don't work the way that Python does.

Python: is the iteration of the multidimensional array super slow?

I have to iterate all items in two-dimensional array of integers and change the value (according to some rule, not important).
I'm surprised how significant difference in performance is there between python runtime and C# or java runtime. Did I wrote totally wrong python code (v2.7.2)?
import numpy
a = numpy.ndarray((5000,5000), dtype = numpy.int32)
for x in numpy.nditer(a.T):
x = 123
>python -m timeit -n 2 -r 2 -s "import numpy; a = numpy.ndarray((5000,5000), dtype=numpy.int32)" "for x in numpy.nditer(a.T):" " x = 123"
2 loops, best of 2: 4.34 sec per loop
For example the C# code performs only 50ms, i.e. python is almost 100 times slower! (suppose the matrix variable is already initialized)
for (y = 0; y < 5000; y++)
for (x = 0; x < 5000; x++)
matrix[y][x] = 123;

Yep! Iterating through numpy arrays in python is slow. (Slower than iterating through a python list, as well.)
Typically, you avoid iterating through them directly.
If you can give us an example of the rule you're changing things based on, there's a good chance that it's easy to vectorize.
As a toy example:
import numpy as np
x = np.linspace(0, 8*np.pi, 100)
y = np.cos(x)
x[y > 0] = 100
However, in many cases you have to iterate, either due to the algorithm (e.g. finite difference methods) or to lessen the memory cost of temporary arrays.
In that case, have a look at Cython, Weave, or something similar.

The example you gave was presumably meant to set all items of a two-dimensional NumPy array to 123. This can be done efficiently like this:
a.fill(123)
or
a[:] = 123

Python is a much more dynamic language than C or C#. The main reason why the loop is so slow is that on every pass, the CPython interpreter is doing some extra work that wastes time: specifically, it is binding the name x with the next object from the iterator, then when it evaluates the assignment it has to look up the name x again.
As #Sven Marnach noted, you can call a method function numpy.fill() and it is fast. That function is compiled C or maybe Fortran, and it will simply loop over the addresses of the numpy.array data structure and fill in the values. Much less dynamic than Python, which is good for this simple case.
But now consider PyPy. Once you run your program under PyPy, a JIT analyzes what your code is actually doing. In this example, it notes that the name x isn't used for anything but the assignment, and it can optimize away binding the name. This example should be one that PyPy speeds up tremendously; likely PyPy will be ten times faster than plain Python (so only one-tenth as fast as C, rather than 1/100 as fast).
http://pypy.org
As I understand it, PyPy won't be working with Numpy for a while yet, so you can't just run your existing Numpy code under PyPy yet. But the day is coming.
I'm excited about PyPy. It offers the hope that we can write in a very high-level language (Python) and yet get nearly the performance of writing things in "portable assembly language" (C). For examples like this one, the Numpy might even beat the performance of naive C code, by using SIMD instructions from the CPU (SSE2, NEON, or whatever). For this example, with SIMD, you could set four integers to 123 with each loop, and that would be faster than a plain C loop. (Unless the C compiler used a SIMD optimization also! Which, come to think of it, is likely for this case. So we are back to "nearly the speed of C" rather than faster for this example. But we can imagine trickier cases that the C compiler isn't smart enough to optimize, where a future PyPy might.)
But never mind PyPy for now. If you will be working with Numpy, it is a good idea to learn all the functions like numpy.fill() that are there to speed up your code.

C++ emphasizes machine time over programmer time.
Python emphasizes programmer time over machine time.
Pypy is a python written in python, and they have the beginnings of numpy; you might try that. Pypy has a nice JIT that makes things quite fast.
You could also try cython, which allows you to translate a dialect of Python to C, and compile the C to a Python C extension module; this allows one to continue using CPython for most of your code, while still getting a bit of a speedup. However, in the one microbenchmark I've tried comparing Pypy and Cython, Pypy was quite a bit faster than Cython.
Cython uses a highly pythonish syntax, but it allows you to pretty freely intermix Python datatypes with C datatypes. If you redo your hotspots with C datatypes, it should be pretty fast. Continuing to use Python datatypes is sped up by Cython too, but not as much.

The nditer code does not assign a value to the elements of a. This doesn't affect the timings issue, but I mention it because it should not be taken as a good use of nditer.
a correct version is:
for i in np.nditer(a, op_flags=[["readwrite"]]):
i[...] = 123
The [...] is needed to retain the reference to loop value, which is an array of shape ().
There's no point in using A.T, since its the values of the base A that get changed.
I agree that the proper way of doing this assignment is a[:]=123.

If you need to do operations on a multidimensional array that depend on the value of the array but don't depend on the position inside the array then .itemset is 5 times faster than nditer for me.
So instead of doing something like
image = np.random.random_sample((200, 200,3));
with np.nditer(image, op_flags=['readwrite']) as it:
for x in it:
x[...] = x*4.5 if x<0.2 else x
You can do this
image2 = np.random.random_sample((200, 200,3));
for i in range(0,image2.size):
x = image2.item(i)
image2.itemset(i, x*4.5 if x<0.2 else x);

Why are there no ++ and -- operators in Python?

Why are there no ++ and -- operators in Python?

It's not because it doesn't make sense; it makes perfect sense to define "x++" as "x += 1, evaluating to the previous binding of x".
If you want to know the original reason, you'll have to either wade through old Python mailing lists or ask somebody who was there (eg. Guido), but it's easy enough to justify after the fact:
Simple increment and decrement aren't needed as much as in other languages. You don't write things like for(int i = 0; i < 10; ++i) in Python very often; instead you do things like for i in range(0, 10).
Since it's not needed nearly as often, there's much less reason to give it its own special syntax; when you do need to increment, += is usually just fine.
It's not a decision of whether it makes sense, or whether it can be done--it does, and it can. It's a question of whether the benefit is worth adding to the core syntax of the language. Remember, this is four operators--postinc, postdec, preinc, predec, and each of these would need to have its own class overloads; they all need to be specified, and tested; it would add opcodes to the language (implying a larger, and therefore slower, VM engine); every class that supports a logical increment would need to implement them (on top of += and -=).
This is all redundant with += and -=, so it would become a net loss.

This original answer I wrote is a myth from the folklore of computing: debunked by Dennis Ritchie as "historically impossible" as noted in the letters to the editors of Communications of the ACM July 2012 doi:10.1145/2209249.2209251
The C increment/decrement operators were invented at a time when the C compiler wasn't very smart and the authors wanted to be able to specify the direct intent that a machine language operator should be used which saved a handful of cycles for a compiler which might do a
load memory
load 1
add
store memory
instead of
inc memory
and the PDP-11 even supported "autoincrement" and "autoincrement deferred" instructions corresponding to *++p and *p++, respectively. See section 5.3 of the manual if horribly curious.
As compilers are smart enough to handle the high-level optimization tricks built into the syntax of C, they are just a syntactic convenience now.
Python doesn't have tricks to convey intentions to the assembler because it doesn't use one.

I always assumed it had to do with this line of the zen of python:
There should be one — and preferably only one — obvious way to do it.
x++ and x+=1 do the exact same thing, so there is no reason to have both.

Of course, we could say "Guido just decided that way", but I think the question is really about the reasons for that decision. I think there are several reasons:
It mixes together statements and expressions, which is not good practice. See http://norvig.com/python-iaq.html
It generally encourages people to write less readable code
Extra complexity in the language implementation, which is unnecessary in Python, as already mentioned

Because, in Python, integers are immutable (int's += actually returns a different object).
Also, with ++/-- you need to worry about pre- versus post- increment/decrement, and it takes only one more keystroke to write x+=1. In other words, it avoids potential confusion at the expense of very little gain.

Clarity!
Python is a lot about clarity and no programmer is likely to correctly guess the meaning of --a unless s/he's learned a language having that construct.
Python is also a lot about avoiding constructs that invite mistakes and the ++ operators are known to be rich sources of defects.
These two reasons are enough not to have those operators in Python.
The decision that Python uses indentation to mark blocks rather
than syntactical means such as some form of begin/end bracketing
or mandatory end marking is based largely on the same considerations.
For illustration, have a look at the discussion around introducing a conditional operator (in C: cond ? resultif : resultelse) into Python in 2005.
Read at least the first message and the decision message of that discussion (which had several precursors on the same topic previously).
Trivia:
The PEP frequently mentioned therein is the "Python Enhancement Proposal" PEP 308. LC means list comprehension, GE means generator expression (and don't worry if those confuse you, they are none of the few complicated spots of Python).

My understanding of why python does not have ++ operator is following: When you write this in python a=b=c=1 you will get three variables (labels) pointing at same object (which value is 1). You can verify this by using id function which will return an object memory address:
In [19]: id(a)
Out[19]: 34019256
In [20]: id(b)
Out[20]: 34019256
In [21]: id(c)
Out[21]: 34019256
All three variables (labels) point to the same object. Now increment one of variable and see how it affects memory addresses:
In [22] a = a + 1
In [23]: id(a)
Out[23]: 34019232
In [24]: id(b)
Out[24]: 34019256
In [25]: id(c)
Out[25]: 34019256
You can see that variable a now points to another object as variables b and c. Because you've used a = a + 1 it is explicitly clear. In other words you assign completely another object to label a. Imagine that you can write a++ it would suggest that you did not assign to variable a new object but ratter increment the old one. All this stuff is IMHO for minimization of confusion. For better understanding see how python variables works:
In Python, why can a function modify some arguments as perceived by the caller, but not others?
Is Python call-by-value or call-by-reference? Neither.
Does Python pass by value, or by reference?
Is Python pass-by-reference or pass-by-value?
Python: How do I pass a variable by reference?
Understanding Python variables and Memory Management
Emulating pass-by-value behaviour in python
Python functions call by reference
Code Like a Pythonista: Idiomatic Python

It was just designed that way. Increment and decrement operators are just shortcuts for x = x + 1. Python has typically adopted a design strategy which reduces the number of alternative means of performing an operation. Augmented assignment is the closest thing to increment/decrement operators in Python, and they weren't even added until Python 2.0.

I'm very new to python but I suspect the reason is because of the emphasis between mutable and immutable objects within the language. Now, I know that x++ can easily be interpreted as x = x + 1, but it LOOKS like you're incrementing in-place an object which could be immutable.
Just my guess/feeling/hunch.

To complete already good answers on that page:
Let's suppose we decide to do this, prefix (++i) that would break the unary + and - operators.
Today, prefixing by ++ or -- does nothing, because it enables unary plus operator twice (does nothing) or unary minus twice (twice: cancels itself)
>>> i=12
>>> ++i
12
>>> --i
12
So that would potentially break that logic.
now if one needs it for list comprehensions or lambdas, from python 3.8 it's possible with the new := assignment operator (PEP572)
pre-incrementing a and assign it to b:
>>> a = 1
>>> b = (a:=a+1)
>>> b
2
>>> a
2
post-incrementing just needs to make up the premature add by subtracting 1:
>>> a = 1
>>> b = (a:=a+1)-1
>>> b
1
>>> a
2

I believe it stems from the Python creed that "explicit is better than implicit".

First, Python is only indirectly influenced by C; it is heavily influenced by ABC, which apparently does not have these operators, so it should not be any great surprise not to find them in Python either.
Secondly, as others have said, increment and decrement are supported by += and -= already.
Third, full support for a ++ and -- operator set usually includes supporting both the prefix and postfix versions of them. In C and C++, this can lead to all kinds of "lovely" constructs that seem (to me) to be against the spirit of simplicity and straight-forwardness that Python embraces.
For example, while the C statement while(*t++ = *s++); may seem simple and elegant to an experienced programmer, to someone learning it, it is anything but simple. Throw in a mixture of prefix and postfix increments and decrements, and even many pros will have to stop and think a bit.

The ++ class of operators are expressions with side effects. This is something generally not found in Python.
For the same reason an assignment is not an expression in Python, thus preventing the common if (a = f(...)) { /* using a here */ } idiom.
Lastly I suspect that there operator are not very consistent with Pythons reference semantics. Remember, Python does not have variables (or pointers) with the semantics known from C/C++.

as i understood it so you won't think the value in memory is changed.
in c when you do x++ the value of x in memory changes.
but in python all numbers are immutable hence the address that x pointed as still has x not x+1. when you write x++ you would think that x change what really happens is that x refrence is changed to a location in memory where x+1 is stored or recreate this location if doe's not exists.

Other answers have described why it's not needed for iterators, but sometimes it is useful when assigning to increase a variable in-line, you can achieve the same effect using tuples and multiple assignment:
b = ++a becomes:
a,b = (a+1,)*2
and b = a++ becomes:
a,b = a+1, a
Python 3.8 introduces the assignment := operator, allowing us to achievefoo(++a) with
foo(a:=a+1)
foo(a++) is still elusive though.

Maybe a better question would be to ask why do these operators exist in C. K&R calls increment and decrement operators 'unusual' (Section 2.8page 46). The Introduction calls them 'more concise and often more efficient'. I suspect that the fact that these operations always come up in pointer manipulation also has played a part in their introduction.
In Python it has been probably decided that it made no sense to try to optimise increments (in fact I just did a test in C, and it seems that the gcc-generated assembly uses addl instead of incl in both cases) and there is no pointer arithmetic; so it would have been just One More Way to Do It and we know Python loathes that.

This may be because #GlennMaynard is looking at the matter as in comparison with other languages, but in Python, you do things the python way. It's not a 'why' question. It's there and you can do things to the same effect with x+=. In The Zen of Python, it is given: "there should only be one way to solve a problem." Multiple choices are great in art (freedom of expression) but lousy in engineering.

I think this relates to the concepts of mutability and immutability of objects. 2,3,4,5 are immutable in python. Refer to the image below. 2 has fixed id until this python process.
x++ would essentially mean an in-place increment like C. In C, x++ performs in-place increments. So, x=3, and x++ would increment 3 in the memory to 4, unlike python where 3 would still exist in memory.
Thus in python, you don't need to recreate a value in memory. This may lead to performance optimizations.
This is a hunch based answer.

I know this is an old thread, but the most common use case for ++i is not covered, that being manually indexing sets when there are no provided indices. This situation is why python provides enumerate()
Example : In any given language, when you use a construct like foreach to iterate over a set - for the sake of the example we'll even say it's an unordered set and you need a unique index for everything to tell them apart, say
i = 0
stuff = {'a': 'b', 'c': 'd', 'e': 'f'}
uniquestuff = {}
for key, val in stuff.items() :
uniquestuff[key] = '{0}{1}'.format(val, i)
i += 1
In cases like this, python provides an enumerate method, e.g.
for i, (key, val) in enumerate(stuff.items()) :

In addition to the other excellent answers here, ++ and -- are also notorious for undefined behavior. For example, what happens in this code?
foo[bar] = bar++;
It's so innocent-looking, but it's wrong C (and C++), because you don't know whether the first bar will have been incremented or not. One compiler might do it one way, another might do it another way, and a third might make demons fly out of your nose. All would be perfectly conformant with the C and C++ standards.
(EDIT: C++17 has changed the behavior of the given code so that it is defined; it will be equivalent to foo[bar+1] = bar; ++bar; — which nonetheless might not be what the programmer is expecting.)
Undefined behavior is seen as a necessary evil in C and C++, but in Python, it's just evil, and avoided as much as possible.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.