Order of evaluation in Python is not clear [duplicate]

Order of evaluation in Python is not clear [duplicate] - python

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Multiple assignment in Python
As we have learnt right since we started with C that on a computer while working in one thread, all operations occur one by one.
I have a doubt in Python 3 language. I have seen codes for swapping variable values using the expression:
a,b = b,a
Or for Fibonacci series using:
a,b = b,a+b
How can these work ? But they do work :O
Does the Python system internally create some temporary variable for these ? What's the order of assignment so that both effectively give the correct result ?
Regards,
Nikhil

At a high level, you are creating two tuples, the left hand side and right hand side, and assigning the right one to the left, which changes the variables one by one to their opposites. Python is a higher level language, so there are more abstractions like this when compared to a language like C.
At a low level, you can see quite clearly what is happening by using the dis module, which can show you the python bytecode for a function:
>>> import dis
>>> def test(x, y):
... x, y = y, x
...
>>> dis.dis(test)
2 0 LOAD_FAST 1 (y)
3 LOAD_FAST 0 (x)
6 ROT_TWO
7 STORE_FAST 0 (x)
10 STORE_FAST 1 (y)
13 LOAD_CONST 0 (None)
16 RETURN_VALUE
What happens is it uses ROT_TWO to swap the order of the items on the stack, which is a very efficient way of doing this.

When you write a, b, you create tuple.
>>> 1, 2
(1, 2)
So, nothing special in evaluation order.

From the Fibonacci example with a=1 and b=1. First, the right hand side is evaluated: b,a+b resulting in the tuple (1,2). Next, the right hand side is assigned to the left hand side, namely a and b. So yes, the evaluation on the right is stored in memory, and then a and b changed to point to these new values.

Related

Why does (( (λf.λx.f(f(f(x)))) (λg.λy.g(g(y))) ) (λz.z + 1)) (0) evaluate to 8?

So I have this lambda expression: (λf.λx.f(f(f(x)))) (λg.λy.g(g(y)))(λz.z + 1)(0) and I'm trying to evaluate it by hand. They way I'm thinking about this is that (λf.λx.f(f(f(x)))) basically represents the expression f(f(f(x))). Then likewise (λg.λy.g(g(y))) represents the expression g(g(y)). Then g(g(y)) is passed in to replace f. So we get g(g(g(g(g(g(y)))))). Or g composed with itself 6 times. Then we pass in z+1 for g and then plug in 0 into that final expression and we wind up with 6.
>>>(((lambda f: lambda x: f(f(f(x))))(lambda g: lambda y: g(g(y))))(lambda z: z+1))(0)
8
The problem is that when I go to verify this answer using python's built in lambda calculus tools I get 8 as the answer.
So clearly I'm doing my evaluation wrong. I'm thinking of 2 g compositions being multiplied by 3 f compositions to get 6. But clearly I'm supposed to be thinking of it as 2^3 compositions but I don't understand why.

Your intuition is failing you somewhat. It's tempting to try to compose those two expressions the "intuitive" way, but doing so oversimplifies the problem. Let's take a look at the first couple of steps.
Here's your lambda expression with some of the extraneous parentheses removed for readability purposes
(λf.λx.f(f(fx))) (λg.λy.g(gy)) (λz.z+1) 0
Now, your intuition tells you that we "compose" the first two functions by letting f be λy.g(gy). But that's not really the case. See, we're not letting f be λy.g(gy); we're letting f be λg.λy.g(gy) (the g is still an argument at this time). So the first simplification applies as
(λx.(λg.λy.g(gy))((λg.λy.g(gy))((λg.λy.g(gy))x))) (λz.z+1) 0
It's a confusing mess, but the point is the bit that's being repeated still has a g argument. Then we plug in λz.z+1 for x, which is admittedly fairly simple
(λg.λy.g(gy)) ((λg.λy.g(gy))((λg.λy.g(gy))(λz.z+1))) 0
Alright. Now in the innermost redex we're going to let g be (λz.z+1). So, glossing over a few steps of the process, we're going to get a function on y which says "add one to this thing twice". i.e.
(λg.λy.g(gy)) ((λg.λy.g(gy))(λy.y+2)) 0
Okay, now let's do it again. The left-hand side of the redex is the same as before, so we're going to do the thing on the right-hand side twice. The thing on the right-hand side is "add one to this number", so we get
(λg.λy.g(gy)) (λy.y+4) 0
Finally, do it one last time. We're adding four to a number twice.
(λy.y+8) 0
And for the grand prize, zero plus eight is...
8

It's definitely a good exercise to walk it through step by step like Silvio, but more simply:
You apply f->f^3 to g->g^2, isn't that g->(((g^2)^2)^2) = g^8?
and so (+1)^8 (0) = 8

Understanding Tuple to List conversion behaviour: list(t) or [*t] which is better?

I've a tuple as below:
t=(1,2,3,4,5,6)
I want to convert it to a list, although there is a straight forward way of
l=list(t)
I wanted to understand if the below is more inefficient, if so in what way?
l=[*t]
This is more to understanding if unpacking and packing it back into a list has any overheads vs list(tuple).
I'll try and benchmark the two and post the results here, but if anybody can throw some insight it would be great.

This is pretty easy to check yourself with the timeit and dis modules. I slapped together this script:
import timeit
import dis
def func(t):
return list(t)
def unpack(t):
return [*t]
def func_wrapper():
t = (1,2,3,4,5,6)
func(t)
def unpack_wrapper():
t = (1,2,3,4,5,6)
unpack(t)
print("Disassembly with function:")
print(dis.dis(func))
print("Dissassembly with unpack:")
print(dis.dis(unpack))
print("Func time:")
print(timeit.timeit(func_wrapper, number=10000))
print("Unpack time:")
print(timeit.timeit(unpack_wrapper, number=10000))
And running it shows this output:
Disassembly with function:
5 0 LOAD_GLOBAL 0 (list)
2 LOAD_FAST 0 (t)
4 CALL_FUNCTION 1
6 RETURN_VALUE
None
Dissassembly with unpack:
8 0 LOAD_FAST 0 (t)
2 BUILD_LIST_UNPACK 1
4 RETURN_VALUE
None
Func time:
0.002832347317420137
Unpack time:
0.0016913349487029865
The disassembly shows that the function method's disassembly requires a one additional function call over the unpacking method. The timing results show that, as expected, the overhead of the function call vs using a built-in operator causes a significant increase in execution time.
By execution time alone, unpacking is more "efficient." But remember that execution time is only one part of the equation - this has to be balanced with readability and in some cases, memory consumption (which is harder to benchmark). In most cases, I would recommend you just stick with the function because it's easier to read. I would only switch to the unpacking method if this code is executed frequently (like in a long-running loop) and is on the critical path of your script.

Bytecode optimization

Here are 2 simple examples. In the first example append method produces LOAD_ATTR instruction inside the cycle, in the second it only produced once and result saved in variable (ie cached). Reminder: I remember, that there extend method for this task which is much faster that this
setup = \
"""LIST = []
ANOTHER_LIST = [i for i in range(10**7)]
def appender(list, another_list):
for elem in another_list:
list.append(elem)
def appender_optimized(list, another_list):
append_method = list.append
for elem in another_list:
append_method(elem)"""
import timeit
print(timeit.timeit("appender(LIST, ANOTHER_LIST)", setup=setup, number=10))
print(timeit.timeit("appender_optimized(LIST, ANOTHER_LIST)", setup=setup, number=10))
Results:
11.92684596051036
7.384205785584728
4.6 seconds difference (even for such a big list) is no joke - for my opinion such difference can not be counted as "micro optimization". Why Python does not do it for me? Because bytecode must be exact reflection of source code? Do compiler even optimize anything? For example,
def te():
a = 2
a += 1
a += 1
a += 1
a += 1
produces
LOAD_FAST 0 (a)
LOAD_CONST 2 (1)
INPLACE_ADD
STORE_FAST 0 (a)
4 times instead of optimize into a += 4. Or do it optimize some famous things like producing bit shift instead of multiplying by 2? Am I misunderstand something about basic language concepts?

Python is a dynamic language. This means that you have a lot of freedom in how you write code. Due to the crazy amounts of introspection that python exposes (which are incredibly useful BTW), many optimizations simply cannot be performed. For example, in your first example, python has no way of knowing what datatype list is going to be when you call it. I could create a really weird class:
class CrazyList(object):
def append(self, value):
def new_append(value):
print "Hello world"
self.append = new_append
Obviously this isn't useful, but I can write this and it is valid python. If I were to pass this type to your above function, the code would be different than the version where you "cache" the append function.
We could write a similar example for += (it could have side-effects that wouldn't get executed if the "compiler" optimized it away).
In order to optimize efficiently, python would have to know your types ... And for a vast majority of your code, it has no (fool-proof) way to get the type data so it doesn't even try for most optimizations.
Please note that this is a micro optimization (and a well documented one). It is useful in some cases, but in most cases it is unnecessary if you write idiomatic python. e.g. your list example is best written using the .extend method as you've noted in your post. Most of the time, if you have a loop that is tight enough for the lookup time of a method to matter in your overall program runtime, then either you should find a way to rewrite just that loop to be more efficient or even push the computation into a faster language (e.g. C). Some libraries are really good at this (numpy).
With that said, there are some optimizations that can be done safely by the "compiler" in a stage known as the "peephole optimizer". e.g. It will do some simple constant folding for you:
>>> import dis
>>> def foo():
... a = 5 * 6
...
>>> dis.dis(foo)
2 0 LOAD_CONST 3 (30)
3 STORE_FAST 0 (a)
6 LOAD_CONST 0 (None)
9 RETURN_VALUE
In some cases, it'll cache values for later use, or turn one type of object into another:
>>> def translate_tuple(a):
... return a in [1, 3]
...
>>> import dis
>>> dis.dis(translate_tuple)
2 0 LOAD_FAST 0 (a)
3 LOAD_CONST 3 ((1, 3))
6 COMPARE_OP 6 (in)
9 RETURN_VALUE
(Note the list got turned into a tuple and cached -- In python3.2+ set literals can also get turned into frozenset and cached).

In general Python optimises virtually nothing. It won't even optimise trivial things like x = x. Python is so dynamic that doing so correctly would be extremely hard. For example the list.append method can't be automatically cached in your first example because it could be changed in another thread, something which can't be done in a more static language like Java.

Logic behind Python indexing

I'm curious in Python why x[0] retrieves the first element of x while x[-1] retrieves the first element when reading in the reverse order. The syntax seems inconsistent to me since in the one case we're counting distance from the first element, whereas we don't count distance from the last element when reading backwards. Wouldn't something like x[-0] make more sense? One thought I have is that intervals in Python are generally thought of as inclusive with respect to the lower bound but exclusive for the upper bound, and so the index could maybe be interpreted as distance from a lower or upper bound element. Any ideas on why this notation was chosen? (I'm also just curious why zero indexing is preferred at all.)

The case for zero-based indexing in general is succinctly described by Dijkstra here. On the other hand, you have to think about how Python array indexes are calculated. As the array indexes are first calculated:
x = arr[index]
will first resolve and calculate index, and -0 obviously evaluates to 0, it would be quite impossible to have arr[-0] to indicate the last element.
y = -0 (??)
x = arr[y]
would hardly make sense.
EDIT:
Let's have a look at the following function:
def test():
y = x[-1]
Assume x has been declared above in a global scope. Now let's have a look at the bytecode:
0 LOAD_GLOBAL 0 (x)
3 LOAD_CONST 1 (-1)
6 BINARY_SUBSCR
7 STORE_FAST 0 (y)
10 LOAD_CONST 0 (None)
13 RETURN_VALUE
Basically the global constant x (more precisely its address) is pushed on the stack. Then the array index is evaluated and pushed on the stack. Then the instruction BINARY_SUBSCR which implements TOS = TOS1[TOS] (where TOS means Top of Stack). Then the top of the stack is popped into the variable y.
As the BINARY_SUBSCR handles negative array indices, and that -0 will be evaluated to 0 before being pushed to the top of the stack, it would take major changes (and unnecessary changes) to the interpreter to have arr[-0] indicate the last element of the array.

Its mostly for a couple reasons:
Computers work with 0-based numbers
Older programming languages used 0-based indexing since they were low-level and closer to machine code
Newer, Higher-level languages use it for consistency and the same reasons
For more information: https://en.wikipedia.org/wiki/Zero-based_numbering#Usage_in_programming_languages

In many other languages that use 0-based indexes but without negative index implemented as python, to access the last element of a list (array) requires finding the length of the list and subtracting 1 for the last element, like so:
items[len(items) - 1]
In python the len(items) part can simply be omitted with support for negative index, consider:
>>> items = list(range(10))
>>> items[len(items) - 1]
9
>>> items[-1]
9

In python: 0 == -0, so x[0] == x[-0].
Why is sequence indexing zero based instead of one based? It is a choice the language designer should do. Most languages I know of use 0 based indexing. Xpath uses 1 based for selection.
Using negative indexing is also a convention for the language. Not sure why it was chosen, but it allows for circling or looping the sequence by simple addition (subtraction) on the index.

Most minimal way to repeat things in python [duplicate]

This question already has answers here:
Is it possible to implement a Python for range loop without an iterator variable?
(15 answers)
Closed 7 months ago.
Suppose you want to write a program that asks the user for X numbers and store them in A, then for Y numbers and store them in B
BEFORE YOU VOTE FOR CLOSING : yes, this question has been asked here, here, here, here and possibly elsewhere, but I reply to each of the proposed solutions in this question explaining why they're not what I'm looking for, so please keep reading before voting for closing IF you decide it's a duplicate. This is a serious question, see last paragraph for a small selection of languages supporting the feature I'm trying to achieve here.
A = []
B = []
# First possibilty : using while loops
# you need to have a counter
i = 0
while (i < X):
A.append(input())
# and increment it yourself
i+=1
# you need to reset it
i = 0
while (i < Y):
B.append(input())
# and increment it again
i+=1
So basically you need to repeat a certain thing x times, then another Y times, but python has only while loops and for loops. The while loop is not to repeat a certain thing N times, but to repeat things while a certain condition is True, that's why you have to use a counter, initialize it, increment it, and test against it.
Second solution is to use for loops :
# No need to create a counter
for x in xrange(x):
A.append(input())
# No need to increment the counter
# no need to reset the counter
for x in xrange(Y):
B.append(input())
This is a lot better :), but there's still something slightly anoying : why would I still have to supply "x", when I don't need x ? In my code, I don't need to know in what loop iteration I am, i just need to repeat things N times, so why do I have to create a counter at all ? Isn't there a way to completely get rid of counters ? imagine you could write something like :
# No need to supply any variable !
repeat(x):
A.append(input())
repeat(Y):
B.append(input())
Wouldn't that be more elegant ? wouldn't that reflect more accurately your intention to possible readers of your code ? unfortunately, python isn't flexible enough to allow for creating new language constructs (more advanced languages allow this). So I still have to use either for or while to loop over things. With that restriction in mind, how do I get the most closer possible to the upper code ?
Here's what I tried
def repeat(limit):
if hasattr(repeat,"counter"):
repeat.counter += 1
return repeat.counter < limit
repeat.limit = limit
repeat.counters = 0
return limit > 0 and True
while repeat(x):
A.append(input())
while repeat(Y):
B.append(input())
This solution works for simple loops, but obviously not for nested loops.
My question is
Do you know any other possible implementation that would allow me to get the same interface as repeat ?
Rebbutal of suggested solutions
suggested solution 1.1 : use itertools.repeat
you'll have to place your code inside a function and call that function inside itertools.repeat, somehow. It's different from the above repeat implementation where you can use it over arbitrary indented blocks of code.
suggestion solution 1.2 : use itertools.repeat in a for loop
import itertools
for _ in itertools.repeat(None, N):
do_something()
you're still writing a for loop, and for loops need a variable (you're providing _), which completely misses the point, look again :
while repeat(5):
do_x()
do_y()
do_z()
Nothing is provided just for the sake of looping. This is what I would like to achieve.
suggested solution 2 : use itertools.count
1) you're half the way there. Yes, you get rid of manually incrementing the counter but you still need for a counter to be created manually. Example :
c = itertools.count(1,1) # You still need this line
while( c.next() < 7):
a = 1
b = 4
d = 59
print "haha"
# No need to increment anything, that's good.
# c.incr() or whatever that would be
You also need to reset the counter to 0, this I think could be hidden inside a function if you use the with statement on it, so that at every end of the with statement the counter gets reset.
suggested solution 3 : use a with statement
I have never used it, but from what I saw, it's used for sweeping the try/catch code inside another function. How can you use this to repeat things ? I'd be excited to learn how to use my first with statement :)
suggested solution 4 : use an iterator/generator, xrange, list comprehensions...
No, no, no...
If you write :
[X() for _ in xrange(4)]
you are :
1) creating a list for values you don't care about.
2) supplying a counter (_)
if you write :
for _ in xrange(4):
X()
see point 2) above
if you write :
def repeat(n):
i = 0
while i < n :
yield i < n
i+1
Then how are you supposed to use repeat ? like this ? :
for x in repeat(5):
X()
then see point 2) above. Also, this is basically xrange, so why not use xrange instead.
Any other cool ideas ?
Some people asked me what language do I come from, because they're not familiar with this repeat(N) syntax. My main programming language is python, but I'm sure I've seen this syntax in other languages. A visit of my bookmarks followed by online search shows that the following languages have this cool syntax :
Ruby
it's just called times instead of repeat.
#!/usr/bin/env ruby
3.times do
puts "This will be printed 3 times"
end
print "Enter a number: "
num = gets.chomp.to_i
num.times do
puts "Ruby is great!"
end
Netlogo
pd repeat 36 [ fd 1 rt 10 ]
;; the turtle draws a circle
AppleScript
[applescript]
repeat 3 times
--commands to repeat
end repeat
[/applescript]
Verilog
1 module repeat_example();
2 reg [3:0] opcode;
3 reg [15:0] data;
4 reg temp;
5
6 always # (opcode or data)
7 begin
8 if (opcode == 10) begin
9 // Perform rotate
10 repeat (8) begin
11 #1 temp = data[15];
12 data = data << 1;
13 data[0] = temp;
14 end
15 end
16 end
17 // Simple test code
18 initial begin
19 $display (" TEMP DATA");
20 $monitor (" %b %b ",temp, data);
21 #1 data = 18'hF0;
22 #1 opcode = 10;
23 #10 opcode = 0;
24 #1 $finish;
25 end
26
27 endmodule
Rebol
repeat count 50 [print rejoin ["This is loop #: " count]]
Forth
LABEL 10 0 DO something LOOP will repeat something 10 times, and you didn't have to supply a counter (Forth automatically stores it in a special variable I).
: TEST 10 0 DO CR ." Hello " LOOP ;
Groovy
you can either write 1.upTo(4){ code } or 4.times{ code } to execute code 4 times. Groovy is a blast when it comes to looping, it has like half a dozen ways to do it (upTo, times, step, java for, groovy for, foreach and while).
// Using int.upto(max).
0.upto(4, createResult)
assert '01234' == result
// Using int.times.
5.times(createResult)
assert '01234' == result
// Using int.step(to, increment).
0.step 5, 1, createResult
assert '01234' == result

WARNING: DO NOT DO THIS IN PRODUCTION!
I highly recommend you go with one of the many list comprehension solutions that have been offered, because those are infinitely less hacky. However, if you're adventurous and really, really, really want to do this, read on...
We can adapt your repeat solution slightly so that nesting works, by storing state that changes depending on where it's being called. How do we do that? By inspecting the stack, of course!
import inspect
state = {}
def repeat(limit):
s = tuple((frame, line) for frame, _, line, _, _, _ in inspect.stack()[1:])
counter = state.setdefault(s, 0)
if counter < limit:
state[s] += 1
return True
else:
del state[s]
return False
We set up a state dict to keep track of current counters for each unique place the repeat function is called, by keying off of a tuple of (stack_frame, line_number) tuples. This has the same caveat as your original repeat function in that break literally breaks everything, but for the most part seems to work. Example:
while repeat(2):
print("foo")
while repeat(3):
print("bar")
Output:
foo
bar
bar
bar
foo
bar
bar
bar

Surely you are over-thinking things. If I understand your (rather lengthy) question correctly you want to avoid repeating code. In this case you first port of call should be to write a function. This seems relatively simple: what you need is a function that takes an argument x (I have used n simply because it reminds me of integers) and returns a list containing that many input elements. Something like (untested):
def read_inputs(n):
ret_val = []
for i in range(n):
ret_val.append(input())
return retval
The remainder of your code is simply:
A = read_inputs(X)
B = read_inputs(Y)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Order of evaluation in Python is not clear [duplicate] - python

When you write a, b, you create tuple. >>> 1, 2 (1, 2) So, nothing special in evaluation order.

Related

Why does (( (λf.λx.f(f(f(x)))) (λg.λy.g(g(y))) ) (λz.z + 1)) (0) evaluate to 8?

Understanding Tuple to List conversion behaviour: list(t) or [*t] which is better?

Bytecode optimization

Logic behind Python indexing

Most minimal way to repeat things in python [duplicate]

Categories

Resources