Odd threading behavior in python - python

I have a problem where I need to pass the index of an array to a function which I define inline. The function then gets passed as a parameter to another function which will eventually call it as a callback.
The thing is, when the code gets called, the value of the index is all wrong. I eventually solved this by creating an ugly workaround but I am interested in understanding what is happening here. I created a minimal example to demonstrate the problem:
from __future__ import print_function
import threading
def works_as_expected():
for i in range(10):
run_in_thread(lambda: print('the number is: {}'.format(i)))
def not_as_expected():
for i in range(10):
run_later_in_thread(lambda: print('the number is: {}'.format(i)))
def run_in_thread(f):
threading.Thread(target=f).start()
threads_to_run_later = []
def run_later_in_thread(f):
threads_to_run_later.append(threading.Thread(target=f))
print('this works as expected:\n')
works_as_expected()
print('\nthis does not work as expected:\n')
not_as_expected()
for t in threads_to_run_later: t.start()
Here is the output:
this works as expected:
the number is: 0
the number is: 1
the number is: 2
the number is: 3
the number is: 4
the number is: 6
the number is: 7
the number is: 7
the number is: 8
the number is: 9
this does not work as expected:
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
the number is: 9
Can someone explain what is happening here? I assume it has to do with enclosing scope or something, but an answer with a reference that explains this dark (to me) corner of python scoping would be valuable to me.
I'm running this on python 2.7.11

This is a result of how closures and scopes work in python.
What is happening is that i is bound within the scope of the not_as_expected function. So even though you're feeding a lambda function to the thread, the variable it's using is being shared between each lambda and each thread.
Consider this example:
def make_function():
i = 1
def inside_function():
print i
i = 2
return inside_function
f = make_function()
f()
What number do you think it will print? The i = 1 before the function was defined or the i = 2 after?
It's going to print the current value of i (i.e. 2). It doesn't matter what the value of i was when the function was made, it's always going to use the current value. The same thing is happening with your lambda functions.
Even in your expected results you can see it didn't always work right, it skipped 5 and displayed 7 twice. What is happening in that case is that each lambda is usually running before the loop gets to the next iteration. But in some cases (like the 5) the loop manages to get through two iterations before control is passed to one of the other threads, and i increments twice and a number is skipped. In other cases (like the 7) two threads manage to run while the loop is still in the same iteration and since i doesn't change between the two threads, the same value gets printed.
If you instead did this:
def function_maker(i):
return lambda: print('the number is: {}'.format(i))
def not_as_expected():
for i in range(10):
run_later_in_thread(function_maker(i))
The i variable gets bound inside function_maker along with the lambda function. Each lambda function will be referencing a different variable, and it will work as expected.

A closure in Python captures the free variables, not their current values at the time of the creation of the closure. For example:
def capture_test():
i = 1
def foo():
return i
def bar():
return i
print(foo(), bar()) # 1 1
i = 2
print(foo(), bar()) # 2 2
In Python you can also capture variables and write to them:
def incdec():
counter = 0
def inc(x):
nonlocal counter
counter += x
return counter
def dec(x):
nonlocal counter
counter -= x
return counter
return inc, dec
i1, d1 = incdec()
i2, d2 = incdec()
print(i1(10), i1(20), d1(3)) # 10 30 27
print(i2(100), d2(5), d2(20)) # 100 95 75
print(i1(7), d2(9)) # 34 66
As you see incdec returns a pair of two closures that captured the same variable and that are incrementing/decrementing it. The variable shared by i1/d1 is however different from the variable shared by i2/d2.
One common mistake is for example to expect that
L = []
for i in range(10):
L.append(lambda : i)
for x in L:
print(x())
will display the numbers from 0 to 9... all of the unnamed closures here captured the same variable i used to loop and all of them will return the same value when called.
The common Python idiom to solve this problem is
L.append(lambda i=i: i)
i.e. using the fact that default values for parameters are evaluated at the time the function is created. With this approach each closure will return a different value because they're returning their private local variable (a parameter that has a default).

Related

Python recursive function - refer to current and previous arguments

I pass new arguments (previous_high, previous_score) to announce_highest, so how do I refer the the old arguments in order to do a comparison?
Sorry if this question is quite basic, still learning how recursive functions work!
def announce_highest(who, previous_high=0, previous_score=0):
"""Return a commentary function that announces when WHO's score
increases by more than ever before in the game.
>>> f0 = announce_highest(1) # Only announce Player 1 score gains
>>> f1 = f0(11, 0)
>>> f2 = f1(11, 1)
1 point! That's the biggest gain yet for Player 1
>>> f3 = f2(20, 1)
>>> f4 = f3(5, 20) # Player 1 gets 4 points, then Swine Swap applies
19 points! That's the biggest gain yet for Player 1
>>> f5 = f4(20, 40) # Player 0 gets 35 points, then Swine Swap applies
20 points! That's the biggest gain yet for Player 1
>>> f6 = f5(20, 55) # Player 1 gets 15 points; not enough for a new high
"""
assert who == 0 or who == 1, 'The who argument should indicate a player.'
# BEGIN PROBLEM 7
gain = #compare previous argument to current argument
if gain > previous_high:
if gain == 1:
print(gain, 'point! That\'s the biggest gain yet for Player', who)
else:
print(gain, 'points! That\'s the biggest gain yet for Player', who)
return announce_highest(some args)
# END PROBLEM 7
The problem from CS61A is described here.
Your first mistake is trying to make this a recursive function. They are asking you to write a higher-order function (a higher-order function is a function that either takes a function as arguments or returns a function as part of its output; in our case we want one that returns a function).
Specifically, we want announce_highest to return a closure. A closure is basically a function (something that takes input, processes it, and returns output) that additionally has its own local environment of defined internal variables that it inherited from the environment where the closure was defined.
Your closure should have three internal variables:
who, the player to announce when their score increases (either 0 or 1)
previous_high the highest score that player you are tracking has ever had
previous_score the score that the player you tracking last had
and this closure takes two parameters current_player0_score and current_player1_score.
Without doing your homework for you, here's two examples of a similar higher-order function that returns a closure. First, I create a my_counter_factory closure creator. It's a function that when called creates a counter which announces when the count is even or odd (depending on the value of the variables announce_if_even_count and announce_if_odd_count variables when it was initially created).
def my_counter_factory(initial_count=0, announce_if_even_count=True, announce_if_odd_count=False):
count = initial_count
def counter(increment_by):
new_count = count + increment_by
if announce_if_even_count and new_count % 2 == 0:
print("Count is even %s" % new_count)
if announce_if_odd_count and new_count % 2 == 1:
print("Count is odd %s" % new_count)
return my_counter_factory(new_count, announce_if_even_count, announce_if_odd_count)
return counter
Which when run will work as:
>>> c0 = my_counter_factory(10, True, True) # count = 10, announce count if even or odd
>>> c1 = c0(5) # announces 15 creates new counter `c1` with count of 15
# note count of c0 is still 10.
Count is odd 15
>>> c2 = c1(3) # count is 18 (stored in new closure c2) by adding 3 to c1 count.
Count is even 18
# See value of closures by incrementing 0 to them:
>>> ignore = c0(0)
Count is even 10
>>> ignore = c1(0)
Count is odd 15
>>> ignore = c2(0)
Count is even 18
Note every time this closure is called, it returns a brand new closure (with its own new environment). An equally valid choice (that behaves slightly differently) is to not create new closures each time, but keep returning back the same closure.
def my_counter_factory(initial_count=0, announce_if_even_count=True, announce_if_odd_count=False):
count = initial_count
def counter(increment_by):
nonlocal count # more on this below.
count += increment_by
if announce_if_even_count and count % 2 == 0:
print("Count is even %s" % count)
if announce_if_odd_count and count % 2 == 1:
print("Count is odd %s" % count)
return counter # This returns back same closure each time.
return counter
This will work as:
>>> c0 = my_counter_factory(10, True, True) # count = 10, announce count if even or odd
>>> c1 = c0(5) # adds 5 to count of c0 closure (was 10, now 15) announces 15
# c1 is a reference to the same closure as c0 with the same count
Count is odd 15
>>> c2 = c1(3) # adds 3 to count of c1 closure (now at 18),
Count is even 18
# See value of closures by incrementing 0 to them:
>>> ignore = c0(0)
Count is even 18
>>> ignore = c1(0)
Count is even 18
>>> ignore = c2(0)
Count is even 18
>>> c0 == c1 == c2
True
# Note in this second example all three closures are identical,
# because the closure merely returns a reference to itself when called
# Granted you could create a brand new closure
>>> new_c = my_counter_factory(0, True, False)
# This one starts with a count of 0 and only announces when it's even
>>> new_c2 = new_c(5)
Final Note: You may wonder why in the second example, we needed the line nonlocal count (but we didn't need it in the first example) and if you took it out you would get an error saying local variable 'count' referenced before assignment. Closures in python are allowed to reference variables from the environment it was defined in (without using the nonlocal keyword introduced in python3), as long as they don't re-assign values to them. Basically when python tries to interpret your function it differentiates between local variables (defined within the function) and variables defined elsewhere. For example:
>>> def print_0_to_4():
... for i in range(5):
... print(i, end=", ")
... print("")
...
>>> i=-1
>>> print_0_to_4()
0, 1, 2, 3, 4,
>>> i
-1
The important thing to note is we called a function print_0_to_4 which assigned to a variable i, which made i a local variable. Changes to the local variable in the function don't modify the values of variables with the same name from the outer environment. (Otherwise programming would be ridiculously hard because we need to know the names of internal variables of every function we'd call for fear of unintentionally modifying our variables when calling a function).
Also note if you don't assign/modify to a variable in a function, then it is ok to reference a variable defined in another scope (without needing a nonlocal keyword).
>>> i = -1
>>> def print_i_from_outer_scope():
... print(i)
...
>>> print_i_from_outer_scope()
-1

How do you create new commands in Python when immutable data types are involved? [duplicate]

This question already has answers here:
How do I pass a variable by reference?
(39 answers)
Closed 5 years ago.
In Python, the following code prints '0', not '1'.
def inc(x):
x = x+1
a = 0
inc(a)
print(a)
I understand why this happens; it's because integers are immutable. What I don't understand is how to get around this behaviour when it's undesirable. Suppose we want to create a new command such that the code
a = 0
inc(a)
print(a)
prints '1'.
Obviously, the naive approach won't do it. What can we do instead?
Similar (a bit more general) question can be found here along with a discussion how Python passes params to functions. In short, without making x variable in your code an object, I believe there's nothing we can do. Of course, you can alter your code to e.g. return changed value from function inc() and print that (i.e. print(inc(x))) or just do the printing from inside the inc() method, but that's not what you're essentially looking for.
If I understand correctly, You are trying to increment variable a using function inc(var) and passing 'a' as a external variable to the function inc().
As #Marko Andrijevic stated, variable x passed to function inc() and variable x defined in the function are different . One way to achieve is by returning value of x and collecting externally, which you may not be looking for.
Alternately, Since you have defined variable 'a' outside function ,it can be called global variable.
If you want to pass that to a function, and manipulate it, you need to define that variable ('a' in your case) inside the function as global. Something like below.
def inc(x):
global a
a = x+1
Now when the new value assigned to 'a' after 'x+1', it is retained after execution of 'inc(x)'
>>> a = 0
>>> inc(a)
>>> a
1
EDIT -1
As per comments by #DYZ . Its correct. declaring global a inside inc() function will always increment a.
A better alternative will be , in that case, to return x inside inc() and assign that value to any external variable.
Not an elegant solution, but works as intended.
def inc(x):
return x+1
Result
>>> a
0
>>> a = inc(a)
>>> a
1
>>> a = inc(a)
>>> a
2
>>> b = 0
>>> b = inc(b)
>>> b
1
>>> a
2
>>>
one can use yield to get variable values.
def inc(x,y,z):
x += 1
y+=1
z+=1
yield x,y,z #inc doesn't stop
yield x+y+z
a=b=c=0
gen=inc(a,b,c)
gen=list(gen)
a,b,c,sum=gen[0]+(gen[1],) #however, index must still be known
print a,b,c,sum

Deep copy index integer using lambda

func = []
for value in range(5):
func.append(lambda: print_index(value))
def print_index(index_to_print):
print(index_to_print)
for a in func:
a()
When I run the code above, the output is
4
4
4
4
4
Why is it not ?
0
1
2
3
4
What can I do to make it like the above.
I have tried importing copy and using copy.deepcopy(index). I didn't work probably since index is an integer.
Is lambda part of the reason it is not working.
Thank you for your help!
Not quite sure what you are trying to achieve but the reason that it is printing all 4s is because Python uses dynamic name resolution of variables, that is the value of value when the functions are executed (not when it is declared) is used.
If you really need a function to print the value then you need to create a closure, which means creating the function inside another function, e.g.:
def make_closure(v):
return lambda: print_index(v)
func = []
for value in range(5):
func.append(make_closure(value))
Now this outputs:
In []:
for a in func:
a()
Out[]:
0
1
2
3
4
As #AChampion said, Python calculates the value of value when it needs to and not while executing the loop.
However for me it seemed easier to do the following fix.
func = []
for value in range(5):
func.append(lambda v = value: print_index(v))
def print_index(index_to_print):
print(index_to_print)
for a in func:
a()
This is setting the default argument in the function which gets calculated during the loop execution. The code above is the same as saying :
func = []
for value in range(5):
def anonymous_function(v=value):
print_index(v)
func.append(anonymous_function)
def print_index(index_to_print):
print(index_to_print)
for a in func:
a()

Python Dispatcher Definitions in a Function [duplicate]

This question already has answers here:
Local variables in nested functions
(4 answers)
Closed 7 years ago.
I'm running into issues with Python dispatcher definitions changing each time I add a new function to the dispatcher. An example:
def dispatcher_create():
dispatcher = {}
for i in range(5):
def square():
return i**2
dispatcher[i] = square
for j in range(5):
print dispatcher[j]()
return dispatcher
This code prints out the value 16 five times. I was hoping it would print out 0 1 4 9 16 instead. I'm sure it's an issue with me redefining square every time, but I'm not sure how best to fix it.
The i in return i**2 is bound to the name i, not the value i.
Try this to create a new variable, bound to the appropriate value:
def dispatcher_create():
dispatcher = {}
for i in range(5):
def square(i=i):
return i**2
dispatcher[i] = square
for j in range(5):
print dispatcher[j]()
return dispatcher
dispatcher_create()
No, redefining square() every time is doing what you want, which you can check by printing the contents of dispatcher: all the functions in it will have different id's as desired.
The problem is that you're creating a closure so when you call any of the functions stored in dispatcher they are accessing the latest value of i, rather than using the value that i had when they were defined.
Robᵩ has shown one way around that, by passing i as an arg to square(); here's another way: using another closure which takes i as an arg so it can preserve it for the squaring function it makes.
def dispatcher_create():
dispatcher = {}
def make_square(j):
def square():
return j**2
return square
for i in range(5):
dispatcher[i] = make_square(i)
return dispatcher
dd = dispatcher_create()
print dd
for j in range(5):
print dd[j]()
typical output
{0: <function square at 0xb73a0a3c>, 1: <function square at 0xb73a0dbc>, 2: <function square at 0xb73a0d84>, 3: <function square at 0xb73a5534>, 4: <function square at 0xb73a517c>}
0
1
4
9
16
Robᵩ's version is a little simpler, but this version has the advantage that the functions in dispatcher have the desired argument signature, i.e., they take no argument, whereas Robᵩ's functions take a single argument with a default value that you can over-ride.
FWIW, you could use i instead of j as the parameter for make_square(), since it's just a local variable to make_square(). OTOH, using i there shadows the i in the outer scope, and I feel that using j is slightly less confusing.
I believe it has to do with scope. The i variable remains defined after the for cycle it is used in. If you print the value of i after the for cycle, you see it is 4. If you then call the functions in the next for cycle, the value of i that is in the current scope is used.
As for the solution, i think functools.partial would be a good choice.
from functools import partial
def dispatcher_create():
dispatcher = {}
for i in range(5):
def square(value):
return value**2
dispatcher[i] = partial(square, i)
for j in range(5):
print dispatcher[j]()
return dispatcher

Hole-in-scope, dead code or why such output?

Code
def change1(list1):
list1[1] = list1[1] + 5
def change2(number):
number = number + 2
def main():
numbers = [4, 8, 12]
change1(numbers)
variable = 15
change2(variable)
i = 0
while i < 3:
print numbers[i]
i += 1
print variable
main()
When I read it, I thought it will output 4 8 12 15 but it outputs 4 13 12 15. I can see here that Python deals with integer and lists differently, I assumed that the last thing is impossible without global. I cannot understand the output, in such case, why would it not output 4 13 12 17?
You can see here almost identical code with different types and different reference:
$ python test2.py
4
13
12
15
$ python test3.py
4
13
12
17
$ cat test2.py test3.py
Pass-by-reference examples
test2.py: pass-by-reference and mutable data type -example. Table/list is not enough to affect the local variable in main, you need the Reference!
def change1(list1):
list1[1] = list1[1] + 5
def change2(number):
number = [x+2 for x in number]
def main():
numbers = [4, 8, 12]
change1(numbers)
variable = [15]
change2(variable)
i = 0
while i < 3:
print numbers[i]
i += 1
print variable[0]
main()
test3.py: pass-by-reference example, changing a mutable data type list/table outside the main function
def change1(list1):
list1[1] = list1[1] + 5
def change2(number):
number[0] += 2
def main():
numbers = [4, 8, 12]
change1(numbers)
variable = [15]
change2(variable)
i = 0
while i < 3:
print numbers[i]
i += 1
print variable[0]
main()
pass-by-value examples
test4.py: trying to find an example with pass-by-value, why it does not work?
$ cat test4.py
# Not yet a pass-by-value example!
global variable
variable = [15]
def change1(list1):
list1[1] = list1[1] + 5
def change2(number):
number = [x+2 for x in number]
def main():
numbers = [4, 8, 12]
change1(numbers)
#variable = 15
change2(variable)
i = 0
while i < 3:
print numbers[i]
i += 1
print variable[0]
main()
$ python test4.py
4
13
12
15 # I expected 17! Why no 17?
def change1(list1):
# `list1[1] =` means you are changing the object passed in
list1[1] = list1[1] + 5
def change2(number):
# `number = ` means you create a **new local variable**, number,
# based on the `number`you passed in
number = [x+2 for x in number]
So if you want to change existing objects, you have to referene them in some way, for example in
def change3(number):
# `number[:]` is the whole existing list and you overwrite it
number[:] = [x+2 for x in number]
Note the [ .. ] when changing a list.
Python parameters are passed by reference. You mutating only one object in change1.
However, numerical values and Strings are all immutable. You cannot change the value of a passed in immutable and see that value change in the caller. Dictionaries and Lists on the other hand are mutable, and changes made to them by a called function will be preserved when the function returns.
More: http://www.penzilla.net/tutorials/python/functions/
The definitive answer is that Python is actually "call by sharing", also known as "call by object" or "call by object reference".
This has been extensively discussed before. From that article:
From time to time, people who’ve read a little CS but not a lot CS (or too much of just one kind of CS) pop up on comp.lang.python and waste a lot of energy trying to tell everyone that Python’s using some calling model that it doesn’t really use. It always turns out that they don’t really understand Python’s model, and quite often, they don’t understand their favourite model either.
But nevermind, the only thing you need to know is that Python’s model is neither “call by value” nor “call by reference” (because any attempt to use those terms for Python requires you to use non-standard definitions of the words “-value” and “-reference”). The most accurate description is CLU’s “call by object” or “call by sharing“. Or, if you prefer, “call by object reference“.
You should also read this, if you haven’t done so already.
Python's semantics are most similar to the semantics of the language CLU. The CLU Reference Manual by Liskov et al describes the semantics like this:
"We call the argument passing technique call by sharing,
because the argument objects are shared between the
caller and the called routine. This technique does not
correspond to most traditional argument passing techniques
(it is similar to argument passing in LISP). In particular it
is not call by value because mutations of arguments per-
formed by the called routine will be visible to the caller.
And it is not call by reference because access is not given
to the variables of the caller, but merely to certain objects."
In change1 you exchange the value in the list with value + 5.
In change2 you add 5 to number. The result is a new object and is not just applied to the passed variable.
If you come from C++: No there is no int& var in Python.
You get the expected result when doing this:
def change2(number):
return number + 5
variable = 15
variable = change2(variable)
If you still don't want to return a value, you could create a MutableInt class.
class MutableInt(object):
def __init__(self, value = 0):
self._value = int(value)
def __add__(self, other):
self._value += int(other)
return self
def __sub__(self, other):
self._value -= int(other)
return self
...
All the examples show call-by-value. Python only has call-by-value. There is no call-by-reference. All values in python are references (it is not possible to have an "object" as the value). Hence it is references that are copied when passed to the function. Lists are mutable, so it is possible to mutate its contents through a shared reference. In change2 you are reassigning a local variable to point to another object, which, like all assignments to local variables, has no effect on any calling scope, since it is call-by-value.

Categories

Resources