Python dereference closure variable at define-time - python

The answer in this post details nicely how python inner functions don't use the value of closure variables until the inner function actually executes, finding the variable name in the proper scope.
For example:
funcs = [(lambda: x) for x in range(3)]
Calling any of the generated lambdas returns 2:
>>> funcs[0]()
2
>>> funcs[1]()
2
>>> funcs[2]()
2
Is there a way to force the value for x to be determined when the function is defined instead of when it is executed later? In the above example, my desired output is 0, 1, 2, respectively.
More specifically, my use-case is to provide a way for API users to conveniently turn a custom function into a thread using a decorator. For example:
for idx in range(3):
#thread_this(name=f'thread_{idx}')
def custom():
do_something()
print(f'thread_{idx} complete.')
When the final print statement executes, it picks up whatever the current value of idx is in the global scope. With appropriate sleep statements, all 3 threads will print 'thread_2 complete.'

You can use functools.partial, first problem can be solved with,
funcs = [functools.partial(lambda x: x, x) for x in xrange(3)]
It will give you desired result.
However, I could not understand the second usecase.

Related

Why do Python yield statements form a closure?

I have two functions that return a list of functions. The functions take in a number x and add i to it. i is an integer increasing from 0-9.
def test_without_closure():
return [lambda x: x+i for i in range(10)]
def test_with_yield():
for i in range(10):
yield lambda x: x+i
I would expect test_without_closure to return a list of 10 functions that each add 9 to x since i's value is 9.
print sum(t(1) for t in test_without_closure()) # prints 100
I expected that test_with_yield would also have the same behavior, but it correctly creates the 10 functions.
print sum(t(1) for t in test_with_yield()) # print 55
My question is, does yielding form a closure in Python?
Yielding does not create a closure in Python, lambdas create a closure. The reason that you get all 9s in "test_without_closure" isn't that there's no closure. If there weren't, you wouldn't be able to access i at all. The problem is that all closures contain a reference¹ to the same i variable, which will be 9 at the end of the function.
This situation isn't much different in test_with_yield. Why, then, do you get different results? Because yield suspends the run of the function, so it's possible to use the yielded lambdas before the end of the function is reached, i.e. before i is 9. To see what this means, consider the following two examples of using test_with_yield:
[f(0) for f in test_with_yield()]
# Result: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
[f(0) for f in list(test_with_yield())]
# Result: [9, 9, 9, 9, 9, 9, 9, 9, 9, 9]
What's happening here is that the first example yields a lambda (while i is 0), calls it (i is still 0), then advances the function until another lambda is yielded (i is now 1), calls the lambda, and so on. The important thing is that each lambda is called before the control flow returns to test_with_yield (i.e. before the value of i changes).
In the second example, we first create a list. So the first lambda is yielded (i is 0) and put into the list, the second lambda is created (i is now 1) and put into the list ... until the last lambda is yielded (i is now 9) and put into the list. And then we start calling the lambdas. So since i is now 9, all lambdas return 9.
¹ The important bit here is that closures hold references to variables, not copies of the value they held when the closure was created. This way, if you assign to the variable inside a lambda (or inner function, which create closures the same way that lambdas do), this will also change the variable outside of the lambda and if you change the value outside, that change will be visible inside the lambda.
No, yielding has nothing to do with closures.
Here is how to recognize closures in Python: a closure is
a function
in which an unqualified name lookup is performed
no binding of the name exists in the function itself
but a binding of the name exists in the local scope of a function whose definition surrounds the definition of the function in which the name is looked up.
The reason for the difference in behaviour you observe is laziness, rather than anything to do with closures. Compare and contrast the following
def lazy():
return ( lambda x: x+i for i in range(10) )
def immediate():
return [ lambda x: x+i for i in range(10) ]
def also_lazy():
for i in range(10):
yield lambda x:x+i
not_lazy_any_more = list(also_lazy())
print( [ f(10) for f in lazy() ] ) # 10 -> 19
print( [ f(10) for f in immediate() ] ) # all 19
print( [ f(10) for f in also_lazy() ] ) # 10 -> 19
print( [ f(10) for f in not_lazy_any_more ] ) # all 19
Notice that the first and third examples give identical results, as do the second and the fourth. The first and third are lazy, the second and fourth are not.
Note that all four examples provide a bunch of closures over the most recent binding of i, it's just that in the first an third case you evaluate the closures before rebinding i (even before you've created the next closure in the sequence), while in the second and fourth case, you first wait until i has been rebound to 9 (after you've created and collected all the closures you are going to make), and only then evaluate the closures.
Adding to #sepp2k's answer you're seeing these two different behaviours because the lambda functions being created don't know from where they have to get i's value. At the time this function is created all it knows is that it has to either fetch i's value from either local scope, enclosed scope, global scope or builtins.
In this particular case it is a closure variable(enclosed scope). And its value is changing with each iteration.
Check out LEGB in Python.
Now to why second one works as expected but not the first one?
It's because each time you're yielding a lambda function the execution of the generator function stops at that moment and when you're invoking it and it will use the value of i at that moment. But in the first case we have already advanced i's value to 9 before we invoked any of the functions.
To prove it you can fetch current value of i from the __closure__'s cell contents:
>>> for func in test_with_yield():
print "Current value of i is {}".format(func.__closure__[0].cell_contents)
print func(9)
...
Current value of i is 0
Current value of i is 1
Current value of i is 2
Current value of i is 3
Current value of i is 4
Current value of i is 5
Current value of i is 6
...
But instead if you store the functions somewhere and call them later then you will see the same behaviour as the first time:
from itertools import islice
funcs = []
for func in islice(test_with_yield(), 4):
print "Current value of i is {}".format(func.__closure__[0].cell_contents)
funcs.append(func)
print '-' * 20
for func in funcs:
print "Now value of i is {}".format(func.__closure__[0].cell_contents)
Output:
Current value of i is 0
Current value of i is 1
Current value of i is 2
Current value of i is 3
--------------------
Now value of i is 3
Now value of i is 3
Now value of i is 3
Now value of i is 3
Example used by Patrick Haugh in comments also shows the same thing: sum(t(1) for t in list(test_with_yield()))
Correct way:
Assign i as a default value to lambda, default values are calculated when function is created and they won't change(unless it's a mutable object). i is now a local variable to the lambda functions.
>>> def test_without_closure():
return [lambda x, i=i: x+i for i in range(10)]
...
>>> sum(t(1) for t in test_without_closure())
55

When is a new name introduced in Python?

I am asking because of the classic problem where somebody creates a list of lambdas:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
and unexpectedly gets only twos as output.
The commonly proposed solution is to make i a named argument like this:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
for l in foo:
print(l())
Which produces the desired output of 0, 1, 2 but now something magical has happened. It sort of did what is expected because Python is pass-by-reference and you didn't want a reference.
Still, just adding a new name to something, shouldn't that just create another reference?
So the question becomes what are the exact rules for when something is not a reference?
Considering that ints are immutable and the following works:
x = 3
y = x
x = 5
print(x, y) // outputs 5 3
probably explains why adding that named parameter works. A local i with the same value was created and captured.
Now why, in the case of our lambdas was the same i referenced? I pass an int to function and it is refenced and if I store it in a variable it is copied. Hm.
Basically I am looking for the most concise and abstract way possible to remember exactly how this works. When is the same value referenced, when do I get a copy. If it has any common names and there are programming languages were it works the same that would be interesting as well.
Here is my current assumption:
Arguments are always passed to functions by reference.
Assigning to a variable of immutable type creates a copy.
I am asking anyway, just to make sure and hopefully get some background.
The issue here is how you think of names.
In your first example, i is a variable that is assigned to every time the loop iterates. When you use lambda to make a function, you make a function that accesses the name i and returns it's value. This means as the name i changes, the value returned by the functions also changes.
The reason the default argument trick works is that the name is evaluated when the function is defined. This means the default value is the value the i name points to at that time, not the name itself.
i is a label. 0, 1 and 2 are the objects. In the first case, the program assigns 0 to i, then makes a function that returns i - it then does this with 1 and 2. When the function is called, it looks up i (which is now 2) and then returns it.
In the second example, you assign 0 to i, then you make a function with a default argument. That default argument is the value that is gotten by evaluating i - that is the object 0. This is repeated for 1 and 2. When the function is called, it assigns that default value to a new variable i, local to the function and unrelated to the outer i.
Python doesn't exactly pass by reference or by value (at least, not the way you'd think of it, coming from a language like C++).
In many other languages (such as C++), variables can be thought of as synonymous with the values they hold.
However, in Python, variables are names that point to the objects in memory.
(This is a good explanation (with pictures!))
Because of this, you can get multiple names attached to one object, which can lead to interesting effects.
Consider these equivalent program snippets:
// C++:
int x;
x = 10; // line A
x = 20; // line B
and
# Python:
x = 10 # line C
x = 20 # line D
After line A, the int 10 is stored in memory, say, at the memory address 0x1111.
After line B, the memory at 0x1111 is overwritten, so 0x1111 now holds the int 20
However, the way this program works in python is quite different:
After line C, x points to some memory, say, 0x2222, and the value stored at 0x2222 is 10
After line D, x points to some different memory, say, 0x3333, and the value stored at 0x3333 is 20
Eventually, the orphaned memory at 0x2222 is garbage collected by Python.
Hopefully this helps you get a grasp of the subtle differences between variables in Python and most other languages.
(I know I didn't directly answer your question about lambdas, but I think this is good background knowledge to have before reading one of the good explanations here, such as #Lattyware's)
See this question for some more background info.
Here's some final background info, in the form of oft-quoted but instructive examples:
print 'Example 1: Expected:'
x = 3
y = x
x = 2
print 'x =', x
print 'y =', y
print 'Example 2: Surprising:'
x = [3]
y = x
x[0] = 2
print 'x =', x
print 'y =', y
print 'Example 3: Same logic as in Example 1:'
x = [3]
y = x
x = [2]
print 'x =', x
print 'y =', y
The output is:
Example 1: Expected:
x = 2
y = 3
Example 2: Surprising:
x = [2]
y = [2]
Example 3: Same logic as in Example 1:
x = [2]
y = [3]
foo = []
for i in range(3):
foo.append((lambda: i))
Here since all the lambda's were created in the same scope so all of them point to the same global variable variable i. so, whatever value i points to will be returned when they are actually called.
foo = []
for i in range(3):
foo.append((lambda z = i: id(z)))
print id(i) #165618436
print(foo[-1]()) #165618436
Here in each loop we assign the value of i to a local variable z, as default arguments are calculated when the function is parsed so the value z simply points to the values stored by i during the iteration.
Arguments are always passed to functions by reference?
In fact the z in foo[-1] still points to the same object as i of the last iteration, so yes values are passed by reference but as integers are immutable so changing i won't affect z of the foo[-1] at all.
In the example below all lambda's point to some mutable object, so modifying items in lis will also affect the functions in foo:
foo = []
lis = ([], [], [])
for i in lis:
foo.append((lambda z = i: z))
lis[0].append("bar")
print foo[0]() #prints ['bar']
i.append("foo") # `i` still points to lis[-1]
print foo[-1]() #prints ['foo']
Assigning to a variable of immutable type creates a copy?
No values are never copied.
>>> x = 1000
>>> y = x # x and y point to the same object, but an immutable object.
>>> x += 1 # so modifying x won't affect y at all, in fact after this step
# x now points to some different object and y still points to
# the same object 1000
>>> x #x now points to an new object, new id()
1001
>>> y #still points to the same object, same id()
1000
>>> x = []
>>> y = x
>>> x.append("foo") #modify an mutable object
>>> x,y #changes can be seen in all references to the object
(['foo'], ['foo'])
The list of lambdas problem arises because the i referred to in both snippets is the same variable.
Two distinct variables with the same name exist only if they exist in two separate scopes. See the following link for when that happens, but basically any new function (including a lambda) or class establishes its own scope, as do modules, and pretty much nothing else does. See: http://docs.python.org/2/reference/executionmodel.html#naming-and-binding
HOWEVER, when reading the value of a variable, if it is not defined in the current local scope, the enclosing local scopes are searched*. Your first example is of exactly this behaviour:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
Each lambda creates no variables at all, so its own local scope is empty. When execution hits the locally undefined i, it is located in the enclosing scope.
In your second example, each lambda creates its own i variable in the parameter list:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
This is in fact equivalent to lambda a=i: a, because the i inside the body is the same as the i on the left hand side of the assignment, and not the i on the right hand side. The consequence is that i is not missing from the local scope, and so the value of the local i is used by each lambda.
Update: Both of your assumptions are incorrect.
Function arguments are passed by value. The value passed is the reference to the object. Pass-by-reference would allow the original variable to be altered.
No implicit copying ever occurs on function call or assignment, of any language-level object. Under the hood, because this is pass-by-value, the references to the parameter objects are copied when the function is called, as is usual in any language which passes references by value.
Update 2: The details of function evaluation are here: http://docs.python.org/2/reference/expressions.html#calls . See the link above for the details regarding name binding.
* No actual linear search occurs in CPython, because the correct variable to use can be determined at compile time.
The answer is that the references created in a closure (where a function is inside a function, and the inner function accesses variables from the outer one) are special. This is an implementation detail, but in CPython the value is a particular kind of object called a cell and it allows the variable's value to be changed without rebinding it to a new object. More info here.
The way variables work in Python is actually rather simple.
All variables contain references to objects.
Reassigning a variable points it to a different object.
All arguments are passed by value when calling functions (though the values being passed are references).
Some types of objects are mutable, which means they can be changed without changing what any of their variable names point to. Only these types can be changed when passed, since this does not require changing any references to the object.
Values are never copied implicitly. Never.
The behaviour really has very little to do with how parameters are passed (which is always the same way; there is no distinction in Python where things are sometimes passed by reference and sometimes passed by value). Rather the problem is to do with how names themselves are found.
lambda: i
creates a function that is of course equivalent to:
def anonymous():
return i
That i is a name, within the scope of anonymous. But it's never bound within that scope (not even as a parameter). So for that to mean anything i must be a name from some outer scope. To find a suitable name i, Python will look at the scope in which anonymous was defined in the source code (and then similarly out from there), until it finds a definition for i.1
So this loop:
foo = []
for i in range(3):
foo.append((lambda: i))
for l in foo:
print(l())
Is almost exactly as if you had written this:
foo = []
for i in range(3):
def anonymous():
return i
foo.append(anonymous)
for l in foo:
print(l())
So that i in return i (or lambda: i) ends up being the same i from the outer scope, which is the loop variable. Not that they are all references to the same object, but that they are all the same name. So it's simply not possible for the functions stored in foo to return different values; they're all returning the object referred to by a single name.
To prove it, watch what happens when I remove the variable i after the loop:
>>> foo = []
>>> for i in range(3):
foo.append((lambda: i))
>>> del i
>>> for l in foo:
print(l())
Traceback (most recent call last):
File "<pyshell#7>", line 2, in <module>
print(l())
File "<pyshell#3>", line 2, in <lambda>
foo.append((lambda: i))
NameError: global name 'i' is not defined
You can see that the problem isn't that each function has a local i bound to the wrong thing, but rather than each function is returning the value of the same global variable, which I've now removed.
OTOH, when your loop looks like this:
foo = []
for i in range(3):
foo.append((lambda i=i: i))
for l in foo:
print(l())
That is quite like this:
foo = []
for i in range(3):
def anonymous(i=i):
return i
foo.append(anonymous)
for l in foo:
print(l())
Now the i in return i is not the same i as in the outer scope; it's a local variable of the function anonymous. A new function is created in each iteration of the loop (stored temporarily in the outer scope variable anonymous, and then permanently in a slot of foo), so each one has it's own local variables.
As each function is created, the default value of its parameter is set to the value of i (in the scope defining the functions). Like any other "read" of a variable, that pulls out whatever object is referenced by the variable at that time, and thereafter has no connection to the variable.2
So each function gets the default value of i as it is in the outer scope at the time it is created, and then when the function is called without an argument that default value becomes the value of the i in that function's local scope. Each function has no non-local references, so is completely unaffected by what happens outside it.
1 This is done at "compile time" (when the Python file is converted to bytecode), with no regard for what the system is like at runtime; it is almost literally looking for an outer def block with i = ... in the source code. So local variables are actually statically resolved! If that lookup chain falls all the way out to the module global scope, then Python assumes that i will be defined in the global scope at the point that the code will be run, and just treats i as a global variable whether or not there is a statically visible binding for i at module scope, hence why you can dynamically create global variables but not local ones.
2 Confusingly, this means that in lambda i=i: i, the three is refer to three completely different "variables" in two different scopes on the one line.
The leftmost i is the "name" holding the value that will be used for the default value of i, which exists independently of any particular call of the function; it's almost exactly "member data" stored in the function object.
The second i is an expression evaluated as the function is created, to get the default value. So the i=i bit acts very like an independent statement the_function.default_i = i, evaluated in the same scope containing the lambda expression.
And finally the third i is actually the local variable inside the function, which only exists within a call to the anonymous function.

Have Python 2.7 functions remember value and not reference? Closure Weirdness

I'm trying to return from a function a list of functions, each of which uses variables from the outside scope. This isn't working. Here's an example which demonstrates what's happening:
a = []
for i in range(10):
a.append(lambda x: x+i)
a[1](1) # returns 10, where it seems it should return 2
Why is this happening, and how can I get around it in python 2.7 ?
The i refers to the same variable each time, so i is 9 in all of the lambdas because that's the value of i at the end of the loop. Simplest workaround involves a default argument:
lambda x, i=i: x+i
This binds the value of the loop's i to a local variable i at the lambda's definition time.
Another workaround is to define a lambda that defines another lambda, and call the first lambda:
(lambda i: lambda x: x+i)(i)
This behavior makes a little more sense if you consider this:
def outerfunc():
def innerfunc():
return x+i
a = []
for i in range(10):
a.append(innerfunc)
return a
Here, innerfunc is defined once, so it makes intuitive sense that you are only working with a single function object, and you would not expect the loop to create ten different closures. With a lambda it doesn't look like the function is defined only once, it looks like you're defining it fresh each time through the loop, but in fact it is functionally the same as the the long version.
Because i isn't getting evaluated when you define the anonymous function (lambda expression) but when it's called. You can see this by adding del i before a[1](1): you'll get NameError: global name 'i' is not defined on the a[1](1) line.
You need to fix the value of i into the lambda expression every time, like so:
a = [lambda x, i=i: x+i for i in range(10)]
a[1](1) # returns 2
Another, more general solution - also without lambdas:
import operator
from functools import partial
a = []
for i in range(10):
a.append(partial(operator.add, i))
a[1][(1) # returns 2
The key aspect here is functools.partial.

Can someone please explain this bit of Python code?

I started working in Python just recently and haven't fully learned all the nuts and bolts of it, but recently I came across this post that explains why python has closures, in there, there is a sample code that goes like this:
y = 0
def foo():
x = [0]
def bar():
print x[0], y
def change(z):
global y
x[0] = y = z
change(1)
bar()
change(2)
bar()
change(3)
bar()
change(4)
bar()
foo()
1 1
2 2
3 3
and basically I don't understand how it actually works, and what construct like x[0] does in this case, or actually I understand what it's doing, I just don't get how is it this :)
Before the nonlocal keyword was added in Python 3 (and still today, if you're stuck on 2.* for whatever reason), a nested function just couldn't rebind a local barename of its outer function -- because, normally, an assignment statement to a barename, such as x = 23, means that x is a local name for the function containing that statement. global exists (and has existed for a long time) to allow assignments to bind or rebind module-level barenames -- but nothing (except for nonlocal in Python 3, as I said) to allow assignments to bind or rebind names in the outer function.
The solution is of course very simple: since you cannot bind or rebind such a barename, use instead a name that is not bare -- an indexing or an attribute of some object named in the outer function. Of course, said object must be of a type that lets you rebind an indexing (e.g., a list), or one that lets you bind or rebind an attribute (e.g., a function), and a list is normally the simplest and most direct approach for this. x is exactly that list in this code sample -- it exists only in order to let nested function change rebind x[0].
It might be simpler to understand if you look at this simplified code where I have removed the global variable:
def foo():
x = [0]
def bar():
print x[0]
def change(z):
x[0] = z
change(1)
bar()
foo()
The first line in foo creates a list with one element. Then bar is defined to be a function which prints the first element in x and the function change modifies the first element of the list. When change(1) is called the value of x becomes [1].
This code is trying to explain when python creates a new variable, and when python reuses an existing variable. I rewrote the above code slightly to make the point more clear.
y = "lion"
def foo():
x = ["tiger"]
w = "bear"
def bar():
print y, x[0], w
def change(z):
global y
x[0] = z
y = z
w = z
bar()
change("zap")
bar()
foo()
This will produce this output:
lion tiger bear
zap zap bear
The point is that the inner function change is able to affect the variable y, and the elements of array x, but it does not change w (because it gets its own local variable w that is not shared).
x = [0] creates a new list with the value 0 in it. x[0] references the zero-eth element in the list, which also happens to be zero.
The example is referencing closures, or passable blocks of code within code.

Is there a map without result in python? [duplicate]

This question already has answers here:
Is it Pythonic to use list comprehensions for just side effects?
(7 answers)
Closed 4 months ago.
Sometimes, I just want to execute a function for a list of entries -- eg.:
for x in wowList:
installWow(x, 'installed by me')
Sometimes I need this stuff for module initialization, so I don't want to have a footprint like x in global namespace. One solution would be to just use map together with lambda:
map(lambda x: installWow(x, 'installed by me'), wowList)
But this of course creates a nice list [None, None, ...] so my question is, if there is a similar function without a return-list -- since I just don't need it.
(off course I can also use _x and thus not leaving visible footprint -- but the map-solution looks so neat ...)
You could make your own "each" function:
def each(fn, items):
for item in items:
fn(item)
# called thus
each(lambda x: installWow(x, 'installed by me'), wowList)
Basically it's just map, but without the results being returned. By using a function you'll ensure that the "item" variable doesn't leak into the current scope.
You can use the built-in any function to apply a function without return statement to any item returned by a generator without creating a list. This can be achieved like this:
any(installWow(x, 'installed by me') for x in wowList)
I found this the most concise idom for what you want to achieve.
Internally, the installWow function does return None which evaluates to False in logical operations. any basically applies an or reduction operation to all items returned by the generator, which are all None of course, so it has to iterate over all items returned by the generator. In the end it does return False, but that doesn't need to bother you. The good thing is: no list is created as a side-effect.
Note that this only works as long as your function returns something that evaluates to False, e.g., None or 0. If it does return something that evaluates to True at some point, e.g., 1, it will not be applied to any of the remaining elements in your iterator. To be safe, use this idiom mainly for functions without return statement.
How about this?
for x in wowList:
installWow(x, 'installed by me')
del x
Every expression evaluates to something, so you always get a result, whichever way you do it. And any such returned object (just like your list) will get thrown away afterwards because there's no reference to it anymore.
To clarify: Very few things in python are statements that don't return anything. Even a function call like
doSomething()
still returns a value, even if it gets discarded right away. There is no such thing as Pascal's function / procedure distinction in python.
You might try this:
filter(lambda x: installWow(x, 'installed by me') and False, wowList)
That way, the return result is an empty list no matter what.
Or you could just drop the and False if you can force installWow() to always return False (or 0 or None or another expression that evaluates false).
You could use a filter and a function that doesn't return a True value. You'd get an empty return list since filter only adds the values which evaluates to true, which I suppose would save you some memory. Something like this:
#!/usr/bin/env python
y = 0
def myfunction(x):
global y
y += x
input = (1, 2, 3, 4)
print "Filter output: %s" % repr(filter(myfunction, input))
print "Side effect result: %d" % y
Running it produces this output:
Filter output: ()
Side effect result: 10
I can not resist myself to post it as separate answer
reduce(lambda x,y: x(y, 'installed by me') , wowList, installWow)
only twist is installWow should return itself e.g.
def installWow(*args):
print args
return installWow
if it is ok to distruct wowList
while wowList: installWow(wowList.pop(), 'installed by me')
if you do want to maintain wowList
wowListR = wowList[:]
while wowListR: installWow(wowListR.pop(), 'installed by me')
and if order matters
wowListR = wowList[:]; wowListR.reverse()
while wowListR: installWow(wowListR.pop(), 'installed by me')
Though as the solution of the puzzle I like the first :)
I tested several different variants, and here are the results I got.
Python 2:
>>> timeit.timeit('for x in xrange(100): L.append(x)', 'L = []')
14.9432640076
>>> timeit.timeit('[x for x in xrange(100) if L.append(x) and False]', 'L = []')
16.7011508942
>>> timeit.timeit('next((x for x in xrange(100) if L.append(x) and False), None)', 'L = []')
15.5235641003
>>> timeit.timeit('any(L.append(x) and False for x in xrange(100))', 'L = []')
20.9048290253
>>> timeit.timeit('filter(lambda x: L.append(x) and False, xrange(100))', 'L = []')
27.8524758816
Python 3:
>>> timeit.timeit('for x in range(100): L.append(x)', 'L = []')
13.719769178002025
>>> timeit.timeit('[x for x in range(100) if L.append(x) and False]', 'L = []')
15.041426660001889
>>> timeit.timeit('next((x for x in range(100) if L.append(x) and False), None)', 'L = []')
15.448063717998593
>>> timeit.timeit('any(L.append(x) and False for x in range(100))', 'L = []')
22.087335471998813
>>> timeit.timeit('next(filter(lambda x: L.append(x) and False, range(100)), None)', 'L = []')
36.72446593800123
Note that the time values are not that precise (for example, the relative performance of the first three options varied from run to run). My conclusion is that you should just use a loop, it's more readable and performs at least as well as the alternatives. If you want to avoid polluting the namespace, just del the variable after using it.
first rewrite the for loop as a generator expression, which does not allocate any memory.
(installWow(x, 'installed by me') for x in wowList )
But this expression doesn't actually do anything without finding some way to consume it. So we can rewrite this to yield something determinate, rather than rely on the possibly None result of installWow.
( [1, installWow(x, 'installed by me')][0] for x in wowList )
which creates a list, but returns only the constant 1. this can be consumed conveniently with reduce
reduce(sum, ( [1, installWow(x, 'installed by me')][0] for x in wowList ))
Which conveniently returns the number of items in wowList that were affected.
Just make installWow return None or make the last statement be pass like so:
def installWow(item, phrase='installed by me'):
print phrase
pass
and use this:
list(x for x in wowList if installWow(x))
x won't be set in the global name space and the list returned is [] a singleton
If you're worried about the need to control the return value (which you need to do to use filter) and prefer a simpler solution than the reduce example above, then consider using reduce directly. Your function will need to take an additional first parameter, but you can ignore it, or use a lambda to discard it:
reduce(lambda _x: installWow(_x, 'installed by me'), wowList, None)
Let me preface this by saying that it seems the original poster was more concerned about namespace clutter than anything else. In that case, you can wrap your working variables in separate function namespace and call it after declaring it, or you can simply remove them from the namespace after you've used them with the "del" builtin command. Or, if you have multiple variables to clean up, def the function with all the temp variables in it, run it, then del it.
Read on if the main concern is optimization:
Three more ways, potentially faster than others described here:
For Python >= 2.7, use collections.deque((installWow(x, 'installed by me') for x in wowList),0) # saves 0 entries while iterating the entire generator, but yes, still has a byproduct of a final object along with a per-item length check internally
If worried about this kind of overhead, install cytoolz. You can use count which still has a byproduct of incrementing a counter but it may be a smaller number of cycles than deque's per-item check, not sure. You can use it instead of any() in the next way:
Replace the generator expression with itertools.imap (when installWow never returns True. Otherwise you may consider itertools.ifilter and itertools.ifilterfalse with None for the predicate): any(itertools.imap(installWow,wowList,itertools.repeat('installed by me')))
But the real problem here is the fact that a function returns something and you do not want it to return anything.. So to resolve this, you have 2 options. One is to refactor your code so installWow takes in the wowList and iterates it internally. Another is rather mindblowing, but you can load the installWow() function into a compiled ast like so:
lines,lineno=inspect.getsourcelines(func) # func here is installWow without the parens
return ast.parse(join(l[4:] for l in lines if l)) # assumes the installWow function is part of a class in a module file.. For a module-level function you would not need the l[4:]
You can then do the same for the outer function, and traverse the ast to find the for loop. Then in the body of the for loop, insert the instalWow() function ast's function definition body, matching up the variable names. You can then simply call exec on the ast itself, and provide a namespace dictionary with the right variables filled in. To make sure your tree modifications are correct, you can check what the final source code would look like by running astunparse.
And if that isn't enough you can go to cython and write a .pyx file which will generate and compile a .c file into a library with python bindings. Then, at least the lost cycles won't be spent converting to and from python objects and type-checking everything repeatedly.
A simple DIY whose sole purpose is to loop through a generator expression:
def do(genexpr):
for _ in genexpr:
pass
Then use:
do(installWow(x, 'installed by me') for x in wowList)
In python 3 there are some ways to use a function with no return(just use a semicolon in jupyter ot ommit the output from the cell):
[*map(print, MY_LIST)]; # form 1 - unpack the map generator to a list
any(map(print, MY_LIST)); # form 2 - force execution with any
list(map(print, MY_LIST)); # form 3 - collect list from generator
Someone needs to answer --
The more pythonic way here is to not worry about polluting the namespace, and using __all__ to define the public variables.
myModule/__init__.py:
__all__ = ['func1', 'func2']
for x in range(10):
print 'init {}'.format(x)
def privateHelper1(x):
return '{}:{}'.format(x,x)
def func1():
print privateHelper1('func1')
def func2():
print privateHelper1('func1')
Then
python -c "import myModule; help(myModule);"
init 0
init 1
init 2
init 3
init 4
init 5
init 6
init 7
init 8
init 9
Help on package mm:
NAME
myModule
FILE
h:\myModule\__init__.py
PACKAGE CONTENTS
FUNCTIONS
func1()
func2()
DATA
__all__ = ['func1', 'func2']

Categories

Resources