Python generators and coroutines - python

I am studying coroutines and generators in various programming languages.
I was wondering if there is a cleaner way to combine together two coroutines implemented via generators than yielding back at the caller whatever the callee yields?
Let's say that we are using the following convention: all yields apart from the last one return null, while the last one returns the result of the coroutine. So, for example, we could have a coroutine that invokes another:
def A():
# yield until a certain condition is met
yield result
def B():
# do something that may or may not yield
x = bind(A())
# ...
return result
in this case I wish that through bind (which may or may not be implementable, that's the question) the coroutine B yields whenever A yields until A returns its final result, which is then assigned to x allowing B to continue.
I suspect that the actual code should explicitly iterate A so:
def B():
# do something that may or may not yield
for x in A(): ()
# ...
return result
which is a tad ugly and error prone...
PS: it's for a game where the users of the language will be the designers who write scripts (script = coroutine). Each character has an associated script, and there are many sub-scripts which are invoked by the main script; consider that, for example, run_ship invokes many times reach_closest_enemy, fight_with_closest_enemy, flee_to_allies, and so on. All these sub-scripts need to be invoked the way you describe above; for a developer this is not a problem, but for a designer the less code they have to write the better!

Edit: I recommend using Greenlet. But if you're interested in a pure Python approach, read on.
This is addressed in PEP 342, but it's somewhat tough to understand at first. I'll try to explain simply how it works.
First, let me sum up what I think is the problem you're really trying to solve.
Problem
You have a callstack of generator functions calling other generator functions. What you really want is to be able to yield from the generator at the top, and have the yield propagate all the way down the stack.
The problem is that Python does not (at a language level) support real coroutines, only generators. (But, they can be implemented.) Real coroutines allow you to halt an entire stack of function calls and switch to a different stack. Generators only allow you to halt a single function. If a generator f() wants to yield, the yield statement has to be in f(), not in another function that f() calls.
The solution that I think you're using now, is to do something like in Simon Stelling's answer (i.e. have f() call g() by yielding all of g()'s results). This is very verbose and ugly, and you're looking for syntax sugar to wrap up that pattern. Note that this essentially unwinds the stack every time you yield, and then winds it back up again afterwards.
Solution
There is a better way to solve this problem. You basically implement coroutines by running your generators on top of a "trampoline" system.
To make this work, you need to follow a couple patterns:
1. When you want to call another coroutine, yield it.
2. Instead of returning a value, yield it.
so
def f():
result = g()
# …
return return_value
becomes
def f():
result = yield g()
# …
yield return_value
Say you're in f(). The trampoline system called f(). When you yield a generator (say g()), the trampoline system calls g() on your behalf. Then when g() has finished yielding values, the trampoline system restarts f(). This means that you're not actually using the Python stack; the trampoline system manages a callstack instead.
When you yield something other than a generator, the trampoline system treats it as a return value. It passes that value back to the caller generator through the yield statement (using .send() method of generators).
Comments
This kind of system is extremely important and useful in asynchronous applications, like those using Tornado or Twisted. You can halt an entire callstack when it's blocked, go do something else, and then come back and continue execution of the first callstack where it left off.
The drawback of the above solution is that it requires you to write essentially all your functions as generators. It may be better to use an implementation of true coroutines for Python - see below.
Alternatives
There are several implementations of coroutines for Python, see: http://en.wikipedia.org/wiki/Coroutine#Implementations_for_Python
Greenlet is an excellent choice. It is a Python module that modifies the CPython interpreter to allow true coroutines by swapping out the callstack.
Python 3.3 should provide syntax for delegating to a subgenerator, see PEP 380.

Are you looking for something like this?
def B():
for x in A():
if x is None:
yield
else:
break
# continue, x contains value A yielded

Related

Do different types of recursion have different memory complexities? [duplicate]

I have the following piece of code which fails with the following error:
RuntimeError: maximum recursion depth exceeded
I attempted to rewrite this to allow for tail recursion optimization (TCO). I believe that this code should have been successful if a TCO had taken place.
def trisum(n, csum):
if n == 0:
return csum
else:
return trisum(n - 1, csum + n)
print(trisum(1000, 0))
Should I conclude that Python does not do any type of TCO, or do I just need to define it differently?
No, and it never will since Guido van Rossum prefers to be able to have proper tracebacks:
Tail Recursion Elimination (2009-04-22)
Final Words on Tail Calls (2009-04-27)
You can manually eliminate the recursion with a transformation like this:
>>> def trisum(n, csum):
... while True: # Change recursion to a while loop
... if n == 0:
... return csum
... n, csum = n - 1, csum + n # Update parameters instead of tail recursion
>>> trisum(1000,0)
500500
I published a module performing tail-call optimization (handling both tail-recursion and continuation-passing style): https://github.com/baruchel/tco
Optimizing tail-recursion in Python
It has often been claimed that tail-recursion doesn't suit the Pythonic way of coding and that one shouldn't care about how to embed it in a loop. I don't want to argue with
this point of view; sometimes however I like trying or implementing new ideas
as tail-recursive functions rather than with loops for various reasons (focusing on the
idea rather than on the process, having twenty short functions on my screen in the same
time rather than only three "Pythonic" functions, working in an interactive session rather than editing my code, etc.).
Optimizing tail-recursion in Python is in fact quite easy. While it is said to be impossible
or very tricky, I think it can be achieved with elegant, short and general solutions; I even
think that most of these solutions don't use Python features otherwise than they should.
Clean lambda expressions working along with very standard loops lead to quick, efficient and
fully usable tools for implementing tail-recursion optimization.
As a personal convenience, I wrote a small module implementing such an optimization
by two different ways. I would like to discuss here about my two main functions.
The clean way: modifying the Y combinator
The Y combinator is well known; it allows to use lambda functions in a recursive
manner, but it doesn't allow by itself to embed recursive calls in a loop. Lambda
calculus alone can't do such a thing. A slight change in the Y combinator however
can protect the recursive call to be actually evaluated. Evaluation can thus be delayed.
Here is the famous expression for the Y combinator:
lambda f: (lambda x: x(x))(lambda y: f(lambda *args: y(y)(*args)))
With a very slight change, I could get:
lambda f: (lambda x: x(x))(lambda y: f(lambda *args: lambda: y(y)(*args)))
Instead of calling itself, the function f now returns a function performing the
very same call, but since it returns it, the evaluation can be done later from outside.
My code is:
def bet(func):
b = (lambda f: (lambda x: x(x))(lambda y:
f(lambda *args: lambda: y(y)(*args))))(func)
def wrapper(*args):
out = b(*args)
while callable(out):
out = out()
return out
return wrapper
The function can be used in the following way; here are two examples with tail-recursive
versions of factorial and Fibonacci:
>>> from recursion import *
>>> fac = bet( lambda f: lambda n, a: a if not n else f(n-1,a*n) )
>>> fac(5,1)
120
>>> fibo = bet( lambda f: lambda n,p,q: p if not n else f(n-1,q,p+q) )
>>> fibo(10,0,1)
55
Obviously recursion depth isn't an issue any longer:
>>> bet( lambda f: lambda n: 42 if not n else f(n-1) )(50000)
42
This is of course the single real purpose of the function.
Only one thing can't be done with this optimization: it can't be used with a
tail-recursive function evaluating to another function (this comes from the fact
that callable returned objects are all handled as further recursive calls with
no distinction). Since I usually don't need such a feature, I am very happy
with the code above. However, in order to provide a more general module, I thought
a little more in order to find some workaround for this issue (see next section).
Concerning the speed of this process (which isn't the real issue however), it happens
to be quite good; tail-recursive functions are even evaluated much quicker than with
the following code using simpler expressions:
def bet1(func):
def wrapper(*args):
out = func(lambda *x: lambda: x)(*args)
while callable(out):
out = func(lambda *x: lambda: x)(*out())
return out
return wrapper
I think that evaluating one expression, even complicated, is much quicker than
evaluating several simple expressions, which is the case in this second version.
I didn't keep this new function in my module, and I see no circumstances where it
could be used rather than the "official" one.
Continuation passing style with exceptions
Here is a more general function; it is able to handle all tail-recursive functions,
including those returning other functions. Recursive calls are recognized from
other return values by the use of exceptions. This solutions is slower than the
previous one; a quicker code could probably be written by using some special
values as "flags" being detected in the main loop, but I don't like the idea of
using special values or internal keywords. There is some funny interpretation
of using exceptions: if Python doesn't like tail-recursive calls, an exception
should be raised when a tail-recursive call does occur, and the Pythonic way will be
to catch the exception in order to find some clean solution, which is actually what
happens here...
class _RecursiveCall(Exception):
def __init__(self, *args):
self.args = args
def _recursiveCallback(*args):
raise _RecursiveCall(*args)
def bet0(func):
def wrapper(*args):
while True:
try:
return func(_recursiveCallback)(*args)
except _RecursiveCall as e:
args = e.args
return wrapper
Now all functions can be used. In the following example, f(n) is evaluated to the
identity function for any positive value of n:
>>> f = bet0( lambda f: lambda n: (lambda x: x) if not n else f(n-1) )
>>> f(5)(42)
42
Of course, it could be argued that exceptions are not intended to be used for intentionally
redirecting the interpreter (as a kind of goto statement or probably rather a kind of
continuation passing style), which I have to admit. But, again,
I find funny the idea of using try with a single line being a return statement: we try to return
something (normal behaviour) but we can't do it because of a recursive call occurring (exception).
Initial answer (2013-08-29).
I wrote a very small plugin for handling tail recursion. You may find it with my explanations there: https://groups.google.com/forum/?hl=fr#!topic/comp.lang.python/dIsnJ2BoBKs
It can embed a lambda function written with a tail recursion style in another function which will evaluate it as a loop.
The most interesting feature in this small function, in my humble opinion, is that the function doesn't rely on some dirty programming hack but on mere lambda calculus: the behaviour of the function is changed to another one when inserted in another lambda function which looks very like the Y combinator.
The word of Guido is at http://neopythonic.blogspot.co.uk/2009/04/tail-recursion-elimination.html
I recently posted an entry in my Python History blog on the origins of
Python's functional features. A side remark about not supporting tail
recursion elimination (TRE) immediately sparked several comments about
what a pity it is that Python doesn't do this, including links to
recent blog entries by others trying to "prove" that TRE can be added
to Python easily. So let me defend my position (which is that I don't
want TRE in the language). If you want a short answer, it's simply
unpythonic. Here's the long answer:
CPython does not and will probably never support tail call optimization based on Guido van Rossum's statements on the subject.
I've heard arguments that it makes debugging more difficult because of how it modifies the stack trace.
Try the experimental macropy TCO implementation for size.
Besides optimizing tail recursion, you can set the recursion depth manually by:
import sys
sys.setrecursionlimit(5500000)
print("recursion limit:%d " % (sys.getrecursionlimit()))
There is no built-in tail recursion optimization in Python. However, we can "rebuild" the function through the Abstract Syntax Tree( AST), eliminating the recursion there and replacing it with a loop. Guido was absolutely right, this approach has some limitations, so I can't recommend it for use.
However, I still wrote (rather as a training example) my own version of the optimizer, and you can even try how it works.
Install this package via pip:
pip install astrologic
Now you can run this sample code:
from astrologic import no_recursion
counter = 0
#no_recursion
def recursion():
global counter
counter += 1
if counter != 10000000:
return recursion()
return counter
print(recursion())
This solution is not stable, and you should never use it in production. You can read about some significant restrictions on the page in github (in Russian, sorry). However, this solution is quite "real", without interrupting the code and other similar tricks.
A tail call can never be optimized to a jump in Python. An optimization is a program transformation that preserves the program's meaning. Tail-call elimination doesn't preserve the meaning of Python programs.
One problem, often mentioned, is that tail-call elimination changes the call stack, and Python allows for runtime introspection of the stack. But there is another problem that is rarely mentioned. There is probably a lot of code like this in the wild:
def map_file(path):
f = open(path, 'rb')
return mmap.mmap(f.fileno())
The call to mmap.mmap is in tail position. If it were replaced by a jump, then the current stack frame would be discarded before control was passed to mmap. The current stack frame contains the only reference to the file object, so the file object could (and in CPython would) be freed before mmap is called, which would close the file descriptor, invalidating it before mmap sees it.
At best, the code would fail with an exception. At worst, the file descriptor could be reused in another thread, causing mmap to map the wrong file. So this "optimization" would be a potentially disastrous thing to unleash on the huge body of existing Python code.
The Python spec guarantees that such problems won't occur, so you can be sure that no conformant implementation will ever convert return f(args) into a jump—unless, perhaps, it has a sophisticated static analysis engine that can prove that discarding an object early will have no observable consequences in this case.
None of that would prevent Python from adding a syntax for explicit tail calls with jump semantics, such as
return from f(args)
That wouldn't break code that didn't use it, and it would probably be useful for autogenerated code and some algorithms. GvR is no longer BDFL, so it might happen, but I wouldn't hold my breath.

Advantages of Higher order functions in Python

To implement prettified xml, I have written following code
def prettify_by_response(response, prettify_func):
root = ET.fromstring(response.content)
return prettify_func(root)
def prettify_by_str(xml_str, prettify_func):
root = ET.fromstring(xml_str)
return prettify_func(root)
def make_pretty_xml(root):
rough_string = ET.tostring(root, "utf-8")
reparsed = minidom.parseString(rough_string)
xml = reparsed.toprettyxml(indent="\t")
return xml
def prettify(response):
if isinstance(response, str) or isinstance(response, bytes):
return prettify_by_str(response, make_pretty_xml)
else:
return prettify_by_response(response, make_pretty_xml)
In prettify_by_response and prettify_by_str functions, I pass function make_pretty_xml as an argument
Instead of passing function as an argument, I can simply call that function.e.g
def prettify_by_str(xml_str, prettify_func):
root = ET.fromstring(xml_str)
return make_pretty_xml(root)
One of the advantage that passing function as an argument to these function over calling that function directly is, this function is not tightly couple to make_pretty_xml function.
What would be other advantages or Am I adding additional complexity?
This seem very open to biased answers I'll try to be impartial but I can't make any promise.
First, high order functions are functions that receive, and/or return functions. The advantages are questionable, I'll try to enumerate the usage of HoF and elucidate the goods and bads of each one
Callbacks
Callbacks came as a solution to blocking calls. I need B to happens after A so I call something that blocks on A and then calls B. This naturally leads to questions like, Hmm, my system wastes a lot of time waiting for things to happen. What if instead of waiting I can get what I need to be done passed as an argument. As anything new in technology that wasn't scaled yet seems a good idea until is scaled.
Callbacks are very common on the event system. If you every code in javascript you know what I'm talking about.
Algorithm abstraction
Some designs, mostly the behavioral ones can make use of HoF to choose some algorithm at runtime. You can have a high-level algorithm that receives functions that deal with low-level stuff. This lead to more abstraction code reuse and portable code. Here, portable means that you can write code to deal with new low levels without changing the high-level ones. This is not related to HoF but can make use of them for great help.
Attaching behavior to another function
The idea here is taking a function as an argument and returning a function that does exactly what the argument function does, plus, some attached behavior. And this is where (I think) HoF really shines.
Python decorators are a perfect example. They take a function as an argument and return another function. This function is attached to the same identifier of the first function
#foo
def bar(*args):
...
is the same of
def bar(*args):
...
bar = foo(bar)
Now, reflect on this code
from functools import lru_cache
#lru_cache(maxsize=None)
def fib(n):
if n < 2:
return n
return fib(n-1) + fib(n-2)
fib is just a Fibonacci function. It calculates the Fibonacci number up to n. Now lru_cache attach a new behavior, of caching results for already previously calculated values. The logic inside fib function is not tainted by LRU cache logic. What a beautiful piece of abstraction we have here.
Applicative style programming or point-free programming
The idea here is to remove variables, or points and combining function applications to express algorithms. I'm sure there are lots of people better than me in this subject wandering SO.
As a side note, this is not a very common style in python.
for i in it:
func(i)
from functools import partial
mapped_it = map(func, it)
In the second example, we removed the i variable. This is common in the parsing world. As another side node, map function is lazy in python, so the second example doesn't have effect until if you iterate over mapped_it
Your case
In your case, you are returning the value of the callback call. In fact, you don't need the callback, you can simply line up the calls as you did, and for this case you don't need HoF.
I hope this helps, and that somebody can show better examples of applicative style :)
Regards

Python infinite loop and if statement

The question requires me to determine the output of the following code.
def new_if(pred, then_clause, else_clause):
if pred:
then_clause
else:
else_clause
def p(x):
new_if(x>5, print(x), p(2*x))
p(1)
I dont understand why it will be an infinite loop.
of output 1,2,4,8,16....and so on.
From what i understand, passing print(x) as a parameter will
straightaway print x, that is why the output has 1,2,4 even though the predicate is not True.
What i dont understand is after x>5, when pred is True,
Why the function does not end at the if pred:
Is it because there is no return value? Even after i put return then_clause or else_clause it is still an infinite loop.
I am unable to test this on pythontutor as it is infinite recursion.
Thank you for your time.
Python doesn't let you pass expressions like x > 5 as code to other functions (at least, not directly as the code is trying to do). If you call a function like foo(x > 5), the x > 5 expression is evaluated immediately in the caller's scope and only the result of the evaluation is passed to the function being called. The same happens for function calls within other function calls. When Python sees foo(bar()), it calls bar first, then calls foo with bar's return value.
In the p(x) function, the code is trying to pass p(2*x) to the new_if function, but the interpreter never gets to new_if since the p calls keep recursing forever (or rather until an exception is raised for exceeding the maximum recursion depth).
One way to make the code work would be to put the expressions into lambda functions, and changing new_if to call them. Bundling the code up into a function lets you delay the evaluation of the expression until the function is called, and there's no infinite recursion since pred_func is generally going to return True at some point (though it will still recurse forever if you call something like p(0) or p(-1)):
def new_if(pred_func, then_func, else_func):
if pred_func():
then_func()
else:
else_func()
def p(x):
new_if(lambda: x>5, lambda: print(x), lambda: p(2*x))
Note that lambdas feel a little bit odd to me for then_func or else_func, since we don't care about or use the return values from them at all. A lambda function always returns the result of its expression. That's actually pretty harmless in this case, since both print and p return None anyway, which is the same as what Python would return for us if we didn't explicitly return from a regular (non-lambda) function. But for me at least, it seems more natural to use a lambda when the return value means something. (Perhaps new_if should return the value returned from whichever function it calls?)
If you don't like writing closures (i.e. functions that have to look up x in the enclosing scope), you could instead use functools.partial to bind pre-calculated arguments to functions like print and p without calling those functions immediately. For instance:
from functools import partial
def p(x):
return new_if(partial((5).__lt__, x), partial(print, x), partial(p, 2*x))
This only works if each of the expressions can be turned into a single call to an existing function. It can be done in this case (with a little creativity and careful syntax for pred_func), but probably won't be possible in more complicated cases.
Its also worth noting that the evaluation of 2*x happens immediately in the p function's scope, before new_if is called. If that multiplication was the expensive part of the else_func logic, that could be problematic (you'd want to defer the work to when else_func was actually called).
you are calling function from itself, that causes infinite loop and you have nothing to stop the function.
def new_if(pred, then_clause, else_clause):
if pred:
then_clause
else:
else_clause
def p(x):
if x<5:
new_if(x>5, print(x),p(2*x))
p(1)
this will solve it

How to write Python generator function that never yields anything

I want to write a Python generator function that never actually yields anything. Basically it's a "do-nothing" drop-in that can be used by other code which expects to call a generator (but doesn't always need results from it). So far I have this:
def empty_generator():
# ... do some stuff, but don't yield anything
if False:
yield
Now, this works OK, but I'm wondering if there's a more expressive way to say the same thing, that is, declare a function to be a generator even if it never yields any value. The trick I've employed above is to show Python a yield statement inside my function, even though it is unreachable.
Another way is
def empty_generator():
return
yield
Not really "more expressive", but shorter. :)
Note that iter([]) or simply [] will do as well.
An even shorter solution:
def empty_generator():
yield from []
For maximum readability and maintainability, I would prioritize a construct which goes at the top of the function. So either
your original if False: yield construct, but hoisted to the very first line, or
a separate decorator which adds generator behavior to a non-generator callable.
(That's assuming you didn't just need a callable which did something and then returned an empty iterable/iterator. If so then you could just use a regular function and return ()/return iter(()) at the end.)
Imagine the reader of your code sees:
def name_fitting_what_the_function_does():
# We need this function to be an empty generator:
if False: yield
# that crucial stuff that this function exists to do
Having this at the top immediately cues in every reader of this function to this detail, which affects the whole function - affects the expectations and interpretations of this function's behavior and usage.
How long is your function body? More than a couple lines? Then as a reader, I will feel righteous fury and condemnation towards the author if I don't get a cue that this function is a generator until the very end, because I will probably have spent significant mental cost weaving a model in my head based on the assumption that this is a regular function - the first yield in a generator should ideally be immediately visible, when you don't even know to look for it.
Also, in a function longer than a few lines, a construct at the very beginning of the function is more trustworthy - I can trust that anyone who has looked at a function has probably seen its first line every time they looked at it. That means a higher chance that if that line was mistaken or broken, someone would have spotted it. That means I can be less vigilant for the possibility that this whole thing is actually broken but being used in a way that makes the breakage non-obvious.
If you're working with people who are sufficiently fluently familiar with the workings of Python, you could even leave off that comment, because to someone who immediately remembers that yield is what makes Python turn a function into a generator, it is obvious that this is the effect, and probably the intent since there is no other reason for correct code to have a non-executed yield.
Alternatively, you could go the decorator route:
#generator_that_yields_nothing
def name_fitting_what_the_function_does():
# that crucial stuff for which this exists
def generator_that_yields_nothing(wrapped):
#functools.wraps(wrapped)
def wrapper_generator():
if False: yield
wrapped()
return wrapper_generator

Is it safe to yield from within a "with" block in Python (and why)?

The combination of coroutines and resource acquisition seems like it could have some unintended (or unintuitive) consequences.
The basic question is whether or not something like this works:
def coroutine():
with open(path, 'r') as fh:
for line in fh:
yield line
Which it does. (You can test it!)
The deeper concern is that with is supposed to be something an alternative to finally, where you ensure that a resource is released at the end of the block. Coroutines can suspend and resume execution from within the with block, so how is the conflict resolved?
For example, if you open a file with read/write both inside and outside a coroutine while the coroutine hasn't yet returned:
def coroutine():
with open('test.txt', 'rw+') as fh:
for line in fh:
yield line
a = coroutine()
assert a.next() # Open the filehandle inside the coroutine first.
with open('test.txt', 'rw+') as fh: # Then open it outside.
for line in fh:
print 'Outside coroutine: %r' % repr(line)
assert a.next() # Can we still use it?
Update
I was going for write-locked file handle contention in the previous example, but since most OSes allocate filehandles per-process there will be no contention there. (Kudos to #Miles for pointing out the example didn't make too much sense.) Here's my revised example, which shows a real deadlock condition:
import threading
lock = threading.Lock()
def coroutine():
with lock:
yield 'spam'
yield 'eggs'
generator = coroutine()
assert generator.next()
with lock: # Deadlock!
print 'Outside the coroutine got the lock'
assert generator.next()
I don't really understand what conflict you're asking about, nor the problem with the example: it's fine to have two coexisting, independent handles to the same file.
One thing I didn't know that I learned in response to your question it that there is a new close() method on generators:
close() raises a new GeneratorExit exception inside the generator to terminate the iteration. On receiving this exception, the generator’s code must either raise GeneratorExit or StopIteration.
close() is called when a generator is garbage-collected, so this means the generator’s code gets one last chance to run before the generator is destroyed. This last chance means that try...finally statements in generators can now be guaranteed to work; the finally clause will now always get a chance to run. This seems like a minor bit of language trivia, but using generators and try...finally is actually necessary in order to implement the with statement described by PEP 343.
http://docs.python.org/whatsnew/2.5.html#pep-342-new-generator-features
So that handles the situation where a with statement is used in a generator, but it yields in the middle but never returns—the context manager's __exit__ method will be called when the generator is garbage-collected.
Edit:
With regards to the file handle issue: I sometimes forget that there exist platforms that aren't POSIX-like. :)
As far as locks go, I think Rafał Dowgird hits the head on the nail when he says "You just have to be aware that the generator is just like any other object that holds resources." I don't think the with statement is really that relevant here, since this function suffers from the same deadlock issues:
def coroutine():
lock.acquire()
yield 'spam'
yield 'eggs'
lock.release()
generator = coroutine()
generator.next()
lock.acquire() # whoops!
I don't think there is a real conflict. You just have to be aware that the generator is just like any other object that holds resources, so it is the creator's responsibility to make sure it is properly finalized (and to avoid conflicts/deadlock with the resources held by the object). The only (minor) problem I see here is that generators don't implement the context management protocol (at least as of Python 2.5), so you cannot just:
with coroutine() as cr:
doSomething(cr)
but instead have to:
cr = coroutine()
try:
doSomething(cr)
finally:
cr.close()
The garbage collector does the close() anyway, but it's bad practice to rely on that for freeing resources.
Because yield can execute arbitrary code, I'd be very wary of holding a lock over a yield statement. You can get a similar effect in lots of other ways, though, including calling a method or functions which might be have been overridden or otherwise modified.
Generators, however, are always (nearly always) "closed", either with an explicit close() call, or just by being garbage-collected. Closing a generator throws a GeneratorExit exception inside the generator and hence runs finally clauses, with statement cleanup, etc. You can catch the exception, but you must throw or exit the function (i.e. throw a StopIteration exception), rather than yield. It's probably poor practice to rely on the garbage collector to close the generator in cases like you've written, because that could happen later than you might want, and if someone calls sys._exit(), then your cleanup might not happen at all.
For a TLDR, look at it this way:
with Context():
yield 1
pass # explicitly do nothing *after* yield
# exit context after explicitly doing nothing
The Context ends after pass is done (i.e. nothing), pass executes after yield is done (i.e. execution resumes). So, the with ends after control is resumed at yield.
TLDR: A with context remains held when yield releases control.
There are actually just two rules that are relevant here:
When does with release its resource?
It does so once and directly after its block is done. The former means it does not release during a yield, as that could happen several times. The later means it does release after yield has completed.
When does yield complete?
Think of yield as a reverse call: control is passed up to a caller, not down to a called one. Similarly, yield completes when control is passed back to it, just like when a call returns control.
Note that both with and yield are working as intended here! The point of a with lock is to protect a resource and it stays protected during a yield. You can always explicitly release this protection:
def safe_generator():
while True:
with lock():
# keep lock for critical operation
result = protected_operation()
# release lock before releasing control
yield result
That would be how I expected things to work. Yes, the block will not release its resources until it completes, so in that sense the resource has escaped it's lexical nesting. However but this is no different to making a function call that tried to use the same resource within a with block - nothing helps you in the case where the block has not yet terminated, for whatever reason. It's not really anything specific to generators.
One thing that might be worth worrying about though is the behaviour if the generator is never resumed. I would have expected the with block to act like a finally block and call the __exit__ part on termination, but that doesn't seem to be the case.

Categories

Resources