Python twisted: iterators and yields/inlineCallbacks - python

Folks,
Am thoroughly confused, so it's possible I am not even asking things correctly, but here goes:
I have a twisted application using inlineCallbacks. Now I need to define an iterator which will mean a generator is returned to the caller. However, the iterator cannot be inlineCallbacks decorated, can it be? If not, then how I do I code something like this.
Just to clarify: the goal is process_loop needs to be called every, say 5, seconds, it can process only ONE chunk of, say 10, and then it has to let go. However, to know that chunk of 10 (stored in cached, which is a dict of a dict), it needs to call a function that returns deferred.
#inlineCallbacks ### can\'t have inlineCallbacks here, right?
def cacheiter(cached):
for cachename,cachevalue in cached.items():
result = yield (call func here which returns deferred)
if result is True:
for k,v in cachedvalue.items():
yield cachename, k, v
#inlineCallbacks
def process_chunk(myiter, num):
try:
for i in xrange(num):
nextval = myiter.next()
yield some_processing(nextval)
returnValue(False)
except StopIteration:
returnValue(True)
#inlineCallbacks
def process_loop(cached):
myiter = cacheiter(cached)
result = yield process_chunk(myiter, 10)
if not result:
print 'More left'
reactor.callLater(5, process_loop, cached)
else:
print 'All done'

You're right that you can't express what you want to express in cacheiter. The inlineCallbacks decorator won't let you have a function that returns an iterator. If you decorate a function with it, then the result is a function that always returns a Deferred. That's what it is for.
Part of what makes this difficult is that iterators don't work well with asynchronous code. If there's a Deferred involved in producing the elements of your iterator, then the elements that come out of your iterator are going to be Deferreds first.
You might do something like this to account for that:
#inlineCallbacks
def process_work():
for element_deferred in some_jobs:
element = yield element_deferred
work_on(element)
This can work, but it looks particularly weird. Since generators can only yield to their caller (not, for example, to their caller's caller), the some_jobs iterator can't do anything about this; only code lexically within process_work can yield a Deferred to the inlineCallbacks-provided trampoline to wait on.
If you don't mind this pattern, then we could imaging your code being written something like:
from twisted.internet.task import deferLater
from twisted.internet.defer import inlineCallbacks, returnValue
from twisted.internet import reactor
class cacheiter(object):
def __init__(self, cached):
self._cached = iter(cached.items())
self._remaining = []
def __iter__(self):
return self
#inlineCallbacks
def next(self):
# First re-fill the list of synchronously-producable values if it is empty
if not self._remaining:
for name, value in self._cached:
# Wait on this Deferred to determine if this cache item should be included
if (yield check_condition(name, value)):
# If so, put all of its values into the value cache so the next one
# can be returned immediately next time this method is called.
self._remaining.extend([(name, k, v) for (k, v) in value.items()])
# Now actually give out a value, if there is one.
if self._remaining:
returnValue(self._remaining.pop())
# Otherwise the entire cache has been visited and the iterator is complete.
# Sadly we cannot signal completion with StopIteration, because the iterator
# protocol isn't going to add an errback to this Deferred and check for
# StopIteration. So signal completion with a simple None value.
returnValue(None)
#inlineCallbacks
def process_chunk(myiter, num):
for i in xrange(num):
nextval = yield myiter.next()
if nextval is None:
# The iterator signaled completion via the special None value.
# Processing is complete.
returnValue(True)
# Otherwise process the value.
yield some_processing(nextval)
# Indicate there is more processing to be done.
returnValue(False)
def sleep(sec):
# Simple helper to delay asynchronously for some number of seconds.
return deferLater(reactor, sec, lambda: None)
#inlineCallbacks
def process_loop(cached):
myiter = cacheiter(cached)
while True:
# Loop processing 10 items from myiter at a time, until process_chunk signals
# there are no values left.
result = yield process_chunk(myiter, 10)
if result:
print 'All done'
break
print 'More left'
# Insert the 5 second delay before starting on the next chunk.
yield sleep(5)
d = process_loop(cached)
Another approach you might be able to take, though, is to use twisted.internet.task.cooperate. cooperate takes an iterator and consumes it, assuming that consuming it is potentially costly, and splitting up the job over multiple reactor iterations. Taking the definition of cacheiter from above:
from twisted.internet.task import cooperate
def process_loop(cached):
finished = []
def process_one(value):
if value is None:
finished.append(True)
else:
return some_processing(value)
myiter = cacheiter(cached)
while not finished:
value_deferred = myiter.next()
value_deferred.addCallback(process_one)
yield value_deferred
task = cooperate(process_loop(cached))
d = task.whenDone()

I think you're trying to do this:
#inlineCallbacks
def cacheiter(cached):
for cachename,cachevalue in cached.items():
result = yield some_deferred() # some deferred you'd like evaluated
if result is True:
# here you want to return something, so you have to use returnValue
# the generator you want to return can be written as a generator expression
gen = ((cachename, k, v) for k,v in cachedvalue.items())
returnValue(gen)
When a genexp can't express what you're trying to return you can write a closure:
#inlineCallbacks
def cacheiter(cached):
for cachename,cachevalue in cached.items():
result = yield some_deferred()
if result is True:
# define the generator, saving the current values of the cache
def gen(cachedvalue=cachedvalue, cachename=cachename):
for k,v in cachedvalue.items():
yield cachename, k, v
returnValue(gen()) # return it

Try writing your iterator as a DeferredGenerator.

Related

Mixing yield and return. `yield [cand]; return` vs `return [[cand]]`. Why do they lead to different output? [duplicate]

This question already has answers here:
Return in generator together with yield
(2 answers)
Closed last year.
Why does
yield [cand]
return
lead to different output/behavior than
return [[cand]]
Minimal viable example
uses recursion
the output of the version using yield [1]; return is different than the output of the version using return [[1]]
def foo(i):
if i != 1:
yield [1]
return
yield from foo(i-1)
def bar(i):
if i != 1:
return [[1]]
yield from bar(i-1)
print(list(foo(1))) # [[1]]
print(list(bar(1))) # []
Min viable counter example
does not use recurion
the output of the version using yield [1]; return is the same as the output of the version using return [[1]]
def foo():
yield [1]
return
def foofoo():
yield from foo()
def bar():
return [[1]]
def barbar():
yield from bar()
print(list(foofoo())) # [[1]]
print(list(barbar())) # [[1]]
Full context
I'm solving Leetcode #39: Combination Sum and was wondering why one solution works, but not the other:
Working solution
from functools import cache # requires Python 3.9+
class Solution:
def combinationSum(self, candidates: List[int], target: int) -> List[List[int]]:
#cache
def helper(targ, i=0):
if i == N or targ < (cand := candidates[i]):
return
if targ == cand:
yield [cand]
return
for comb in helper(targ - cand, i):
yield comb + [cand]
yield from helper(targ, i+1)
N = len(candidates)
candidates.sort()
yield from helper(target)
Non-working solution
from functools import cache # requires Python 3.9+
class Solution:
def combinationSum(self, candidates: List[int], target: int) -> List[List[int]]:
#cache
def helper(targ, i=0):
if i == N or targ < (cand := candidates[i]):
return
if targ == cand:
return [[cand]]
for comb in helper(targ - cand, i):
yield comb + [cand]
yield from helper(targ, i+1)
N = len(candidates)
candidates.sort()
yield from helper(target)
Output
On the following input
candidates = [2,3,6,7]
target = 7
print(Solution().combinationSum(candidates, target))
the working solution correctly prints
[[3,2,2],[7]]
while the non-working solution prints
[]
I'm wondering why yield [cand]; return works, but return [[cand]] doesn't.
In a generator function, return just defines the value associated with the StopIteration exception implicitly raised to indicate an iterator is exhausted. It's not produced during iteration, and most iterating constructs (e.g. for loops) intentionally ignore the StopIteration exception (it means the loop is over, you don't care if someone attached random garbage to a message that just means "we're done").
For example, try:
>>> def foo():
... yield 'onlyvalue' # Existence of yield keyword makes this a generator
... return 'returnvalue'
...
>>> f = foo() # Makes a generator object, stores it in f
>>> next(f) # Pull one value from generator
'onlyvalue'
>>> next(f) # There is no other yielded value, so this hits the return; iteration over
--------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
...
StopIteration: 'returnvalue'
As you can see, your return value does get "returned" in a sense (it's not completely discarded), but it's never seen by anything iterating normally, so it's largely useless. Outside of rare cases involving using generators as coroutines (where you're using .send() and .throw() on instances of the generator and manually advancing it with next(genobj)), the return value of a generator won't be seen.
In short, you have to pick one:
Use yield anywhere in a function, and it's a generator (whether or not the code path of a particular call ever reaches a yield) and return just ends generation (while maybe hiding some data in the StopIteration exception). No matter what you do, calling the generator function "returns" a new generator object (which you can loop over until exhausted), it can never return a raw value computed inside the generator function (which doesn't even begin running until you loop over it at least once).
Don't use yield, and return works as expected (because it's not a generator function).
As an example to explain what happens to the return value in normal looping constructs, this is what for x in gen(): effectively expands to a C optimized version of:
__unnamed_iterator = iter(gen())
while True:
try:
x = next(__unnamed_iterator)
except StopIteration: # StopIteration caught here without inspecting it
break # Loop ends, StopIteration exception cleaned even from sys.exc_info() to avoid possible reference cycles
# body of loop goes here
# Outside of loop, there is no StopIteration object left
As you can see, the expanded form of the for loop has to look for a StopIteration to indicate the loop is over, but it doesn't use it. And for anything that's not a generator, the StopIteration never has any associated values; the for loop has no way to report them even if it did (it has to end the loop when it's told iteration is over, and the arguments to StopIteration are explicitly not part of the values iterated anyway). Anything else that consumes the generator (e.g. calling list on it) is doing roughly the same thing as the for loop, ignoring the StopIteration in the same way; nothing except code that specifically expects generators (as opposed to more generalized iterables and iterators) will ever bother to inspect the StopIteration object (at the C layer, there are optimizations that StopIteration objects aren't even produced by most iterators; they return NULL and leave the set exception empty, which all iterator protocol using things know is equivalent to returning NULL and setting a StopIteration object, so for anything but a generator, there isn't even an exception to inspect much of the time).

Using generator send() within a for loop

I implemented graph traversal as a generator function which yields the node being visited.
Sometimes the user needs to tell the traversal function that the edges outgoing from a particular node shouldn't be followed; in order to support that, the traversal checks the value sent back to it (using generator send() method), and if it's True, regards the node as a leaf for traversal purposes.
The problem is that the simplest user loop is kinda long:
# simplified thanks to #tobias_k
# bfs is the traversal generator function
traversal = bfs(g, start_node)
try:
n = next(traversal)
while True:
# process(n) returns True if don't want to follow edges out of n
n = traversal.send(process(n))
except StopIteration:
pass
Is there any way to improve this?
I thought something like this should work:
for n in bfs(g, start_node):
???.send(process(n))
but I feel I'm missing the knowledge of some python syntax.
I don't see a way to do this in a regular for loop. However, you could create another generator, that iterates another generator, using some "follow-function" to determine whether to follow the current element, thus encapsulating the tricky parts of your code into a separate function.
def checking_generator(generator, follow_function):
try:
x = next(generator)
while True:
yield x
x = generator.send(follow_function(x))
except StopIteration:
pass
for n in checking_generator(bfs(g, start_node), process):
print(n)
I discovered that my question would have had a one-line answer, using the extended "continue" statement proposed in the earlier version of PEP 342:
for n in bfs(g, start_node):
continue process(n)
However, while PEP 342 was accepted, that particular feature was withdrawn after this June 2005 discussion between Raymond and Guido:
Raymond Hettinger said:
Let me go on record as a strong -1 for "continue EXPR". The
for-loop is our most basic construct and is easily understood in its
present form. The same can be said for "continue" and "break" which
have the added advantage of a near zero learning curve for people
migrating from other languages.
Any urge to complicate these basic statements should be seriously
scrutinized and held to high standards of clarity, explainability,
obviousness, usefulness, and necessity. IMO, it fails most of those
tests.
I would not look forward to explaining "continue EXPR" in the tutorial
and think it would stand out as an anti-feature.
[...] The correct argument against "continue EXPR" is that there
are no use cases yet; if there were a good use case, the explanation
would follow easily.
Guido
If python core developers have since changed their mind about the usefulness of extended "continue", perhaps this could be reintroduced into a future PEP. But, given a nearly identical use case as in this question was already discussed in the quoted thread, and wasn't found persuasive, it seems unlikely.
To simplify the client code, you could use an ordinary bsf() generator and check node.isleaf attribute in it:
for node in bfs(g, start_node):
node.isleaf = process(node) # don't follow if `process()` returns True
The disadvantage is that node is mutable. Or you have to pass a shared data structure that tracks leaf nodes: leaf[node] = process(node) where leaf dictionary is passed into bfs() earlier.
If you want to use .send() method explicitly; you have to handle StopIteration. See PEP 479 -- Change StopIteration handling inside generators. You could hide it in a helper function:
def traverse(tree_generator, visitor):
try:
node = next(tree_generator)
while True:
node = tree_generator.send(visitor(node))
except StopIteration:
pass
Example:
traverse(bfs(g, start_node), process)
I don't see this as a frequent use case, consider this as the original generator:
def original_gen():
for x in range(10):
should_break = yield x
if should_break:
break
If the value of should_break is always calculated based on some function call with x then why not just write the generator like this:
def processing_gen(check_f):
for x in range(10):
yield x
should_break = check_f(x)
if should_break:
break
However I usually think of the code that processes the generated values as being written inside the loop (otherwise what is the point of having a loop at all?)
What it really seems you want to do is create a generator where calling the __next__ method really implies send(process(LAST_VALUE)) which can be implemented with a class:
class Followup_generator(): #feel free to use a better name
def __init__(self,generator,following_function):
self.gen = generator
self.process_f = following_function
def __iter__(self):
return self
def __next__(self):
if hasattr(self,"last_value"):
return self.send(self.process_f(self.last_value))
else:
self.last_value = next(self.gen)
return self.last_value
def send(self,arg):
self.last_value = self.gen.send(arg)
return self.last_value
def __getattr__(self,attr):
"forward other lookups to the generator (.throw etc.)"
return getattr(self.gen, attr)
# call signature is the exact same as #tobias_k's checking_generator
traversal = Followup_generator(bfs(g, start_node), process)
for n in traversal:
print(n)
n = traversal.send(DATA) #you'd be able to send extra values to it
However this still doesn't see this as frequently used, I'd be perfectly fine with a while loop, although I'd put the .send call at the top:
traversal = bfs(g, start_node)
send_value = None
while True:
n = traversal.send(send_value)
#code for loop, ending in calculating the next send_value
send_value = process(n)
And you might wrap that in a try: ... except StopIteration:pass although I find that simply waiting for an error to raise is better expressed with a context manager:
class Catch:
def __init__(self,exc_type):
if issubclass(exc_type,BaseException):
self.catch_type = exc_type
else:
raise TypeError("can only catch Exceptions")
def __enter__(self):
return self
def __exit__(self,exc_type,err, tb):
if issubclass(exc_type, self.catch_type):
self.err = err
return True
with Catch(StopIteration):
traversal = bfs(g, start_node)
send_value = None
while True:
n = traversal.send(send_value)
#code for loop, ending in calculating the next send_value
send_value = process(n)
Probably this is the answer to the question from the thread's topic.
Take a look at the additional empty yields statements inside the traversal function and custom send function, that does the magical job.
# tested with Python 3.7
def traversal(n):
for i in range(n):
yield i, '%s[%s] %s' % (' ' * (4 - n), n, i)
stop = yield
if stop:
yield # here's the first part of the magic
else:
yield # the same as above
yield from traversal(int(n / 2))
def send(generator, value):
next(generator) # here's the second part of the magic
generator.send(value)
g = traversal(4)
for i, (num, msg) in enumerate(g):
print('>', i, msg)
stop = num % 2 == 0
send(g, stop)
I've written a small class SettableGenerator which uses a method to receive the value to be send and then forwards it to the actual generator when __next__ is invoked.
With this you can write:
gen = SettableGenerator(bfs(g, start_node))
for n in gen:
gen.set(process(n))
Let's consider the following generator. It generates numbers from 0 to 9. For every generated number, it gets an input and stores it into ret:
def count_to_nine():
# Output: numbers from 0 to 9
# Input: converted numbers
ret = []
for i in range(10):
# Yield a number, get something back
val = (yield i)
# Remember that "something"
ret.append(val)
return ret
You can, indeed, iterate it using next() + send(),
but the best way is to iterate using send() alone:
g = count_to_nine()
value = None # to make sure that the first send() gives a None
while True:
value = g.send(value) # send the previously generated value, get a new one
value = f'#{value}'
Here's the result:
StopIteration: ['#0', '#1', '#2', '#3', '#4', '#5', '#6', '#7', '#8', '#9']
If you want that output, catch the StopIteration and get the result from it.
Cheers!

Discarding a tee element

I want to "fork" a stream of a large amount of data, in order to look-ahead just a couple of elements.
I was hoping to write something like this:
from itertools import tee
stream = # a generator of a very large data stream
while True:
try:
element= stream.next()
process_element( element )
if some_condition( element ):
stream, fork= tee(stream)
process_fork( fork )
except StopIteration:
break
Reading the documentation for tee, though, I'm left with the impression that the deque of fork will keep growing, even after fork has gone out of scope.
Is this the case? If so, is there a way to tell tee to "discard" the fork? Or is there another more obvious way of doing this?
You could avoid the implementation-dependent behavior #goncalopp mentions by making aTeeclass and giving it adiscard()method:
class Tee(object):
def __init__(self, iterable, n=2):
it = iter(iterable)
self.deques = [collections.deque() for _ in range(n)]
def gen(mydeque):
while True:
if not mydeque: # when the local deque is empty
newval = next(it) # fetch a new value and
for d in self.deques: # load it to all the active deques
d.append(newval)
yield mydeque.popleft()
self.generators = [gen(d) for d in self.deques]
def __call__(self):
return self.generators
def discard(gen):
index = self.generators.index(gen)
del self.deques[index]
del self.generators[index]
Note that since it would now be a class, utilizing it would be slightly different. However when you're done withfork, you could get rid of it by callingtee.discard(fork). Here's an example:
tee = None
while True:
try:
element = stream.next()
process_element(element)
if some_condition(element):
if not tee:
tee = Tee(stream)
stream, fork = tee()
process_fork(fork)
except StopIteration:
break
if tee:
tee.discard(fork)
fork = None
Here's a simple test script:
from itertools import tee
def natural_numbers():
i=0
while True:
yield i
i+=1
stream = natural_numbers() #Don't use xrange, cpython optimizes it away
stream, fork= tee(stream)
del fork
for e in stream:
pass
It seems that, at least in CPython, the process' memory doesn't keep growing. There seems to be a mechanism that detects this situation.
If, however, you substitute tee for the python code the documentation states is equivalent...
def tee(iterable, n=2):
it = iter(iterable)
deques = [collections.deque() for i in range(n)]
def gen(mydeque):
while True:
if not mydeque: # when the local deque is empty
newval = next(it) # fetch a new value and
for d in deques: # load it to all the deques
d.append(newval)
yield mydeque.popleft()
return tuple(gen(d) for d in deques)
...memory does keep growing, as expected.
So, my guess is that this will be implementation-dependent behaviour

How can I write a Python decorator to increase stackdepth?

BACKGROUND
When playing around, I often write simple recursive functions looking something like:
def f(a,b):
if a>=0 and b>=0:
return min( f(a-1,b) , f(b,a-1) ) # + some cost that depends on a,b
else:
return 0
(For example, when computing weighted edit distances, or evaluating recursively defined mathematical formulas.)
I then use a memoizing decorator to cache the results automatically.
PROBLEM
When I try something like f(200,10) I get:
RuntimeError: maximum recursion depth exceeded
This is as expected because the recursive implementation exhausts Python's stack space/ recursion limits.
WORKAROUNDS
I usually work around this problem by one of:
Increasing recursion limit with sys.setrecursionlimit (only works up to about 1000 depth)
Using a for loop to fill up the cache for smaller values
Changing the function to use a list as a manual stack (via append and pop calls) (in other words, moving from a recursive implementation to an iterative one)
Using an alternative programming language
but I find all of these quite error prone.
QUESTION
Is there a way to write an #Bigstack decorator that would simulate the effect of having a really big stack?
Note that my functions normally make several recursive function calls so this is not the same as tail recursion - I really do want to save all the internal state of each function on the stack.
WHAT I'VE TRIED
I've been thinking about using a list of generator expressions as my stack. By probing the stackframe I could work out when the function has been called recursively and then trigger an exception to return to the decorator code. However, I can't work out a way of gluing these ideas together to make anything that actually works.
Alternatively, I could try accessing the abstract syntax tree for the function and try transforming calls to recursive functions to yield statements, but this seems like it's heading in the wrong direction.
Any suggestions?
EDIT
It certainly looks like I am misusing Python, but another approach I have been considering is to use a different thread for each block of, say, 500 stack frames and then insert queues between each consecutive pair of threads - one queue for arguments, and another queue for return values. (Each queue will have at most one entry in it.) I think this probably doesn't work for some reason - but I'll probably only work out why after I've tried to implement it.
To get around the recursion limit, you can catch the RuntimeError exception to detect when you've run out of stack space, and then return a continuation-ish function that, when called, restarts the recursion at the level where you ran out of space. Call this (and its return value, and so on) until you get a value, then try again from the top. Once you've memoized the lower levels, the higher levels won't run into a recursion limit, so eventually this will work. Put the repeated-calling-until-it-works in a wrapper function. Basically it's a lazy version of your warming-up-the-cache idea.
Here's an example with a simple recursive "add numbers from 1 to n inclusive" function.
import functools
def memoize(func):
cache = {}
#functools.wraps(func)
def wrapper(*args, **kwargs):
key = args, tuple(sorted(kwargs.items()))
if key in cache:
return cache[key]
else:
result = func(*args, **kwargs)
if not callable(result):
cache[key] = result
return result
return wrapper
#memoize
def _addup(n):
if n < 2:
return n
else:
try:
result = _addup(n - 1)
except RuntimeError:
return lambda: _addup(n)
else:
return result if callable(result) else result + n
def addup(n):
result = _addup(n)
while callable(result):
while callable(result):
result = result()
result = _addup(n)
return result
assert addup(5000) == sum(xrange(5001))
Rather than returning the lambda function all the way back up the call chain, we can raise an exception to short-circuit that, which both improves performance and simplifies the code:
# memoize function as above, or you can probably use functools.lru_cache
class UnwindStack(Exception):
pass
#memoize
def _addup(n):
if n < 2:
return n
else:
try:
return _addup(n - 1) + n
except RuntimeError:
raise UnwindStack(lambda: _addup(n))
def _try(func, *args, **kwargs):
try:
return func(*args, **kwargs)
except UnwindStack as e:
return e[0]
def addup(n):
result = _try(_addup, n)
while callable(result):
while callable(result):
result = _try(result)
result = _try(_addup, n)
return result
This remains pretty inelegant, though, and still has a fair amount of overhead, and I can't imagine how you'd make a decorator out it. Python isn't really suited to this kind of thing, I guess.
Here's an implementation that uses a list of generator expressions as the stack:
def run_stackless(frame):
stack, return_stack = [(False, frame)], []
while stack:
active, frame = stack.pop()
action, res = frame.send(return_stack.pop() if active else None)
if action == 'call':
stack.extend([(True, frame), (False, res)])
elif action == 'tail':
stack.append((False, res))
elif action == 'return':
return_stack.append(res)
else:
raise ValueError('Unknown action', action)
return return_stack.pop()
To use it you need to transform the recursive function according to the following rules:
return expr -> yield 'return', expr
recursive_call(args...) -> (yield 'call', recursive_call(args...))
return recursive_call(args...) -> yield 'tail', recursive_call(args...)
For example, with the cost function as a * b, your function becomes:
def f(a,b):
if a>=0 and b>=0:
yield 'return', min((yield 'call', f(a-1,b)),
(yield 'call', f(b,a-1))) + (a * b)
else:
yield 'return', 0
Testing:
In [140]: run_stackless(g(30, 4))
Out[140]: 410
In Python 2.6.2 it appears to offer a ~8-10x performance hit compared to direct calls.
The tail action is for tail recursion:
def factorial(n):
acc = [1]
def fact(n):
if n == 0:
yield 'return', 0
else:
acc[0] *= n
yield 'tail', fact(n - 1)
run_stackless(fact(n))
return acc[0]
The transformation to generator-recursive style is fairly easy, and could probably be done as a bytecode hack.
This approach combines memoisation and increased stack depth into a single decorator.
I generate a pool of threads with each thread responsible for 64 levels of the stack.
Threads are only created once and resued (but currently never deleted).
Queues are used to pass information between threads, although note that only the thread corresponding to the current stack depth will actually have work to do.
My experiments suggest this adds around 10% overhead for a simple recursive function (and should be less for more complicated functions).
import threading,Queue
class BigstackThread(threading.Thread):
def __init__(self,send,recv,func):
threading.Thread.__init__( self )
self.daemon = True
self.send = send
self.recv = recv
self.func = func
def run(self):
while 1:
args = self.send.get()
v = self.func(*args)
self.recv.put(v)
class Bigstack(object):
def __init__(self,func):
self.func = func
self.cache = {}
self.depth = 0
self.threadpool = {}
def __call__(self,*args):
if args in self.cache:
return self.cache[args]
self.depth+=1
if self.depth&63:
v = self.func(*args)
else:
T=self.threadpool
if self.depth not in T:
send = Queue.Queue(1)
recv = Queue.Queue(1)
t = BigstackThread(send,recv,self)
T[self.depth] = send,recv,t
t.start()
else:
send,recv,_ = T[self.depth]
send.put(args)
v = recv.get()
self.depth-=1
self.cache[args]=v
return v
#Bigstack
def f(a,b):
if a>=0 and b>=0:
return min(f(a-1,b),f(b-1,a))+1
return 0

invoking yield for a generator in another function

suppose I have some manager object. This object's API has a main_hook function, that gets another function f as it's argument, and runs the given f in a loop, doing some stuff in between each iteration:
def main_hook(self,f):
while (self.shouldContinue()):
#do some preparations
f(self)
#do some tear down
Now, I also have (more accurately, would like to have) a function stop_and_do_stuff, that once called, stops main_hook dead in it's tracks, returns the control to whichever func called main_hook, and after that func finished what's it doing, get control back to main_hook and continue. Basically the result will be the same as doing
def main_hook(self,f):
while (self.shouldContinue()):
#do some preparations
yield
#do some tear down
Except that instead yield I want to have a call to f(), while giving f the option to call self.stop_and_do_stuff()
I can't work around this by making f also a generator for 2 reasons:
1.f isn't part of my API - it's given to me by a user who uses my lib
2.Even if could ask him to use yield, the place in the code in which he will need to call stop_and_do_stuff won't be directly inside f, rather in some place in the function stack which will be inside f(), but not directly in it, e.g
def h(manager):
#do stuff
if should stop:
manager.stop_and_do_stuff()
#do more stuff
def g(manager):
#some stuff
if should stop:
manager.stop_and_do_stuff()
#more stuff
if should stop again:
manager.stop_and_do_stuff()
if should call h:
h()
def f(manager):
g(manager)
so if I choose to make f a generator, I also need to make g a generator and also h, otherwise this trick won't work.
Is there any solution to all of this? maybe I'm trying to solve it the wrong way?
(I know this question is long and ugly - it's the best I could do. If something isn't clear please tell me and I'll clarify it)
EDIT
Maybe pep 342 is the solution?
My previous answer describes how to do this in Python2, which is very ugly. But now I ran across PEP 380: Syntax for Delegating to a Subgenerator. That does exactly what you ask. The only problem is that it requires Python3. But that shouldn't really be a problem.
Here's how it works:
def worker():
yield 1
yield 2
return 3
def main():
yield 0
value = yield from worker()
print('returned %d' % value)
yield 4
for m in main():
print('generator yields %d' % m)
The result of this is:
generator yields 0
generator yields 1
generator yields 2
returned 3
generator yields 4
Exceptions are passed through the way you would expect.
I believe I should also add an answer from the other point of view, ie not trying to explain how you could achieve what we can understand of what you are trying to do, but why yield definitely couldn't possibly work.
When a function contains yield keyword it is deeply modified. It is still a callable but not a normal function any more : it becomes a factory that return an iterator.
From the caller's point of view there is no difference between the three implementations below (except that the yield one is so much simpler).
##########################################
print "Function iterator using yield",
def gen():
for x in range(0, 10):
yield x
f = gen()
try:
while True:
print f.next(),
except StopIteration:
pass
for x in gen():
print x,
print
#########################################
print "Class iterator defining iter and next",
class gen2(object):
def __init__(self):
self.index = 0;
self.limit = 10;
def __iter__(self):
return self
def next(self):
if self.index >= self.limit:
raise StopIteration
self.index += 1;
return self.index - 1;
f = gen2()
try:
while True:
print f.next(),
except StopIteration:
pass
for x in gen2():
print x,
print
#########################################
print "Function iterator using iter() and sentinel",
def gen3():
def g3():
if g3.index is None:
g3.index = 0
g3.index += 1;
return g3.index - 1
g3.index = None
return iter(g3, 10)
f = gen3()
try:
while True:
print f.next(),
except StopIteration:
pass
for x in gen3():
print x,
print
Then you should understand that yield is not much about control flow, but about keeping call context inside variables. Once it is understood you have to decide if the API of main_loop really want to provide an iterator to it's caller. Then if so, if f may loop it must should also be an iterator (and there should be a loop around calls to f() like below).
def main_hook(self,f):
while (self.shouldContinue()):
#do some preparations
for v in f(self):
yield v
#do some tear down
But you should not care if f() has to call inner functions g(), etc. That is completely irrelevant. You provide a lib and it is your user problem to call with an appropriate iterable. If you believe your lib user won't be able to, you will have to change the overall design.
Hope it helps.
I don't understand the whole either (what does the main_hook caller look like ?), but i would say, Throw a StopNow exception, when you should stop, just like you should throw StopIteration when your generator is finished.
here is how i understood the thing as well as what i would do.
class StopNow(Exception):
pass
def main_hook(self,f):
got_stop_now_exc = False
while (!got_stop_now_exc and self.shouldContinue()):
#do some preparations
try:
f(self)
except StopNow:
got_stop_now_exc = True
#do some compulsary tear down, exception or not
def stop_and_do_stuff()
raise StopNow()
def my_f():
if needed:
stop_and_do_stuff()
def the_main_hook_caller():
while i_should:
managerthingie.main_hook(my_f)
do_stuff()
The behavior you describe looks exactly like a simple function call. Like below.
def f(manager):
print("Entering f")
manager.stop_and_do_stuff()
print("Exiting f")
class Manager(Object):
def shouldContinue(self):
return True
def stop_and_do_stuff(self):
print("Manager stop and do stuff")
def main_hook(self,f):
while self.shouldContinue()
print("Manager Setup")
f(self)
print("Manager Tear Down")
No problem if f() is provided by another user of if stop_and_do_stuff is called from some inner function. If you also want the manager to be able to unwind stack from stop_and_do_stuff and really exit in some cases, no problem. Just raise some exception from it and you would catch it from main_hook or upper code.
You should be able to do from inside stop_and_and_do_stuff() whatever you want to do from the caller of main hook. If not you should explain why.
What is unclear in the question is what's happening on the caller side of main_hook() and why you would want to be able to exit the main_hook loop, but not really. Either the main_loop caller expect a generator either it does not. You need to explain that part if you want to get a sensible answer (some context informations would also be nice, if you really explain WTF you are trying to do, and your real restrictions - you said f is provided by some other user and main_hook is in a lib, what of main_hook caller ? - there is probably well known usual solutions).
I am not quite sure what exactly you are trying to achieve, so maybe if you can explain the problem more instead of giving solution that would be better.
From my partial understanding why don't you do something like this
def main_hook(self,f):
while (self.shouldContinue()):
#do some preparations
stop_and_do_stuff = f(self)
if stop_and_do_stuff :
yield
#do some tear down
So basically f returns a flag to stop or not, and if it says stop we yield to function which called main_hook and that function can continue after doing some stuff
e.g.
class A(object):
def main_hook(self,f):
while (self.shouldContinue()):
#do some preparations
stop = f(self)
if stop:
yield
#do some tear down
def shouldContinue(self):
return True
def f(a):
return True
a = A()
for x in a.main_hook(f):
print x

Categories

Resources