Cleanup in a generator when it goes out of scope - python

I have a generator that must perform a clean up step even if it was never iterated through:
def gen(data):
while True:
item = data.get()
if item is None:
break
# ...
try:
yield transformed_item
except GeneratorExit:
break
# clean up; must happen if gen was called
# ...
Everything works fine (i.e., clean up happens) when I call it like this:
for x in gen(data):
# ...
or like this:
g = gen(data)
r = next(g)
# ...
But when the generator goes out of scope without anyone calling next on it, then of course it never executes any code at all, so GeneratorExit isn't raised inside it, and the clean-up doesn't happen:
g = gen(data)
# g was never used before going out of scope
del g
How can I refactor the code to guarantee the cleanup step occurs even if the generator goes out of scope before it ever had a chance to yield anything?

You could use context handlers for this. It depends how long you need to persist the generator.
class Gen(object):
def __init__(self, data):
self.data = data
def __enter__(self):
return self._gen(self.data)
def __exit__(self, exc_type, exc_val, exc_tb):
# Cleanup
print 'Cleaning up'
def _gen(self, data):
for i in data:
yield i
Then it would look like:
with Gen(data) as g:
r = next(g)
EDIT:
Given the limitation that you can't enforce end users to use context managers, could you just wrap the generator creation in another function and "seed" the generator?
def gen(data):
g = _gen(data)
next(g)
return g
def _gen(data):
yield None
while True:
... # Rest of generator

Related

Is it possible to write arbitrary depth delegated generators in Python?

I would like to write a class with the following interface.
class Automaton:
""" A simple automaton class """
def iterate(self, something):
""" yield something and expects some result in return """
print("Yielding", something)
result = yield something
print("Got \"" + result + "\" in return")
return result
def start(self, somefunction):
""" start the iteration process """
yield from somefunction(self.iterate)
raise StopIteration("I'm done!")
def first(iterate):
while iterate("what do I do?") != "over":
continue
def second(iterate):
value = yield from iterate("what do I do?")
while value != "over":
value = yield from iterate("what do I do?")
# A simple driving process
automaton = Automaton()
#generator = automaton.start(first) # This one hangs
generator = automaton.start(second) # This one runs smoothly
next_yield = generator.__next__()
for step in range(4):
next_yield = generator.send("Continue...({})".format(step))
try:
end = generator.send("over")
except StopIteration as excp:
print(excp)
The idea is that Automaton will regularly yield values to the caller which will in turn send results/commands back to the Automaton.
The catch is that the decision process "somefunction" will be some user defined function I have no control over. Which means that I can't really expect it to call the iterate method will a yield from in front. Worst, it could be that the user wants to plug some third-party function he has no control over inside this Automaton class. Meaning that the user might not be able to rewrite his somefunction for it to include yield from in front of iterate calls.
To be clear: I completely understand why using the first function hangs the automaton. I am just wondering if there is a way to alter the definition of iterate or start that would make the first function work.

Is there a Python structure that combines `while` and `with`?

I have some code that looks like this.
condition = <expression>
while condition:
<some code>
I would like to be able to write that without having to write a separate statement to create the condition. E.g.,
while <create_condition(<expression>)>:
<some code>
Here are two possibilities that don't work, but that would scratch my itch.
with <expression> as condition:
<some code>
The problem with that is that it doesn't loop. If I embed a while inside the with I'm back where I started.
Define my own function to do this.
def looping_with(<expression>, <some code>):
<define looping_with>
The problem with this is that if <some code> is passed as a lambda expression it is limited to a single expression. None of the workarounds I've seen are attractive.
If <some code> is passed as an actual def one gets a syntax error. You can't pass a function definition as an argument to another function.
One could define the function elsewhere and then pass the function. But the point of with, while, and lambda is that the code itself, not a reference to the code, is embedded in context. (The original version of my code, which is not terrible, is better than that.)
Any suggestions would be appreciated.
UPDATE: (As Dave Beazley likes to say: You're going to hate this.)
I hesitate to offer this example, but this is something like what I'm trying to do.
class Container:
def __init__(self):
self.value = None
class Get_Next:
def __init__(self, gen):
self.gen = gen
def __call__(self, limit, container):
self.runnable_gen = self.gen(limit, container)
return self
def get_next(self):
try:
next(self.runnable_gen)
return True
except StopIteration:
return False
#Get_Next
def a_generator(limit, container):
i = 0
while i < limit:
container.value = i
yield
i += 1
container = Container()
gen = a_generator(5, container)
while gen.get_next():
print(container.value)
print('Done')
When run, the output is:
0
1
2
3
4
Done
P.S. Lest you think this is too far out, there is a very easy way to produce the same result. Remove the decorator from a_generator and then run:
for _ in a_generator(5, container):
print(container.value)
print('Done')
The result is the same.
The problem is that for _ in <something> is too ugly for me.
So, what I'm really looking for is a way to get the functionality of for _ in <something> with nicer syntax. The syntax should (a) indicate that we are establishing a context and (b) looping within that context. Hence the request for a combination of with and while.
You could a context manager class that would help in doing something like that:
class Condition:
def __init__(self, cond):
self.cond = cond
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
pass
def __call__(self, *args, **kwargs):
return self.cond(*args, **kwargs)
with Condition(lambda x: x != 3) as condition:
a = 0
while condition(a):
print('a:', a)
a += 1
Output:
a: 0
a: 1
a: 2

Decorator over a nose test case that yields

I have the following decorator that is supposed to wrap the implementation of test case functions within a try/except block and print the log if an exception occurs.
def print_log(test_case):
#wraps(test_case)
def run_test(self):
try:
test_case(self)
except:
Test_Loop.failure_teardown(self)
raise
return run_test
This however does not seem to work on one of my test cases that calls a yield generator
Please bear with me as this is a basic example:
class Test_Loop:
# ton of implementation here (e.g. initialization, etc)
def runIt(self, name, ip, port):
# code here
#print_log
def test_log_looper(self):
for l in self.links:
# initialize variables seen below and other stuff
for n in names:
# do stuff
for i in ips:
# do stuff
for p in ports:
yield self.runIt, l, n, i, p
From debugging, when the decorator is applied, it seems that it does not even enter the first loop. What am I doing wrong?
You need to iterate over your generator. Modify your decorator like this:
def print_log(test_case):
#wraps(test_case)
def run_test(self):
try:
for _ in test_case(self): pass
except:
Test_Loop.failure_teardown(self)
raise
return run_test

Python: Using the same function as both a generator and a regular function

Is it possible to use a function as both a regular function (that does stuff and returns) and a generator function?
Suppose I have the following code:
def __init__(self):
self.make_generator = False
def foo(self):
gen = (x for x in my_list)
for x in gen:
# Do some stuff
if self.make_generator:
yield
# This return will raise a StopIteration
return
The idea here was that if self.make_generator is False, then I don't want foo to yield; I want it to not be a generator so that I can call it like:
foo()
And if I wanted it to yield between iterations, I'd make it a generator like so:
def __init__(self):
self.make_generator = True
self.foo_gen = self.foo()
def run(self):
while True:
self.foo_gen.next()
But it seems that this is not possible. I want to keep my code dry but am not sure how I can control when foo is a generator and when it executes like a regular function.
Thanks!

How to put variables on the stack/context in Python

In essence, I want to put a variable on the stack, that will be reachable by all calls below that part on the stack until the block exits. In Java I would solve this using a static thread local with support methods, that then could be accessed from methods.
Typical example: you get a request, and open a database connection. Until the request is complete, you want all code to use this database connection. After finishing and closing the request, you close the database connection.
What I need this for, is a report generator. Each report consist of multiple parts, each part can rely on different calculations, sometimes different parts relies in part on the same calculation. As I don't want to repeat heavy calculations, I need to cache them. My idea is to decorate methods with a cache decorator. The cache creates an id based on the method name and module, and it's arguments, looks if it has this allready calculated in a stack variable, and executes the method if not.
I will try and clearify by showing my current implementation. Want I want to do is to simplify the code for those implementing calculations.
First, I have the central cache access object, which I call MathContext:
class MathContext(object):
def __init__(self, fn):
self.fn = fn
self.cache = dict()
def get(self, calc_config):
id = create_id(calc_config)
if id not in self.cache:
self.cache[id] = calc_config.exec(self)
return self.cache[id]
The fn argument is the filename the context is created in relation to, from where data can be read to be calculated.
Then we have the Calculation class:
class CalcBase(object):
def exec(self, math_context):
raise NotImplementedError
And here is a stupid Fibonacci example. Non of the methods are actually recursive, they work on large sets of data instead, but it works to demonstrate how you would depend on other calculations:
class Fibonacci(CalcBase):
def __init__(self, n): self.n = n
def exec(self, math_context):
if self.n < 2: return 1
a = math_context.get(Fibonacci(self.n-1))
b = math_context.get(Fibonacci(self.n-2))
return a+b
What I want Fibonacci to be instead, is just a decorated method:
#cache
def fib(n):
if n<2: return 1
return fib(n-1)+fib(n-2)
With the math_context example, when math_context goes out of scope, so does all it's cached values. I want the same thing for the decorator. Ie. at point X, everything cached by #cache is dereferrenced to be gced.
I went ahead and made something that might just do what you want. It can be used as both a decorator and a context manager:
from __future__ import with_statement
try:
import cPickle as pickle
except ImportError:
import pickle
class cached(object):
"""Decorator/context manager for caching function call results.
All results are cached in one dictionary that is shared by all cached
functions.
To use this as a decorator:
#cached
def function(...):
...
The results returned by a decorated function are not cleared from the
cache until decorated_function.clear_my_cache() or cached.clear_cache()
is called
To use this as a context manager:
with cached(function) as function:
...
function(...)
...
The function's return values will be cleared from the cache when the
with block ends
To clear all cached results, call the cached.clear_cache() class method
"""
_CACHE = {}
def __init__(self, fn):
self._fn = fn
def __call__(self, *args, **kwds):
key = self._cache_key(*args, **kwds)
function_cache = self._CACHE.setdefault(self._fn, {})
try:
return function_cache[key]
except KeyError:
function_cache[key] = result = self._fn(*args, **kwds)
return result
def clear_my_cache(self):
"""Clear the cache for a decorated function
"""
try:
del self._CACHE[self._fn]
except KeyError:
pass # no cached results
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.clear_my_cache()
def _cache_key(self, *args, **kwds):
"""Create a cache key for the given positional and keyword
arguments. pickle.dumps() is used because there could be
unhashable objects in the arguments, but passing them to
pickle.dumps() will result in a string, which is always hashable.
I used this to make the cached class as generic as possible. Depending
on your requirements, other key generating techniques may be more
efficient
"""
return pickle.dumps((args, sorted(kwds.items())), pickle.HIGHEST_PROTOCOL)
#classmethod
def clear_cache(cls):
"""Clear everything from all functions from the cache
"""
cls._CACHE = {}
if __name__ == '__main__':
# used as decorator
#cached
def fibonacci(n):
print "calculating fibonacci(%d)" % n
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
for n in xrange(10):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
def lucas(n):
print "calculating lucas(%d)" % n
if n == 0:
return 2
if n == 1:
return 1
return lucas(n - 1) + lucas(n - 2)
# used as context manager
with cached(lucas) as lucas:
for i in xrange(10):
print 'lucas(%d) = %d' % (i, lucas(i))
for n in xrange(9, -1, -1):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
cached.clear_cache()
for n in xrange(9, -1, -1):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
this question seems to be two question
a) sharing db connection
b) caching/Memoizing
b) you have answered yourselves
a) I don't seem to understand why you need to put it on stack?
you can do one of these
you can use a class and connection
could be attribute of it
you can decorate all your function
so that they get a connection from
central location
each function can explicitly use a
global connection method
you can create a connection and pass
around it, or create a context
object and pass around
context,connection can be a part of
context
etc, etc
You could use a global variable wrapped in a getter function:
def getConnection():
global connection
if connection:
return connection
connection=createConnection()
return connection
"you get a request, and open a database connection.... you close the database connection."
This is what objects are for. Create the connection object, pass it to other objects, and then close it when you're done. Globals are not appropriate. Simply pass the value around as a parameter to the other objects that are doing the work.
"Each report consist of multiple parts, each part can rely on different calculations, sometimes different parts relies in part on the same calculation.... I need to cache them"
This is what objects are for. Create a dictionary with useful calculation results and pass that around from report part to report part.
You don't need to mess with "stack variables", "static thread local" or anything like that.
Just pass ordinary variable arguments to ordinary method functions. You'll be a lot happier.
class MemoizedCalculation( object ):
pass
class Fibonacci( MemoizedCalculation ):
def __init__( self ):
self.cache= { 0: 1, 1: 1 }
def __call__( self, arg ):
if arg not in self.cache:
self.cache[arg]= self(arg-1) + self(arg-2)
return self.cache[arg]
class MathContext( object ):
def __init__( self ):
self.fibonacci = Fibonacci()
You can use it like this
>>> mc= MathContext()
>>> mc.fibonacci( 4 )
5
You can define any number of calculations and fold them all into a single container object.
If you want, you can make the MathContext into a formal Context Manager so that it work with the with statement. Add these two methods to MathContext.
def __enter__( self ):
print "Initialize"
return self
def __exit__( self, type_, value, traceback ):
print "Release"
Then you can do this.
with MathContext() as mc:
print mc.fibonacci( 4 )
At the end of the with statement, you can guaranteed that the __exit__ method was called.

Categories

Resources