Python infinite recursion with generators - python

I'm trying to wrap my head around python generators and as a result, I'm trying to print infinitely nested objects using yield, but I find that I still get problems with blowing out the stack. Ideally, I'd like to be able to yield and print each item as it comes along but I can't figure out what I'm doing wrong:
class Parent:
def __init__(self, name, child=None):
self._name = name
self._child = child
def get_name(self):
return self._name
def get_child(self):
return self._child
def set_child(self, child):
self._child = child
def __iter__(self):
next_child = self._child.get_child()
if not next_child:
raise StopIteration
else:
self._child = next_child
yield next_child
def __str__(self):
return "%s has %s" % (self._name, self._child)
if __name__ == '__main__':
p1 = Parent("child")
p2 = Parent("child", p1)
p1.set_child(p2)
for t in p1:
print t

The error in your code, as noted by jonrsharpe, is due to the __str__ function, which tries to return:
child has child has child has child has child has ...
You probably mean:
def __str__(self):
return "%s has %s" % (self._name, self._child.get_name())
# return 'child has child'
Also, __iter__ should be a generator function. Generator functions need to contain a loop in order to continually produce values. So it should be something like:
def __iter__(self):
next_child = self._child.get_child()
while next_child:
yield next_child
next_child = next_child.get_child()
# When the function ends, it will automatically raise StopIteration
With the modifications, your code prints endless lines of child has child.
See also What does the yield keyword do in Python? for more information about generator functions.

The infinite recursion is happening at __str__ function. It has nothing to do with the __iter__ function.
when you do print t, it executes t._child.__str__ which in turn executes t._child._child.__str__ and so forth.
try changing the __str__ function definition to something simple like return self._name and you won't get a recursion depth exceeded error

Related

How to eliminate recursion in Python function containing control flow

I have a function of the form:
def my_func(my_list):
for i, thing in enumerate(my_list):
my_val = another_func(thing)
if i == 0:
# do some stuff
else:
if my_val == something:
return my_func(my_list[:-1])
# do some other stuff
The recursive part is getting called enough that I am getting a RecursionError, so I am trying to replace it with a while loop as explained here, but I can't work out how to reconcile this with the control flow statements in the function. Any help would be gratefully received!
There may be a good exact answer, but the most general (or maybe quick-and-dirty) way to switch from recursion to iteration is to manage the stack yourself. Just do manually what programming language does implicitly and have your own unlimited stack.
In this particular case there is tail recursion. You see, my_func recursive call result is not used by the caller in any way, it is immediately returned. What happens in the end is that the deepest recursive call's result bubbles up and is being returned as it is. This is what makes #outoftime's solution possible. We are only interested in into-recursion pass, as the return-from-recursion pass is trivial. So the into-recursion pass is replaced with iterations.
def my_func(my_list):
run = True
while run:
for i, thing in enumerate(my_list):
my_val = another_func(thing)
if i == 0:
# do some stuff
else:
if my_val == something:
my_list = my_list[:-1]
break
# do some other stuff
This is an iterative method.
Decorator
class TailCall(object):
def __init__(self, __function__):
self.__function__ = __function__
self.args = None
self.kwargs = None
self.has_params = False
def __call__(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.has_params = True
return self
def __handle__(self):
if not self.has_params:
raise TypeError
if type(self.__function__) is TailCaller:
return self.__function__.call(*self.args, **self.kwargs)
return self.__function__(*self.args, **self.kwargs)
class TailCaller(object):
def __init__(self, call):
self.call = call
def __call__(self, *args, **kwargs):
ret = self.call(*args, **kwargs)
while type(ret) is TailCall:
ret = ret.__handle__()
return ret
#TailCaller
def factorial(n, prev=1):
if n < 2:
return prev
return TailCall(factorial)(n-1, n * prev)
To use this decorator simply wrap your function with #TailCaller decorator and return TailCall instance initialized with required params.
I'd like to say thank you for inspiration to #o2genum and to Kyle Miller who wrote an excellent article about this problem.
Despite how good is to remove this limitation, probably, you have to be
aware of why this feature is not officially supported.

How to use the iterator class?

Im trying to create a iterator class that will give me a path throw a tree graph, which every iteration it will return the next step according to certain conditions.
So i looked up how to do this here : Build a Basic Python Iterator
and this is what i wrote so far :
def travel_path_iterator(self, article_name):
return Path_Iter(article_name)
class Path_Iter:
def __init__(self,article):
self.article=article
def __iter__(self):
return next(self)
def __next__(self):
answer= self.article.get_max_out_nb()
if answer != self.article.get_name():
return answer
else:
raise StopIteration
But I have a problem to call this class.
my output is always :
<__main__.Path_Iter object at 0x7fe94049fc50>
any guesses what im doing wrong ?
While Path_Iter is already an iterator, the __iter__-method should return self:
def __iter__(self):
return self
Next, to iterate an iterator, you need some kind of loop. E.g. to print the contents, you could convert the iterator to a list:
print list(xyz.travel_path_iterator(article_name))
Using a generator function:
def travel_path_generator(article):
while True:
answer = article.get_max_out_nb()
if answer == article.get_name()
break
else:
yield answer

Returning an Object (class) in Parallel Python

I have created a function that takes a value, does some calculations and return the different answers as an object. However when I try to parallelize the code, using pp, I get the following error.
File "trmm.py", line 8, in getattr
return self.header_array[name]
RuntimeError: maximum recursion depth exceeded while calling a Python object
Here is a simple version of what I am trying to do.
class DataObject(object):
"""
Class to handle data objects with several arrays.
"""
def __getattr__(self, name):
try:
return self.header_array[name]
except KeyError:
try:
return self.line[name]
except KeyError:
raise AttributeError("%s instance has no attribute '%s'" %(self.__class__.__name__, name))
def __setattr__(self, name, value):
if name in ('header_array', 'line'):
object.__setattr__(self, name, value)
elif name in self.line:
self.line[name] = value
else:
self.header_array[name] = value
class TrmmObject(DataObject):
def __init__(self):
DataObject.__init__(self)
self.header_array = {
'header': None
}
self.line = {
'longitude': None,
'latitude': None
}
if __name__ == '__main__':
import pp
ppservers = ()
job_server = pp.Server(2, ppservers=ppservers)
def get_monthly_values(value):
tplObj = TrmmObject()
tplObj.longitude = value
tplObj.latitude = value * 2
return tplObj
job1 = job_server.submit(get_monthly_values, (5,), (DataObject,TrmmObject,),("numpy",))
result = job1()
If I change return tplObj to return [tplObj.longitude, tplObj.latitude] there is no problem. However, as I said before this is a simple version, in reality this change would complicate the program a lot.
I am very grateful for any help.
You almost never need to use getattr and setattr, and it almost always ends up with something blowing up, and infinite recursions is a typical effect of that. I can't really see any reason for using them here either. Be explicit and use the line and header_array dictionaries directly.
If you want a function that looks up a value over all arrays, create a function for that and call it explicitly. Calling the function __getitem__ and using [] is explicit. :-)
(And please don't call a dictionary "header_array", it's confusing).

getting test class instance in nose test object to sync up with instance in decorated generator target

I've been working on a way to get tests produced from a generator in nose to have descriptions that are customized for the specific iteration being tested. I have something that works, as long as my generator target method never tries to access self from my generator class. I'm seeing that all my generator target instances have a common test class instance while nose is generating a one-offed instance of the test class for each test run from the generator. This is resulting in setUp being run on each test instance nose creates, but never running on the instance the generator target is bound to (of course, the real problem is that I can't see how to bind the nose-created instance to the generator target). Here's the code I'm using to try to figure this all out (yes, I know the decorator would probably be better as a callable class, but nose, at least version 1.2.1 that I have, explicitly checks that tests are either functions or methods, so a callable class won't run at all):
import inspect
def labelable_yielded_case(case):
argspec = inspect.getargspec(case)
if argspec.defaults is not None:
defaults_list = [''] * (len(argspec.args) - len(argspec.defaults)) + argspec.defaults
else:
defaults_list = [''] * len(argspec.args)
argument_defaults_list = zip(argspec.args, defaults_list)
case_wrappers = []
def add_description(wrapper_id, argument_dict):
case_wrappers[wrapper_id].description = case.__doc__.format(**argument_dict)
def case_factory(*factory_args, **factory_kwargs):
def case_wrapper_wrapper():
wrapper_id = len(case_wrappers)
def case_wrapper(*args, **kwargs):
args = factory_args + args
argument_list = []
for argument in argument_defaults_list:
argument_list.append(list(argument))
for index, value in enumerate(args):
argument_list[index][1] = value
argument_dict = dict(argument_list)
argument_dict.update(factory_kwargs)
argument_dict.update(kwargs)
add_description(wrapper_id, argument_dict)
return case(*args, **kwargs)
case_wrappers.append(case_wrapper)
case_wrapper.__name__ = case.__name__
return case_wrapper
return case_wrapper_wrapper()
return case_factory
class TestTest(object):
def __init__(self):
self.data = None
def setUp(self):
print 'setup', self
self.data = (1,2,3)
def test_all(self):
for index, value in enumerate((1,2,3)):
yield self.validate_equality(), index, value
def test_all_again(self):
for index, value in enumerate((1,2,3)):
yield self.validate_equality_again, index, value
#labelable_yielded_case
def validate_equality(self, index, value):
'''element {index} equals {value}'''
print 'test', self
assert self.data[index] == value, 'expected %d got %d' % (value, self.data[index])
def validate_equality_again(self, index, value):
print 'test', self
assert self.data[index] == value, 'expected %d got %d' % (value, self.data[index])
validate_equality_again.description = 'again'
When run through nose, the again tests work just fine, but the set of tests using the decorated generator target all fail because self.data is None (because setUp is never run because the instance of TestTest stored in the closures is not the instances run by nose). I tried making the decorator an instance member of a base class for TestTest, but then nose threw errors about having too few arguments (no self) passed to the unbound labelable_yielded_case. Is there any way I can make this work (short of hacking nose), or am I stuck choosing between either not being able to have the yield target be an instance member or not having per-test labeling for each yielded test?
Fixed it (at least for the case here, though I think I got it for all cases). I had to fiddle with case_wrapper_wrapper and case_wrapper to get the factory to return the wrapped cases attached to the correct class, but not bound to any given instance in any way. I also had another code issue because I was building the argument dict in wrapper wrapper, but then not passing it to the case. Working code:
import inspect
def labelable_yielded_case(case):
argspec = inspect.getargspec(case)
if argspec.defaults is not None:
defaults_list = [''] * (len(argspec.args) - len(argspec.defaults)) + argspec.defaults
else:
defaults_list = [''] * len(argspec.args)
argument_defaults_list = zip(argspec.args, defaults_list)
case_wrappers = []
def add_description(wrapper_id, argument_dict):
case_wrappers[wrapper_id].description = case.__doc__.format(**argument_dict)
def case_factory(*factory_args, **factory_kwargs):
def case_wrapper_wrapper():
wrapper_id = len(case_wrappers)
def case_wrapper(*args, **kwargs):
argument_list = []
for argument in argument_defaults_list:
argument_list.append(list(argument))
for index, value in enumerate(args):
argument_list[index][1] = value
argument_dict = dict(argument_list)
argument_dict.update(kwargs)
add_description(wrapper_id, argument_dict)
return case(**argument_dict)
case_wrappers.append(case_wrapper)
case_name = case.__name__ + str(wrapper_id)
case_wrapper.__name__ = case_name
if factory_args:
setattr(factory_args[0].__class__, case_name, case_wrapper)
return getattr(factory_args[0].__class__, case_name)
else:
return case_wrapper
return case_wrapper_wrapper()
return case_factory
class TestTest(object):
def __init__(self):
self.data = None
def setUp(self):
self.data = (1,2,3)
def test_all(self):
for index, value in enumerate((1,2,3)):
yield self.validate_equality(), index, value
#labelable_yielded_case
def validate_equality(self, index, value):
'''element {index} equals {value}'''
assert self.data[index] == value, 'expected %d got %d' % (value, self.data[index])

How to put variables on the stack/context in Python

In essence, I want to put a variable on the stack, that will be reachable by all calls below that part on the stack until the block exits. In Java I would solve this using a static thread local with support methods, that then could be accessed from methods.
Typical example: you get a request, and open a database connection. Until the request is complete, you want all code to use this database connection. After finishing and closing the request, you close the database connection.
What I need this for, is a report generator. Each report consist of multiple parts, each part can rely on different calculations, sometimes different parts relies in part on the same calculation. As I don't want to repeat heavy calculations, I need to cache them. My idea is to decorate methods with a cache decorator. The cache creates an id based on the method name and module, and it's arguments, looks if it has this allready calculated in a stack variable, and executes the method if not.
I will try and clearify by showing my current implementation. Want I want to do is to simplify the code for those implementing calculations.
First, I have the central cache access object, which I call MathContext:
class MathContext(object):
def __init__(self, fn):
self.fn = fn
self.cache = dict()
def get(self, calc_config):
id = create_id(calc_config)
if id not in self.cache:
self.cache[id] = calc_config.exec(self)
return self.cache[id]
The fn argument is the filename the context is created in relation to, from where data can be read to be calculated.
Then we have the Calculation class:
class CalcBase(object):
def exec(self, math_context):
raise NotImplementedError
And here is a stupid Fibonacci example. Non of the methods are actually recursive, they work on large sets of data instead, but it works to demonstrate how you would depend on other calculations:
class Fibonacci(CalcBase):
def __init__(self, n): self.n = n
def exec(self, math_context):
if self.n < 2: return 1
a = math_context.get(Fibonacci(self.n-1))
b = math_context.get(Fibonacci(self.n-2))
return a+b
What I want Fibonacci to be instead, is just a decorated method:
#cache
def fib(n):
if n<2: return 1
return fib(n-1)+fib(n-2)
With the math_context example, when math_context goes out of scope, so does all it's cached values. I want the same thing for the decorator. Ie. at point X, everything cached by #cache is dereferrenced to be gced.
I went ahead and made something that might just do what you want. It can be used as both a decorator and a context manager:
from __future__ import with_statement
try:
import cPickle as pickle
except ImportError:
import pickle
class cached(object):
"""Decorator/context manager for caching function call results.
All results are cached in one dictionary that is shared by all cached
functions.
To use this as a decorator:
#cached
def function(...):
...
The results returned by a decorated function are not cleared from the
cache until decorated_function.clear_my_cache() or cached.clear_cache()
is called
To use this as a context manager:
with cached(function) as function:
...
function(...)
...
The function's return values will be cleared from the cache when the
with block ends
To clear all cached results, call the cached.clear_cache() class method
"""
_CACHE = {}
def __init__(self, fn):
self._fn = fn
def __call__(self, *args, **kwds):
key = self._cache_key(*args, **kwds)
function_cache = self._CACHE.setdefault(self._fn, {})
try:
return function_cache[key]
except KeyError:
function_cache[key] = result = self._fn(*args, **kwds)
return result
def clear_my_cache(self):
"""Clear the cache for a decorated function
"""
try:
del self._CACHE[self._fn]
except KeyError:
pass # no cached results
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.clear_my_cache()
def _cache_key(self, *args, **kwds):
"""Create a cache key for the given positional and keyword
arguments. pickle.dumps() is used because there could be
unhashable objects in the arguments, but passing them to
pickle.dumps() will result in a string, which is always hashable.
I used this to make the cached class as generic as possible. Depending
on your requirements, other key generating techniques may be more
efficient
"""
return pickle.dumps((args, sorted(kwds.items())), pickle.HIGHEST_PROTOCOL)
#classmethod
def clear_cache(cls):
"""Clear everything from all functions from the cache
"""
cls._CACHE = {}
if __name__ == '__main__':
# used as decorator
#cached
def fibonacci(n):
print "calculating fibonacci(%d)" % n
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
for n in xrange(10):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
def lucas(n):
print "calculating lucas(%d)" % n
if n == 0:
return 2
if n == 1:
return 1
return lucas(n - 1) + lucas(n - 2)
# used as context manager
with cached(lucas) as lucas:
for i in xrange(10):
print 'lucas(%d) = %d' % (i, lucas(i))
for n in xrange(9, -1, -1):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
cached.clear_cache()
for n in xrange(9, -1, -1):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
this question seems to be two question
a) sharing db connection
b) caching/Memoizing
b) you have answered yourselves
a) I don't seem to understand why you need to put it on stack?
you can do one of these
you can use a class and connection
could be attribute of it
you can decorate all your function
so that they get a connection from
central location
each function can explicitly use a
global connection method
you can create a connection and pass
around it, or create a context
object and pass around
context,connection can be a part of
context
etc, etc
You could use a global variable wrapped in a getter function:
def getConnection():
global connection
if connection:
return connection
connection=createConnection()
return connection
"you get a request, and open a database connection.... you close the database connection."
This is what objects are for. Create the connection object, pass it to other objects, and then close it when you're done. Globals are not appropriate. Simply pass the value around as a parameter to the other objects that are doing the work.
"Each report consist of multiple parts, each part can rely on different calculations, sometimes different parts relies in part on the same calculation.... I need to cache them"
This is what objects are for. Create a dictionary with useful calculation results and pass that around from report part to report part.
You don't need to mess with "stack variables", "static thread local" or anything like that.
Just pass ordinary variable arguments to ordinary method functions. You'll be a lot happier.
class MemoizedCalculation( object ):
pass
class Fibonacci( MemoizedCalculation ):
def __init__( self ):
self.cache= { 0: 1, 1: 1 }
def __call__( self, arg ):
if arg not in self.cache:
self.cache[arg]= self(arg-1) + self(arg-2)
return self.cache[arg]
class MathContext( object ):
def __init__( self ):
self.fibonacci = Fibonacci()
You can use it like this
>>> mc= MathContext()
>>> mc.fibonacci( 4 )
5
You can define any number of calculations and fold them all into a single container object.
If you want, you can make the MathContext into a formal Context Manager so that it work with the with statement. Add these two methods to MathContext.
def __enter__( self ):
print "Initialize"
return self
def __exit__( self, type_, value, traceback ):
print "Release"
Then you can do this.
with MathContext() as mc:
print mc.fibonacci( 4 )
At the end of the with statement, you can guaranteed that the __exit__ method was called.

Categories

Resources