How can I serialise a recursive function? - python

Suppose I have a function that is recursive through its closure:
def outer():
def fact(n):
return 1 if n == 0 else n * fact(n - 1)
return fact
I now want to serialize the function and reconstruct it using types.FunctionType:
import pickle, marshal, copyreg, types
def make_cell(value):
return (lambda: value).__closure__[0]
def make_function(*args):
return types.FunctionType(*args)
copyreg.pickle(types.CodeType,
lambda code: (marshal.loads, (marshal.dumps(code),)))
copyreg.pickle(type((lambda i=0: lambda: i)().__closure__[0]),
lambda cell: (make_cell, (cell.cell_contents,)))
copyreg.pickle(types.FunctionType,
lambda fn: (make_function, (fn.__code__, {}, fn.__name__, fn.__defaults__, fn.__closure__)))
buf = pickle.dumps(outer())
fn = pickle.loads(buf)
This works fine for ordinary closures, but with fact it results in infinite recursion as pickle attempts to serialise fact within its closure. The usual way to handle recursive data structures in pickle is to memoise the object between construction and initialisation, but function objects are immutable, as are fn.__closure__ (a tuple), and cell objects:
>>> cell = (lambda i=0: lambda: i)().__closure__[0]
>>> cell.cell_contents = 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: attribute 'cell_contents' of 'cell' objects is not writable
Presumably the language has to do something similar when constructing recursive functions within normal code, as again the function object isn't available to place in its closure until it's been constructed. Is there some magic to building recursive functions that I'm missing?

A closure binds to a free variable, not it's value. For a self-referencing closure, all Python needs to do is create a closure for the free fact name first (not yet bound to anything), create the function object with the closure, and then bind fact to that object.
As such, you need to combine creating a closure and a function into the same outer function, such that you create a closure to the name the function is going to be bound to:
def create_closure_and_function(*args):
func = None
def create_function_closure():
return func
closure = create_function_closure.__closure__
func = types.FunctionType(*args[:-1] + [closure])
return func
To make this work with unpickling you'd have to loop over the closure argument (args[-1]) and detect where there is a recursion, and replace that one item with create_function_closure.__closure__[0], I suppose.

This is how I ended up doing it, in Python 3 using nonlocal:
def settable_cell():
if False:
x = None
def set_cell(y):
nonlocal x
x = y
return (lambda: x).__closure__[0], set_cell
And in Python 2 using a generator:
def settable_cell():
def g():
while True:
x = (yield (lambda: x).__closure__[0])
set_cell = iter(g()).send
return set_cell(None), set_cell
This allows separating creating a closure cell from setting the value of its free variable; the rest of the solution just requires some fiddling with the pickle memoisation facility.

Related

How to apply class decorator at base of all decorators on methods

I am using this way of decorating all methods
import inspect
def decallmethods(decorator, prefix='test_'):
def dectheclass(cls):
for name, m in inspect.getmembers(cls, inspect.ismethod):
if name.startswith(prefix):
setattr(cls, name, decorator(m))
return cls
return dectheclass
#decallmethods(login_testuser)
class TestCase(object):
def setUp(self):
pass
def test_1(self):
print "test_1()"
def test_2(self):
print "test_2()"
This is working but it applies at the top , if i have other decorators.
I mean
Now the result is
#login_testuser
#other
def test_2(self):
print "test_2()"
But i want
#other
#login_testuser
def test_2(self):
print "test_2()"
This is most certainly a bad idea, but what you want to do can be done in some extent, and this is going to take a lot of time to explain. First off, rather than thinking of decorators as a syntax sugar, think of them as what they really are: a function (that is a closure) with a function that exist inside it. Now this is out of the way, supposed we have a function:
def operation(a, b):
print('doing operation')
return a + b
Simply it will do this
>>> hi = operation('hello', 'world')
doing operation
>>> print(hi)
helloworld
Now define a decorator that prints something before and after calling its inner function (equivalent to the other decorator that you want to decorator later):
def other(f):
def other_inner(*a, **kw):
print('other start')
result = f(*a, **kw)
print('other finish')
return result
return other_inner
With that, build a new function and decorator
#other
def o_operation(a, b):
print('doing operation')
return a + b
Remembering, this is basically equivalent to o_operation = other(operation)
Run this to ensure it works:
>>> r2 = o_operation('some', 'inner')
other start
doing operation
other finish
>>> print(r2)
someinner
Finally, the final decorator you want to call immediately before operation but not d_operation, but with your existing code it results in this:
def inject(f):
def injected(*a, **kw):
print('inject start')
result = f(*a, **kw)
print('inject finish')
return result
return injected
#inject
#other
def i_o_operation(a, b):
print('doing operation')
return a + b
Run the above:
>>> i_o_operation('hello', 'foo')
inject start
other start
doing operation
other finish
inject finish
'hellofoo'
As mentioned decorators are really closures and hence that's why it's possible to have items inside that are effectively instanced inside. You can reach them by going through the __closure__ attribute:
>>> i_o_operation.__closure__
(<cell at 0x7fc0eabd1fd8: function object at 0x7fc0eabce7d0>,)
>>> i_o_operation.__closure__[0].cell_contents
<function other_inner at 0x7fc0eabce7d0>
>>> print(i_o_operation.__closure__[0].cell_contents('a', 'b'))
other start
doing operation
other finish
ab
See how this effectively calls the function inside the injected closure directly, as if that got unwrapped. What if that closure can be replaced with the one that did the injection? For all of our protection, __closure__ and cell.cell_contents are read-only. What needs to be done is to construct completely new functions with the intended closures by making use of the FunctionType function constructor (found in the types module)
Back to the problem. Since what we have now is:
i_o_operation = inject(other(operation))
And what we want is
o_i_operation = other(inject(operation))
We effectively have to somehow strip the call to other from i_o_operation and somehow wrap it around with inject to produce o_i_operation. (Dragons follows after the break)
First, construct a function that effectively calls inject(operation) by taking the closure to level deep (so that f will contain just the original operation call) but mix it with the code produced by inject(f):
i_operation = FunctionType(
i_o_operation.__code__,
globals=globals(),
closure=i_o_operation.__closure__[0].cell_contents.__closure__,
)
Since i_o_operation is the result of inject(f) we can take that code to produce a new function. The globals is a formality that's required, and finally take the closure of the nested level, and the first part of the function is produced. Verify that the other is not called.
>>> i_operation('test', 'strip')
inject start
doing operation
inject finish
'teststrip'
Neat. However we still want the other to be wrapped outside of this to finally produce o_i_operation. We do need to somehow put this new function we produced in a closure, and a way to do this is to create a surrogate function that produce one
def closure(f):
def surrogate(*a, **kw):
return f(*a, **kw)
return surrogate
And simply use it to construct and extract our closure
o_i_operation = FunctionType(
i_o_operation.__closure__[0].cell_contents.__code__,
globals=globals(),
closure=closure(i_operation).__closure__,
)
Call this:
>>> o_i_operation('job', 'complete')
other start
inject start
doing operation
inject finish
other finish
'jobcomplete'
Looks like we finally got what we need. While this doesn't exactly answer your exact problem, this started down the right track but is already pretty hairy.
Now for the actual problem: a function that will ensure a decorator function be the most inner (final) callable before a given original, undecorated function - i.e. for a given target and a f(g(...(callable)), we want to emulate a result that gives f(g(...(target(callable)))). This is the code:
from types import FunctionType
def strip_decorators(f):
"""
Strip all decorators from f. Assumes each are functions with a
closure with a first cell being the target function.
"""
# list of not the actual decorator, but the returned functions
decorators = []
while f.__closure__:
# Assume first item is the target method
decorators.append(f)
f = f.__closure__[0].cell_contents
return decorators, f
def inject_decorator(decorator, f):
"""
Inject a decorator to the most inner function within the stack of
closures in `f`.
"""
def closure(f):
def surrogate(*a, **kw):
return f(*a, **kw)
return surrogate
decorators, target_f = strip_decorators(f)
result = decorator(target_f)
while decorators:
# pop out the last one in
decorator = decorators.pop()
result = FunctionType(
decorator.__code__,
globals=globals(),
closure=closure(result).__closure__,
)
return result
To test this, we use a typical example use-case - html tags.
def italics(f):
def i(s):
return '<i>' + f(s) + '</i>'
return i
def bold(f):
def b(s):
return '<b>' + f(s) + '</b>'
return b
def underline(f):
def u(s):
return '<u>' + f(s) + '</u>'
return u
#italics
#bold
def hi(s):
return s
Running the test.
>>> hi('hello')
'<i><b>hello</b></i>'
Our target is to inject the underline decorator (specifically the u(hi) callable) into the most inner closure. This can be done like so, with the function we have defined above:
>>> hi_u = inject_decorator(underline, hi)
>>> hi_u('hello')
'<i><b><u>hello</u></b></i>'
Works with undecorated functions:
>>> def pp(s):
... return s
...
>>> pp_b = inject_decorator(bold, pp)
>>> pp_b('hello')
'<b>hello</b>'
A major assumption was made for this first-cut version of the rewriter, which is that all decorators in the chain only have a closure length of one, that one element being the function being decorated with. Take this decorator for instance:
def prefix(p):
def decorator(f):
def inner(*args, **kwargs):
new_args = [p + a for a in args]
return f(*new_args, **kwargs)
return inner
return decorator
Example usage:
>>> #prefix('++')
... def prefix_hi(s):
... return s
...
>>> prefix_hi('test')
'++test'
Now try to inject a bold decorator like so:
>>> prefix_hi_bold = inject_decorator(bold, prefix_hi)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 18, in inject_decorator
ValueError: inner requires closure of length 2, not 1
This is simply because the closure formed by decorator within prefix has two elements, one being the prefix string p and the second being the actual function, and inner being nested inside that expects both those to be present inside its closure. Resolving that will require more code to analyse and reconstruct the details.
Anyway, this explanation took quite a bit of time and words, so I hope you understand this and maybe get you started on the actual right track.
If you want to turn inject_decorator into a decorator, and/or mix it into your class decorator, best of luck, most of the hard work is already done.

How to get function object inside a function (Python)

I want to have something like
def x():
print get_def_name()
but not necessarily know the name of x.
Ideally it would return 'x' where x would be the name of the function.
You can do this by using Python's built-in inspect library.
You can read more of its documentation if you want to handle more complicated cases, but this snippet will work for you:
from inspect import getframeinfo, currentframe
def test_func_name():
return getframeinfo(currentframe()).function
print(test_func_name())
Functions in Python are objects, and as it happens those objects do have an attribute containing the name they were defined with:
>>> def x():
... pass
...
>>> print x.__name__
x
So, a naïve approach might be this:
>>> def x():
... print x.__name__
...
>>> x()
x
That seems to work. However, since you had to know the name of x inside the function in order to do that, you haven't really gained anything; you might have well just have done this:
def x():
print "x"
In fact, though, it's worse than that, because the __name__ attribute only refers to the name the function was defined with. If it gets bound to another name, it won't behave as you expect:
>>> y = x
>>> y()
x
Even worse, if the original name is no longer around, it won't work at all:
>>> del x
>>> y()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in x
NameError: global name 'x' is not defined
This second problem is one you can actually get around, although it's not pretty. The trick is to write a decorator that can pass the function's name into it as an argument:
>>> from functools import wraps
>>> def introspective(func):
... __name__ = func.__name__
... #wraps(func)
... def wrapper(*args, **kwargs):
... return func(__name__=__name__, *args, **kwargs)
... return wrapper
...
>>> #introspective
... def x(__name__):
... print __name__
...
>>> x()
x
>>> y = x
>>> y()
x
>>> del x
>>> y()
x
... although as you can see, you're still only getting back the name the function was defined with, not the one it happens to be bound to right now.
In practice, the short (and correct) answer is "don't do that". It's a fundamental fact of Python that objects don't know what name (or names) they're bound to - if you think your function needs that information, you're doing something wrong.
This sounds like you want to declare an anonymous function and it would return a reference to the new function object.
In Python, you can get a trivial anonymous function object with lambda but for a complex function it must have a name. But any function object is in fact an object and you can pass references around to it, so the name doesn't matter.
# lambda
sqr = lambda n: n**2
assert sqr(2) == 4
assert sqr(3) == 9
# named function
def f(n):
return n**2
sqr = f
assert sqr(2) == 4
assert sqr(3) == 9
Note that this function does have a name, f, but the name doesn't really matter here. We set the name sqr to the function object reference and use that name. We could put the function reference into a list or other data structure if we wanted to.
You could re-use the name of the function:
def f(n):
return n**2
sqr = f
def f(n):
return n**3
cube = f
So, while Python doesn't really support full anonymous functions, you can get the same effect. It's not really a problem that you have to give functions a name.
If you really don't want the function to have a name, you can unbind the name:
def f(n):
return n**2
lst = [f] # save function reference in a list
del(f) # unbind the name
Now the only way to access this function is through the list; the name of the function is gone.
I found a similar solution as Vazirani's, but I did a step forward to get the function object based on the name. Here is my solution:
import inspect
def named_func():
func_name = inspect.stack()[0].function
func_obj = inspect.stack()[1].frame.f_locals[func_name]
print(func_name, func_obj, func_obj.xxx)
named_func.xxx = 15
named_func()
Output is
named_func <function named_func at 0x7f3bc84622f0> 15
Unfortunately I cannot do this with lambda function. I keep trying.

Can someone explain to me the difference between a Function Object and a Closure

By "Function Object", I mean an object of a class that is in some sense callable and can be treated in the language as a function. For example, in python:
class FunctionFactory:
def __init__ (self, function_state):
self.function_state = function_state
def __call__ (self):
self.function_state += 1
return self.function_state
>>>> function = FunctionFactory (5)
>>>> function ()
6
>>>> function ()
7
My question is - would this use of FunctionFactory and function be considered a closure?
A closure is a function that remembers the environment in which it was defined and has access to variables from the surrounding scope. A function object is an object that can be called like a function, but which may not actually be a function. Function objects are not closures:
class FunctionObject(object):
def __call__(self):
return foo
def f():
foo = 3
FunctionObject()() # raises UnboundLocalError
A FunctionObject does not have access to the scope in which it was created. However, a function object's __call__ method may be a closure:
def f():
foo = 3
class FunctionObject(object):
def __call__(self):
return foo
return FunctionObject()
print f()() # prints 3, since __call__ has access to the scope where it was defined,
# though it doesn't have access to the scope where the FunctionObject
# was created
... would this use of FunctionFactory and function be considered a closure?
Not per se, since it doesn't involve scopes. Although it does mimic what a closure is capable of.
def ClosureFactory(val):
value = val
def closure():
nonlocal value # 3.x only; use a mutable object in 2.x instead
value += 1
return value
return closure
3>> closure = ClosureFactory(5)
3>> closure()
6
3>> closure()
7
A closure is a piece of code that closes over the environment it is defined in, capturing its variables. In most of the modern languages, functions are closures, but it is not necessarily so, and you can imagine closures that are not function objects (such as Ruby blocks, for example, which are not objects at all).
This is an essential test for a closure:
def bar():
x = 1
def foo():
print x
return foo
x = 2
bar()()
If it prints 1, foo is a closure. If it prints 2, it is not.

Is there a way to refer to the current function in python?

I want a function to refer to itself. e.g. to be recursive.
So I do something like that:
def fib(n):
return n if n <= 1 else fib(n-1)+fib(n-2)
This is fine most of the time, but fib does not, actually, refer to itself; it refers to the the binding of fib in the enclosing block. So if for some reason fib is reassigned, it will break:
>>> foo = fib
>>> fib = foo(10)
>>> x = foo(8)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in fib
TypeError: 'int' object is not callable
How can I prevent this from happening (from inside fib), if at all possible? As far as I know, the name of fib does not exist before the function-definition is fully executed; Are there any workarounds?
I don't have a real use case where it may actually happen; I am asking out of sheer curiosity.
I'd make a decorator for this
from functools import wraps
def selfcaller(func):
#wraps(func)
def wrapper(*args, **kwargs):
return func(wrapper, *args, **kwargs)
return wrapper
And use it like
#selfcaller
def fib(self, n):
return n if n <= 1 else self(n-1)+self(n-2)
This is actually a readable way to define a Fixed Point Combinator (or Y Combinator):
fix = lambda g: (lambda f: g(lambda arg: f(f)(arg))) (lambda f: g(lambda arg: f(f)(arg)))
usage:
fib = fix(lambda self: lambda n: n if n <= 1 else self(n-1)+self(n-2))
or:
#fix
def fib(self):
return lambda n: n if n <= 1 else self(n-1)+self(n-2)
The binding here happens in the formal parameter, so the problem does not arise.
There's no way to do what you're trying to do. You're right that fib does not exist before the function definition is executed (or, worse, it exists but refers to something completely different…), which means there is no workaround from inside fib that can possibly work.*
However, if you're willing to drop that requirement, there are workarounds that do work. For example:
def _fibmaker():
def fib(n):
return n if n <= 1 else fib(n-1)+fib(n-2)
return fib
fib = _fibmaker()
del _fibmaker
Now fib is referring to the binding in the closure from the local environment of a call to _fibmaker. Of course even that can be replaced if you really want to, but it's not easy (the fib.__closure__ attribute is not writable; it's a tuple, so you can't replace any of its cells; each cell's cell_contents is a readonly attribute, …), and there's no way you're going to do it by accident.
There are other ways to do this (e.g., use a special placeholder inside fib, and a decorator that replaces the placeholder with the decorated function), and they're all about equally unobvious and ugly, which may seem to violate TOOWTDI. But in this case, the "it" is something you probably don't want to do, so it doesn't really matter.
Here's one way you can write a general, pure-python decorator for a function that uses self instead of its own name, without needing an extra self parameter to the function:
def selfcaller(func):
env = {}
newfunc = types.FunctionType(func.__code__, globals=env)
env['self'] = newfunc
return newfunc
#selfcaller
def fib(n):
return n if n <= 1 else self(n-1)+self(n-2)
Of course this won't work on a function that has any free variables that are bound from globals, but you can fix that with a bit of introspection. And, while we're at it, we can also remove the need to use self inside the function's definition:
def selfcaller(func):
env = dict(func.__globals__)
newfunc = types.FunctionType(func.__code__, globals=env)
env[func.__code__.co_name] = newfunc
return newfunc
This is Python 3.x-specific; some of the attribute names are different in 2.x, but otherwise it's the same.
This still isn't 100% fully general. For example, if you want to be able to use it on methods so they can still call themselves even if the class or object redefines their name, you need slightly different tricks. And there are some pathological cases that might require building a new CodeType out of func.__code__.co_code. But the basic idea is the same.
* As far as Python is concerned, until the name is bound, it doesn't exist… but obviously, under the covers, the interpreter has to know the name of the function you're defining. And at least some interpreters offer non-portable ways to get at that information.
For example, in CPython 3.x, you can very easily get the name of the function currently being defined—it's just sys._getframe().f_code.co_name.
Of course this won't directly do you any good, because nothing (or the wrong thing) is bound to that name. But notice that f_code in there. That's the current frame's code object. Of course you can't call a code object directly, but you can do so indirectly, either by generating a new function out of it, or by using bytecodehacks.
For example:
def fib2(n):
f = sys._getframe()
fib2 = types.FunctionType(f.f_code, globals=globals())
return n if n<=1 else fib2(n-1)+fib2(n-2)
Again, this won't handle every pathological case… but the only way I can think of to do so is to actually keep a circular reference to the frame, or at least its globals (e.g., by passing globals=f.f_globals), which seems like a very bad idea.
See Frame Hacks for more clever things you can do.
Finally, if you're willing to step out of Python entirely, you can create an import hook that preprocesses or compiles your code from a Python custom-extended with, say, defrec into pure Python and/or bytecode.
And if you're thinking "But that sounds like it would be a lot nicer as a macro than as a preprocessor hack, if only Python had macros"… then you'll probably prefer to use a preprocessor hack that gives Python macros, like MacroPy, and then write your extensions as macros.
Like abamert said "..there is no way around the problem from inside ..".
Here's my approach:
def fib(n):
def fib(n):
return n if n <= 1 else fib(n-1)+fib(n-2)
return fib(n)
Someone asked me for a macro based solution for this, so here it is:
# macropy/my_macro.py
from macropy.core.macros import *
macros = Macros()
#macros.decorator()
def recursive(tree, **kw):
tree.decorator_list = []
wrapper = FunctionDef(
name=tree.name,
args=tree.args,
body=[],
decorator_list=tree.decorator_list
)
return_call = Return(
Call(
func = Name(id=tree.name),
args = tree.args.args,
keywords = [],
starargs = tree.args.vararg,
kwargs = tree.args.kwarg
)
)
return_call = parse_stmt(unparse_ast(return_call))[0]
wrapper.body = [tree, return_call]
return wrapper
This can be used as follows:
>>> import macropy.core.console
0=[]=====> MacroPy Enabled <=====[]=0
>>> from macropy.my_macro import macros, recursive
>>> #recursive
... def fib(n):
... return n if n <= 1 else fib(n-1)+fib(n-2)
...
>>> foo = fib
>>> fib = foo(10)
>>> x = foo(8)
>>> x
21
It basically does exactly the wrapping that hus787 gave:
Create a new statement which does return fib(...), which uses the argument list of the original function as the ...
Create a new def, with the same name, same args, same decorator_list as the old one
Place the old function, together followed by the return statement, in the body of the new functiondef
Strip the original function of its decorators (I assume you'd want to decorate the wrapper instead)
The parse_stmt(unparse_ast(return_call))[0] rubbish is a quick hack to get stuff to work (you actually can't just copy the argument AST from the param list of the function and use them in a Call AST) but that's just detail.
To show that it's actually doing that, you can add a print unparse_ast statement to see what the transformed function looks like:
#macros.decorator()
def recursive(tree, **kw):
...
print unparse_ast(wrapper)
return wrapper
which, when run as above, prints
def fib(n):
def fib(n):
return (n if (n <= 1) else (fib((n - 1)) + fib((n - 2))))
return fib(n)
Looks like exactly what you want! It should work for any function, with multiple args, kwargs, defaults, etc., but I'm too lazy to test. Working with the AST is a bit verbose, and MacroPy is still super-experimental, but i think it's pretty neat.

Is it possible to assign return value from two higher order functions to the same variable in Python?[see details]

There is a tutorial at http://pythonprogramming.jottit.com/functional_programming and it gives an example how to use higher order functions to return functions:
def trace(f):
f.indent = 0
def g(x):
print '| ' * f.indent + '|--', f.__name__, x
f.indent += 1
value = f(x)
print '| ' * f.indent + '|--', 'return', repr(value)
f.indent -= 1
return value
return g
and
def memoize(f):
cache = {}
def g(x):
if x not in cache:
cache[x] = f(x)
return cache[x]
return g
but I don't get how it's able to assign two functions on the same variable on the statements:
fib = trace(fib)
fib = memoize(fib)
print fib(4)
both trace and memoize seem to have effect on the last call. Why is that?
What you've written is very similar to
fib2 = memoize(trace(fib))
print fib2(4)
because you have changed which function the variable fib points to after the call to trace, so memoize is applied to the tracing version (and then fib is "overwritten" again).
If you want to have a tracing version and a memoized version separately, you need to assign their results to different variables, e.g.:
fib_trace = trace(fib)
fib_memo = memoize(fib)
print fib_trace(4), fib_memo(4)
Both trace() and memoize() create a new function object and return it to you.
In each case, the new function object "wraps" the old function object, so the original function is not lost.
Using my amazing ASCII art skills, here is a diagram:
f(x) # this is your original function
trace(f(x)) # trace "wraps" it and returns a wrapped object
memoize(trace(f(x))) # memoize "wraps" it and returns a new wrapped function object
We start out with a function object bound to the name fib.
Then we call trace(fib) which creates a new function object. When it is first created, its name is g but we then bind it to the name fib. Try printing fib.__name__.
Then we call memoize(fib) which creates a new function object. Again it's first created with the name of g but then bound to the name fib.
Remember, in Python everything is an object, and objects can exist with no name, with one name, or with many names. In this case, we keep re-using the name fib but we keep re-binding it with different function objects.
It's no different than:
a = a + 2
a = a + 5
print a
Just as a will have increased by 7, fib will have had both decorators applied to it.

Categories

Resources