Python decorate function preserving part of signature - python

I have written the following decorator:
def partializable(fn):
def arg_partializer(*fixable_parameters):
def partialized_fn(dynamic_arg):
return fn(dynamic_arg, *fixable_parameters)
return partialized_fn
return arg_partializer
The purpose of this decorator is to break the function call into two calls. If I decorate the following:
#partializable
def my_fn(dyn, fix1, fix2):
return dyn + fix1 + fix2
I then can do:
core_accepting_dynamic_argument = my_fn(my_fix_1, my_fix_2)
final_result = core_accepting_dynamic_argument(my_dyn)
My problem is that the now decorated my_fn exhibits the following signature: my_fn(*fixable_parameters)
I want it to be: my_fn(fix1, fix2)
How can I accomplish this? I probably have to use wraps or the decorator module, but I need to preserve only part of the original signature and I don't know if that's possible.

Taking inspiration from https://stackoverflow.com/a/33112180/9204395, it's possible to accomplish this by manually altering the signature of arg_partializer, since only the signature of fn is known in the relevant scope and can be handled with inspect.
from inspect import signature
def partializable(fn):
def arg_partializer(*fixable_parameters):
def partialized_fn(dynamic_arg):
return fn(dynamic_arg, *fixable_parameters)
return partialized_fn
# Override signature
sig = signature(fn)
sig = sig.replace(parameters=tuple(sig.parameters.values())[1:])
arg_partializer.__signature__ = sig
return arg_partializer
This is not particularly elegant, but as I think about the problem I'm starting to suspect that this (or a conceptual equivalent) is the only possible way to pull this stunt. Feel free to contradict me.

Related

Make callers use functools.partial or call a factory function?

I've read several arguments comparing partial vs lambda, but most of them talked about how partial is more flexible (not limited to expressions) and gives info about the wrapped function. But I want to consider this from the caller's perspective. Here's my situation.
I have a function that takes a 1-argument modifier function. A request is passed into that modifier function to be modified:
def my_func(request, modifier):
modifier(request)
I'm also building some utilities that makes it easier to create parameterized modifier functions, e.g. adding/modifying URL params to the request. I thought of two ways of doing it, but not sure which one is better.
Option 1
def add_params(request, params):
for param in params:
# Manipulate the request with param.
This way, callers can use functools.partial to bind the params, like this:
modifier = functools.partial(add_params, params={'abc':'123'})
Option 2
def add_params(params):
def func(request):
for param in params:
# Modify request with param.
return func
Then callers use it like this:
modifier = add_params({'abc':'123'})
Question
If I don't care about function introspection, are there any downsides to using option 2? Would option 2 run into late binding issues? (Although my use case doesn't run into that). I really like how option 2 is easier for callers to use.
The two functions are completely isomorphic to each other from a mathematical perspective (though their efficiency may vary):
# Option 1
(Request, Params) -> None
# Option 2
Params -> (Request -> None)
For your purpose, I would say option 2 offers the most convenience since the function is already curried, so you can not only avoid partial but can also compose them easily:
import functools
def compose(*fs):
return functools.reduce(lambda f, g: lambda x: f(g(x)), fs)
modifier = compose(add_params({'abc':'123'}),
add_params({'def':'456'}))
If you ever want to call the function directly you can always do:
add_params({'abc':'123'})(request)
which is not really all that involved compared to option 1:
add_params(request, {'abc':'123'})
Late binding shouldn't pose an issue unless you use variables from outside the function, and if you do there's always a way to work around it.
Unfortunately option 2 has the disadvantage of being annoying to define, but this can be simplified using decorators:
def curry_request(f):
def wrapper(*args, **kwargs):
def inner(request):
f(request, *args, **kwargs)
return inner
return wrapper
#curry_request
def add_params(request, params):
# do something
def partial(func, *args, **keywords):
def newfunc(*fargs, **fkeywords):
newkeywords = keywords.copy()
newkeywords.update(fkeywords)
return func(*(args + fargs), **newkeywords)
newfunc.func = func
newfunc.args = args
newfunc.keywords = keywords
return newfunc
Compared with partial function implementation code,your option 2 is also good,i don't think it has any downsides in your situation.But functools.partial is a common way to result in a simplified signature,
if you want to ruturn a new partial function for another function,you can still invoke partial func.if you want to use option 2 model,you may need to implement a new function

cleaning up nested function calls

I have written several functions that run sequentially, each one taking as its input the output of the previous function so in order to run it, I have to run this line of code
make_list(cleanup(get_text(get_page(URL))))
and I just find that ugly and inefficient, is there a better way to do sequential function calls?
Really, this is the same as any case where you want to refactor commonly-used complex expressions or statements: just turn the expression or statement into a function. The fact that your expression happens to be a composition of function calls doesn't make any difference (but see below).
So, the obvious thing to do is to write a wrapper function that composes the functions together in one place, so everywhere else you can make a simple call to the wrapper:
def get_page_list(url):
return make_list(cleanup(get_text(get_page(url))))
things = get_page_list(url)
stuff = get_page_list(another_url)
spam = get_page_list(eggs)
If you don't always call the exact same chain of functions, you can always factor out into the pieces that you frequently call. For example:
def get_clean_text(page):
return cleanup(get_text(page))
def get_clean_page(url):
return get_clean_text(get_page(url))
This refactoring also opens the door to making the code a bit more verbose but a lot easier to debug, since it only appears once instead of multiple times:
def get_page_list(url):
page = get_page(url)
text = get_text(page)
cleantext = cleanup(text)
return make_list(cleantext)
If you find yourself needing to do exactly this kind of refactoring of composed functions very often, you can always write a helper that generates the refactored functions. For example:
def compose1(*funcs):
#wraps(funcs[0])
def composed(arg):
for func in reversed(funcs):
arg = func(arg)
return arg
return composed
get_page_list = compose1(make_list, cleanup, get_text, get_page)
If you want a more complicated compose function (that, e.g., allows passing multiple args/return values around), it can get a bit complicated to design, so you might want to look around on PyPI and ActiveState for the various existing implementations.
You could try something like this. I always like separating train wrecks(the book "Clean Code" calls those nested functions train wrecks). This is easier to read and debug. Remember you probably spend twice as long reading your code than writing it so make it easier to read. You will thank yourself later.
url = get_page(URL)
url_text = get_text(url)
make_list(cleanup(url_text))
# you can also encapsulate that into its own function
def build_page_list_from_url(url):
url = get_page(URL)
url_text = get_text(url)
return make_list(cleanup(url_text))
Options:
Refactor: implement this series of function calls as one, aptly-named method.
Look into decorators. They're syntactic sugar for 'chaining' functions in this way. E.g. implement cleanup and make_list as a decorators, then decorate get_text with them.
Compose the functions. See code in this answer.
You could shorten constructs like that with something like the following:
class ChainCalls(object):
def __init__(self, *funcs):
self.funcs = funcs
def __call__(self, *args, **kwargs):
result = self.funcs[-1](*args, **kwargs)
for func in self.funcs[-2::-1]:
result = func(result)
return result
def make_list(arg): return 'make_list(%s)' % arg
def cleanup(arg): return 'cleanup(%s)' % arg
def get_text(arg): return 'get_text(%s)' % arg
def get_page(arg): return 'get_page(%r)' % arg
mychain = ChainCalls(make_list, cleanup, get_text, get_page)
print( mychain('http://is.gd') )
Output:
make_list(cleanup(get_text(get_page('http://is.gd'))))

Replacing parts of the function code on-the-fly

Here I came up with the solution to the other question asked by me on how to remove all costly calling to debug output function scattered over the function code (slowdown was 25 times with using empty function lambda *p: None).
The solution is to edit function code dynamically and prepend all function calls with comment sign #.
from __future__ import print_function
DEBUG = False
def dprint(*args,**kwargs):
'''Debug print'''
print(*args,**kwargs)
def debug(on=False,string='dprint'):
'''Decorator to comment all the lines of the function code starting with string'''
def helper(f):
if not on:
import inspect
source = inspect.getsource(f)
source = source.replace(string, '#'+string) #Beware! Swithces off the whole line after dprint statement
with open('temp_f.py','w') as file:
file.write(source)
from temp_f import f as f_new
return f_new
else:
return f #return f intact
return helper
def f():
dprint('f() started')
print('Important output')
dprint('f() ended')
f = debug(DEBUG,'dprint')(f) #If decorator #debug(True) is used above f(), inspect.getsource somehow includes #debug(True) inside the code.
f()
The problems I see now are these:
# commets all line to the end; but there may be other statements separated by ;. This may be addressed by deleting all pprint calls in f, not commenting, still it may be not that trivial, as there may be nested parantheses.
temp_f.py is created, and then new f code is loaded from it. There should be a better way to do this without writing to hard drive. I found this recipe, but haven't managed to make it work.
if decorator is applied with special syntax used #debug, then inspect.getsource includes the line with decorator to the function code. This line can be manually removed from string, but it may lead to bugs if there are more than one decorator applied to f. I solved it with resorting to old-style decorator application f=decorator(f).
What other problems do you see here?
How can all these problems be solved?
What are upsides and downsides of this approach?
What can be improved here?
Is there any better way to do what I try to achieve with this code?
I think it's a very interesting and contentious technique to preprocess function code before compilation to byte-code. Strange though that nobody got interested in it. I think the code I gave may have a lot of shaky points.
A decorator can return either a wrapper, or the decorated function unaltered. Use it to create a better debugger:
from functools import wraps
def debug(enabled=False):
if not enabled:
return lambda x: x # Noop, returns decorated function unaltered
def debug_decorator(f):
#wraps(f)
def print_start(*args, **kw):
print('{0}() started'.format(f.__name__))
try:
return f(*args, **kw)
finally:
print('{0}() completed'.format(f.__name__))
return print_start
return debug_decorator
The debug function is a decorator factory, when called it produces a decorator function. If debugging is disabled, it simply returns a lambda that returns it argument unchanged, a no-op decorator. When debugging is enabled, it returns a debugging decorator that prints when a decorated function has started and prints again when it returns.
The returned decorator is then applied to the decorated function.
Usage:
DEBUG = True
#debug(DEBUG)
def my_function_to_be_tested():
print('Hello world!')
To reiterate: when DEBUG is set to false, the my_function_to_be_tested remains unaltered, so runtime performance is not affected at all.
Here is the solution I came up with after composing answers from another questions asked by me here on StackOverflow.
This solution don't comment anything and just deletes standalone dprint statements. It uses ast module and works with Abstract Syntax Tree, it lets us avoid parsing source code. This idea was written in the comment here.
Writing to temp_f.py is replaced with execution f in necessary environment. This solution was offered here.
Also, the last solution addresses the problem of decorator recursive application. It's solved by using _blocked global variable.
This code solves the problem asked to be solved in the question. But still, it's suggested not to be used in real projects:
You are correct, you should never resort to this, there are so many
ways it can go wrong. First, Python is not a language designed for
source-level transformations, and it's hard to write it a transformer
such as comment_1 without gratuitously breaking valid code. Second,
this hack would break in all kinds of circumstances - for example,
when defining methods, when defining nested functions, when used in
Cython, when inspect.getsource fails for whatever reason. Python is
dynamic enough that you really don't need this kind of hack to
customize its behavior.
from __future__ import print_function
DEBUG = False
def dprint(*args,**kwargs):
'''Debug print'''
print(*args,**kwargs)
_blocked = False
def nodebug(name='dprint'):
'''Decorator to remove all functions with name 'name' being a separate expressions'''
def helper(f):
global _blocked
if _blocked:
return f
import inspect, ast, sys
source = inspect.getsource(f)
a = ast.parse(source) #get ast tree of f
class Transformer(ast.NodeTransformer):
'''Will delete all expressions containing 'name' functions at the top level'''
def visit_Expr(self, node): #visit all expressions
try:
if node.value.func.id == name: #if expression consists of function with name a
return None #delete it
except(ValueError):
pass
return node #return node unchanged
transformer = Transformer()
a_new = transformer.visit(a)
f_new_compiled = compile(a_new,'<string>','exec')
env = sys.modules[f.__module__].__dict__
_blocked = True
try:
exec(f_new_compiled,env)
finally:
_blocked = False
return env[f.__name__]
return helper
#nodebug('dprint')
def f():
dprint('f() started')
print('Important output')
dprint('f() ended')
print('Important output2')
f()
Other relevant links:
Switching off debug prints

How to get a functions arguments and values from outside the function?

I have searched a little bit to try to figure this one out but didn't get a solution that I was exactly looking for.
This is my use case:
I would like to evaluate expressions from a functions/methods doc-string against the f/m's parameters and values, but from outside the function (when being called but outside execution of the function
I can't statically change the source code I am evaluating (cant write in new functionality) but dynamically changing (i.e. wrapping the function or adding attributes at run-time) is acceptable
I would prefer to stick with tools in the standard library but am willing to try external libraries if it will make the task a breeze
Here is a simple example of what I am trying to do:
def f1(a,b):
"""a==b"""
pass
def f2(f):
f_locals = "get f's args and values before f is executed"
return eval(f.__doc__,None,f_locals)
>>> f2(f1(2,2))
While I have no clue why you would want to do this, what you've described can be achieved with the inspect module. This example is as close to your original example that I can come up with.
from inspect import getcallargs
def f1(a,b):
"""a==b"""
pass
def f2(f, *f_args, **f_kwargs):
f_callargs = getcallargs(f, *f_args, **f_kwargs)
return eval(f.__doc__, None, f_callargs)
f2(f1, 2, 2)
This should output True.
Keep in mind that this assumes a great many things about the arguments and docstrings of any such functions passed to f2, not the least of which is that none of the examined functions are malicious or malformed. Why don't you want to call functions normally, and why don't you want to change functions?
Edit: As Pajton pointed out, getcallargs is more appropriate here, and removes the calls to both dict and zip. The above code has been updated to reflect this.
I'm not sure if this is what you are looking for, but here's an alternative without inspect module.
#!/usr/bin/python
# -*- coding: utf-8-unix -*-
"""
This is a sample implementation of Inline.pm (Perl) in Python.
Using #inline decorator, it is now possible to write any code
in any language in docstring, and let it compile down to executable
Python code at runtime.
For this specific example, it simply evals input docstring, so
code in docstring must be in Python as well.
"""
# Language compiler for MyLang
class MyLang:
#classmethod
def compile(self, docstring):
# For this example, this simply generates code that
# evals docstring.
def testfunc(*arg, **kw):
return eval(docstring, None, kw)
return testfunc
# #inline decorator
def inline(lang):
def decorate(func):
parm = func.__code__.co_varnames[0:func.__code__.co_argcount]
fgen = lang.compile(func.__doc__)
def wrap(*arg, **kw):
# turn all args into keyword-args
kw.update(dict(zip(parm, arg)))
return fgen(**kw)
return wrap
return decorate
#inline(MyLang)
def myadd(a, b):
"""a + b"""
print(myadd(1, 9))
print(myadd(b = 8, a = 2))
print(myadd(a = 3, b = 7))

Hashing a python function to regenerate output when the function is modified

I have a python function that has a deterministic result. It takes a long time to run and generates a large output:
def time_consuming_function():
# lots_of_computing_time to come up with the_result
return the_result
I modify time_consuming_function from time to time, but I would like to avoid having it run again while it's unchanged. [time_consuming_function only depends on functions that are immutable for the purposes considered here; i.e. it might have functions from Python libraries but not from other pieces of my code that I'd change.] The solution that suggests itself to me is to cache the output and also cache some "hash" of the function. If the hash changes, the function will have been modified, and we have to re-generate the output.
Is this possible or ridiculous?
Updated: based on the answers, it looks like what I want to do is to "memoize" time_consuming_function, except instead of (or in addition to) arguments passed into an invariant function, I want to account for a function that itself will change.
If I understand your problem, I think I'd tackle it like this. It's a touch evil, but I think it's more reliable and on-point than the other solutions I see here.
import inspect
import functools
import json
def memoize_zeroadic_function_to_disk(memo_filename):
def decorator(f):
try:
with open(memo_filename, 'r') as fp:
cache = json.load(fp)
except IOError:
# file doesn't exist yet
cache = {}
source = inspect.getsource(f)
#functools.wraps(f)
def wrapper():
if source not in cache:
cache[source] = f()
with open(memo_filename, 'w') as fp:
json.dump(cache, fp)
return cache[source]
return wrapper
return decorator
#memoize_zeroadic_function_to_disk(...SOME PATH HERE...)
def time_consuming_function():
# lots_of_computing_time to come up with the_result
return the_result
Rather than putting the function in a string, I would put the function in its own file. Call it time_consuming.py, for example. It would look something like this:
def time_consuming_method():
# your existing method here
# Is the cached data older than this file?
if (not os.path.exists(data_file_name)
or os.stat(data_file_name).st_mtime < os.stat(__file__).st_mtime):
data = time_consuming_method()
save_data(data_file_name, data)
else:
data = load_data(data_file_name)
# redefine method
def time_consuming_method():
return data
While testing the infrastructure for this to work, I'd comment out the slow parts. Make a simple function that just returns 0, get all of the save/load stuff working to your satisfaction, then put the slow bits back in.
The first part is memoization and serialization of your lookup table. That should be straightforward enough based on some python serialization library. The second part is that you want to delete your serialized lookup table when the source code changes. Perhaps this is being overthought into some fancy solution. Presumably when you change the code you check it in somewhere? Why not add a hook to your checkin routine that deletes your serialized table? Or if this is not research data and is in production, make it part of your release process that if the revision number of your file (put this function in it's own file) has changed, your release script deletes the serialzed lookup table.
So, here is a really neat trick using decorators:
def memoize(f):
cache={};
def result(*args):
if args not in cache:
cache[args]=f(*args);
return cache[args];
return result;
With the above, you can then use:
#memoize
def myfunc(x,y,z):
# Some really long running computation
When you invoke myfunc, you will actually be invoking the memoized version of it. Pretty neat, huh? Whenever you want to redefine your function, simply use "#memoize" again, or explicitly write:
myfunc = memoize(new_definition_for_myfunc);
Edit
I didn't realize that you wanted to cache between multiple runs. In that case, you can do the following:
import os;
import os.path;
import cPickle;
class MemoizedFunction(object):
def __init__(self,f):
self.function=f;
self.filename=str(hash(f))+".cache";
self.cache={};
if os.path.exists(self.filename):
with open(filename,'rb') as file:
self.cache=cPickle.load(file);
def __call__(self,*args):
if args not in self.cache:
self.cache[args]=self.function(*args);
return self.cache[args];
def __del__(self):
with open(self.filename,'wb') as file:
cPickle.dump(self.cache,file,cPickle.HIGHEST_PROTOCOL);
def memoize(f):
return MemoizedFunction(f);
What you describe is effectively memoization. Most common functions can be memoized by defining a decorator.
A (overly simplified) example:
def memoized(f):
cache={}
def memo(*args):
if args in cache:
return cache[args]
else:
ret=f(*args)
cache[args]=ret
return ret
return memo
#memoized
def time_consuming_method():
# lots_of_computing_time to come up with the_result
return the_result
Edit:
From Mike Graham's comment and the OP's update, it is now clear that values need to be cached over different runs of the program. This can be done by using some of of persistent storage for the cache (e.g. something as simple as using Pickle or a simple text file, or maybe using a full blown database, or anything in between). The choice of which method to use depends on what the OP needs. Several other answers already give some solutions to this, so I'm not going to repeat that here.

Categories

Resources