Function closure vs. callable class - python

In many cases, there are two implementation choices: a closure and a callable class. For example,
class F:
def __init__(self, op):
self.op = op
def __call__(self, arg1, arg2):
if (self.op == 'mult'):
return arg1 * arg2
if (self.op == 'add'):
return arg1 + arg2
raise InvalidOp(op)
f = F('add')
or
def F(op):
if op == 'or':
def f_(arg1, arg2):
return arg1 | arg2
return f_
if op == 'and':
def g_(arg1, arg2):
return arg1 & arg2
return g_
raise InvalidOp(op)
f = F('add')
What factors should one consider in making the choice, in either direction?
I can think of two:
It seems a closure would always have better performance (can't
think of a counterexample).
I think there are cases when a closure cannot do the job (e.g., if
its state changes over time).
Am I correct in these? What else could be added?

Closures are faster. Classes are more flexible (i.e. more methods available than just __call__).

I realize this is an older posting, but one factor I didn't see listed is that in Python (pre-nonlocal) you cannot modify a local variable contained in the referencing environment. (In your example such modification is not important, but technically speaking the lack of being able to modify such a variable means it's not a true closure.)
For example, the following code doesn't work:
def counter():
i = 0
def f():
i += 1
return i
return f
c = counter()
c()
The call to c above will raise a UnboundLocalError exception.
This is easy to get around by using a mutable, such as a dictionary:
def counter():
d = {'i': 0}
def f():
d['i'] += 1
return d['i']
return f
c = counter()
c() # 1
c() # 2
but of course that's just a workaround.

Please note that because of an error previously found in my testing code, my original answer was incorrect. The revised version follows.
I made a small program to measure running time and memory consumption. I created the following callable class and a closure:
class CallMe:
def __init__(self, context):
self.context = context
def __call__(self, *args, **kwargs):
return self.context(*args, **kwargs)
def call_me(func):
return lambda *args, **kwargs: func(*args, **kwargs)
I timed calls to simple functions accepting different number of arguments (math.sqrt() with 1 argument, math.pow() with 2 and max() with 12).
I used CPython 2.7.10 and 3.4.3+ on Linux x64. I was only able to do memory profiling on Python 2. The source code I used is available here.
My conclusions are:
Closures run faster than equivalent callable classes: about 3 times faster on Python 2, but only 1.5 times faster on Python 3. The narrowing is both because closure became slower and callable classes slower.
Closures take less memory than equivalent callable classes: roughly 2/3 of the memory (only tested on Python 2).
While not part of the original question, it's interesting to note that the run time overhead for calls made via a closure is roughly the same as a call to math.pow(), while via a callable class it is roughly double that.
These are very rough estimates, and they may vary with hardware, operating system and the function you're comparing it too. However, it gives you an idea about the impact of using each kind of callable.
Therefore, this supports (conversely to what I've written before), that the accepted answer given by #RaymondHettinger is correct, and closures should be preferred for indirect calls, at least as long as it doesn't impede on readability. Also, thanks to #AXO for pointing out the mistake in my original code.

I consider the class approach to be easier to understand at one glance, and therefore, more maintainable. As this is one of the premises of good Python code, I think that all things being equal, one is better off using a class rather than a nested function. This is one of the cases where the flexible nature of Python makes the language violate the "there should be one, and preferably only one, obvious way of doing something" predicate for coding in Python.
The performance difference for either side should be negligible - and if you have code where performance matters at this level, you certainly should profile it and optimize the relevant parts, possibly rewriting some of your code as native code.
But yes, if there was a tight loop using the state variables, assessing the closure variables should be slight faster than assessing the class attributes. Of course, this would be overcome by simply inserting a line like op = self.op inside the class method, before entering the loop, making the variable access inside the loop to be made to a local variable - this would avoid an attribute look-up and fetching for each access. Again, performance differences should be negligible, and you have a more serious problem if you need this little much extra performance and are coding in Python.

Mr. Hettinger's answer still is true ten years later in Python3.10. For anyone wondering:
from timeit import timeit
class A: # Naive class
def __init__(self, op):
if op == "mut":
self.exc = lambda x, y: x * y
elif op == "add":
self.exc = lambda x, y: x + y
def __call__(self, x, y):
return self.exc(x,y)
class B: # More optimized class
__slots__ = ('__call__')
def __init__(self, op):
if op == "mut":
self.__call__ = lambda x, y: x * y
elif op == "add":
self.__call__ = lambda x, y: x + y
def C(op): # Closure
if op == "mut":
def _f(x,y):
return x * y
elif op == "add":
def _f(x,t):
return x + y
return _f
a = A("mut")
b = B("mut")
c = C("mut")
print(timeit("[a(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000))
# 26.47s naive class
print(timeit("[b(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000))
# 18.00s optimized class
print(timeit("[c(x,y) for x in range(100) for y in range(100)]", globals=globals(), number=10000))
# 12.12s closure
Using closure seems to offer significant speed gains in cases where the call number is high. However, classes have extensive customization and are superior choice at times.

I'd re-write class example with something like:
class F(object):
__slots__ = ('__call__')
def __init__(self, op):
if op == 'mult':
self.__call__ = lambda a, b: a * b
elif op == 'add':
self.__call__ = lambda a, b: a + b
else:
raise InvalidOp(op)
That gives 0.40 usec/pass (function 0.31, so it 29% slower) at my machine with Python 3.2.2. Without using object as a base class it gives 0.65 usec/pass (i.e. 55% slower than object based). And by some reason code with checking op in __call__ gives almost the same results as if it was done in __init__. With object as a base and check inside __call__ gives 0.61 usec/pass.
The reason why would you use classes might be polymorphism.
class UserFunctions(object):
__slots__ = ('__call__')
def __init__(self, name):
f = getattr(self, '_func_' + name, None)
if f is None: raise InvalidOp(name)
else: self.__call__ = f
class MyOps(UserFunctions):
#classmethod
def _func_mult(cls, a, b): return a * b
#classmethod
def _func_add(cls, a, b): return a + b

Related

Ways to define and use partially bound functions

The two ways I'm aware of to have a partially-bound function that can be later called is:
apply_twice = lambda f: lambda x: f(f(x))
square2x = apply_twice(lambda x: x*x)
square2x(2)
# 16
And
def apply_twice(f):
def apply(x):
return f(f(x))
return apply
square_2x=apply_twice(lambda x: x*x)
square_2x(4)
# 256
Are there any other common ways to pass around or use partially-bound functions?
functools.partial can be used to partially apply an ordinary Python function. This is especially useful if you already have a regular function and want to apply only some of the arguments.
from functools import partial
def apply_twice(f, x):
return f(f(x))
square2x = partial(apply_twice, lambda x: x*x)
print(square2x(4))
It's also important to remember that functions are only one type of callable in Python, and we're free to define callables ourselves as ordinary user-defined classes. So if you have some complex operation that you want to behave like a function, you can always write a class, which lets you document in more detail what it is and what the different parts mean.
class MyApplyTwice:
def __init__(self, f):
self.f = f
def __call__(self, x):
return self.f(self.f(x))
square2x = MyApplyTwice(lambda x: x*x)
print(square2x(4))
While overly verbose in this example, it can be helpful to write your function out as a class if it's going to be storing state long-term or might be doing confusing mutable things with its state. It's also useful to keep in mind for learning purposes, as it's a healthy reminder that closures and objects are two sides of the same coin. They're really the same thing, viewed in a different light.
You can also do this with functools.partial():
def apply_twice(f, x):
return f(f(x))
square_2x = functools.partial(apply_twice, lambda x: x*x)
This isn't really partial binding, assuming you mean partial application.
Partial application is when you create a function that does the same thing as another function by fixing some number of its arguments, producing a function of smaller arity (the arity of a function is the number of arugments it takes).
So, for example,
def foo(a, b, c):
return a + b + c
A partially applied version of foo would be something like:
def partial_foo(a, b):
return foo(a, b, 42)
Or, with a lambda expression:
partial_foo = lambda a, b: foo(a, b, 42)
However, note, the above goes against the official style guidelines, in PEP8, you shouldn't assign the result of lambda expressions to a name, if you are going to do that just use a full function defintion.
The module, functools, has a helper for partial application:
import functools
partial_foo = functools.partial(foo, c=42)
Note, you may have heard about "currying", which sometimes gets confused for partial application. Currying is when you decompose a n-arity function into N, 1-arity functions. So, more concretely, for foo:
curried_foo = lambda a: lambda b: lambda c: a + b + c
Or in long form:
def curried_foo(a):
def _curr0(b):
def _curr1(c):
return a + b + c
return _curr1
return _curr0
And the important part, curried_foo(1)(2)(3) == foo(1, 2, 3)

Is there a better way than this to write Python functions that "depend on parameters"?

Consider the Python function line defined as follows:
def line(m, b):
def inner_function(x):
return m * x + b
return inner_function
This function has the property that for any floats m and b, the object line(m, b) is a Python function, and when line(m, b) is called on a float x, it returns a float line(m, b)(x). The float line(m, b)(x) can be interpreted as the value of the line with slope m and y-intercept b at the point x. This is one method for writing a Python function that "depends on parameters" m and b.
Is there a special name for this method of writing a Python function that depends on some parameters?
Is there a more Pythonic and/or computationally efficient way to write a function that does the same thing as line above?
This is called a closure, and it's a perfectly reasonable way to write one, as well as one of the most efficient means of doing so (in the CPython reference interpreter anyway).
The only other common pattern I know of is the equivalent of C++'s functors, where a class has the state as attributes, and the additional parameters are passed to __call__, e.g. to match your case:
class Line:
def __init__(self, m, b):
self.m = m
self.b = b
def __call__(self, x):
return self.m * x + self.b
It's used identically, either creating/storing an instance and reusing it, or as in your example, creating it, using it once, and throwing it away (Line(m, b)(x)). Functors are slower than closures though (as attribute access is more expensive than reading from nested scope, at least in the CPython reference interpreter), and as you can see, they're more verbose as well, so I'd generally recommend the closure unless your needs require the greater flexibility/power of class instances.
I support #ShaddowRanger's answer. But using partial is another nice approach.
import functools
def f(m, b, x):
return m * x + b
line = functools.partial(f, 2, 3)
line(5)
=> 13
One thing which is worth pointing out is that lambda objects, and OP's inner_function aren't pickleable, whereas line here, as well as #ShaddowRanger's Line objects are, which makes them a bit more useful.
This is a little shorter:
def line(m,b):
return lambda x: m*x+b;

[Python]Function that runs once then remembers result when called again [duplicate]

I just started Python and I've got no idea what memoization is and how to use it. Also, may I have a simplified example?
Memoization effectively refers to remembering ("memoization" → "memorandum" → to be remembered) results of method calls based on the method inputs and then returning the remembered result rather than computing the result again. You can think of it as a cache for method results. For further details, see page 387 for the definition in Introduction To Algorithms (3e), Cormen et al.
A simple example for computing factorials using memoization in Python would be something like this:
factorial_memo = {}
def factorial(k):
if k < 2: return 1
if k not in factorial_memo:
factorial_memo[k] = k * factorial(k-1)
return factorial_memo[k]
You can get more complicated and encapsulate the memoization process into a class:
class Memoize:
def __init__(self, f):
self.f = f
self.memo = {}
def __call__(self, *args):
if not args in self.memo:
self.memo[args] = self.f(*args)
#Warning: You may wish to do a deepcopy here if returning objects
return self.memo[args]
Then:
def factorial(k):
if k < 2: return 1
return k * factorial(k - 1)
factorial = Memoize(factorial)
A feature known as "decorators" was added in Python 2.4 which allow you to now simply write the following to accomplish the same thing:
#Memoize
def factorial(k):
if k < 2: return 1
return k * factorial(k - 1)
The Python Decorator Library has a similar decorator called memoized that is slightly more robust than the Memoize class shown here.
functools.cache decorator:
Python 3.9 released a new function functools.cache. It caches in memory the result of a functional called with a particular set of arguments, which is memoization. It's easy to use:
import functools
import time
#functools.cache
def calculate_double(num):
time.sleep(1) # sleep for 1 second to simulate a slow calculation
return num * 2
The first time you call caculate_double(5), it will take a second and return 10. The second time you call the function with the same argument calculate_double(5), it will return 10 instantly.
Adding the cache decorator ensures that if the function has been called recently for a particular value, it will not recompute that value, but use a cached previous result. In this case, it leads to a tremendous speed improvement, while the code is not cluttered with the details of caching.
(Edit: the previous example calculated a fibonacci number using recursion, but I changed the example to prevent confusion, hence the old comments.)
functools.lru_cache decorator:
If you need to support older versions of Python, functools.lru_cache works in Python 3.2+. By default, it only caches the 128 most recently used calls, but you can set the maxsize to None to indicate that the cache should never expire:
#functools.lru_cache(maxsize=None)
def calculate_double(num):
# etc
The other answers cover what it is quite well. I'm not repeating that. Just some points that might be useful to you.
Usually, memoisation is an operation you can apply on any function that computes something (expensive) and returns a value. Because of this, it's often implemented as a decorator. The implementation is straightforward and it would be something like this
memoised_function = memoise(actual_function)
or expressed as a decorator
#memoise
def actual_function(arg1, arg2):
#body
I've found this extremely useful
from functools import wraps
def memoize(function):
memo = {}
#wraps(function)
def wrapper(*args):
# add the new key to dict if it doesn't exist already
if args not in memo:
memo[args] = function(*args)
return memo[args]
return wrapper
#memoize
def fibonacci(n):
if n < 2: return n
return fibonacci(n - 1) + fibonacci(n - 2)
fibonacci(25)
Memoization is keeping the results of expensive calculations and returning the cached result rather than continuously recalculating it.
Here's an example:
def doSomeExpensiveCalculation(self, input):
if input not in self.cache:
<do expensive calculation>
self.cache[input] = result
return self.cache[input]
A more complete description can be found in the wikipedia entry on memoization.
Let's not forget the built-in hasattr function, for those who want to hand-craft. That way you can keep the mem cache inside the function definition (as opposed to a global).
def fact(n):
if not hasattr(fact, 'mem'):
fact.mem = {1: 1}
if not n in fact.mem:
fact.mem[n] = n * fact(n - 1)
return fact.mem[n]
Memoization is basically saving the results of past operations done with recursive algorithms in order to reduce the need to traverse the recursion tree if the same calculation is required at a later stage.
see http://scriptbucket.wordpress.com/2012/12/11/introduction-to-memoization/
Fibonacci Memoization example in Python:
fibcache = {}
def fib(num):
if num in fibcache:
return fibcache[num]
else:
fibcache[num] = num if num < 2 else fib(num-1) + fib(num-2)
return fibcache[num]
Memoization is the conversion of functions into data structures. Usually one wants the conversion to occur incrementally and lazily (on demand of a given domain element--or "key"). In lazy functional languages, this lazy conversion can happen automatically, and thus memoization can be implemented without (explicit) side-effects.
Well I should answer the first part first: what's memoization?
It's just a method to trade memory for time. Think of Multiplication Table.
Using mutable object as default value in Python is usually considered bad. But if use it wisely, it can actually be useful to implement a memoization.
Here's an example adapted from http://docs.python.org/2/faq/design.html#why-are-default-values-shared-between-objects
Using a mutable dict in the function definition, the intermediate computed results can be cached (e.g. when calculating factorial(10) after calculate factorial(9), we can reuse all the intermediate results)
def factorial(n, _cache={1:1}):
try:
return _cache[n]
except IndexError:
_cache[n] = factorial(n-1)*n
return _cache[n]
Here is a solution that will work with list or dict type arguments without whining:
def memoize(fn):
"""returns a memoized version of any function that can be called
with the same list of arguments.
Usage: foo = memoize(foo)"""
def handle_item(x):
if isinstance(x, dict):
return make_tuple(sorted(x.items()))
elif hasattr(x, '__iter__'):
return make_tuple(x)
else:
return x
def make_tuple(L):
return tuple(handle_item(x) for x in L)
def foo(*args, **kwargs):
items_cache = make_tuple(sorted(kwargs.items()))
args_cache = make_tuple(args)
if (args_cache, items_cache) not in foo.past_calls:
foo.past_calls[(args_cache, items_cache)] = fn(*args,**kwargs)
return foo.past_calls[(args_cache, items_cache)]
foo.past_calls = {}
foo.__name__ = 'memoized_' + fn.__name__
return foo
Note that this approach can be naturally extended to any object by implementing your own hash function as a special case in handle_item. For example, to make this approach work for a function that takes a set as an input argument, you could add to handle_item:
if is_instance(x, set):
return make_tuple(sorted(list(x)))
Solution that works with both positional and keyword arguments independently of order in which keyword args were passed (using inspect.getargspec):
import inspect
import functools
def memoize(fn):
cache = fn.cache = {}
#functools.wraps(fn)
def memoizer(*args, **kwargs):
kwargs.update(dict(zip(inspect.getargspec(fn).args, args)))
key = tuple(kwargs.get(k, None) for k in inspect.getargspec(fn).args)
if key not in cache:
cache[key] = fn(**kwargs)
return cache[key]
return memoizer
Similar question: Identifying equivalent varargs function calls for memoization in Python
Just wanted to add to the answers already provided, the Python decorator library has some simple yet useful implementations that can also memoize "unhashable types", unlike functools.lru_cache.
cache = {}
def fib(n):
if n <= 1:
return n
else:
if n not in cache:
cache[n] = fib(n-1) + fib(n-2)
return cache[n]
If speed is a consideration:
#functools.cache and #functools.lru_cache(maxsize=None) are equally fast, taking 0.122 seconds (best of 15 runs) to loop a million times on my system
a global cache variable is quite a lot slower, taking 0.180 seconds (best of 15 runs) to loop a million times on my system
a self.cache class variable is a bit slower still, taking 0.214 seconds (best of 15 runs) to loop a million times on my system
The latter two are implemented similar to how it is described in the currently top-voted answer.
This is without memory exhaustion prevention, i.e. I did not add code in the class or global methods to limit that cache's size, this is really the barebones implementation. The lru_cache method has that for free, if you need this.
One open question for me would be how to unit test something that has a functools decorator. Is it possible to empty the cache somehow? Unit tests seem like they would be cleanest using the class method (where you can instantiate a new class for each test) or, secondarily, the global variable method (since you can do yourimportedmodule.cachevariable = {} to empty it).

python 3: setting up a variable once inside a function that is called multiple times [duplicate]

I just started Python and I've got no idea what memoization is and how to use it. Also, may I have a simplified example?
Memoization effectively refers to remembering ("memoization" → "memorandum" → to be remembered) results of method calls based on the method inputs and then returning the remembered result rather than computing the result again. You can think of it as a cache for method results. For further details, see page 387 for the definition in Introduction To Algorithms (3e), Cormen et al.
A simple example for computing factorials using memoization in Python would be something like this:
factorial_memo = {}
def factorial(k):
if k < 2: return 1
if k not in factorial_memo:
factorial_memo[k] = k * factorial(k-1)
return factorial_memo[k]
You can get more complicated and encapsulate the memoization process into a class:
class Memoize:
def __init__(self, f):
self.f = f
self.memo = {}
def __call__(self, *args):
if not args in self.memo:
self.memo[args] = self.f(*args)
#Warning: You may wish to do a deepcopy here if returning objects
return self.memo[args]
Then:
def factorial(k):
if k < 2: return 1
return k * factorial(k - 1)
factorial = Memoize(factorial)
A feature known as "decorators" was added in Python 2.4 which allow you to now simply write the following to accomplish the same thing:
#Memoize
def factorial(k):
if k < 2: return 1
return k * factorial(k - 1)
The Python Decorator Library has a similar decorator called memoized that is slightly more robust than the Memoize class shown here.
functools.cache decorator:
Python 3.9 released a new function functools.cache. It caches in memory the result of a functional called with a particular set of arguments, which is memoization. It's easy to use:
import functools
import time
#functools.cache
def calculate_double(num):
time.sleep(1) # sleep for 1 second to simulate a slow calculation
return num * 2
The first time you call caculate_double(5), it will take a second and return 10. The second time you call the function with the same argument calculate_double(5), it will return 10 instantly.
Adding the cache decorator ensures that if the function has been called recently for a particular value, it will not recompute that value, but use a cached previous result. In this case, it leads to a tremendous speed improvement, while the code is not cluttered with the details of caching.
(Edit: the previous example calculated a fibonacci number using recursion, but I changed the example to prevent confusion, hence the old comments.)
functools.lru_cache decorator:
If you need to support older versions of Python, functools.lru_cache works in Python 3.2+. By default, it only caches the 128 most recently used calls, but you can set the maxsize to None to indicate that the cache should never expire:
#functools.lru_cache(maxsize=None)
def calculate_double(num):
# etc
The other answers cover what it is quite well. I'm not repeating that. Just some points that might be useful to you.
Usually, memoisation is an operation you can apply on any function that computes something (expensive) and returns a value. Because of this, it's often implemented as a decorator. The implementation is straightforward and it would be something like this
memoised_function = memoise(actual_function)
or expressed as a decorator
#memoise
def actual_function(arg1, arg2):
#body
I've found this extremely useful
from functools import wraps
def memoize(function):
memo = {}
#wraps(function)
def wrapper(*args):
# add the new key to dict if it doesn't exist already
if args not in memo:
memo[args] = function(*args)
return memo[args]
return wrapper
#memoize
def fibonacci(n):
if n < 2: return n
return fibonacci(n - 1) + fibonacci(n - 2)
fibonacci(25)
Memoization is keeping the results of expensive calculations and returning the cached result rather than continuously recalculating it.
Here's an example:
def doSomeExpensiveCalculation(self, input):
if input not in self.cache:
<do expensive calculation>
self.cache[input] = result
return self.cache[input]
A more complete description can be found in the wikipedia entry on memoization.
Let's not forget the built-in hasattr function, for those who want to hand-craft. That way you can keep the mem cache inside the function definition (as opposed to a global).
def fact(n):
if not hasattr(fact, 'mem'):
fact.mem = {1: 1}
if not n in fact.mem:
fact.mem[n] = n * fact(n - 1)
return fact.mem[n]
Memoization is basically saving the results of past operations done with recursive algorithms in order to reduce the need to traverse the recursion tree if the same calculation is required at a later stage.
see http://scriptbucket.wordpress.com/2012/12/11/introduction-to-memoization/
Fibonacci Memoization example in Python:
fibcache = {}
def fib(num):
if num in fibcache:
return fibcache[num]
else:
fibcache[num] = num if num < 2 else fib(num-1) + fib(num-2)
return fibcache[num]
Memoization is the conversion of functions into data structures. Usually one wants the conversion to occur incrementally and lazily (on demand of a given domain element--or "key"). In lazy functional languages, this lazy conversion can happen automatically, and thus memoization can be implemented without (explicit) side-effects.
Well I should answer the first part first: what's memoization?
It's just a method to trade memory for time. Think of Multiplication Table.
Using mutable object as default value in Python is usually considered bad. But if use it wisely, it can actually be useful to implement a memoization.
Here's an example adapted from http://docs.python.org/2/faq/design.html#why-are-default-values-shared-between-objects
Using a mutable dict in the function definition, the intermediate computed results can be cached (e.g. when calculating factorial(10) after calculate factorial(9), we can reuse all the intermediate results)
def factorial(n, _cache={1:1}):
try:
return _cache[n]
except IndexError:
_cache[n] = factorial(n-1)*n
return _cache[n]
Here is a solution that will work with list or dict type arguments without whining:
def memoize(fn):
"""returns a memoized version of any function that can be called
with the same list of arguments.
Usage: foo = memoize(foo)"""
def handle_item(x):
if isinstance(x, dict):
return make_tuple(sorted(x.items()))
elif hasattr(x, '__iter__'):
return make_tuple(x)
else:
return x
def make_tuple(L):
return tuple(handle_item(x) for x in L)
def foo(*args, **kwargs):
items_cache = make_tuple(sorted(kwargs.items()))
args_cache = make_tuple(args)
if (args_cache, items_cache) not in foo.past_calls:
foo.past_calls[(args_cache, items_cache)] = fn(*args,**kwargs)
return foo.past_calls[(args_cache, items_cache)]
foo.past_calls = {}
foo.__name__ = 'memoized_' + fn.__name__
return foo
Note that this approach can be naturally extended to any object by implementing your own hash function as a special case in handle_item. For example, to make this approach work for a function that takes a set as an input argument, you could add to handle_item:
if is_instance(x, set):
return make_tuple(sorted(list(x)))
Solution that works with both positional and keyword arguments independently of order in which keyword args were passed (using inspect.getargspec):
import inspect
import functools
def memoize(fn):
cache = fn.cache = {}
#functools.wraps(fn)
def memoizer(*args, **kwargs):
kwargs.update(dict(zip(inspect.getargspec(fn).args, args)))
key = tuple(kwargs.get(k, None) for k in inspect.getargspec(fn).args)
if key not in cache:
cache[key] = fn(**kwargs)
return cache[key]
return memoizer
Similar question: Identifying equivalent varargs function calls for memoization in Python
Just wanted to add to the answers already provided, the Python decorator library has some simple yet useful implementations that can also memoize "unhashable types", unlike functools.lru_cache.
cache = {}
def fib(n):
if n <= 1:
return n
else:
if n not in cache:
cache[n] = fib(n-1) + fib(n-2)
return cache[n]
If speed is a consideration:
#functools.cache and #functools.lru_cache(maxsize=None) are equally fast, taking 0.122 seconds (best of 15 runs) to loop a million times on my system
a global cache variable is quite a lot slower, taking 0.180 seconds (best of 15 runs) to loop a million times on my system
a self.cache class variable is a bit slower still, taking 0.214 seconds (best of 15 runs) to loop a million times on my system
The latter two are implemented similar to how it is described in the currently top-voted answer.
This is without memory exhaustion prevention, i.e. I did not add code in the class or global methods to limit that cache's size, this is really the barebones implementation. The lru_cache method has that for free, if you need this.
One open question for me would be how to unit test something that has a functools decorator. Is it possible to empty the cache somehow? Unit tests seem like they would be cleanest using the class method (where you can instantiate a new class for each test) or, secondarily, the global variable method (since you can do yourimportedmodule.cachevariable = {} to empty it).

Calling function with unknown number of parameters

I am trying to create a set of functions in python that will all do a similar operation on a set of inputs. All of the functions have one input parameter fixed and half of them also need a second parameter. For the sake of simplicity, below is a toy example with only two functions.
Now, I want, in my script, to run the appropriate function, depending on what the user input as a number. Here, the user is the random function (so the minimum example works). What I want to do is something like this:
def function_1(*args):
return args[0]
def function_2(*args):
return args[0] * args[1]
x = 10
y = 20
i = random.randint(1,2)
f = function_1 if i==1 else function_2
return_value = f(x,y)
And it works, but it seems messy to me. I would rather have function_1 defined as
def function_1(x):
return x
Another way would be to define
def function_1(x,y):
return x
But that leaves me with a dangling y parameter.
but that will not work as easily. Is my way the "proper" way of solving my problem or does there exist a better way?
There are couple of approaches here, all of them adding more boiler-plate code.
There is also this PEP which may be interesting to you.
But 'pythonic' way of doing it is not as elegant as usual function overloading due to the fact that functions are just class attributes.
So you can either go with function like that:
def foo(*args):
and then count how many args you've got which will be very broad but very flexible as well.
another approach is the default arguments:
def foo(first, second=None, third=None)
less flexible but easier to predict, and then lastly you can also use:
def foo(anything)
and detect the type of anything in your function acting accordingly.
Your monkey-patching example can work too, but it becomes more complex if you use it with class methods, and does make introspection tricky.
EDIT: Also, for your case you may want to keep the functions separate and write single 'dispatcher' function that will call appropriate function for you depending on the arguments, which is probably best solution considering above.
EDIT2: base on your comments I believe that following approach may work for you
def weigh_dispatcher(*args, **kwargs):
#decide which function to call base on args
if 'somethingspecial' in kwargs:
return weight2(*args, **kwargs)
def weight_prep(arg):
#common part here
def weight1(arg1, arg2):
weitht_prep(arg1)
#rest of the func
def weight2(arg1, arg2, arg3):
weitht_prep(arg1)
#rest of the func
alternatively you can move the common part into the dispatcher
You may also have a function with optional second argument:
def function_1(x, y = None):
if y != None:
return x + y
else:
return x
Here's the sample run:
>>> function_1(3)
3
>>> function_1(3, 4)
7
Or even optional multiple arguments! Check this out:
def function_2(x, *args):
return x + sum(args)
And the sample run:
>>> function_2(3)
3
>>> function_2(3, 4)
7
>>> function_2(3, 4, 5, 6, 7)
25
You may here refer to args as to list:
def function_3(x, *args):
if len(args) < 1:
return x
else:
return x + sum(args)
And the sample run:
>>> function_3(1,2,3,4,5)
15

Categories

Resources