This question already has answers here:
Python functools lru_cache with instance methods: release object
(9 answers)
Closed 4 months ago.
I have a class with a method that I want to cache properly, i.e. that the results are properly cleaned when the object is no longer in use. Example:
import functools
import numpy as np
class foo:
def __init__(self, dev):
self.dev = dev
#functools.cache
def bar(self, len):
return np.random.normal(scale=self.dev, size=len)
if __name__ == '__main__':
for i in range(100000):
foo = Foo(i)
_ = foo.bar(1000000)
This creates a memory leak which is hard to discover. How to do this properly? For properties, there is a cached_property, but this does not work for functions with arguments.
With just two changes, you can considerably improve your algorithm’s runtime:
Import the #lru_cache decorator from the functools module.
Use #lru_cache to decorate model().
Here’s what the top of the script will look like with the two updates:
import functools
from abc import abstractmethod, ABC
"""
Module: base_dao.py
Author: Imam Hossain Roni
Created: April 01, 2020
Description: 'Its a base dao used to separate the data persistence logic
in a separate layer.'
"""
class Dao(ABC):
# must override these
MODEL_CLASS = None
SAVE_BATCH_SIZE = 1000
VALIDATOR_CLASS = None
#property
#abstractmethod
def model_cls(self):
pass
#property
#functools.lru_cache(maxsize=None)
def model(self):
return self.model_cls()
I tried to decorate a classmethod with functools.lru_cache. My attempt failed:
import functools
class K:
#functools.lru_cache(maxsize=32)
#classmethod
def mthd(i, stryng: str): \
return stryng
obj = K()
The error message comes from functools.lru_cache:
TypeError: the first argument must be callable
A class method is, itself, not callable. (What is callable is the object return by the class method's __get__ method.)
As such, you want the function decorated by lru_cache to be turned into a class method instead.
#classmethod
#functools.lru_cache(maxsize=32)
def mthd(cls, stryng: str):
return stryng
The selected answer is totally correct but adding another post. If you want to bind the cache storages to each classes, instead of sharing the single storage to all its subclasses, there is another option methodtools
import functools
import methodtools
class K:
#classmethod
#functools.lru_cache(maxsize=1)
def mthd(cls, s: str):
print('functools', s)
return s
#methodtools.lru_cache(maxsize=1) # note that methodtools wraps classmethod
#classmethod
def mthd2(cls, s: str):
print('methodtools', s)
return s
class L(K):
pass
K.mthd('1')
L.mthd('2')
K.mthd2('1')
L.mthd2('2')
K.mthd('1') # functools share the storage
L.mthd('2')
K.mthd2('1') # methodtools doesn't share the storage
L.mthd2('2')
Then the result is
$ python example.py
functools 1
functools 2
methodtools 1
methodtools 2
functools 1
functools 2
As part of parallellizing some existing code (with multiprocessing), I run into the situation that something similar to the class below needs to be pickled.
Starting from:
import pickle
from functools import lru_cache
class Test:
def __init__(self):
self.func = lru_cache(maxsize=None)(self._inner_func)
def _inner_func(self, x):
# In reality this will be slow-running
return x
calling
t = Test()
pickle.dumps(t)
returns
_pickle.PicklingError: Can't pickle <functools._lru_cache_wrapper object at 0x00000190454A7AC8>: it's not the same object as __main__.Test._inner_func
which I don't really understand. By the way, I also tried a variation where the name of _inner_func was func as well, that didn't change things.
If anybody is interested, this can be solved by using getstate and setstate like this:
from functools import lru_cache
from copy import copy
class Test:
def __init__(self):
self.func = lru_cache(maxsize=None)(self._inner_func)
def _inner_func(self, x):
# In reality this will be slow-running
return x
def __getstate__(self):
result = copy(self.__dict__)
result["func"] = None
return result
def __setstate__(self, state):
self.__dict__ = state
self.func = lru_cache(maxsize=None)(self._inner_func)
As detailled in the comments, the pickle module has issues when dealing with decorators. See this question for more details:
Pickle and decorated classes (PicklingError: not the same object)
Use methodtools.lru_cache not to create a new cache function in __init__
import pickle
from methodtools import lru_cache
class Test:
#lru_cache(maxsize=None)
def func(self, x):
# In reality this will be slow-running
return x
if __name__ == '__main__':
t = Test()
print(pickle.dumps(t))
It requires to install methodtools via pypi:
pip install methodtools
I'm attempting to test a a method that is memoized through lru_cache (since it's an expensive database call). with pytest-mock.
A simplified version of the code is:
class User:
def __init__(self, file):
# load a file
#lru_cache
def get(self, user_id):
# do expensive call
Then I'm testing:
class TestUser:
def test_get_is_called(self, mocker):
data = mocker.ANY
user = User(data)
repository.get(user_id)
open_mock = mocker.patch('builtins.open', mocker.mock_open())
open_mock.assert_called_with('/foo')
But I'm getting the following error:
TypeError: unhashable type: '_ANY'
This happens because functools.lru_cache needs the keys stored to be hashable i.e. have a method __hash__ or __cmp__ implemented.
How can I mock such methods in a mocker to make it work?
I've tried
user.__hash__.return_value = 'foo'
with no luck.
For people arriving here trying to work out how to test functions decorated with lru_cache or alru_cache, the answer is to clear the cache before each test.
This can be done as follows:
def setup_function():
"""
Avoid the `(a)lru_cache` causing tests with identical parameters to interfere
with one another.
"""
my_cached_function.cache_clear()
How to switch off #lru_cache when running pytest
In case you ended up here because you want to test an #lru_cache - decorated function with different mocking (but the lru_cache prevents your mocking) ...
Just set the maxsize of the #lru_cache to 0 if you run pytest!
#lru_cache(maxsize=0 if "pytest" in sys.modules else 256)
Minimal working example
with #lru_cache active (maxsize=256) when code runs and deactivated (maxsize=0) if pytest runs:
import sys
from functools import lru_cache
#lru_cache(maxsize=0 if "pytest" in sys.modules else 256)
def fct_parent():
return fct_child()
def fct_child():
return "unmocked"
def test_mock_lru_cache_internal(monkeypatch):
"""This test fails if #lru_cache of fct_parent is active and succeeds otherwise"""
print(f"{fct_parent.cache_info().maxsize=}")
for ii in range(2):
ret_val = f"mocked {ii}"
with monkeypatch.context() as mpc:
mpc.setattr(f"{__name__}.fct_child", lambda: ret_val) # mocks fct_child to return ret_val
assert fct_parent() == ret_val
if __name__ == "__main__":
"""
This module is designed to fail, if called by python
$ python test_lru_cache_mocking.py
and to work if exectued by pytest
$ pytest -s test_lru_cache_mocking.py
The reason is, that the size of the lru_cache is 256 / 0 respectively
and hence test_mock_lru_cache_internal fails / succeeds.
"""
#
from _pytest.monkeypatch import MonkeyPatch
test_mock_lru_cache_internal(MonkeyPatch())
Instead of using mocker.ANY (an object which is intented to be used in assertions as a placeholder that's equal to any object) I believe you instead want to use a sentinel object (such as mocker.sentinel.DATA).
This appears to work from a quick test:
from functools import lru_cache
#lru_cache(maxsize=None)
def f(x):
return (x, x)
def test(mocker):
ret = f(mocker.sentinel.DATA)
assert ret == (mocker.sentinel.DATA, mocker.sentinel.DATA)
What's the best way to toggle decorators on and off, without actually going to each decoration and commenting it out? Say you have a benchmarking decorator:
# deco.py
def benchmark(func):
def decorator():
# fancy benchmarking
return decorator
and in your module something like:
# mymodule.py
from deco import benchmark
class foo(object):
#benchmark
def f():
# code
#benchmark
def g():
# more code
That's fine, but sometimes you don't care about the benchmarks and don't want the overhead. I have been doing the following. Add another decorator:
# anothermodule.py
def noop(func):
# do nothing, just return the original function
return func
And then comment out the import line and add another:
# mymodule.py
#from deco import benchmark
from anothermodule import noop as benchmark
Now benchmarks are toggled on a per-file basis, having only to change the import statement in the module in question. Individual decorators can be controlled independently.
Is there a better way to do this? It would be nice to not have to edit the source file at all, and to specify which decorators to use in which files elsewhere.
You could add the conditional to the decorator itself:
def use_benchmark(modname):
return modname == "mymodule"
def benchmark(func):
if not use_benchmark(func.__module__):
return func
def decorator():
# fancy benchmarking
return decorator
If you apply this decorator in mymodule.py, it will be enabled; if you apply it in othermodule.py, it will not be enabled.
I've been using the following approach. It's almost identical to the one suggested by CaptainMurphy, but it has the advantage that you don't need to call the decorator like a function.
import functools
class SwitchedDecorator:
def __init__(self, enabled_func):
self._enabled = False
self._enabled_func = enabled_func
#property
def enabled(self):
return self._enabled
#enabled.setter
def enabled(self, new_value):
if not isinstance(new_value, bool):
raise ValueError("enabled can only be set to a boolean value")
self._enabled = new_value
def __call__(self, target):
if self._enabled:
return self._enabled_func(target)
return target
def deco_func(target):
"""This is the actual decorator function. It's written just like any other decorator."""
def g(*args,**kwargs):
print("your function has been wrapped")
return target(*args,**kwargs)
functools.update_wrapper(g, target)
return g
# This is where we wrap our decorator in the SwitchedDecorator class.
my_decorator = SwitchedDecorator(deco_func)
# Now my_decorator functions just like the deco_func decorator,
# EXCEPT that we can turn it on and off.
my_decorator.enabled=True
#my_decorator
def example1():
print("example1 function")
# we'll now disable my_decorator. Any subsequent uses will not
# actually decorate the target function.
my_decorator.enabled=False
#my_decorator
def example2():
print("example2 function")
In the above, example1 will be decorated, and example2 will NOT be decorated. When I have to enable or disable decorators by module, I just have a function that makes a new SwitchedDecorator whenever I need a different copy.
I think you should use a decorator a to decorate the decorator b, which let you switch the decorator b on or off with the help of a decision function.
This sounds complex, but the idea is rather simple.
So let's say you have a decorator logger:
from functools import wraps
def logger(f):
#wraps(f)
def innerdecorator(*args, **kwargs):
print (args, kwargs)
res = f(*args, **kwargs)
print res
return res
return innerdecorator
This is a very boring decorator and I have a dozen or so of these, cachers, loggers, things which inject stuff, benchmarking etc. I could easily extend it with an if statement, but this seems to be a bad choice; because then I have to change a dozen of decorators, which is not fun at all.
So what to do? Let's step one level higher. Say we have a decorator, which can decorate a decorator? This decorator would look like this:
#point_cut_decorator(logger)
def my_oddly_behaving_function
This decorator accepts logger, which is not a very interesting fact. But it also has enough power to choose if the logger should be applied or not to my_oddly_behaving_function. I called it point_cut_decorator, because it has some aspects of aspect oriented programming. A point cut is a set of locations, where some code (advice) has to be interwoven with the execution flow. The definitions of point cuts is usually in one place. This technique seems to be very similar.
How can we implement it decision logic. Well I have chosen to make a function, which accepts the decoratee, the decorator, file and name, which can only say if a decorator should be applied or not. These are the coordinates, which are good enough to pinpoint the location very precisely.
This is the implementation of point_cut_decorator, I have chosen to implement the decision function as a simple function, you could extend it to let it decide from your settings or configuration, if you use regexes for all 4 coordinates, you will end up with something very powerful:
from functools import wraps
myselector is the decision function, on true a decorator is applied on false it is not applied. Parameters are the filename, the module name, the decorated object and finally the decorator. This allows us to switch of behaviour in a fine grained manner.
def myselector(fname, name, decoratee, decorator):
print fname
if decoratee.__name__ == "test" and fname == "decorated.py" and decorator.__name__ == "logger":
return True
return False
This decorates a function, checks myselector and if myselector says go on, it will apply the decorator to the function.
def point_cut_decorator(d):
def innerdecorator(f):
#wraps(f)
def wrapper(*args, **kwargs):
if myselector(__file__, __name__, f, d):
ps = d(f)
return ps(*args, **kwargs)
else:
return f(*args, **kwargs)
return wrapper
return innerdecorator
def logger(f):
#wraps(f)
def innerdecorator(*args, **kwargs):
print (args, kwargs)
res = f(*args, **kwargs)
print res
return res
return innerdecorator
And this is how you use it:
#point_cut_decorator(logger)
def test(a):
print "hello"
return "world"
test(1)
EDIT:
This is the regular expression approach I talked about:
from functools import wraps
import re
As you can see, I can specify somewhere a couple of rules, which decides a decorator should be applied or not:
rules = [{
"file": "decorated.py",
"module": ".*",
"decoratee": ".*test.*",
"decorator": "logger"
}]
Then I loop over all rules and return True if a rule matches or false if a rule doesn't matches. By making rules empty in production, this will not slow down your application too much:
def myselector(fname, name, decoratee, decorator):
for rule in rules:
file_rule, module_rule, decoratee_rule, decorator_rule = rule["file"], rule["module"], rule["decoratee"], rule["decorator"]
if (
re.match(file_rule, fname)
and re.match(module_rule, name)
and re.match(decoratee_rule, decoratee.__name__)
and re.match(decorator_rule, decorator.__name__)
):
return True
return False
Here is what I finally came up with for per-module toggling. It uses #nneonneo's suggestion as a starting point.
Random modules use decorators as normal, no knowledge of toggling.
foopkg.py:
from toggledeco import benchmark
#benchmark
def foo():
print("function in foopkg")
barpkg.py:
from toggledeco import benchmark
#benchmark
def bar():
print("function in barpkg")
The decorator module itself maintains a set of function references for all decorators that have been disabled, and each decorator checks for its existence in this set. If so, it just returns the raw function (no decorator). By default the set is empty (everything enabled).
toggledeco.py:
import functools
_disabled = set()
def disable(func):
_disabled.add(func)
def enable(func):
_disabled.discard(func)
def benchmark(func):
if benchmark in _disabled:
return func
#functools.wraps(func)
def deco(*args,**kwargs):
print("--> benchmarking %s(%s,%s)" % (func.__name__,args,kwargs))
ret = func(*args,**kwargs)
print("<-- done")
return deco
The main program can toggle individual decorators on and off during imports:
from toggledeco import benchmark, disable, enable
disable(benchmark) # no benchmarks...
import foopkg
enable(benchmark) # until they are enabled again
import barpkg
foopkg.foo() # no benchmarking
barpkg.bar() # yes benchmarking
reload(foopkg)
foopkg.foo() # now with benchmarking
Output:
function in foopkg
--> benchmarking bar((),{})
function in barpkg
<-- done
--> benchmarking foo((),{})
function in foopkg
<-- done
This has the added bug/feature that enabling/disabling will trickle down to any submodules imported from modules imported in the main function.
EDIT:
Here's class suggested by #nneonneo. In order to use it, the decorator must be called as a function ( #benchmark(), not #benchmark ).
class benchmark:
disabled = False
#classmethod
def enable(cls):
cls.disabled = False
#classmethod
def disable(cls):
cls.disabled = True
def __call__(cls,func):
if cls.disabled:
return func
#functools.wraps(func)
def deco(*args,**kwargs):
print("--> benchmarking %s(%s,%s)" % (func.__name__,args,kwargs))
ret = func(*args,**kwargs)
print("<-- done")
return deco
I would implement a check for a config file inside the decorator's body. If benchmark has to be used according to the config file, then I would go to your current decorator's body. If not, I would return the function and do nothing more. Something in this flavor:
# deco.py
def benchmark(func):
if config == 'dontUseDecorators': # no use of decorator
# do nothing
return func
def decorator(): # else call decorator
# fancy benchmarking
return decorator
What happens when calling a decorated function ? # in
#benchmark
def f():
# body comes here
is syntactic sugar for this
f = benchmark(f)
so if config wants you to overlook decorator, you are just doing f = f() which is what you expect.
I don't think anyone has suggested this yet:
benchmark_modules = set('mod1', 'mod2') # Load this from a config file
def benchmark(func):
if not func.__module__ in benchmark_modules:
return func
def decorator():
# fancy benchmarking
return decorator
Each function or method has a __module__ attribute that is the name of the module where the function is defined. Create a whitelist (or blacklist if you prefer) of modules where benchmarking is to occur, and if you don't want to benchmark that module just return the original undecorated function.
another straight way:
# mymodule.py
from deco import benchmark
class foo(object):
def f():
# code
if <config.use_benchmark>:
f = benchmark(f)
def g():
# more code
if <config.use_benchmark>:
g = benchmark(g)
Here's a workaround to automatically toggle a decorator (here: #profile used by line_profiler):
if 'profile' not in __builtins__ or type(__builtins__) is not dict: profile=lambda x: None;
More info
This conditional (only if needed) instantiation of the profile variable (as an empty lambda function) prevents raising NameError when trying to import our module with user-defined functions where the decorator #profile is applied to every profiled user function. If I ever want to use the decorator for profiling - it will still work, not being overwritten (already existing in an external script kernprof that contains this decorator).