Pythonic way to create closures - python

I have code that exists in a file outside of a GUI, but generates methods to be called by the GUI. For example, this file contains functions that look like:
# old code
def fixDictionary(dictionary, key, new_value):
def fix():
dictionary[key] = new_value
return fix
The general approach of wrapping dictionary in a closure works fine, but this approach leads to a lot of boilerplate code for creating parameter-less functions. I made a simple decorator that does this for me, shown below.
# new code
from functools import wraps
def strip_args(function):
def outer(*args, **kwargs):
#wraps(function)
def inner():
function(*args, **kwargs)
return inner
return outer
#strip_args
def fixDictionary(dictionary, key, new_value):
dictionary[key] = new_value
For reference, the usage of this function looks something like:
dictionary = {'key': 'old_value'}
fixer = fixDictionary(dictionary, 'key', 'new_value')
fixer()
print(dictionary) # {'key': 'new_value'}
Then, I also have a bunch of methods in my code that look like:
# old code
def checkDictionary(dictionary):
errors = []
for key, value in dictionary.items():
if value == 'old_value':
error.append(fixDictionary(dictionary, key, 'new_value'))
return errors
If not clear, these methods check objects for errors and then return anonymous functions that the GUI can call in order to correct those errors. However, all of these methods initialize a blank container, add items to it, and then return it. In order to remove the repeated code in all of these functions, I wrote another decorator:
# new code
def init_and_return(**init_dict):
if len(init_dict) != 1:
raise ValueError('Exactly one "name=type" pair should be supplied')
_name, _type = init_dict.items()[0]
def outer(function):
#wraps(function)
def inner(*args, **kwargs):
_value = _type()
function.func_globals[_name] = _value
function(*args, **kwargs)
return _value
return inner
return outer
#init_and_return(errors=list)
def checkDictionary(dictionary):
for key, value in dictionary.items():
if value == 'old_value':
errors.append(fixDictionary(dictionary, key, 'new_value'))
Now, the final usage looks something like:
dictionary = {'key': 'old_value'}
errors = checkDictionary(dictionary) # [<function fixDictionary at 0x01806C30>]
errors[0]()
print(dictionary) # {'key': 'new_value'}
This also works great and allows me avoid writing more boilerplate for these functions as well. I have two questions about the above implementation:
Is there a more Pythonic way to implement this functionality? The purpose is to eliminate all of the boilerplate code from each function, but writing the functions strip_args and init_and_return definitely strained the brain. While functions like this shouldn't have to be written often, it seems like they are far separated from their actual behavior.
The line function.func_globals[_name] = _value has undesired behavior; it allows errors to be accessed from the global scope. This isn't the end of the world because the variable is reset every time a function is called, but is there anyway for me to set a local variable instead? I have tried changing this line to locals()[_name] = _value, but the scope doesn't carry through to the function. Is this level of meta-programming beyond the scope of what is intended in Python?

I figured out a way to address my second question by implementing some book-keeping code into the init_and_return function that checks whether it's overwriting a global variable, and then restoring it if so (or deleting it if not).
def init_and_return(**init_dict):
# this could be extended to allow for more than one k-v argument
if len(init_dict) != 1:
raise ValueError('Exactly one "name=type" pair should be supplied')
_name, _type = init_dict.items()[0]
def outer(function):
#wraps(function)
def inner(*args, **kwargs):
# instantiate a new container
_value = _type()
# used to roll-back the original global variable
_backup, _check = None, False
# store original global variable (if it exists)
if _name in function.func_globals:
_backup = function.func_globals[_name]
_check = True
# add container to global scope
function.func_globals[_name] = _value
function(*args, **kwargs)
# roll-back if it existed beforehand, delete otherwise
if _check:
function.func_globals[_name] = _backup
else:
del function.func_globals[_name]
return _value
return inner
return outer

Related

The class doesn't take values as numbers when creating an object of a class, but takes them as default values

The task is to write a class decorator, which reads a JSON file and makes its key/values to become properties of the class.
But one of conditions is that there has to be the ability to pass values manually (by creating a class object) as well.
I almost did it. There's just a tiny problem. The program reads data from JSON file and passes them successfully to the class. But when passing values manually during creation of an object of the class, values don't change and they are still being taken from JSON.
The problem only disappears when passing values as default values.
room = Room(1, 1) # Doesn't work
room = Room(tables=1, chairs=1) # Does work
Since arguments have to be passed only as numbers in tests, I have to manage it to work with just numbers, not default values.
Here's the code.
from json import load
def json_read_data(file):
def decorate(cls):
def decorated(*args, **kwargs):
if kwargs == {}:
with open(file) as f:
params = {}
for key, value in load(f).items():
params[key] = value
return cls(**params)
else:
return cls(*args, **kwargs)
return decorated
return decorate
#json_read_data('furniture.json')
class Room:
def __init__(self, tables=None, chairs=None):
self.tables = tables
self.chairs = chairs
def is_it_enough(self):
return self.chairs * 0.5 - self.tables > 0.4
kitchen = Room() # This is passing values from JSON file
print(kitchen.__dict__) # Prints {'tables': 2, 'chairs': 5}
room = Room(tables=1, chairs=1) # This is passing values manually
print(room.__dict__) # Prints {'tables': 1, 'chairs': 1}
'''
JSON file:
{
"tables": 2,
"chairs": 5
}
'''
But if we change to room = Room(1, 1), print(room.dict) prints {'tables': 2, 'chairs': 5} again. Please help me solve this problem!
You need to add your arguments to the decorator. Remember that your decorator is called first and then it calls the decorated function.
You could declare your decorator as: def json_read_data(file, *args): then the subsequent calls to cls() would have to be adapted to accept them. The second one already does, the first one needs modification.
It seems, this edit really worked:
def decorated(*args, **kwargs):
if not args and not kwargs:

Getting the value of a mutable keyword argument of a decorator

I have the following code, in which I simply have a decorator for caching a function's results, and as a concrete implementation, I used the Fibonacci function.
After playing around with the code, I wanted to print the cache variable, that's initiated in the cache wrapper.
(It's not because I suspect the cache might be faulty, I simply want to know how to access it without going into debug mode and put a breakpoint inside the decorator)
I tried to explore the fib_w_cache function in debug mode, which is supposed to actually be the wrapped fib_w_cache, but with no success.
import timeit
def cache(f, cache = dict()):
def args_to_str(*args, **kwargs):
return str(args) + str(kwargs)
def wrapper(*args, **kwargs):
args_str = args_to_str(*args, **kwargs)
if args_str in cache:
#print("cache used for: %s" % args_str)
return cache[args_str]
else:
val = f(*args, **kwargs)
cache[args_str] = val
return val
return wrapper
#cache
def fib_w_cache(n):
if n == 0: return 0
elif n == 1: return 1
else:
return fib_w_cache(n-2) + fib_w_cache(n-1)
def fib_wo_cache(n):
if n == 0: return 0
elif n == 1: return 1
else:
return fib_wo_cache(n-1) + fib_wo_cache(n-2)
print(timeit.timeit('[fib_wo_cache(i) for i in range(0,30)]', globals=globals(), number=1))
print(timeit.timeit('[fib_w_cache(i) for i in range(0,30)]', globals=globals(), number=1))
I admit this is not an "elegant" solution in a sense, but keep in mind that python functions are also objects. So with some slight modification to your code, I managed to inject the cache as an attribute of a decorated function:
import timeit
def cache(f):
def args_to_str(*args, **kwargs):
return str(args) + str(kwargs)
def wrapper(*args, **kwargs):
args_str = args_to_str(*args, **kwargs)
if args_str in wrapper._cache:
#print("cache used for: %s" % args_str)
return wrapper._cache[args_str]
else:
val = f(*args, **kwargs)
wrapper._cache[args_str] = val
return val
wrapper._cache = {}
return wrapper
#cache
def fib_w_cache(n):
if n == 0: return 0
elif n == 1: return 1
else:
return fib_w_cache(n-2) + fib_w_cache(n-1)
#cache
def fib_w_cache_1(n):
if n == 0: return 0
elif n == 1: return 1
else:
return fib_w_cache(n-2) + fib_w_cache(n-1)
def fib_wo_cache(n):
if n == 0: return 0
elif n == 1: return 1
else:
return fib_wo_cache(n-1) + fib_wo_cache(n-2)
print(timeit.timeit('[fib_wo_cache(i) for i in range(0,30)]', globals=globals(), number=1))
print(timeit.timeit('[fib_w_cache(i) for i in range(0,30)]', globals=globals(), number=1))
print(fib_w_cache._cache)
print(fib_w_cache_1._cache) # to prove that caches are different instances for different functions
cache is of course a perfectly normal local variable in scope within the cache function, and a perfectly normal nonlocal cellvar in scope within the wrapper function, so if you want to access the value from there, you just do it—as you already are.
But what if you wanted to access it from somewhere else? Then there are two options.
First, cache happens to be defined at the global level, meaning any code anywhere (that hasn't hidden it with a local variable named cache) can access the function object.
And if you're trying to access the values of a function's default parameters from outside the function, they're available in the attributes of the function object. The inspect module docs explain the inspection-oriented attributes of each builtin type:
__defaults__ is a sequence of the values for all positional-or-keyword parameters, in order.
__kwdefaults__ is a mapping from keywords to values for all keyword-only parameters.
So:
>>> def f(a, b=0, c=1, *, d=2, e=3): pass
>>> f.__defaults__
(0, 1)
>>> f.__kwdefaults__
{'e': 3, 'd': 2}
So, for a simple case where you know there's exactly one default value and know which argument it belongs to, all you need is:
>>> cache.__defaults__[0]
{}
If you need to do something more complicated or dynamic, like get the default value for c in the f function above, you need to dig into other information—the only way to know that c's default value will be the second one in __defaults__ is to look at the attributes of the function's code object, like f.__code__.co_varnames, and figure it out from there. But usually, it's better to just use the inspect module's helpers. For example:
>>> inspect.signature(f).parameters['c'].default
1
>>> inspect.signature(cache).parameters['cache'].default
{}
Alternatively, if you're trying to access the cache from inside fib_w_cache, while there's no variable in lexical scope in that function body you can look at, you do know that the function body is only called by the decorator wrapper, and it is available there.
So, you can get your stack frame
frame = inspect.currentframe()
… follow it back to your caller:
back = frame.f_back
… and grab it from that frame's locals:
back.f_locals['cache']
It's worth noting that f_locals works like the locals function: it's actually a copy of the internal locals storage, so modifying it may have no effect, and that copy flattens nonlocal cell variables to regular local variables. If you wanted to access the actual cell variable, you'd have to grub around in things like back.f_code.co_freevars to get the index and then dig it out of the function object's __closure__. But usually, you don't care about that.
Just for a sake of completeness, python has caching decorator built-in in functools.lru_cache with some inspecting mechanisms:
from functools import lru_cache
#lru_cache(maxsize=None)
def fib_w_cache(n):
if n == 0: return 0
elif n == 1: return 1
else:
return fib_w_cache(n-2) + fib_w_cache(n-1)
print('fib_w_cache(10) = ', fib_w_cache(10))
print(fib_w_cache.cache_info())
Prints:
fib_w_cache(10) = 55
CacheInfo(hits=8, misses=11, maxsize=None, currsize=11)
I managed to find a solution (in some sense by #Patrick Haugh's advice).
I simply accessed cache.__defaults__[0] which holds the cache's dict.
The insights about the shared cache and how to avoid it we're also quite useful.
Just as a note, the cache dictionary can only be accessed through the cache function object. It cannot be accessed through the decorated functions (at least as far as I understand). It logically aligns well with the fact that the cache is shared in my implementation, where on the other hand, in the alternative implementation that was proposed, it is local per decorated function.
You can make a class into a wrapper.
def args_to_str(*args, **kwargs):
return str(args) + str(kwargs)
class Cache(object):
def __init__(self, func):
self.func = func
self.cache = {}
def __call__(self, *args, **kwargs):
args_str = args_to_str(*args, **kwargs)
if args_str in self.cache:
return self.cache[args_str]
else:
val = self.func(*args, **kwargs)
self.cache[args_str] = val
return val
Each function has its own cache. you can access it by calling function.cache. This also allows for any methods you wish to attach to your function.
If you wanted all decorated functions to share the same cache, you could use a class variable instead of an instance variable:
class SharedCache(object):
cache = {}
def __init__(self, func):
self.func = func
#rest of the the code is the same
#SharedCache
def function_1(stuff):
things

Getting list of undefined functions in Python code

Given Python code,
def foo():
def bar():
pass
bar()
foo()
bar()
I'd like to get a list of functions which, if I execute the Python code, will result in a NameError.
In this example, the list should be ['bar'], because it is not defined in the global scope and will cause an error when executed.
Executing the code in a loop, each time defining new functions, is not performant enough.
Currently I walk the AST tree, record all function definitions and all function calls, and subtract one from the other. This gives the wrong result in this case.
it seems you are trying to write some static analyzer for python. maybe you are working on C, but i think it would be faster for me to show the idea only in python:
list_token = # you have tokenized the program now.
class Env:
def __init__(self):
self.env = set()
self.super_env = None # this will point to Env instance
def __contains__(self, key):
if key in self.env:
return True
if self.sub_env is not None:
return key in self.super_env
def add(self, key):
self.env.add(key)
topenv = Env()
currentenv = topenv
ret = [] # return list
for tok in list_token:
if is_colon(tok): # is ':', ie. enter a new scope
newenv = Env()
currentenv.super_env = newenv
currentenv = newenv
else if is_exiting(tok): # exit a scope
currentenv = currentenv.super_env
else if refing_symbol(tok):
if tok not in currentenv: ret.add(tok)
else if new_symbol(tok):
currentenv.add(tok)
else: pass
if you think this code is not enough, please point out the reason. and if you want to capture all by static analysis, i think it's not quite possible.

Returning an Object (class) in Parallel Python

I have created a function that takes a value, does some calculations and return the different answers as an object. However when I try to parallelize the code, using pp, I get the following error.
File "trmm.py", line 8, in getattr
return self.header_array[name]
RuntimeError: maximum recursion depth exceeded while calling a Python object
Here is a simple version of what I am trying to do.
class DataObject(object):
"""
Class to handle data objects with several arrays.
"""
def __getattr__(self, name):
try:
return self.header_array[name]
except KeyError:
try:
return self.line[name]
except KeyError:
raise AttributeError("%s instance has no attribute '%s'" %(self.__class__.__name__, name))
def __setattr__(self, name, value):
if name in ('header_array', 'line'):
object.__setattr__(self, name, value)
elif name in self.line:
self.line[name] = value
else:
self.header_array[name] = value
class TrmmObject(DataObject):
def __init__(self):
DataObject.__init__(self)
self.header_array = {
'header': None
}
self.line = {
'longitude': None,
'latitude': None
}
if __name__ == '__main__':
import pp
ppservers = ()
job_server = pp.Server(2, ppservers=ppservers)
def get_monthly_values(value):
tplObj = TrmmObject()
tplObj.longitude = value
tplObj.latitude = value * 2
return tplObj
job1 = job_server.submit(get_monthly_values, (5,), (DataObject,TrmmObject,),("numpy",))
result = job1()
If I change return tplObj to return [tplObj.longitude, tplObj.latitude] there is no problem. However, as I said before this is a simple version, in reality this change would complicate the program a lot.
I am very grateful for any help.
You almost never need to use getattr and setattr, and it almost always ends up with something blowing up, and infinite recursions is a typical effect of that. I can't really see any reason for using them here either. Be explicit and use the line and header_array dictionaries directly.
If you want a function that looks up a value over all arrays, create a function for that and call it explicitly. Calling the function __getitem__ and using [] is explicit. :-)
(And please don't call a dictionary "header_array", it's confusing).

How to put variables on the stack/context in Python

In essence, I want to put a variable on the stack, that will be reachable by all calls below that part on the stack until the block exits. In Java I would solve this using a static thread local with support methods, that then could be accessed from methods.
Typical example: you get a request, and open a database connection. Until the request is complete, you want all code to use this database connection. After finishing and closing the request, you close the database connection.
What I need this for, is a report generator. Each report consist of multiple parts, each part can rely on different calculations, sometimes different parts relies in part on the same calculation. As I don't want to repeat heavy calculations, I need to cache them. My idea is to decorate methods with a cache decorator. The cache creates an id based on the method name and module, and it's arguments, looks if it has this allready calculated in a stack variable, and executes the method if not.
I will try and clearify by showing my current implementation. Want I want to do is to simplify the code for those implementing calculations.
First, I have the central cache access object, which I call MathContext:
class MathContext(object):
def __init__(self, fn):
self.fn = fn
self.cache = dict()
def get(self, calc_config):
id = create_id(calc_config)
if id not in self.cache:
self.cache[id] = calc_config.exec(self)
return self.cache[id]
The fn argument is the filename the context is created in relation to, from where data can be read to be calculated.
Then we have the Calculation class:
class CalcBase(object):
def exec(self, math_context):
raise NotImplementedError
And here is a stupid Fibonacci example. Non of the methods are actually recursive, they work on large sets of data instead, but it works to demonstrate how you would depend on other calculations:
class Fibonacci(CalcBase):
def __init__(self, n): self.n = n
def exec(self, math_context):
if self.n < 2: return 1
a = math_context.get(Fibonacci(self.n-1))
b = math_context.get(Fibonacci(self.n-2))
return a+b
What I want Fibonacci to be instead, is just a decorated method:
#cache
def fib(n):
if n<2: return 1
return fib(n-1)+fib(n-2)
With the math_context example, when math_context goes out of scope, so does all it's cached values. I want the same thing for the decorator. Ie. at point X, everything cached by #cache is dereferrenced to be gced.
I went ahead and made something that might just do what you want. It can be used as both a decorator and a context manager:
from __future__ import with_statement
try:
import cPickle as pickle
except ImportError:
import pickle
class cached(object):
"""Decorator/context manager for caching function call results.
All results are cached in one dictionary that is shared by all cached
functions.
To use this as a decorator:
#cached
def function(...):
...
The results returned by a decorated function are not cleared from the
cache until decorated_function.clear_my_cache() or cached.clear_cache()
is called
To use this as a context manager:
with cached(function) as function:
...
function(...)
...
The function's return values will be cleared from the cache when the
with block ends
To clear all cached results, call the cached.clear_cache() class method
"""
_CACHE = {}
def __init__(self, fn):
self._fn = fn
def __call__(self, *args, **kwds):
key = self._cache_key(*args, **kwds)
function_cache = self._CACHE.setdefault(self._fn, {})
try:
return function_cache[key]
except KeyError:
function_cache[key] = result = self._fn(*args, **kwds)
return result
def clear_my_cache(self):
"""Clear the cache for a decorated function
"""
try:
del self._CACHE[self._fn]
except KeyError:
pass # no cached results
def __enter__(self):
return self
def __exit__(self, type, value, traceback):
self.clear_my_cache()
def _cache_key(self, *args, **kwds):
"""Create a cache key for the given positional and keyword
arguments. pickle.dumps() is used because there could be
unhashable objects in the arguments, but passing them to
pickle.dumps() will result in a string, which is always hashable.
I used this to make the cached class as generic as possible. Depending
on your requirements, other key generating techniques may be more
efficient
"""
return pickle.dumps((args, sorted(kwds.items())), pickle.HIGHEST_PROTOCOL)
#classmethod
def clear_cache(cls):
"""Clear everything from all functions from the cache
"""
cls._CACHE = {}
if __name__ == '__main__':
# used as decorator
#cached
def fibonacci(n):
print "calculating fibonacci(%d)" % n
if n == 0:
return 0
if n == 1:
return 1
return fibonacci(n - 1) + fibonacci(n - 2)
for n in xrange(10):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
def lucas(n):
print "calculating lucas(%d)" % n
if n == 0:
return 2
if n == 1:
return 1
return lucas(n - 1) + lucas(n - 2)
# used as context manager
with cached(lucas) as lucas:
for i in xrange(10):
print 'lucas(%d) = %d' % (i, lucas(i))
for n in xrange(9, -1, -1):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
cached.clear_cache()
for n in xrange(9, -1, -1):
print 'fibonacci(%d) = %d' % (n, fibonacci(n))
this question seems to be two question
a) sharing db connection
b) caching/Memoizing
b) you have answered yourselves
a) I don't seem to understand why you need to put it on stack?
you can do one of these
you can use a class and connection
could be attribute of it
you can decorate all your function
so that they get a connection from
central location
each function can explicitly use a
global connection method
you can create a connection and pass
around it, or create a context
object and pass around
context,connection can be a part of
context
etc, etc
You could use a global variable wrapped in a getter function:
def getConnection():
global connection
if connection:
return connection
connection=createConnection()
return connection
"you get a request, and open a database connection.... you close the database connection."
This is what objects are for. Create the connection object, pass it to other objects, and then close it when you're done. Globals are not appropriate. Simply pass the value around as a parameter to the other objects that are doing the work.
"Each report consist of multiple parts, each part can rely on different calculations, sometimes different parts relies in part on the same calculation.... I need to cache them"
This is what objects are for. Create a dictionary with useful calculation results and pass that around from report part to report part.
You don't need to mess with "stack variables", "static thread local" or anything like that.
Just pass ordinary variable arguments to ordinary method functions. You'll be a lot happier.
class MemoizedCalculation( object ):
pass
class Fibonacci( MemoizedCalculation ):
def __init__( self ):
self.cache= { 0: 1, 1: 1 }
def __call__( self, arg ):
if arg not in self.cache:
self.cache[arg]= self(arg-1) + self(arg-2)
return self.cache[arg]
class MathContext( object ):
def __init__( self ):
self.fibonacci = Fibonacci()
You can use it like this
>>> mc= MathContext()
>>> mc.fibonacci( 4 )
5
You can define any number of calculations and fold them all into a single container object.
If you want, you can make the MathContext into a formal Context Manager so that it work with the with statement. Add these two methods to MathContext.
def __enter__( self ):
print "Initialize"
return self
def __exit__( self, type_, value, traceback ):
print "Release"
Then you can do this.
with MathContext() as mc:
print mc.fibonacci( 4 )
At the end of the with statement, you can guaranteed that the __exit__ method was called.

Categories

Resources