Python model inheritance and order of model declaration - python

The following code:
class ParentModel(models.Model):
pass
class ChildA(ChildB):
pass
class ChildB(ParentModel):
pass
Obviously fails with the message.
NameError: name "ChildB" is not defined
Is there anyway to get around this issue, without actually reordering the class definitions? (The code is auto-generated, about 45K lines, and the order of classes is random).

Perfectionists look away!!
This is a workaround (hack); the solution would be to solve the incorrect declaration order.
WARNING: This is extremely daft.
Concept:
Imagine a namespace where anything can exist. Literally anything that is asked of it. Not the smartest thing usually but out-of-order declaration isn't smart either, so why not?
The key problem of out-of-sequence classes is that dependent classes were being defined before their dependencies, the base classes. At that point of evaluation, the base classes are undefined resulting in a NameError.
Wrapping each class in try except statements would take as much effort as rewriting the module anyway, so that can be dismissed out of hand.
A more efficient (in terms of programmer time) means of suppressing NameError must be used. This can be achieved by making the namespace totally permissible, as in, if a lookup object doesn't exist, it should be created thereby avoiding a NameError. This is the obvious danger of this approach as a lookup becomes a creation.
Implementation:
Namespaces in Python are dictionaries, I believe, and dictionaries methods can be overloaded, including the lookup function: __getitem__. So mr_agreeable is a dictionary subclass with an overloaded __getitem__ method which automatically creates a blank class when a lookup key doesn't exist. An instance of mr_agreeable is passed to execfile as the namespace for the classes.py script. The objects (aside from the builtins) created execfile call are merged with the globals() dict of the calling script: hack.py.
This works because Python doesn't care if class' base classes are changed after the fact.
This may be implementation dependent, I don't know. Tested on: Python 2.7.3 64bit on Win7 64bit.
Assuming your out-of-order classes are defined in classes.py:
class ParentModel(object):
name = "parentmodel"
class ChildC(ChildA):
name = "childc"
class ChildA(ChildB):
name = "childa"
class ChildB(ParentModel):
name = "childb"
The loader script, lets call it hack.py:
from random import randint
from codecs import encode
class mr_agreeable(dict):
sin_counter = 0
nutty_factor = 0
rdict = {0 : (0, 9), 200 : (10, 14), 500 : (15, 16), 550 : (17, 22)}
def __getitem__(self, key):
class tmp(object):
pass
tmp.__name__ = key
if(not key in self.keys()):
self.prognosis()
print self.insanity()
return self.setdefault(key, tmp)
def prognosis(self):
self.sin_counter += 1
self.nutty_factor = max(filter(lambda x: x < self.sin_counter, self.rdict.keys()))
def insanity(self):
insane_strs = \
[
"Nofbyhgryl", "Fher, jul abg?", "Sbe fher", "Fbhaqf terng", "Qrsvangryl", "Pbhyqa'g nterr zber",
"Jung pbhyq tb jebat?", "Bxl Qbnxl", "Lrc", "V srry gur fnzr jnl", "Zneel zl qnhtugre",
"Znlor lbh fubhyq svk gung", "1 AnzrReebe vf bar gbb znal naq n 1000'f abg rabhtu", "V'ir qbar qvegvre guvatf",
"Gur ebbz vf fgnegvat gb fcva", "Cebonoyl abg", "Npghnyyl, ab ..... nyevtug gura", "ZNXR VG FGBC",
"BU TBQ AB", "CYRNFR AB", "LBH'ER OERNXVAT CLGUBA", "GUVF VF ABG PBAFRAGHNY", "V'Z GRYYVAT THVQB!!"
]
return encode("ze_nterrnoyr: " + insane_strs[randint(*self.rdict[self.nutty_factor])], "rot13")
def the_act():
ns = mr_agreeable()
execfile("classes.py", ns)
hostages = list(set(ns.keys()) - set(["__builtins__", "object"]))
globals().update([(key, ns[key]) for key in hostages])
the_act()
mr_agreeable acts as the permissible namespace to the complied classes.py. He reminds you this is bad form.

My previous answer showed a loader script that executed the out of order script in execfile but provided a dynamic name space that created placeholder classes (these are typically base classes sourced before they are defined). It then loaded the changes from this name space in the loader's global namespace.
This approach has two problems:
1) Its a hack
2) The assumed class of the placeholders is the object class. So when:
class ChildC(ChildA):
name = "childc"
is evaluated, the namespace detects ChildA is undefined and so creates a placeholder class (an object subclass). When ChildA is actually defined (in the out-of-order script), it might be of a different base class than object and so rebasing ChildC to the new ChildA will fail if ChildA's base is not the object class (what ChildC was originally created with). See this for more info.
So I created a new script, which actually rewrites the input out-of-order script using a similar concept to the previous hack and this script. The new script is used by calling:
python mr_agreeable.py -i out_of_order.py -o ordered.py
mr_agreeable.py:
import os
import sys
from codecs import encode
from random import randint
import getopt
import inspect
import types
__doc__ = \
'''
A python script that re-orders out of sequence class defintions
'''
class rebase_meta(type):
'''
Rebase metaclass
Automatically rebases classes created with this metaclass upon
modification of classes base classes
'''
org_base_classes = {}
org_base_classes_subs = {}
base_classes = {}
base_classes_subs = {}
mod_loaded = False
mod_name = ""
mod_name_space = {}
def __init__(cls, cls_name, cls_bases, cls_dct):
#print "Making class: %s" % cls_name
super(rebase_meta, cls).__init__(cls_name, cls_bases, cls_dct)
# Remove the old base sub class listings
bases = rebase_meta.base_classes_subs.items()
for (base_cls_name, sub_dict) in bases:
sub_dict.pop(cls_name, None)
# Add class to bases' sub class listings
for cls_base in cls_bases:
if(not rebase_meta.base_classes_subs.has_key(cls_base.__name__)):
rebase_meta.base_classes_subs[cls_base.__name__] = {}
rebase_meta.base_classes[cls_base.__name__] = cls_base
rebase_meta.base_classes_subs[cls_base.__name__][cls_name] = cls
# Rebase the sub classes to the new base
if(rebase_meta.base_classes.has_key(cls_name)): # Is class a base class
subs = rebase_meta.base_classes_subs[cls_name]
rebase_meta.base_classes[cls_name] = cls # Update base class dictionary to new class
for (sub_cls_name, sub_cls) in subs.items():
if(cls_name == sub_cls_name):
continue
sub_bases_names = [x.__name__ for x in sub_cls.__bases__]
sub_bases = tuple([rebase_meta.base_classes[x] for x in sub_bases_names])
try:
# Attempt to rebase sub class
sub_cls.__bases__ = sub_bases
#print "Rebased class: %s" % sub_cls_name
except TypeError:
# The old sub class is incompatible with the new base class, so remake the sub
if(rebase_meta.mod_loaded):
new_sub_cls = rebase_meta(sub_cls_name, sub_bases, dict(sub_cls.__dict__.items() + [("__module__", rebase_meta.mod_name)]))
rebase_meta.mod_name_space[sub_cls_name] = new_sub_cls
else:
new_sub_cls = rebase_meta(sub_cls_name, sub_bases, dict(sub_cls.__dict__.items()))
subs[sub_cls_name] = new_sub_cls
#classmethod
def register_mod(self, imod_name, imod_name_space):
if(not self.mod_loaded):
self.org_base_classes = self.base_classes.copy()
self.org_base_classes_subs = self.base_classes_subs.copy()
self.mod_loaded = True
else:
self.base_classes = self.org_base_classes
self.base_classes_subs = self.org_base_classes_subs
self.mod_name = imod_name
self.mod_name_space = imod_name_space
# Can't subclass these classes
forbidden_subs = \
[
"bool",
"buffer",
"memoryview",
"slice",
"type",
"xrange",
]
# Builtin, sub-classable classes
org_class_types = filter(lambda x: isinstance(x, type) and (not x.__name__ in forbidden_subs) and x.__module__ == "__builtin__", types.__builtins__.values())
# Builtin classes recreated with Rebasing metaclass
class_types = [(cls.__name__, rebase_meta(cls.__name__, (cls,), {})) for cls in org_class_types]
# Overwrite builtin classes
globals().update(class_types)
class mr_quiet(dict):
'''
A namespace class that creates placeholder classes upon
a non existant lookup. mr_quiet doesnt say much.
'''
def __getitem__(self, key):
if(not key in self.keys()):
if(hasattr(__builtins__, key)):
return getattr(__builtins__, key)
else:
if(not key in self.keys()):
self.sanity_check()
return self.setdefault(key, rebase_meta(key, (object,), {}))
else:
return dict.__getitem__(self, key)
def sanity_check(self):
pass
class mr_agreeable(mr_quiet):
'''
A talkative cousin of mr_quiet.
'''
sin_counter = 0
nutty_factor = 0
rdict = {0 : (0, 9), 200 : (10, 14), 500 : (15, 16), 550 : (17, 22)}
def sanity_check(self):
self.prognogsis()
print self.insanity()
def prognogsis(self):
self.sin_counter += 1
self.nutty_factor = max(filter(lambda x: x < self.sin_counter, self.rdict.keys()))
def insanity(self):
insane_strs = \
[
"Nofbyhgryl", "Fher, jul abg?", "Sbe fher", "Fbhaqf terng", "Qrsvangryl", "Pbhyqa'g nterr zber",
"Jung pbhyq tb jebat?", "Bxl Qbnxl", "Lrc", "V srry gur fnzr jnl", "Zneel zl qnhtugre",
"Znlor lbh fubhyq svk gung", "1 AnzrReebe vf bar gbb znal naq n 1000'f abg rabhtu", "V'ir qbar qvegvre guvatf",
"Gur ebbz vf fgnegvat gb fcva", "Cebonoyl abg", "Npghnyyl, ab ..... nyevtug gura", "ZNXR VG FGBC",
"BU TBQ AB", "CYRNFR AB", "LBH'ER OERNXVAT CLGUBA", "GUVF VF ABG PBAFRAGHNY", "V'Z GRYYVAT THVQB!!"
]
return encode("ze_nterrnoyr: " + insane_strs[randint(*self.rdict[self.nutty_factor])], "rot13")
def coll_up(ilist, base = 0, count = 0):
'''
Recursively collapse nested lists at depth base and above
'''
tlist = []
if(isinstance(ilist, __builtins__.list) or isinstance(ilist, __builtins__.tuple)):
for q in ilist:
tlist += coll_up(q, base, count + 1)
else:
if(base > count):
tlist = ilist
else:
tlist = [ilist]
return [tlist] if((count != 0) and (base > count)) else tlist
def build_base_dict(ilist):
'''
Creates a dictionary of class : class bases pairs
'''
base_dict = {}
def build_base_dict_helper(iclass, idict):
idict[iclass] = list(iclass.__bases__)
for x in iclass.__bases__:
build_base_dict_helper(x, idict)
for cur_class in ilist:
build_base_dict_helper(cur_class, base_dict)
return base_dict
def transform_base_to_sub(idict):
'''
Transforms a base dict into dictionary of class : sub classes pairs
'''
sub_dict = {}
classes = idict.keys()
for cur_class in idict:
sub_dict[cur_class] = filter(lambda cls: cur_class in idict[cls], classes)
return sub_dict
recur_class_helper = lambda idict, ilist = []: [[key, recur_class_helper(idict, idict[key])] for key in ilist]
recur_class = lambda idict: recur_class_helper(idict, idict.keys())
class proc_func(list):
'''
Cmdline processing class
'''
def __init__(self, name = "", *args, **kwargs):
self.name = name
super(list, self).__init__(*args, **kwargs)
def get_args(self, *args):
self.extend(filter(lambda x: x, args))
def __call__(self, *args):
print self.name
print self
class proc_inputs(proc_func):
def get_args(self, *args):
self.extend(filter(os.path.isfile, args))
class proc_outputs(proc_func):
pass
class proc_helper(proc_func):
'''
Help function
Print help information
'''
def get_args(self, *args):
self()
def __call__(self, *args):
print __file__
print __doc__
print "Help:\n\t%s -h -i inputfile -o ouputfile" % sys.argv[0]
print "\t\t-h or --help\tPrint this help message"
print "\t\t-i or --input\tSpecifies the input script"
print "\t\t-o or --output\tSpecifies the output script"
sys.exit()
if __name__ == "__main__":
proc_input = proc_inputs("input")
proc_output = proc_outputs("output")
proc_help = proc_helper("help")
cmd_line_map = \
{
"-i" : proc_input,
"--input" : proc_input,
"-o" : proc_output,
"--ouput" : proc_output,
"-h" : proc_help,
"--help" : proc_help
}
try:
optlist, args = getopt.getopt(sys.argv[1:], "hi:o:", ["help", "input=", "output="])
for (key, value) in optlist:
cmd_line_map[key].get_args(value)
except getopt.GetoptError:
proc_help()
if(len(proc_input) != len(proc_output)):
print "Input files must have a matching output file"
proc_help()
elif(not proc_input):
proc_help()
else:
in_out_pairs = zip(proc_input, proc_output)
for (in_file, out_file) in in_out_pairs:
dodgy_module_name = os.path.splitext(in_file)[0]
sys.modules[dodgy_module_name] = types.ModuleType(dodgy_module_name)
sys.modules[dodgy_module_name].__file__ = in_file
# Make a fake space post haste
name_space = mr_agreeable\
(
[
("__name__", dodgy_module_name), # Needed for the created classes to identify with the fake module
("__module__", dodgy_module_name), # Needed to fool the inspect module
] + \
class_types
)
# Exclude these from returning
exclusions = name_space.keys()
# Associate the fake name space to the rebasing metaclass
rebase_meta.register_mod(dodgy_module_name, name_space)
# Run dodgy code
execfile(in_file, name_space)
# Bring back dodgy classes
import_classes = [cls if(isinstance(cls, type) and not cls_name in exclusions) else None for (cls_name, cls) in name_space.items()]
dodgy_import_classes = filter(lambda x: x, import_classes)
# Create base and sub class dictionaries
base_dict = build_base_dict(dodgy_import_classes)
sub_dict = transform_base_to_sub(base_dict)
# Create sets of base and sub classes
base_set = reduce(lambda x, y: x | y, map(set, base_dict.values()), set([]))
sub_set = reduce(lambda x, y: x | y, map(set, sub_dict.values()), set([]))
kings = list(base_set - sub_set) # A list of bases which are not subs
kingdoms = recur_class_helper(sub_dict, kings) # A subclass tree of lists
lineages = coll_up(kingdoms, 2) # Flatten the tree branches at and below 2nd level
# Filter only for the clases created in the dodgy module
inbred_lines = [filter(lambda x: x.__module__ == dodgy_module_name, lineage) for lineage in lineages]
# Load Source
for lineage in inbred_lines:
for cls in lineage:
setattr(cls, "_source", inspect.getsource(cls))
# Write Source
with open(out_file, "w") as file_h:
for lineage in inbred_lines:
for cls in lineage:
file_h.write(cls._source + "\n")

Related

Store and restore exec's "execution state" / global namespace

I'm trying to run code snippets with pythons exec function and I want to be able to rerun (single) snippets so that they start their execution from the "execution state" the previous snippet (it's "parent") stored.
Probably easier to explain as code:
class Snippet:
def __init__(self, source, parent):
self.source = source
self.parent = parent
self.namespace = None
def execute(self):
self.namespace = cloneNamespace(self.parent)
executeInNamespace(self.source, self.namespace)
snip1 = Snippet('a = 1', None)
snip2 = Snippet('print(a)', snip1)
snip3 = Snippet('a = 2', snip2)
snip1.execute() # a = 1
snip2.execute() # print(a) --> 1
snip3.execute() # a = 2
snip2.execute() # print(a) --> 1 (!)
The second call to snip2 should run again from the namespace / execution state after/of snip1. parent.namespace is still snip1.namespace so a is 1 and snip2.execute should print out 1.
The question is how should cloneNamespace and executeInNamespace look like.
The best solution I could come up with so far is:
import dill
import types
def cloneNamespace(parent):
if parent == None:
return {}
newNamespace = {}
for k, v in parent.namespace.items():
# function and classes as refs, because they store a __globals__ ref
if isinstance(v, types.FunctionType) or isinstance(v, type):
newNamespace[k] = v
else:
newNamespace[k] = dill.copy(v)
return newNamespace
# global "wrapper namespace" so that __globals__ of funcs and classes don't get invalid
globalNamespace = {}
def executeInNamespace(source, namespace):
globalNamespace.clear()
globalNamespace.update(namespace)
exec(source, globalNamespace)
namespace.update(globalNamespace)
That work's for basic stuff but fails for example with these snippets:
snip1 = Snippet('''class A:
def bla(self):
print(x)''', None)
snip2 = Snippet('aInst = A()', snip1)
snip3 = Snippet('x = 1', snip2)
snip4 = Snippet('aInst.bla()', snip3)
I also tried dill.[dump,load]_module but couldn't get it working in the exec environment.
I know there are gonna be issues with file descriptors and similar, but I'd like to handle them the straight forward way everything else is also handled (rerun from the previous state).
Any ideas?
Are there other options -- wihtout exec -- to achieve this?
Thx!

Optimizing modifiable named list based on namedtuple

My goal is to optimize a framework based on a stack of modifiers for CSV-sourced lists. Each modifier uses a header list to work on a named basis.
CSV example (including header):
date;place
13/02/2013;New York
15/04/2012;Buenos Aires
29/10/2010;Singapour
I have written some code based on namedtuple in order to be able to use lists generated by csv module without reorganizing data every time. Generated code below :
class MyNamedList(object):
__slots__ = ("__values")
_fields = ['date', 'ignore', 'place']
def __init__(self, values):
self.__values = values
if len(self.__values) <= 151:
for i in range(len(self.__values), 151):
self.__values += [None,]
#property
def date(self):
return self.__values[0]
#date.setter
def date(self, val):
self.__values[0] = val
#property
def ignore(self):
return self.__values[150]
#ignore.setter
def ignore(self, val):
self.__values[150] = val
#property
def place(self):
return self.__values[1]
#b.setter
def place(self, val):
self.__values[1] = val
I must say i am very disappointed with performance using this class. Calling a simple modifier function (which changes "ignore" to True 100 times. Yes i know it is useless) for each line of a 70000-line csv file takes 9 seconds (with pypy. 5.5 using original python) whereas equivalent code using a list named foo takes 1.1 second (same with pypy and original python).
Is there anything i could do to get comparable performance between both approaches ? To me, record.ignore = True could be directly inlined (or so) and therefore translated into record[150] = True. Is there any blocking point i don't see to get this to happen ?
Note that the record i am modifying is actually (for now) not created for each line in the CSV file, meaning adding more items into the list happens only once, before the iteration.
Update : sample codes
--> Using namedlist
import namedlist
MyNamedList=namedlist.namedlist("MyNamedList", {"a":1, "b":2, "ignore":150})
test = MyNamedList([0,1])
def foo(a):
test.ignore = True # x100 times
import csv
stream = csv.reader(open("66666.csv", "rb"))
for i in stream:
foo(i)
--> Not using namedlist
import namedlist
import csv
MyNamedList=namedlist.namedlist("MyNamedList", {"a":1, "b":2, "ignore":150})
test = MyNamedList([0,1])
sample_data = []
for i in range(len(sample_data), 151):
sample_data += [None,]
def foo(a):
sample_data[150] = True # x100 times
stream = csv.reader(open("66666.csv", "rb"))
for i in stream:
foo(i)
Update #2 : code for namedlist.py (heavily based on namedtuple.py
# Retrieved from http://code.activestate.com/recipes/500261/
# Licensed under the PSF license
from keyword import iskeyword as _iskeyword
import sys as _sys
def namedlist(typename, field_indices, verbose=False, rename=False):
# Parse and validate the field names. Validation serves two purposes,
# generating informative error messages and preventing template injection attacks.
field_names = field_indices.keys()
for name in [typename,] + field_names:
if not min(c.isalnum() or c=='_' for c in name):
raise ValueError('Type names and field names can only contain alphanumeric characters and underscores: %r' % name)
if _iskeyword(name):
raise ValueError('Type names and field names cannot be a keyword: %r' % name)
if name[0].isdigit():
raise ValueError('Type names and field names cannot start with a number: %r' % name)
seen_names = set()
for name in field_names:
if name.startswith('_') and not rename:
raise ValueError('Field names cannot start with an underscore: %r' % name)
if name in seen_names:
raise ValueError('Encountered duplicate field name: %r' % name)
seen_names.add(name)
# Create and fill-in the class template
numfields = len(field_names)
argtxt = repr(field_names).replace("'", "")[1:-1] # tuple repr without parens or quotes
reprtxt = ', '.join('%s=%%r' % name for name in field_names)
max_index=-1
for name in field_names:
index = field_indices[name]
if max_index < index:
max_index = index
max_index += 1
template = '''class %(typename)s(object):
__slots__ = ("__values") \n
_fields = %(field_names)r \n
def __init__(self, values):
self.__values = values
if len(self.__values) <= %(max_index)s:
for i in range(len(self.__values), %(max_index)s):
self.__values += [None,]'''% locals()
for name in field_names:
index = field_indices[name]
template += ''' \n
#property
def %s(self):
return self.__values[%d]
#%s.setter
def %s(self, val):
self.__values[%d] = val''' % (name, index, name, name, index)
if verbose:
print template
# Execute the template string in a temporary namespace
namespace = {'__name__':'namedtuple_%s' % typename,
'_property':property, '_tuple':tuple}
try:
exec template in namespace
except SyntaxError, e:
raise SyntaxError(e.message + ':\n' + template)
result = namespace[typename]
# For pickling to work, the __module__ variable needs to be set to the frame
# where the named tuple is created. Bypass this step in enviroments where
# sys._getframe is not defined (Jython for example) or sys._getframe is not
# defined for arguments greater than 0 (IronPython).
try:
result.__module__ = _sys._getframe(1).f_globals.get('__name__', '__main__')
except (AttributeError, ValueError):
pass
return result

Avoiding magic numbers in Python Flask and probably most other languages

I am defining models for my app and I need to a column named 'status' for various verification procedures. Here is a simplified user model.
class User
id(int)
name(str)
status(int) # 0- New 1-Active 2-Inactive 3-Reported 4-Deleted
I asked a fellow Python developer to review my code; and he suggested that I avoided 'magic numbers'. His solution is this:
class Choices:
#classmethod
def get_value(cls, key):
# get the string display if need to show
for k, v in cls.CHOICES:
if k == key:
return v
return ""
class UserStatusChoices(Choices):
NEW = 0
ACTIVE = 1
INACTIVE = 2
REPORTED = 3
DELETED = 4
CHOICES = (
(NEW, "NEW"),
(ACTIVE, "ACTIVE"),
(INACTIVE, "INACTIVE"),
(REPORTED, "REPORTED"),
(DELETED, "DELETED"),
)
Couldn't I use simple dictionaries instead? Does anyone see a good reason for 'class'y solution?
Building on Python Enum class (with tostring fromstring)
class Enum(object):
#classmethod
def tostring(cls, val):
for k,v in vars(cls).iteritems():
if v==val:
return k
#classmethod
def fromstring(cls, str):
return getattr(cls, str.upper(), None)
#classmethod
def build(cls, str):
for val, name in enumerate(str.split()):
setattr(cls, name, val)
class MyEnum(Enum):
VAL1, VAL2, VAL3 = range(3)
class YourEnum(Enum):
CAR, BOAT, TRUCK = range(3)
class MoreEnum(Enum):
pass
print MyEnum.fromstring('Val1')
print MyEnum.tostring(2)
print MyEnum.VAL1
print YourEnum.BOAT
print YourEnum.fromstring('TRUCK')
# Dodgy semantics for creating enums.
# Should really be
# MoreEnum = Enum.build("CIRCLE SQUARE")
MoreEnum.build("CIRCLE SQUARE")
print MoreEnum.CIRCLE
print MoreEnum.tostring(1)
print MoreEnum.tostring(MoreEnum.CIRCLE)
EDIT Added build class method so that a string could be used to build the enums.
Although there are probably better solutions out there.

Dynamically creating classes with eval in python

I want to take an argument and create a class whose name is the argument itself.
For example, I take 'Int' as an argument and create a class whose name is 'Int',
that is my class would be like this.
class Int :
def __init__(self,param) :
self.value = 3
I am doing this by doing this.
def makeClass( x ) :
return eval( 'class %s :\n def __init__(self,param) :\n self.type = 3'%(x,))
and then calling
myClass = makeClass('Int')
myInt = myClass(3)
I am getting a syntax error for this. Please help.
eval is used for evaluating expressions, class is not an expression, it's a statment. Perhaps you want something like exec?
As a side note, what you're doing here could probably be done pretty easily with type, and then you sidestep all of the performance and security implications of using eval/exec.
def cls_init(self,param):
self.type = 3
Int = type("Int",(object,),{'__init__':cls_init})
# ^class name
# ^class bases -- inherit from object. It's a good idea :-)
# ^class dictionary. This is where you add methods or class attributes.
As requested, this works in Python 2.7 and Python 3.4, printing 3:
def makeClass(x):
exec('class %s:\n\tdef __init__(self,v):\n\t\tself.value = v' % x)
return eval('%s' % x)
myClass = makeClass('Int')
myInt = myClass(3)
print(myInt.value)
If you wish to add methods from the existing classes:
def makeClass(name):
parent = name.lower()
exec('class %s(%s):\n\tdef __init__(self,v):\n\t\tself.value = v' % (name, parent))
return eval('%s' % name)
Int = makeClass('Int')
myInt = Int(3)
Str = makeClass('Str')
myStr = Str(3)
print(myInt.value, myInt == 3, myInt == 5)
print(myStr.value, myStr == '3', myStr == 3)
Output:
3 True False
3 True False
Less typing with side effects:
def makeClass(name):
parent = name.lower()
exec('global %s\nclass %s(%s):\n\tdef __init__(self,v):\n\t\tself.value = v' % (name, name, parent))
makeClass('Int')
myInt = Int(3)
makeClass('Str')
myStr = Str(3)
Mgilson's type answer is probably preferred, though.

Python: Idiomatic properties for structured data?

I've got a bad smell in my code. Perhaps I just need to let it air out for a bit, but right now it's bugging me.
I need to create three different input files to run three Radiative Transfer Modeling (RTM) applications, so that I can compare their outputs. This process will be repeated for thousands of sets of inputs, so I'm automating it with a python script.
I'd like to store the input parameters as a generic python object that I can pass to three other functions, who will each translate that general object into the specific parameters needed to run the RTM software they are responsible. I think this makes sense, but feel free to criticize my approach.
There are many possible input parameters for each piece of RTM software. Many of them over-lap. Most of them are kept at sensible defaults, but should be easily changed.
I started with a simple dict
config = {
day_of_year: 138,
time_of_day: 36000, #seconds
solar_azimuth_angle: 73, #degrees
solar_zenith_angle: 17, #degrees
...
}
There are a lot of parameters, and they can be cleanly categorized into groups, so I thought of using dicts within the dict:
config = {
day_of_year: 138,
time_of_day: 36000, #seconds
solar: {
azimuth_angle: 73, #degrees
zenith_angle: 17, #degrees
...
},
...
}
I like that. But there are a lot of redundant properties. The solar azimuth and zenith angles, for example, can be found if the other is known, so why hard-code both? So I started looking into python's builtin property. That lets me do nifty things with the data if I store it as object attributes:
class Configuration(object):
day_of_year = 138,
time_of_day = 36000, #seconds
solar_azimuth_angle = 73, #degrees
#property
def solar_zenith_angle(self):
return 90 - self.solar_azimuth_angle
...
config = Configuration()
But now I've lost the structure I had from the second dict example.
Note that some of the properties are less trivial than my solar_zenith_angle example, and might require access to other attributes outside of the group of attributes it is a part of. For example I can calculate solar_azimuth_angle if I know the day of year, time of day, latitude, and longitude.
What I'm looking for:
A simple way to store configuration data whose values can all be accessed in a uniform way, are nicely structured, and may exist either as attributes (real values) or properties (calculated from other attributes).
A possibility that is kind of boring:
Store everything in the dict of dicts I outlined earlier, and having other functions run over the object and calculate the calculatable values? This doesn't sound fun. Or clean. To me it sounds messy and frustrating.
An ugly one that works:
After a long time trying different strategies and mostly getting no where, I came up with one possible solution that seems to work:
My classes: (smells a bit func-y, er, funky. def-initely.)
class SubConfig(object):
"""
Store logical groupings of object attributes and properties.
The parent object must be passed to the constructor so that we can still
access the parent object's other attributes and properties. Useful if we
want to use them to compute a property in here.
"""
def __init__(self, parent, *args, **kwargs):
super(SubConfig, self).__init__(*args, **kwargs)
self.parent = parent
class Configuration(object):
"""
Some object which holds many attributes and properties.
Related configurations settings are grouped in SubConfig objects.
"""
def __init__(self, *args, **kwargs):
super(Configuration, self).__init__(*args, **kwargs)
self.root_config = 2
class _AConfigGroup(SubConfig):
sub_config = 3
#property
def sub_property(self):
return self.sub_config * self.parent.root_config
self.group = _AConfigGroup(self) # Stinky?!
How I can use them: (works as I would like)
config = Configuration()
# Inspect the state of the attributes and properties.
print("\nInitial configuration state:")
print("config.rootconfig: %s" % config.root_config)
print("config.group.sub_config: %s" % config.group.sub_config)
print("config.group.sub_property: %s (calculated)" % config.group.sub_property)
# Inspect whether the properties compute the correct value after we alter
# some attributes.
config.root_config = 4
config.group.sub_config = 5
print("\nState after modifications:")
print("config.rootconfig: %s" % config.root_config)
print("config.group.sub_config: %s" % config.group.sub_config)
print("config.group.sub_property: %s (calculated)" % config.group.sub_property)
The behavior: (output of execution of all of the above code, as expected)
Initial configuration state:
config.rootconfig: 2
config.group.sub_config: 3
config.group.sub_property: 6 (calculated)
State after modifications:
config.rootconfig: 4
config.group.sub_config: 5
config.group.sub_property: 20 (calculated)
Why I don't like it:
Storing configuration data in class definitions inside of the main object's __init__() doesn't feel elegant. Especially having to instantiate them immediately after definition like that. Ugh. I can deal with that for the parent class, sure, but doing it in a constructor...
Storing the same classes outside the main Configuration object doesn't feel elegant either, since properties in the inner classes may depend on the attributes of Configuration (or their siblings inside it).
I could deal with defining the functions outside of everything, so inside having things like
#property
def solar_zenith_angle(self):
return calculate_zenith(self.solar_azimuth_angle)
but I can't figure out how to do something like
#property
def solar.zenith_angle(self):
return calculate_zenith(self.solar.azimuth_angle)
(when I try to be clever about it I always run into <property object at 0xXXXXX>)
So what is the right way to go about this? Am I missing something basic or taking a very wrong approach? Does anyone know a clever solution?
Help! My python code isn't beautiful! I must be doing something wrong!
Phil,
Your hesitation about func-y config is very familiar to me :)
I suggest you to store your config not as a python file but as a structured data file. I personally prefer YAML because it looks clean, just as you designed in the very beginning. Of course, you will need to provide formulas for the auto calculated properties, but it is not too bad unless you put too much code. Here is my implementation using PyYAML lib.
The config file (config.yml):
day_of_year: 138
time_of_day: 36000 # seconds
solar:
azimuth_angle: 73 # degrees
zenith_angle: !property 90 - self.azimuth_angle
The code:
import yaml
yaml.add_constructor("tag:yaml.org,2002:map", lambda loader, node:
type("Config", (object,), loader.construct_mapping(node))())
yaml.add_constructor("!property", lambda loader, node:
property(eval("lambda self: " + loader.construct_scalar(node))))
config = yaml.load(open("config.yml"))
print "LOADED config.yml"
print "config.day_of_year:", config.day_of_year
print "config.time_of_day:", config.time_of_day
print "config.solar.azimuth_angle:", config.solar.azimuth_angle
print "config.solar.zenith_angle:", config.solar.zenith_angle, "(calculated)"
print
config.solar.azimuth_angle = 65
print "CHANGED config.solar.azimuth_angle = 65"
print "config.solar.zenith_angle:", config.solar.zenith_angle, "(calculated)"
The output:
LOADED config.yml
config.day_of_year: 138
config.time_of_day: 36000
config.solar.azimuth_angle: 73
config.solar.zenith_angle: 17 (calculated)
CHANGED config.solar.azimuth_angle = 65
config.solar.zenith_angle: 25 (calculated)
The config can be of any depth and properties can use any subgroup values. Try this for example:
a: 1
b:
c: 3
d: some text
e: true
f:
g: 7.01
x: !property self.a + self.b.c + self.b.f.g
Assuming you already loaded this config:
>>> config
<__main__.Config object at 0xbd0d50>
>>> config.a
1
>>> config.b
<__main__.Config object at 0xbd3bd0>
>>> config.b.c
3
>>> config.b.d
'some text'
>>> config.b.e
True
>>> config.b.f
<__main__.Config object at 0xbd3c90>
>>> config.b.f.g
7.01
>>> config.x
11.01
>>> config.b.f.g = 1000
>>> config.x
1004
UPDATE
Let us have a property config.b.x which uses both self, parent and subgroup attributes in its formula:
a: 1
b:
x: !property self.parent.a + self.c + self.d.e
c: 3
d:
e: 5
Then we just need to add a reference to parent in subgroups:
import yaml
def construct_config(loader, node):
attrs = loader.construct_mapping(node)
config = type("Config", (object,), attrs)()
for k, v in attrs.iteritems():
if v.__class__.__name__ == "Config":
setattr(v, "parent", config)
return config
yaml.add_constructor("tag:yaml.org,2002:map", construct_config)
yaml.add_constructor("!property", lambda loader, node:
property(eval("lambda self: " + loader.construct_scalar(node))))
config = yaml.load(open("config.yml"))
And let's see how it works:
>>> config.a
1
>>> config.b.c
3
>>> config.b.d.e
5
>>> config.b.parent == config
True
>>> config.b.d.parent == config.b
True
>>> config.b.x
9
>>> config.a = 1000
>>> config.b.x
1008
Well, here's an ugly way to at least make sure your properties get called:
class ConfigGroup(object):
def __init__(self, config):
self.config = config
def __getattribute__(self, name):
v = object.__getattribute__(self, name)
if hasattr(v, '__get__'):
return v.__get__(self, ConfigGroup)
return v
class Config(object):
def __init__(self):
self.a = 10
self.group = ConfigGroup(self)
self.group.a = property(lambda group: group.config.a*2)
Of course, at this point you might as well forego property entirely and just check if the attribute is callable in __getattribute__.
Or you could go all out and have fun with metaclasses:
def config_meta(classname, parents, attrs):
defaults = {}
groups = {}
newattrs = {'defaults':defaults, 'groups':groups}
for name, value in attrs.items():
if name.startswith('__'):
newattrs[name] = value
elif isinstance(value, type):
groups[name] = value
else:
defaults[name] = value
def init(self):
for name, value in defaults.items():
self.__dict__[name] = value
for name, value in groups.items():
group = value()
group.config = self
self.__dict__[name] = group
newattrs['__init__'] = init
return type(classname, parents, newattrs)
class Config2(object):
__metaclass__ = config_meta
a = 10
b = 2
class group(object):
c = 5
#property
def d(self):
return self.c * self.config.a
Use it like this:
>>> c2.a
10
>>> c2.group.d
50
>>> c2.a = 6
>>> c2.group.d
30
Final edit (?): if you don't want to have to "backtrack" using self.config in subgroup property definitions, you can use the following instead:
class group_property(property):
def __get__(self, obj, objtype=None):
return super(group_property, self).__get__(obj.config, objtype)
def __set__(self, obj, value):
super(group_property, self).__set__(obj.config, value)
def __delete__(self, obj):
return super(group_property, self).__del__(obj.config)
class Config2(object):
...
class group(object):
...
#group_property
def e(config):
return config.group.c * config.a
group_property receives the base config object instead of the group object, so paths always start from the root. Therefore, e is equivalent to the previously defined d.
BTW, supporting nested groups is left as an exercise for the reader.
Wow, I just read an article about descriptors on r/python today, but I don't think hacking descriptors is going to give you what you want.
The only thing I know that handles sub-configurations like that is flatland. Here's how it would work in Flatland anyhow.
But you could do:
class Configuration(Form):
day_of_year = Integer
time_of_day = Integer
class solar(Form):
azimuth_angle = Integer
solar_angle = Integer
Then load the dictionary in
config = Configuration({
day_of_year: 138,
time_of_day: 36000, #seconds
solar: {
azimuth_angle: 73, #degrees
zenith_angle: 17, #degrees
...
},
...
})
I love flatland, but I'm not sure you gain much by using it.
You could add a metaclass or decorator to your class definition.
something like
def instantiate(klass):
return klass()
class Configuration(object):
#instantiate
class solar(object):
#property
def azimuth_angle(self):
return self.azimuth_angle
That might be better. Then create a nice __init__ on Configuration that can load all the data from a dictionary. I dunno maybe someone else has a better idea.
Here's something a little more complete (without as much magic as LaC's answer, but slightly less generic).
def instantiate(clazz): return clazz()
#dummy functions for testing
calc_zenith_angle = calc_azimuth_angle = lambda(x): 3
class Solar(object):
def __init__(self):
if getattr(self,'azimuth_angle',None) is None and getattr(self,'zenith_angle',None) is None:
return AttributeError("must have either azimuth_angle or zenith_angle provided")
if getattr(self,'zenith_angle',None) is None:
self.zenith_angle = calc_zenith_angle(self.azimuth_angle)
elif getattr(self,'azimuth_angle',None) is None:
self.azimuth_angle = calc_azimuth_angle(self.zenith_angle)
class Configuration(object):
day_of_year = 138
time_of_day = 3600
#instantiate
class solar(Solar):
azimuth_angle = 73
#zenith_angle = 17 #not defined
#if you don't want auto-calculation to be done automagically
class ConfigurationNoAuto(object):
day_of_year = 138
time_of_day = 3600
#instantiate
class solar(Solar):
azimuth_angle = 73
#property
def zenith_angle(self):
return calc_zenith_angle(self.azimuth_angle)
config = Configuration()
config_no_auto = ConfigurationNoAuto()
>>> config.day_of_year
138
>>> config_no_auto.day_of_year
138
>>> config_no_auto.solar.azimuth_angle
73
>>> config_no_auto.solar.zenith_angle
3
>>> config.solar.zenith_angle
3
>>> config.solar.azimuth_angle
7
I think I would rather subclass dict so that it fell back to a default if no data was available. Something like this:
class fallbackdict(dict):
...
defaults = { 'pi': 3.14 }
x_config = fallbackdict(defaults)
x_config.update({
'planck': 6.62606957e-34
})
The other aspect can be addressed with callables. Wether this is elegant or ugly depends on wether datatype declarations are useful:
pi: (float, 3.14)
calc = lambda v: v[0](v[1])
x_config.update({
'planck': (double, 6.62606957e-34),
'calculated': (lambda x: 1.0 - calc(x_config['planck']), None)
})
Depending on the circumstances, the lambda might be broken out if it is used many times.
Don't know if it is better, but it mostly preserves the dictionary style.

Categories

Resources