Applying Class to Items in a List - python

What I'm trying to do is allow any number of attributes to be supplied to a function. This function will handle creating a class based on those attributes. Then, I've got another function that will handle importing data from a text file, applying the generated class to each item, and adding it to a list. Below is what I have.
def create_class(attributes):
class classObject:
def __init__(self, **attributes):
for attr in attributes.keys():
self.__dict__[attr] = attributes[attr]
return classObject
def file_to_list(file, attributes):
classObject = create_class(attributes)
with open(file, "r") as f:
var = []
for line in f.readlines():
var.append(classObject(line))
return var
data = file_to_list("file.txt", ["propA", "propB"])
The issue is with how I'm trying to add the item to the list. Normally, I wouldn't have any issue, but I believe the way in which I'm creating the class is causing issues with how I usually do it.
File "file.py", line 17, in file_to_list
var.append(classObject(line))
TypeError: init() takes 1 positional argument but 2 were given
How do I loop through each of the attributes of the class, so that I can set the value for each and add it to the list?
UPDATE:
Below is an example of what file.txt looks like.
1A,1B
2A,2B
3A,3B

It looks like your class generation is wrong. You appear to want to be able to do:
Cls = create_class(["some", "attributes", "go", "here"])
and end up with a class object that looks like:
class Cls(object):
def __init__(self, some, attributes, go, here):
self.some = some
self.attributes = attributes
self.go = go
self.here = here
but what you're actually doing is creating a class that takes a dictionary, and gives that dictionary dot-syntax.
>>> obj = Cls({"different": "attributes", "go": "here"})
>>> obj.different
"attributes"
>>> obj.go
"here"
You can implement the former with:
def create_class(attributes: typing.List[str]):
class gen_class(object):
def __init__(self, *args):
if len(args) != len(attributes):
# how do you handle the case where the caller specifies fewer or more
# arguments than the generated class expects? I would throw a...
raise ValueError(f"Wrong number of arguments (expected {len(attributes)}, got {len(args)}.")
for attr, value in zip(attributes, args):
setattr(self, attr, value)
Then you should be able to use csv.reader to read in your file and instantiate those classes.
import csv
CSV_Cls = create_class(["propA", "propB"])
with open(file) as f:
reader = csv.reader(f)
data = [CSV_Cls(*row) for row in reader]
However, it does seem that writing your own code generator to make that class is the wrong choice here. Why not used a collections.namedtuple instead?
from collections import namedtuple
CSV_Cls = namedtuple("CSV_Cls", "propA propB")
with open(file) as f:
reader = csv.reader(f)
data = [CSV_Cls(*row) for row in reader]
This stdlib codegen is already written, known to work (and heavily tested) and won't accidentally introduce errors. The only reason to prefer a class is if you need to tightly-couple some behavior to the data, or if you need a mutable data structure

First, why not use type for this instead? It's the default metaclass, i.e. a callable that creates class objects. The class dict will be the third argument, which makes it easy to create programmatically.
type(name, (), attributes)
(You probably don't need any base classes, but that's what the second argument is for.)
Second, your __init__ doesn't appear to accept a str, which is the only thing you can get from readlines(). It takes only self (implied) and keyword arguments.
You could perhaps convert the line str to a dict (but that depends on what's in it), and then use the dict as your kwargs, like classObject(**kwargs), but then there's probably no point in declaring it with stars in the __init__ method in the first place.

Related

Python classes, mappings, pprint, KeysView vs. dict_keys; to keys() or not to keys()?

I have a problem with my base class. I started writing it after finding an answer on this site about more informative __repr__() methods. I added to it after finding a different answer on this site about using pprint() with my own classes. I tinkered with it a little more after finding a third answer on this site about making my classes unpackable with a ** operator.
I modified it again after seeing in yet another answer on this site that there was a distinction between merely giving it __getitem__(), __iter__(), and __len__() methods on the one hand, and actually making it a fully-qualified mapping by subclassing collections.abc.Mapping on the other. Further, I saw that doing so would remove the need for writing my own keys() method, as the Mapping would take care of that.
So I got rid of keys(), and a class method broke.
The problem
I have a method that iterates through my class' keys and values to produce one big string formatted as I'd like it. That class looks like this.
class MyObj():
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
def the_problem_method(self):
"""Method I'm getting divergent output for."""
longest = len(max((key for key in self.keys()), key=len))
key_width = longest + TAB_WIDTH - longest % TAB_WIDTH
return '\n'.join((f'{key:<{key_width}}{value}' for key, value in self))
Yes, that doesn't have the base class in it, but the MWE later on will account for that. The nut of it is that (key for key in self.keys()) part. When I have a keys() method written, I get the output I want.
def keys(self):
"""Get object attribute names."""
return self.__dict__.keys()
When I remove that to go with the keys() method supplied by collections.abc.Mapping, I get no space between key and value
The question
I can get the output I want by restoring the keys() method (and maybe adding values() and items() while I'm at it), but is that the best approach? Would it be better to go with the Mapping one and modify my class method to suit it? If so, how? Should I leave Mapping well enough alone until I know I need it?
This is my base class to be copied all aver creation and subclassed out the wazoo. I want to Get. It. Right.
There are already several considerations I can think of and many more of which I am wholly ignorant.
I use Python 3.9 and greater. I'll abandon 3.9 when conda does.
I want to keep my more-informative __repr__() methods.
I want pprint() to work, via the _dispatch table method with _format_dict_items().
I want to allow for duck typing my classes reliably.
I have not yet used type hinting, but I want to allow for using best practices there if I start.
Everything else I know nothing about.
The MWE
This has my problem class at the top and output stuff at the bottom. There are two series of classes building upon the previous ones.
The first are ever-more-inclusive base classes, and it is here that the difference between the instance with the keys() method and that without is shown. the class, BaseMap, subclasses the Mapping and has the __getitem__(), __iter__(), and __len__() methods. The next class up the chain, BaseMapKeys, subclasses that and adds the keys() method.
The second group, MapObj and MapKeysObj, are subclasses of the problem class that also subclass those different base classes respectively.
OK, maybe the WE isn't so M, but lots of things got me to this point and I don't want to neglect any.
import collections.abc
from pprint import pprint, PrettyPrinter
TAB_WIDTH = 3
class MyObj():
def __init__(self, foo, bar):
self.foo = foo
self.bar = bar
def the_problem_method(self):
"""Method I'm getting divergent output for."""
longest = len(max((key for key in self.keys()), key=len))
key_width = longest + TAB_WIDTH - longest % TAB_WIDTH
return '\n'.join((f'{key:<{key_width}}{value}' for key, value in self))
class Base(object):
"""Base class with more informative __repr__."""
def __repr__(self):
"""Object representation."""
params = (f'{key}={repr(value)}'
for key, value in self.__dict__.items())
return f'{repr(self.__class__)}({", ".join(params)})'
class BaseMap(Base, collections.abc.Mapping):
"""Enable class to be pprint-able, unpacked with **."""
def __getitem__(self, attr):
"""Get object attribute values."""
return getattr(self.__dict__, attr)
def __iter__(self):
"""Make object iterable."""
for attr in self.__dict__.keys():
yield attr, getattr(self, attr)
def __len__(self):
"""Get length of object."""
return len(self.__dict__)
class BaseMapKeys(BaseMap):
"""Overwrite KeysView output with what I thought it would be."""
def keys(self):
"""Get object attribute names."""
return self.__dict__.keys()
class MapObj(BaseMap, MyObj):
"""Problem class with collections.abc.Mapping."""
def __init__(self, foo, bar):
super().__init__(foo, bar)
class MapKeysObj(BaseMapKeys, MyObj):
"""Problem class with collections.abc.Mapping and keys method."""
def __init__(self, foo, bar):
super().__init__(foo, bar)
if isinstance(getattr(PrettyPrinter, '_dispatch'), dict):
# assume the dispatch table method still works
def pprint_basemap(printer, object, stream, indent, allowance, context,
level):
"""Implement pprint for subclasses of BaseMap class."""
write = stream.write
write(f'{object.__class__}(\n {indent * " "}')
printer._format_dict_items(object, stream, indent, allowance + 1,
context, level)
write(f'\n{indent * " "})')
map_classes = [MapObj, MapKeysObj]
for map_class in map_classes:
PrettyPrinter._dispatch[map_class.__repr__] = pprint_basemap
def print_stuff(map_obj):
print('pprint object:')
pprint(map_obj)
print()
print('print keys():')
print(map_obj.keys())
print()
print('print list(keys()):')
print(list(map_obj.keys()))
print()
print('print the problem method:')
print(map_obj.the_problem_method())
print('\n\n')
params = ['This is a really long line to force new line in pprint output', 2]
baz = MapObj(*params)
print_stuff(baz)
scoggs = MapKeysObj(*params)
print_stuff(scoggs)

Create Nothing from falsey values using Returns library

Using the Returns library, I have a function that filters a list. I want it to return Nothing if the list is empty (i.e. falsey) or Some([...]) if the list has values.
Maybe seems to be mostly focused on "true" nothing, being None. But I'm wondering if there's a way to get Nothing from a falsey value without doing something like
data = []
result = Some(data) if len(data) > 0 else Nothing
It looks like you have at least a few options. (1) You can create a new class that inherits from Maybe, and then override any methods you like, (2) create a simple function that returns Nothing is data is false, else returns Maybe.from_optional(data) {or whatever other method of Maybe you prefer), or (3) create your own container as per the returns documentation at https://returns.readthedocs.io/en/latest/pages/create-your-own-container.html.
Here is a class called Possibly, that inherits from Maybe and overrides the from_optional class method. You can add similar overrides for other methods following this pattern.
from typing import Optional
from returns.maybe import Maybe, _NewValueType, _Nothing, Some
class Possibly(Maybe):
def __init__(self):
super().__init__()
#classmethod
def from_optional(
cls, inner_value: Optional[_NewValueType],
) -> 'Maybe[_NewValueType]':
"""
Creates new instance of ``Maybe`` container based on an optional value.
"""
if not inner_value or inner_value is None:
return _Nothing(inner_value)
return Some(inner_value)
data = [1,2,3]
empty_data = []
print(Possibly.from_optional(data))
print(Possibly.from_optional(empty_data))
Here are two equivalent functions:
from returns.maybe import Maybe, _Nothing
data = [1,2,3]
empty_data = []
def my_from_optional(anything):
if not anything:
return _Nothing(anything)
else:
return Maybe.from_optional(anything)
def my_from_optional(anything):
return Maybe.from_optional(anything) if anything else _Nothing(anything)
print(my_from_optional(data))
print(my_from_optional(empty_data))

How to initialize an object that requires __new__ and __init__

I'm creating a class sequence, which inherits from the builtin list and will hold an ordered collection of a second class: d0 which inherits from int. d0, in addition to its int value must contain a secondary value, i which denotes where it exists in the class and a reference to the class itself.
My understanding is because int is an immutable type, I have to use the __new__ method, and because it will have other attributes, I need to use __init__.
I've been trying for a while to get this to work and I've explored a few options.
Attempt 1:
class sequence(list):
def __init__(self, data):
for i, elem in enumerate(data): self.append( d0(elem, i, self) )
class d0(int):
def __new__(self, val, i, parent):
self.i = i
self.parent = parent
return int.__new__(d0, val)
x = sequence([1,2,3])
print([val.i for val in x])
This was the most intuitive to me, but every time self.i is assigned, it overwrites the i attribute for all other instances of d0 in sequence. Though I'm not entirely clear why this happens, I understand that __new__ is not the place instantiate an object.
Attempt 2:
class sequence(list):
def __init__(self, data):
for i, val in enumerate(data): self.append( d0(val, i, self) )
class d0(int):
def __new__(cls, *args):
return super().__new__(cls, *args)
def __init__(self, *args):
self = args[0]
self.i = args[1]
self.parent = args[2]
x = sequence([1,2,3])
print([val.i for val in x])
This raises TypeError: int() takes at most 2 arguments (3 given), though I'm not sure why.
Attempt 3:
class sequence(list):
def __init__(self, data):
for i, val in enumerate(data):
temp = d0.__new__(d0, val)
temp.__init__(i, self)
self.append(temp)
class d0(int):
def __new__(cls, val):
return int.__new__(d0, val)
def __init__(self, i, parent):
self.i = i
self.parent = parent
x = sequence([1,2,3])
print([val.i for val in x])
This accomplishes the task, but is cumbersome and otherwise just feels strange to have to explicitly call __new__ and __init__ to instantiate an object.
What is the proper way to accomplish this? I would also appreciate any explanation for the undesired behavior in attempts 1 and 2.
First, your sequence isn’t much of a type so far: calling append on it won’t preserve its indexed nature (let alone sort or slice assignment!). If you just want to make lists that look like this, just write a function that returns a list. Note that list itself behaves like such a function (it was one back in the Python 1 days!), so you can often still use it like a type.
So let’s talk just about d0. Leaving aside the question of whether deriving from int is a good idea (it’s at least less work than deriving from list properly!), you have the basic idea correct: you need __new__ for an immutable (base) type, because at __init__ time it’s too late to choose its value. So do so:
class d0(int):
def __new__(cls,val,i,parent):
return super().__new__(cls,val)
Note that this is a class method: there’s no instance yet, but we do need to know what class we’re instantiating (what if someone inherits from d0?). This is what attempt #1 got wrong: it thought the first argument was an instance to which to assign attributes.
Note also that we pass only one (other) argument up: int can’t use our ancillary data. (Nor can it ignore it: consider int('f',16).) Thus failed #2: it sent all the arguments up.
We can install our other attributes now, but the right thing to do is use __init__ to separate manufacturing an object from initializing it:
# d0 continued
def __init__(self,val,i,parent):
# super().__init__(val)
self.i=i; self.parent=parent
Note that all the arguments appear again, even val which we ignore. This is because calling a class involves only one argument list (cf. d0(elem,i,self)), so __new__ and __init__ have to share it. (It would therefore be formally correct to pass val to int.__init__, but what would it do with it? There’s no use in calling it at all since we know int is already completely set up.) Using #3 was painful because it didn’t follow this rule.

Can I "detect" a slicing expression in a python class method?

I am developing an application where I have defined a "variable" object containing data in the form of a numpy array. These variables are linked to (netcdf) data files, and I would like to dynamically load the variable values when needed instead of loading all data from the sometimes huge files at the start.
The following snippet demonstrates the principle and works well, including access to data portions with slices. For example, you can write:
a = var() # empty variable
print a.values[7] # values have been automatically "loaded"
or even:
a = var()
a[7] = 0
However, this code still forces me to load the entire variable data at once. Netcdf (with the netCDF4 library) would allow me to directly access data slices from the file. Example:
f = netCDF4.Dataset(filename, "r")
print f.variables["a"][7]
I cannot use the netcdf variable objects directly, because my application is tied to a web service which cannot remember the netcdf file handler, and also because the variable data don't always come from netcdf files, but may originate from other sources such as OGC web services.
Is there a way to "capture" the slicing expression in the property or setter methods and use them? The idea would be to write something like:
#property
def values(self):
if self._values is None:
self._values = np.arange(10.)[slice] # load from file ...
return self._values
instead of the code below.
Working demo:
import numpy as np
class var(object):
def __init__(self, values=None, metadata=None):
if values is None:
self._values = None
else:
self._values = np.array(values)
self.metadata = metadata # just to demonstrate that var has mor than just values
#property
def values(self):
if self._values is None:
self._values = np.arange(10.) # load from file ...
return self._values
#values.setter
def values(self, values):
self._values = values
First thought: Should I perhaps create values as a separate class and then use __getitem__? See In python, how do I create two index slicing for my own matrix class?
No, you cannot detect what will be done to the object after returning from .values. The result could be stored in a variable and only (much later on) be sliced, or sliced in different places, or used in its entirety, etc.
You indeed should instead return a wrapper object and hook into object.__getitem__; it would let you detect slicing and load data as needed. When slicing, Python passes in a slice() object.
Thanks to the guidance of Martijn Pieters and with a bit more reading, I came up with the following code as demonstration. Note that the Reader class uses a netcdf file and the netCDF4 library. If you want to try out this code yourself you will either need a netcdf file with variables "a" and "b", or replace Reader with something else that will return a data array or a slice from a data array.
This solution defines three classes: Reader does the actual file I/O handling, Values manages the data access part and invokes a Reader instance if no data have been stored in memory, and var is the final "variable" which in real life will contain a lot more metadata. The code contains a couple of extra print statements for educational purposes.
"""Implementation of a dynamic variable class which can read data from file when needed or
return the data values from memory if they were read already. This concepts supports
slicing for both memory and file access."""
import numpy as np
import netCDF4 as nc
FILENAME = r"C:\Users\m.schultz\Downloads\data\tmp\MACC_20141224_0001.nc"
VARNAME = "a"
class Reader(object):
"""Implements the actual data access to variable values. Here reading a
slice from a netcdf file.
"""
def __init__(self, filename, varname):
"""Final implementation will also have to take groups into account...
"""
self.filename = filename
self.varname = varname
def read(self, args=slice(None, None, None)):
"""Read a data slice. Args is a tuple of slice objects (e.g.
numpy.index_exp). The default corresponds to [:], i.e. all data
will be read.
"""
with nc.Dataset(self.filename, "r") as f:
values = f.variables[self.varname][args]
return values
class Values(object):
def __init__(self, values=None, reader=None):
"""Initialize Values. You can either pass numerical (or other) values,
preferrably as numpy array, or a reader instance which will read the
values on demand. The reader must have a read(args) method, where
args is a tuple of slices. If no args are given, all data should be
returned.
"""
if values is not None:
self._values = np.array(values)
self.reader = reader
def __getattr__(self, name):
"""This is only be called if attribute name is not present.
Here, the only attribute we care about is _values.
Self.reader should always be defined.
This method is necessary to allow access to variable.values without
a slicing index. If only __getitem__ were defined, one would always
have to write variable.values[:] in order to make sure that something
is returned.
"""
print ">>> in __getattr__, trying to access ", name
if name == "_values":
print ">>> calling reader and reading all values..."
self._values = self.reader.read()
return self._values
def __getitem__(self, args):
print "in __getitem__"
if not "_values" in self.__dict__:
values = self.reader.read(args)
print ">>> read from file. Shape = ", values.shape
if args == slice(None, None, None):
self._values = values # all data read, store in memory
return values
else:
print ">>> read from memory. Shape = ", self._values[args].shape
return self._values[args]
def __repr__(self):
return self._values.__repr__()
def __str__(self):
return self._values.__str__()
class var(object):
def __init__(self, name=VARNAME, filename=FILENAME, values=None):
self.name = name
self.values = Values(values, Reader(filename, name))
if __name__ == "__main__":
# define a variable and access all data first
# this will read the entire array and save it in memory, so that
# subsequent access with or without index returns data from memory
a = var("a", filename=FILENAME)
print "1: a.values = ", a.values
print "2: a.values[-1] = ", a.values[-1]
print "3: a.values = ", a.values
# define a second variable, where we access a data slice first
# In this case the Reader only reads the slice and no data are stored
# in memory. The second access indexes the complete array, so Reader
# will read everything and the data will be stored in memory.
# The last access will then use the data from memory.
b = var("b", filename=FILENAME)
print "4: b.values[0:3] = ", b.values[0:3]
print "5: b.values[:] = ", b.values[:]
print "6: b.values[5:8] = ",b.values[5:8]

How can I apply a prefix to dictionary access?

I'm imitating the behavior of the ConfigParser module to write a highly specialized parser that exploits some well-defined structure in the configuration files for a particular application I work with. Several sections of the config file contain hundreds of variable and routine mappings prefixed with either Variable_ or Routine_, like this:
[Map.PRD]
Variable_FOO=LOC1
Variable_BAR=LOC2
Routine_FOO=LOC3
Routine_BAR=LOC4
...
[Map.SHD]
Variable_FOO=LOC1
Variable_BAR=LOC2
Routine_FOO=LOC3
Routine_BAR=LOC4
...
I'd like to maintain the basic structure of ConfigParser where each section is stored as a single dictionary, so users would still have access to the classic syntax:
config.content['Mappings']['Variable_FOO'] = 'LOC1'
but also be able to use a simplified API that drills down to this section:
config.vmapping('PRD')['FOO'] = 'LOC1'
config.vmapping('PRD')['BAR'] = 'LOC2'
config.rmapping('PRD')['FOO'] = 'LOC3'
config.rmapping('PRD')['BAR'] = 'LOC4'
Currently I'm implementing this by storing the section in a special subclass of dict to which I've added a prefix attribute. The variable and routine properties of the parser set the prefix attribute of the dict-like object to 'Variable_' or 'Routine_' and then modified __getitem__ and __setitem__ attributes of the dict handle gluing the prefix together with the key to access the appropriate item. It's working, but involves a lot of boilerplate to implement all the associated niceties like supporting iteration.
I suppose my ideal solution would be do dispense with the subclassed dict and have have the variable and routine properties somehow present a "view" of the plain dict object underneath without the prefixes.
Update
Here's the solution I implemented, largely based on #abarnet's answer:
class MappingDict(object):
def __init__(self, prefix, d):
self.prefix, self.d = prefix, d
def prefixify(self, name):
return '{}_{}'.format(self.prefix, name)
def __getitem__(self, name):
name = self.prefixify(name)
return self.d.__getitem__(name)
def __setitem__(self, name, value):
name = self.prefixify(name)
return self.d.__setitem__(name, value)
def __delitem__(self, name):
name = self.prefixify(name)
return self.d.__delitem__(name)
def __iter__(self):
return (key.partition('_')[-1] for key in self.d
if key.startswith(self.prefix))
def __repr__(self):
return 'MappingDict({})'.format(dict.__repr__(self))
class MyParser(object):
SECTCRE = re.compile(r'\[(?P<header>[^]]+)\]')
def __init__(self, filename):
self.filename = filename
self.content = {}
lines = [x.strip() for x in open(filename).read().splitlines()
if x.strip()]
for line in lines:
match = re.match(self.SECTCRE, line)
if match:
section = match.group('header')
self.content[section] = {}
else:
key, sep, value = line.partition('=')
self.content[section][key] = value
def write(self, filename):
fp = open(filename, 'w')
for section in sorted(self.content, key=sectionsort):
fp.write("[%s]\n" % section)
for key in sorted(self.content[section], key=cpfsort):
value = str(self.content[section][key])
fp.write("%s\n" % '='.join([key,value]))
fp.write("\n")
fp.close()
def vmapping(self, nsp):
section = 'Map.{}'.format(nsp)
return MappingDict('Variable', self.content[section])
def rmapping(self, nsp):
section = 'Map.{}'.format(nsp)
return MappingDict('Routine', self.content[section])
It's used like this:
config = MyParser('myfile.cfg')
vmap = config.vmapping('PRD')
vmap['FOO'] = 'LOC5'
vmap['BAR'] = 'LOC6'
config.write('newfile.cfg')
The resulting newfile.cfg reflects the LOC5 and LOC6 changes.
I don't think you want inheritance here. You end up with two separate dict objects which you have to create on load and then paste back together on save…
If that's acceptable, you don't even need to bother with the prefixing during normal operations; just do the prefixing while saving, like this:
class Config(object):
def save(self):
merged = {'variable_{}'.format(key): value for key, value
in self.variable_dict.items()}
merged.update({'routine_{}'.format(key): value for key, value
in self.routine_dict.items()}
# now save merged
If you want that merged object to be visible at all times, but don't expect to be called on that very often, make it a #property.
If you want to access the merged dictionary regularly, at the same time you're accessing the two sub-dictionaries, then yes, you want a view:
I suppose my ideal solution would be do dispense with the subclassed dict and have have the global and routine properties somehow present a "view" of the plain dict object underneath without the prefixes.
This is going to be very hard to do with inheritance. Certainly not with inheritance from dict; inheritance from builtins.dict_items might work if you're using Python 3, but it still seems like a stretch.
But with delegation, it's easy. Each sub-dictionary just holds a reference to the parent dict:
class PrefixedDict(object):
def __init__(self, prefix, d):
self.prefix, self.d = prefix, d
def prefixify(self, key):
return '{}_{}'.format(self.prefix, key)
def __getitem__(self, key):
return self.d.__getitem__(self.prefixify(key))
def __setitem__(self, key, value):
return self.d.__setitem__(self.prefixify(key), value)
def __delitem__(self, key):
return self.d.__delitem__(self.prefixify(key))
def __iter__(self):
return (key[len(self.prefix):] for key in self.d
if key.startswith(self.prefix)])
You don't get any of the dict methods for free that way—but that's a good thing, because they were mostly incorrect anyway, right? Explicitly delegate the ones you want. (If you do have some you want to pass through as-is, use __getattr__ for that.)
Besides being conceptually simpler and harder to screw up through accidentally forgetting to override something, this also means that PrefixDict can work with any type of mapping, not just a dict.
So, no matter which way you go, where and how do these objects get created?
The easy answer is that they're attributes that you create when you construct a Config:
def __init__(self):
self.d = {}
self.variable = PrefixedDict('Variable', self.d)
self.routine = PrefixedDict('Routine', self.d)
If this needs to be dynamic (e.g., there can be an arbitrary set of prefixes), create them at load time:
def load(self):
# load up self.d
prefixes = set(key.split('_')[0] for key in self.d)
for prefix in prefixes:
setattr(self, prefix, PrefixedDict(prefix, self.d)
If you want to be able to create them on the fly (so config.newprefix['foo'] = 3 adds 'Newprefix_foo'), you can do this instead:
def __getattr__(self, name):
return PrefixedDict(name.title(), self.d)
But once you're using dynamic attributes, you really have to question whether it isn't cleaner to use dictionary (item) syntax instead, like config['newprefix']['foo']. For one thing, that would actually let you call one of the sub-dictionaries 'global', as in your original question…
Or you can first build the dictionary syntax, use what's usually referred to as an attrdict (search ActiveState recipes and PyPI for 3000 implementations…), which lets you automatically make config.newprefix mean config['newprefix'], so you can use attribute syntax when you have valid identifiers, but fall back to dictionary syntax when you don't.
There are a couple of options for how to proceed.
The simplest might be to use nested dictionaries, so Variable_FOO becomes config["variable"]["FOO"]. You might want to use a defaultdict(dict) for the outer dictionary so you don't need to worry about initializing the inner ones when you add the first value to them.
Another option would be to use tuple keys in a single dictionary. That is, Variable_FOO would become config[("variable", "FOO")]. This is easy to do with code, since you can simply assign to config[tuple(some_string.split("_"))]. Though, I suppose you could also just use the unsplit string as your key in this case.
A final approach allows you to use the syntax you want (where Variable_FOO is accessed as config.Variable["FOO"]), by using __getattr__ and a defaultdict behind the scenes:
from collections import defaultdict
class Config(object):
def __init__(self):
self._attrdicts = defaultdict(dict)
def __getattr__(self, name):
return self._attrdicts[name]
You could extend this with behavior for __setattr__ and __delattr__ but it's probably not necessary. The only serious limitation to this approach (given the original version of the question), is that the attributes names (like Variable) must be legal Python identifiers. You can't use strings with leading numbers, Python keywords (like global) or strings containing whitespace characters.
A downside to this approach is that it's a bit more difficult to use programatically (by, for instance, your config-file parser). To read a value of Variable_FOO and save it to config.Variable["FOO"] you'll probably need to use the global getattr function, like this:
name, value = line.split("=")
prefix, suffix = name.split("_")
getattr(config, prefix)[suffix] = value

Categories

Resources