what's the usage of inherit 'dict' with a class? - python

I saw one of my colleague write his code like:
class a(dict):
# something
pass
Is this a common skill? What does it serve for?

This can be done when you want a class with the default behaviour of a dictionary (getting and setting keys), but the instances are going to be used in highlu specific circumstances, and you anticipate the need to provide custom methods or constructors specific to those.
For example, you may want have a dynamic KeyStorage that starts as a in-memory store, but later adapt the class to keep the data on disk.
You can also mangle the keys and values as needed - for storage of unicode data on a database with a specific encoding, for example.

In some cases it makes sense. For example you could create a dict that allows case insensitive lookup:
class case_insensitive_dict(dict):
def __getitem__(self, key):
return super(case_insensitive_dict, self).__getitem__(key.lower())
def __setitem__(self, key, value):
return super(case_insensitive_dict, self).__setitem__(key.lower(), value)
d = case_insensitive_dict()
d["AbCd"] = 1
print d["abcd"]
(this might require additional error handling)

Extending the built-in dict class can be useful to create dict "supersets" (e.g. "bunch" class where keys can be accessed object-style, as in javascript) without having to reimplement MutableMapping's 5 methods by hand.
But if your colleague literally writes
class MyDict(dict):
pass
without any customisation, I can only see evil uses for it, such as adding attributes to the dict:
>>> a = {}
>>> a.foo = 3
AttributeError: 'dict' object has no attribute 'foo'
>>> b = MyDict()
>>> b.foo = 3
>>>

Related

How can I implement virtual attribute hierarchies?

Is it possible to create a Python object which supports the following?
obj = ...
obj.abc.def = 123
print(obj.abc.def)
x = obj.abc.ghi(1,2,3)
The point is that obj should not contain all the attributes (which could be numerous, and possibly not known in advance), but instead end with a call to some sort of handler, i.e. handler(obj,'get','abc','def'), etc. to perform the requested action and return its result.
What I'm after is a convenient notation for interactive use, with the dot notation to access a specific detail of the object, without using objects of objects of objects.
I understand __getattr__ and __setattr__, and I've read about descriptors in Python 3, And maybe I should look into making __dict__ a mapping object, but I'm running afoul on the open-ended nesting involved when faced with a.b.c style sub-attributes.
A tree structure can be defined using defaultdict as follows:
from collections import defaultdict
def tree():
return defaultdict(tree)
This can be used to create arbitrarily nested keys on the fly:
t = tree()
t['a']['b']['c'] = 123
Working with attribute access, we can use __getattr__ and __setattr__ to delegate calls to an underlying dict of a similar recursive form:
class Tree:
def __init__(self):
super().__setattr__('t', defaultdict(Tree))
def __getattr__(self, name):
return self.t[name]
def __setattr__(self, name, value):
self.t[name] = value
Now attributes can be generated on the fly by accessing them:
t = Tree()
t.a.b.c = 123

What is the best way to create struct-like object of C/C++ in python?

I've been switching from Matlab to NumPy/Scipy, and I think NumPy is great in many aspects.
But one thing that I don't feel comfortable is that I cannot find a data structure similar to struct in C/C++.
For example, I may want to do the following thing:
struct Parameters{
double frame_size_sec;
double frame_step_sec;
}
One simplest way is using a dictionary as follows.
parameters = {"frame_size_sec" : 0.0, "frame_step_sec", 0.0}
But in case of a dictionary, unlike struct, any keys may be added. I'd like to restrict keys.
The other option might be using a class as follows. But it also has the same type of problems.
class Parameters:
frame_size_sec = 0.0
frame_step_sec = 0.0
From a thread, I saw that there is a data structure called named tuple, which looks great, but the biggest problem with it is that fields are immutable. So it is still different from what I want.
In sum, what would be the best way to use a struct-like object in python?
If you don't need actual memory layout guarantees, user-defined classes can restrict their set of instance members to a fixed list using __slots__. So for example:
class Parameters: # On Python 2, class Parameters(object):, as __slots__ only applies to new-style classes
__slots__ = 'frame_size_sec', 'frame_step_sec'
def __init__(self, frame_size_sec=0., frame_step_sec=0.):
self.frame_size_sec = float(frame_size_sec)
self.frame_step_sec = float(frame_step_sec)
gets you a class where on initialization, it's guaranteed to assign two float members, and no one can add new instance attributes (accidentally or on purpose) to any instance of the class.
Please read the caveats at the __slots__ documentation; in inheritance cases for instance, if a superclass doesn't define __slots__, then the subclass will still have __dict__ and therefore can have arbitrary attributes defined on it.
If you need memory layout guarantees and stricter (C) types for variables, you'll want to look at ctypes Structures, but from what you're saying, it sounds like you're just trying to enforce a fixed, limited set of attributes, not specific types or memory layouts.
While taking the risk of not being very Pythonic, you can create an immutable dictionary by subclassing the dict class and overwriting some of its methods:
def not_supported(*args, **kwargs):
raise NotImplementedError('ImmutableDict is immutable')
class ImmutableDict(dict):
__delitem__ = not_supported
__setattr__ = not_supported
update = not_supported
clear = not_supported
pop = not_supported
popitem = not_supported
def __getattr__(self, item):
return self[item]
def __setitem__(self, key, value):
if key in self.keys():
dict.__setitem__(self, key, value)
else:
raise NotImplementedError('ImmutableDict is immutable')
Some usage examples:
my_dict = ImmutableDict(a=1, b=2)
print my_dict['a']
>> 1
my_dict['a'] = 3 # will work, can modify existing key
my_dict['c'] = 1 # will raise an exception, can't add a new key
print my_dict.a # also works because we overwrote __getattr__ method
>> 3

Are there any 'gotchas' with this Python pattern?

Here's the pattern I'm thinking of using:
class Dicty(dict):
def __init__(self):
self.__dict__ = self
d = Dicty()
d.foo = 'bar'
print d['foo']
>>> bar
d['foo'] = 'baz'
print d.foo
>>> 'baz'
Generally, I prefer the semantics of object attribute access over dict get/set access, but there are some circumstances where dict-like access is required (for example, d['foo-bar'] = 'baz') and I'd prefer not to have special getter setter methods for these cases, so thus, the dual behavior of dict & object at the same time with shared attributes.
Are there any gotchas with the above pattern?
Here's a less "hacky" way to achieve the same effect:
class Dicty(dict):
def __getattr__(self, key):
return self[key]
def __setattr__(self, key, value):
self[key] = value
I think that your way may work fine as well, but setting the __dict__ attribute like that seems a bit iffy style-wise, and is bound to raise some questions if anyone else ends up reading your code.
Don't set self.__dict__. Call __init__(self, *args, **kwargs) on the superclass. Also, dict inherits from object so you don't need to specify it.
A couple of things. One is if you try and use a dictionary method, such as keys, you won't be able to get it now. There were some other issues I ran into, such as being pickle-able and copy-able.
But I have implemented something that does this without those problems. You can see it here, it's the AttrDict class in the dictlib.py module. That module also contains a class that can wrap another mapping style object, in cases where it can't be subclassed.

Python introspection: How to get an 'unsorted' list of object attributes?

The following code
import types
class A:
class D:
pass
class C:
pass
for d in dir(A):
if type(eval('A.'+d)) is types.ClassType:
print d
outputs
C
D
How do I get it to output in the order in which these classes were defined in the code? I.e.
D
C
Is there any way other than using inspect.getsource(A) and parsing that?
Note that that parsing is already done for you in inspect - take a look at inspect.findsource, which searches the module for the class definition and returns the source and line number. Sorting on that line number (you may also need to split out classes defined in separate modules) should give the right order.
However, this function doesn't seem to be documented, and is just using a regular expression to find the line, so it may not be too reliable.
Another option is to use metaclasses, or some other way to either implicitly or explicitly ordering information to the object. For example:
import itertools, operator
next_id = itertools.count().next
class OrderedMeta(type):
def __init__(cls, name, bases, dct):
super(OrderedMeta, cls).__init__(name, bases, dct)
cls._order = next_id()
# Set the default metaclass
__metaclass__ = OrderedMeta
class A:
class D:
pass
class C:
pass
print sorted([cls for cls in [getattr(A, name) for name in dir(A)]
if isinstance(cls, OrderedMeta)], key=operator.attrgetter("_order"))
However this is a fairly intrusive change (requires setting the metaclass of any classes you're interested in to OrderedMeta)
The inspect module also has the findsource function. It returns a tuple of source lines and line number where the object is defined.
>>> import inspect
>>> import StringIO
>>> inspect.findsource(StringIO.StringIO)[1]
41
>>>
The findsource function actually searches trough the source file and looks for likely candidates if it is given a class-object.
Given a method-, function-, traceback-, frame-, or code-object, it simply looks at the co_firstlineno attribute of the (contained) code-object.
No, you can't get those attributes in the order you're looking for. Python attributes are stored in a dict (read: hashmap), which has no awareness of insertion order.
Also, I would avoid the use of eval by simply saying
if type(getattr(A, d)) is types.ClassType:
print d
in your loop. Note that you can also just iterate through key/value pairs in A.__dict__
AFAIK, no -- there isn't*. This is because all of a class's attributes are stored in a dictionary (which is, as you know, unordered).
*: it might actually be possible, but that would require either decorators or possibly metaclass hacking. Do either of those interest you?
class ExampleObject:
def example2():
pass
def example1():
pass
context = ExampleObject
def sort_key(item):
return inspect.findsource(item)[1]
properties = [
getattr(context, attribute) for attribute in dir(context)
if callable(getattr(context, attribute)) and
attribute.startswith('__') is False
]
properties.sort(key=sort_key)
print(properties)
Should print out:
[<function ExampleObject.example2 at 0x7fc2baf9e940>, <function ExampleObject.example1 at 0x7fc2bae5e790>]
I needed to use this as well for some compiler i'm building, and this proved very useful.
I'm not trying to be glib here, but would it be feasible for you to organize the classes in your source alphabetically? i find that when there are lots of classes in one file this can be useful in its own right.

Javascript style dot notation for dictionary keys unpythonic?

I've started to use constructs like these:
class DictObj(object):
def __init__(self):
self.d = {}
def __getattr__(self, m):
return self.d.get(m, None)
def __setattr__(self, m, v):
super.__setattr__(self, m, v)
Update: based on this thread, I've revised the DictObj implementation to:
class dotdict(dict):
def __getattr__(self, attr):
return self.get(attr, None)
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
class AutoEnum(object):
def __init__(self):
self.counter = 0
self.d = {}
def __getattr__(self, c):
if c not in self.d:
self.d[c] = self.counter
self.counter += 1
return self.d[c]
where DictObj is a dictionary that can be accessed via dot notation:
d = DictObj()
d.something = 'one'
I find it more aesthetically pleasing than d['something']. Note that accessing an undefined key returns None instead of raising an exception, which is also nice.
Update: Smashery makes a good point, which mhawke expands on for an easier solution. I'm wondering if there are any undesirable side effects of using dict instead of defining a new dictionary; if not, I like mhawke's solution a lot.
AutoEnum is an auto-incrementing Enum, used like this:
CMD = AutoEnum()
cmds = {
"peek": CMD.PEEK,
"look": CMD.PEEK,
"help": CMD.HELP,
"poke": CMD.POKE,
"modify": CMD.POKE,
}
Both are working well for me, but I'm feeling unpythonic about them.
Are these in fact bad constructs?
Your DictObj example is actually quite common. Object-style dot-notation access can be a win if you are dealing with ‘things that resemble objects’, ie. they have fixed property names containing only characters valid in Python identifiers. Stuff like database rows or form submissions can be usefully stored in this kind of object, making code a little more readable without the excess of ['item access'].
The implementation is a bit limited - you don't get the nice constructor syntax of dict, len(), comparisons, 'in', iteration or nice reprs. You can of course implement those things yourself, but in the new-style-classes world you can get them for free by simply subclassing dict:
class AttrDict(dict):
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
To get the default-to-None behaviour, simply subclass Python 2.5's collections.defaultdict class instead of dict.
With regards to the DictObj, would the following work for you? A blank class will allow you to arbitrarily add to or replace stuff in a container object.
class Container(object):
pass
>>> myContainer = Container()
>>> myContainer.spam = "in a can"
>>> myContainer.eggs = "in a shell"
If you want to not throw an AttributeError when there is no attribute, what do you think about the following? Personally, I'd prefer to use a dict for clarity, or to use a try/except clause.
class QuietContainer(object):
def __getattr__(self, attribute):
try:
return object.__getattr__(self,attribute)
except AttributeError:
return None
>>> cont = QuietContainer()
>>> print cont.me
None
Right?
This is a simpler version of your DictObj class:
class DictObj(object):
def __getattr__(self, attr):
return self.__dict__.get(attr)
>>> d = DictObj()
>>> d.something = 'one'
>>> print d.something
one
>>> print d.somethingelse
None
>>>
As far as I know, Python classes use dictionaries to store their attributes anyway (that's hidden from the programmer), so it looks to me that what you've done there is effectively emulate a Python class... using a python class.
It's not "wrong" to do this, and it can be nicer if your dictionaries have a strong possibility of turning into objects at some point, but be wary of the reasons for having bracket access in the first place:
Dot access can't use keywords as keys.
Dot access has to use Python-identifier-valid characters in the keys.
Dictionaries can hold any hashable element -- not just strings.
Also keep in mind you can always make your objects access like dictionaries if you decide to switch to objects later on.
For a case like this I would default to the "readability counts" mantra: presumably other Python programmers will be reading your code and they probably won't be expecting dictionary/object hybrids everywhere. If it's a good design decision for a particular situation, use it, but I wouldn't use it without necessity to do so.
The one major disadvantage of using something like your DictObj is you either have to limit allowable keys or you can't have methods on your DictObj such as .keys(), .values(), .items(), etc.
There's a symmetry between this and this answer:
class dotdict(dict):
__getattr__= dict.__getitem__
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
The same interface, just implemented the other way round...
class container(object):
__getitem__ = object.__getattribute__
__setitem__ = object.__setattr__
__delitem__ = object.__delattr__
Don't overlook Bunch.
It is a child of dictionary and can import YAML or JSON, or convert any existing dictionary to a Bunch and vice-versa. Once "bunchify"'d, a dictionary gains dot notations without losing any other dictionary methods.
I like dot notation a lot better than dictionary fields personally. The reason being that it makes autocompletion work a lot better.
It's not bad if it serves your purpose. "Practicality beats purity".
I saw such approach elserwhere (eg. in Paver), so this can be considered common need (or desire).
Because you ask for undesirable side-effects:
A disadvantage is that in visual editors like eclipse+pyDev, you will see many undefined variable errors on lines using the dot notation. Pydef will not be able to find such runtime "object" definitions. Whereas in the case of a normal dictionary, it knows that you are just getting a dictionary entry.
You would need to 1) ignore those errors and live with red crosses; 2) suppress those warnings on a line by line basis using ##UndefinedVariable or 3) disable undefined variable error entirely, causing you to miss real undefined variable definitions.
If you're looking for an alternative that handles nested dicts:
Recursively transform a dict to instances of the desired class
import json
from collections import namedtuple
class DictTransformer():
#classmethod
def constantize(self, d):
return self.transform(d, klass=namedtuple, klassname='namedtuple')
#classmethod
def transform(self, d, klass, klassname):
return self._from_json(self._to_json(d), klass=klass, klassname=klassname)
#classmethod
def _to_json(self, d, access_method='__dict__'):
return json.dumps(d, default=lambda o: getattr(o, access_method, str(o)))
#classmethod
def _from_json(self, jsonstr, klass, klassname):
return json.loads(jsonstr, object_hook=lambda d: klass(klassname, d.keys())(*d.values()))
Ex:
constants = {
'A': {
'B': {
'C': 'D'
}
}
}
CONSTANTS = DictTransformer.transform(d, klass=namedtuple, klassname='namedtuple')
CONSTANTS.A.B.C == 'D'
Pros:
handles nested dicts
can potentially generate other classes
namedtuples provide immutability for constants
Cons:
may not respond to .keys and .values if those are not provided on your klass (though you can sometimes mimic with ._fields and list(A.B.C))
Thoughts?
h/t to #hlzr for the original class idea

Categories

Resources