Are there any 'gotchas' with this Python pattern? - python

Here's the pattern I'm thinking of using:
class Dicty(dict):
def __init__(self):
self.__dict__ = self
d = Dicty()
d.foo = 'bar'
print d['foo']
>>> bar
d['foo'] = 'baz'
print d.foo
>>> 'baz'
Generally, I prefer the semantics of object attribute access over dict get/set access, but there are some circumstances where dict-like access is required (for example, d['foo-bar'] = 'baz') and I'd prefer not to have special getter setter methods for these cases, so thus, the dual behavior of dict & object at the same time with shared attributes.
Are there any gotchas with the above pattern?

Here's a less "hacky" way to achieve the same effect:
class Dicty(dict):
def __getattr__(self, key):
return self[key]
def __setattr__(self, key, value):
self[key] = value
I think that your way may work fine as well, but setting the __dict__ attribute like that seems a bit iffy style-wise, and is bound to raise some questions if anyone else ends up reading your code.

Don't set self.__dict__. Call __init__(self, *args, **kwargs) on the superclass. Also, dict inherits from object so you don't need to specify it.

A couple of things. One is if you try and use a dictionary method, such as keys, you won't be able to get it now. There were some other issues I ran into, such as being pickle-able and copy-able.
But I have implemented something that does this without those problems. You can see it here, it's the AttrDict class in the dictlib.py module. That module also contains a class that can wrap another mapping style object, in cases where it can't be subclassed.

Related

How to Elegantly Pass All Attributes of a Class as Arguments in a Function?

I have a somewhat complex class Thing, and an associated mixin IterMixin (to make the class iterable)...and a funky method elsewhere in the codebase which receives an instance of my class as an argument.
In fact, I'm attempting to bundle up a bunch of parameters as single object to be passed to multiple external functions beyond the funky function below. A parameter object design pattern of sorts...
class IterMixin():
def __iter__(self):
for attr, value in self.__dict__.items():
yield attr, value
class Thing(IterMixin):
def __iter__(self, foo=None, bar=None, baz=999):
if foo is None:
self.foo = {}
else:
self.foo = foo
if bar is None:
self.foo = {}
else:
self.bar = bar
self.baz = baz
#property
def foo(self):
return self._foo
#foo.setter
def foo(self, data)
self._foo = self.parser(data)
#property
def bar(self):
return self._bar
#bar.setter
def bar(self, more_data)
self._bar, self.baz = self.another_parser(more_data)
def parser(self, data):
...do stuff...
return foo
def another_parser(self, more_data):
...do add'l stuff...
return bar, baz
With regard to the funky function, in a completely different module, via the Thing class, I want to pass Thing's attributes (foo, bar, and baz) to the funky function as one argument...like so:
thing_args = Thing()
def funky(*thing_args):
...do stuff...
...expecting to manipulate keys from things_arg
...
return whatever
PROBLEM:
If I do not make the setters for the attributes foo and bar private (for example, via self._foo)--i.e., by way of an underscore--then I evoke infinite recursion during class initialization ...as the __init__ and setters for these attributes loop over and over and repeatedly call themselves. To avoid that, I used the#property decorator and "privatized" the foo and bar while setting them.
However, when I pass an instance of the Thing class, and unpack its attributes as args in the funky function via a splat or asterick, if I introspect the resultant keys for those attributes, I still get _foo and _bar. I can't seem to get rid of the underscores. (In other words, I get the "privatized" attribute names of Thing.)
The biz logic of funky needs the unpacked values to not have any underscores.
Why is this happening (the underscores upon unpacking)? How can I fix this? Is there a more elegant way to either initialize the foo and bar attributes without privatizing anything? Or perhaps a more Pythonic way to pass all the attributes in the Thing class to my funky function?
First, you've got a major problem that will prevent you from even seeing the problem you've asked for help with: Your Thing class defines an __iter__ method that doesn't super, and doesn't yield or return anything. Hopefully that part is just some typo and you know how to fix it to do whatever you actually wanted there.
No, onto the problem you're asking about:
class IterMixin():
def __iter__(self):
for attr, value in self.__dict__.items():
yield attr, value
Try printing out the __dict__ of your instances. Or, better, instances of a minimal example like this:
class Thing:
#property
def foo(self):
return self._foo
#foo.setter
def foo(self, data):
self._foo = data
t = Thing()
t.foo = 2
print(t.__dict__)
The output is {'_foo': 2}.
You've tried to hide the attributes by giving them private names and putting them behind properties, but then you've gone around behind the properties' backs and looked directly into the __dict__ where the real attributes are.
And what else could be there? Your actual _foo has to be stored somewhere on each instance. That foo, on the other hand, isn't really a value, it's a getter/setter that uses that private attribute, so it isn't stored anywhere.
If you really want to use reflection to find all of the "public values" on an instance, you can do something like this:
for attr, value in inspect.getmembers(self):
if not attr.startswith('_') and not callable(value):
yield attr, value
However, I think it would be much better to not do this reflectively. Simpler and cleaner options include:
Add a _fields = 'foo', 'bar', 'baz' and have the base class iterate _fields_.
Write a decorator that registers a property, and have the base class iterate that registry.
Build something that lets you specify the attributes more declaratively and writes the boilerplate for you. See namedtuple, dataclass, and attrs for some inspiration.
Just use attrs (or, if you're not the OP but someone reading this from the future who can rely on 3.7+, dataclass) to do that work for you.
Rethink your design. A class whose instances iterate name-value pairs of their public attributes is weird in the first place. A "parameter object" that acted like a mapping to be used for keyword-splatting could be useful; one that acted like a normal iterable could be useful; one that acts as an iterable of name-value pairs is useless for anything except for passing to a dict construct (at which point it's, again, simpler to be a mapping). Plus, a mixin is really not helping you with the hard part of doing it. Whatever you actually need to do, ask for help on how to do that, instead of how to make this code that shouldn't work work anyway.

what's the usage of inherit 'dict' with a class?

I saw one of my colleague write his code like:
class a(dict):
# something
pass
Is this a common skill? What does it serve for?
This can be done when you want a class with the default behaviour of a dictionary (getting and setting keys), but the instances are going to be used in highlu specific circumstances, and you anticipate the need to provide custom methods or constructors specific to those.
For example, you may want have a dynamic KeyStorage that starts as a in-memory store, but later adapt the class to keep the data on disk.
You can also mangle the keys and values as needed - for storage of unicode data on a database with a specific encoding, for example.
In some cases it makes sense. For example you could create a dict that allows case insensitive lookup:
class case_insensitive_dict(dict):
def __getitem__(self, key):
return super(case_insensitive_dict, self).__getitem__(key.lower())
def __setitem__(self, key, value):
return super(case_insensitive_dict, self).__setitem__(key.lower(), value)
d = case_insensitive_dict()
d["AbCd"] = 1
print d["abcd"]
(this might require additional error handling)
Extending the built-in dict class can be useful to create dict "supersets" (e.g. "bunch" class where keys can be accessed object-style, as in javascript) without having to reimplement MutableMapping's 5 methods by hand.
But if your colleague literally writes
class MyDict(dict):
pass
without any customisation, I can only see evil uses for it, such as adding attributes to the dict:
>>> a = {}
>>> a.foo = 3
AttributeError: 'dict' object has no attribute 'foo'
>>> b = MyDict()
>>> b.foo = 3
>>>

Python properties as instance attributes

I am trying to write a class with dynamic properties. Consider the following class with two read-only properties:
class Monster(object):
def __init__(self,color,has_fur):
self._color = color
self._has_fur = has_fur
#property
def color(self): return self._color
#property
def has_fur(self): return self._has_fur
I want to generalize this so that __init__ can take an arbitrary dictionary and create read-only properties from each item in the dictionary. I could do that like this:
class Monster2(object):
def __init__(self,traits):
self._traits = traits
for key,value in traits.iteritems():
setattr(self.__class__,key,property(lambda self,key=key: self._traits[key]))
However, this has a serious drawback: every time I create a new instance of Monster, I am actually modifying the Monster class. Instead of creating properties for my new Monster instance, I am effectively adding properties to all instances of Monster. To see this:
>>> hasattr(Monster2,"height")
False
>>> hasattr(Monster2,"has_claws")
False
>>> blue_monster = Monster2({"height":4.3,"color":"blue"})
>>> hasattr(Monster2,"height")
True
>>> hasattr(Monster2,"has_claws")
False
>>> red_monster = Monster2({"color":"red","has_claws":True})
>>> hasattr(Monster2,"height")
True
>>> hasattr(Monster2,"has_claws")
True
This of course makes sense, since I explicitly added the properties as class attributes with setattr(self.__class__,key,property(lambda self,key=key: self._traits[key])). What I need here instead are properties that can be added to the instance. (i.e. "instance properties"). Unfortunately, according to everything I have read and tried, properties are always class attributes, not instance attributes. For example, this doesn't work:
class Monster3(object):
def __init__(self,traits):
self._traits = traits
for key,value in traits.iteritems():
self.__dict__[key] = property(lambda self,key=key: self._traits[key])
>>> green_monster = Monster3({"color":"green"})
>>> green_monster.color
<property object at 0x028FDAB0>
So my question is this: do "instance properties" exist? If not, what is the reason? I have been able to find lots about how properties are used in Python, but precious little about how they are implemented. If "instance properties" don't make sense, I would like to understand why.
No, there is no such thing as per-instance properties; like all descriptors, properties are always looked up on the class. See the descriptor HOWTO for exactly how that works.
You can implement dynamic attributes using a __getattr__ hook instead, which can check for instance attributes dynamically:
class Monster(object):
def __init__(self, traits):
self._traits = traits
def __getattr__(self, name):
if name in self._traits:
return self._traits[name]
raise AttributeError(name)
These attributes are not really dynamic though; you could just set these directly on the instance:
class Monster(object):
def __init__(self, traits):
self.__dict__.update(traits)
So my question is this: do "instance properties" exist?
No.
If not, what is the reason?
Because properties are implemented as descriptors. And the magic of descriptors is that they do different things when found in an object's type's dictionary than when found in the object's dictionary.
I have been able to find lots about how properties are used in Python, but precious little about how they are implemented.
Read the Descriptor HowTo Guide linked above.
So, is there a way you could do this?
Well, yes, if you're willing to rethink the design a little.
For your case, all you want to do is use _traits in place of __dict__, and you're generating useless getter functions dynamically, so you could replace the whole thing with a couple of lines of code, as in Martijn Pieters's answer.
Or, if you want to redirect .foo to ._foo iff foo is in a list (or, better, set) of _traits, that's just as easy:
def __getattr__(self, name):
if name in self._traits:
return getattr(self, '_' + name)
raise AttributeError
But let's say you actually had some kind of use for getter functions—each attribute actually needs some code to generate the value, which you've wrapped up in a function, and stored in _traits. In that case:
def __getattr__(self, name):
getter = self._traits.get(name)
if getter:
return getter()
raise AttributeError
What I need here instead are properties that can be added to the instance.
A property() is a descriptor and those only work when stored in classes, not when stored in instances.
An easy way to achieve the effect of an instance property is do def __getattr__. That will let you control the behavior for lookups.
In case you don't need to make that properties read-only - you can just update object __dict__ with kwargs
class Monster(object):
def __init__(self, **kwargs):
self.__dict__.update(kwargs)
than you can make instances of that class like that:
m0 = Monster(name='X')
m1 = Monster(name='godzilla', behaviour='godzilla behaviour')
Another way of doing what you want could be to dynamically create monster classes. e.g.
def make_monster_class(traits):
class DynamicMonster(object):
pass
for key, val in traits.items():
setattr(DynamicMonster, key, property(lambda self, val=val: val))
return DynamicMonster()
blue_monster = make_monster_class({"height": 4.3, "color": "blue"})
red_monster = make_monster_class({"color": "red", "has_claws": True})
for check in ("height", "color", "has_claws"):
print "blue", check, getattr(blue_monster, check, "N/A")
print "red ", check, getattr(red_monster, check, "N/A")
Output:
blue height 4.3
red height N/A
blue color blue
red color red
blue has_claws N/A
red has_claws True
I don't necessarily recommend this (the __getattr__ solution is generally preferable) but you could write your class so that all instances made from it have their own class (well, a subclass of it). This is a quick hacky implementation of that idea:
class MyClass(object):
def __new__(Class):
Class = type(Class.__name__, (Class,), {})
Class.__new__ = object.__new__ # to prevent infinite recursion
return Class()
m1 = MyClass()
m2 = MyClass()
assert type(m1) is not type(m2)
Now you can set properties on type(self) with aplomb since each instance has its own class object.
#Claudiu's answer is the same kind of thing, just implemented with a function instead of integrated into the instance-making machinery.

How to use default property descriptors and successfully assign from __init__()?

What's the correct idiom for this please?
I want to define an object containing properties which can (optionally) be initialized from a dict (the dict comes from JSON; it may be incomplete). Later on I may modify the properties via setters.
There are actually 13+ properties, and I want to be able to use default getters and setters, but that doesn't seem to work for this case:
But I don't want to have to write explicit descriptors for all of prop1... propn
Also, I'd like to move the default assignments out of __init__() and into the accessors... but then I'd need expicit descriptors.
What's the most elegant solution? (other than move all the setter calls out of __init__() and into a method/classmethod _make()?)
[DELETED COMMENT The code for badprop using default descriptor was due to comment by a previous SO user, who gave the impression it gives you a default setter. But it doesn't - the setter is undefined and it necessarily throws AttributeError.]
class DubiousPropertyExample(object):
def __init__(self,dct=None):
self.prop1 = 'some default'
self.prop2 = 'other default'
#self.badprop = 'This throws AttributeError: can\'t set attribute'
if dct is None: dct = dict() # or use defaultdict
for prop,val in dct.items():
self.__setattr__(prop,val)
# How do I do default property descriptors? this is wrong
##property
#def badprop(self): pass
# Explicit descriptors for all properties - yukk
#property
def prop1(self): return self._prop1
#prop1.setter
def prop1(self,value): self._prop1 = value
#property
def prop2(self): return self._prop2
#prop2.setter
def prop2(self,value): self._prop2 = value
dub = DubiousPropertyExample({'prop2':'crashandburn'})
print dub.__dict__
# {'_prop2': 'crashandburn', '_prop1': 'some default'}
If you run this with line 5 self.badprop = ... uncommented, it fails:
self.badprop = 'This throws AttributeError: can\'t set attribute'
AttributeError: can't set attribute
[As ever, I read the SO posts on descriptors, implicit descriptors, calling them from init]
I think you're slightly misunderstanding how properties work. There is no "default setter". It throws an AttributeError on setting badprop not because it doesn't yet know that badprop is a property rather than a normal attribute (if that were the case it would just set the attribute with no error, because that's now normal attributes behave), but because you haven't provided a setter for badprop, only a getter.
Have a look at this:
>>> class Foo(object):
#property
def foo(self):
return self._foo
def __init__(self):
self._foo = 1
>>> f = Foo()
>>> f.foo = 2
Traceback (most recent call last):
File "<pyshell#12>", line 1, in <module>
f.foo = 2
AttributeError: can't set attribute
You can't set such an attribute even from outside of __init__, after the instance is constructed. If you just use #property, then what you have is a read-only property (effectively a method call that looks like an attribute read).
If all you're doing in your getters and setters is redirecting read/write access to an attribute of the same name but with an underscore prepended, then by far the simplest thing to do is get rid of the properties altogether and just use normal attributes. Python isn't Java (and even in Java I'm not convinced of the virtue of private fields with the obvious public getter/setter anyway). An attribute that is directly accessible to the outside world is a perfectly reasonable part of your "public" interface. If you later discover that you need to run some code whenever an attribute is read/written you can make it a property then without changing your interface (this is actually what descriptors were originally intended for, not so that we could start writing Java style getters/setters for every single attribute).
If you're actually doing something in the properties other than changing the name of the attribute, and you do want your attributes to be readonly, then your best bet is probably to treat the initialisation in __init__ as directly setting the underlying data attributes with the underscore prepended. Then your class can be straightforwardly initialised without AttributeErrors, and thereafter the properties will do their thing as the attributes are read.
If you're actually doing something in the properties other than changing the name of the attribute, and you want your attributes to be readable and writable, then you'll need to actually specify what happens when you get/set them. If each attribute has independent custom behaviour, then there is no more efficient way to do this than explicitly providing a getter and a setter for each attribute.
If you're running exactly the same (or very similar) code in every single getter/setter (and it's not just adding an underscore to the real attribute name), and that's why you object to writing them all out (rightly so!), then you may be better served by implementing some of __getattr__, __getattribute__, and __setattr__. These allow you to redirect attribute reading/writing to the same code each time (with the name of the attribute as a parameter), rather than to two functions for each attribute (getting/setting).
It seems like the easiest way to go about this is to just implement __getattr__ and __setattr__ such that they will access any key in your parsed JSON dict, which you should set as an instance member. Alternatively, you could call update() on self.__dict__ with your parsed JSON, but that's not really the best way to go about things, as it means your input dict could potentially trample members of your instance.
As to your setters and getters, you should only be creating them if they actually do something special other than directly set or retrieve the value in question. Python isn't Java (or C++ or anything else), you shouldn't try to mimic the private/set/get paradigm that is common in those languages.
I simply put the dict in the local scope and get/set there my properties.
class test(object):
def __init__(self,**kwargs):
self.kwargs = kwargs
#self.value = 20 asign from init is possible
#property
def value(self):
if self.kwargs.get('value') == None:
self.kwargs.update(value=0)#default
return self.kwargs.get('value')
#value.setter
def value(self,v):
print(v) #do something with v
self.kwargs.update(value=v)
x = test()
print(x.value)
x.value = 10
x.value = 5
Output
0
10
5

Javascript style dot notation for dictionary keys unpythonic?

I've started to use constructs like these:
class DictObj(object):
def __init__(self):
self.d = {}
def __getattr__(self, m):
return self.d.get(m, None)
def __setattr__(self, m, v):
super.__setattr__(self, m, v)
Update: based on this thread, I've revised the DictObj implementation to:
class dotdict(dict):
def __getattr__(self, attr):
return self.get(attr, None)
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
class AutoEnum(object):
def __init__(self):
self.counter = 0
self.d = {}
def __getattr__(self, c):
if c not in self.d:
self.d[c] = self.counter
self.counter += 1
return self.d[c]
where DictObj is a dictionary that can be accessed via dot notation:
d = DictObj()
d.something = 'one'
I find it more aesthetically pleasing than d['something']. Note that accessing an undefined key returns None instead of raising an exception, which is also nice.
Update: Smashery makes a good point, which mhawke expands on for an easier solution. I'm wondering if there are any undesirable side effects of using dict instead of defining a new dictionary; if not, I like mhawke's solution a lot.
AutoEnum is an auto-incrementing Enum, used like this:
CMD = AutoEnum()
cmds = {
"peek": CMD.PEEK,
"look": CMD.PEEK,
"help": CMD.HELP,
"poke": CMD.POKE,
"modify": CMD.POKE,
}
Both are working well for me, but I'm feeling unpythonic about them.
Are these in fact bad constructs?
Your DictObj example is actually quite common. Object-style dot-notation access can be a win if you are dealing with ‘things that resemble objects’, ie. they have fixed property names containing only characters valid in Python identifiers. Stuff like database rows or form submissions can be usefully stored in this kind of object, making code a little more readable without the excess of ['item access'].
The implementation is a bit limited - you don't get the nice constructor syntax of dict, len(), comparisons, 'in', iteration or nice reprs. You can of course implement those things yourself, but in the new-style-classes world you can get them for free by simply subclassing dict:
class AttrDict(dict):
__getattr__ = dict.__getitem__
__setattr__ = dict.__setitem__
__delattr__ = dict.__delitem__
To get the default-to-None behaviour, simply subclass Python 2.5's collections.defaultdict class instead of dict.
With regards to the DictObj, would the following work for you? A blank class will allow you to arbitrarily add to or replace stuff in a container object.
class Container(object):
pass
>>> myContainer = Container()
>>> myContainer.spam = "in a can"
>>> myContainer.eggs = "in a shell"
If you want to not throw an AttributeError when there is no attribute, what do you think about the following? Personally, I'd prefer to use a dict for clarity, or to use a try/except clause.
class QuietContainer(object):
def __getattr__(self, attribute):
try:
return object.__getattr__(self,attribute)
except AttributeError:
return None
>>> cont = QuietContainer()
>>> print cont.me
None
Right?
This is a simpler version of your DictObj class:
class DictObj(object):
def __getattr__(self, attr):
return self.__dict__.get(attr)
>>> d = DictObj()
>>> d.something = 'one'
>>> print d.something
one
>>> print d.somethingelse
None
>>>
As far as I know, Python classes use dictionaries to store their attributes anyway (that's hidden from the programmer), so it looks to me that what you've done there is effectively emulate a Python class... using a python class.
It's not "wrong" to do this, and it can be nicer if your dictionaries have a strong possibility of turning into objects at some point, but be wary of the reasons for having bracket access in the first place:
Dot access can't use keywords as keys.
Dot access has to use Python-identifier-valid characters in the keys.
Dictionaries can hold any hashable element -- not just strings.
Also keep in mind you can always make your objects access like dictionaries if you decide to switch to objects later on.
For a case like this I would default to the "readability counts" mantra: presumably other Python programmers will be reading your code and they probably won't be expecting dictionary/object hybrids everywhere. If it's a good design decision for a particular situation, use it, but I wouldn't use it without necessity to do so.
The one major disadvantage of using something like your DictObj is you either have to limit allowable keys or you can't have methods on your DictObj such as .keys(), .values(), .items(), etc.
There's a symmetry between this and this answer:
class dotdict(dict):
__getattr__= dict.__getitem__
__setattr__= dict.__setitem__
__delattr__= dict.__delitem__
The same interface, just implemented the other way round...
class container(object):
__getitem__ = object.__getattribute__
__setitem__ = object.__setattr__
__delitem__ = object.__delattr__
Don't overlook Bunch.
It is a child of dictionary and can import YAML or JSON, or convert any existing dictionary to a Bunch and vice-versa. Once "bunchify"'d, a dictionary gains dot notations without losing any other dictionary methods.
I like dot notation a lot better than dictionary fields personally. The reason being that it makes autocompletion work a lot better.
It's not bad if it serves your purpose. "Practicality beats purity".
I saw such approach elserwhere (eg. in Paver), so this can be considered common need (or desire).
Because you ask for undesirable side-effects:
A disadvantage is that in visual editors like eclipse+pyDev, you will see many undefined variable errors on lines using the dot notation. Pydef will not be able to find such runtime "object" definitions. Whereas in the case of a normal dictionary, it knows that you are just getting a dictionary entry.
You would need to 1) ignore those errors and live with red crosses; 2) suppress those warnings on a line by line basis using ##UndefinedVariable or 3) disable undefined variable error entirely, causing you to miss real undefined variable definitions.
If you're looking for an alternative that handles nested dicts:
Recursively transform a dict to instances of the desired class
import json
from collections import namedtuple
class DictTransformer():
#classmethod
def constantize(self, d):
return self.transform(d, klass=namedtuple, klassname='namedtuple')
#classmethod
def transform(self, d, klass, klassname):
return self._from_json(self._to_json(d), klass=klass, klassname=klassname)
#classmethod
def _to_json(self, d, access_method='__dict__'):
return json.dumps(d, default=lambda o: getattr(o, access_method, str(o)))
#classmethod
def _from_json(self, jsonstr, klass, klassname):
return json.loads(jsonstr, object_hook=lambda d: klass(klassname, d.keys())(*d.values()))
Ex:
constants = {
'A': {
'B': {
'C': 'D'
}
}
}
CONSTANTS = DictTransformer.transform(d, klass=namedtuple, klassname='namedtuple')
CONSTANTS.A.B.C == 'D'
Pros:
handles nested dicts
can potentially generate other classes
namedtuples provide immutability for constants
Cons:
may not respond to .keys and .values if those are not provided on your klass (though you can sometimes mimic with ._fields and list(A.B.C))
Thoughts?
h/t to #hlzr for the original class idea

Categories

Resources