Advantages of UserDict class? - python

What are advantages of using UserDict class?
I mean, what I really get if instead of
class MyClass(object):
def __init__(self):
self.a = 0
self.b = 0
...
m = MyClass()
m.a = 5
m.b = 7
I will write the following:
class MyClass(UserDict):
def __init__(self):
UserDict.__init__(self)
self["a"] = 0
self["b"] = 0
...
m = MyClass()
m["a"] = 5
m["b"] = 7
Edit: If I understand right I can add new fields to an object in a runtime in both cases?
m.c = "Cool"
and
m["c"] = "Cool"

UserDict.UserDict has no substantial added value since Python 2.2, since, as #gs mention, you can now subclass dict directly -- it exists only for backwards compatibility with Python 2.1 and earlier, when builtin types could not be subclasses. Still, it was kept in Python 3 (now in its proper place in the collections module) since, as the docs now mention,
The need for this class has been
partially supplanted by the ability to
subclass directly from dict; however,
this class can be easier to work with
because the underlying dictionary is
accessible as an attribute.
UserDict.DictMixin, in Python 2, is quite handy -- as the docs say,
The module defines a mixin, DictMixin,
defining all dictionary methods for
classes that already have a minimum
mapping interface. This greatly
simplifies writing classes that need
to be substitutable for dictionaries
(such as the shelve module).
You subclass it, define some fundamental methods (at least __getitem__, which is sufficient for a read-only mapping without the ability to get keys or iterate; also keys if you need those abilities; possibly __setitem__, and you have a R/W mapping without the ability of removing items; add __delitem__ for full capability, and possibly override other methods for reasons of performance), and get a full-fledged implementation of dict's rich API (update, get, and so on). A great example of the Template Method design pattern.
In Python 3, DictMixin is gone; you can get almost the same functionality by relying on collections.MutableMapping instead (or just collections.Mapping for R/O mappings). It's a bit more elegant, though not QUITE as handy (see this issue, which was closed with "won't fix"; the short discussion is worth reading).

Subclassing the dict gives you all the features of a dict, like if x in dict:. You normally do this if you want to extend the features of the dict, creating an ordered dict for example.
BTW: In more recent Python versions you can subclass dict directly, you don't need UserDict.

It's tricky to overwrite dict correctly, while UserDict makes it easy. There was some discussion to remove it from Python3, but I believe it was kept for this reason. Example:
class MyDict(dict):
def __setitem__(self, key, value):
super().__setitem__(key, value * 10)
d = MyDict(a=1, b=2) # Oups MyDict.__setitem__ not called
d.update(c=3) # Oups MyDict.__setitem__ not called
d['d'] = 4 # Good!
print(d) # {'a': 1, 'b': 2, 'c': 3, 'd': 40}
UserDict inherit collections.abc.MutableMapping, so don't have those drawback:
class MyDict(collections.UserDict):
def __setitem__(self, key, value):
super().__setitem__(key, value * 10)
d = MyDict(a=1, b=2) # Good: MyDict.__setitem__ correctly called
d.update(c=3) # Good: MyDict.__setitem__ correctly called
d['d'] = 4 # Good
print(d) # {'a': 10, 'b': 20, 'c': 30, 'd': 40}

Well, as of 3.6 there are certainly some disadvantages, as I just found out. Namely, isinstance(o, dict) returns False.
from collections import UserDict
class MyClass(UserDict):
pass
data = MyClass(a=1,b=2)
print("a:", data.get("a"))
print("is it a dict?:", isinstance(data, dict))
Not a dict!
a: 1
is it a dict?: False
Change to class MyClass(dict): and isinstance returns True.
However... with UserDict you can step into its implementation.
(pdb-ing into functions/methods is an easy way to see exactly how they work)
#assumes UserDict
di = MyClass()
import pdb
#pdb will have work if your ancestor is UserDict, but not with dict
#since it is c-based
pdb.set_trace()
di["a"]= 1

Related

Extend dataclass' __repr__ programmatically

Suppose I have a dataclass with a set method. How do I extend the repr method so that it also updates whenever the set method is called:
from dataclasses import dataclass
#dataclass
class State:
A: int = 1
B: int = 2
def set(self, var, val):
setattr(self, var, val)
Ex:
In [2]: x = State()
In [3]: x
Out[3]: State(A=1, B=2)
In [4]: x.set("C", 3)
In [5]: x
Out[5]: State(A=1, B=2)
In [6]: x.C
Out[6]: 3
The outcome I would like
In [7]: x
Out[7]: State(A=1, B=2, C=3)
The dataclass decorator lets you quickly and easily build classes that have specific fields that are predetermined when you define the class. The way you're intending to use your class, however, doesn't match up very well with what dataclasses are good for. You want to be able to dynamically add new fields after the class already exists, and have them work with various methods (like __init__, __repr__ and presumably __eq__). That removes almost all of the benefits of using dataclass. You should instead just write your own class that does what you want it to do.
Here's a quick and dirty version:
class State:
_defaults = {"A": 1, "B": 2}
def __init__(self, **kwargs):
self.__dict__.update(self._defaults)
self.__dict__.update(kwargs)
def __eq__(self, other):
return self.__dict__ == other.__dict__ # you might want to add some type checking here
def __repr__(self):
kws = [f"{key}={value!r}" for key, value in self.__dict__.items()]
return "{}({})".format(type(self).__name__, ", ".join(kws))
This is pretty similar to what you get from types.SimpleNamespace, so you might just be able to use that instead (it doesn't do default values though).
You could add your set method to this framework, though it seems to me like needless duplication of the builtin setattr function you're already using to implement it. If the caller needs to dynamically set an attribute, they can call setattr themselves. If the attribute name is constant, they can use normal attribute assignment syntax instead s.foo = "bar".

Assigning dictionary to a class object

What is the difference between the two class definitions below,
class my_dict1(dict):
def __init__(self, data):
self = data.copy()
self.N = sum(self.values)
The above code results in AttributeError: 'dict' object has no attribute 'N', while the below code compiles
class my_dict2(dict):
def __init__(self, data):
for k, v in data.items():
self[k] = v
self.N = sum(self.values)
For example,
d = {'a': 3, 'b': 5}
a = my_dict1(d) # results in attribute error
b = my_dict2(d) # works fine
By assigning self itself to anything you assign self to a completely different instance than you were originally dealing with, making it no longer the "self". This instance will be of the broader type dict (because data is a dict), not of the narrower type my_dict1. You would need to do self["N"] in the first example for it to be interpreted without error, but note that even with this, in something like:
abc = mydict_1({})
abc will still not have the key "N" because a completely difference instance in __init__ was given a value for the key "N". This shows you that there's no reasonable scenario where you want to assign self itself to something else.
In regards to my_dict2, prefer composition over inheritance if you want to use a particular dict as a representation of your domain. This means having data as an instance field. See the related C# question Why not inherit from List?, the core answer is still the same. It comes down to whether you want to extend the dict mechanism vs. having a business object based on it.

Delegate to a dict class in Python

In Python 3, I have a tree of lists and dicts that I get from another library. I'd like to instrument the dicts in that tree with objects containing more behavior (giving a richer model to the simple dict classes). I've tried replacing the class of these objects with a subclass of dict, but that is not allowed:
class MyClass(dict): pass
{}.__class__ = MyClass
That fails with TypeError: __class__ assignment: only for heap types.
So I'm instead trying to write a wrapper or adapter or delegate class:
class InstrumentedDict(object):
"""
Instrument an existing dictionary with additional
functionality, but always reference and mutate
the original dictionary.
>>> orig = {'a': 1, 'b': 2}
>>> inst = InstrumentedDict(orig)
>>> hasattr(inst, '__getitem__')
True
>>> inst.__getitem__('a')
1
>>> inst['a']
1
>>> inst['c'] = 3
>>> orig['c']
3
>>> inst.keys() == orig.keys()
True
"""
def __init__(self, orig):
self._orig = orig
def __getattribute__(self, name):
orig = super(InstrumentedDict, self).__getattribute__('_orig')
return orig.__getattribute__(name)
However, the doctests fail at inst['a'] with TypeError: 'InstrumentedDict' object is not subscriptable. Note, however, that it doesn't fail to invoke __hasattr__ or __getitem__.
I'm hoping to delegate all behavior to the underlying dictionary, and I'd like not to have to think about or explicitly delegate the whole signature of a dictionary.
It's important that whatever this class does should affect the underlying dict (rather than creating separate references to the values). Ideally, it should not impose or negate mutability on the underlying Mapping, but should mirror its behavior.
Is there a simple and elegant solution that meets the specified interface but doesn't require explicit mirroring of the signature (such as in this implementation)?
Edit: To clarify, I want to overlay behavior on existing dictionaries without creating new copies, such that if the instrumented copy is modified, so is the original.
At a risk of completely missing the point of your question...
Is there any reason to build a proxy instead of just subclassing dict? Something like:
class InstrumentedDict(dict):
""" Walks like a dict, talks like a dict... """
Edit after comment:
Ah, I see :) Makes sense...
Seems like UserDict is the answer, check this out:
from collections import UserDict
class InstrumentedDict(UserDict):
def __init__(self, data):
super(InstrumentedDict, self).__init__()
self.data = data
remote_dict = {"a": 1}
instr_dict = InstrumentedDict(remote_dict)
print(instr_dict) # {'a': 1}
instr_dict["b"] = 2
print(instr_dict) # {'a': 1, 'b': 2}
print(remote_dict) # {'a': 1, 'b': 2}
UserDict seems to be a relic from olden days when we couldn't subclass dict directly. But it's useful because it exposes data attribute. And that's pretty much all what the docs say: UserDict

How to make a dictionary that returns key for keys missing from the dictionary instead of raising KeyError?

I want to create a python dictionary that returns me the key value for the keys are missing from the dictionary.
Usage example:
dic = smart_dict()
dic['a'] = 'one a'
print(dic['a'])
# >>> one a
print(dic['b'])
# >>> b
dicts have a __missing__ hook for this:
class smart_dict(dict):
def __missing__(self, key):
return key
Could simplify it as (since self is never used):
class smart_dict(dict):
#staticmethod
def __missing__(key):
return key
Why don't you just use
dic.get('b', 'b')
Sure, you can subclass dict as others point out, but I find it handy to remind myself every once in a while that get can have a default value!
If you want to have a go at the defaultdict, try this:
dic = defaultdict()
dic.__missing__ = lambda key: key
dic['b'] # should set dic['b'] to 'b' and return 'b'
except... well: AttributeError: ^collections.defaultdict^object attribute '__missing__' is read-only, so you will have to subclass:
from collections import defaultdict
class KeyDict(defaultdict):
def __missing__(self, key):
return key
d = KeyDict()
print d['b'] #prints 'b'
print d.keys() #prints []
Congratulations. You too have discovered the uselessness of the
standard collections.defaultdict type. If that execrable midden heap of code smell
offends your delicate sensibilities as much as it did mine, this is your lucky
StackOverflow day.
Thanks to the forbidden wonder of the 3-parameter
variant of the type()
builtin, crafting a non-useless default dictionary type is both fun and profitable.
What's Wrong with dict.__missing__()?
Absolutely nothing, assuming you like excess boilerplate and the shocking silliness of collections.defaultdict – which should behave as expected but really doesn't. To be fair, Jochen
Ritzel's accepted
solution of subclassing dict and
implementing the optional __missing__()
method is a fantastic
workaround for small-scale use cases only requiring a single default dictionary.
But boilerplate of this sort scales poorly. If you find yourself instantiating
multiple default dictionaries, each with their own slightly different logic for
generating missing key-value pairs, an industrial-strength alternative
automating boilerplate is warranted.
Or at least nice. Because why not fix what's broken?
Introducing DefaultDict
In less than ten lines of pure Python (excluding docstrings, comments, and
whitespace), we now define a DefaultDict type initialized with a user-defined
callable generating default values for missing keys. Whereas the callable passed
to the standard collections.defaultdict type uselessly accepts no
parameters, the callable passed to our DefaultDict type usefully accepts the
following two parameters:
The current instance of this dictionary.
The current missing key to generate a default value for.
Given this type, solving sorin's
question reduces to a single line of Python:
>>> dic = DefaultDict(lambda self, missing_key: missing_key)
>>> dic['a'] = 'one a'
>>> print(dic['a'])
one a
>>> print(dic['b'])
b
Sanity. At last.
Code or It Didn't Happen
def DefaultDict(keygen):
'''
Sane **default dictionary** (i.e., dictionary implicitly mapping a missing
key to the value returned by a caller-defined callable passed both this
dictionary and that key).
The standard :class:`collections.defaultdict` class is sadly insane,
requiring the caller-defined callable accept *no* arguments. This
non-standard alternative requires this callable accept two arguments:
#. The current instance of this dictionary.
#. The current missing key to generate a default value for.
Parameters
----------
keygen : CallableTypes
Callable (e.g., function, lambda, method) called to generate the default
value for a "missing" (i.e., undefined) key on the first attempt to
access that key, passed first this dictionary and then this key and
returning this value. This callable should have a signature resembling:
``def keygen(self: DefaultDict, missing_key: object) -> object``.
Equivalently, this callable should have the exact same signature as that
of the optional :meth:`dict.__missing__` method.
Returns
----------
MappingType
Empty default dictionary creating missing keys via this callable.
'''
# Global variable modified below.
global _DEFAULT_DICT_ID
# Unique classname suffixed by this identifier.
default_dict_class_name = 'DefaultDict' + str(_DEFAULT_DICT_ID)
# Increment this identifier to preserve uniqueness.
_DEFAULT_DICT_ID += 1
# Dynamically generated default dictionary class specific to this callable.
default_dict_class = type(
default_dict_class_name, (dict,), {'__missing__': keygen,})
# Instantiate and return the first and only instance of this class.
return default_dict_class()
_DEFAULT_DICT_ID = 0
'''
Unique arbitrary identifier with which to uniquify the classname of the next
:func:`DefaultDict`-derived type.
'''
The key ...get it, key? to this arcane wizardry is the call to
the 3-parameter variant
of the type() builtin:
type(default_dict_class_name, (dict,), {'__missing__': keygen,})
This single line dynamically generates a new dict subclass aliasing the
optional __missing__ method to the caller-defined callable. Note the distinct
lack of boilerplate, reducing DefaultDict usage to a single line of Python.
Automation for the egregious win.
The first respondent mentioned defaultdict,
but you can define __missing__ for any subclass of dict:
>>> class Dict(dict):
def __missing__(self, key):
return key
>>> d = Dict(a=1, b=2)
>>> d['a']
1
>>> d['z']
'z'
Also, I like the second respondent's approach:
>>> d = dict(a=1, b=2)
>>> d.get('z', 'z')
'z'
I agree this should be easy to do, and also easy to set up with different defaults or functions that transform a missing value somehow.
Inspired by Cecil Curry's answer, I asked myself: why not have the default-generator (either a constant or a callable) as a member of the class, instead of generating different classes all the time? Let me demonstrate:
# default behaviour: return missing keys unchanged
dic = FlexDict()
dic['a'] = 'one a'
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'
# regardless of default: easy initialisation with existing dictionary
existing_dic = {'a' : 'one a'}
dic = FlexDict(existing_dic)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'b'
# using constant as default for missing values
dic = FlexDict(existing_dic, default = 10)
print(dic['a'])
# 'one a'
print(dic['b'])
# 10
# use callable as default for missing values
dic = FlexDict(existing_dic, default = lambda missing_key: missing_key * 2)
print(dic['a'])
# 'one a'
print(dic['b'])
# 'bb'
print(dic[2])
# 4
How does it work? Not so difficult:
class FlexDict(dict):
'''Subclass of dictionary which returns a default for missing keys.
This default can either be a constant, or a callable accepting the missing key.
If "default" is not given (or None), each missing key will be returned unchanged.'''
def __init__(self, content = None, default = None):
if content is None:
super().__init__()
else:
super().__init__(content)
if default is None:
default = lambda missing_key: missing_key
self.default = default # sets self._default
#property
def default(self):
return self._default
#default.setter
def default(self, val):
if callable(val):
self._default = val
else: # constant value
self._default = lambda missing_key: val
def __missing__(self, x):
return self.default(x)
Of course, one can debate whether one wants to allow changing the default-function after initialisation, but that just means removing #default.setter and absorbing its logic into __init__.
Enabling introspection into the current (constant) default value could be added with two extra lines.
Subclass dict's __getitem__ method. For example, How to properly subclass dict and override __getitem__ & __setitem__

Odd behaviour using a custom dict class as the __dict__ attribute of Python classes

I have a class that inherits from a dictionary in order to add some custom behavior - in this case it passes each key and value to a function for validation. In the example below, the 'validation' simply prints a message.
Assignment to the dictionary works as expected, printing messages whenever items are added to the dict. But when I try to use the custom dictionary type as the __dict__ attribute of a class, attribute assignments, which in turn puts keys/values into my custom dictionary class, somehow manages to insert values into the dictionary while completely bypassing __setitem__ (and the other methods I've defined that may add keys).
The custom dictionary:
from collections import MutableMapping
class ValidatedDict(dict):
"""A dictionary that passes each value it ends up storing through
a given validator function.
"""
def __init__(self, validator, *args, **kwargs):
self.__validator = validator
self.update(*args, **kwargs)
def __setitem__(self, key, value):
self.__validator(value)
self.__validator(key)
dict.__setitem__(self, key, value)
def copy(self): pass # snipped
def fromkeys(validator, seq, v = None): pass # snipped
setdefault = MutableMapping.setdefault
update = MutableMapping.update
def Validator(i): print "Validating:", i
Using it as the __dict__ attribute of a class yields behavior I don't understand.
>>> d = ValidatedDict(Validator)
>>> d["key"] = "value"
Validating: value
Validating: key
>>> class Foo(object): pass
...
>>> foo = Foo()
>>> foo.__dict__ = ValidatedDict(Validator)
>>> type(foo.__dict__)
<class '__main__.ValidatedDict'>
>>> foo.bar = 100 # Yields no message!
>>> foo.__dict__['odd'] = 99
Validating: 99
Validating: odd
>>> foo.__dict__
{'odd': 99, 'bar': 100}
Can someone explain why it doesn't behave the way I expect? Can it or can't it work the way I'm attempting?
This is an optimization. To support metamethods on __dict__, every single instance assignment would need to check the existance of the metamethod. This is a fundamental operation--every attribute lookup and assignment--so the extra couple branches needed to check this would become overhead for the whole language, for something that's more or less redundant with obj.__getattr__ and obj.__setattr__.

Categories

Resources