What would a "frozen dict" be? - python

A frozen set is a frozenset.
A frozen list could be a tuple.
What would a frozen dict be? An immutable, hashable dict.
I guess it could be something like collections.namedtuple, but that is more like a frozen-keys dict (a half-frozen dict). Isn't it?
A "frozendict" should be a frozen dictionary, it should have keys, values, get, etc., and support in, for, etc.
update :
* there it is : https://www.python.org/dev/peps/pep-0603

Python doesn't have a builtin frozendict type. It turns out this wouldn't be useful too often (though it would still probably be useful more often than frozenset is).
The most common reason to want such a type is when memoizing function calls for functions with unknown arguments. The most common solution to store a hashable equivalent of a dict (where the values are hashable) is something like tuple(sorted(kwargs.items())).
This depends on the sorting not being a bit insane. Python cannot positively promise sorting will result in something reasonable here. (But it can't promise much else, so don't sweat it too much.)
You could easily enough make some sort of wrapper that works much like a dict. It might look something like
import collections
class FrozenDict(collections.Mapping):
"""Don't forget the docstrings!!"""
def __init__(self, *args, **kwargs):
self._d = dict(*args, **kwargs)
self._hash = None
def __iter__(self):
return iter(self._d)
def __len__(self):
return len(self._d)
def __getitem__(self, key):
return self._d[key]
def __hash__(self):
# It would have been simpler and maybe more obvious to
# use hash(tuple(sorted(self._d.iteritems()))) from this discussion
# so far, but this solution is O(n). I don't know what kind of
# n we are going to run into, but sometimes it's hard to resist the
# urge to optimize when it will gain improved algorithmic performance.
if self._hash is None:
hash_ = 0
for pair in self.items():
hash_ ^= hash(pair)
self._hash = hash_
return self._hash
It should work great:
>>> x = FrozenDict(a=1, b=2)
>>> y = FrozenDict(a=1, b=2)
>>> x is y
False
>>> x == y
True
>>> x == {'a': 1, 'b': 2}
True
>>> d = {x: 'foo'}
>>> d[y]
'foo'

Curiously, although we have the seldom useful frozenset, there's still no frozen mapping. The idea was rejected in PEP 416 -- Add a frozendict builtin type. This idea may be revisited in a later Python release, see PEP 603 -- Adding a frozenmap type to collections.
So the Python 2 solution to this:
def foo(config={'a': 1}):
...
Still seems to be the usual:
def foo(config=None):
if config is None:
config = {'a': 1} # default config
...
In Python 3 you have the option of this:
from types import MappingProxyType
default_config = {'a': 1}
DEFAULTS = MappingProxyType(default_config)
def foo(config=DEFAULTS):
...
Now the default config can be updated dynamically, but remain immutable where you want it to be immutable by passing around the proxy instead.
So changes in the default_config will update DEFAULTS as expected, but you can't write to the mapping proxy object itself.
Admittedly it's not really the same thing as an "immutable, hashable dict", but it might be a decent substitute for some use cases of a frozendict.

Assuming the keys and values of the dictionary are themselves immutable (e.g. strings) then:
>>> d
{'forever': 'atones', 'minks': 'cards', 'overhands': 'warranted',
'hardhearted': 'tartly', 'gradations': 'snorkeled'}
>>> t = tuple((k, d[k]) for k in sorted(d.keys()))
>>> hash(t)
1524953596

There is no fronzedict, but you can use MappingProxyType that was added to the standard library with Python 3.3:
>>> from types import MappingProxyType
>>> foo = MappingProxyType({'a': 1})
>>> foo
mappingproxy({'a': 1})
>>> foo['a'] = 2
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'mappingproxy' object does not support item assignment
>>> foo
mappingproxy({'a': 1})

I think of frozendict everytime I write a function like this:
def do_something(blah, optional_dict_parm=None):
if optional_dict_parm is None:
optional_dict_parm = {}

Install frozendict
pip install frozendict
Use it!
from frozendict import frozendict
def smth(param = frozendict({})):
pass

Here is the code I've been using. I subclassed frozenset. The advantages of this are the following.
This is a truly immutable object. No relying on the good behavior of future users and developers.
It's easy to convert back and forth between a regular dictionary and a frozen dictionary. FrozenDict(orig_dict) --> frozen dictionary. dict(frozen_dict) --> regular dict.
Update Jan 21 2015: The original piece of code I posted in 2014 used a for-loop to find a key that matched. That was incredibly slow. Now I've put together an implementation which takes advantage of frozenset's hashing features. Key-value pairs are stored in special containers where the __hash__ and __eq__ functions are based on the key only. This code has also been formally unit-tested, unlike what I posted here in August 2014.
MIT-style license.
if 3 / 2 == 1:
version = 2
elif 3 / 2 == 1.5:
version = 3
def col(i):
''' For binding named attributes to spots inside subclasses of tuple.'''
g = tuple.__getitem__
#property
def _col(self):
return g(self,i)
return _col
class Item(tuple):
''' Designed for storing key-value pairs inside
a FrozenDict, which itself is a subclass of frozenset.
The __hash__ is overloaded to return the hash of only the key.
__eq__ is overloaded so that normally it only checks whether the Item's
key is equal to the other object, HOWEVER, if the other object itself
is an instance of Item, it checks BOTH the key and value for equality.
WARNING: Do not use this class for any purpose other than to contain
key value pairs inside FrozenDict!!!!
The __eq__ operator is overloaded in such a way that it violates a
fundamental property of mathematics. That property, which says that
a == b and b == c implies a == c, does not hold for this object.
Here's a demonstration:
[in] >>> x = Item(('a',4))
[in] >>> y = Item(('a',5))
[in] >>> hash('a')
[out] >>> 194817700
[in] >>> hash(x)
[out] >>> 194817700
[in] >>> hash(y)
[out] >>> 194817700
[in] >>> 'a' == x
[out] >>> True
[in] >>> 'a' == y
[out] >>> True
[in] >>> x == y
[out] >>> False
'''
__slots__ = ()
key, value = col(0), col(1)
def __hash__(self):
return hash(self.key)
def __eq__(self, other):
if isinstance(other, Item):
return tuple.__eq__(self, other)
return self.key == other
def __ne__(self, other):
return not self.__eq__(other)
def __str__(self):
return '%r: %r' % self
def __repr__(self):
return 'Item((%r, %r))' % self
class FrozenDict(frozenset):
''' Behaves in most ways like a regular dictionary, except that it's immutable.
It differs from other implementations because it doesn't subclass "dict".
Instead it subclasses "frozenset" which guarantees immutability.
FrozenDict instances are created with the same arguments used to initialize
regular dictionaries, and has all the same methods.
[in] >>> f = FrozenDict(x=3,y=4,z=5)
[in] >>> f['x']
[out] >>> 3
[in] >>> f['a'] = 0
[out] >>> TypeError: 'FrozenDict' object does not support item assignment
FrozenDict can accept un-hashable values, but FrozenDict is only hashable if its values are hashable.
[in] >>> f = FrozenDict(x=3,y=4,z=5)
[in] >>> hash(f)
[out] >>> 646626455
[in] >>> g = FrozenDict(x=3,y=4,z=[])
[in] >>> hash(g)
[out] >>> TypeError: unhashable type: 'list'
FrozenDict interacts with dictionary objects as though it were a dict itself.
[in] >>> original = dict(x=3,y=4,z=5)
[in] >>> frozen = FrozenDict(x=3,y=4,z=5)
[in] >>> original == frozen
[out] >>> True
FrozenDict supports bi-directional conversions with regular dictionaries.
[in] >>> original = {'x': 3, 'y': 4, 'z': 5}
[in] >>> FrozenDict(original)
[out] >>> FrozenDict({'x': 3, 'y': 4, 'z': 5})
[in] >>> dict(FrozenDict(original))
[out] >>> {'x': 3, 'y': 4, 'z': 5} '''
__slots__ = ()
def __new__(cls, orig={}, **kw):
if kw:
d = dict(orig, **kw)
items = map(Item, d.items())
else:
try:
items = map(Item, orig.items())
except AttributeError:
items = map(Item, orig)
return frozenset.__new__(cls, items)
def __repr__(self):
cls = self.__class__.__name__
items = frozenset.__iter__(self)
_repr = ', '.join(map(str,items))
return '%s({%s})' % (cls, _repr)
def __getitem__(self, key):
if key not in self:
raise KeyError(key)
diff = self.difference
item = diff(diff({key}))
key, value = set(item).pop()
return value
def get(self, key, default=None):
if key not in self:
return default
return self[key]
def __iter__(self):
items = frozenset.__iter__(self)
return map(lambda i: i.key, items)
def keys(self):
items = frozenset.__iter__(self)
return map(lambda i: i.key, items)
def values(self):
items = frozenset.__iter__(self)
return map(lambda i: i.value, items)
def items(self):
items = frozenset.__iter__(self)
return map(tuple, items)
def copy(self):
cls = self.__class__
items = frozenset.copy(self)
dupl = frozenset.__new__(cls, items)
return dupl
#classmethod
def fromkeys(cls, keys, value):
d = dict.fromkeys(keys,value)
return cls(d)
def __hash__(self):
kv = tuple.__hash__
items = frozenset.__iter__(self)
return hash(frozenset(map(kv, items)))
def __eq__(self, other):
if not isinstance(other, FrozenDict):
try:
other = FrozenDict(other)
except Exception:
return False
return frozenset.__eq__(self, other)
def __ne__(self, other):
return not self.__eq__(other)
if version == 2:
#Here are the Python2 modifications
class Python2(FrozenDict):
def __iter__(self):
items = frozenset.__iter__(self)
for i in items:
yield i.key
def iterkeys(self):
items = frozenset.__iter__(self)
for i in items:
yield i.key
def itervalues(self):
items = frozenset.__iter__(self)
for i in items:
yield i.value
def iteritems(self):
items = frozenset.__iter__(self)
for i in items:
yield (i.key, i.value)
def has_key(self, key):
return key in self
def viewkeys(self):
return dict(self).viewkeys()
def viewvalues(self):
return dict(self).viewvalues()
def viewitems(self):
return dict(self).viewitems()
#If this is Python2, rebuild the class
#from scratch rather than use a subclass
py3 = FrozenDict.__dict__
py3 = {k: py3[k] for k in py3}
py2 = {}
py2.update(py3)
dct = Python2.__dict__
py2.update({k: dct[k] for k in dct})
FrozenDict = type('FrozenDict', (frozenset,), py2)

You may use frozendict from utilspie package as:
>>> from utilspie.collectionsutils import frozendict
>>> my_dict = frozendict({1: 3, 4: 5})
>>> my_dict # object of `frozendict` type
frozendict({1: 3, 4: 5})
# Hashable
>>> {my_dict: 4}
{frozendict({1: 3, 4: 5}): 4}
# Immutable
>>> my_dict[1] = 5
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mquadri/workspace/utilspie/utilspie/collectionsutils/collections_utils.py", line 44, in __setitem__
self.__setitem__.__name__, type(self).__name__))
AttributeError: You can not call '__setitem__()' for 'frozendict' object
As per the document:
frozendict(dict_obj): Accepts obj of dict type and returns a hashable and immutable dict

Subclassing dict
i see this pattern in the wild (github) and wanted to mention it:
class FrozenDict(dict):
def __init__(self, *args, **kwargs):
self._hash = None
super(FrozenDict, self).__init__(*args, **kwargs)
def __hash__(self):
if self._hash is None:
self._hash = hash(tuple(sorted(self.items()))) # iteritems() on py2
return self._hash
def _immutable(self, *args, **kws):
raise TypeError('cannot change object - object is immutable')
# makes (deep)copy alot more efficient
def __copy__(self):
return self
def __deepcopy__(self, memo=None):
if memo is not None:
memo[id(self)] = self
return self
__setitem__ = _immutable
__delitem__ = _immutable
pop = _immutable
popitem = _immutable
clear = _immutable
update = _immutable
setdefault = _immutable
example usage:
d1 = FrozenDict({'a': 1, 'b': 2})
d2 = FrozenDict({'a': 1, 'b': 2})
d1.keys()
assert isinstance(d1, dict)
assert len(set([d1, d2])) == 1 # hashable
Pros
support for get(), keys(), items() (iteritems() on py2) and all the goodies from dict out of the box without explicitly implementing them
uses internally dict which means performance (dict is written in c in CPython)
elegant simple and no black magic
isinstance(my_frozen_dict, dict) returns True - although python encourages duck-typing many packages uses isinstance(), this can save many tweaks and customizations
Cons
any subclass can override this or access it internally (you cant really 100% protect something in python, you should trust your users and provide good documentation).
if you care for speed, you might want to make __hash__ a bit faster.

Yes, this is my second answer, but it is a completely different approach. The first implementation was in pure python. This one is in Cython. If you know how to use and compile Cython modules, this is just as fast as a regular dictionary. Roughly .04 to .06 micro-sec to retrieve a single value.
This is the file "frozen_dict.pyx"
import cython
from collections import Mapping
cdef class dict_wrapper:
cdef object d
cdef int h
def __init__(self, *args, **kw):
self.d = dict(*args, **kw)
self.h = -1
def __len__(self):
return len(self.d)
def __iter__(self):
return iter(self.d)
def __getitem__(self, key):
return self.d[key]
def __hash__(self):
if self.h == -1:
self.h = hash(frozenset(self.d.iteritems()))
return self.h
class FrozenDict(dict_wrapper, Mapping):
def __repr__(self):
c = type(self).__name__
r = ', '.join('%r: %r' % (k,self[k]) for k in self)
return '%s({%s})' % (c, r)
__all__ = ['FrozenDict']
Here's the file "setup.py"
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules = cythonize('frozen_dict.pyx')
)
If you have Cython installed, save the two files above into the same directory. Move to that directory in the command line.
python setup.py build_ext --inplace
python setup.py install
And you should be done.

The main disadvantage of namedtuple is that it needs to be specified before it is used, so it's less convenient for single-use cases.
However, there is a practical workaround that can be used to handle many such cases. Let's say that you want to have an immutable equivalent of the following dict:
MY_CONSTANT = {
'something': 123,
'something_else': 456
}
This can be emulated like this:
from collections import namedtuple
MY_CONSTANT = namedtuple('MyConstant', 'something something_else')(123, 456)
It's even possible to write an auxiliary function to automate this:
def freeze_dict(data):
from collections import namedtuple
keys = sorted(data.keys())
frozen_type = namedtuple(''.join(keys), keys)
return frozen_type(**data)
a = {'foo':'bar', 'x':'y'}
fa = freeze_dict(data)
assert a['foo'] == fa.foo
Of course this works only for flat dicts, but it shouldn't be too difficult to implement a recursive version.

freeze implements frozen collections (dict, list and set) that are hashable, type-hinted and will recursively freeze the data you give them (when possible) for you.
pip install frz
Usage:
from freeze import FDict
a_mutable_dict = {
"list": [1, 2],
"set": {3, 4},
}
a_frozen_dict = FDict(a_mutable_dict)
print(repr(a_frozen_dict))
# FDict: {'list': FList: (1, 2), 'set': FSet: {3, 4}}

In the absence of native language support, you can either do it yourself or use an existing solution. Fortunately Python makes it dead simple to extend off of their base implementations.
class frozen_dict(dict):
def __setitem__(self, key, value):
raise Exception('Frozen dictionaries cannot be mutated')
frozen_dict = frozen_dict({'foo': 'FOO' })
print(frozen['foo']) # FOO
frozen['foo'] = 'NEWFOO' # Exception: Frozen dictionaries cannot be mutated
# OR
from types import MappingProxyType
frozen_dict = MappingProxyType({'foo': 'FOO'})
print(frozen_dict['foo']) # FOO
frozen_dict['foo'] = 'NEWFOO' # TypeError: 'mappingproxy' object does not support item assignment

I needed to access fixed keys for something at one point for something that was a sort of globally-constanty kind of thing and I settled on something like this:
class MyFrozenDict:
def __getitem__(self, key):
if key == 'mykey1':
return 0
if key == 'mykey2':
return "another value"
raise KeyError(key)
Use it like
a = MyFrozenDict()
print(a['mykey1'])
WARNING: I don't recommend this for most use cases as it makes some pretty severe tradeoffs.

Related

python class behaves like dictionary-or-list-data-like

In python3 console, input those:
>>> import sys
>>> sys.version_info
sys.version_info(major=3, minor=4, micro=3, releaselevel='final', serial=0)
>>> type(sys.version_info) # this is class type
<class 'sys.version_info'>
>>> sys.version_info[0:2] # ?? But it acts like a list-data-like
(3, 4)
My questions are:
How can a class act like dictionary-or-list-data-like?
May give an example to construct a class like this?
Is there some documentation about
this?
Python contains several methods for emulating container types such as dictionaries and lists.
In particular, consider the following class:
class MyDict(object):
def __getitem__(self, key):
# Called for getting obj[key]
def __setitem__(self, key, value):
# Called for setting obj[key] = value
If you write
obj = MyDict()
Then
obj[3]
will call the first method, and
obj[3] = 'foo'
will call the second method.
If you further want to support
len(obj)
then you just need to add the method
def __len__(self):
# Return here the logical length
Here is an example of a (very inefficient) dictionary implemented by a list
class MyDict(object):
def __init__(self, seq=None):
self._vals = list(seq) if seq is not None else []
def __getitem__(self, key):
return [v[1] for v in self._vals if v[0] == key][0]
def __setitem__(self, key, val):
self._vals = [v for v in self._vals if v[0] != key]
self._vals.append((key, val))
def __len__(self):
return len(self._vals)
You can use it pretty much like a regular dict:
obj = MyDict()
obj[2] = 'b'
>>> obj[2]
'b'
It's quite easy ... All you need to do is define a __getitem__ method that handles slicing or integer/string lookup. You can do pretty much whatever you want...
class Foo(object):
def __init__(self, bar, baz):
self.bar = bar
self.baz = baz
def __getitem__(self, ix):
return (self.bar, self.baz).__getitem__(ix)
Here's a cheat sheet of what will be passed to __getitem__ as ix in the following situations:
f[1] # f.__getitem__(1)
f[1:] # f.__getitem__(slice(1, None, None))
f[1:, 2] # f.__getitem__( (slice(1, None, None), 2) )
f[1, 2] # f.__getitem__( (1, 2) )
f[(1, 2)] # f.__getitem__( (1, 2) )
The trick (which can be slightly non-trivial) is simply writing __getitem__ so that it looks at the type of the object that was passed and then does the right thing. For my answer, I cheated by creating a tuple in my __getitem__ and then I called __getitem__ on the tuple (since it already does the right thing in all of the cases that I wanted to support)
Here's some example usage:
>>> f = Foo(1, 2)
>>> f[1]
2
>>> f[0]
1
>>> f[:]
(1, 2)
note that you don't typically need to even do this yourself. You can create a named tuple to do the job for you:
from collections import namedtuple
Foo = namedtuple('Foo', 'bar, baz')
And usage is pretty much the same:
>>> f = Foo(1, 2)
>>> f[1]
2
>>> f[0]
1
>>> f[:]
(1, 2)
The main difference here is that our namedtuple is immutable. Once created, we can't change it's members.
i think in python like ECMAScript (aka javascript) class is a dictionary or associative array(associative array). since you can add a property or method to your class at runtime.(see)
class A(object):
def __init__(self):
self.x = 0
a = A()
a.y=5
print a.y # 5
if you want write a class like that you can use __getitem__ and __setitem__ methods:
class A(object):
class B(object):
def __init__(self, x, y):
self.vals = (x, y)
def __getitem__(self, key):
return self.vals[key]
def __setitem__(self, key, val):
self.vals[key] = val
def __len__(self):
return len(self.__vals)
def __init__(self, x, y):
self.b = self.B(x,y)
a = A('foo','baz')
print type(a.b) # __main__.b __main__ because we run script straightly
print a.b[:] # ('foo', 'baz')
You can achieve the same behaviour by overriding getitem() and setitem() in your class.
class Example:
def __getitem__(self, index):
return index ** 2
>>> X = Example()
>>> X[2]
>>> 4
You can override setitem() too in your class for achieving the setter kind of thing.

Making custom containers work with **kwargs (how does Python expand the args?)

I have a custom container class in Python 2.7, and everything works as expected except if I pass try to expand an instance as **kwargs for a function:
cm = ChainableMap({'a': 1})
cm['b'] = 2
assert cm == {'a': 1, 'b': 2} # Is fine
def check_kwargs(**kwargs):
assert kwargs == {'a': 1, 'b': 2}
check_kwargs(**cm) # Raises AssertionError
I've overridden __getitem__, __iter__, iterkeys, keys, items, and iteritems, (and __eq__ and __repr__) yet none of them seem to be involved in the expansion as **kwargs, what am I doing wrong?
Edit - The working updated source that now inherits from MutableMapping and adds the missing methods:
from itertools import chain
from collections import MutableMapping
class ChainableMap(MutableMapping):
"""
A mapping object with a delegation chain similar to JS object prototypes::
>>> parent = {'a': 1}
>>> child = ChainableMap(parent)
>>> child.parent is parent
True
Failed lookups delegate up the chain to self.parent::
>>> 'a' in child
True
>>> child['a']
1
But modifications will only affect the child::
>>> child['b'] = 2
>>> child.keys()
['a', 'b']
>>> parent.keys()
['a']
>>> child['a'] = 10
>>> parent['a']
1
Changes in the parent are also reflected in the child::
>>> parent['c'] = 3
>>> sorted(child.keys())
['a', 'b', 'c']
>>> expect = {'a': 10, 'b': 2, 'c': 3}
>>> assert child == expect, "%s != %s" % (child, expect)
Unless the child is already masking out a certain key::
>>> del parent['a']
>>> parent.keys()
['c']
>>> assert child == expect, "%s != %s" % (child, expect)
However, this doesn't work::
>>> def print_sorted(**kwargs):
... for k in sorted(kwargs.keys()):
... print "%r=%r" % (k, kwargs[k])
>>> child['c'] == 3
True
>>> print_sorted(**child)
'a'=10
'b'=2
'c'=3
"""
__slots__ = ('_', 'parent')
def __init__(self, parent, **data):
self.parent = parent
self._ = data
def __getitem__(self, key):
try:
return self._[key]
except KeyError:
return self.parent[key]
def __iter__(self):
return self.iterkeys()
def __setitem__(self, key, val):
self._[key] = val
def __delitem__(self, key):
del self._[key]
def __len__(self):
return len(self.keys())
def keys(self, own=False):
return list(self.iterkeys(own))
def items(self, own=False):
return list(self.iteritems(own))
def iterkeys(self, own=False):
if own:
for k in self._.iterkeys():
yield k
return
yielded = set([])
for k in chain(self.parent.iterkeys(), self._.iterkeys()):
if k in yielded:
continue
yield k
yielded.add(k)
def iteritems(self, own=False):
for k in self.iterkeys(own):
yield k, self[k]
def __eq__(self, other):
return sorted(self.iteritems()) == sorted(other.iteritems())
def __repr__(self):
return dict(self.iteritems()).__repr__()
def __contains__(self, key):
return key in self._ or key in self.parent
def containing(self, key):
"""
Return the ancestor that directly contains ``key``
>>> p2 = {'a', 2}
>>> p1 = ChainableMap(p2)
>>> c = ChainableMap(p1)
>>> c.containing('a') is p2
True
"""
if key in self._:
return self
elif hasattr(self.parent, 'containing'):
return self.parent.containing(key)
elif key in self.parent:
return self.parent
def get(self, key, default=None):
"""
>>> c = ChainableMap({'a': 1})
>>> c.get('a')
1
>>> c.get('b', 'default')
'default'
"""
if key in self:
return self[key]
else:
return default
def pushdown(self, top):
"""
Pushes a new mapping onto the top of the delegation chain:
>>> parent = {'a': 10}
>>> child = ChainableMap(parent)
>>> top = {'a': 'apple', 'b': 'beer', 'c': 'cheese'}
>>> child.pushdown(top)
>>> assert child == top
This creates a new ChainableMap with the contents of ``child`` and makes it
the new parent (the old parent becomes the grandparent):
>>> child.parent.parent is parent
True
>>> del child['a']
>>> child['a'] == 10
True
"""
old = ChainableMap(self.parent)
for k, v in self.items(True):
old[k] = v
del self[k]
self.parent = old
for k, v in top.iteritems():
self[k] = v
When creating a keyword argument dictionary, the behavior is the same as passing your object into the dict() initializer, which results in the dict {'b': 2} for your cm object:
>>> cm = ChainableMap({'a': 1})
>>> cm['b'] = 2
>>> dict(cm)
{'b': 2}
A more detailed explanation of why this is the case is below, but the summary is that your mapping is converted to a Python dictionary in C code which does some optimization if the argument is itself another dict, by bypassing the Python function calls and inspecting the underlying C object directly.
There are a few ways to approach the solution for this, either make sure that the underlying dict contains everything you want, or stop inheriting from dict (which will require other changes as well, at the very least a __setitem__ method).
edit: It sounds like BrenBarn's suggestion to inherit from collections.MutableMapping instead of dict did the trick.
You could accomplish the first method pretty simply by just adding self.update(parent) to ChainableMap.__init__(), but I'm not sure if that will cause other side effects to the behavior of your class.
Explanation of why dict(cm) gives {'b': 2}:
Check out the following CPython code for the dict object:
http://hg.python.org/releasing/2.7.3/file/7bb96963d067/Objects/dictobject.c#l1522
When dict(cm) is called (and when keyword arguments are unpacked), the PyDict_Merge function is called with cm as the b parameter. Because ChainableMap inherits from dict, the if statement at line 1539 is entered:
if (PyDict_Check(b)) {
other = (PyDictObject*)b;
...
From there on, items from other are added to the new dict that is being created by accessing the C object directly, which bypasses all of the methods that you overwrote.
This means that any items in a ChainableMap instance that are accessed through the parent attribute will not be added to the new dictionary created by dict() or keyword argument unpacking.

Subclassing builtin types in Python 2 and Python 3

When subclassing builtin types, I noticed a rather important difference between Python 2 and Python 3 in the return type of the methods of the built-in types. The following code illustrates this for sets:
class MySet(set):
pass
s1 = MySet([1, 2, 3, 4, 5])
s2 = MySet([1, 2, 3, 6, 7])
print(type(s1.union(s2)))
print(type(s1.intersection(s2)))
print(type(s1.difference(s2)))
With Python 2, all the return values are of type MySet. With Python 3, the return types are set. I could not find any documentation on what the result is supposed to be, nor any documentation about the change in Python 3.
Anyway, what I really care about is this: is there a simple way in Python 3 to get the behavior seen in Python 2, without redefining every single method of the built-in types?
This isn't a general change for built-in types when moving from Python 2.x to 3.x -- list and int, for example, have the same behaviour in 2.x and 3.x. Only the set type was changed to bring it in line with the other types, as discussed in this bug tracker issue.
I'm afraid there is no really nice way to make it behave the old way. Here is some code I was able to come up with:
class MySet(set):
def copy(self):
return MySet(self)
def _make_binary_op(in_place_method):
def bin_op(self, other):
new = self.copy()
in_place_method(new, other)
return new
return bin_op
__rand__ = __and__ = _make_binary_op(set.__iand__)
intersection = _make_binary_op(set.intersection_update)
__ror__ = __or__ = _make_binary_op(set.__ior__)
union = _make_binary_op(set.update)
__sub__ = _make_binary_op(set.__isub__)
difference = _make_binary_op(set.difference_update)
__rxor__ = xor__ = _make_binary_op(set.__ixor__)
symmetric_difference = _make_binary_op(set.symmetric_difference_update)
del _make_binary_op
def __rsub__(self, other):
new = MySet(other)
new -= self
return new
This will simply overwrite all methods with versions that return your own type. (There is a whole lot of methods!)
Maybe for your application, you can get away with overwriting copy() and stick to the in-place methods.
Perhaps a metaclass to do all that humdrum wrapping for you would make it easier:
class Perpetuate(type):
def __new__(metacls, cls_name, cls_bases, cls_dict):
if len(cls_bases) > 1:
raise TypeError("multiple bases not allowed")
result_class = type.__new__(metacls, cls_name, cls_bases, cls_dict)
base_class = cls_bases[0]
known_attr = set()
for attr in cls_dict.keys():
known_attr.add(attr)
for attr in base_class.__dict__.keys():
if attr in ('__new__'):
continue
code = getattr(base_class, attr)
if callable(code) and attr not in known_attr:
setattr(result_class, attr, metacls._wrap(base_class, code))
elif attr not in known_attr:
setattr(result_class, attr, code)
return result_class
#staticmethod
def _wrap(base, code):
def wrapper(*args, **kwargs):
if args:
cls = args[0]
result = code(*args, **kwargs)
if type(result) == base:
return cls.__class__(result)
elif isinstance(result, (tuple, list, set)):
new_result = []
for partial in result:
if type(partial) == base:
new_result.append(cls.__class__(partial))
else:
new_result.append(partial)
result = result.__class__(new_result)
elif isinstance(result, dict):
for key in result:
value = result[key]
if type(value) == base:
result[key] = cls.__class__(value)
return result
wrapper.__name__ = code.__name__
wrapper.__doc__ = code.__doc__
return wrapper
class MySet(set, metaclass=Perpetuate):
pass
s1 = MySet([1, 2, 3, 4, 5])
s2 = MySet([1, 2, 3, 6, 7])
print(s1.union(s2))
print(type(s1.union(s2)))
print(s1.intersection(s2))
print(type(s1.intersection(s2)))
print(s1.difference(s2))
print(type(s1.difference(s2)))
As a follow-up to Sven's answer, here is a universal wrapping solution that takes care of all non-special methods. The idea is to catch the first lookup coming from a method call, and install a wrapper method that does the type conversion. At subsequent lookups, the wrapper is returned directly.
Caveats:
1) This is more magic trickery than I like to have in my code.
2) I'd still need to wrap special methods (__and__ etc.) manually because their lookup bypasses __getattribute__
import types
class MySet(set):
def __getattribute__(self, name):
attr = super(MySet, self).__getattribute__(name)
if isinstance(attr, types.BuiltinMethodType):
def wrapper(self, *args, **kwargs):
result = attr(self, *args, **kwargs)
if isinstance(result, set):
return MySet(result)
else:
return result
setattr(MySet, name, wrapper)
return wrapper
return attr

How to implement an ordered, default dict?

I would like to combine OrderedDict() and defaultdict() from collections in one object, which shall be an ordered, default dict.
Is this possible?
The following (using a modified version of this recipe) works for me:
from collections import OrderedDict, Callable
class DefaultOrderedDict(OrderedDict):
# Source: http://stackoverflow.com/a/6190500/562769
def __init__(self, default_factory=None, *a, **kw):
if (default_factory is not None and
not isinstance(default_factory, Callable)):
raise TypeError('first argument must be callable')
OrderedDict.__init__(self, *a, **kw)
self.default_factory = default_factory
def __getitem__(self, key):
try:
return OrderedDict.__getitem__(self, key)
except KeyError:
return self.__missing__(key)
def __missing__(self, key):
if self.default_factory is None:
raise KeyError(key)
self[key] = value = self.default_factory()
return value
def __reduce__(self):
if self.default_factory is None:
args = tuple()
else:
args = self.default_factory,
return type(self), args, None, None, self.items()
def copy(self):
return self.__copy__()
def __copy__(self):
return type(self)(self.default_factory, self)
def __deepcopy__(self, memo):
import copy
return type(self)(self.default_factory,
copy.deepcopy(self.items()))
def __repr__(self):
return 'OrderedDefaultDict(%s, %s)' % (self.default_factory,
OrderedDict.__repr__(self))
Here is another possibility, inspired by Raymond Hettinger's super() Considered Super, tested on Python 2.7.X and 3.4.X:
from collections import OrderedDict, defaultdict
class OrderedDefaultDict(OrderedDict, defaultdict):
def __init__(self, default_factory=None, *args, **kwargs):
#in python3 you can omit the args to super
super(OrderedDefaultDict, self).__init__(*args, **kwargs)
self.default_factory = default_factory
If you check out the class's MRO (aka, help(OrderedDefaultDict)), you'll see this:
class OrderedDefaultDict(collections.OrderedDict, collections.defaultdict)
| Method resolution order:
| OrderedDefaultDict
| collections.OrderedDict
| collections.defaultdict
| __builtin__.dict
| __builtin__.object
meaning that when an instance of OrderedDefaultDict is initialized, it defers to the OrderedDict's init, but this one in turn will call the defaultdict's methods before calling __builtin__.dict, which is precisely what we want.
If you want a simple solution that doesn't require a class, you can just use OrderedDict.setdefault(key, default=None) or OrderedDict.get(key, default=None). If you only get / set from a few places, say in a loop, you can easily just setdefault.
totals = collections.OrderedDict()
for i, x in some_generator():
totals[i] = totals.get(i, 0) + x
It is even easier for lists with setdefault:
agglomerate = collections.OrderedDict()
for i, x in some_generator():
agglomerate.setdefault(i, []).append(x)
But if you use it more than a few times, it is probably better to set up a class, like in the other answers.
Here's another solution to think about if your use case is simple like mine and you don't necessarily want to add the complexity of a DefaultOrderedDict class implementation to your code.
from collections import OrderedDict
keys = ['a', 'b', 'c']
items = [(key, None) for key in keys]
od = OrderedDict(items)
(None is my desired default value.)
Note that this solution won't work if one of your requirements is to dynamically insert new keys with the default value. A tradeoff of simplicity.
Update 3/13/17 - I learned of a convenience function for this use case. Same as above but you can omit the line items = ... and just:
od = OrderedDict.fromkeys(keys)
Output:
OrderedDict([('a', None), ('b', None), ('c', None)])
And if your keys are single characters, you can just pass one string:
OrderedDict.fromkeys('abc')
This has the same output as the two examples above.
You can also pass a default value as the second arg to OrderedDict.fromkeys(...).
Another simple approach would be to use dictionary get method
>>> from collections import OrderedDict
>>> d = OrderedDict()
>>> d['key'] = d.get('key', 0) + 1
>>> d['key'] = d.get('key', 0) + 1
>>> d
OrderedDict([('key', 2)])
>>>
A simpler version of #zeekay 's answer is:
from collections import OrderedDict
class OrderedDefaultListDict(OrderedDict): #name according to default
def __missing__(self, key):
self[key] = value = [] #change to whatever default you want
return value
A simple and elegant solution building on #NickBread.
Has a slightly different API to set the factory, but good defaults are always nice to have.
class OrderedDefaultDict(OrderedDict):
factory = list
def __missing__(self, key):
self[key] = value = self.factory()
return value
I created slightly fixed and more simplified version of the accepted answer, actual for python 3.7.
from collections import OrderedDict
from copy import copy, deepcopy
import pickle
from typing import Any, Callable
class DefaultOrderedDict(OrderedDict):
def __init__(
self,
default_factory: Callable[[], Any],
*args,
**kwargs,
):
super().__init__(*args, **kwargs)
self.default_factory = default_factory
def __getitem__(self, key):
try:
return super().__getitem__(key)
except KeyError:
return self.__missing__(key)
def __missing__(self, key):
self[key] = value = self.default_factory()
return value
def __reduce__(self):
return type(self), (self.default_factory, ), None, None, iter(self.items())
def copy(self):
return self.__copy__()
def __copy__(self):
return type(self)(self.default_factory, self)
def __deepcopy__(self, memo):
return type(self)(self.default_factory, deepcopy(tuple(self.items()), memo))
def __repr__(self):
return f'{self.__class__.__name__}({self.default_factory}, {OrderedDict(self).__repr__()})'
And, that may be even more important, provided some tests.
a = DefaultOrderedDict(list)
# testing default
assert a['key'] == []
a['key'].append(1)
assert a['key'] == [1, ]
# testing repr
assert repr(a) == "DefaultOrderedDict(<class 'list'>, OrderedDict([('key', [1])]))"
# testing copy
b = a.copy()
assert b['key'] is a['key']
c = copy(a)
assert c['key'] is a['key']
d = deepcopy(a)
assert d['key'] is not a['key']
assert d['key'] == a['key']
# testing pickle
saved = pickle.dumps(a)
restored = pickle.loads(saved)
assert restored is not a
assert restored == a
# testing order
a['second_key'] = [2, ]
a['key'] = [3, ]
assert list(a.items()) == [('key', [3, ]), ('second_key', [2, ])]
Inspired by other answers on this thread, you can use something like,
from collections import OrderedDict
class OrderedDefaultDict(OrderedDict):
def __missing__(self, key):
value = OrderedDefaultDict()
self[key] = value
return value
I would like to know if there're any downsides of initializing another object of the same class in the missing method.
i tested the default dict and discovered it's also sorted!
maybe it was just a coincidence but anyway you can use the sorted function:
sorted(s.items())
i think it's simpler

Why does Python not support record type? (i.e. mutable namedtuple)

Why does Python not support a record type natively? It's a matter of having a mutable version of namedtuple.
I could use namedtuple._replace. But I need to have these records in a collection and since namedtuple._replace creates another instance, I also need to modify the collection which becomes messy quickly.
Background:
I have a device whose attributes I need to get by polling it over TCP/IP. i.e. its representation is a mutable object.
Edit:
I have a set of devices for whom I need to poll.
Edit:
I need to iterate through the object displaying its attributes using PyQt. I know I can add special methods like __getitem__ and __iter__, but I want to know if there is an easier way.
Edit:
I would prefer a type whose attribute are fixed (just like they are in my device), but are mutable.
Python <3.3
You mean something like this?
class Record(object):
__slots__= "attribute1", "attribute2", "attribute3",
def items(self):
"dict style items"
return [
(field_name, getattr(self, field_name))
for field_name in self.__slots__]
def __iter__(self):
"iterate over fields tuple/list style"
for field_name in self.__slots__:
yield getattr(self, field_name)
def __getitem__(self, index):
"tuple/list style getitem"
return getattr(self, self.__slots__[index])
>>> r= Record()
>>> r.attribute1= "hello"
>>> r.attribute2= "there"
>>> r.attribute3= 3.14
>>> print r.items()
[('attribute1', 'hello'), ('attribute2', 'there'), ('attribute3', 3.1400000000000001)]
>>> print tuple(r)
('hello', 'there', 3.1400000000000001)
Note that the methods provided are just a sample of possible methods.
Python ≥3.3 update
You can use types.SimpleNamespace:
>>> import types
>>> r= types.SimpleNamespace()
>>> r.attribute1= "hello"
>>> r.attribute2= "there"
>>> r.attribute3= 3.14
dir(r) would provide you with the attribute names (filtering out all .startswith("__"), of course).
Is there any reason you can't use a regular dictionary? It seems like the attributes don't have a specific ordering in your particular situation.
Alternatively, you could also use a class instance (which has nice attribute access syntax). You could use __slots__ if you wish to avoid having a __dict__ created for each instance.
I've also just found a recipe for "records", which are described as mutable named-tuples. They are implemented using classes.
Update:
Since you say order is important for your scenario (and you want to iterate through all the attributes) an OrderedDict seems to be the way to go. This is part of the standard collections module as of Python 2.7; there are other implementations floating around the internet for Python < 2.7.
To add attribute-style access, you can subclass it like so:
from collections import OrderedDict
class MutableNamedTuple(OrderedDict):
def __init__(self, *args, **kwargs):
super(MutableNamedTuple, self).__init__(*args, **kwargs)
self._initialized = True
def __getattr__(self, name):
try:
return self[name]
except KeyError:
raise AttributeError(name)
def __setattr__(self, name, value):
if hasattr(self, '_initialized'):
super(MutableNamedTuple, self).__setitem__(name, value)
else:
super(MutableNamedTuple, self).__setattr__(name, value)
Then you can do:
>>> t = MutableNamedTuple()
>>> t.foo = u'Crazy camels!'
>>> t.bar = u'Yay, attribute access'
>>> t.foo
u'Crazy camels!'
>>> t.values()
[u'Crazy camels!', u'Yay, attribute access']
This can be done using an empty class and instances of it, like this:
>>> class a(): pass
...
>>> ainstance = a()
>>> ainstance.b = 'We want Moshiach Now'
>>> ainstance.b
'We want Moshiach Now'
>>>
There's a library similar to namedtuple, but mutable, called recordtype.
Package home: http://pypi.python.org/pypi/recordtype
Simple example:
from recordtype import recordtype
Person = recordtype('Person', 'first_name last_name phone_number')
person1 = Person('Trent', 'Steele', '637-3049')
person1.last_name = 'Terrence';
print person1
# Person(first_name=Trent, last_name=Terrence, phone_number=637-3049)
Simple default value example:
Basis = recordtype('Basis', [('x', 1), ('y', 0)])
Iterate through the fields of person1 in order:
map(person1.__getattribute__, Person._fields)
This question is old, but just for the sake of completeness, Python 3.7 has dataclasses which are pretty much records.
>>> from dataclasses import dataclass
>>>
>>> #dataclass
... class MyRecord:
... name: str
... age: int = -1
...
>>> rec = MyRecord('me')
>>> rec.age = 127
>>> print(rec)
MyRecord(name='me', age=127)
The attrs third party library provides more functionality for both Python 2 and Python 3. Nothing wrong with vendoring dependencies either if the requirement is more around things you can't keep locally rather than specifically only using the stdlib. dephell has a nice helper for doing that.
This answer duplicates another one.
There is a mutable alternative to collections.namedtuple - recordclass.
It has same API and minimal memory footprint (actually it also faster). It support assignments. For example:
from recordclass import recordclass
Point = recordclass('Point', 'x y')
>>> p = Point(1, 2)
>>> p
Point(x=1, y=2)
>>> print(p.x, p.y)
1 2
>>> p.x += 2; p.y += 3; print(p)
Point(x=3, y=5)
There is more complete example (it also include performance comparisons).
In the closely related Existence of mutable named tuple in Python? question 13 tests are used for comparing 6 mutable alternatives to namedtuple.
The latest namedlist 1.7 passes all of these tests with both Python 2.7 and Python 3.5 as of Jan 11, 2016. It is a pure python implementation.
The second best candidate according to these tests is the recordclass which is a C extension. Of course, it depends on your requirements whether a C extension is preferred or not.
For further details, especially for the tests, see Existence of mutable named tuple in Python?
Based on several useful tricks gathered over time, this "frozenclass" decorator does pretty much everything needed: http://pastebin.com/fsuVyM45
Since that code is over 70% documentation and tests, I won't say more here.
Here is a complete mutable namedtuple I made, which behaves like a list and is totally compatible with it.
class AbstractNamedArray():
"""a mutable collections.namedtuple"""
def __new__(cls, *args, **kwargs):
inst = object.__new__(cls) # to rename the class
inst._list = len(cls._fields)*[None]
inst._mapping = {}
for i, field in enumerate(cls._fields):
inst._mapping[field] = i
return inst
def __init__(self, *args, **kwargs):
if len(kwargs) == 0 and len(args) != 0:
assert len(args) == len(self._fields), 'bad number of arguments'
self._list = list(args)
elif len(args) == 0 and len(kwargs) != 0:
for field, value in kwargs.items():
assert field in self._fields, 'field {} doesn\'t exist'
self._list[self._mapping[field]] = value
else:
raise ValueError("you can't mix args and kwargs")
def __getattr__(self, x):
return object.__getattribute__(self, '_list')[object.__getattribute__(self, '_mapping')[x]]
def __setattr__(self, x, y):
if x in self._fields:
self._list[self._mapping[x]] = y
else:
object.__setattr__(self, x, y)
def __repr__(self):
fields = []
for field, value in zip(self._fields, map(self.__getattr__, self._fields)):
fields.append('{}={}'.format(field, repr(value)))
return '{}({})'.format(self._name, ', '.join(fields))
def __iter__(self):
yield from self._list
def __list__(self):
return self._list[:]
def __len__(self):
return len(self._fields)
def __getitem__(self, x):
return self._list[x]
def __setitem__(self, x, y):
self._list[x] = y
def __contains__(self, x):
return x in self._list
def reverse(self):
self._list.reverse()
def copy(self):
return self._list.copy()
def namedarray(name, fields):
"""used to construct a named array (fixed-length list with named fields)"""
return type(name, (AbstractNamedarray,), {'_name': name, '_fields': fields})
You could do something like thisdictsubclass which is its own __dict__. The basic concept is the same as that of the ActiveState AttrDict recipe, but the implementation is simpler. The result is something more mutable than you need since both an instance's attributes and their values are changeable. Although the attributes aren't ordered, you can iterate through the current ones and/or their values.
class Record(dict):
def __init__(self, *args, **kwargs):
super(Record, self).__init__(*args, **kwargs)
self.__dict__ = self
As tzot stated, since Python ≥3.3, Python does have a mutable version of namedtuple: types.SimpleNamespace.
These things are very similar to the new C# 9 Records.
Here are some usage examples:
Positional constructor arguments
>>> import types
>>>
>>> class Location(types.SimpleNamespace):
... def __init__(self, lat=0, long=0):
... super().__init__(lat=lat, long=long)
...
>>> loc_1 = Location(49.4, 8.7)
Pretty repr
>>> loc_1
Location(lat=49.4, long=8.7)
Mutable
>>> loc_2 = Location()
>>> loc_2
Location(lat=0, long=0)
>>> loc_2.lat = 49.4
>>> loc_2
Location(lat=49.4, long=0)
Value semantics for equality
>>> loc_2 == loc_1
False
>>> loc_2.long = 8.7
>>> loc_2 == loc_1
True
Can add attributes at runtime
>>> loc_2.city = 'Heidelberg'
>>> loc_2

Categories

Resources