Related
I would like to write a custom list class in Python 3 like in this question How would I create a custom list class in python?, but unlike that question I would like to implement __get__ and __set__ methods. Although my class is similar to the list, but there are some magic operations hidden behind these methods. And so I would like to work with this variable like with list, like in main of my program (see below). I would like to know, how to move __get__ and __set__ methods (fget and fset respectively) from Foo class to MyList class to have only one class.
My current solution (also, I added output for each operation for clarity):
class MyList:
def __init__(self, data=[]):
print('MyList.__init__')
self._mylist = data
def __getitem__(self, key):
print('MyList.__getitem__')
return self._mylist[key]
def __setitem__(self, key, item):
print('MyList.__setitem__')
self._mylist[key] = item
def __str__(self):
print('MyList.__str__')
return str(self._mylist)
class Foo:
def __init__(self, mylist=[]):
self._mylist = MyList(mylist)
def fget(self):
print('Foo.fget')
return self._mylist
def fset(self, data):
print('Foo.fset')
self._mylist = MyList(data)
mylist = property(fget, fset, None, 'MyList property')
if __name__ == '__main__':
foo = Foo([1, 2, 3])
# >>> MyList.__init__
print(foo.mylist)
# >>> Foo.fget
# >>> MyList.__str__
# >>> [1, 2, 3]
foo.mylist = [1, 2, 3, 4]
# >>> Foo.fset
# >>> MyList.__init__
print(foo.mylist)
# >>> Foo.fget
# >>> MyList.__str__
# >>> [1, 2, 3, 4]
foo.mylist[0] = 0
# >>> Foo.fget
# >>> MyList.__setitem__
print(foo.mylist[0])
# >>> Foo.fget
# >>> MyList.__getitem__
# >>> 0
Thank you in advance for any help.
How to move __get__ and __set__ methods (fget and fset respectively) from Foo class to MyList class to have only one class?
UPD:
Thanks a lot to #Blckknght! I tried to understand his answer and it works very well for me! It's exactly what I needed. As a result, I get the following code:
class MyList:
def __init__(self, value=None):
self.name = None
if value is None:
self.value = []
else:
self.value = value
def __set_name__(self, owner, name):
self.name = "_" + name
def __get__(self, instance, owner):
return getattr(instance, self.name)
def __set__(self, instance, value):
setattr(instance, self.name, MyList(value))
def __getitem__(self, key):
return self.value[key]
def __setitem__(self, key, value):
self.value[key] = value
def append(self, value):
self.value.append(value)
def __str__(self):
return str(self.value)
class Foo:
my_list = MyList()
def __init__(self):
self.my_list = [1, 2, 3]
print(type(self.my_list)) # <class '__main__.MyList'>
self.my_list = [4, 5, 6, 7, 8]
print(type(self.my_list)) # <class '__main__.MyList'>
self.my_list[0] = 10
print(type(self.my_list)) # <class '__main__.MyList'>
self.my_list.append(7)
print(type(self.my_list)) # <class '__main__.MyList'>
print(self.my_list) # [10, 5, 6, 7, 8, 7]
foo = Foo()
I don't know, that's Pythonic way or not, but it works as I expected.
In a comment, you explained what you actually want:
x = MyList([1])
x = [2]
# and have x be a MyList after that.
That is not possible. In Python, plain assignment to a bare name (e.g., x = ..., in contrast to x.blah = ... or x[0] = ...) is an operation on the name only, not the value, so there is no way for any object to hook into the name-binding process. An assignment like x = [2] works the same way no matter what the value of x is (and indeed works the same way regardless of whether x already has a value or whether this is the first value being assigned to x).
While you can make your MyList class follow the descriptor protocol (which is what the __get__ and __set__ methods are for), you probably don't want to. That's because, to be useful, a descriptor must be placed as an attribute of a class, not as an attribute of an instance. The properties in your Foo class creating separate instances of MyList for each instance. That wouldn't work if the list was defined on the Foo class directly.
That's not to say that custom descriptors can't be useful. The property you're using in your Foo class is a descriptor. If you wanted to, you could write your own MyListAttr descriptor that does the same thing.
class MyListAttr(object):
def __init__(self):
self.name = None
def __set_name__(self, owner, name): # this is used in Pyton 3.6+
self.name = "_" + name
def find_name(self, cls): # this is used on earlier versions that don't support set_name
for name in dir(cls):
if getattr(cls, name) is self:
self.name = "_" + name
return
raise TypeError()
def __get__(self, obj, owner):
if obj is None:
return self
if self.name is None:
self.find_name(owner)
return getattr(obj, self.name)
def __set__(self, obj, value):
if self.name is None:
self.find_name(type(obj))
setattr(obj, self.name, MyList(value))
class Foo(object):
mylist = MyListAttr() # create the descriptor as a class variable
def __init__(self, data=None):
if data is None:
data = []
self.mylist = data # this invokes the __set__ method of the descriptor!
The MyListAttr class is more complicated than it otherwise might be because I try to have the descriptor object find its own name. That's not easy to figure out in older versions of Python. Starting with Python 3.6, it's much easier (because the __set_name__ method will be called on the descriptor when it is assigned as a class variable). A lot of the code in the class could be removed if you only needed to support Python 3.6 and later (you wouldn't need find_name or any of the code that calls it in __get__ and __set__).
It might not seem worth writing a long descriptor class like MyListAttr to do what you were able to do with less code using a property. That's probably correct if you only have one place you want to use the descriptor. But if you may have many classes (or many attributes within a single class) where you want the same special behavior, you will benefit from packing the behavior into a descriptor rather than writing a lot of very similar property getter and setter methods.
You might not have noticed, but I also made a change to the Foo class that is not directly related to the descriptor use. The change is to the default value for data. Using a mutable object like a list as a default argument is usually a very bad idea, as that same object will be shared by all calls to the function without an argument (so all Foo instances not initialized with data would share the same list). It's better to use a sentinel value (like None) and replace the sentinel with what you really want (a new empty list in this case). You probably should fix this issue in your MyList.__init__ method too.
I have a list of objects. Each object has two fields
obj1.status = 2
obj1.timestamp = 19211
obj2.status = 3
obj2.timestamp = 14211
obj_list = [obj1, obj2]
I will keep adding / deleting objects in the list and also changing attributes of objects, for example I may change ob1.status to 5.
Now I have two dicts
dict1 - <status, object>
dict2 - <timestamp, object>
How do I design a simple solution so that whenever I modify/delete/insert elements in the list, the maps get automatically updated. I am interested in a pythonic solution that is elegant and extensible. For example in future, I should be able to easily add another attribute and dict for that as well
Also for simplicity, let us assume all attributes value are different. For example no two objects will have same status
You could override the __setattr__ on the objects to update the indexes whenever you set the values. You can use a weakref dictionary for the indexes so that when you delete objects and are no longer using them, they are automatically removed from the indexes.
import weakref
from bunch import Bunch
class MyObject(object):
indexes = Bunch() # Could just use dict()
def __init__(self, **kwargs):
super(MyObject, self).__init__()
for k, v in kwargs.items():
setattr(self, k, v)
def __setattr__(self, name, value):
try:
index = MyObject.indexes[name]
except KeyError:
index = weakref.WeakValueDictionary()
MyObject.indexes[name] = index
try:
old_val = getattr(self, name)
del index[old_val]
except (KeyError, AttributeError):
pass
object.__setattr__(self, name, value)
index[value] = self
obj1 = MyObject(status=1, timestamp=123123)
obj2 = MyObject(status=2, timestamp=2343)
print MyObject.indexes.status[1]
print obj1.indexes.timestamp[2343]
obj1.status = 5
print obj2.indexes['status'][5]
I used a Bunch here because it allows you to access the indexes using .name notation, but you could just use a dict instead and use the ['name'] syntax.
One approach here would be to create a class level dict for MyObj and define updating behavior using property decorator. Every time an object is changed or added, it is reflected in the respected dictionaries associated with the class.
Edit: as #BrendanAbel points out, using weakref.WeakValueDictionary in place of dict handles object deletion from class level dicts.
from datetime import datetime
from weakref import WeakValueDictionary
DEFAULT_TIME = datetime.now()
class MyObj(object):
"""
A sample clone of your object
"""
timestamps = WeakValueDictionary()
statuses = WeakValueDictionary()
def __init__(self, status=0, timestamp=DEFAULT_TIME):
self._status = status
self._timestamp = timestamp
self.status = status
self.timestamp = timestamp
def __update_class(self):
MyObj.timestamps.update({self.timestamp: self})
MyObj.statuses.update({self.status: self})
def __delete_from_class(self):
maybe_self = MyObj.statuses.get(self.status, None)
if maybe_self is self is not None:
del MyObj.statuses[self.status]
maybe_self = MyObj.timestamps.get(self.timestamp, None)
if maybe_self is self is not None:
del MyObj.timestamps[self.timestamp]
#property
def status(self):
return self._status
#status.setter
def status(self, val):
self.__delete_from_class()
self._status = val
self.__update_class()
#property
def timestamp(self):
return self._timestamp
#timestamp.setter
def timestamp(self, val):
self.__delete_from_class()
self._timestamp = val
self.__update_class()
def __repr__(self):
return "MyObj: status={} timestamp={}".format(self.status, self.timestamp)
obj1 = MyObj(1)
obj2 = MyObj(2)
obj3 = MyObj(3)
lst = [obj1, obj2, obj3]
# In [87]: q.lst
# Out[87]:
# [MyObj: status=1 timestamp=2016-05-27 13:43:38.158363,
# MyObj: status=2 timestamp=2016-05-27 13:43:38.158363,
# MyObj: status=3 timestamp=2016-05-27 13:43:38.158363]
# In [88]: q.MyObj.statuses[1]
# Out[88]: MyObj: status=1 timestamp=2016-05-27 13:43:38.158363
# In [89]: q.MyObj.statuses[1].status = 42
# In [90]: q.MyObj.statuses[42]
# Out[90]: MyObj: status=42 timestamp=2016-05-27 13:43:38.158363
# In [91]: q.MyObj.statuses[1]
# ---------------------------------------------------------------------------
# KeyError Traceback (most recent call last)
# <ipython-input-91-508ab072bfc4> in <module>()
# ----> 1 q.MyObj.statuses[1]
# KeyError: 1
For a collection to be aware of mutation of its elements, there must be some connection between the elements and that collection which can communicate when changes happen. For this reason, we either must bind an instance to a collection or proxy the elements of the collection so that change-communication doesn't leak into the element's code.
A note about the implementation I'm going to present, the proxying method only works if the attributes are changed by direct setting, not inside of a method. A more complex book-keeping system would be necessary then.
Additionally, it assumes that exact duplicates of all attributes won't exist, given that you require the indices be built out of set objects instead of list
from collections import defaultdict
class Proxy(object):
def __init__(self, proxy, collection):
self._proxy = proxy
self._collection = collection
def __getattribute__(self, name):
if name in ("_proxy", "_collection"):
return object.__getattribute__(self, name)
else:
proxy = self._proxy
return getattr(proxy, name)
def __setattr__(self, name, value):
if name in ("_proxy", "collection"):
object.__setattr__(self, name, value)
else:
proxied = self._proxy
collection = self._collection
old = getattr(proxied, name)
setattr(proxy, name, value)
collection.signal_change(proxied, name, old, value)
class IndexedCollection(object):
def __init__(self, items, index_names):
self.items = list(items)
self.index_names = set(index_names)
self.indices = defaultdict(lambda: defaultdict(set))
def __len__(self):
return len(self.items)
def __iter__(self):
for i in range(len(self)):
yield self[i]
def remove(self, obj):
self.items.remove(obj)
self._remove_from_indices(obj)
def __getitem__(self, i):
# Ensure consumers get a proxy, not a raw object
return Proxy(self.items[i], self)
def append(self, obj):
self.items.append(obj)
self._add_to_indices(obj)
def _add_to_indices(self, obj):
for indx in self.index_names:
key = getattr(obj, indx)
self.indices[indx][key].add(obj)
def _remove_from_indices(self, obj):
for indx in self.index_names:
key = getattr(obj, indx)
self.indices[indx][key].remove(obj)
def signal_change(self, obj, indx, old, new):
if indx not in self.index_names:
return
# Tell the container to update its indices for a
# particular attribute and object
self.indices[indx][old].remove(obj)
self.indices[indx][new].add(obj)
I am not sure if this is what you are asking for but ...
Objects:
import operator
class Foo(object):
def __init__(self):
self.one = 1
self.two = 2
f = Foo()
f.name = 'f'
g = Foo()
g.name = 'g'
h = Foo()
h.name = 'h'
name = operator.attrgetter('name')
lists: a initially contains f and b initially contains h
a = [f]
b = [h]
dictionaries: each with one item whose value is one of the lists
d1 = {1:a}
d2 = {1:b}
d1[1] is list a which contains f and f.one is 1
>>> d1
{1: [<__main__.Foo object at 0x03F4CA50>]}
>>> name(d1[1][0])
'f'
>>> name(d1[1][0]), d1[1][0].one
('f', 1)
changing f.one is seen in the dictionary
>>> f.one = '?'
>>> name(d1[1][0]), d1[1][0].one
('f', '?')
>>>
d2[1] is list b which contains h
>>> d2
{1: [<__main__.Foo object at 0x03F59070>]}
>>> name(d2[1][0]), d2[1][0].one
('h', 1)
Add an object to b and it is seen in the dictionary
>>> b.append(g)
>>> b
[<__main__.Foo object at 0x03F59070>, <__main__.Foo object at 0x03F4CAF0>]
>>> d2
{1: [<__main__.Foo object at 0x03F59070>, <__main__.Foo object at 0x03F4CAF0>]}
>>> name(d2[1][1]), d2[1][1].one
('g', 1)
Let's say I have this dictionary in python, defined at the module level (mysettings.py):
settings = {
'expensive1' : expensive_to_compute(1),
'expensive2' : expensive_to_compute(2),
...
}
I would like those values to be computed when the keys are accessed:
from mysettings import settings # settings is only "prepared"
print settings['expensive1'] # Now the value is really computed.
Is this possible? How?
Don't inherit build-in dict. Even if you overwrite dict.__getitem__() method, dict.get() would not work as you expected.
The right way is to inherit abc.Mapping from collections.
from collections.abc import Mapping
class LazyDict(Mapping):
def __init__(self, *args, **kw):
self._raw_dict = dict(*args, **kw)
def __getitem__(self, key):
func, arg = self._raw_dict.__getitem__(key)
return func(arg)
def __iter__(self):
return iter(self._raw_dict)
def __len__(self):
return len(self._raw_dict)
Then you can do:
settings = LazyDict({
'expensive1': (expensive_to_compute, 1),
'expensive2': (expensive_to_compute, 2),
})
I also list sample code and examples here: https://gist.github.com/gyli/9b50bb8537069b4e154fec41a4b5995a
If you don't separe the arguments from the callable, I don't think it's possible. However, this should work:
class MySettingsDict(dict):
def __getitem__(self, item):
function, arg = dict.__getitem__(self, item)
return function(arg)
def expensive_to_compute(arg):
return arg * 3
And now:
>>> settings = MySettingsDict({
'expensive1': (expensive_to_compute, 1),
'expensive2': (expensive_to_compute, 2),
})
>>> settings['expensive1']
3
>>> settings['expensive2']
6
Edit:
You may also want to cache the results of expensive_to_compute, if they are to be accessed multiple times. Something like this
class MySettingsDict(dict):
def __getitem__(self, item):
value = dict.__getitem__(self, item)
if not isinstance(value, int):
function, arg = value
value = function(arg)
dict.__setitem__(self, item, value)
return value
And now:
>>> settings.values()
dict_values([(<function expensive_to_compute at 0x9b0a62c>, 2),
(<function expensive_to_compute at 0x9b0a62c>, 1)])
>>> settings['expensive1']
3
>>> settings.values()
dict_values([(<function expensive_to_compute at 0x9b0a62c>, 2), 3])
You may also want to override other dict methods depending of how you want to use the dict.
Store references to the functions as the values for the keys i.e:
def A():
return "that took ages"
def B():
return "that took for-ever"
settings = {
"A": A,
"B": B,
}
print(settings["A"]())
This way, you only evaluate the function associated with a key when you access it and invoke it. A suitable class which can handle having non-lazy values would be:
import types
class LazyDict(dict):
def __getitem__(self,key):
item = dict.__getitem__(self,key)
if isinstance(item,types.FunctionType):
return item()
else:
return item
usage:
settings = LazyDict([("A",A),("B",B)])
print(settings["A"])
>>>
that took ages
You can make expensive_to_compute a generator function:
settings = {
'expensive1' : expensive_to_compute(1),
'expensive2' : expensive_to_compute(2),
}
Then try:
from mysettings import settings
print next(settings['expensive1'])
I would populate the dictionary values with callables and change them to the result upon reading.
class LazyDict(dict):
def __getitem__(self, k):
v = super().__getitem__(k)
if callable(v):
v = v()
super().__setitem__(k, v)
return v
def get(self, k, default=None):
if k in self:
return self.__getitem__(k)
return default
Then with
def expensive_to_compute(arg):
print('Doing heavy stuff')
return arg * 3
you can do:
>>> settings = LazyDict({
'expensive1': lambda: expensive_to_compute(1),
'expensive2': lambda: expensive_to_compute(2),
})
>>> settings.__repr__()
"{'expensive1': <function <lambda> at 0x000001A0BA2B8EA0>, 'expensive2': <function <lambda> at 0x000001A0BA2B8F28>}"
>>> settings['expensive1']
Doing heavy stuff
3
>>> settings.get('expensive2')
Doing heavy stuff
6
>>> settings.__repr__()
"{'expensive1': 3, 'expensive2': 6}"
I recently needed something similar. Mixing both strategies from Guangyang Li and michaelmeyer, here is how I did it:
class LazyDict(MutableMapping):
"""Lazily evaluated dictionary."""
function = None
def __init__(self, *args, **kargs):
self._dict = dict(*args, **kargs)
def __getitem__(self, key):
"""Evaluate value."""
value = self._dict[key]
if not isinstance(value, ccData):
value = self.function(value)
self._dict[key] = value
return value
def __setitem__(self, key, value):
"""Store value lazily."""
self._dict[key] = value
def __delitem__(self, key):
"""Delete value."""
return self._dict[key]
def __iter__(self):
"""Iterate over dictionary."""
return iter(self._dict)
def __len__(self):
"""Evaluate size of dictionary."""
return len(self._dict)
Let's lazily evaluate the following function:
def expensive_to_compute(arg):
return arg * 3
The advantage is that the function is yet to be defined within the object and the arguments are the ones actually stored (which is what I needed):
>>> settings = LazyDict({'expensive1': 1, 'expensive2': 2})
>>> settings.function = expensive_to_compute # function unknown until now!
>>> settings['expensive1']
3
>>> settings['expensive2']
6
This approach works with a single function only.
I can point out the following advantages:
implements the complete MutableMapping API
if your function is non-deterministic, you can reset a value to re-evaluate
pass in a function to generate the values on the first attribute get:
class LazyDict(dict):
""" Fill in the values of a dict at first access """
def __init__(self, fn, *args, **kwargs):
self._fn = fn
self._fn_args = args or []
self._fn_kwargs = kwargs or {}
return super(LazyDict, self).__init__()
def _fn_populate(self):
if self._fn:
self._fn(self, *self._fn_args, **self._fn_kwargs)
self._fn = self._fn_args = self._fn_kwargs = None
def __getattribute__(self, name):
if not name.startswith('_fn'):
self._fn_populate()
return super(LazyDict, self).__getattribute__(name)
def __getitem__(self, item):
self._fn_populate()
return super(LazyDict, self).__getitem__(item)
>>> def _fn(self, val):
... print 'lazy loading'
... self['foo'] = val
...
>>> d = LazyDict(_fn, 'bar')
>>> d
{}
>>> d['foo']
lazy loading
'bar'
>>>
Alternatively, one can use the LazyDictionary package that creates a thread-safe lazy dictionary.
Installation:
pip install lazydict
Usage:
from lazydict import LazyDictionary
import tempfile
lazy = LazyDictionary()
lazy['temp'] = lambda: tempfile.mkdtemp()
I have a custom container class in Python 2.7, and everything works as expected except if I pass try to expand an instance as **kwargs for a function:
cm = ChainableMap({'a': 1})
cm['b'] = 2
assert cm == {'a': 1, 'b': 2} # Is fine
def check_kwargs(**kwargs):
assert kwargs == {'a': 1, 'b': 2}
check_kwargs(**cm) # Raises AssertionError
I've overridden __getitem__, __iter__, iterkeys, keys, items, and iteritems, (and __eq__ and __repr__) yet none of them seem to be involved in the expansion as **kwargs, what am I doing wrong?
Edit - The working updated source that now inherits from MutableMapping and adds the missing methods:
from itertools import chain
from collections import MutableMapping
class ChainableMap(MutableMapping):
"""
A mapping object with a delegation chain similar to JS object prototypes::
>>> parent = {'a': 1}
>>> child = ChainableMap(parent)
>>> child.parent is parent
True
Failed lookups delegate up the chain to self.parent::
>>> 'a' in child
True
>>> child['a']
1
But modifications will only affect the child::
>>> child['b'] = 2
>>> child.keys()
['a', 'b']
>>> parent.keys()
['a']
>>> child['a'] = 10
>>> parent['a']
1
Changes in the parent are also reflected in the child::
>>> parent['c'] = 3
>>> sorted(child.keys())
['a', 'b', 'c']
>>> expect = {'a': 10, 'b': 2, 'c': 3}
>>> assert child == expect, "%s != %s" % (child, expect)
Unless the child is already masking out a certain key::
>>> del parent['a']
>>> parent.keys()
['c']
>>> assert child == expect, "%s != %s" % (child, expect)
However, this doesn't work::
>>> def print_sorted(**kwargs):
... for k in sorted(kwargs.keys()):
... print "%r=%r" % (k, kwargs[k])
>>> child['c'] == 3
True
>>> print_sorted(**child)
'a'=10
'b'=2
'c'=3
"""
__slots__ = ('_', 'parent')
def __init__(self, parent, **data):
self.parent = parent
self._ = data
def __getitem__(self, key):
try:
return self._[key]
except KeyError:
return self.parent[key]
def __iter__(self):
return self.iterkeys()
def __setitem__(self, key, val):
self._[key] = val
def __delitem__(self, key):
del self._[key]
def __len__(self):
return len(self.keys())
def keys(self, own=False):
return list(self.iterkeys(own))
def items(self, own=False):
return list(self.iteritems(own))
def iterkeys(self, own=False):
if own:
for k in self._.iterkeys():
yield k
return
yielded = set([])
for k in chain(self.parent.iterkeys(), self._.iterkeys()):
if k in yielded:
continue
yield k
yielded.add(k)
def iteritems(self, own=False):
for k in self.iterkeys(own):
yield k, self[k]
def __eq__(self, other):
return sorted(self.iteritems()) == sorted(other.iteritems())
def __repr__(self):
return dict(self.iteritems()).__repr__()
def __contains__(self, key):
return key in self._ or key in self.parent
def containing(self, key):
"""
Return the ancestor that directly contains ``key``
>>> p2 = {'a', 2}
>>> p1 = ChainableMap(p2)
>>> c = ChainableMap(p1)
>>> c.containing('a') is p2
True
"""
if key in self._:
return self
elif hasattr(self.parent, 'containing'):
return self.parent.containing(key)
elif key in self.parent:
return self.parent
def get(self, key, default=None):
"""
>>> c = ChainableMap({'a': 1})
>>> c.get('a')
1
>>> c.get('b', 'default')
'default'
"""
if key in self:
return self[key]
else:
return default
def pushdown(self, top):
"""
Pushes a new mapping onto the top of the delegation chain:
>>> parent = {'a': 10}
>>> child = ChainableMap(parent)
>>> top = {'a': 'apple', 'b': 'beer', 'c': 'cheese'}
>>> child.pushdown(top)
>>> assert child == top
This creates a new ChainableMap with the contents of ``child`` and makes it
the new parent (the old parent becomes the grandparent):
>>> child.parent.parent is parent
True
>>> del child['a']
>>> child['a'] == 10
True
"""
old = ChainableMap(self.parent)
for k, v in self.items(True):
old[k] = v
del self[k]
self.parent = old
for k, v in top.iteritems():
self[k] = v
When creating a keyword argument dictionary, the behavior is the same as passing your object into the dict() initializer, which results in the dict {'b': 2} for your cm object:
>>> cm = ChainableMap({'a': 1})
>>> cm['b'] = 2
>>> dict(cm)
{'b': 2}
A more detailed explanation of why this is the case is below, but the summary is that your mapping is converted to a Python dictionary in C code which does some optimization if the argument is itself another dict, by bypassing the Python function calls and inspecting the underlying C object directly.
There are a few ways to approach the solution for this, either make sure that the underlying dict contains everything you want, or stop inheriting from dict (which will require other changes as well, at the very least a __setitem__ method).
edit: It sounds like BrenBarn's suggestion to inherit from collections.MutableMapping instead of dict did the trick.
You could accomplish the first method pretty simply by just adding self.update(parent) to ChainableMap.__init__(), but I'm not sure if that will cause other side effects to the behavior of your class.
Explanation of why dict(cm) gives {'b': 2}:
Check out the following CPython code for the dict object:
http://hg.python.org/releasing/2.7.3/file/7bb96963d067/Objects/dictobject.c#l1522
When dict(cm) is called (and when keyword arguments are unpacked), the PyDict_Merge function is called with cm as the b parameter. Because ChainableMap inherits from dict, the if statement at line 1539 is entered:
if (PyDict_Check(b)) {
other = (PyDictObject*)b;
...
From there on, items from other are added to the new dict that is being created by accessing the C object directly, which bypasses all of the methods that you overwrote.
This means that any items in a ChainableMap instance that are accessed through the parent attribute will not be added to the new dictionary created by dict() or keyword argument unpacking.
Is there any way to make a list of classes behave like a set in python?
Basically, I'm working on a piece of software that does some involved string comparison, and I have a custom class for handling the strings. Therefore, there is an instance of the class for each string.
As a result, I have a large list containing all these classes. I would like to be able to access them like list[key], where in this case, the key is a string the class is based off of (note: the string will never change once the class is instantiated, so it should be hashable).
It seems to me that I should be able to do this somewhat easily, by adding something like __cmp__ to the class, but either I'm being obtuse (likely), or I'm missing something in the docs.
Basically, I want to be able to do something like this (Python prompt example):
>>class a:
... def __init__(self, x):
... self.var = x
...
>>> from test import a
>>> cl = set([a("Hello"), a("World"), a("Pie")])
>>> print cl
set([<test.a instance at 0x00C866C0>, <test.a instance at 0x00C866E8>, <test.a instance at 0x00C86710>])
>>> cl["World"]
<test.a instance at 0x00C866E8>
Thanks!
Edit Some additional Tweaks:
class a:
... def __init__(self, x):
... self.var = x
... def __hash__(self):
... return hash(self.var)
...
>>> v = a("Hello")
>>> x = {}
>>> x[v]=v
>>> x["Hello"]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
KeyError: 'Hello'
>>> x["Hello"]
Just write a class that behaves a bit like a mapping:
class ClassDict(object):
def __init__(self):
self.classes = {}
def add(self, cls):
self.classes[cls.__name__] = cls
def remove(self, cls):
if self.classes[cls.__name__] == cls:
del self.classes[cls.__name__]
else:
raise KeyError('%r' % cls)
def __getitem__(self, key):
return self.classes[key]
def __repr__(self):
return 'ClassDict(%s)' % (', '.join(self.classes),)
class C(object):
pass
class D(object):
pass
cd = ClassDict()
cd.add(C)
cd.add(D)
print cd
print cd['C']
Why don't you just do:
>>> v = MyStr("Hello")
>>> x = {}
>>> x[v.val]=v
>>> x["Hello"]
MyStr("Hello")
Why go through all the trouble of trying to create a hand-rolled dict that uses different keys than the ones you pass in? (i.e. "Hello" instead of MyStr("Hello")).
ex.
class MyStr(object):
def __init__(self, val):
self.val = str(val)
def __hash__(self):
return hash(self.val)
def __str__(self):
return self.val
def __repr__(self):
return 'MyStr("%s")' % self.val
>>> v = MyStr("Hello")
>>> x = {}
>>> x[str(v)]=v
>>> x["Hello"]
MyStr("Hello")
Set and dict use the value returned by an object's __hash__ method to look up the object, so this will do what you want:
>>class a:
... def __init__(self, x):
... self.var = x
...
... def __hash__(self):
... return hash(self.var)
As I remember "set" and "dict" uses also __hash__
From Python 2.x doc:
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries or other mutable types (that are compared by value rather than by object identity) may not be used as keys.
Do you want something like this
class A(object):
ALL_INSTANCES = {}
def __init__(self, text):
self.text = text
self.ALL_INSTANCES[self.text] = self
a1 = A("hello")
a2 = A("world")
print A.ALL_INSTANCES["hello"]
output:
<__main__.A object at 0x00B7EA50>