I extended dict in a simple way to directly access it's values with the d.key notation instead of d['key']:
class ddict(dict):
def __getattr__(self, item):
return self[item]
def __setattr__(self, key, value):
self[key] = value
Now when I try to pickle it, it will call __getattr__ to find __getstate__, which is neither present nor necessary. The same will happen upon unpickling with __setstate__:
>>> import pickle
>>> class ddict(dict):
... def __getattr__(self, item):
... return self[item]
... def __setattr__(self, key, value):
... self[key] = value
...
>>> pickle.dumps(ddict())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __getattr__
KeyError: '__getstate__'
How do I have to modify the class ddict in order to be properly pickable?
The problem is not pickle but that your __getattr__ method breaks the expected contract by raising KeyError exceptions. You need to fix your __getattr__ method to raise AttributeError exceptions instead:
def __getattr__(self, item):
try:
return self[item]
except KeyError:
raise AttributeError(item)
Now pickle is given the expected signal for a missing __getstate__ customisation hook.
From the object.__getattr__ documentation:
This method should return the (computed) attribute value or raise an AttributeError exception.
(bold emphasis mine).
If you insist on keeping the KeyError, then at the very least you need to skip names that start and end with double underscores and raise an AttributeError just for those:
def __getattr__(self, item):
if isinstance(item, str) and item[:2] == item[-2:] == '__':
# skip non-existing dunder method lookups
raise AttributeError(item)
return self[item]
Note that you probably want to give your ddict() subclass an empty __slots__ tuple; you don't need the extra __dict__ attribute mapping on your instances, since you are diverting attributes to key-value pairs instead. That saves you a nice chunk of memory per instance.
Demo:
>>> import pickle
>>> class ddict(dict):
... __slots__ = ()
... def __getattr__(self, item):
... try:
... return self[item]
... except KeyError:
... raise AttributeError(item)
... def __setattr__(self, key, value):
... self[key] = value
...
>>> pickle.dumps(ddict())
b'\x80\x03c__main__\nddict\nq\x00)\x81q\x01.'
>>> type(pickle.loads(pickle.dumps(ddict())))
<class '__main__.ddict'>
>>> d = ddict()
>>> d.foo = 'bar'
>>> d.foo
'bar'
>>> pickle.loads(pickle.dumps(d))
{'foo': 'bar'}
That pickle tests for the __getstate__ method on the instance rather than on the class as is the norm for special methods, is a discussion for another day.
First of all, I think you may need to distinguish between instance attribute and class attribute.
In Python official document Chapter 11.1.4 about pickling, it says:
instances of such classes whose dict or the result of calling getstate() is picklable (see section The pickle protocol for details).
Therefore, the error message you're getting is when you try to pickle an instance of the class, but not the class itself - in fact, your class definition will just pickle fine.
Now for pickling an object of your class, the problem is that you need to call the parent class's serialization implementation first to properly set things up. The correct code is:
In [1]: import pickle
In [2]: class ddict(dict):
...:
...: def __getattr__(self, item):
...: super.__getattr__(self, item)
...: return self[item]
...:
...: def __setattr__(self, key, value):
...: super.__setattr__(self, key, value)
...: self[key] = value
...:
In [3]: d = ddict()
In [4]: d.name = "Sam"
In [5]: d
Out[5]: {'name': 'Sam'}
In [6]: pickle.dumps(d)
Out[6]: b'\x80\x03c__main__\nddict\nq\x00)\x81q\x01X\x04\x00\x00\x00nameq\x02X\x03\x00\x00\x00Samq\x03s}q\x04h\x02h\x03sb.'
Related
I am referring to the question asked in How to force/ensure class attributes are a specific type? (shown bellow).
The type checking works as suggested. However, the class instance has an error. Namely, when instantiate the class as follows and call __dict__ on it, the error comes up.
excel_parser.py:
one_foo = Foo()
one_foo.__dict__
results in:
Traceback (most recent call last):
File "C:/Users/fiona/PycharmProjects/data_processing/excel_parser.py", line 80, in <module>
Foo.__dict__
TypeError: descriptor '__dict__' for 'Foo' objects doesn't apply to a 'Foo' object
How can I prevent this from happening? Thx
def getter_setter_gen(name, type_):
def getter(self):
return getattr(self, "__" + name)
def setter(self, value):
if not isinstance(value, type_):
raise TypeError(f"{name} attribute must be set to an instance of {type_}")
setattr(self, "__" + name, value)
return property(getter, setter)
def auto_attr_check(cls):
new_dct = {}
for key, value in cls.__dict__.items():
if isinstance(value, type):
value = getter_setter_gen(key, value)
new_dct[key] = value
# Creates a new class, using the modified dictionary as the class dict:
return type(cls)(cls.__name__, cls.__bases__, new_dct)
#auto_attr_check
class Foo(object):
bar = int
baz = str
bam = float
Trying to build something and found this odd behavior. I'm trying to build a class that I can call dict on
class Test1:
def __iter__(self):
return iter([(a, a) for a in range(10)])
obj = Test1()
dict(obj) # returns {0: 0, 1: 1, 2: 2 ...}
now in my use case the object has a __getattr__ overload, which is where the problem comes in so
class Test2:
def __iter__(self):
return iter([(a, a) for a in range(10)])
def __getattr__(self, attr):
raise Exception(f"why are you calling me {attr}")
obj = Test2()
dict(obj) # Exception: why are you calling me keys
The dict function is calling somewhere self.keys but obviously Test1().keys throws an AttributeError so it's being handed there somehow. How do I get __iter__ and __getattr__ to play nicely together. Or is there a better way of doing this?
Edit:
I guess raising an AttributeError works
class Test2:
def __iter__(self):
return iter([(a, a) for a in range(10)])
def __getattr__(self, attr):
raise AttributeError(f"why are you calling me {attr}")
obj = Test2()
dict(obj). # No Exception
You don't actually have to PROVIDE .keys (clearly, or your first example would have failed). You just need to provide the right exception. This works:
class Test2:
def __iter__(self):
return iter([(a, a) for a in range(10)])
def __getattr__(self, attr):
if attr == 'keys':
raise AttributeError(f"{type(self).__name__!r} object has no attribute {attr!r}")
raise Exception(f"why are you calling me {attr}")
obj = Test2()
dict(obj) # Exception: why are you calling me keys
I have a class with __getitem__() function which is subscribable like a dictionary. However, when I try to pass it to a str.format() i get a TypeError. How can I use a class in python with the format() function?
>>> class C(object):
id=int()
name=str()
def __init__(self, id, name):
self.id=id
self.name=name
def __getitem__(self, key):
return getattr(self, key)
>>> d=dict(id=1, name='xyz')
>>> c=C(id=1, name='xyz')
>>>
>>> #Subscription works for both objects
>>> print(d['id'])
1
>>> print(c['id'])
1
>>>
>>> s='{id} {name}'
>>> #format() only works on dict()
>>> print(s.format(**d))
1 xyz
>>> print(s.format(**c))
Traceback (most recent call last):
File "<pyshell#13>", line 1, in <module>
print(s.format(**c))
TypeError: format() argument after ** must be a mapping, not C
As some of the comments mention you could inherit from dict, the reason it doesn't work is that:
If the syntax **expression appears in the function call, the expression must evaluate to a mapping, the contents of which are treated as additional keyword arguments. In the case of a keyword appearing in both expression and as an explicit keyword argument, a TypeError exception is raised.
For it to work you need to implement the Mapping ABC. Something along the lines of this:
from collections.abc import Mapping
class C(Mapping):
id=int()
name=str()
def __init__(self, id, name):
self.id = id
self.name = name
def __iter__(self):
for x in self.__dict__.keys():
yield x
def __len__(self):
return len(self.__dict__)
def __getitem__(self, key):
return self.__dict__[key]
This way you should just be able to use s = '{id}{name}'.format(**c)
rather than s = '{id}{name}'.format(**c.__dict__)
You can also use MutableMapping from collections.abc module if you want to be able to change your class variables like in a dictionary. MutableMapping would also require the implementation of __setitem__ and __delitem__
I am using a bunch class to transform a dict to an object.
class Bunch(object):
""" Transform a dict to an object """
def __init__(self, kwargs):
self.__dict__.update(kwargs)
The problem is , i have a key with a dot in its name({'test.this':True}).
So when i call:
spam = Bunch({'test.this':True})
dir(spam)
I have the attibute:
['__class__',
'__delattr__',
...
'__weakref__',
'test.this']
But i can't access it:
print(spam.test.this)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-7-ea63f60f74ca> in <module>()
----> 1 print(spam.test.this)
AttributeError: 'Bunch' object has no attribute 'test'
i got an AttributeError.
How can i access this attribute?
You can use getattr:
>>> getattr(spam, 'test.this')
True
Alternatively, you can get the value from the object's __dict__. Use vars to get spam's dict:
>>> vars(spam)['test.this']
True
Implement __getitem__(self, key):
class D():
def __init__(self, kwargs):
self.__dict__.update(kwargs)
def __getitem__(self, key):
return self.__dict__.get(key)
d = D({"foo": 1, "bar.baz": 2})
print(d["foo"])
print(d["bar.baz"])
Edit:
I don't recommend accessing d.__dict__ directly from a client of a D instance. Client code like this
d = D({"foo": 1, "bar.baz": 2})
print(d.__dict__.get("bar.baz"))
is trying to reach into the underpants of d and requires knowledge about implementation details of D.
A correct suggestion would be to avoid using dot in the variables.
And even if we use somehow, its better to get it using getattr.
getattr(spam, 'test.this')
If we are being stubborn by avoid standards so this may help.
class Objectify(object):
def __init__(self, obj):
for key in obj:
if isinstance(obj[key], dict):
self.__dict__.update(key=Objectify(obj[key]))
else:
self.__dict__.update(key=obj[key])
class Bunch(object):
""" Transform a dict to an object """
def __init__(self, obj, loop=False):
for key in obj:
if isinstance(obj[key], dict):
self.__dict__.update(key=Objectify(obj[key]))
else:
self.__dict__.update(key=obj[key])
spam1 = Bunch({'test': {'this': True}})
print(spam1.test.this)
spam2 = Bunch({'test': {'this': {'nested_this': True}}})
print(spam2.test.this.nested_this)
Not provided test.this as the key. You may want to create a nested dict iterating through the keys having dots.
Try spam["test.this"] or spam.get("test.this")
I have a class that converts a dictionary to an object like this
class Dict2obj(dict):
__getattr__= dict.__getitem__
def __init__(self, d):
self.update(**dict((k, self.parse(v))
for k, v in d.iteritems()))
#classmethod
def parse(cls, v):
if isinstance(v, dict):
return cls(v)
elif isinstance(v, list):
return [cls.parse(i) for i in v]
else:
return v
When I try to make a deep copy of the object I get this error
import copy
my_object = Dict2obj(json_data)
copy_object = copy.deepcopy(my_object)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy.py", line 172, in deepcopy
copier = getattr(x, "__deepcopy__", None)
KeyError: '__deepcopy__'
But if I override the __getattr__ function in the Dict2obj class I was able to do deep copy operation. See example below
class Dict2obj(dict):
__getattr__= dict.__getitem__
def __init__(self, d):
self.update(**dict((k, self.parse(v))
for k, v in d.iteritems()))
def __getattr__(self, key):
if key in self:
return self[key]
raise AttributeError
#classmethod
def parse(cls, v):
if isinstance(v, dict):
return cls(v)
elif isinstance(v, list):
return [cls.parse(i) for i in v]
else:
return v
Why do I need to override __getattr__ method in order to do a deepcopy of objects returned by this class?
The issue occurs for your first class, because copy.deepcopy tries to call getattr(x, "__deepcopy__", None) . The significance of the third argument is that, if the attribute does not exist for the object, it returns the third argument.
This is given in the documentation for getattr() -
getattr(object, name[, default])
Return the value of the named attribute of object. name must be a string. If the string is the name of one of the object’s attributes, the result is the value of that attribute. For example, getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise AttributeError is raised.
This works as , if the underlying __getattr__ raises AttributeError and the default argument was provided for the getattr() function call the AttributeError is caught by the getattr() function and it returns the default argument, otherwise it lets the AttributeError bubble up. Example -
>>> class C:
... def __getattr__(self,k):
... raise AttributeError('asd')
...
>>>
>>> c = C()
>>> getattr(c,'a')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in __getattr__
AttributeError: asd
>>> print(getattr(c,'a',None))
None
But in your case, since you directly assign dict.__getitem__ to __getattr__ , if the name is not found in the dictionary, it raises a KeyError , not an AttributeError and hence it does not get handled by getattr() and your copy.deepcopy() fails.
You should handle the KeyError in your getattr and then raise AttributeError instead. Example -
class Dict2obj(dict):
def __init__(self, d):
self.update(**dict((k, self.parse(v))
for k, v in d.iteritems()))
def __getattr__(self, name):
try:
return self[name]
except KeyError:
raise AttributeError(name)
...