pickling dict inherited class is missing internal values - python

I've played around for a bit with the code and obviously the reason for the failure is that when setting 'wham' the value is another instance of the class TestDict which works fine as long as i don't try to pickle and unpickle it.
Because if i do self.test is missing.
Traceback:
Traceback (most recent call last):
File "test.py", line 30, in <module>
loads_a = loads(dumps_a)
File "test.py", line 15, in __setitem__
if self.test == False:
AttributeError: 'TestDict' object has no attribute 'test'
The code:
from pickle import dumps, loads
class TestDict(dict):
def __init__(self, test=False, data={}):
super().__init__(data)
self.test = test
def __getitem__(self, k):
if self.test == False:
pass
return dict.__getitem__(self, k)
def __setitem__(self, k, v):
if self.test == False:
pass
if type(v) == dict:
super().__setitem__(k, TestDict(False, v))
else:
super().__setitem__(k, v)
if __name__ == '__main__':
a = TestDict()
a['wham'] = {'bam' : 1}
b = TestDict(True)
b['wham'] = {'bam' : 2}
dumps_a = dumps(a)
dumps_b = dumps(b)
loads_a = loads(dumps_a)
loads_b = loads(dumps_b)
print(loads_a)
print(loads_b)
The code works if not replacing __setitem__ and __getitem__ but i want to add extra functionality to those two specific functions.
I've also tried:
class TestDict(dict):
__module__ = os.path.splitext(os.path.basename(__file__))[0]
Which sort of worked, as long as i don't nest TestDict within TestDict meaning i won't get to replace __setitem__ and __getitem__ in sub-parts of the dictionary.

Define __reduce__ method:
class TestDict(dict):
...
def __reduce__(self):
return type(self), (self.test, dict(self))

The problem is that pickle doesn't call the __init__ method of the object when it does the unpicling, so the self.test variable is not created at the moment when it tries to set the items of the dictionary.
Apparently setting the attributes is staged after setting items of the dictionary.
One way to solve it is to add a class level attribute that will be overridden in the instances:
class TestDict(dict):
test = False
...

Related

pickle and dill can't load objects with overridden __hash__ function (AttributeError)

In the next few lines of code I'll replicate on a smaller scale what's happening with my program.
Class A must store a dictionary with keys that have type A (values can be any type to replicate the error).
class A:
def __init__(self, name):
self.name = name
self.dic = dict() # it'll be a mapping from A objects to <?> objects
def __repr__(self): return self.name
def __hash__(self): return hash(self.name)
The same is needed with class B. Besides, class B is a more complex object that takes a while to build, and thus I need to store it locally and load it when I need it.
class B:
def __init__(self, dic):
self.dic = dic # it'll be a mapping from A objects to <?> objects
def __repr__(self): return str(self.dic)
# saving the model with pickle
def save(self, filename):
with open("objects/" + filename + ".fan", "wb+") as filehandler:
pickle.dump(self, filehandler)
# loading the model with pickle
#staticmethod
def load(filename):
with open("objects/" + filename + ".fan", "rb") as filehandler:
return pickle.load(filehandler)
Let's instantiate some objects:
# instantiate two A objects
obj1 = A("name")
obj2 = A("name2")
# fill their dic field
obj1.dic[obj2] = 0
obj2.dic[obj1] = 1
# create a dictionary object with type(key) = A
# and instantiate a B object with that
dic = {obj1: (0, 0), obj2: (1, 4)}
obj3 = B(dic)
Now if I try to dump and load B with pickle/dill:
obj3.save("try") # all goes well
B.load("try") # nothing goes well
I get the following error:
Traceback (most recent call last):
File "C:\Users\SimoneZannini\Documents\Fantacalcio\try.py", line 40, in <module>
B.load("try")
File "C:\Users\SimoneZannini\Documents\Fantacalcio\try.py", line 29, in load
return pickle.load(filehandler)
File "C:\Users\SimoneZannini\Documents\Fantacalcio\try.py", line 11, in __hash__
def __hash__(self): return hash(self.name)
AttributeError: 'A' object has no attribute 'name'
Process finished with exit code 1
I know there's a similar problem that was solved, but this isn't exactly my case and the __getstate__ and __setstate__ workaround doesn't seem to work. I think this is due to A class having a dict object inside of it, but it's just an assumption.
Thanks in advance for your time.
Two things:
1
I'm not sure exactly why the error occurs but you can avoid it by declaring name as a class member variable like so
class A:
name = ""
def __init__(self, name):
self.name = name
self.dic = dict() # it'll be a mapping from A objects to <?> objects
def __repr__(self): return self.name
def __hash__(self): return hash(self.name)
2
Objects that keep a reference to other objects of the same class are often (though not always) an indicator of sub-optimal design. Why keep a dict inside each A when you could simply keep a dict (or dicts) outside the class?
To address the comments
Now I get KeyError with the dic field of A, something that doesn't happen without dumping and loading the object B
Consider the following
class C:
def __hash__(self):
return 1
c1 = C()
c2 = C()
mydict = {c1:1}
print(mydict[c1]) # 1
print(mydict[c2]) # key error
When you un-pickle a B, its self.dic now contains new As (not the original ones) so when you try to use the old As as keys in the new Bs dic, it doesn't work. Again, you could work around this but I think re-designing your app will be easier in the long run. You will need to override __eq__() in A for it to work:
class D:
def __hash__(self):
return 1
def __eq__(self, other):
return True
d1 = D()
d2 = D()
mydict = {d1:1}
print(mydict[d1]) # 1
print(mydict[d2]) # 1

How to add attribute to arbitrary object python?

For a project I'm working on, I want to be able to associate a name with an object. The way I would like to do it is to set the .name attribute of the object to the name I want. What I really need is a function that takes an instance of an object, and returns something that is identical in every way but with a .name attribute. The problem is that I don't know what type of data the object will be ahead of time, so I can't use subclassing for example
Every method I've tried has hit a problem. Trying to give it a .name attribute directly doesnt work, for example:
>>> cats = ['tabby', 'siamese']
>>> cats.name = 'cats'
Traceback (most recent call last):
File "<pyshell#197>", line 1, in <module>
cats.name = 'cats'
AttributeError: 'list' object has no attribute 'name'
Using setattr has the same problem.
I've tried creating a new class that on init copies all attributes from the instance and also has a .name attribute, but this doesn't work either. If I try:
class NamedThing:
def __init__(self, name, thing):
thing_dict = {#not all types have a .__dict__ method
name: getattr(thing, name) for name in dir(thing)
}
self.__dict__ = thing_dict
self.name = name
It copies over the dict without a problem, but for some reason unless I directly call the new methods, python fails to find them, so the object loses all of its functionality. For example:
>>> cats = ['tabby', 'siamese']
>>> named_thing_cats = NamedThing('cats', cats)
>>> named_thing_cats.__repr__()#directly calling .__repr__()
"['tabby', 'siamese']"
>>> repr(named_thing_cats)#for some reason python does not call the new repr method
'<__main__.NamedThing object at 0x0000022814C1A670>'
>>> hasattr(named_thing_cats, '__iter__')
True
>>> for cat in named_thing_cats:
print(cat)
Traceback (most recent call last):
File "<pyshell#215>", line 1, in <module>
for cat in named_thing_cats:
TypeError: 'NamedThing' object is not iterable
I've also tried setting the type and attributes by setting class directly:
class NamedThing:
def __init__(self, name, thing):
thing_dict = {#not all types have a .__dict__ method
name: getattr(thing, name) for name in dir(thing)
}
self.__class__ = type('NamedThing', (type(thing),), thing_dict)
self.name = name
But this runs into a problem depending on what type thing is:
>>> cats = ['tabby', 'siamese']
>>> named_thing_cats = NamedThing('cats', cats)
Traceback (most recent call last):
File "<pyshell#217>", line 1, in <module>
named_thing_cats = NamedThing('cats', cats)
File "C:/Users/61490/Documents/Python/HeirachicalDict/moduleanalyser.py", line 12, in __init__
self.__class__ = type('NamedThing', (type(thing),), thing_dict)
TypeError: __class__ assignment: 'NamedThing' object layout differs from 'NamedThing'
I'm really stuck, help would be great
What you want is called an object proxy. This is some pretty sophisticated stuff, as you're getting into the data model of python and manipulating some pretty fundamental dunder (double underscore) methods in interesting ways
class Proxy:
def __init__(self, proxied):
object.__setattr__(self, '_proxied', proxied)
def __getattribute__(self, name):
try:
return object.__getattribute__(self, name)
except AttributeError:
p = object.__getattribute__(self, '_proxied')
return getattr(p, name)
def __setattr__(self, name, value):
p = object.__getattribute__(self, '_proxied')
if hasattr(p, name):
setattr(p, name, value)
else:
setattr(self, name, value)
def __getitem__(self, key):
p = object.__getattribute__(self, '_proxied')
return p[key]
def __setitem__(self, key, value):
p = object.__getattribute__(self, '_proxied')
p[key] = value
def __delitem__(self, key):
p = object.__getattribute__(self, '_proxied')
del p[key]
The most obvious thing that's going on here is that internally this class has to use the object implementation of the dunders to avoid recursing infinitely. What this does is holds a reference to a proxied object, then if you try to get or set an attribute it will check the proxied object, if the proxied object has that attribute it uses it, otherwise it sets the attribute on itself. For indexing, like with a list, it just directly acts on the proxied object, since the Proxy itself doesn't allow indexing.
If you need to use this in production, there's a package called wrapt you should probably look at instead.
Why not just create a __iter__ magic method with yield from:
class NamedThing():
def __init__(self, name, thing):
self.thing = thing
self.name = name
def __iter__(self):
yield from self.thing
cats = ['tabby', 'siamese']
named_thing_cats = NamedThing('cats', cats)
for cat in named_thing_cats:
print(cat)
Output;
tabby
siamese
Does this work?
class Thingy(list):
def __init__(self, name, thing):
list.__init__(self, thing)
self.name = name
cats = Thingy('cats', ['tabby', 'siamese'])
print(cats.name) # shows 'cats'
for cat in cats:
print(cat) # shows tabby, siamese
Or you could do:
class Thingy:
def __init__(self, name, thing):
self.thing = thing
self.name = name

Way to mask functions on Python Object

I have a class that inherit from OrderedDict, but I don't know if this is the right way to accomplish what I need.
I would like the class to have the duel method of the javascript '.' notation like obj.<property> and I would also like the users to be able to access the class properties like obj['myproperty'] but I was to hide all the key() and get() functions. The inheritance model is providing good functionality, but it cluttering up the object with additional methods that are not really needed.
Is it possible to get the dictionary behavior without all the other functions coming along?
For this discussion, let's assume my class is this:
from six.moves.urllib import request
import json
class MyClass(OrderedDict):
def __init__(self, url):
super(MyClass, self).__init__(url=url)
self._url = url
self.init()
def init(self):
# call the url and load the json
req = request.Request(self._url)
res = json.loads(request.urlopen(req).read())
for k,v in res.items():
setattr(self, k, v)
self.update(res)
self.__dict__.update(res)
if __name__ == "__main__":
url = "https://sampleserver5.arcgisonline.com/ArcGIS/rest/services?f=json"
props = MyClass(url=url)
props.currentVersion
Is there another way to approach this dilemma?
Thanks
If all you want is x['a'] to work the same way as x.a without any other functionality of dictionaries, then don't inherit from dict or OrderedDict, instead just forward key/indice operations (__getitem__, __setitem__ and __delitem__) to attribute operations:
class MyClass(object):
def __getitem__(self,key):
try: #change the error to a KeyError if the attribute doesn't exist
return getattr(self,key)
except AttributeError:
pass
raise KeyError(key)
def __setitem__(self,key,value):
setattr(self,key,value)
def __delitem__(self,key):
delattr(self,key)
As an added bonus, because these special methods don't check the instance variables for the method name it doesn't break if you use the same names:
x = MyClass()
x['__getitem__'] = 1
print(x.__getitem__) #still works
print(x["__getattr__"]) #still works
The only time it will break is when trying to use __dict__ since that is where the instance variables are actually stored:
>>> x = MyClass()
>>> x.a = 4
>>> x.__dict__ = 1 #stops you right away
Traceback (most recent call last):
File "<pyshell#36>", line 1, in <module>
x.__dict__ = 1
TypeError: __dict__ must be set to a dictionary, not a 'int'
>>> x.__dict__ = {} #this is legal but removes all the previously stored values!
>>> x.a
Traceback (most recent call last):
File "<pyshell#38>", line 1, in <module>
x.a
AttributeError: 'MyClass' object has no attribute 'a'
In addition you can still use the normal dictionary methods by using vars():
x = MyClass()
x.a = 4
x['b'] = 6
for k,v in vars(x).items():
print((k,v))
#output
('b', 6)
('a', 4)
>>> vars(x)
{'b': 6, 'a': 4}

Python: Lambda function as a namedtuple object?

I've written a program in which I have a fairly typical class. In this class I create multiple namedtuple objects. The namedtuple objects hold many items, which all work fine, except for lambda functions that I try to bind to it. Below is a stripped down example and the error message that I am receiving. Hope someone knows why this is going wrong. Thanks in advance!
FILE: test.py
from equations import *
from collections import namedtuple
class Test:
def __init__(self, nr):
self.obj = self.create(nr)
print self.obj.name
print self.obj.f1(2)
def create(self, nr):
obj = namedtuple("struct", "name f1 f2")
obj.name = str(nr)
(obj.f1, obj.f2) = get_func(nr)
return obj
test = Test(1)
FILE: equations.py
def get_func(nr):
return (lambda x: test1(x), lambda x: test2(x))
def test1(x):
return (x/1)
def test2(x):
return (x/2)
ERROR:
Traceback (most recent call last):
File "test.py", line 17, in <module>
test = Test(1)
File "test.py", line 8, in __init__
print self.obj.f1(2)
TypeError: unbound method <lambda>() must be called with struct instance as first argument (got int instance instead)`
The namedtuple() constructor returns a class, not an instance itself. You are adding methods to that class. As such, your lambda's must accept a self argument.
In any case, you should create instances of the named tuple class you created. If you don't want to give your lambdas a self first argument, adding them to the instance you then created would work fine:
from equations import *
from collections import namedtuple
Struct = namedtuple("struct", "name f1 f2")
class Test:
def __init__(self, nr):
self.obj = self.create(nr)
print self.obj.name
print self.obj.f1(2)
def create(self, nr):
obj = Struct(str(nr), *get_func(nr))
return obj
test = Test(1)

Setting Property via a String

I'm trying to set a Python class property outside of the class via the setattr(self, item, value) function.
class MyClass:
def getMyProperty(self):
return self.__my_property
def setMyProperty(self, value):
if value is None:
value = ''
self.__my_property = value
my_property = property( getMyProperty, setMyProperty )
And in another script, I create an instance and want to specify the property and let the property mutator handle the simple validation.
myClass = MyClass()
new_value = None
# notice the property in quotes
setattr(myClass, 'my_property', new_value)
The problem is that it doesn't appear to be calling the setMyProperty(self, value) mutator. For a quick test to verify that it doesn't get called, I change the mutator to:
def setMyProperty(self, value):
raise ValueError('WTF! Why are you not being called?')
if value is None:
value = ''
self.__my_property = value
I'm fairly new to Python, and perhaps there's another way to do what I'm trying to do, but can someone explain why the mutator isn't being called when setattr(self, item, value) is called?
Is there another way to set a property via a string? I need the validation inside the mutator to be executed when setting the property value.
Works for me:
>>> class MyClass(object):
... def get(self): return 10
... def setprop(self, val): raise ValueError("hax%s"%str(val))
... prop = property(get, setprop)
...
>>> i = MyClass()
>>> i.prop =4
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in setprop
ValueError: hax4
>>> i.prop
10
>>> setattr(i, 'prop', 12)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in setprop
ValueError: hax12
The code you pasted seems to do the same as mine, except that my class inherits from object, but that's cause I'm running Python 2.6 and I thought that in 2.7 all classes automatically inherit from object. Try that, though, and see if it helps.
To make it even clearer: try just doing myClass.my_property = 4. Does that raise an exception? If not then it's an issue with inheriting from object - properties only work for new-style classes, i.e. classes that inherit from object.

Categories

Resources