Cannot unpickle Exception subclass - python

Simplified version of Why is my custom exception unpickle failing.
I am trying to pickle a 'simple' exception subclass. It pickles OK, but when unpickling it falls over:
import pickle
class ABError(Exception):
def __init__(self, a, b):
self.a = a
self.b = b
ab_err = ABError("aaaa", "bbbb")
pickled = pickle.dumps(ab_err)
original = pickle.loads(pickled) # Fails
Error:
Traceback (most recent call last):
File "p.py", line 12, in <module>
original = pickle.loads(pickled) # Fails
File "/usr/lib/python2.7/pickle.py", line 1388, in loads
return Unpickler(file).load()
File "/usr/lib/python2.7/pickle.py", line 864, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1139, in load_reduce
value = func(*args)
TypeError: __init__() takes exactly 3 arguments (1 given)
An earlier comment suggested the issue is because the built in Exception class supplies a __setstate_() method. However, it's not clear to me if this is expected behaviour or not - it certainly seems surprising since doing the same thing with a subclass of object works OK.

The BaseException class defines a custom __reduce__ method in exceptions.c, which returns the list of arguments to pass to __init__. Exact code is
if (self->args && self->dict)
return PyTuple_Pack(3, Py_TYPE(self), self->args, self->dict);
else
return PyTuple_Pack(2, Py_TYPE(self), self->args);
According to __reduce__ documentation,
the first item of the tuple is the callable to invoke. Here, that will be the exception class.
the second item is the tuple of arguments to pass to the callable. Here, that will be self.args.
the third item is a dict to merge into self.__dict__.
So from this, BaseException.__reduce__ will make unpickle invoke the exception's constructor with given args.
You have two options: either override __reduce__, or put the required arguments in self.args, either directly or by letting the parent class do it:
import pickle
class ABError(Exception):
def __init__(self, a, b):
self.a = a
self.b = b
# self.args = (a, b)
# maybe better, let base class's __init__ do it =>
super(ABError, self).__init__(a, b)
ab_err = ABError("aaaa", "bbbb")
pickled = pickle.dumps(ab_err)
original = pickle.loads(pickled) # no longer fails
Note that the original issue comes from the rather naive way BaseException pickle handling works. It is fixed in the latest python3 releases. Your question's original code works fine on python 3.5 for instance.

Related

Python: make subclass of generic class generic by default

I want to be able to define what the contents of a subclass of a subclass of typing.Iterable have to be.
Type hints are critical for this project so I have to find a working solution
Here is a snip code of what I've already tried and what I want:
# The code I'm writing:
from typing import TypeVar, Iterable
T = TypeVar('T')
class Data:
pass
class GeneralPaginatedCursor(Iterable[T]):
"""
If this type pf cursor is used by an EDR a specific implementation has to be created
Handle paginated lists, exposes hooks to simplify retrieval and parsing of paginated data
"""
# Implementation
pass
###########
# This part is supposed to be written by different developers on a different place:
class PaginatedCursor(GeneralPaginatedCursor):
pass
def foo() -> GeneralPaginatedCursor[Data]:
"""
Works great
"""
pass
def bar() -> PaginatedCursor[Data]:
"""
Raises
Traceback (most recent call last):
.
.
.
def bar(self) -> PaginatedCursor[Data]:
File "****\Python\Python38-32\lib\typing.py", line 261, in inner
return func(*args, **kwds)
File "****\Python\Python38-32\lib\typing.py", line 894, in __class_getitem__
_check_generic(cls, params)
File "****\Python\Python38-32\lib\typing.py", line 211, in _check_generic
raise TypeError(f"{cls} is not a generic class")
"""
pass
I don't want to leave it to the other developers in the future to inherit from Iterable because if someone will miss it everything will break.
I found the exact same issue here:
https://github.com/python/cpython/issues/82640
But there is no answer.
The only requirements are that GeneralPaginatedCursor define __iter__ to return an Iterable value (namely, something with a __next__ method).
The error you see occurs because, since GeneralPaginatedCursor is generic in T, PaginatedCursor should be as well.
class PaginatedCursor(GeneralPaginatedCursor[T]):
pass

multiprocessing and modules

I am attempting to use multiprocessing to call derived class member function defined in a different module. There seem to be several questions dealing with calling class methods from the same module, but none from different modules. For example, if I have the following structure:
main.py
multi/
__init__.py (empty)
base.py
derived.py
main.py
from multi.derived import derived
from multi.base import base
if __name__ == '__main__':
base().multiFunction()
derived().multiFunction()
base.py
import multiprocessing;
# The following two functions wrap calling a class method
def wrapPoolMapArgs(classInstance, functionName, argumentLists):
className = classInstance.__class__.__name__
return zip([className] * len(argumentLists), [functionName] * len(argumentLists), [classInstance] * len(argumentLists), argumentLists)
def executeWrappedPoolMap(args, **kwargs):
classType = eval(args[0])
funcType = getattr(classType, args[1])
funcType(args[2], args[3:], **kwargs)
class base:
def multiFunction(self):
mppool = multiprocessing.Pool()
mppool.map(executeWrappedPoolMap, wrapPoolMapArgs(self, 'method', range(3)))
def method(self,args):
print "base.method: " + args.__str__()
derived.py
from base import base
class derived(base):
def method(self,args):
print "derived.method: " + args.__str__()
Output
base.method: (0,)
base.method: (1,)
base.method: (2,)
Traceback (most recent call last):
File "e:\temp\main.py", line 6, in <module>
derived().multiFunction()
File "e:\temp\multi\base.py", line 15, in multiFunction
mppool.map(executeWrappedPoolMap, wrapPoolMapArgs(self, 'method', range(3)))
File "C:\Program Files\Python27\lib\multiprocessing\pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "C:\Program Files\Python27\lib\multiprocessing\pool.py", line 567, in get
raise self._value
NameError: name 'derived' is not defined
I have tried fully qualifying the class name in the wrapPoolMethodArgs method, but that just gives the same error, saying multi is not defined.
Is there someway to achieve this, or must I restructure to have all classes in the same package if I want to use multiprocessing with inheritance?
This is almost certainly caused by the ridiculous eval based approach to dynamically invoking specific code.
In executeWrappedPoolMap (in base.py), you convert a str name of a class to the class itself with classType = eval(args[0]). But eval is executed in the scope of executeWrappedPoolMap, which is in base.py, and can't find derived (because the name doesn't exist in base.py).
Stop passing the name, and pass the class object itself, passing classInstance.__class__ instead of classInstance.__class__.__name__; multiprocessing will pickle it for you, and you can use it directly on the other end, instead of using eval (which is nearly always wrong; it's code smell of the strongest sort).
BTW, the reason the traceback isn't super helpful is that the exception is raised in the worker, caught, pickle-ed, and sent back to the main process and re-raise-ed. The traceback you see is from that re-raise, not where the NameError actually occurred (which was in the eval line).

What's the exact usage of __reduce__ in Pickler

I know that in order to be picklable, a class has to overwrite __reduce__ method, and it has to return string or tuple.
How does this function work?
What the exact usage of __reduce__? When will it been used?
When you try to pickle an object, there might be some properties that don't serialize well. One example of this is an open file handle. Pickle won't know how to handle the object and will throw an error.
You can tell the pickle module how to handle these types of objects natively within a class directly. Lets see an example of an object which has a single property; an open file handle:
import pickle
class Test(object):
def __init__(self, file_path="test1234567890.txt"):
# An open file in write mode
self.some_file_i_have_opened = open(file_path, 'wb')
my_test = Test()
# Now, watch what happens when we try to pickle this object:
pickle.dumps(my_test)
It should fail and give a traceback:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
--- snip snip a lot of lines ---
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/copy_reg.py", line 70, in _reduce_ex
raise TypeError, "can't pickle %s objects" % base.__name__
TypeError: can't pickle file objects
However, had we defined a __reduce__ method in our Test class, pickle would have known how to serialize this object:
import pickle
class Test(object):
def __init__(self, file_path="test1234567890.txt"):
# Used later in __reduce__
self._file_name_we_opened = file_path
# An open file in write mode
self.some_file_i_have_opened = open(self._file_name_we_opened, 'wb')
def __reduce__(self):
# we return a tuple of class_name to call,
# and optional parameters to pass when re-creating
return (self.__class__, (self._file_name_we_opened, ))
my_test = Test()
saved_object = pickle.dumps(my_test)
# Just print the representation of the string of the object,
# because it contains newlines.
print(repr(saved_object))
This should give you something like: "c__main__\nTest\np0\n(S'test1234567890.txt'\np1\ntp2\nRp3\n.", which can be used to recreate the object with open file handles:
print(vars(pickle.loads(saved_object)))
In general, the __reduce__ method needs to return a tuple with at least two elements:
A blank object class to call. In this case, self.__class__
A tuple of arguments to pass to the class constructor. In the example it's a single string, which is the path to the file to open.
Consult the docs for a detailed explanation of what else the __reduce__ method can return.

Python: dynamically add attributes to new-style class/obj

Can I dynamically add attributes to instances of a new-style class (one that derives from object)?
Details:
I'm working with an instance of sqlite3.Connection. Simply extending the class isn't an option because I don't get the instance by calling a constructor; I get it by calling sqlite3.connect().
Building a wrapper doesn't save me much of the bulk for the code I'm writing.
Python 2.7.1
Edit
Right answers all. But I still am not reaching my goal; instances of sqlite3.Connection bar my attempts to set attributes in the following ways (as do instances of object itself). I always get an AttributeError:
> conn = sqlite3.connect([filepath])
> conn.a = 'foo'
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
conn.a = 'foo'
AttributeError: 'object' object has no attribute 'a'
> conn.__setattr__('a','foo')
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
conn.__setattr__('a','foo')
AttributeError: 'object' object has no attribute 'a'
Help?
Yes, unless the class is using __slots__ or preventing attribute writing by overriding __setattr__, or an internal Python class, or a Python class implemented natively (usually in C).
You can always try setting an attribute. Except for seriously weird __setattr__ implementations, assigning an attribute to an instance of a class of one of the types mentioned above should raise an AttributeError.
In these cases, you'll have to use a wrapper, like this:
class AttrWrapper(object):
def __init__(self, wrapped):
self._wrapped = wrapped
def __getattr__(self, n):
return getattr(self._wrapped, n)
conn = AttrWrapper(sqlite3.connect(filepath))
Simple experimentation:
In []: class Tst(object): pass
..:
In []: t= Tst()
In []: t.attr= 'is this valid?'
In []: t.attr
Out[]: 'is this valid?'
So, indeed it seems to be possible to do that.
Update:
But from the documentation: SQLite is a C library that ..., so it seems that you really need to wrap it.
conn.a = 'foo',
or any dynamic assignment is valid, if conn is
<type 'classobj'>.
Things like:
c=object()
c.e=1
will raise an Attribute error. On the otherhand: Python allows you to do fantastic Metaclass programming:
>>>from new import classobj
>>>Foo2 = classobj('Foo2',(Foo,),{'bar':lambda self:'bar'})
>>>Foo2().bar()
>>>'bar'
>>>Foo2().say_foo()
>>>foo

How to wrap built-in methods in Python? (or 'how to pass them by reference')

I want to wrap the default open method with a wrapper that should also catch exceptions. Here's a test example that works:
truemethod = open
def fn(*args, **kwargs):
try:
return truemethod(*args, **kwargs)
except (IOError, OSError):
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
open = fn
I want to make a generic method of it:
def wrap(method, exceptions = (OSError, IOError)):
truemethod = method
def fn(*args, **kwargs):
try:
return truemethod(*args, **kwargs)
except exceptions:
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
method = fn
But it doesn't work:
>>> wrap(open)
>>> open
<built-in function open>
Apparently, method is a copy of the parameter, not a reference as I expected. Any pythonic workaround?
The problem with your code is that inside wrap, your method = fn statement is simply changing the local value of method, it isn't changing the larger value of open. You'll have to assign to those names yourself:
def wrap(method, exceptions = (OSError, IOError)):
def fn(*args, **kwargs):
try:
return method(*args, **kwargs)
except exceptions:
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
return fn
open = wrap(open)
foo = wrap(foo)
Try adding global open. In the general case, you might want to look at this section of the manual:
This module provides direct access to all ‘built-in’ identifiers of Python; for example, __builtin__.open is the full name for the built-in function open(). See chapter Built-in Objects.
This module is not normally accessed explicitly by most applications, but can be useful in modules that provide objects with the same name as a built-in value, but in which the built-in of that name is also needed. For example, in a module that wants to implement an open() function that wraps the built-in open(), this module can be used directly:
import __builtin__
def open(path):
f = __builtin__.open(path, 'r')
return UpperCaser(f)
class UpperCaser:
'''Wrapper around a file that converts output to upper-case.'''
def __init__(self, f):
self._f = f
def read(self, count=-1):
return self._f.read(count).upper()
# ...
CPython implementation detail: Most modules have the name __builtins__ (note the 's') made available as part of their globals. The value of __builtins__ is normally either this module or the value of this modules’s __dict__ attribute. Since this is an implementation detail, it may not be used by alternate implementations of Python.
you can just add return fn at the end of your wrap function and then do:
>>> open = wrap(open)
>>> open('bhla')
Traceback (most recent call last):
File "<pyshell#24>", line 1, in <module>
open('bhla')
File "<pyshell#18>", line 7, in fn
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
SystemExit: Can't open 'bhla'. Error #2: No such file or directory

Categories

Resources