Question
I am observing behavior in Python 3.3.4 that I would like help understanding: Why are my exceptions properly raised when a function is executed normally, but not when the function is executed in a pool of workers?
Code
import multiprocessing
class AllModuleExceptions(Exception):
"""Base class for library exceptions"""
pass
class ModuleException_1(AllModuleExceptions):
def __init__(self, message1):
super(ModuleException_1, self).__init__()
self.e_string = "Message: {}".format(message1)
return
class ModuleException_2(AllModuleExceptions):
def __init__(self, message2):
super(ModuleException_2, self).__init__()
self.e_string = "Message: {}".format(message2)
return
def func_that_raises_exception(arg1, arg2):
result = arg1 + arg2
raise ModuleException_1("Something bad happened")
def func(arg1, arg2):
try:
result = func_that_raises_exception(arg1, arg2)
except ModuleException_1:
raise ModuleException_2("We need to halt main") from None
return result
pool = multiprocessing.Pool(2)
results = pool.starmap(func, [(1,2), (3,4)])
pool.close()
pool.join()
print(results)
This code produces this error:
Exception in thread Thread-3:
Traceback (most recent call last):
File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/threading.py", line 921, in _bootstrap_inner
self.run()
File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/threading.py", line 869, in run
self._target(*self._args, **self._kwargs)
File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/multiprocessing/pool.py", line 420, in _handle_results
task = get()
File "/user/peteoss/encap/Python-3.4.2/lib/python3.4/multiprocessing/connection.py", line 251, in recv
return ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() missing 1 required positional argument: 'message2'
Conversely, if I simply call the function, it seems to handle the exception properly:
print(func(1, 2))
Produces:
Traceback (most recent call last):
File "exceptions.py", line 40, in
print(func(1, 2))
File "exceptions.py", line 30, in func
raise ModuleException_2("We need to halt main") from None
__main__.ModuleException_2
Why does ModuleException_2 behave differently when it is run in a process pool?
The issue is that your exception classes have non-optional arguments in their __init__ methods, but that when you call the superclass __init__ method you don't pass those arguments along. This causes a new exception when your exception instances are unpickled by the multiprocessing code.
This has been a long-standing issue with Python exceptions, and you can read quite a bit of the history of the issue in this bug report (in which a part of the underlying issue with pickling exceptions was fixed, but not the part you're hitting).
To summarize the issue: Python's base Exception class puts all the arguments it's __init__ method receives into an attribute named args. Those arguments are put into the pickle data and when the stream is unpickled, they're passed to the __init__ method of the newly created object. If the number of arguments received by Exception.__init__ is not the same as a child class expects, you'll get at error at unpickling time.
A workaround for the issue is to pass all the arguments you custom exception classes require in their __init__ methods to the superclass __init__:
class ModuleException_2(AllModuleExceptions):
def __init__(self, message2):
super(ModuleException_2, self).__init__(message2) # the change is here!
self.e_string = "Message: {}".format(message2)
Another possible fix would be to not call the superclass __init__ method at all (this is what the fix in the bug linked above allows), but since that's usually poor behavior for a subclass, I can't really recommend it.
Your ModuleException_2.__init__ fails while beeing unpickled.
I was able to fix the problem by changing the signature to
class ModuleException_2(AllModuleExceptions):
def __init__(self, message2=None):
super(ModuleException_2, self).__init__()
self.e_string = "Message: {}".format(message2)
return
but better have a look at Pickling Class Instances to ensure a clean implementation.
Related
To my mind, I have a fairly simple long-IO operation that could be refined using threading. I've built a DearPyGui GUI interface (not explicitly related to the problem - just background info). A user can load a file via the package's file loader. Some of these files can be quite large (3 GB). Therefore, I'm adding a pop-up window to lock the interface (modal) whilst the file is loading. The above was context, and the problem is not the DearPyGUI.
I'm starting a thread inside a method of a class instance, which in turn calls (via being the thread's target) a further method (from the same object) and then updates an attribute of that object, which is to be interrogated later. For example:
class IOClass:
__init__(self):
self.fileObj = None
def loadFile(self, fileName):
thread = threading.Thread(target=self.threadMethod, args=fileName)
thread.start()
#Load GUI wait-screen
thread.join()
#anything else..EXCEPTION THROWN HERE
print(" ".join(["Version:", self.fileObj.getVersion()]))
def threadMethod(self, fileName):
print(" ".join(["Loading filename", fileName]))
#expensive-basic Python IO operation here
self.fileObj = ...python IO operation here
class GUIClass:
__init__(self):
pass
def startMethod(self):
#this is called by __main__
ioClass = IOClass()
ioClass.loadFile("filename.txt")
Unfortunately, I get this error:
Exception in thread Thread-1 (loadFile):
Traceback (most recent call last):
File "/home/anthony/anaconda3/envs/CPRD-software/lib/python3.10/threading.py", line 1009, in _bootstrap_inner
self.run()
File "/home/anthony/anaconda3/envs/CPRD-software/lib/python3.10/threading.py", line 946, in run
self._target(*self._args, **self._kwargs)
TypeError: AnalysisController.loadFile() takes 2 positional arguments but 25 were given
Traceback (most recent call last):
File "/home/anthony/CPRD-software/GUI/Controllers/AnalysisController.py", line 117, in loadStudySpace
print(" ".join(["Version:", self.fileObj.getVersion()]))
AttributeError: 'NoneType' object has no attribute 'getVersion'
I'm not sure what's going on. The machine should sit there for at least 3 minutes as the data is loaded. But instead, it appears to perform join, but the main thread doesn't wait for the IO thread to load the file, instead attempting to class a method on what was loaded in.
I solved it. In the threading.Thread() do not call the method using self. Instead, pass self in as an argument to the thread method e.g.,
thread = threading.Thread(target=threadMethod, args=(self, fileName))
The target function doesn't change i.e. it remains as so:
def threadMethod(self, fileName):
#expensive-basic Python IO operation here
self.fileObj = ...python IO operation here
I am trying to make a class Sprite that inherits generic class T, with the bound of being a subclass of class Object. Class Text is a subclass of class Object. This is class Text, provided by outer library:
# text.py
class Text(Object):
# ignored properties...
def __init__(self, text="placeholder text", **kwargs):
super().__init__(object_type=Text.object_type, text=text, **kwargs)
This is my self-written class Sprite:
# sprite.py
from typing import TypeVar, Generic
T = TypeVar('T', bound=Object)
class Sprite(Generic[T]):
def __init__(self, **kwargs):
super(Sprite, self).__init__(self, clickable=True, evt_handler=self.click_wrapper, **kwargs)
And such a Sprite instance is initialized by:
sprite = Sprite[Text](
text="This is a sprite!",
object_id="spriteTest",
# other similar arguments...
)
And this is the error thrown:
Exception thrown in main()! Terminating main...
Traceback (most recent call last):
File "sprite.py", line 79, in main
sprite = Sprite[Text](
File "C:\ProgramData\Anaconda3\lib\typing.py", line 687, in __call__
result = self.__origin__(*args, **kwargs)
File "sprite.py", line 47, in __init__
super(Sprite, self).__init__(self, clickable=True, evt_handler=self.click_wrapper, **kwargs)
TypeError: object.__init__() takes exactly one argument (the instance to initialize)
Why is this not working?
I believe you are misunderstanding Generic Type variables here. Let me first try to reduce your bug down to its most minimal variant:
class Sprite:
def __init__(self, **kwargs):
super().__init__(**kwargs)
Sprite(text="My text")
This very simple program throws the exact same Exception as you have:
Traceback (most recent call last):
File ".../72690805.py", line 57, in <module>
Sprite(text="My text")
File ".../72690805.py", line 55, in __init__
super().__init__(**kwargs)
TypeError: object.__init__() takes exactly one argument (the instance to initialize)
The key here is that it is object for who you cannot specify anything other than one argument. With other words, the superclass of Sprite is object in both of our cases (i.e. the default Python object, not your Object class). Your Sprite class simply does not have a non-default superclass.
You seem to be of the understanding that your super(...).__init__(...) call will initialize Text whenever you use Sprite[Text](...), but that is not the case. Let me give a common example of a Generic type variable in use:
from typing import List, TypeVar, Generic
T = TypeVar('T')
class Queue(Generic[T]):
def __init__(self) -> None:
super().__init__()
self.queue: List[T] = []
def put(self, task: T) -> None:
self.queue.append(task)
def get(self) -> T:
return self.queue.pop(-1)
queue = Queue()
queue.put(12)
queue.put(24)
queue.put(36)
# queue.put('a') # <- A type checker should disallow this
print(queue.get())
print(queue.get())
print(queue.get())
Here, we have a simple Queue class, with put and get methods. These functions are supplemented with Type hints via T, and now type checkers know that e.g. Queue[int]().get returns an int.
However, the super().__init__() is still just the standard initialization of the Python object. We're not suddenly initializing an int, which is equivalent to what you seem to be trying.
To wrap up; whenever you find yourself using functionality from the typing module to try and get something working, then you're making a mistake. As far as I'm aware, all the functionality from typing is merely "cosmetic", and is ignored by Python. It exists to allow developers to use type checkers to ensure that they are not making mistakes, e.g. calling queue.put('a') when queue was initialized with Queue[int](). To reiterate, this put of a character will still execute in Python, and it will place the character in the queue, even though a Type checker would tell you that it's wrong.
I'm trying to catch an Exception raised by a child process, and have been running into some issues. The gist of my code is:
class CustomException(Exception):
def __init__(self, msg):
self.msg = msg
def __str__(self):
return self.msg
def update(partition):
if os.getpid() % 2 == 0:
raise CustomException('PID was divisible by 2!')
else:
# Do something fancy
if __name__ == '__main__':
try:
some_response = get_response_from_another_method()
partition_size = 100
p = Pool(config.NUMBER_OF_PROCESSES)
for i in range(0, NUMBER_OF_PROCESSES):
partition = get_partition(some_response, partition_size)
x = p.apply_async(update, args=(partition,))
x.get()
p.close()
p.join()
except CustomException as e:
log.error('There was an error')
if email_notifier.send_notification(e.msg):
log.debug('Email notification sent')
else:
log.error('An error occurred while sending an email.')
When I run this, I am seeing:
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "/System/Library/Frameworks/Python.framework/Versions/2.6/lib/python2.6/multiprocessing/pool.py", line 259, in _handle_results
task = get()
TypeError: ('__init__() takes exactly 2 arguments (1 given)', <class 'CustomException'>, ())
Is there some facility to do this? Thanks!!
In short, this is something of a quirk in Python 2, and a related issue is referenced in this bug report. It has to do with how exceptions are pickled. The simplest solution is perhaps to alter CustomException so that it calls its parent class initializer. Alternatively, if you're able, I'd suggest moving to Python 3.
For example, this code works fine in both Python 2 and Python 3:
from multiprocessing import Pool
class CustomException(Exception):
pass
def foo():
raise CustomException('PID was divisible by 2!')
pool = Pool()
result = pool.apply_async(foo, [])
But if we alter CustomException so that it has a required argument:
class CustomException(Exception):
def __init__(self, required):
self.required = required
The above example results in a TypeError being raised under Python 2. It works under Python 3.
The problem is that CustomException inherits Exception's __reduce__ method, which tells Python how to pickle an instance. The inherited __reduce__ knows nothing about CustomException's call signature, so unpickling isn't done correctly.
A quick fix is to simply call the parent class's __init__:
class CustomException(Exception):
def __init__(self, msg):
super(Exception, self).__init__()
self.msg = msg
But since you really aren't doing anything special with the message, why not just define:
class CustomException(Exception):
pass
I'm not sure why, but yesterday I was testing some multiprocessing code that I wrote and it was working fine. Then today when I checked the code again, it would give me this error:
Exception in thread Thread-5:
Traceback (most recent call last):
File "C:\Python32\lib hreading.py", line 740, in _bootstrap_inner
self.run()
File "C:\Python32\lib hreading.py", line 693, in run
self._target(*self._args, **self._kwargs)
File "C:\Python32\lib\multiprocessing\pool.py", line 342, in _handle_tasks
put(task)
File "C:\Python32\lib\multiprocessing\pool.py", line 439, in __reduce__
'pool objects cannot be passed between processes or pickled'
NotImplementedError: pool objects cannot be passed between processes or pickled
The structure of my code goes as follows:
* I have 2 modules, say A.py, and B.py.
* A.py has class defined in it called A.
* B.py similarly has class B.
* In class A I have a multiprocessing pool as one of the attributes.
* The pool is defined in A.__init__(), but used in another method - run()
* In A.run() I set some attributes of some objects of class B (which are collected in a list called objBList), and then I use pool.map(processB, objBList)
* processB() is a module function (in A.py) that receives as the only parameter (an instance of B) and calls B.runInput()
* the error happens at the pool.map() line.
basically in A.py:
class A:
def __init__(self):
self.pool = multiprocessing.Pool(7)
def run(self):
for b in objBList:
b.inputs = something
result = self.pool.map(processB, objBList)
return list(result)
def processB(objB):
objB.runInputs()
and in B.py:
class B:
def runInputs(self):
do_something()
BTW, I'm forced to use the processB() module function because of the way multiprocessing works on Windows.
Also I would like to point out that the error I am getting - that pool can't be pickled - shouldn't be referring to any part of my code, as I'm not trying to send the child processes any Pool objects.
Any ideas?
(PS: I should also mention that in between the two days that I was testing this function the computer restarted unexpectedly - possibly after installing windows updates.)
Perhaps your class B objects contain a reference to your A instance.
I'm trying to combine DBUS' asynchronous method calls with Twisted's Deferreds, but I'm encountering trouble in tweaking the usual DBUS service method decorator to do this.
To use the DBUS async callbacks approach, you'd do:
class Service(dbus.service.Object):
#dbus.service.method(INTERFACE, async_callbacks=('callback', 'errback'))
def Resources(self, callback, errback):
callback({'Magic' : 42})
There's a few places where I simply wrap those two methods in a Deferred, so I thought I'd create a decorator to do that for me:
def twisted_dbus(*args, **kargs):
def decorator(real_func):
#dbus.service.method(*args, async_callbacks=('callback', 'errback'), **kargs)
def wrapped_func(callback, errback, *inner_args, **inner_kargs):
d = defer.Deferred()
d.addCallbacks(callback, errback)
return real_func(d, *inner_args, **inner_kargs)
return wrapped_func
return decorator
class Service(dbus.service.Object):
#twisted_dbus(INTERFACE)
def Resources(self, deferred):
deferred.callback({'Magic' : 42})
This, however, doesn't work since the method is bound and takes the first argument, resulting in this traceback:
$ python service.py
Traceback (most recent call last):
File "service.py", line 25, in <module>
class StatusCache(dbus.service.Object):
File "service.py", line 32, in StatusCache
#twisted_dbus(INTERFACE)
File "service.py", line 15, in decorator
#dbus.service.method(*args, async_callbacks=('callback', 'errback'), **kargs)
File "/usr/lib/pymodules/python2.6/dbus/decorators.py", line 165, in decorator
args.remove(async_callbacks[0])
ValueError: list.remove(x): x not in list
I could add an extra argument to the inner function there, like so:
def twisted_dbus(*args, **kargs):
def decorator(real_func):
#dbus.service.method(*args, async_callbacks=('callback', 'errback'), **kargs)
def wrapped_func(possibly_self, callback, errback, *inner_args, **inner_kargs):
d = defer.Deferred()
d.addCallbacks(callback, errback)
return real_func(possibly_self, d, *inner_args, **inner_kargs)
return wrapped_func
return decorator
But that seems... well, dumb. Especially if, for some reason, I want to export a non-bound method.
So is it possible to make this decorator work?
Why is it dumb? You're already assuming you know that the first positional argument (after self) is a Deferred. Why is it more dumb to assume that you know that the real first position argument is self?
If you also want to support free functions, then write another decorator and use that when you know there is no self argument coming.