Behavior of multiprocessing.Pool on exception?

Behavior of multiprocessing.Pool on exception? - python

Suppose I have a program that looks like this:
jobs = [list_of_values_to_consume_and_act]
with multiprocessing.Pool(8) as pool:
results = pool.map(func, jobs)
And whatever is done in func can raise an exception due to external circumstances, so I can't prevent an exception from happening.
How will the pool behave on exception?
Will it only terminate the process that raised an exception and let other processes run and consume the jobs?
If yes, will it start another process to pick up the slack?
What about the job being handled by the dead process, will it be 'resubmitted' to the pool?
In any case, how do I 'retrieve' the exception?

No processes will be terminated at all. All calls to the target
functions from within the pool's processes are wrapped in a
try...except block. Incase an exception is caught, the process
informs the appropriate handler thread in the main process which
passes the exception forward so it can be re-rasied. Whether or not other jobs will execute depends on if the pool is still open. Incase you do not catch this re-raised exception, the main process (or the process that started the pool) will exit, automatically cleaning up open resources like the pool (so no tasks can be executed now since pool closed). But if you catch the exception and let the main process continue running then the pool will not shutdown and other jobs will execute as scheduled.
N/A
The outcome of a job is irrelevant, once it's run once by any process,
that job is marked completed and not resubmitted to the pool.
Wrap your call to pool.map in a try...except block? Do note that
incase one of your jobs do raise an error, then the results of other
successful jobs will become inaccessible as well (because these are
stored after the call to pool.map completes, but the call never
successfully completed). In such cases, where you need to catch
exceptions of individual jobs, it's better to use pool.imap
or pool.apply_async
Example of catching exception for individual tasks using imap:
import multiprocessing
import time
def prt(value):
if value == 3:
raise ValueError(f"Error for value {value}")
time.sleep(1)
return value
if __name__ == "__main__":
with multiprocessing.Pool(3) as pool:
jobs = pool.imap(prt, range(1, 10))
results = []
for i in range(10):
try:
result = next(jobs)
except ValueError as e:
print(e)
results.append("N/A") # This means that this individual task was unsuccessful
except StopIteration:
break
else:
results.append(result)
print(results)
Example of catching exception for individual tasks using apply_async
import multiprocessing
import time
def prt(value):
if value == 3:
raise ValueError(f"Error for value {value}")
time.sleep(1)
return value
if __name__ == "__main__":
pool = multiprocessing.Pool(3)
job = [pool.apply_async(prt, (i,)) for i in range(1, 10)]
results = []
for j in job:
try:
results.append(j.get())
except ValueError as e:
print(e)
results.append("N/A")
print(results)

Related

Python: if I wrap pool.apply_async commands in try...except, are they still executed in parallel?

I've taken over some code a former colleague wrote, which was frequently getting stuck when one or more parallelised functions through a NameError exception, which wasn't caught. (The parallelisation is handled by multiprocessing.Pool.) Because the exception is due to certain arguments not being defined, the only way I've been able to catch this exception is to put the pool.apply_async commands into try...except blocks, like so:
from multiprocessing import Pool
# Define worker functions
def workerfn1(args1):
#commands
def workerfn2(args2):
#more commands
def workerfn3(args3):
#even more commands
# Execute worker functions in parallel
with Pool(processes=os.cpu_count()-1) as pool:
try:
r1 = pool.apply_async(workerfn1, args1)
except NameError as e:
print("Worker function r1 failed")
print(e)
try:
r2 = pool.apply_async(workerfn2, args2)
except NameError as e:
print("Worker function r2 failed")
print(e)
try:
r3 = pool.apply_async(workerfn3, args3)
except NameError as e:
print("Worker function r3 failed")
print(e)
Obviously, the try...except blocks are not parallelised, but the interpreter has to read the apply_async commands sequentially anyway while it assigns them to different CPUs...so will these three functions still be executed in parallel (if they don't throw the NameError exception), or does the use of try...except prevent this from happening?

First, you need to be more careful in posting code that is not full of spelling and other errors.
Method multiprocessing.pool.Pool.apply_async (not apply_sync) returns a multiprocessing.pool.AsyncResult instance. It is only when you call method get on this instance that you get either the return value from your worker function or any exception that occurred in your worker function is now thrown. So:
from multiprocessing import Pool
# Define worker functions
def workerfn1(args1):
...
def workerfn2(args2):
...
def workerfn3(args3):
raise NameError('Some name goes here.')
# Required for Windows:
if __name__ == '__main__':
# Execute worker functions in parallel
with Pool(processes=3) as pool:
result1 = pool.apply_async(workerfn1, args=(1,))
result2 = pool.apply_async(workerfn2, args=(1,))
result3 = pool.apply_async(workerfn3, args=(1,))
try:
return_value1 = result1.get()
except NameError as e:
print("Worker function workerfn1 failed:", e)
try:
return_value2 = result2.get()
except NameError as e:
print("Worker function workerfn2 failed:", e)
try:
return_value3 = result3.get()
except NameError as e:
print("Worker function workerfn3 failed:", e)
Prints:
Worker function workerfn3 failed: Some name goes here.
Note
Without calling get on the AsyncResult returned from apply_async you are not waiting for the completion of the submitted task and there is no point in surrounding the call with try/catch. When you then fall through the with block an implicit call to terminate will be done on the pool instance that will immediately kill all running pool processes and any running tasks will be halted and any tasks waiting to run will be purged. You can call pool.close() followed by pool.join() within the block and that sequence will wait for all submitted tasks to complete. But without explicitly calling get on the AsyncResult instances you will not be able to get return values or exceptions.

Python Multiprocessing Doesnt Terminate On Base Exception

When running using multiprocessing pool, I find that the worker process keeps running past a point where an exception is thrown.
Consider the following code:
import multiprocessing
def worker(x):
print("input: " + x)
y = x + "_output"
raise Exception("foobar")
print("output: " + y)
return(y)
def main():
data = [str(x) for x in range(4)]
pool = multiprocessing.Pool(1)
chunksize = 1
results = pool.map(worker, data, chunksize)
pool.close()
pool.join()
print("Printing results:")
print(results)
if __name__ == "__main__":
main()
The output is:
$ python multiprocessing_fail.py
input: 0
input: 1
input: 2
Traceback (most recent call last):
input: 3
File "multiprocessing_fail.py", line 25, in <module>
main()
File "multiprocessing_fail.py", line 16, in main
results = pool.map(worker, data, 1)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 251, in map
return self.map_async(func, iterable, chunksize).get()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 558, in get
raise self._value
Exception: foobar
As you can see, the worker process never proceeds beyond raise Exception("foobar") to the second print statement. However, it resumes work at the beginning of function worker() again and again.
I looked for an explanation in the documentation, but couldn't find any. Here is a potentially related SO question:
Keyboard Interrupts with python's multiprocessing Pool
But that is different (about keyboard interrupts not being picked by the master process).
Another SO question:
How to catch exceptions in workers in Multiprocessing
This question is also different, since in it the master process doesnt catch any exception, whereas here the master did catch the exception (line 16). More importantly, in that question the worker did not run past an exception (there is only one executable line for the worker).
Am running python 2.7

Comment: Pool should start one worker since the code has pool = multiprocessing.Pool(1).
From the Documnentation:
A process pool object which controls a pool of worker processes to which jobs can be submitted
Comment: That one worker is running the worker() function multiple times
From the Documentation:
map(func, iterable[, chunksize])
This method chops the iterable into a number of chunks which it submits to the process pool as separate tasks.
Your worker() is the separate task. Renaming your worker() to task() could help to clarify what is what.
Comment: What I expect is that the worker process crashes at the Exception
It does, the separate task, your worker() dies and Pool starts the next task.
What you want is Pool.terminate()
From the Documentation:
terminate()
Stops the worker processes immediately without completing outstanding work.
Question: ... I find that the worker process keeps running past a point where an exception is thrown.
You give iteration data to Pool, therfore Pool does what it have to do:
Starting len(data) worker.
data = [str(x) for x in range(4)]
The main Question is: What do you want to expect with
raise Exception("foobar")

How to detect exceptions in concurrent.futures in Python3?

I have just moved on to python3 as a result of its concurrent futures module. I was wondering if I could get it to detect errors. I want to use concurrent futures to parallel program, if there are more efficient modules please let me know.
I do not like multiprocessing as it is too complicated and not much documentation is out. It would be great however if someone could write a Hello World without classes only functions using multiprocessing to parallel compute so that it is easy to understand.
Here is a simple script:
from concurrent.futures import ThreadPoolExecutor
def pri():
print("Hello World!!!")
def start():
try:
while True:
pri()
except KeyBoardInterrupt:
print("YOU PRESSED CTRL+C")
with ThreadPoolExecutor(max_workers=3) as exe:
exe.submit(start)
The above code was just a demo, of how CTRL+C will not work to print the statement.
What I want is to be able to call a function is an error is present. This error detection must be from the function itself.
Another example
import socket
from concurrent.futures import ThreadPoolExecutor
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
def con():
try:
s.connect((x,y))
main()
except: socket.gaierror
err()
def err():
time.sleep(1)
con()
def main():
s.send("[+] Hello")
with ThreadPoolExecutor as exe:
exe.submit(con)

Way too late to the party, but maybe it'll help someone else...
I'm pretty sure the original question was not really answered. Folks got hung up on the fact that user5327424 was using a keyboard interrupt to raise an exception when the point was that the exception (however it was caused) was not raised. For example:
import concurrent.futures
def main():
numbers = range(10)
with concurrent.futures.ThreadPoolExecutor() as executor:
results = {executor.submit(raise_my_exception, number): number for number in numbers}
def raise_my_exception(number):
print('Proof that this function is getting called. %s' % number)
raise Exception('This never sees the light of day...')
main()
When the example code above is executed, you will see the text inside the print statement displayed on the screen, but you will never see the exception. This is because the results of each thread are held in the results object. You need to iterate that object to get to your exceptions. The following example shows how to access the results.
import concurrent.futures
def main():
numbers = range(10)
with concurrent.futures.ThreadPoolExecutor() as executor:
results = {executor.submit(raise_my_exception, number): number for number in numbers}
for result in results:
# This will cause the exception to be raised (but only the first one)
print(result.result())
def raise_my_exception(number):
print('Proof that this function is getting called. %s' % number)
raise Exception('This will be raised once the results are iterated.')
main()
I'm not sure I like this behavior or not, but it does allow the threads to fully execute, regardless of the exceptions encountered inside the individual threads.

Here's a solution. I'm not sure you like it, but I can't think of any other. I've modified your code to make it work.
from concurrent.futures import ThreadPoolExecutor
import time
quit = False
def pri():
print("Hello World!!!")
def start():
while quit is not True:
time.sleep(1)
pri()
try:
pool = ThreadPoolExecutor(max_workers=3)
pool.submit(start)
while quit is not True:
print("hei")
time.sleep(1)
except KeyboardInterrupt:
quit = True
Here are the points:
When you use with ThreadPoolExecutor(max_workers=3) as exe, it waits until all tasks have been done. Have a look at Doc
If wait is True then this method will not return until all the pending futures are done executing and the resources associated with the executor have been freed. If wait is False then this method will return immediately and the resources associated with the executor will be freed when all pending futures are done executing. Regardless of the value of wait, the entire Python program will not exit until all pending futures are done executing.
You can avoid having to call this method explicitly if you use the with statement, which will shutdown the Executor (waiting as if Executor.shutdown() were called with wait set to True)
It's like calling join() on a thread.
That's why I replaced it with:
pool = ThreadPoolExecutor(max_workers=3)
pool.submit(start)
Main thread must be doing "work" to be able to catch a Ctrl+C. So you can't just leave main thread there and exit, the simplest way is to run an infinite loop
Now that you have a loop running in main thread, when you hit CTRL+C, program will enter the except KeyboardInterrupt block and set quit=True. Then your worker thread can exit.
Strictly speaking, this is only a workaround. It seems to me it's impossible to have another way for this.
Edit
I'm not sure what's bothering you, but you can catch exception in another thread without problem:
import socket
import time
from concurrent.futures import ThreadPoolExecutor
s = socket.socket(socket.AF_INET,socket.SOCK_STREAM)
def con():
try:
raise socket.gaierror
main()
except socket.gaierror:
print("gaierror occurred")
err()
def err():
print("err invoked")
time.sleep(1)
con()
def main():
s.send("[+] Hello")
with ThreadPoolExecutor(3) as exe:
exe.submit(con)
Output
gaierror occurred
err invoked
gaierror occurred
err invoked
gaierror occurred
err invoked
gaierror occurred
...

Python multi processing . Handle exception in parent process and make all children die gracefully

I have this following code.
This uses a python module called decorator .
from multiprocessing import Pool
from random import randint
import traceback
import decorator
import time
def test_retry(number_of_retry_attempts=1, **kwargs):
timeout = kwargs.get('timeout', 2.0) # seconds
#decorator.decorator
def tryIt(func, *fargs, **fkwargs):
for _ in xrange(number_of_retry_attempts):
try: return func(*fargs, **fkwargs)
except:
tb = traceback.format_exc()
if timeout is not None:
time.sleep(timeout)
print 'Catching exception %s. Attempting retry: '%(tb)
raise
return tryIt
The decorator module helps me to decorate my datawarhouse call functions. So I don't need to take care of connection dropping and various connection based issues and allow me to reset the connection and try again after some timeout . I decorate all my functions which do data-warehouse reads with this method, so I get retry for free .
I have the following methods .
def process_generator(data):
#Process the generated data
def generator():
data = data_warhouse_fetch_method()#This is the actual method which needs retry
yield data
#test_retry(number_of_retry_attempts=2,timeout=1.0)
def data_warhouse_fetch_method():
#Fetch the data from data-warehouse
pass
I try to multi process my code using multiprocessing module like this.
try:
pool = Pool(processes=2)
result = pool.imap_unordered(process_generator,generator())
except Exception as exception:
print 'Do some post processing stuff'
tb = traceback.format_exc()
print tb
Things are normal when everything is successful . Also things are normal when it fixes itself within the number of retries. But once the number of reties exceeds i raise the exception in the test_retry method which is not getting caught in the main process . The process dies and the processes forked by main process are left as orphans . May be I am doing something wrong here . I am looking for some help to fix the following problem . Propagate the exception to parent process so that I can handle the exception and make my children die gracefully . Also I want to know how to inform the child processes to die gracefully. Thanks in advance for the help .
Edit : Added more code to explain.
def test_retry(number_of_retry_attempts=1, **kwargs):
timeout = kwargs.get('timeout', 2.0) # seconds
#decorator.decorator
def tryIt(func, *fargs, **fkwargs):
for _ in xrange(number_of_retry_attempts):
try: return func(*fargs, **fkwargs)
except:
tb = traceback.format_exc()
if timeout is not None:
time.sleep(timeout)
print 'Catching exception %s. Attempting retry: '%(tb)
raise
return tryIt
#test_retry(number_of_retry_attempts=2,timeout=1.0)
def bad_method():
sample_list =[]
return sample_list[0] #This will result in an exception
def process_generator(number):
if isinstance(number,int):
return number+1
else:
raise
def generator():
for i in range(20):
if i%10 == 0 :
yield bad_method()
else:
yield i
try:
pool = Pool(processes=2)
result = pool.imap_unordered(process_generator,generator())
pool.close()
#pool.join()
for r in result:
print r
except Exception, e: #Hoping the generator will catch the exception. But not .
print 'got exception: %r, terminating the pool' % (e,)
pool.terminate()
print 'pool is terminated'
finally:
print 'joining pool processes'
pool.join()
print 'join complete'
print 'the end'
The actual problem grinds down to if the generator is throwing an exception , I am unable to catch the exception thrown by the generator in the except clause which is wrapped around pool.imap_unordered() method . So after the exception is thrown the main process is stuck and the child process waits forever .Not sure what I am doing wrong here .

I don't fully understand the code that was shared here as I am not an expert. Also, the question is nearly an year old. But I had the same requirement as explained in the topic. And I managed to find a solution:
import multiprocessing
import time
def dummy(flag):
try:
if flag:
print('Sleeping for 2 secs')
time.sleep(2) # So that it can be terminated
else:
raise Exception('Exception from ', flag) # To simulate termination
return flag # To check that the sleeping thread never returns this
except Exception as e:
print('Exception inside dummy', e)
raise e
finally:
print('Entered finally', flag)
if __name__ == '__main__':
pool = multiprocessing.Pool(processes=multiprocessing.cpu_count())
args_list = [(1,), (0,)]
# call dummy for each tuple inside args_list.
# Use error_callback to terminate the pool
results = pool.starmap_async(dummy, args_list,
error_callback=lambda e, mp_pool=pool: mp_pool.terminate())
pool.close()
pool.join()
try:
# Try to see the results.
# If there was an exception in any process, results.get() throws exception
for result in results.get():
# Never executed cause of the exception
print('Printing result ', result)
except Exception as e:
print('Exception inside main', e)
print('Reached the end')
This produces the following output:
Sleeping for 2 secs
Exception inside dummy ('Exception from ', 0)
Entered finally 0
Exception inside main ('Exception from ', 0)
Reached the end
This is pretty much the first time I am answering a question so I apologise in advance if I have violated any rules or made any mistakes.
I had tried to do the following without success:
Use apply_async. But that just hung the main process after the exception was thrown
Try killing the processes and children using the pid in error_callback
Use a multiprocessing.event to track exceptions and check the same in all processes after each step before proceeding. Not a good approach but that didn't work either: "Condition objects should only be shared between processes through inheritance"
I honestly wish it wasn't so hard to terminate all processes in same pool if one of the processes threw an exception.

How to handle exception in threading with queue in Python?

This is never print:
"Exception in threadfuncqueue handled by threadfuncqueue",
"Exception in threadfuncqueue handled by main thread" and
"thread test with queue passed". Never quitting!
from threading import Thread
from Queue import Queue
import time
class ImRaiseError():
def __init__(self):
time.sleep(1)
raise Exception(self.__class__.__name__)
# place for paste worked code example from below
print "begin thread test with queue"
def threadfuncqueue(q):
print "\n"+str(q.get())
while not q.empty():
try:
testthread = ImRaiseError()
finally:
print "Exception in threadfuncqueue handled by threadfuncqueue"
q = Queue()
items = [1,2]
for i in range(len(items)):
t = Thread(target=threadfuncqueue,args=(q,))
if(1 == i):
t.daemon = False
else:
t.daemon = True
t.start()
for item in items:
q.put("threadfuncqueue"+str(item))
try:
q.join() # block until all tasks are done
finally:
print "Exception in threadfuncqueue handled by main thread"
print "thread test with queue passed"
quit()
How handle this exception?
Example of worked code, but without queue:
print "=========== procedure style test"
def threadfunc(q):
print "\n"+str(q)
while True:
try:
testthread = ImRaiseError()
finally:
print str(q)+" handled by process"
try:
threadfunc('testproc')
except Exception as e:
print "error!",e
print "procedure style test ==========="
print "=========== simple thread tests"
testthread = Thread(target=threadfunc,args=('testthread',))
testthread.start()
try:
testthread.join()
finally:
print "Exception in testthread handled by main thread"
testthread1 = Thread(target=threadfunc,args=('testthread1',))
testthread1.start()
try:
testthread1.join()
finally:
print "Exception in testthread1 handled by main thread"
print "simple thread tests ==========="

Short Answer
You're putting things in a queue and retrieving them, but if you're going to join a queue, you need to mark tasks as done as you pull them out of the queue and process them. According to the docs, every time you enqueue an item, a counter is incremented, and you need to call q.task_done() to decrement that counter. q.join() will block until that counter reaches zero. Add this immediately after your q.get() call to prevent main from being blocked:
q.task_done()
Also, I find it odd that you're checking q for emptiness after you've retrieved something from it. I'm not sure exactly what you're trying to achieve with that so I don't have any recommendations for you, but I would suggest reconsidering your design in that area.
Other Thoughts
Once you get this code working you should take it over to Code Review because it is a bit of a mess. Here are a few thoughts for you:
Exception Handling
You're not actually "handling" the exception in threadfuncqueue(q). All the finally statement does is allow you to execute cleanup code in the event of an exception. It does not actually catch and handle the exception. The exception will still travel up the call stack. Consider this example, test.py:
try:
raise Exception
finally:
print("Yup!")
print("Nope!")
Output:
Yup!
Traceback (most recent call last):
File "test.py", line 2, in
raise Exception
Exception
Notice that "Yup!" got printed while "Nope!" didn't. The code in the finally block was executed, but that didn't stop the exception from propagating up the stack and halting the interpreter. You need the except statement for that:
try:
raise Exception
except Exception: # only catch the exceptions you expect
print("Yup!")
print("Nope!")
Output:
Yup!
Nope!
This time both are printed, because we caught and handled the exception.
Exception Raising
Your current method of raising the exception in your thread is needlessly complicated. Instead of creating the whole ImRaiseError class, just raise the exception you want with a string:
raise Exception('Whatever error message I want')
If you find yourself manually manipulating mangled names (like self.__class__.__name__), you're usually doing something wrong.
Extra Parentheses
Using parentheses around conditional expressions is generally frowned upon in Python:
if(1 == i): # unnecessary extra characters
Try to break the C/C++/Java habit and get rid of them:
if 1 == i:
Other
I've already gone beyond the scope of this question, so I'm going to cut this off now, but there are a few other things you could clean up and make more idiomatic. Head over to Code Review when you're done here and see what else can be improved.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.