It is possible to retrieve the outputs of workers with Pool.map, but when one worker fails, an exception is raised and it's not possible to retrieve the outputs anymore. So, my idea was to log the outputs in a process-synchronized queue so as to retrieve the outputs of all successful workers.
The following snippet seems to work:
from multiprocessing import Pool, Manager
from functools import partial
def f(x, queue):
if x == 4:
raise Exception("Error")
queue.put_nowait(x)
if __name__ == '__main__':
queue = Manager().Queue()
pool = Pool(2)
try:
pool.map(partial(f, queue=queue), range(6))
pool.close()
pool.join()
except:
print("An error occurred")
while not queue.empty():
print("Output => " + str(queue.get()))
But I was wondering whether a race condition could occur during the queue polling phase. I'm not sure whether the queue process will necessarily be alive when all workers have completed. Do you think my code is correct from that point of view?
As far as "how to correctly handle exceptions", which is your main question:
First, in your case, you will never get to execute pool.close and pool.join. But pool.map will not return until all the submitted tasks have returned their results or generated an exception, so you really don't need to call these to be sure that all of your submitted tasks have been completed. If it weren't for worker function f writing the results to a queue, you would never be able to get any results back using map as long as long as any of your tasks resulted in an exception. You would instead have to apply_async individual tasks and get AsyncResult instances for each one.
So I would say that a better way of handling exceptions in you worker functions without having to resort to using a queue would be as follows. But note that when you use apply_async, tasks are being submitted one task at a time, which can result in many shared memory accesses. This becomes a performance issue really only when the number of tasks being submitted is very large. In this case, it would be better for worker functions to handle the exceptions themselves and somehow pass back the error indication to allow the use of map or imap, where you could specify a chunksize.
When using a queue, be aware that writing to a managed queue has fair bit of overhead. The second piece of code shows how you can reduce that overhead a bit by using a multiprocessing.Queue instance, which does not use a proxy unlike the managed queue. Note the output order, which is not the order in which the tasks were submitted but rather the order in which tasks completed -- another potential downside or upside to using a queue (you can use a callback function with apply_async if you want the results in the order completed). Even with your original code you should not depend on the order of results in the queue.
from multiprocessing import Pool
def f(x):
if x == 4:
raise Exception("Error")
return x
if __name__ == '__main__':
pool = Pool(2)
results = [pool.apply_async(f, args=(x,)) for x in range(6)]
for x, result in enumerate(results): # result is AsyncResult instance:
try:
return_value = result.get()
except:
print(f'An error occurred for x = {x}')
else:
print(f'For x = {x} the return value is {return_value}')
Prints:
For x = 0 the return value is 0
For x = 1 the return value is 1
For x = 2 the return value is 2
For x = 3 the return value is 3
An error occurred for x = 4
For x = 5 the return value is 5
OP's Original Code Modified to Use multiprocessing.Queue
from multiprocessing import Pool, Queue
def init_pool(q):
global queue
queue = q
def f(x):
if x == 4:
raise Exception("Error")
queue.put_nowait(x)
if __name__ == '__main__':
queue = Queue()
pool = Pool(2, initializer=init_pool, initargs=(queue,))
try:
pool.map(f, range(6))
except:
print("An error occurred")
while not queue.empty():
print("Output => " + str(queue.get()))
Prints:
An error occurred
Output => 0
Output => 2
Output => 3
Output => 1
Output => 5
Related
I'm attempting to use multiprocessing to run many simulations across multiple processes; however, the code I have written only uses 1 of the processes as far as I can tell.
Updated
I've gotten all the processes to work (I think) thanks to #PaulBecotte ; however, the multiprocessing seems to run significantly slower than its non-multiprocessing counterpart.
For instance, not including the function and class declarations/implementations and imports, I have:
def monty_hall_sim(num_trial, player_type='AlwaysSwitchPlayer'):
if player_type == 'NeverSwitchPlayer':
player = NeverSwitchPlayer('Never Switch Player')
else:
player = AlwaysSwitchPlayer('Always Switch Player')
return (MontyHallGame().play_game(player) for trial in xrange(num_trial))
def do_work(in_queue, out_queue):
while True:
try:
f, args = in_queue.get()
ret = f(*args)
for result in ret:
out_queue.put(result)
except:
break
def main():
logging.getLogger().setLevel(logging.ERROR)
always_switch_input_queue = multiprocessing.Queue()
always_switch_output_queue = multiprocessing.Queue()
total_sims = 20
num_processes = 5
process_sims = total_sims/num_processes
with Timer(timer_name='Always Switch Timer'):
for i in xrange(num_processes):
always_switch_input_queue.put((monty_hall_sim, (process_sims, 'AlwaysSwitchPlayer')))
procs = [multiprocessing.Process(target=do_work, args=(always_switch_input_queue, always_switch_output_queue)) for i in range(num_processes)]
for proc in procs:
proc.start()
always_switch_res = []
while len(always_switch_res) != total_sims:
always_switch_res.append(always_switch_output_queue.get())
always_switch_success = float(always_switch_res.count(True))/float(len(always_switch_res))
print '\tLength of Always Switch Result List: {alw_sw_len}'.format(alw_sw_len=len(always_switch_res))
print '\tThe success average of switching doors was: {alw_sw_prob}'.format(alw_sw_prob=always_switch_success)
which yields:
Time Elapsed: 1.32399988174 seconds
Length: 20
The success average: 0.6
However, I am attempting to use this for total_sims = 10,000,000 over num_processes = 5, and doing so has taken significantly longer than using 1 process (1 process returned in ~3 minutes). The non-multiprocessing counterpart I'm comparing it to is:
def main():
logging.getLogger().setLevel(logging.ERROR)
with Timer(timer_name='Always Switch Monty Hall Timer'):
always_switch_res = [MontyHallGame().play_game(AlwaysSwitchPlayer('Monty Hall')) for x in xrange(10000000)]
always_switch_success = float(always_switch_res.count(True))/float(len(always_switch_res))
print '\n\tThe success average of not switching doors was: {not_switching}' \
'\n\tThe success average of switching doors was: {switching}'.format(not_switching=never_switch_success,
switching=always_switch_success)
You could try import “process “ under some if statements
EDIT- you changed some stuff, let me try and explain a bit better.
Each message you put into the input queue will cause the monty_hall_sim function to get called and send num_trial messages to the output queue.
So your original implementation was right- to get 20 output messages, send in 5 input messages.
However, your function is slightly wrong.
for trial in xrange(num_trial):
res = MontyHallGame().play_game(player)
yield res
This will turn the function into a generator that will provide a new value on each next() call- great! The problem is here
while True:
try:
f, args = in_queue.get(timeout=1)
ret = f(*args)
out_queue.put(ret.next())
except:
break
Here, on each pass through the loop you create a NEW generator with a NEW message. The old one is thrown away. So here, each input message only adds a single output message to the queue before you throw it away and get another one. The correct way to write this is-
while True:
try:
f, args = in_queue.get(timeout=1)
ret = f(*args)
for result in ret:
out_queue.put(ret.next())
except:
break
Doing it this way will continue to yield output messages from the generator until it finishes (after yielding 4 messages in this case)
I was able to get my code to run significantly faster by changing monty_hall_sim's return to a list comprehension, having do_work add the lists to the output queue, and then extend the results list of main with the lists returned by the output queue. Made it run in ~13 seconds.
I have a code piece like below
pool = multiprocessing.Pool(10)
for i in range(300):
for m in range(500):
data = do_some_calculation(resource)
pool.apply_async(paralized_func, data, call_back=update_resource)
# need to wait for all processes finish
# {...}
# Summarize resource
do_something_with_resource(resource)
So basically I have 2 loops. I init process pool once outside the loops to avoid overheating. At the end of 2nd loop, I want to summarize the result of all processes.
Problem is that I can't use pool.map() to wait because of variation of data input. I can't use pool.join() and pool.close() either because I still need to use the pool in next iteration of 1st loop.
What is the good way to wait for processes to finish in this case?
I tried checking for pool._cache at the end of 2nd loop.
while len(process_pool._cache) > 0:
sleep(0.001)
This way works but look weird. Is there a better way to do this?
apply_async will return an AsyncResult object. This object has a method wait([timeout]), you can use it.
Example:
pool = multiprocessing.Pool(10)
for i in range(300):
results = []
for m in range(500):
data = do_some_calculation(resource)
result = pool.apply_async(paralized_func, data, call_back=update_resource)
results.append(result)
[result.wait() for result in results]
# need to wait for all processes finish
# {...}
# Summarize resource
do_something_with_resource(resource)
I haven't checked this code as it is not executable, but it should work.
There's an issue with most upvoted answer
[result.wait() for result in results]
will not work as a roadblock in case some of the workers raised an exception. Exception considered sufficient case to proceed further for wait(). Here's possible check if all workers finished processing.
while True:
time.sleep(1)
# catch exception if results are not ready yet
try:
ready = [result.ready() for result in results]
successful = [result.successful() for result in results]
except Exception:
continue
# exit loop if all tasks returned success
if all(successful):
break
# raise exception reporting exceptions received from workers
if all(ready) and not all(successful):
raise Exception(f'Workers raised following exceptions {[result._value for result in results if not result.successful()]}')
Or you can use a callback to record how many returns you have got.
pool = multiprocessing.Pool(10)
for i in range(300):
results = 0
for m in range(500):
data = do_some_calculation(resource)
result = pool.apply_async(paralized_func, data, call_back=lambda x: results+=1; )
results.append(result)
# need to wait for all processes finish
while results < 500:
pass
# Summarize resource
do_something_with_resource(resource)
My problem is, whenever I use thr.results() the program acts like its running on one thread. But when i don't you use thr.results() it will use x threads
so if I remove my if statement, it will run on 10 threads, if I have it in there it will act like its on one 1 thread
def search(query):
r = requests.get("https://www.google.com/search?q=" + query)
return r.status_code
pool = ThreadPoolExecutor(max_workers=10)
for i in range(50):
thr = pool.submit(search, "stocks")
print(i)
if thr.result() != 404:
print("Ran")
pool.shutdown(wait=True)
That's because result will wait for the future to complete:
Return the value returned by the call. If the call hasn’t yet completed then this method will wait up to timeout seconds. If the call hasn’t completed in timeout seconds, then a concurrent.futures.TimeoutError will be raised. timeout can be an int or float. If timeout is not specified or None, there is no limit to the wait time.
When you have result within a loop you submit a task, then wait it to complete and then submit another one so there can be only one task running at a time.
Update You can either store the returned futures to a list and iterate over them once you have submitted all the task. Other option is to use map:
from concurrent.futures import ThreadPoolExecutor
import time
def square(x):
time.sleep(0.3)
return x * x
print(time.time())
with ThreadPoolExecutor(max_workers=3) as pool:
for res in pool.map(square, range(10)):
print(res)
print(time.time())
Output:
1485845609.983702
0
1
4
9
16
25
36
49
64
81
1485845611.1942203
The following code starts three processes, they are in a pool to handle 20 worker calls:
import multiprocessing
def worker(nr):
print(nr)
numbers = [i for i in range(20)]
if __name__ == '__main__':
multiprocessing.freeze_support()
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, numbers)
pool.close()
pool.join()
Is there a way to start the processes in a sequence (as opposed to having them starting all at the same time), with a delay inserted between each process start?
If not using a Pool I would have used multiprocessing.Process(target=worker, args=(nr,)).start() in a loop, starting them one after the other and inserting the delay as needed. I find Pool to be extremely useful, though (together with the map call) so I would be glad to keep it if possible.
According to the documentation, no such control over pooled processes exists. You could however, simulate it with a lock:
import multiprocessing
import time
lock = multiprocessing.Lock()
def worker(nr):
lock.acquire()
time.sleep(0.100)
lock.release()
print(nr)
numbers = [i for i in range(20)]
if __name__ == '__main__':
multiprocessing.freeze_support()
pool = multiprocessing.Pool(processes=3)
results = pool.map(worker, numbers)
pool.close()
pool.join()
Your 3 processes will still start simultaneously. Well, what I mean is you don't have control over which process starts executing the callback first. But at least you get your delay. This effectively has each worker "starting" (but really, continuing) at designated intervals.
Ammendment resulting from discussion below:
Note that on Windows it's not possible to inherit a lock from a parent process. Instead, you can use multiprocessing.Manager().Lock() to communicate a global lock object between processes (with additional IPC overhead, of course). The global lock object needs to be initialized in each process, as well. This would look like:
from multiprocessing import Process, freeze_support
import multiprocessing
import time
from datetime import datetime as dt
def worker(nr):
glock.acquire()
print('started job: {} at {}'.format(nr, dt.now()))
time.sleep(1)
glock.release()
print('ended job: {} at {}'.format(nr, dt.now()))
numbers = [i for i in range(6)]
def init(lock):
global glock
glock = lock
if __name__ == '__main__':
multiprocessing.freeze_support()
lock = multiprocessing.Manager().Lock()
pool = multiprocessing.Pool(processes=3, initializer=init, initargs=(lock,))
results = pool.map(worker, numbers)
pool.close()
pool.join()
Couldn't you do something simple like this:
from multiprocessing import Process
from time import sleep
def f(n):
print 'started job: '+str(n)
sleep(3)
print 'ended job: '+str(n)
if __name__ == '__main__':
for i in range(0,100):
p = Process(target=f, args=(i,))
p.start()
sleep(1)
Result
started job: 0
started job: 1
started job: 2
ended job: 0
started job: 3
ended job: 1
started job: 4
ended job: 2
started job: 5
could you try defining a function that yields your values slowly?
def get_numbers_on_delay(numbers, delay):
for i in numbers:
yield i
time.sleep(delay)
and then:
results = pool.map(worker, get_numbers_on_delay(numbers, 5))
i haven't tested it, so i'm not sure, but give it a shot.
I couldn't get the locking answer to work for some reason so i implemented it this way.
I realize the question is old, but maybe someone else has the same problem.
It spawns all the processes similar to the locking solution, but sleeps before work based on their process name number.
from multiprocessing import current_process
from re import search
from time import sleep
def worker():
process_number = search('\d+', current_process().name).group()
time_between_workers = 5
sleep(time_between_workers * int(process_number))
#do your work here
Since the names given to the processes seem to be unique and incremental, this grabs the number of the process and sleeps based on that.
SpawnPoolWorker-1 sleeps 1 * 5 seconds, SpawnPoolWorker-2 sleeps 2 * 5 seconds etc.
This is a followup question to this. User Will suggested using a queue, I tried to implement that solution below. The solution works just fine with j=1000, however, it hangs as I try to scale to larger numbers. I am stuck here and cannot determine why it hangs. Any suggestions would be appreciated. Also, the code is starting to get ugly as I keep messing with it, I apologize for all the nested functions.
def run4(j):
"""
a multicore approach using queues
"""
from multiprocessing import Process, Queue, cpu_count
import os
def bazinga(uncrunched_queue, crunched_queue):
"""
Pulls the next item off queue, generates its collatz
length and
"""
num = uncrunched_queue.get()
while num != 'STOP': #Signal that there are no more numbers
length = len(generateChain(num, []) )
crunched_queue.put([num , length])
num = uncrunched_queue.get()
def consumer(crunched_queue):
"""
A process to pull data off the queue and evaluate it
"""
maxChain = 0
biggest = 0
while not crunched_queue.empty():
a, b = crunched_queue.get()
if b > maxChain:
biggest = a
maxChain = b
print('%d has a chain of length %d' % (biggest, maxChain))
uncrunched_queue = Queue()
crunched_queue = Queue()
numProcs = cpu_count()
for i in range(1, j): #Load up the queue with our numbers
uncrunched_queue.put(i)
for i in range(numProcs): #put sufficient stops at the end of the queue
uncrunched_queue.put('STOP')
ps = []
for i in range(numProcs):
p = Process(target=bazinga, args=(uncrunched_queue, crunched_queue))
p.start()
ps.append(p)
p = Process(target=consumer, args=(crunched_queue, ))
p.start()
ps.append(p)
for p in ps: p.join()
You're putting 'STOP' poison pills into your uncrunched_queue (as you should), and having your producers shut down accordingly; on the other hand your consumer only checks for emptiness of the crunched queue:
while not crunched_queue.empty():
(this working at all depends on a race condition, btw, which is not good)
When you start throwing non-trivial work units at your bazinga producers, they take longer. If all of them take long enough, your crunched_queue dries up, and your consumer dies. I think you may be misidentifying what's happening - your program doesn't "hang", it just stops outputting stuff because your consumer is dead.
You need to implement a smarter methodology for shutting down your consumer. Either look for n poison pills, where n is the number of producers (who accordingly each toss one in the crunched_queue when they shut down), or use something like a Semaphore that counts up for each live producer and down when one shuts down.