So what I want to do is run the same function multiple times simultaneously while getting a result as return and storing it in an array or list. It goes like this:
def base_func(matrix,arg1,arg2):
result = []
for row in range(matrix.shape[0]):
#perform necessary operation on row and return a certain value to store it into result
x = func(matrix[row],arg1,arg2)
result.append(x)
return np.array(result)
I tried using threading in python. My implementation goes:
def base_func(matrix,arg1,arg2):
result = []
threads = []
for row in range(matrix.shape[0]):
t = threading.Thread(target=func,args=(matrix[row],arg1,arg2,))
threads.append(t)
t.start()
for t in threads:
res = t.join()
result.append(res)
return np.array(result)
This doesn't seem to work and just returns None from the threads.
From what I read in the documentation of threading.join(), it says:
As join() always returns None, you must call is_alive() after join() to decide whether a timeout happened – if the thread is still alive, the join() call timed out.
You will always get None from these line of your code:
res = t.join()
result.append(res)
This post mentions a similar problem as yours, please follow this for your solution. You might want to use concurrent.futures module as explained in this answer.
Related
It is possible to retrieve the outputs of workers with Pool.map, but when one worker fails, an exception is raised and it's not possible to retrieve the outputs anymore. So, my idea was to log the outputs in a process-synchronized queue so as to retrieve the outputs of all successful workers.
The following snippet seems to work:
from multiprocessing import Pool, Manager
from functools import partial
def f(x, queue):
if x == 4:
raise Exception("Error")
queue.put_nowait(x)
if __name__ == '__main__':
queue = Manager().Queue()
pool = Pool(2)
try:
pool.map(partial(f, queue=queue), range(6))
pool.close()
pool.join()
except:
print("An error occurred")
while not queue.empty():
print("Output => " + str(queue.get()))
But I was wondering whether a race condition could occur during the queue polling phase. I'm not sure whether the queue process will necessarily be alive when all workers have completed. Do you think my code is correct from that point of view?
As far as "how to correctly handle exceptions", which is your main question:
First, in your case, you will never get to execute pool.close and pool.join. But pool.map will not return until all the submitted tasks have returned their results or generated an exception, so you really don't need to call these to be sure that all of your submitted tasks have been completed. If it weren't for worker function f writing the results to a queue, you would never be able to get any results back using map as long as long as any of your tasks resulted in an exception. You would instead have to apply_async individual tasks and get AsyncResult instances for each one.
So I would say that a better way of handling exceptions in you worker functions without having to resort to using a queue would be as follows. But note that when you use apply_async, tasks are being submitted one task at a time, which can result in many shared memory accesses. This becomes a performance issue really only when the number of tasks being submitted is very large. In this case, it would be better for worker functions to handle the exceptions themselves and somehow pass back the error indication to allow the use of map or imap, where you could specify a chunksize.
When using a queue, be aware that writing to a managed queue has fair bit of overhead. The second piece of code shows how you can reduce that overhead a bit by using a multiprocessing.Queue instance, which does not use a proxy unlike the managed queue. Note the output order, which is not the order in which the tasks were submitted but rather the order in which tasks completed -- another potential downside or upside to using a queue (you can use a callback function with apply_async if you want the results in the order completed). Even with your original code you should not depend on the order of results in the queue.
from multiprocessing import Pool
def f(x):
if x == 4:
raise Exception("Error")
return x
if __name__ == '__main__':
pool = Pool(2)
results = [pool.apply_async(f, args=(x,)) for x in range(6)]
for x, result in enumerate(results): # result is AsyncResult instance:
try:
return_value = result.get()
except:
print(f'An error occurred for x = {x}')
else:
print(f'For x = {x} the return value is {return_value}')
Prints:
For x = 0 the return value is 0
For x = 1 the return value is 1
For x = 2 the return value is 2
For x = 3 the return value is 3
An error occurred for x = 4
For x = 5 the return value is 5
OP's Original Code Modified to Use multiprocessing.Queue
from multiprocessing import Pool, Queue
def init_pool(q):
global queue
queue = q
def f(x):
if x == 4:
raise Exception("Error")
queue.put_nowait(x)
if __name__ == '__main__':
queue = Queue()
pool = Pool(2, initializer=init_pool, initargs=(queue,))
try:
pool.map(f, range(6))
except:
print("An error occurred")
while not queue.empty():
print("Output => " + str(queue.get()))
Prints:
An error occurred
Output => 0
Output => 2
Output => 3
Output => 1
Output => 5
I'm attempting to use multiprocessing to run many simulations across multiple processes; however, the code I have written only uses 1 of the processes as far as I can tell.
Updated
I've gotten all the processes to work (I think) thanks to #PaulBecotte ; however, the multiprocessing seems to run significantly slower than its non-multiprocessing counterpart.
For instance, not including the function and class declarations/implementations and imports, I have:
def monty_hall_sim(num_trial, player_type='AlwaysSwitchPlayer'):
if player_type == 'NeverSwitchPlayer':
player = NeverSwitchPlayer('Never Switch Player')
else:
player = AlwaysSwitchPlayer('Always Switch Player')
return (MontyHallGame().play_game(player) for trial in xrange(num_trial))
def do_work(in_queue, out_queue):
while True:
try:
f, args = in_queue.get()
ret = f(*args)
for result in ret:
out_queue.put(result)
except:
break
def main():
logging.getLogger().setLevel(logging.ERROR)
always_switch_input_queue = multiprocessing.Queue()
always_switch_output_queue = multiprocessing.Queue()
total_sims = 20
num_processes = 5
process_sims = total_sims/num_processes
with Timer(timer_name='Always Switch Timer'):
for i in xrange(num_processes):
always_switch_input_queue.put((monty_hall_sim, (process_sims, 'AlwaysSwitchPlayer')))
procs = [multiprocessing.Process(target=do_work, args=(always_switch_input_queue, always_switch_output_queue)) for i in range(num_processes)]
for proc in procs:
proc.start()
always_switch_res = []
while len(always_switch_res) != total_sims:
always_switch_res.append(always_switch_output_queue.get())
always_switch_success = float(always_switch_res.count(True))/float(len(always_switch_res))
print '\tLength of Always Switch Result List: {alw_sw_len}'.format(alw_sw_len=len(always_switch_res))
print '\tThe success average of switching doors was: {alw_sw_prob}'.format(alw_sw_prob=always_switch_success)
which yields:
Time Elapsed: 1.32399988174 seconds
Length: 20
The success average: 0.6
However, I am attempting to use this for total_sims = 10,000,000 over num_processes = 5, and doing so has taken significantly longer than using 1 process (1 process returned in ~3 minutes). The non-multiprocessing counterpart I'm comparing it to is:
def main():
logging.getLogger().setLevel(logging.ERROR)
with Timer(timer_name='Always Switch Monty Hall Timer'):
always_switch_res = [MontyHallGame().play_game(AlwaysSwitchPlayer('Monty Hall')) for x in xrange(10000000)]
always_switch_success = float(always_switch_res.count(True))/float(len(always_switch_res))
print '\n\tThe success average of not switching doors was: {not_switching}' \
'\n\tThe success average of switching doors was: {switching}'.format(not_switching=never_switch_success,
switching=always_switch_success)
You could try import “process “ under some if statements
EDIT- you changed some stuff, let me try and explain a bit better.
Each message you put into the input queue will cause the monty_hall_sim function to get called and send num_trial messages to the output queue.
So your original implementation was right- to get 20 output messages, send in 5 input messages.
However, your function is slightly wrong.
for trial in xrange(num_trial):
res = MontyHallGame().play_game(player)
yield res
This will turn the function into a generator that will provide a new value on each next() call- great! The problem is here
while True:
try:
f, args = in_queue.get(timeout=1)
ret = f(*args)
out_queue.put(ret.next())
except:
break
Here, on each pass through the loop you create a NEW generator with a NEW message. The old one is thrown away. So here, each input message only adds a single output message to the queue before you throw it away and get another one. The correct way to write this is-
while True:
try:
f, args = in_queue.get(timeout=1)
ret = f(*args)
for result in ret:
out_queue.put(ret.next())
except:
break
Doing it this way will continue to yield output messages from the generator until it finishes (after yielding 4 messages in this case)
I was able to get my code to run significantly faster by changing monty_hall_sim's return to a list comprehension, having do_work add the lists to the output queue, and then extend the results list of main with the lists returned by the output queue. Made it run in ~13 seconds.
I have a code piece like below
pool = multiprocessing.Pool(10)
for i in range(300):
for m in range(500):
data = do_some_calculation(resource)
pool.apply_async(paralized_func, data, call_back=update_resource)
# need to wait for all processes finish
# {...}
# Summarize resource
do_something_with_resource(resource)
So basically I have 2 loops. I init process pool once outside the loops to avoid overheating. At the end of 2nd loop, I want to summarize the result of all processes.
Problem is that I can't use pool.map() to wait because of variation of data input. I can't use pool.join() and pool.close() either because I still need to use the pool in next iteration of 1st loop.
What is the good way to wait for processes to finish in this case?
I tried checking for pool._cache at the end of 2nd loop.
while len(process_pool._cache) > 0:
sleep(0.001)
This way works but look weird. Is there a better way to do this?
apply_async will return an AsyncResult object. This object has a method wait([timeout]), you can use it.
Example:
pool = multiprocessing.Pool(10)
for i in range(300):
results = []
for m in range(500):
data = do_some_calculation(resource)
result = pool.apply_async(paralized_func, data, call_back=update_resource)
results.append(result)
[result.wait() for result in results]
# need to wait for all processes finish
# {...}
# Summarize resource
do_something_with_resource(resource)
I haven't checked this code as it is not executable, but it should work.
There's an issue with most upvoted answer
[result.wait() for result in results]
will not work as a roadblock in case some of the workers raised an exception. Exception considered sufficient case to proceed further for wait(). Here's possible check if all workers finished processing.
while True:
time.sleep(1)
# catch exception if results are not ready yet
try:
ready = [result.ready() for result in results]
successful = [result.successful() for result in results]
except Exception:
continue
# exit loop if all tasks returned success
if all(successful):
break
# raise exception reporting exceptions received from workers
if all(ready) and not all(successful):
raise Exception(f'Workers raised following exceptions {[result._value for result in results if not result.successful()]}')
Or you can use a callback to record how many returns you have got.
pool = multiprocessing.Pool(10)
for i in range(300):
results = 0
for m in range(500):
data = do_some_calculation(resource)
result = pool.apply_async(paralized_func, data, call_back=lambda x: results+=1; )
results.append(result)
# need to wait for all processes finish
while results < 500:
pass
# Summarize resource
do_something_with_resource(resource)
I am using a coroutine pipeline for an event-driven data pipeline. Everything is working great so far. I wanted to try processing some of the input in batches but need a way to ensure that the final batch is processed once the upstream producer is empty. In the contrived example below, this would be a way to print(res) in print_data_cp once produce_data_from was done. A more direct analog would be to print and reset res each time its length == 3 and guarantee that the remaining values in res are printed once the producer is done. I know there are several ways to solve this but is there a idiomatic approach to this problem (e.g. sentinel value, return remainder, while/finally, wrap in class)?
For now, I have the coprocess function as part of a class and let res be an instance variable so I can access it after the the coprocess function is complete. This works but something like a while/finally would be more general.
def produce_data_from(data, consumer):
next(consumer)
for x in data:
consumer.send(x)
def print_data_cp():
res = []
while True:
x = (yield)
res.append(x)
print(x)
cons = print_data_cp()
produce_data_from(range(10), cons)
This modification uses try/finally and changes the producer to close the consumer coprocess. This triggers the finally block. Here, the coprocessor relies on the producer sending the close() signal, so modifying the consumer function to batch process requires modifying the upstream producer function as well. Not ideal, but it works and feels pythonic enough. I would be delighted to see other approaches.
def produce_data_from(data, consumer):
next(consumer)
for x in data:
consumer.send(x)
consumer.close()
def print_data_cp():
res = []
try:
while True:
x = (yield)
res.append(x)
if len(res) >= 3:
print(res)
res = []
finally:
print(res)
cons = print_data_cp()
produce_data_from(range(10), cons)
I'm trying to run three functions (each can take up to 1 second to execute) every second. I'd then like to store the output from each function, and write them to separate files.
At the moment I'm using Timers for my delay handling. (I could subclass Thread, but that's getting a bit complicated for this simple script)
def main:
for i in range(3):
set_up_function(i)
t = Timer(1, run_function, [i])
t.start()
time.sleep(100) # Without this, main thread exits
def run_function(i):
t = Timer(1, run_function, [i])
t.start()
print function_with_delay(i)
What's the best way to handle the output from function_with_delay? Append the result to a global list for each function?
Then I could put something like this at the end of my main function:
...
while True:
time.sleep(30) # or in a try/except with a loop of 1 second sleeps so I can interrupt
for i in range(3):
save_to_disk(data[i])
Thoughts?
Edit: Added my own answer as a possibility
I believe the python Queue module is designed for precisely this sort of scenario. You could do something like this, for example:
def main():
q = Queue.Queue()
for i in range(3):
t = threading.Timer(1, run_function, [q, i])
t.start()
while True:
item = q.get()
save_to_disk(item)
q.task_done()
def run_function(q, i):
t = threading.Timer(1, run_function, [q, i])
t.start()
q.put(function_with_delay(i))
I would say store a list of lists (bool, str), where bool is whether the function has finished running and str is the output. Each function locks the list with a mutex to append output (or if you don't care about thread safety omit this). Then, have a simple polling loop checking if all the bool values are True, and if so then do your save_to_disk calls.
Another alternative would be to implement a class (taken from this answer) that uses threading.Lock(). This has the advantage of being able to wait on the ItemStore, and save_to_disk can use getAll, rather than polling the queue. (More efficient for large data sets?)
This is particularly suited to writing at a set time interval (ie every 30 seconds), rather than once per second.
class ItemStore(object):
def __init__(self):
self.lock = threading.Lock()
self.items = []
def add(self, item):
with self.lock:
self.items.append(item)
def getAll(self):
with self.lock:
items, self.items = self.items, []
return items