I am struggling for a while with Multiprocessing in Python. I would like to run 2 independent functions simultaneously, wait until both calculations are finished and then continue with the output of both functions. Something like this:
# Function A:
def jobA(num):
result=num*2
return result
# Fuction B:
def jobB(num):
result=num^3
return result
# Parallel process function:
{resultA,resultB}=runInParallel(jobA(num),jobB(num))
I found other examples of multiprocessing however they used only one function or didn't returned an output. Anyone knows how to do this? Many thanks!
I'd recommend creating processes manually (rather than as part of a pool), and sending the return values to the main process through a multiprocessing.Queue. These queues can share almost any Python object in a safe and relatively efficient way.
Here's an example, using the jobs you've posted.
def jobA(num, q):
q.put(num * 2)
def jobB(num, q):
q.put(num ^ 3)
import multiprocessing as mp
q = mp.Queue()
jobs = (jobA, jobB)
args = ((10, q), (2, q))
for job, arg in zip(jobs, args):
mp.Process(target=job, args=arg).start()
for i in range(len(jobs)):
print('Result of job {} is: {}'.format(i, q.get()))
This prints out:
Result of job 0 is: 20
Result of job 1 is: 1
But you can of course do whatever further processing you'd like using these values.
Related
I want to use Pool to split a task among n workers. What happens is that when I'm using map with one argument in the task function, I observe that all the cores are used, all tasks are launched simultaneously.
On the other hand, when I'm using starmap, task launch is one by one and I never reach 100% CPU load.
I want to use starmap for my case because I want to pass a second argument, but there's no use if it doesn't take advantage of multiprocessing.
This is the code that works
import numpy as np
from multiprocessing import Pool
# df_a = just a pandas dataframe which I split in n parts and I
# feed each part to a task. Each one may have a few
# thousand rows
n_jobs = 16
def run_parallel(df_a):
dfs_a = np.array_split(df_a, n_jobs)
print("done split")
pool = Pool(n_jobs)
result = pool.map(task_function, dfs_a)
return result
def task_function(left_df):
print("in task function")
# execute task...
return result
result = run_parallel(df_a)
in this case, "in task function" is printed at the same time, 16 times.
This is the code that doesn't work
n_jobs = 16
# df_b: a big pandas dataframe (~1.7M rows, ~20 columns) which I
# want to send to each task as is
def run_parallel(df_a, df_b):
dfs_a = np.array_split(df_a, n_jobs)
print("done split")
pool = Pool(n_jobs)
result = pool.starmap(task_function, zip(dfs_a, repeat(df_b)))
return result
def task_function(left_df, right_df):
print("in task function")
# execute task
return result
result = run_parallel(df_a, df_b)
Here, "in task function" is printed sequentially and the processors never reach 100% capacity. I also tried workarounds based on this answer:
https://stackoverflow.com/a/5443941/6941970
but no luck. Even when I used map in this way:
from functools import partial
pool.map(partial(task_function, b=df_b), dfs_a)
considering that maybe repeat(*very big df*) would introduce memory issues, still there wasn't any real parallelization
I'm diving into the multiprocessing world in python.
After watching some videos I came up with a question due to the nature of my function.
This function takes 4 arguments:
The 1st argument is a file to be read, hence, this is a list of files to read.
The following 2 arguments are two different dictionaries.
The last argument is an optional argument "debug_mode" which is needed to be set to "True"
# process_data(file, signals_dict, parameter_dict, debug_mode=False)
file_list = [...]
t1 = time.time()
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(process_data, file_list)
t2 = time.time()
The question is:
How can I specify the remaining parameters to the function?
Thanks in advance
ProcessPoolExecutor.map documentation is weak. The worker accepts a single parameter. If your target has a different call signature, you need to write an intermediate worker that is passed a container and knows how to expand that into the paramter list. The documention also fails to make it clear that you need to wait for the job to complete before closing the pool. If you start the jobs and exit the pool context with clause, the pool is terminated.
import concurrent.futures
import os
def process_data(a,b,c,d):
print(os.getpid(), a, b, c, d)
return a
def _process_data_worker(p):
return process_data(*p)
if __name__ == "__main__":
file_list = [["fooa", "foob", "fooc", "food"],
["bara", "barb", "barc", "bard"]]
with concurrent.futures.ProcessPoolExecutor() as executor:
results = executor.map(_process_data_worker, file_list)
for result in results:
print('result', result)
You need to create a list of lists containing parameters for each process:
params_list = [[file1, dict1_1, dict2_1, True],
[file2, dict1_2, dict2_2, True],
[file3, dict1_3, dict2_3]]
Then, you can create processes like this:
executor.map(process_data, params_list)
Using multiprocessing.pool I can split an input list for a single function to be processed in parallel along multiple CPUs. Like this:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4)
results = pool.map(f, range(100))
pool.close()
pool.join()
However, this does not allow to run different functions on different processors. If I want to do something like this, in parallel / simultaneously:
foo1(args1) --> Processor1
foo2(args2) --> Processor2
How can this be done?
Edit: After Darkonaut remarks, I do not care about specifically assigning foo1 to Processor number 1. It can be any processor as chosen by the OS. I am just interested in running independent functions in different/ parallel Processes. So rather:
foo1(args1) --> process1
foo2(args2) --> process2
I usually find it easiest to use the concurrent.futures module for concurrency. You can achieve the same with multiprocessing, but concurrent.futures has (IMO) a much nicer interface.
Your example would then be:
from concurrent.futures import ProcessPoolExecutor
def foo1(x):
return x * x
def foo2(x):
return x * x * x
if __name__ == '__main__':
with ProcessPoolExecutor(2) as executor:
# these return immediately and are executed in parallel, on separate processes
future_1 = executor.submit(foo1, 1)
future_2 = executor.submit(foo2, 2)
# get results / re-raise exceptions that were thrown in workers
result_1 = future_1.result() # contains foo1(1)
result_2 = future_2.result() # contains foo2(2)
If you have many inputs, it is better to use executor.map with the chunksize argument instead:
from concurrent.futures import ProcessPoolExecutor
def foo1(x):
return x * x
def foo2(x):
return x * x * x
if __name__ == '__main__':
with ProcessPoolExecutor(4) as executor:
# these return immediately and are executed in parallel, on separate processes
future_1 = executor.map(foo1, range(10000), chunksize=100)
future_2 = executor.map(foo2, range(10000), chunksize=100)
# executor.map returns an iterator which we have to consume to get the results
result_1 = list(future_1) # contains [foo1(x) for x in range(10000)]
result_2 = list(future_2) # contains [foo2(x) for x in range(10000)]
Note that the optimal values for chunksize, the number of processes, and whether process-based concurrency actually leads to increased performance depends on many factors:
The runtime of foo1 / foo2. If they are extremely cheap (as in this example), the communication overhead between processes might dominate the total runtime.
Spawning a process takes time, so the code inside with ProcessPoolExecutor needs to run long enough for this to amortize.
The actual number of physical processors in the machine you are running on.
Whether your application is IO bound or compute bound.
Whether the functions you use in foo are already parallelized (such as some np.linalg solvers, or scikit-learn estimators).
I'm performing analyses of time-series of simulations. Basically, it's doing the same tasks for every time steps. As there is a very high number of time steps, and as the analyze of each of them is independant, I wanted to create a function that can multiprocess another function. The latter will have arguments, and return a result.
Using a shared dictionnary and the lib concurrent.futures, I managed to write this :
import concurrent.futures as Cfut
def multiprocess_loop_grouped(function, param_list, group_size, Nworkers, *args):
# function : function that is running in parallel
# param_list : list of items
# group_size : size of the groups
# Nworkers : number of group/items running in the same time
# **param_fixed : passing parameters
manager = mlp.Manager()
dic = manager.dict()
executor = Cfut.ProcessPoolExecutor(Nworkers)
futures = [executor.submit(function, param, dic, *args)
for param in grouper(param_list, group_size)]
Cfut.wait(futures)
return [dic[i] for i in sorted(dic.keys())]
Typically, I can use it like this :
def read_file(files, dictionnary):
for file in files:
i = int(file[4:9])
#print(str(i))
if 'bz2' in file:
os.system('bunzip2 ' + file)
file = file[:-4]
dictionnary[i] = np.loadtxt(file)
os.system('bzip2 ' + file)
Map = np.array(multiprocess_loop_grouped(read_file, list_alti, Group_size, N_thread))
or like this :
def autocorr(x):
result = np.correlate(x, x, mode='full')
return result[result.size//2:]
def find_lambda_finger(indexes, dic, Deviation):
for i in indexes :
#print(str(i))
# Beach = Deviation[i,:] - np.mean(Deviation[i,:])
dic[i] = Anls.find_first_max(autocorr(Deviation[i,:]), valmax = True)
args = [Deviation]
Temp = Rescal.multiprocess_loop_grouped(find_lambda_finger, range(Nalti), Group_size, N_thread, *args)
Basically, it is working. But it is not working well. Sometimes it crashes. Sometimes it actually launches a number of python processes equal to Nworkers, and sometimes there is only 2 or 3 of them running at a time while I specified Nworkers = 15.
For example, a classic error I obtain is described in the following topic I raised : Calling matplotlib AFTER multiprocessing sometimes results in error : main thread not in main loop
What is the more Pythonic way to achieve what I want ? How can I improve the control this function ? How can I control more the number of running python process ?
One of the basic concepts for Python multi-processing is using queues. It works quite well when you have an input list that can be iterated and which does not need to be altered by the sub-processes. It also gives you a good control over all the processes, because you spawn the number you want, you can run them idle or stop them.
It is also a lot easier to debug. Sharing data explicitly is usually an approach that is much more difficult to setup correctly.
Queues can hold anything as they are iterables by definition. So you can fill them with filepath strings for reading files, non-iterable numbers for doing calculations or even images for drawing.
In your case a layout could look like that:
import multiprocessing as mp
import numpy as np
import itertools as it
def worker1(in_queue, out_queue):
#holds when nothing is available, stops when 'STOP' is seen
for a in iter(in_queue.get, 'STOP'):
#do something
out_queue.put({a: result}) #return your result linked to the input
def worker2(in_queue, out_queue):
for a in iter(in_queue.get, 'STOP'):
#do something differently
out_queue.put({a: result}) //return your result linked to the input
def multiprocess_loop_grouped(function, param_list, group_size, Nworkers, *args):
# your final result
result = {}
in_queue = mp.Queue()
out_queue = mp.Queue()
# fill your input
for a in param_list:
in_queue.put(a)
# stop command at end of input
for n in range(Nworkers):
in_queue.put('STOP')
# setup your worker process doing task as specified
process = [mp.Process(target=function,
args=(in_queue, out_queue), daemon=True) for x in range(Nworkers)]
# run processes
for p in process:
p.start()
# wait for processes to finish
for p in process:
p.join()
# collect your results from the calculations
for a in param_list:
result.update(out_queue.get())
return result
temp = multiprocess_loop_grouped(worker1, param_list, group_size, Nworkers, *args)
map = multiprocess_loop_grouped(worker2, param_list, group_size, Nworkers, *args)
It can be made a bit more dynamic when you are afraid that your queues will run out of memory. Than you need to fill and empty the queues while the processes are running. See this example here.
Final words: it is not more Pythonic as you requested. But it is easier to understand for a newbie ;-)
I want to execute f1 and f2 at the same time. but the following code doesn't work!
from multiprocessing import Pool
def f1(x):
return x*x
def f2(x):
return x^2
if __name__ == '__main__':
x1=10
x2=20
p= Pool(2)
out=(p.map([f1, f2], [x1, x2]))
y1=out[0]
y2=out[1]
I believe you'd like to use threading.Thread and shared queue in your code.
from queue import Queue
from threading import Thread
import time
def f1(q, x):
# Sleep function added to compare execution times.
time.sleep(5)
# Instead of returning the result we put it in shared queue.
q.put(x * 2)
def f2(q, x):
time.sleep(5)
q.put(x ^ 2)
if __name__ == '__main__':
x1 = 10
x2 = 20
result_queue = Queue()
# We create two threads and pass shared queue to both of them.
t1 = Thread(target=f1, args=(result_queue, x1))
t2 = Thread(target=f2, args=(result_queue, x2))
# Starting threads...
print("Start: %s" % time.ctime())
t1.start()
t2.start()
# Waiting for threads to finish execution...
t1.join()
t2.join()
print("End: %s" % time.ctime())
# After threads are done, we can read results from the queue.
while not result_queue.empty():
result = result_queue.get()
print(result)
Code above should print output similar to:
Start: Sat Jul 2 20:50:50 2016
End: Sat Jul 2 20:50:55 2016
20
22
As you can see, even though both functions wait 5 seconds to yield their results, they do it in parallel so overall execution time is 5 seconds.
If you care about what function put what result in your queue, I can see two solutions that will allow to determine that. You can either create multiple queues or wrap your results in a tuple.
def f1(q, x):
time.sleep(5)
# Tuple containing function information.
q.put((f1, x * 2))
And for further simplification (especially when you have many functions to deal with) you can decorate your functions (to avoid repeated code and to allow function calls without queue):
def wrap_result(func):
def wrapper(*args):
# Assuming that shared queue is always the last argument.
q = args[len(args) - 1]
# We use it to store the results only if it was provided.
if isinstance(q, Queue):
function_result = func(*args[:-1])
q.put((func, function_result))
else:
function_result = func(*args)
return function_result
return wrapper
#wrap_result
def f1(x):
time.sleep(5)
return x * 2
Note that my decorator was written in a rush and its implementation might need improvements (in case your functions accept kwargs, for instance). If you decide to use it, you'll have to pass your arguments in reverse order: t1 = threading.Thread(target=f1, args=(x1, result_queue)).
A little friendly advice.
"Following code doesn't work" says nothing about the problem. Is it raising an exception? Is it giving unexpected results?
It's important to read error messages. Even more important - to study their meaning. Code that you have provided raises a TypeError with pretty obvious message:
File ".../stack.py", line 16, in <module> out = (p.map([f1, f2], [x1, x2]))
TypeError: 'list' object is not callable
That means first argument of Pool().map() have to be a callable object, a function for instance. Let's see the docs of that method.
Apply func to each element in iterable, collecting the results in a
list that is returned.
It clearly doesn't allow a list of functions to be passed as it's argument.
Here you can read more about Pool().map() method.
I want to execute f1 and f2 at the same time. but the following code doesn't work! ...
out=(p.map([f1, f2], [x1, x2]))
The minimal change to your code is to replace the p.map() call with:
r1 = p.apply_async(f1, [x1])
out2 = f2(x2)
out1 = r1.get()
Though if all you want is to run two function calls concurrently then you don't need the Pool() here, you could just start a Thread/Process manually and use Pipe/Queue to get the result:
#!/usr/bin/env python
from multiprocessing import Process, Pipe
def another_process(f, args, conn):
conn.send(f(*args))
conn.close()
if __name__ == '__main__':
parent_conn, child_conn = Pipe(duplex=False)
p = Process(target=another_process, args=(f1, [x1], child_conn))
p.start()
out2 = f2(x2)
out1 = parent_conn.recv()
p.join()