I have been banging my head against Multiprocessing in Python for the better part of the day now, and I've managed to make very little progress - I apologize if my question is a duplicate or my ignorance is apparent - I couldn't find it asked anywhere else in this way.
I'm looking for a way to run functions in parallel, and return some arbitrary thing they've produced back to the main script.
The question is: Can a Process() started from Multiprocessing return a list or some other arbitrary variable type?
For example, I would like to:
def 30_second_function():
#pretend this takes 30 seconds to run
return ["mango", "habanero", "salsa"]
#End 30_second_function()
def 5_second_function():
#pretend this takes 5 seconds to run
return {"beans": "8 oz", "tomato paste": "16 oz"}
#End 5_second_function()
p1 = multiprocessing.Process(target=30_second_function)
p1.start()
p2 = multiprocessing.Process(target=5_second_function)
p2.start()
#Somehow retrieve the list and the dictionary here. p1.returned??
And then somehow access the list from 30_second_function and the dictionary from 5_second_function. Is this possible? Am I going about this the wrong way?
Process itself does not provide a way to get return value. To exchange data between processes, you need to use queue, pipe, shared memory, ...:
import multiprocessing
def thirty_second_function(q):
q.put(["mango", "habanero", "salsa"])
def five_second_function(q):
q.put({"beans": "8 oz", "tomato paste": "16 oz"})
if __name__ == '__main__':
q1 = multiprocessing.Queue()
p1 = multiprocessing.Process(target=thirty_second_function, args=(q1,))
p1.start()
q2 = multiprocessing.Queue()
p2 = multiprocessing.Process(target=five_second_function, args=(q2,))
p2.start()
print(q1.get())
print(q2.get())
Alternative using multiprocessing.pool.Pool:
import multiprocessing.pool
def thirty_second_function():
return ["mango", "habanero", "salsa"]
def five_second_function():
return {"beans": "8 oz", "tomato paste": "16 oz"}
if __name__ == '__main__':
p = multiprocessing.pool.Pool()
p1 = p.apply_async(thirty_second_function)
p2 = p.apply_async(five_second_function)
print(p1.get())
print(p2.get())
Or using concurrent.futures module (also available in standard library since Python 3.2+):
from concurrent.futures import ProcessPoolExecutor
def thirty_second_function():
return ["mango", "habanero", "salsa"]
def five_second_function():
return {"beans": "8 oz", "tomato paste": "16 oz"}
if __name__ == '__main__':
with ProcessPoolExecutor() as e:
p1 = e.submit(thirty_second_function)
p2 = e.submit(five_second_function)
print(p1.result())
print(p2.result())
Related
I'm just studying about multiprocessing in Python. I have a code that updates the value of a variable in a process, and other processes read the value of this variable. This is working as I expected.
Now I just want to know if there is some way to do the same using the Ray library to improve the speed of execution if I need to run lots of processes reading it
from multiprocessing import Process, Manager
def write_to_dict(d, value):
while True:
value = value + 1
d['key'] = value
def read_from_dict(d):
while True:
read = d['key']
print(read)
if __name__ == '__main__':
manager = Manager()
shared_dict = manager.dict()
p1 = Process(target=write_to_dict, args=(shared_dict, 0))
p2 = Process(target=read_from_dict, args=(shared_dict,))
p1.start()
p2.start()
p1.join()
p2.join()
So I want to start a nested while loop in one multiprocessing function from another multiprocessing function. In one function, I'm changing a variable (action) to "fn2", and in the other function there is a nested while loop whose condition is while action == "fn2":.
See code:
from multiprocessing import Process
running = True
action = None
def func1():
global action
if 1+1 == 2:
action = "fn2"
print(action)
def func2():
while running:
while action == "fn2":
print("fn2")
if __name__ == '__main__':
p1 = Process(target=func1)
p1.start()
p2 = Process(target=func2)
p2.start()
p1.join()
p2.join()
However, when I run it, the code just prints "fn2" once (confirming that action is equal to "fn2"). But the nested loop inside func2() does not execute. Sorry if the answer is obvious, I'm new to multiprocessing.
i added two comments (with print statements) to highlight the error.
basically action=None in func2() so that is why...
from multiprocessing import Process
running = True
action = None
def func1():
global action
if 1+1 == 2:
action = "fn2"
print(action)
def func2():
while running:
print('got here') # <--- loops infinitly here
print(action) # <--- this is none
while action == "fn2":
print("fn2")
if __name__ == '__main__':
p1 = Process(target=func1)
p1.start()
p2 = Process(target=func2)
p2.start()
p1.join()
p2.join()
In order to share values when multiprocessing, which is called Sharing state between processes you need to use value or array for a single device shared memory or alternatively, Manager for networks of servers.
Here is a link:
https://docs.python.org/3/library/multiprocessing.html
The basic format looks like this:
from multiprocessing import Process, Value, Array
def f(n, a):
n.value = 3.1415927
for i in range(len(a)):
a[i] = -a[i]
if __name__ == '__main__':
num = Value('d', 0.0)
arr = Array('i', range(10))
p = Process(target=f, args=(num, arr))
p.start()
p.join()
print(num.value)
print(arr[:])
So in the case of the question what the variable action is equivalent to n (variable) or a (list) etc.. and this can be shares across functions.
Also note that one can parse arguments into multiprocess functions with the args keyword: args=(num, arr ...).
I have two functions and needed the return values to proceed with the further part of the script...but currently my code giving only the output of the first function...
import multiprocessing
def gm(name):
h = "Good Morning"+str(name)
qout.put(h)
def sal(name):
k="Hi "+str(name)
qout.put(k)
if __name__ == '__main__':
qout = multiprocessing.Queue()
p1 = multiprocessing.Process(target=gm, args=("ashin",))
p2 = multiprocessing.Process(target=sal, args=("ashin",))
p1.start()
p2.start()
p1.join()
p2.join()
result = qout.get()
#output - "Good Morning ashin"
#required output - "Good Morning ashin" & "Hi ashin"
Appreciate your help......
qout.get() gets you the first element from queue. I do not know the bigger picture of what you're are trying to achieve, but you can get all elements from queue like in the following.
from multiprocessing import Process, Queue
def gm(name):
h = "Good Morning"+str(name)
qout.put(h)
def sal(name):
k="Hi "+str(name)
qout.put(k)
if __name__ == '__main__':
qout = Queue()
p1 = Process(target=gm, args=("ashin",))
p2 = Process(target=sal, args=("ashin",))
p1.start()
p2.start()
p1.join()
p2.join()
list1 = []
while not qout.empty():
list1.append(qout.get())
temp = list(map(str, list1))
print(" & ".join(temp))
output
Hi ashin & Good Morningashin
Instead of managing your own output queue, just use the latest Python 3 concurrency features:
from concurrent.futures import as_completed, ProcessPoolExecutor
def gm(name):
return f'Good Morning {name}'
def sal(name):
return f'Hi {name}'
if __name__ == '__main__':
with ProcessPoolExecutor() as exe:
futures = [exe.submit(x, 'ashin') for x in (gm, sal)]
for future in as_completed(futures):
print(future.result())
Based on this pretty useful tutorial I have tried to make a simple implementation of Python multiprocessing to measure its effectivity. The modules multi1, multi2, multi3 contain an ODE integration and exporting the calculated values in a csv (it does not matter, they are here for a script to do something).
import multiprocessing
import multi1
import multi2
import multi3
import time
t0 = time.time()
if __name__ == '__main__':
p1 = multiprocessing.Process(target = multi1.main(), args=())
p2 = multiprocessing.Process(target = multi2.main(), args=())
p3 = multiprocessing.Process(target = multi3.main(), args=())
p1.start()
p2.start()
p3.start()
p1.join()
p2.join()
p3.join()
t1 = time.time()
multi1.main()
multi2.main()
multi3.main()
t2 = time.time()
print t1-t0
print t2-t1
The problem is that the printed times are equal, so the multiprocessing didn't speed up the process. Why?
You called main in the main thread, and passed the return value (probably None) as the target, so no actual work is done in your worker processes. Remove the call parens, so you pass the function itself without calling it, e.g.:
p1 = multiprocessing.Process(target=multi1.main, args=())
p2 = multiprocessing.Process(target=multi2.main, args=())
p3 = multiprocessing.Process(target=multi3.main, args=())
This is the same basic problem seen in the threaded case.
I am new with multiprocessing in python and so far all the example I've seen are this kind (with one or more methods in the file and then 'main'):
from multiprocessing import Process
def f1(a):
#do something
def f2(b):
#do something
if __name__ == '__main__':
f1(a1)
p = Process(target=f2, args=(b2,))
p.start()
p.join()
If I have instead a method who calls 2 functions in another file to be concurrent like in the following lines,
def function():
#do something
file2.f1(a) #first concurrent method
file2.f2(b) #second concurrent method
how should I do?
Can anyone make a simple example? I tried in this way, but it starts all the program again after the first loop :
def function():
#do something
for i in range(3):
p1 = Process(target=file2.f1, args=(a)) #first concurrent method
p2 = Process(target=file2.f2, args=(b)) #second concurrent method
p1.start()
p2.start()
p1.join()
p2.join()
The issue seems to be that args varialbe is incorrectly defined, it should be tuple and not a single variable:
def function():
#do something
for i in range(3):
p1 = Process(target=file2.f1, args=(a, )) #first concurrent method
p2 = Process(target=file2.f2, args=(b, )) #second concurrent method
p1.start()
p2.start()
p1.join()
p2.join()
If you the order of the executions is flexible, you can use the Pool class to trigger multiple calls:
from multiprocessing.pool import Pool
pool = Pool()
pool.map_async(f1, [(arg, )] * 3)
pool.map_async(f2, [(arg, )] * 3)
pool.close()
pool.join()