I've gone through (this SO thread)[ Synchronization issue using Python's multiprocessing module but it doesnt provide the answer.
The following code :-
rom multiprocessing import Process, Lock
def f(l, i):
l.acquire()
print 'hello world', i
l.release()
# do something else
if __name__ == '__main__':
lock = Lock()
for num in range(10):
Process(target=f, args=(lock, num)).start()
How do I get the processes to execute in order.? I want to hold up a lock for a few seconds and then release it and thereby moving forward with the P1 and P2 into the lock, and then P2 moving forward and P3 exceuting that lock. How would I get the processes to execute in order.?
It sounds like you just want to delay the start of each successive process. If that's the case, you can use a multiprocessing.Event to delay starting the next child in the main process. Just pass the event to the child, and have the child set the Event when its done doing whatever should run prior to starting the next child. The main process can wait on that Event, and once it's signalled, clear it and start the next child.
from multiprocessing import Process, Event
def f(e, i):
print 'hello world', i
e.set()
# do something else
if __name__ == '__main__':
event = Event()
for num in range(10):
p = Process(target=f, args=(event, num))
p.start()
event.wait()
event.clear()
this is not the purpose of locks. Your code architecture is bad for your use case. I think you should refactor your code to this:
from multiprocessing import Process
def f(i):
# do something here
if __name__ == '__main__':
for num in range(10):
print 'hello world', num
Process(target=f, args=(num,)).start()
in this case it will print in order and then will do the remaining part asynchronously
Related
I have a python script which calls a series of sub-processes. They need to run "for ever" - but they occasionally die, or get killed. When this happens I need to restart the process using the same arguments as the one which died.
This is a very simplified version:
[edit: this is the less simplified version, which includes "restart" code]
import multiprocessing
import time
import random
def printNumber(number):
print("starting :", number)
while random.randint(0, 5) > 0:
print(number)
time.sleep(2)
if __name__ == '__main__':
children = [] # list
args = {} # dictionary
for processNumber in range(10,15):
p = multiprocessing.Process(
target=printNumber,
args=(processNumber,)
)
children.append(p)
p.start()
args[p.pid] = processNumber
while True:
time.sleep(1)
for n, p in enumerate(children):
if not p.is_alive():
#get parameters dead child was started with
pidArgs = args[p.pid]
del(args[p.pid])
print("n,args,p: ",n,pidArgs,p)
children.pop(n)
# start new process with same args
p = multiprocessing.Process(
target=printNumber,
args=(pidArgs,)
)
children.append(p)
p.start()
args[p.pid] = pidArgs
I have updated the example to illustrate how I want the processes to be restarted if one crashes/killed/etc - keeping track of which pid was started with which args.
Is this the "best" way to do this, or is there a more "python" way of doing this?
I think I would create a separate thread for each Process and use a ProcessPoolExecutor. Executors have a useful function, submit, which returns a Future. You can wait on each Future and re-launch the Executor when the Future is done. Arguments to the function are tracked as class variables, so restarting is just a simple loop.
import threading
from concurrent.futures import ProcessPoolExecutor
import time
import random
import traceback
def printNumber(number):
print("starting :", number)
while random.randint(0, 5) > 0:
print(number)
time.sleep(2)
class KeepRunning(threading.Thread):
def __init__(self, func, *args, **kwds):
self.func = func
self.args = args
self.kwds = kwds
super().__init__()
def run(self):
while True:
with ProcessPoolExecutor(max_workers=1) as pool:
future = pool.submit(self.func, *self.args, **self.kwds)
try:
future.result()
except Exception:
traceback.print_exc()
if __name__ == '__main__':
for process_number in range(10, 15):
keep = KeepRunning(printNumber, process_number)
keep.start()
while True:
time.sleep(1)
At the end of the program is a loop to keep the main thread running. Without that, the program will attempt to exit while your Processes are still running.
For the example you provided I would just remove the exit condition from the while loop and change it to True.
As you said though the actual code is more complicated (why didn't you post that?). So if the process gets terminated by lets say an exception just put the code inside a try catch block. You can then put said block in an infinite loop.
I hope this is what you are looking for but that seems to be the right way to do it provided the goal and information you provided.
Instead of just starting the process immediately, you can save the list of processes and their arguments, and create another process that checks they are alive.
For example:
if __name__ == '__main__':
process_list = []
for processNumber in range(5):
process = multiprocessing.Process(
target=printNumber,
args=(processNumber,)
)
process_list.append((process,args))
process.start()
while True:
for running_process, process_args in process_list:
if not running_process.is_alive():
new_process = multiprocessing.Process(target=printNumber, args=(process_args))
process_list.remove(running_process, process_args) # Remove terminated process
process_list.append((new_process, process_args))
I must say that I'm not sure the best way to do it is in python, you may want to look at scheduler services like jenkins or something like that.
I am learning about Python multiprocessing and trying to understand how I can make my code wait for all processes to finish and then continue with the rest of the code. I thought join() method should do the job, but the output of my code is not what I expected from the using it.
Here is the code:
from multiprocessing import Process
import time
def fun():
print('starting fun')
time.sleep(2)
print('finishing fun')
def fun2():
print('starting fun2')
time.sleep(5)
print('finishing fun2')
def fun3():
print('starting fun3')
print('finishing fun3')
if __name__ == '__main__':
processes = []
print('starting main')
for i in [fun, fun2, fun3]:
p = Process(target=i)
p.start()
processes.append(p)
for p in processes:
p.join()
print('finishing main')
g=0
print("g",g)
I expected all processes under if __name__ == '__main__': to finish before the lines g=0 and print(g) are called, so something like this was expected:
starting main
starting fun2
starting fun
starting fun3
finishing fun3
finishing fun
finishing fun2
finishing main
g 0
But the actual output indicates that there's something I don't understand about join() (or multiprocessing in general):
starting main
g 0
g 0
starting fun2
g 0
starting fun
starting fun3
finishing fun3
finishing fun
finishing fun2
finishing main
g 0
The question is: How do I write the code that finishes all processes first and then continues with the code without multiprocessing, so that I get the former output? I run the code from command prompt on Windows, in case it matters.
On waiting the Process to finish:
You can just Process.join your list, something like
import multiprocessing
import time
def func1():
time.sleep(1)
print('func1')
def func2():
time.sleep(2)
print('func2')
def func3():
time.sleep(3)
print('func3')
def main():
processes = [
multiprocessing.Process(target=func1),
multiprocessing.Process(target=func2),
multiprocessing.Process(target=func3),
]
for p in processes:
p.start()
for p in processes:
p.join()
if __name__ == '__main__':
main()
But if you're thinking about giving your process more complexity, try using a Pool:
import multiprocessing
import time
def func1():
time.sleep(1)
print('func1')
def func2():
time.sleep(2)
print('func2')
def func3():
time.sleep(3)
print('func3')
def main():
result = []
with multiprocessing.Pool() as pool:
result.append(pool.apply_async(func1))
result.append(pool.apply_async(func2))
result.append(pool.apply_async(func3))
for r in result:
r.wait()
if __name__ == '__main__':
main()
More info on Pool
On why g0 prints multiple times:
This is happening because you're using spawn or forkserver to set your Process and the g0 and print declarations are outside a function or the __main__ if block.
From the docs:
Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).
(...)
This allows the newly spawned Python interpreter to safely import the module and then run the module’s foo() function.
Similar restrictions apply if a pool or manager is created in the main module.
It's basically interpreting again because it's importing your .py file as a module.
Below is the code which demonstrates the problem. Please note that this is only an example, I am using the same logic in a more complicated application, where I can't use sleep as the amount of time, it will take for process1 to modify the variable, is dependent on the speed of the internet connection.
from multiprocessing import Process
code = False
def func():
global code
code = True
pro = Process(target=func)
pro.start()
while code == False:
pass
pro.terminate()
pro.join()
print('Done!')
On running this nothing appears on the screen. When I terminate the program, by pressing CTRL-C, the stack trace shows that the while loop was being executed.
Python has a few concurrency libraries: threading, multiprocessing and asyncio (and more).
multiprocessing is a library which uses subprocesses to bypass python's inability to concurrently run CPU intensive tasks. To share variables between different multiprocessing.Processes, create them via a multiprocessing.Manager() instance. For example:
import multiprocessing
import time
def func(event):
print("> func()")
time.sleep(1)
print("setting event")
event.set()
time.sleep(1)
print("< func()")
def main():
print("In main()")
manager = multiprocessing.Manager()
event = manager.Event()
p = multiprocessing.Process(target=func, args=(event,))
p.start()
while not event.is_set():
print("waiting...")
time.sleep(0.2)
print("OK! joining func()...")
p.join()
print('Done!')
if __name__ == "__main__":
main()
Im trying to understand how processes are messaging the other one, below example;
i use second function to do my main job, and queue feeds first function sometimes to do it own job and no matter when its finished, i look many example and try different ways, but no success, is any one can explain how can i do it over my example.
from multiprocessing import Process, Queue, Manager
import time
def first(a,b):
q.get()
print a+b
time.sleep(3)
def second():
for i in xrange(10):
print "seconf func"
k+=1
q.put=(i,k)
if __name__ == "__main__":
processes = []
q = Queue()
manager = Manager()
p = Process(target=first, args=(a,b))
p.start()
processes.append(p)
p2 = Process(target=second)
p2.start()
processes.append(p2)
try:
for process in processes:
process.join()
except KeyboardInterrupt:
print "Interupt"
This problem seems to have been eluding me - all the solutions are more like workarounds and add quite a bit of complexity to the code.
Since its been a good while since any posts regarding this have been made, are there any simple solutions to the following - upon detecting a keyboard interrupt, cleanly exit all the childs proceses, terminate the program?
Code below is snippet of my multiproccess structure - I'd like to preserve as much as posible, while adding the needed functionality:
from multiprocessing import Pool
import time
def multiprocess_init(l):
global lock
lock = l
def synchronous_print(i):
with lock:
print i
time.sleep(1)
if __name__ == '__main__':
lock = Lock()
pool = Pool(processes=5, initializer=multiprocess_init, initargs=(lock, ))
for i in range(1,20):
pool.map_async(synchronous_print, [i])
pool.close() #necessary to prevent zombies
pool.join() #wait for all processes to finish
The short answer is to move to python 3. Python 2 has multiple problems with thread/process synchronization that have been fixed in python 3.
In your case, multiprocessing will doggedly recreate your child processes every time you send keyboard interrupt and pool.close will get stuck and never exit. You can reduce the problem by explicitly exiting the child process with os.exit and by waiting for individual results from apply_async so that you don't get stuck in pool.close prison.
from multiprocessing import Pool, Lock
import time
import os
def multiprocess_init(l):
global lock
lock = l
print("initialized child")
def synchronous_print(i):
try:
with lock:
print i
time.sleep(1)
except KeyboardInterrupt:
print("exit child")
os.exit(2)
if __name__ == '__main__':
lock = Lock()
pool = Pool(processes=5, initializer=multiprocess_init, initargs=(lock, ))
results = []
for i in range(1,20):
results.append(pool.map_async(synchronous_print, [i]))
for result in results:
print('wait result')
result.wait()
pool.close() #necessary to prevent zombies
pool.join() #wait for all processes to finish
print("Join completes")