I'm spawning 5 different processes from a python script, like this:
p = multiprocessing.Process(target=some_method,args=(arg,))
p.start()
My problem is, when, somehow the parent process (the main script) gets killed, the child processes keeps on running.
Is there a way to kill child processes, which are spawned like this, when the parent gets killed ?
EDIT:
I'm trying this:
p = multiprocessing.Process(target=client.start,args=(self.query_interval,))
p.start()
atexit.register(p.terminate)
But this doesnt seem to be working
I've encounter the same problem myself, I've got the following solution:
before calling p.start(), you may set p.daemon=True. Then as mentioned here python.org multiprocessing
When a process exits, it attempts to terminate all of its daemonic child processes.
The child is not notified of the death of its parent, it only works the other way.
However, when a process dies, all its file descriptors are closed. And the other end of a pipe is notified about this, if it selects the pipe for reading.
So your parent can create a pipe before spawning the process (or in fact, you can just set up stdin to be a pipe), and the child can select that for reading. It will report ready for reading when the parent end is closed. This requires your child to run a main loop, or at least make regular calls to select. If you don't want that, you'll need some manager process to do it, but then when that one is killed, things break again.
If you have access to the parent pid you can use something like this
import os
import sys
import psutil
def kill_child_proc(ppid):
for process in psutil.process_iter():
_ppid = process.ppid()
if _ppid == ppid:
_pid = process.pid
if sys.platform == 'win32':
process.terminate()
else:
os.system('kill -9 {0}'.format(_pid))
kill_child_proc(<parent_pid>)
My case was using a Queue object to communicate with the child processes. For whatever reason, the daemon flag as suggested in the accepted answer does not work. Here's a minimal example illustrating how to get the children to die gracefully in this case.
The main idea is to pause child work execution every second or so and check if the parent process is still alive. If it is not alive, we close the Queue and exit.
Note this also works if the main process is killed using SIGKILL
import ctypes, sys
import multiprocessing as mp
worker_queue = mp.Queue(maxsize=10)
# flag to communicate the parent's death to all children
alive = mp.Value(ctypes.c_bool, lock=False)
alive.value = True
def worker():
while True:
# fake work
data = 99.99
# submit finished work to parent, while checking if parent has died
queued = False
while not queued:
# note here we do not block indefinitely, so we can check if parent died
try:
worker_queue.put(data, block=True, timeout=1.0)
queued = True
except: pass
# check if parent process is alive still
par_alive = mp.parent_process().is_alive()
if not (par_alive and alive.value):
# for some reason par_alive is only False for one of the children;
# notify the others that the parent has died
alive.value = False
# appears we need to close the queue before sys.exit will work
worker_queue.close()
# for more dramatic shutdown, could try killing child process;
# wp.current_process().kill() does not work, though you could try
# calling os.kill directly with the child PID
sys.exit(1)
# launch worker processes
for i in range(4):
child = mp.Process(target=worker)
child.start()
Related
I am using the multiprocessing module of Python. I am testing the following code :
from multiprocessing import *
from time import sleep
def f():
print ('in child#1 proc')
sleep(2)
print('ch#1 ends')
def f1() :
print ('in child#2 proc')
sleep(10)
print('ch#2 ends')
if __name__ == '__main__':
p = Process(target=f)
p1 = Process(target=f1, daemon=True)
p.start()
p1.start()
sleep(1)
print ('child procs started')
I have the following observations :
The first child process p runs for 2 secs
After 1 sec, the second child process p1 becomes zombie
The parent (main) process runs (is active) till child#1 (non-daemon process) is running, that is for 2secs
Now I have the following queries :
Why should the parent (main) process be active after it finishes execution? Note that the parent does not perform a join on p.
Why should the daemon child p1 become a zombie after 1 sec? Note that the parent (main) process actually stays alive till the time p is running.
I have executed the above program on ubuntu.
My observations are based on the output os the ps command on ubuntu
To sum up and persist the discussion in the comments of the other answer:
Why should the parent (main) process be active after it finishes
execution? Note that the parent does not perform a join on p.
multiprocessing tries to make sure that your programs using it behave well. That is, it attempts to clean up after itself. In order to do so, it utilizes the atexit module which lets you register exit handlers that are to be executed when the interpreter process prepares to terminate normally.
multiprocessing defines and registers the function _exit_function that first calls terminate() on all still running daemonic childs and then calls join() on all remaining non-daemonic childs. Since join() blocks, the parent waits until the non-daemonic childs have terminated. terminate() on the other hand does not block, it simply sends a SIGTERM signal (on Unix) to childs and returns.
That brings us to:
Why should the daemon child p1 become a zombie after 1 sec? Note that
the parent (main) process actually stays alive till the time p is
running.
That is because the parent has reached the end of its instructions and the interpreter prepares to terminate, i.e. it executes the registered exit handlers. The daemonic child p1 receives a SIGTERM signal. Since SIGTERM is allowed to be caught and handled inside processes, the child is not ordered to shut down immediately, but instead is given the chance to do some cleanup of its own. That's what makes p1 show up as <defunct>. The Kernel knows that the process has been instructed to terminate, but the process has not done so yet.
In the given case, p1 has not yet had the chance to honor the SIGTERM signal, presumably because it still executes sleep(). At least as of Python 3.5:
The function now sleeps at least secs even if the sleep is interrupted
by a signal, except if the signal handler raises an exception (see PEP
475 for the rationale).
The parent stays alive because it is the root of the app. It stays in memory while the children are processing. Note, join waits for the child to exit and then gives control back to the parent. If you don't join the parent will exit but remain in memory.
p1 will zombie because the parent exits after the sleep 1. It stays alive with p because you don't deamon p. if you don't deamon a process and you call start on it, the control is passed to the child and when the child is complete it will pass control back to the parent. if you do daemon it, it will keep control with the parent and run the child in the back.
I'm using python to benchmark something. This can take a large amount of time, and I want to set a (global) timeout. I use the following script (summarized):
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException()
# Halt problem after half an hour
signal.alarm(1800)
try:
while solution is None:
guess = guess()
try:
with open(solutionfname, 'wb') as solutionf:
solverprocess = subprocess.Popen(["solver", problemfname], stdout=solutionf)
solverprocess.wait()
finally:
# `solverprocess.poll() == None` instead of try didn't work either
try:
solverprocess.kill()
except:
# Solver process was already dead
pass
except TimeoutException:
pass
# Cancel alarm if it's still active
signal.alarm(0)
However it keeps spawning orphan processes sometimes, but I can't reliably recreate the circumstances. Does anyone know what the correct way to prevent this is?
You simply have to wait after killing the process.
The documentation for the kill() method states:
Kills the child. On Posix OSs the function sends SIGKILL to the child.
On Windows kill() is an alias for terminate().
In other words, if you aren't on Windows, you are only sending a signal to the subprocess.
This will create a zombie process because the parent process didn't read the return value of the subprocess.
The kill() and terminate() methods are just shortcuts to send_signal(SIGKILL) and send_signal(SIGTERM).
Try adding a call to wait() after the kill(). This is even shown in the example under the documentation for communicate():
proc = subprocess.Popen(...)
try:
outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
note the call to communicate() after the kill(). (It is equivalent to calling wait() and also erading the outputs of the subprocess).
I want to clarify one thing: it seems like you don't understand exactly what a zombie process is. A zombie process is a terminated process. The kernel keeps the process in the process table until the parent process reads its exit status. I believe all memory used by the subprocess is actually reused; the kernel only has to keep track of the exit status of such a process.
So, the zombie processes you see aren't running. They are already completely dead, and that's why they are called zombie. They are "alive" in the process table, but aren't really running at all.
Calling wait() does exactly this: wait till the subprocess ends and read the exit status. This allows the kernel to remove the subprocess from the process table.
On linux, you can use python-prctl.
Define a preexec function such as:
def pre_exec():
import signal
prctl.set_pdeathsig(signal.SIGTERM)
And have your Popen call pass it.
subprocess.Popen(..., preexec_fn=pre_exec)
That's as simple as that. Now the child process will die rather than become orphan if the parent dies.
If you don't like the external dependency of python-prctl you can also use the older prctl. Instead of
prctl.set_pdeathsig(signal.SIGTERM)
you would have
prctl.prctl(prctl.PDEATHSIG, signal.SIGTERM)
I want to use multiprocessing module to complete this.
when I do this, like:
$ python my_process.py
I start a parent process, and then let the parent process spawn a child process,
then i want that the parent process exits itself, but the child process continues to work.
Allow me write a WRONG code to explain myself:
from multiprocessing import Process
def f(x):
with open('out.dat', 'w') as f:
f.write(x)
if __name__ == '__main__':
p = Process(target=f, args=('bbb',))
p.daemon = True # This is key, set the daemon, then parent exits itself
p.start()
#p.join() # This is WRONG code, just want to exlain what I mean.
# the child processes will be killed, when father exit
So, how do i start a process that will not be killed when the parent process finishes?
20140714
Hi, you guys
My friend just told me a solution...
I just think...
Anyway, just let u see:
import os
os.system('python your_app.py&') # SEE!? the & !!
this does work!!
A trick: call os._exit to make parent process exit, in this way daemonic child processes will not be killed.
But there are some other side affects, described in the doc:
Exit the process with status n, without calling cleanup handlers,
flushing stdio buffers, etc.
If you do not care about this, you can use it.
Here's one way to achieve an independent child process that does not exit when __main__ exits. It uses the os._exit() tip mentioned above by #WKPlus.
Is there a way to detach matplotlib plots so that the computation can continue?
I have some python multiprocessing code with the parent process starting a bunch of child worker processes and then terminating them after awhile:
from multiprocessing import Process
nWorkers = 10
curWorkers = []
for iw in range(nWorkers):
pq = Process(target=worker, args=(worker's_args_here))
pq.start()
curWorkers.append(pq)
# Do work here...
for pw in curWorkers:
pw.terminate()
However, the child processes all are showing as defunct long after termination. Are they zombie processes? More importantly, how should I terminate them so that they really go away?
Try adding:
for pw in curWorkers:
pw.join()
at the end. .terminate() just kills the process. The parent process still needs to reap it (at least on Linux-y systems) before the child process goes away entirely.
It seems to me in Python, there is no need to reap zombie processes.
For example, in the following code
import multiprocessing
import time
def func(msg):
time.sleep(2)
print "done " + str(msg)
if __name__ == "__main__":
for i in range(10):
p = multiprocessing.Process(target=func, args=('3'))
p.start()
print "child"+str(i)
print "parent"
time.sleep(100)
When all the child process exit, the parent process is still running
and at this time, I checked the process using ps -ef
and I noticed there is no defunct process.
Does this mean that in Python, there is no need to reap zombie process?
After having a look to the library - especially to multiprocessing/process.py -, I see that
in Process.start(), there is a _current_process._children.add(self) which adds the started process to a list/set/whatever,
a few lines above, there is a _cleanup() which polls and discards terminated processes, removing zombies.
But that doesn't explain why your code doesn't produce zombies, as the childs wait a while befor terminating, so that the parent's start() calls don't notice that yet.
Those processes are not actually zombies since they should terminate successfully.
You could set the child processes to be deamonic so they'll terminate if the main process terminates.