Is there no need to reap zombie process in python?

Is there no need to reap zombie process in python? - python

It seems to me in Python, there is no need to reap zombie processes.
For example, in the following code
import multiprocessing
import time
def func(msg):
time.sleep(2)
print "done " + str(msg)
if __name__ == "__main__":
for i in range(10):
p = multiprocessing.Process(target=func, args=('3'))
p.start()
print "child"+str(i)
print "parent"
time.sleep(100)
When all the child process exit, the parent process is still running
and at this time, I checked the process using ps -ef
and I noticed there is no defunct process.
Does this mean that in Python, there is no need to reap zombie process?

After having a look to the library - especially to multiprocessing/process.py -, I see that
in Process.start(), there is a _current_process._children.add(self) which adds the started process to a list/set/whatever,
a few lines above, there is a _cleanup() which polls and discards terminated processes, removing zombies.
But that doesn't explain why your code doesn't produce zombies, as the childs wait a while befor terminating, so that the parent's start() calls don't notice that yet.

Those processes are not actually zombies since they should terminate successfully.
You could set the child processes to be deamonic so they'll terminate if the main process terminates.

Related

not able to terminate the process in multiprocessing python (linux)

I am new to python and using multiprocessing, I am starting one process and calling one shell script through this process. After terminating this process shell script keeps running in the background, how do I kill it, please help.
python script(test.py)
#!/usr/bin/python
import time
import os
import sys
import multiprocessing
# test process
def test_py_process():
os.system("./test.sh")
return
p=multiprocessing.Process(target=test_py_process)
p.start()
print 'STARTED:', p, p.is_alive()
time.sleep(10)
p.terminate()
print 'TERMINATED:', p, p.is_alive()
shell script (test.sh)
#!/bin/bash
for i in {1..100}
do
sleep 1
echo "Welcome $i times"
done

The reason is that the child process that is spawned by the os.system call spawns a child process itself. As explained in the multiprocessing docs descendant processes of the process will not be terminated – they will simply become orphaned. So. p.terminate() kills the process you created, but the OS process (/bin/bash ./test.sh) simply gets assigned to the system's scheduler process and continues executing.
You could use subprocess.Popen instead:
import time
from subprocess import Popen
if __name__ == '__main__':
p = Popen("./test.sh")
print 'STARTED:', p, p.poll()
time.sleep(10)
p.kill()
print 'TERMINATED:', p, p.poll()
Edit: #Florian Brucker beat me to it. He deserves the credit for answering the question first. Still keeping this answer for the alternate approach using subprocess, which is recommended over os.system() in the documentation for os.system() itself.

os.system runs the given command in a separate process. Therefore, you have three processes:
The main process in which your script runs
The process in which test_py_processes runs
The process in which the bash script runs
Process 2 is a child process of process 1, and process 3 is a child of process 1.
When you call Process.terminate from within process 1 this will send the SIGTERM signal to process two. That process will then terminate. However, the SIGTERM signal is not automatically propagated to the child processes of process 2! This means that process 3 is not notified when process 2 exits and hence keeps on running as a child of the init process.
The best way to terminate process 3 depends on your actual problem setting, see this SO thread for some suggestions.

What exactly is Python multiprocessing Module's .join() Method Doing?

Learning about Python Multiprocessing (from a PMOTW article) and would love some clarification on what exactly the join() method is doing.
In an old tutorial from 2008 it states that without the p.join() call in the code below, "the child process will sit idle and not terminate, becoming a zombie you must manually kill".
from multiprocessing import Process
def say_hello(name='world'):
print "Hello, %s" % name
p = Process(target=say_hello)
p.start()
p.join()
I added a printout of the PID as well as a time.sleep to test and as far as I can tell, the process terminates on its own:
from multiprocessing import Process
import sys
import time
def say_hello(name='world'):
print "Hello, %s" % name
print 'Starting:', p.name, p.pid
sys.stdout.flush()
print 'Exiting :', p.name, p.pid
sys.stdout.flush()
time.sleep(20)
p = Process(target=say_hello)
p.start()
# no p.join()
within 20 seconds:
936 ttys000 0:00.05 /Library/Frameworks/Python.framework/Versions/2.7/Reso
938 ttys000 0:00.00 /Library/Frameworks/Python.framework/Versions/2.7/Reso
947 ttys001 0:00.13 -bash
after 20 seconds:
947 ttys001 0:00.13 -bash
Behavior is the same with p.join() added back at end of the file. Python Module of the Week offers a very readable explanation of the module; "To wait until a process has completed its work and exited, use the join() method.", but it seems like at least OS X was doing that anyway.
Am also wondering about the name of the method. Is the .join() method concatenating anything here? Is it concatenating a process with it's end? Or does it just share a name with Python's native .join() method?

The join() method, when used with threading or multiprocessing, is not related to str.join() - it's not actually concatenating anything together. Rather, it just means "wait for this [thread/process] to complete". The name join is used because the multiprocessing module's API is meant to look as similar to the threading module's API, and the threading module uses join for its Thread object. Using the term join to mean "wait for a thread to complete" is common across many programming languages, so Python just adopted it as well.
Now, the reason you see the 20 second delay both with and without the call to join() is because by default, when the main process is ready to exit, it will implicitly call join() on all running multiprocessing.Process instances. This isn't as clearly stated in the multiprocessing docs as it should be, but it is mentioned in the Programming Guidelines section:
Remember also that non-daemonic processes will be automatically be
joined.
You can override this behavior by setting the daemon flag on the Process to True prior to starting the process:
p = Process(target=say_hello)
p.daemon = True
p.start()
# Both parent and child will exit here, since the main process has completed.
If you do that, the child process will be terminated as soon as the main process completes:
daemon
The process’s daemon flag, a Boolean value. This must be set before
start() is called.
The initial value is inherited from the creating process.
When a process exits, it attempts to terminate all of its daemonic
child processes.

Without the join(), the main process can complete before the child process does. I'm not sure under what circumstances that leads to zombieism.
The main purpose of join() is to ensure that a child process has completed before the main process does anything that depends on the work of the child process.
The etymology of join() is that it's the opposite of fork, which is the common term in Unix-family operating systems for creating child processes. A single process "forks" into several, then "joins" back into one.

I'm not going to explain in detail what join does, but here's the etymology and the intuition behind it, which should help you remember its meaning more easily.
The idea is that execution "forks" into multiple processes of which one is the main/primary process, the rest workers (or minor/secondary). When the workers are done, they "join" the main process so that serial execution may be resumed.
The join() causes the main process to wait for a worker to join it. The method might better have been called "wait", since that's the actual behavior it causes in the master (and that's what it's called in POSIX, although POSIX threads call it "join" as well). The joining only occurs as an effect of the threads cooperating properly, it's not something the main process does.
The names "fork" and "join" have been used with this meaning in multiprocessing since 1963.

The join() call ensures that subsequent lines of your code are not called before all the multiprocessing processes are completed.
For example, without the join(), the following code will call restart_program() even before the processes finish, which is similar to asynchronous and is not what we want (you can try):
num_processes = 5
for i in range(num_processes):
p = multiprocessing.Process(target=calculate_stuff, args=(i,))
p.start()
processes.append(p)
for p in processes:
p.join() # call to ensure subsequent line (e.g. restart_program)
# is not called until all processes finish
restart_program()

join() is used to wait for the worker processes to exit. One must call close() or terminate() before using join().
Like #Russell mentioned join is like the opposite of fork (which Spawns sub-processes).
For join to run you have to run close() which will prevent any more tasks from being submitted to the pool and exit once all tasks complete. Alternatively, running terminate() will just exit by stopping all worker processes immediately.
"the child process will sit idle and not terminate, becoming a zombie you must manually kill" this is possible when the main (parent) process exits but the child process is still running and once completed it has no parent process to return its exit status to.

To wait until a process has completed its work and exited, use the join() method.
and
Note It is important to join() the process after terminating it in order to give the background machinery time to update the status of the object to reflect the termination.
This is a good example helped me understand it: here
One thing I noticed personally was my main process paused until the child had finished its process using the join() method which defeated the point of me using multiprocessing.Process() in the first place.

Kill Child Process if Parent is killed in Python

I'm spawning 5 different processes from a python script, like this:
p = multiprocessing.Process(target=some_method,args=(arg,))
p.start()
My problem is, when, somehow the parent process (the main script) gets killed, the child processes keeps on running.
Is there a way to kill child processes, which are spawned like this, when the parent gets killed ?
EDIT:
I'm trying this:
p = multiprocessing.Process(target=client.start,args=(self.query_interval,))
p.start()
atexit.register(p.terminate)
But this doesnt seem to be working

I've encounter the same problem myself, I've got the following solution:
before calling p.start(), you may set p.daemon=True. Then as mentioned here python.org multiprocessing
When a process exits, it attempts to terminate all of its daemonic child processes.

The child is not notified of the death of its parent, it only works the other way.
However, when a process dies, all its file descriptors are closed. And the other end of a pipe is notified about this, if it selects the pipe for reading.
So your parent can create a pipe before spawning the process (or in fact, you can just set up stdin to be a pipe), and the child can select that for reading. It will report ready for reading when the parent end is closed. This requires your child to run a main loop, or at least make regular calls to select. If you don't want that, you'll need some manager process to do it, but then when that one is killed, things break again.

If you have access to the parent pid you can use something like this
import os
import sys
import psutil
def kill_child_proc(ppid):
for process in psutil.process_iter():
_ppid = process.ppid()
if _ppid == ppid:
_pid = process.pid
if sys.platform == 'win32':
process.terminate()
else:
os.system('kill -9 {0}'.format(_pid))
kill_child_proc(<parent_pid>)

My case was using a Queue object to communicate with the child processes. For whatever reason, the daemon flag as suggested in the accepted answer does not work. Here's a minimal example illustrating how to get the children to die gracefully in this case.
The main idea is to pause child work execution every second or so and check if the parent process is still alive. If it is not alive, we close the Queue and exit.
Note this also works if the main process is killed using SIGKILL
import ctypes, sys
import multiprocessing as mp
worker_queue = mp.Queue(maxsize=10)
# flag to communicate the parent's death to all children
alive = mp.Value(ctypes.c_bool, lock=False)
alive.value = True
def worker():
while True:
# fake work
data = 99.99
# submit finished work to parent, while checking if parent has died
queued = False
while not queued:
# note here we do not block indefinitely, so we can check if parent died
try:
worker_queue.put(data, block=True, timeout=1.0)
queued = True
except: pass
# check if parent process is alive still
par_alive = mp.parent_process().is_alive()
if not (par_alive and alive.value):
# for some reason par_alive is only False for one of the children;
# notify the others that the parent has died
alive.value = False
# appears we need to close the queue before sys.exit will work
worker_queue.close()
# for more dramatic shutdown, could try killing child process;
# wp.current_process().kill() does not work, though you could try
# calling os.kill directly with the child PID
sys.exit(1)
# launch worker processes
for i in range(4):
child = mp.Process(target=worker)
child.start()

How to let the child process live when parent process exited?

I want to use multiprocessing module to complete this.
when I do this, like:
$ python my_process.py
I start a parent process, and then let the parent process spawn a child process,
then i want that the parent process exits itself, but the child process continues to work.
Allow me write a WRONG code to explain myself:
from multiprocessing import Process
def f(x):
with open('out.dat', 'w') as f:
f.write(x)
if __name__ == '__main__':
p = Process(target=f, args=('bbb',))
p.daemon = True # This is key, set the daemon, then parent exits itself
p.start()
#p.join() # This is WRONG code, just want to exlain what I mean.
# the child processes will be killed, when father exit
So, how do i start a process that will not be killed when the parent process finishes?
20140714
Hi, you guys
My friend just told me a solution...
I just think...
Anyway, just let u see:
import os
os.system('python your_app.py&') # SEE!? the & !!
this does work!!

A trick: call os._exit to make parent process exit, in this way daemonic child processes will not be killed.
But there are some other side affects, described in the doc:
Exit the process with status n, without calling cleanup handlers,
flushing stdio buffers, etc.
If you do not care about this, you can use it.

Here's one way to achieve an independent child process that does not exit when __main__ exits. It uses the os._exit() tip mentioned above by #WKPlus.
Is there a way to detach matplotlib plots so that the computation can continue?

python multiprocessing: why is process defunct after terminate?

I have some python multiprocessing code with the parent process starting a bunch of child worker processes and then terminating them after awhile:
from multiprocessing import Process
nWorkers = 10
curWorkers = []
for iw in range(nWorkers):
pq = Process(target=worker, args=(worker's_args_here))
pq.start()
curWorkers.append(pq)
# Do work here...
for pw in curWorkers:
pw.terminate()
However, the child processes all are showing as defunct long after termination. Are they zombie processes? More importantly, how should I terminate them so that they really go away?

Try adding:
for pw in curWorkers:
pw.join()
at the end. .terminate() just kills the process. The parent process still needs to reap it (at least on Linux-y systems) before the child process goes away entirely.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.