I am trying to use multiprocessing from inside another process that was spawned with Popen. I want to be able to communicate between this process and a new child process, but this "middle" process has a polling read on the pipe with its parent, which seems to block execution of its child process.
Here is my file structure:
entry.py
import subprocess, threading, time, sys
def start():
# Create process 2
worker = subprocess.Popen([sys.executable, "-u", "mproc.py"],
# When creating the subprocess with an open pipe to stdin and
# subsequently polling that pipe, it blocks further communication
# between subprocesses
stdin=subprocess.PIPE,
close_fds=False,)
t = threading.Thread(args=(worker))
t.start()
time.sleep(4)
if __name__ == '__main__':
start()
mproc.py
import multiprocessing as mp
import time, sys, threading
def exit_on_stdin_close():
try:
while sys.stdin.read():
pass
except:
pass
def put_hello(q):
# We never reach this line if exit_poll.start() is uncommented
q.put("hello")
time.sleep(2.4)
def start():
exit_poll = threading.Thread(target=exit_on_stdin_close, name="exit-poll")
exit_poll.daemon = True
# This daemon thread polling stdin blocks execution of subprocesses
# But ONLY if running in another process with stdin connected
# to its parent by PIPE
exit_poll.start()
ctx = mp.get_context('spawn')
q = ctx.Queue()
p = ctx.Process(target=put_hello, args=(q,))
# Create process 3
p.start()
p.join()
print(f"result: {q.get()}")
if __name__ == '__main__':
start()
My desired behavior is that when running entry.py, mproc.py should run on a subprocess and be able to communicate with its own subprocess to get the Queue output, and this does happen if I don't start the exit-poll daemon thread:
$ python -u entry.py
result: hello
but if exit-poll is running, then process 3 blocks as soon as it's started. The put_hello method isn't even entered until the exit-poll thread ends.
Is there a way to create a process 3 from process 2 and communicate between the two, even while the pipe between processes 1 and 2 is being used?
Edit: I can only consistently reproduce this problem on Windows. On Linux (Ubuntu 20.04 WSL) the Queues are able to communicate even with exit-poll running, but only if I'm using the spawn multiprocessing context. If I change it to fork then I get the same behavior that I see on Windows.
I am using python 2.7 and Python thread doesn't kill its process after the main program exits. (checking this with the ps -ax command on ubuntu machine)
I have the below thread class,
import os
import threading
class captureLogs(threading.Thread):
'''
initialize the constructor
'''
def __init__(self, deviceIp, fileTag):
threading.Thread.__init__(self)
super(captureLogs, self).__init__()
self._stop = threading.Event()
self.deviceIp = deviceIp
self.fileTag = fileTag
def stop(self):
self._stop.set()
def stopped(self):
return self._stop.isSet()
'''
define the run method
'''
def run(self):
'''
Make the thread capture logs
'''
cmdTorun = "adb logcat > " + self.deviceIp +'_'+self.fileTag+'.log'
os.system(cmdTorun)
And I am creating a thread in another file sample.py,
import logCapture
import os
import time
c = logCapture.captureLogs('100.21.143.168','somefile')
c.setDaemon(True)
c.start()
print "Started the log capture. now sleeping. is this a dameon?", c.isDaemon()
time.sleep(5)
print "Sleep tiime is over"
c.stop()
print "Calling stop was successful:", c.stopped()
print "Thread is now completed and main program exiting"
I get the below output from the command line:
Started the log capture. now sleeping. is this a dameon? True
Sleep tiime is over
Calling stop was successful: True
Thread is now completed and main program exiting
And the sample.py exits.
But when I use below command on a terminal,
ps -ax | grep "adb"
I still see the process running. (I am killing them manually now using the kill -9 17681 17682)
Not sure what I am missing here.
My question is,
1) why is the process still alive when I already killed it in my program?
2) Will it create any problem if I don't bother about it?
3) is there any other better way to capture logs using a thread and monitor the logs?
EDIT: As suggested by #bug Killer, I added the below method in my thread class,
def getProcessID(self):
return os.getpid()
and used os.kill(c.getProcessID(), SIGTERM) in my sample.py . The program doesn't exit at all.
It is likely because you are using os.system in your thread. The spawned process from os.system will stay alive even after the thread is killed. Actually, it will stay alive forever unless you explicitly terminate it in your code or by hand (which it sounds like you are doing ultimately) or the spawned process exits on its own. You can do this instead:
import atexit
import subprocess
deviceIp = '100.21.143.168'
fileTag = 'somefile'
# this is spawned in the background, so no threading code is needed
cmdTorun = "adb logcat > " + deviceIp +'_'+fileTag+'.log'
proc = subprocess.Popen(cmdTorun, shell=True)
# or register proc.kill if you feel like living on the edge
atexit.register(proc.terminate)
# Here is where all the other awesome code goes
Since all you are doing is spawning a process, creating a thread to do it is overkill and only complicates your program logic. Just spawn the process in the background as shown above and then let atexit terminate it when your program exits. And/or call proc.terminate explicitly; it should be fine to call repeatedly (much like close on a file object) so having atexit call it again later shouldn't hurt anything.
My question is hopefully particular enough to not relate to any of the other ones that I've read. I'm wanting to use subprocess and multiprocessing to spawn a bunch of jobs serially and return the return code to me. The problem is that I don't want to wait() so I can spawn the jobs all at once, but I do want to know when it finishes so I can get the return code. I'm having this weird problem where if I poll() the process it won't run. It just hangs out in the activity monitor without running (I'm on a Mac). I thought I could use a watcher thread, but I'm hanging on the q_out.get() which is leading me to believe that maybe I'm filling up the buffer and deadlocking. I'm not sure how to get around this. This is basically what my code looks like. If anyone has any better ideas on how to do this I would be happy to completely change my approach.
def watchJob(p1,out_q):
while p1.poll() == None:
pass
print "Job is done"
out_q.put(p1.returncode)
def runJob(out_q):
LOGFILE = open('job_to_run.log','w')
p1 = Popen(['../../bin/jobexe','job_to_run'], stdout = LOGFILE)
t = threading.Thread(target=watchJob, args=(p1,out_q))
t.start()
out_q= Queue()
outlst=[]
for i in range(len(nprocs)):
proc = Process(target=runJob, args=(out_q,))
proc.start()
outlst.append(out_q.get()) # This hangs indefinitely
proc.join()
You don't need neither multiprocessing nor threading here. You could run multiple child processes in parallel and collect their statutes all in a single thread:
#!/usr/bin/env python3
from subprocess import Popen
def run(cmd, log_filename):
with open(log_filename, 'wb', 0) as logfile:
return Popen(cmd, stdout=logfile)
# start several subprocesses
processes = {run(['echo', c], 'subprocess.%s.log' % c) for c in 'abc'}
# now they all run in parallel
# report as soon as a child process exits
while processes:
for p in processes:
if p.poll() is not None:
processes.remove(p)
print('{} done, status {}'.format(p.args, p.returncode))
break
p.args stores cmd in Python 3.3+, keep track of cmd yourself on earlier Python versions.
See also:
Python threading multiple bash subprocesses?
Python subprocess in parallel
Python: execute cat subprocess in parallel
Using Python's Multiprocessing module to execute simultaneous and separate SEAWAT/MODFLOW model runs
To limit number of parallel jobs a ThreadPool could be used (as shown in the first link):
#!/usr/bin/env python3
from multiprocessing.dummy import Pool # use threads
from subprocess import Popen
def run_until_done(args):
cmd, log_filename = args
try:
with open(log_filename, 'wb', 0) as logfile:
p = Popen(cmd, stdout=logfile)
return cmd, p.wait(), None
except Exception as e:
return cmd, None, str(e)
commands = ((('echo', str(d)), 'subprocess.%03d.log' % d) for d in range(500))
pool = Pool(128) # 128 concurrent commands at a time
for cmd, status, error in pool.imap_unordered(run_until_done, commands):
if error is None:
fmt = '{cmd} done, status {status}'
else:
fmt = 'failed to run {cmd}, reason: {error}'
print(fmt.format_map(vars())) # or fmt.format(**vars()) on older versions
The thread pool in the example has 128 threads (no more, no less). It can't execute more than 128 jobs concurrently. As soon as any of the threads frees (done with a job), it takes another, etc. Total number of jobs that is executed concurrently is limited by the number of threads. New job doesn't wait for all 128 previous jobs to finish. It is started when any of the old jobs is done.
If you're going to run watchJob in a thread, there's no reason to busy-loop with p1.poll; just call p1.wait() to block until the process finishes. Using the busy loop requires the GIL to constantly be released/re-acquired, which slows down the main thread, and also pegs the CPU, which hurts performance even more.
Also, if you're not using the stdout of the child process, you shouldn't send it to PIPE, because that could cause a deadlock if the process writes enough data to the stdout buffer to fill it up (which may actually be what's happening in your case). There's also no need to use multiprocessing here; just call Popen in the main thread, and then have the watchJob thread wait on the process to finish.
import threading
from subprocess import Popen
from Queue import Queue
def watchJob(p1, out_q):
p1.wait()
out_q.put(p1.returncode)
out_q = Queue()
outlst=[]
p1 = Popen(['../../bin/jobexe','job_to_run'])
t = threading.Thread(target=watchJob, args=(p1,out_q))
t.start()
outlst.append(out_q.get())
t.join()
Edit:
Here's how to run multiple jobs concurrently this way:
out_q = Queue()
outlst = []
threads = []
num_jobs = 3
for _ in range(num_jobs):
p = Popen(['../../bin/jobexe','job_to_run'])
t = threading.Thread(target=watchJob, args=(p1, out_q))
t.start()
# Don't consume from the queue yet.
# All jobs are running, so now we can start
# consuming results from the queue.
for _ in range(num_jobs):
outlst.append(out_q.get())
t.join()
When you execute a python script, does the process/interpreter exit because it reads an EOF character from the script? [i.e. is that the exit signal?]
The follow up to this is how/when a python child process knows to exit, namely, when you start a child process by overriding the run() method, as here:
class Example(multiprocessing.Process):
def __init__(self, task_queue, result_queue):
multiprocessing.Process.__init__(self)
self.task_queue = task_queue
self.result_queue = result_queue
def run(self):
while True:
next_task = self.task_queue.get()
if next_task is None:
print '%s: Exiting' % proc_name
break
#more stuff...[assume there's some task_done stuff, etc]
if __name__ == '__main__':
tasks = multiprocessing.JoinableQueue()
results = multiprocessing.Queue()
processes = [ Example(tasks, results)
for i in range(5) ]
for i in processes:
i.start()
#more stuff...like populating the queue, etc.
Now, what I'm curious about is: Do the child processes automatically exit upon completion of the run() method? And if I kill the main thread during execution, will the child processes end immediately? Will they end if their run() calls can complete independently of the status of the parent process?
Yes, each child process terminates automatically after completion of the run method, even though I think you should avoid subclassing Process and use the target argument instead.
Note that in linux the child process may remain in zombie state if you do not read the exit status:
>>> from multiprocessing import Process
>>> def target():
... print("Something")
...
>>> Process(target=target).start()
>>> Something
>>>
If we look at the processes after this:
While if we read the exit status of the process (with Process.exitcode), this does not happen.
Each Process instance launches a new process in the background, how and when this subprocess is terminated is OS-dependant. Every OS provides some mean of communication between processes. Child processes are usually not terminated if you kill the "parent" process.
For example doing this:
>>> from multiprocessing import Process
>>> import time
>>> def target():
... while True:
... time.sleep(0.5)
...
>>> L = [Process(target=target) for i in range(10)]
>>> for p in L: p.start()
...
The main python process will have 10 children:
Now if we kill that process we obtain this:
Note how the child processes where inherited by init and are still running.
But, as I said, this is OS specific. On some OSes killing the parent process will kill all child processes.
Is there a way to ensure all created subprocess are dead at exit time of a Python program? By subprocess I mean those created with subprocess.Popen().
If not, should I iterate over all of the issuing kills and then kills -9? anything cleaner?
You can use atexit for this, and register any clean up tasks to be run when your program exits.
atexit.register(func[, *args[, **kargs]])
In your cleanup process, you can also implement your own wait, and kill it when a your desired timeout occurs.
>>> import atexit
>>> import sys
>>> import time
>>>
>>>
>>>
>>> def cleanup():
... timeout_sec = 5
... for p in all_processes: # list of your processes
... p_sec = 0
... for second in range(timeout_sec):
... if p.poll() == None:
... time.sleep(1)
... p_sec += 1
... if p_sec >= timeout_sec:
... p.kill() # supported from python 2.6
... print 'cleaned up!'
...
>>>
>>> atexit.register(cleanup)
>>>
>>> sys.exit()
cleaned up!
Note -- Registered functions won't be run if this process (parent process) is killed.
The following windows method is no longer needed for python >= 2.6
Here's a way to kill a process in windows. Your Popen object has a pid attribute, so you can just call it by success = win_kill(p.pid) (Needs pywin32 installed):
def win_kill(pid):
'''kill a process by specified PID in windows'''
import win32api
import win32con
hProc = None
try:
hProc = win32api.OpenProcess(win32con.PROCESS_TERMINATE, 0, pid)
win32api.TerminateProcess(hProc, 0)
except Exception:
return False
finally:
if hProc != None:
hProc.Close()
return True
On *nix's, maybe using process groups can help you out - you can catch subprocesses spawned by your subprocesses as well.
if __name__ == "__main__":
os.setpgrp() # create new process group, become its leader
try:
# some code
finally:
os.killpg(0, signal.SIGKILL) # kill all processes in my group
Another consideration is to escalate the signals: from SIGTERM (default signal for kill) to SIGKILL (a.k.a kill -9). Wait a short while between the signals to give the process a chance to exit cleanly before you kill -9 it.
The subprocess.Popen.wait() is the only way to assure that they're dead. Indeed, POSIX OS's require that you wait on your children. Many *nix's will create a "zombie" process: a dead child for which the parent didn't wait.
If the child is reasonably well-written, it terminates. Often, children read from PIPE's. Closing the input is a big hint to the child that it should close up shop and exit.
If the child has bugs and doesn't terminate, you may have to kill it. You should fix this bug.
If the child is a "serve-forever" loop, and is not designed to terminate, you should either kill it or provide some input or message which will force it to terminate.
Edit.
In standard OS's, you have os.kill( PID, 9 ). Kill -9 is harsh, BTW. If you can kill them with SIGABRT (6?) or SIGTERM (15) that's more polite.
In Windows OS, you don't have an os.kill that works. Look at this ActiveState Recipe for terminating a process in Windows.
We have child processes that are WSGI servers. To terminate them we do a GET on a special URL; this causes the child to clean up and exit.
Find out a solution for linux (without installing prctl):
def _set_pdeathsig(sig=signal.SIGTERM):
"""help function to ensure once parent process exits, its childrent processes will automatically die
"""
def callable():
libc = ctypes.CDLL("libc.so.6")
return libc.prctl(1, sig)
return callable
subprocess.Popen(your_command, preexec_fn=_set_pdeathsig(signal.SIGTERM))
Warning: Linux-only! You can make your child receive a signal when its parent dies.
First install python-prctl==1.5.0 then change your parent code to launch your child processes as follows
subprocess.Popen(["sleep", "100"], preexec_fn=lambda: prctl.set_pdeathsig(signal.SIGKILL))
What this says is:
launch subprocess: sleep 100
after forking and before exec of the subprocess, the child registers for "send me a SIGKILL
when my parent terminates".
orip's answer is helpful but has the downside that it kills your process and returns an error code your parent. I avoided that like this:
class CleanChildProcesses:
def __enter__(self):
os.setpgrp() # create new process group, become its leader
def __exit__(self, type, value, traceback):
try:
os.killpg(0, signal.SIGINT) # kill all processes in my group
except KeyboardInterrupt:
# SIGINT is delievered to this process as well as the child processes.
# Ignore it so that the existing exception, if any, is returned. This
# leaves us with a clean exit code if there was no exception.
pass
And then:
with CleanChildProcesses():
# Do your work here
Of course you can do this with try/except/finally but you have to handle the exceptional and non-exceptional cases separately.
I needed a small variation of this problem (cleaning up subprocesses, but without exiting the Python program itself), and since it's not mentioned here among the other answers:
p=subprocess.Popen(your_command, preexec_fn=os.setsid)
os.killpg(os.getpgid(p.pid), 15)
setsid will run the program in a new session, thus assigning a new process group to it and its children. calling os.killpg on it thus won't bring down your own python process also.
poll( )
Check if child process has terminated.
Returns returncode attribute.
A solution for windows may be to use the win32 job api e.g. How do I automatically destroy child processes in Windows?
Here's an existing python implementation
https://gist.github.com/ubershmekel/119697afba2eaecc6330
Is there a way to ensure all created subprocess are dead at exit time of a Python program? By subprocess I mean those created with subprocess.Popen().
You could violate encapsulation and test that all Popen processes have terminated by doing
subprocess._cleanup()
print subprocess._active == []
If not, should I iterate over all of the issuing kills and then kills -9? anything cleaner?
You cannot ensure that all subprocesses are dead without going out and killing every survivor. But if you have this problem, it is probably because you have a deeper design problem.
I actually needed to do this, but it involved running remote commands. We wanted to be able to stop the processes by closing the connection to the server. Also, if, for example, you are running in the python repl, you can select to run as foreground if you want to be able to use Ctrl-C to exit.
import os, signal, time
class CleanChildProcesses:
"""
with CleanChildProcesses():
Do work here
"""
def __init__(self, time_to_die=5, foreground=False):
self.time_to_die = time_to_die # how long to give children to die before SIGKILL
self.foreground = foreground # If user wants to receive Ctrl-C
self.is_foreground = False
self.SIGNALS = (signal.SIGHUP, signal.SIGTERM, signal.SIGABRT, signal.SIGALRM, signal.SIGPIPE)
self.is_stopped = True # only call stop once (catch signal xor exiting 'with')
def _run_as_foreground(self):
if not self.foreground:
return False
try:
fd = os.open(os.ctermid(), os.O_RDWR)
except OSError:
# Happens if process not run from terminal (tty, pty)
return False
os.close(fd)
return True
def _signal_hdlr(self, sig, framte):
self.__exit__(None, None, None)
def start(self):
self.is_stopped = False
"""
When running out of remote shell, SIGHUP is only sent to the session
leader normally, the remote shell, so we need to make sure we are sent
SIGHUP. This also allows us not to kill ourselves with SIGKILL.
- A process group is called orphaned when the parent of every member is
either in the process group or outside the session. In particular,
the process group of the session leader is always orphaned.
- If termination of a process causes a process group to become orphaned,
and some member is stopped, then all are sent first SIGHUP and then
SIGCONT.
consider: prctl.set_pdeathsig(signal.SIGTERM)
"""
self.childpid = os.fork() # return 0 in the child branch, and the childpid in the parent branch
if self.childpid == 0:
try:
os.setpgrp() # create new process group, become its leader
os.kill(os.getpid(), signal.SIGSTOP) # child fork stops itself
finally:
os._exit(0) # shut down without going to __exit__
os.waitpid(self.childpid, os.WUNTRACED) # wait until child stopped after it created the process group
os.setpgid(0, self.childpid) # join child's group
if self._run_as_foreground():
hdlr = signal.signal(signal.SIGTTOU, signal.SIG_IGN) # ignore since would cause this process to stop
self.controlling_terminal = os.open(os.ctermid(), os.O_RDWR)
self.orig_fore_pg = os.tcgetpgrp(self.controlling_terminal) # sends SIGTTOU to this process
os.tcsetpgrp(self.controlling_terminal, self.childpid)
signal.signal(signal.SIGTTOU, hdlr)
self.is_foreground = True
self.exit_signals = dict((s, signal.signal(s, self._signal_hdlr))
for s in self.SIGNALS)
def stop(self):
try:
for s in self.SIGNALS:
#don't get interrupted while cleaning everything up
signal.signal(s, signal.SIG_IGN)
self.is_stopped = True
if self.is_foreground:
os.tcsetpgrp(self.controlling_terminal, self.orig_fore_pg)
os.close(self.controlling_terminal)
self.is_foreground = False
try:
os.kill(self.childpid, signal.SIGCONT)
except OSError:
"""
can occur if process finished and one of:
- was reaped by another process
- if parent explicitly ignored SIGCHLD
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
- parent has the SA_NOCLDWAIT flag set
"""
pass
os.setpgrp() # leave the child's process group so I won't get signals
try:
os.killpg(self.childpid, signal.SIGINT)
time.sleep(self.time_to_die) # let processes end gracefully
os.killpg(self.childpid, signal.SIGKILL) # In case process gets stuck while dying
os.waitpid(self.childpid, 0) # reap Zombie child process
except OSError as e:
pass
finally:
for s, hdlr in self.exit_signals.iteritems():
signal.signal(s, hdlr) # reset default handlers
def __enter__(self):
if self.is_stopped:
self.start()
def __exit__(self, exit_type, value, traceback):
if not self.is_stopped:
self.stop()
Thanks to Malcolm Handley for the initial design. Done with python2.7 on linux.
You can try subalive, a package I wrote for similar problem. It uses periodic alive ping via RPC, and the slave process automatically terminates when the master stops alive pings for some reason.
https://github.com/waszil/subalive
Example for master:
from subalive import SubAliveMaster
# start subprocess with alive keeping
SubAliveMaster(<path to your slave script>)
# do your stuff
# ...
Example for slave subprocess:
from subalive import SubAliveSlave
# start alive checking
SubAliveSlave()
# do your stuff
# ...
It's possible to get some more guarantees on windows by spawning a separate process to oversee the destruction.
import subprocess
import sys
import os
def terminate_process_on_exit(process):
if sys.platform == "win32":
try:
# Or provide this script normally.
# Here just to make it somewhat self-contained.
# see https://stackoverflow.com/a/22559493/3763139
# see https://superuser.com/a/1299350/388191
with open('.process_watchdog_helper.bat', 'x') as file:
file.write(""":waitforpid
tasklist /nh /fi "pid eq %1" 2>nul | find "%1" >nul
if %ERRORLEVEL%==0 (
timeout /t 5 /nobreak >nul
goto :waitforpid
) else (
wmic process where processid="%2" call terminate >nul
)""")
except:
pass
# After this spawns we're pretty safe. There is a race, but we do what we can.
subprocess.Popen(
['.process_watchdog_helper.bat', str(os.getpid()), str(process.pid)],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL
)
# example
class DummyProcess:
def __init__(self, pid):
self.pid = pid
set_terminate_when_this_process_dies(DummyProcess(7516))
This is what I did for my posix app:
When your app exists call the kill() method of this class:
http://www.pixelbeat.org/libs/subProcess.py
Example use here:
http://code.google.com/p/fslint/source/browse/trunk/fslint-gui#608
help for python code:
http://docs.python.org/dev/library/subprocess.html#subprocess.Popen.wait