Python: How to determine subprocess children have all finished running

Python: How to determine subprocess children have all finished running - python

I am trying to detect when an installation program finishes executing from within a Python script. Specifically, the application is the Oracle 10gR2 Database. Currently I am using the subprocess module with Popen. Ideally, I would simply use the wait() method to wait for the installation to finish executing, however, the documented command actually spawns child processes to handle the actual installation. Here is some sample code of the failing code:
import subprocess
OUI_DATABASE_10GR2_SUBPROCESS = ['sudo',
'-u',
'oracle',
os.path.join(DATABASE_10GR2_TMP_PATH,
'database',
'runInstaller'),
'-ignoreSysPrereqs',
'-silent',
'-noconfig',
'-responseFile '+ORACLE_DATABASE_10GR2_SILENT_RESPONSE]
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
There is a similar question here: Killing a subprocess including its children from python, but the selected answer does not address the children issue, instead it instructs the user to call directly the application to wait for. I am looking for a specific solution that will wait for all children of the subprocess. What if there are an unknown number of subprocesses? I will select the answer that addresses the issue of waiting for all children subprocesses to finish.
More clarity on failure: The child processes continue executing after the wait() command since that command only waits for the top level process (in this case it is 'sudo'). Here is a simple diagram of the known child processes in this problem:
Python subprocess module -> Sudo -> runInstaller -> java -> (unknown)

Ok, here is a trick that will work only under Unix. It is similar to one of the answers to this question: Ensuring subprocesses are dead on exiting Python program. The idea is to create a new process group. You can then wait for all processes in the group to terminate.
pid = os.fork()
if pid == 0:
os.setpgrp()
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
os._exit(0)
else:
os.waitpid(-pid)
I have not tested this. It creates an extra subprocess to be the leader of the process group, but avoiding that is (I think) quite a bit more complicated.
I found this web page to be helpful as well. http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/

You can just use os.waitpid with the the pid set to -1, this will wait for all the subprocess of the current process until they finish:
import os
import sys
import subprocess
proc = subprocess.Popen([sys.executable,
'-c',
'import subprocess;'
'subprocess.Popen("sleep 5", shell=True).wait()'])
pid, status = os.waitpid(-1, 0)
print pid, status
This is the result of pstree <pid> of different subprocess forked:
python───python───sh───sleep
Hope this can help :)

Check out the following link http://www.oracle-wiki.net/startdocsruninstaller which details a flag you can use for the runInstaller command.
This flag is definitely available for 11gR2, but I have not got a 10g database to try out this flag for the runInstaller packaged with that version.
Regards

Everywhere I look seems to say it's not possible to solve this in the general case. I've whipped up a library called 'pidmon' that combines some answers for Windows and Linux and might do what you need.
I'm planning to clean this up and put it on github, possibly called 'pidmon' or something like that. I'll post a link if/when I get it up.
EDIT: It's available at http://github.com/dbarnett/python-pidmon.
I made a special waitpid function that accepts a graft_func argument so that you can loosely define what sort of processes you want to wait for when they're not direct children:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, recursive=True,
graft_func=(lambda p: p.name == '???' and p.parent.pid == ???))
or, as a shotgun approach, to just wait for any processes started since the call to waitpid to stop again, do:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, graft_func=(lambda p: True))
Note that this is still barely tested on Windows and seems very slow on Windows (but did I mention it's on github where it's easy to fork?). This should at least get you started, and if it works at all for you, I have plenty of ideas on how to optimize it.

Related

How to run multiple servers with a python script? [duplicate]

I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.

While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.

Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.

You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.

I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid

Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'

Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.

You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)

You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.

I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.

Stop a bash script in python [duplicate]

I am currently trying to write (Python 2.7.3) kind of a wrapper for GDB, which will allow me to dynamically switch from scripted input to interactive communication with GDB.
So far I use
self.process = subprocess.Popen(["gdb vuln"], stdin = subprocess.PIPE, shell = True)
to start gdb within my script. (vuln is the binary I want to examine)
Since a key feature of gdb is to pause the execution of the attached process and allow the user to inspect registers and memory on receiving SIGINT (STRG+C) I do need some way to pass a SIGINT signal to it.
Neither
self.process.send_signal(signal.SIGINT)
nor
os.kill(self.process.pid, signal.SIGINT)
or
os.killpg(self.process.pid, signal.SIGINT)
work for me.
When I use one of these functions there is no response. I suppose this problem arises from the use of shell=True. However, at this point I am really out of ideas.
Even my old friend Google couldn't really help me out this time, so maybe you can help me. Thank's in advance.
Cheers, Mike

Here is what worked for me:
import signal
import subprocess
try:
p = subprocess.Popen(...)
p.wait()
except KeyboardInterrupt:
p.send_signal(signal.SIGINT)
p.wait()

I looked deeper into the problem and found some interesting things. Maybe these findings will help someone in the future.
When calling gdb vuln using suprocess.Popen() it does in fact create three processes, where the pid returned is the one of sh (5180).
ps -a
5180 pts/0 00:00:00 sh
5181 pts/0 00:00:00 gdb
5183 pts/0 00:00:00 vuln
Consequently sending a SIGINT to the process will in fact send SIGINT to sh.
Besides, I continued looking for an answer and stumbled upon this post
https://bugzilla.kernel.org/show_bug.cgi?id=9039
To keep it short, what is mentioned there is the following:
When pressing STRG+C while using gdb regularly SIGINT is in fact sent to the examined program (in this case vuln), then ptrace will intercept it and pass it to gdb.
What this means is, that if I use self.process.send_signal(signal.SIGINT) it will in fact never reach gdb this way.
Temporary Workaround:
I managed to work around this problem by simply calling subprocess.popen() as follows:
subprocess.Popen("killall -s INT " + self.binary, shell = True)
This is nothing more than a first workaround. When multiple applications with the same name are running might do some serious damage. Besides, it somehow fails, if shell=True is not set.
If someone has a better fix (e.g. how to get the pid of the process startet by gdb), please let me know.
Cheers, Mike
EDIT:
Thanks to Mark for pointing out to look at the ppid of the process.
I managed to narrow down the process's to which SIGINT is sent using the following approach:
out = subprocess.check_output(['ps', '-Aefj'])
for line in out.splitlines():
if self.binary in line:
l = line.split(" ")
while "" in l:
l.remove("")
# Get sid and pgid of child process (/bin/sh)
sid = os.getsid(self.process.pid)
pgid = os.getpgid(self.process.pid)
#only true for target process
if l[4] == str(sid) and l[3] != str(pgid):
os.kill(pid, signal.SIGINT)

I have done something like the following in the past and if I recollect it seemed to work for me :
def detach_procesGroup():
os.setpgrp()
subprocess.Popen(command,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
preexec_fn=detach_processGroup)

Detect hanging python shell in OS X

I've got a program that implements a buggy library that occasionally hangs due to improperly implementing parallisation.
I don't have the time to fix the core issue, so I'm looking for a hack to figure out when the process is hanging and not doing it's job.
Are there any OS X or python specific APIs to do this? Is it possible to use another thread or even the main thread to repeatedly parse stdout so that when the last few lines haven't changed in a certain duration, the other thread is notified and can kill the misbehaving thread? (and then restart?

Basically you are looking for a monitor process. It will run a command (or set of commands) and watch their execution looking for specific things (in your case, silence on stdout). Referencing the 2 SO questions below (and a brief look at some docs), you can quickly build a super simple monitor.
https://stackoverflow.com/questions/2804543/read-subprocess-stdout-line-by-line
https://stackoverflow.com/questions/3471461/raw-input-and-timeout
# monitor.py
import subprocess
TIMEOUT = 10
while True:
# start a new process to monitor
# you could also run sys.argv[1:] for a more generic monitor
child = subprocess.Popen(['python','other.py','arg'], stdout=subprocess.PIPE)
while True:
rlist,_,_ = select([child.stdout], [], [], TIMEOUT)
if rlist:
child.stdout.read() # do you need to save the output?
else:
# timeout occurred, did the process finish?
if child.poll() is not None:
# child process completed (or was killed, but didn't hang), we are done
sys.exit()
else:
# otherwise, kill the child and start a new one
child.kill()
break

Control a subprocess (specifically gdb) in multiple ways

I am developing a wrapper around gdb using python. Basically, I just want to be able to detect a few setup annoyances up-front and be able to run a single command to invoke gdb, rather than a huge string I have to remember each time.
That said, there are two cases that I am using. The first, which works fine, is invoking gdb by creating a new process and attaching to it. Here's the code that I have for this one:
def spawnNewProcessInGDB():
global gObjDir, gGDBProcess;
from subprocess import Popen
from os.path import join
import subprocess
binLoc = join(gObjDir, 'dist');
binLoc = join(binLoc, 'bin');
binLoc = join(binLoc, 'mycommand')
profileDir = join(gObjDir, '..')
profileDir = join(profileDir, 'trash-profile')
try:
gGDBProcess = Popen(['gdb', '--args', binLoc, '-profile', profileDir], cwd=gObjDir)
gGDBProcess.wait()
except KeyboardInterrupt:
# Send a termination signal to the GDB process, if it's running
promptAndTerminate(gGDBProcess)
Now, if the user presses CTRL-C while this is running, it breaks (i.e. it forwards the CTRL-C to GDB). This is the behavior I want.
The second case is a bit more complicated. It might be the case that I already had this program running on my system and it crashed, but was caught. In this case, I want to be able to connect to it using gdb to get a stack trace (or perhaps I was already running it, and I simply now want to connect to the process that's already in memory).
As a convenience feature, I've created a mirror function, which will connect to a running process using gdb:
def connectToProcess(procNum):
global gObjDir, gGDBProcess
from subprocess import Popen
import subprocess
import signal
print("Connecting to mycommand process number " + str(procNum) + "...")
try:
gGDBProcess = Popen(['gdb', '-p', procNum], cwd=gObjDir)
gGDBProcess.wait()
except KeyboardInterrupt:
promptAndTerminate(gGDBProcess)
Again, this seems to work as expected. It starts gdb, I can set breakpoints, run the program, etc. The only catch is that it doesn't forward CTRL-C to gdb if I press it while the program is running. Instead, it jumps immediately to promptAndTerminate().
I'm wondering if anyone can see why this is happening - the two calls to subprocess.Popen() seem identical to me, albeit that one is running gdb in a different mode.
I have also tried replacing the call to subprocess.Popen() with the following:
gGDBProcess = Popen(['gdb', '-p', procNum], cwd=gObjDir, stdin=subprocess.PIPE)
but this leads to undesirable results as well, because it doesn't actually communicate anything to the child gdb process (e.g. if I type in c to start the program running again after it is broken upon connection from gdb, it doesn't do anything). Again, it terminates the running python process when I type CTRL-C.
Any help would be appreciated!

How to start a background process in Python?

I'm trying to port a shell script to the much more readable python version. The original shell script starts several processes (utilities, monitors, etc.) in the background with "&". How can I achieve the same effect in python? I'd like these processes not to die when the python scripts complete. I am sure it's related to the concept of a daemon somehow, but I couldn't find how to do this easily.

While jkp's solution works, the newer way of doing things (and the way the documentation recommends) is to use the subprocess module. For simple commands its equivalent, but it offers more options if you want to do something complicated.
Example for your case:
import subprocess
subprocess.Popen(["rm","-r","some.file"])
This will run rm -r some.file in the background. Note that calling .communicate() on the object returned from Popen will block until it completes, so don't do that if you want it to run in the background:
import subprocess
ls_output=subprocess.Popen(["sleep", "30"])
ls_output.communicate() # Will block for 30 seconds
See the documentation here.
Also, a point of clarification: "Background" as you use it here is purely a shell concept; technically, what you mean is that you want to spawn a process without blocking while you wait for it to complete. However, I've used "background" here to refer to shell-background-like behavior.

Note: This answer is less current than it was when posted in 2009. Using the subprocess module shown in other answers is now recommended in the docs
(Note that the subprocess module provides more powerful facilities for spawning new processes and retrieving their results; using that module is preferable to using these functions.)
If you want your process to start in the background you can either use system() and call it in the same way your shell script did, or you can spawn it:
import os
os.spawnl(os.P_DETACH, 'some_long_running_command')
(or, alternatively, you may try the less portable os.P_NOWAIT flag).
See the documentation here.

You probably want the answer to "How to call an external command in Python".
The simplest approach is to use the os.system function, e.g.:
import os
os.system("some_command &")
Basically, whatever you pass to the system function will be executed the same as if you'd passed it to the shell in a script.

I found this here:
On windows (win xp), the parent process will not finish until the longtask.py has finished its work. It is not what you want in CGI-script. The problem is not specific to Python, in PHP community the problems are the same.
The solution is to pass DETACHED_PROCESS Process Creation Flag to the underlying CreateProcess function in win API. If you happen to have installed pywin32 you can import the flag from the win32process module, otherwise you should define it yourself:
DETACHED_PROCESS = 0x00000008
pid = subprocess.Popen([sys.executable, "longtask.py"],
creationflags=DETACHED_PROCESS).pid

Use subprocess.Popen() with the close_fds=True parameter, which will allow the spawned subprocess to be detached from the Python process itself and continue running even after Python exits.
https://gist.github.com/yinjimmy/d6ad0742d03d54518e9f
import os, time, sys, subprocess
if len(sys.argv) == 2:
time.sleep(5)
print 'track end'
if sys.platform == 'darwin':
subprocess.Popen(['say', 'hello'])
else:
print 'main begin'
subprocess.Popen(['python', os.path.realpath(__file__), '0'], close_fds=True)
print 'main end'

Both capture output and run on background with threading
As mentioned on this answer, if you capture the output with stdout= and then try to read(), then the process blocks.
However, there are cases where you need this. For example, I wanted to launch two processes that talk over a port between them, and save their stdout to a log file and stdout.
The threading module allows us to do that.
First, have a look at how to do the output redirection part alone in this question: Python Popen: Write to stdout AND log file simultaneously
Then:
main.py
#!/usr/bin/env python3
import os
import subprocess
import sys
import threading
def output_reader(proc, file):
while True:
byte = proc.stdout.read(1)
if byte:
sys.stdout.buffer.write(byte)
sys.stdout.flush()
file.buffer.write(byte)
else:
break
with subprocess.Popen(['./sleep.py', '0'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc1, \
subprocess.Popen(['./sleep.py', '10'], stdout=subprocess.PIPE, stderr=subprocess.PIPE) as proc2, \
open('log1.log', 'w') as file1, \
open('log2.log', 'w') as file2:
t1 = threading.Thread(target=output_reader, args=(proc1, file1))
t2 = threading.Thread(target=output_reader, args=(proc2, file2))
t1.start()
t2.start()
t1.join()
t2.join()
sleep.py
#!/usr/bin/env python3
import sys
import time
for i in range(4):
print(i + int(sys.argv[1]))
sys.stdout.flush()
time.sleep(0.5)
After running:
./main.py
stdout get updated every 0.5 seconds for every two lines to contain:
0
10
1
11
2
12
3
13
and each log file contains the respective log for a given process.
Inspired by: https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Tested on Ubuntu 18.04, Python 3.6.7.

You probably want to start investigating the os module for forking different threads (by opening an interactive session and issuing help(os)). The relevant functions are fork and any of the exec ones. To give you an idea on how to start, put something like this in a function that performs the fork (the function needs to take a list or tuple 'args' as an argument that contains the program's name and its parameters; you may also want to define stdin, out and err for the new thread):
try:
pid = os.fork()
except OSError, e:
## some debug output
sys.exit(1)
if pid == 0:
## eventually use os.putenv(..) to set environment variables
## os.execv strips of args[0] for the arguments
os.execv(args[0], args)

You can use
import os
pid = os.fork()
if pid == 0:
Continue to other code ...
This will make the python process run in background.

I haven't tried this yet but using .pyw files instead of .py files should help. pyw files dosen't have a console so in theory it should not appear and work like a background process.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.