Is there a way to check if a subprocess is still running?

Is there a way to check if a subprocess is still running? - python

I'm launching a number of subprocesses with subprocess.Popen in Python.
I'd like to check whether one such process has completed. I've found two ways of checking the status of a subprocess, but both seem to force the process to complete.
One is using process.communicate() and printing the returncode, as explained here: checking status of process with subprocess.Popen in Python.
Another is simply calling process.wait() and checking that it returns 0.
Is there a way to check if a process is still running without waiting for it to complete if it is?

Ouestion: ... a way to check if a process is still running ...
You can do it for instance:
p = subprocess.Popen(...
"""
A None value indicates that the process hasn't terminated yet.
"""
poll = p.poll()
if poll is None:
# p.subprocess is alive
Python » 3.6.1 Documentation popen-objects
Tested with Python:3.4.2

Doing the
myProcessIsRunning = poll() is None
As suggested by the main answer, is the recommended way and the simplest way to check if a process running. (and it works in jython as well)
If you do not have the process instance in hand to check it.
Then use the operating system TaskList / Ps processes.
On windows, my command will look as follows:
filterByPid = "PID eq %s" % pid
pidStr = str(pid)
commandArguments = ['cmd', '/c', "tasklist", "/FI", filterByPid, "|", "findstr", pidStr ]
This is essentially doing the same thing as the following command line:
cmd /c "tasklist /FI "PID eq 55588" | findstr 55588"
And on linux, I do exactly the same using the:
pidStr = str(pid)
commandArguments = ['ps', '-p', pidStr ]
The ps command will already be returning error code 0 / 1 depending on whether the process is found. While on windows you need the find string command.
This is the same approach that is discussed on the following stack overflow thread:
Verify if a process is running using its PID in JAVA
NOTE:
If you use this approach, remember to wrap your command call in a try/except:
try:
foundRunningProcess = subprocess.check_output(argumentsArray, **kwargs)
return True
except Exception as err:
return False
Note, be careful if you are developing with VS Code and using pure Python and Jython.
On my environment, I was under the illusion that the poll() method did not work because a process that I suspected that must have ended was indeed running.
This process had launched Wildfly. And after I had asked for wildfly to stop, the shell was still waiting for user to "Press any key to continue . . .".
In order to finish off this process, in pure python the following code was working:
process.stdin.write(os.linesep)
On jython, I had to fix this code to look as follows:
print >>process.stdin, os.linesep
And with this difference the process did indeed finish.
And the jython.poll() started telling me that the process is indeed finished.

As suggested by the other answers None is the designed placeholder for the "return code" when no code has been returned yet by the subprocess.
The documentation for the returncode attribute backs this up (emphasis mine):
The child return code, set by poll() and wait() (and indirectly by communicate()). A None value indicates that the process hasn’t terminated yet.
A negative value -N indicates that the child was terminated by signal N (POSIX only).
An interesting place where this None value occurs is when using the timeout parameter for wait or communicate.
If the process does not terminate after timeout seconds, a TimeoutExpired exception will be raised.
If you catch that exception and check the returncode attribute it will indeed be None
import subprocess
with subprocess.Popen(['ping','127.0.0.1']) as p:
try:
p.wait(timeout=3)
except subprocess.TimeoutExpired:
assert p.returncode is None
If you look at the source for subprocess you can see the exception being raised.
https://github.com/python/cpython/blob/47be7d0108b4021ede111dbd15a095c725be46b7/Lib/subprocess.py#L1930-L1931
If you search that source for self.returncode is you'll find many uses where the library authors lean on that None return code design to infer if an app is running or not running. The returncode attribute is initialized to None and only ever changes in a few spots, the main flow in invocations to _handle_exitstatus to pass on the actual return code.

You could use subprocess.check_output to have a look at your output.
Try this code:
import subprocess
subprocess.check_output(['your command here'], shell=True, stderr=subprocess.STDOUT)
Hope this helped!

Related

Stop a bash script in python [duplicate]

I am currently trying to write (Python 2.7.3) kind of a wrapper for GDB, which will allow me to dynamically switch from scripted input to interactive communication with GDB.
So far I use
self.process = subprocess.Popen(["gdb vuln"], stdin = subprocess.PIPE, shell = True)
to start gdb within my script. (vuln is the binary I want to examine)
Since a key feature of gdb is to pause the execution of the attached process and allow the user to inspect registers and memory on receiving SIGINT (STRG+C) I do need some way to pass a SIGINT signal to it.
Neither
self.process.send_signal(signal.SIGINT)
nor
os.kill(self.process.pid, signal.SIGINT)
or
os.killpg(self.process.pid, signal.SIGINT)
work for me.
When I use one of these functions there is no response. I suppose this problem arises from the use of shell=True. However, at this point I am really out of ideas.
Even my old friend Google couldn't really help me out this time, so maybe you can help me. Thank's in advance.
Cheers, Mike

Here is what worked for me:
import signal
import subprocess
try:
p = subprocess.Popen(...)
p.wait()
except KeyboardInterrupt:
p.send_signal(signal.SIGINT)
p.wait()

I looked deeper into the problem and found some interesting things. Maybe these findings will help someone in the future.
When calling gdb vuln using suprocess.Popen() it does in fact create three processes, where the pid returned is the one of sh (5180).
ps -a
5180 pts/0 00:00:00 sh
5181 pts/0 00:00:00 gdb
5183 pts/0 00:00:00 vuln
Consequently sending a SIGINT to the process will in fact send SIGINT to sh.
Besides, I continued looking for an answer and stumbled upon this post
https://bugzilla.kernel.org/show_bug.cgi?id=9039
To keep it short, what is mentioned there is the following:
When pressing STRG+C while using gdb regularly SIGINT is in fact sent to the examined program (in this case vuln), then ptrace will intercept it and pass it to gdb.
What this means is, that if I use self.process.send_signal(signal.SIGINT) it will in fact never reach gdb this way.
Temporary Workaround:
I managed to work around this problem by simply calling subprocess.popen() as follows:
subprocess.Popen("killall -s INT " + self.binary, shell = True)
This is nothing more than a first workaround. When multiple applications with the same name are running might do some serious damage. Besides, it somehow fails, if shell=True is not set.
If someone has a better fix (e.g. how to get the pid of the process startet by gdb), please let me know.
Cheers, Mike
EDIT:
Thanks to Mark for pointing out to look at the ppid of the process.
I managed to narrow down the process's to which SIGINT is sent using the following approach:
out = subprocess.check_output(['ps', '-Aefj'])
for line in out.splitlines():
if self.binary in line:
l = line.split(" ")
while "" in l:
l.remove("")
# Get sid and pgid of child process (/bin/sh)
sid = os.getsid(self.process.pid)
pgid = os.getpgid(self.process.pid)
#only true for target process
if l[4] == str(sid) and l[3] != str(pgid):
os.kill(pid, signal.SIGINT)

I have done something like the following in the past and if I recollect it seemed to work for me :
def detach_procesGroup():
os.setpgrp()
subprocess.Popen(command,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
preexec_fn=detach_processGroup)

Polling subprocess object without blocking

I'm writing a python script that launches programs in the background and then monitors to see if they encounter an error. I am using the subprocess module to start the process and keep a list of running programs.
processes.append((subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE), command))
I have found that when I try to monitor the programs by calling communicate on the subprocess object, the main program waits for the program to finish. I have tried to use poll(), but that doesn't give me access to the error code that caused the crash and I would like to address the issue and retry opening the process.
runningProcesses is a list of tuples containing the subprocess object and the command associated with it.
def monitorPrograms(runningProcesses):
for program in runningProcesses:
temp = program[0].communicate()
if program[0].returncode:
if program[0].returncode == 1:
print "Program exited successfully."
else:
print "Whoops, something went wrong. Program %s crashed." % program[0].pid
When I tried to get the return code without using communicate, the crash of the program didn't register.
Do I have to use threads to run the communication in parallel or is there a simpler way that I am missing?

No need to use threads, to monitor multiple processes, especially if you don't use their output (use DEVNULL instead of PIPE to hide the output), see Python threading multiple bash subprocesses?
Your main issue is incorrect Popen.poll() usage. If it returns None; it means that the process is still running -- you should call it until you get non-None value. Here's a similar to your case code example that prints ping processes statuses.
If you do want to get subprocess' stdout/stderr as a string then you could use threads, async.io.
If you are on Unix and you control all the code that may spawn subprocesses then you could avoid polling and handle SIGCHLD yourself. asyncio stdlib library may handle SIGCHLD. You could also implement it manually, though it might be complicated.

Based on my research, the best way to do this is with threads. Here's an article that I referenced when creating my own package to solve this problem.
The basic method used here is to spin of threads that constantly request log output (and finally the exit status) of the subprocess call.
Here's an example of my own "receiver" which listens for logs:
class Receiver(threading.Thread):
def __init__(self, stream, stream_type=None, callback=None):
super(Receiver, self).__init__()
self.stream = stream
self.stream_type = stream_type
self.callback = callback
self.complete = False
self.text = ''
def run(self):
for line in iter(self.stream.readline, ''):
line = line.rstrip()
if self.callback:
line = self.callback(line, msg_type=self.stream_type)
self.text += line + "\n"
self.complete = True
And now the code that spins the receiver off:
def _execute(self, command):
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
shell=True, preexec_fn=os.setsid)
out = Receiver(process.stdout, stream_type='out', callback=self.handle_log)
err = Receiver(process.stderr, stream_type='err', callback=self.handle_log)
out.start()
err.start()
try:
self.wait_for_complete(out)
except CommandTimeout:
os.killpg(process.pid, signal.SIGTERM)
raise
else:
status = process.poll()
output = CommandOutput(status=status, stdout=out.text, stderr=err.text)
return output
finally:
out.join(timeout=1)
err.join(timeout=1)
CommandOutput is simply a named tuple that makes it easy to reference the data I care about.
You'll notice I have a method 'wait_for_complete' which waits for the receiver to set complete = True. Once complete, the execute method calls process.poll() to get the exit code. We now have all stdout/stderr and the status code of the process.

Performance of subprocess.check_output vs subprocess.call

I've been using subprocess.check_output() for some time to capture output from subprocesses, but ran into some performance problems under certain circumstances. I'm running this on a RHEL6 machine.
The calling Python environment is linux-compiled and 64-bit. The subprocess I'm executing is a shell script which eventually fires off a Windows python.exe process via Wine (why this foolishness is required is another story). As input to the shell script, I'm piping in a small bit of Python code that gets passed off to python.exe.
While the system is under moderate/heavy load (40 to 70% CPU utilization), I've noticed that using subprocess.check_output(cmd, shell=True) can result in a significant delay (up to ~45 seconds) after the subprocess has finished execution before the check_output command returns. Looking at output from ps -efH during this time shows the called subprocess as sh <defunct>, until it finally returns with a normal zero exit status.
Conversely, using subprocess.call(cmd, shell=True) to run the same command under the same moderate/heavy load will cause the subprocess to return immediately with no delay, all output printed to STDOUT/STDERR (rather than returned from the function call).
Why is there such a significant delay only when check_output() is redirecting the STDOUT/STDERR output into its return value, and not when the call() simply prints it back to the parent's STDOUT/STDERR?

Reading the docs, both subprocess.call and subprocess.check_output are use-cases of subprocess.Popen. One minor difference is that check_output will raise a Python error if the subprocess returns a non-zero exit status. The greater difference is emphasized in the bit about check_output (my emphasis):
The full function signature is largely the same as that of the Popen constructor, except that stdout is not permitted as it is used internally. All other supplied arguments are passed directly through to the Popen constructor.
So how is stdout "used internally"? Let's compare call and check_output:
call
def call(*popenargs, **kwargs):
return Popen(*popenargs, **kwargs).wait()
check_output
def check_output(*popenargs, **kwargs):
if 'stdout' in kwargs:
raise ValueError('stdout argument not allowed, it will be overridden.')
process = Popen(stdout=PIPE, *popenargs, **kwargs)
output, unused_err = process.communicate()
retcode = process.poll()
if retcode:
cmd = kwargs.get("args")
if cmd is None:
cmd = popenargs[0]
raise CalledProcessError(retcode, cmd, output=output)
return output
communicate
Now we have to look at Popen.communicate as well. Doing this, we notice that for one pipe, communicate does several things which simply take more time than simply returning Popen().wait(), as call does.
For one thing, communicate processes stdout=PIPE whether you set shell=True or not. Clearly, call does not. It just lets your shell spout whatever... making it a security risk, as Python describes here.
Secondly, in the case of check_output(cmd, shell=True) (just one pipe)... whatever your subprocess sends to stdout is processed by a thread in the _communicate method. And Popen must join the thread (wait on it) before additionally waiting on the subprocess itself to terminate!
Plus, more trivially, it processes stdout as a list which must then be joined into a string.
In short, even with minimal arguments, check_output spends a lot more time in Python processes than call does.

Let's look at the code. The .check_output has the following wait:
def _internal_poll(self, _deadstate=None, _waitpid=os.waitpid,
_WNOHANG=os.WNOHANG, _os_error=os.error, _ECHILD=errno.ECHILD):
"""Check if child process has terminated. Returns returncode
attribute.
This method is called by __del__, so it cannot reference anything
outside of the local scope (nor can any methods it calls).
"""
if self.returncode is None:
try:
pid, sts = _waitpid(self.pid, _WNOHANG)
if pid == self.pid:
self._handle_exitstatus(sts)
except _os_error as e:
if _deadstate is not None:
self.returncode = _deadstate
if e.errno == _ECHILD:
# This happens if SIGCLD is set to be ignored or
# waiting for child processes has otherwise been
# disabled for our process. This child is dead, we
# can't get the status.
# http://bugs.python.org/issue15756
self.returncode = 0
return self.returncode
The .call waits using the following code:
def wait(self):
"""Wait for child process to terminate. Returns returncode
attribute."""
while self.returncode is None:
try:
pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
except OSError as e:
if e.errno != errno.ECHILD:
raise
# This happens if SIGCLD is set to be ignored or waiting
# for child processes has otherwise been disabled for our
# process. This child is dead, we can't get the status.
pid = self.pid
sts = 0
# Check the pid and loop as waitpid has been known to return
# 0 even without WNOHANG in odd situations. issue14396.
if pid == self.pid:
self._handle_exitstatus(sts)
return self.returncode
Notice that bug related to internal_poll. It is viewable at http://bugs.python.org/issue15756. Pretty much exactly the issue you are running into.
Edit: The other potential issue between .call and .check_output is that .check_output actually cares about stdin and stdout and will try to perform IO against both pipes. If you are running into a process that get's itself into a zombie state it is possible that a read against a pipe in a defunct state is causing the hang you are experiencing.
In most cases zombie states get cleaned up pretty quickly, but, they will not if for instance they are interrupted while in a system call (like read or write). Of course the read/write system call should itself be interrupted as soon as the IO can no longer be performed, but, it is possible that you are hitting some sort of race condition where things are getting killed in a bad order.
The only way that I can think of to determine which is the cause in this case is for you to either add debugging code to the subprocess file or to invoke the python debugger and initiate a backtrace when you run into the condition you are experiencing.

How to check if a shell command is over in Python

Let's say that I have this simple line in python:
os.system("sudo apt-get update")
of course, apt-get will take some time untill it's finished, how can I check in python if the command had finished or not yet?
Edit: this is the code with Popen:
os.environ['packagename'] = entry.get_text()
process = Popen(['dpkg-repack', '$packagename'])
if process.poll() is None:
print "It still working.."
else:
print "It finished"
Now the problem is, it never print "It finished" even when it really finish.

As the documentation states it:
This is implemented by calling the Standard C function system(), and
has the same limitations
The C call to system simply runs the program until it exits. Calling os.system blocks your python code until the bash command has finished thus you'll know that it is finished when os.system returns. If you'd like to do other stuff while waiting for the call to finish, there are several possibilities. The preferred way is to use the subprocessing module.
from subprocess import Popen
...
# Runs the command in another process. Doesn't block
process = Popen(['ls', '-l'])
# Later
# Returns the return code of the command. None if it hasn't finished
if process.poll() is None:
# Still running
else:
# Has finished
Check the link above for more things you can do with Popen
For a more general approach at running code concurrently, you can run that in another thread or process. Here's example code:
from threading import Thread
...
thread = Thread(group=None, target=lambda:os.system("ls -l"))
thread.run()
# Later
if thread.is_alive():
# Still running
else:
# Has finished
Another option would be to use the concurrent.futures module.

os.system will actually wait for the command to finish and return the exit status (format dependent format).

os.system is blocking; it calls the command waits for its completion, and returns its return code.
So, it'll be finished once os.system returns.
If your code isn't working, I think that could be caused by one of sudo's quirks, it refuses to give rights on certain environments(I don't know the details tho.).

Python: How to determine subprocess children have all finished running

I am trying to detect when an installation program finishes executing from within a Python script. Specifically, the application is the Oracle 10gR2 Database. Currently I am using the subprocess module with Popen. Ideally, I would simply use the wait() method to wait for the installation to finish executing, however, the documented command actually spawns child processes to handle the actual installation. Here is some sample code of the failing code:
import subprocess
OUI_DATABASE_10GR2_SUBPROCESS = ['sudo',
'-u',
'oracle',
os.path.join(DATABASE_10GR2_TMP_PATH,
'database',
'runInstaller'),
'-ignoreSysPrereqs',
'-silent',
'-noconfig',
'-responseFile '+ORACLE_DATABASE_10GR2_SILENT_RESPONSE]
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
There is a similar question here: Killing a subprocess including its children from python, but the selected answer does not address the children issue, instead it instructs the user to call directly the application to wait for. I am looking for a specific solution that will wait for all children of the subprocess. What if there are an unknown number of subprocesses? I will select the answer that addresses the issue of waiting for all children subprocesses to finish.
More clarity on failure: The child processes continue executing after the wait() command since that command only waits for the top level process (in this case it is 'sudo'). Here is a simple diagram of the known child processes in this problem:
Python subprocess module -> Sudo -> runInstaller -> java -> (unknown)

Ok, here is a trick that will work only under Unix. It is similar to one of the answers to this question: Ensuring subprocesses are dead on exiting Python program. The idea is to create a new process group. You can then wait for all processes in the group to terminate.
pid = os.fork()
if pid == 0:
os.setpgrp()
oracle_subprocess = subprocess.Popen(OUI_DATABASE_10GR2_SUBPROCESS)
oracle_subprocess.wait()
os._exit(0)
else:
os.waitpid(-pid)
I have not tested this. It creates an extra subprocess to be the leader of the process group, but avoiding that is (I think) quite a bit more complicated.
I found this web page to be helpful as well. http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/

You can just use os.waitpid with the the pid set to -1, this will wait for all the subprocess of the current process until they finish:
import os
import sys
import subprocess
proc = subprocess.Popen([sys.executable,
'-c',
'import subprocess;'
'subprocess.Popen("sleep 5", shell=True).wait()'])
pid, status = os.waitpid(-1, 0)
print pid, status
This is the result of pstree <pid> of different subprocess forked:
python───python───sh───sleep
Hope this can help :)

Check out the following link http://www.oracle-wiki.net/startdocsruninstaller which details a flag you can use for the runInstaller command.
This flag is definitely available for 11gR2, but I have not got a 10g database to try out this flag for the runInstaller packaged with that version.
Regards

Everywhere I look seems to say it's not possible to solve this in the general case. I've whipped up a library called 'pidmon' that combines some answers for Windows and Linux and might do what you need.
I'm planning to clean this up and put it on github, possibly called 'pidmon' or something like that. I'll post a link if/when I get it up.
EDIT: It's available at http://github.com/dbarnett/python-pidmon.
I made a special waitpid function that accepts a graft_func argument so that you can loosely define what sort of processes you want to wait for when they're not direct children:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, recursive=True,
graft_func=(lambda p: p.name == '???' and p.parent.pid == ???))
or, as a shotgun approach, to just wait for any processes started since the call to waitpid to stop again, do:
import pidmon
pidmon.waitpid(oracle_subprocess.pid, graft_func=(lambda p: True))
Note that this is still barely tested on Windows and seems very slow on Windows (but did I mention it's on github where it's easy to fork?). This should at least get you started, and if it works at all for you, I have plenty of ideas on how to optimize it.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.