Performance of subprocess.check_output vs - python

I've been using subprocess.check_output() for some time to capture output from subprocesses, but ran into some performance problems under certain circumstances. I'm running this on a RHEL6 machine.
The calling Python environment is linux-compiled and 64-bit. The subprocess I'm executing is a shell script which eventually fires off a Windows python.exe process via Wine (why this foolishness is required is another story). As input to the shell script, I'm piping in a small bit of Python code that gets passed off to python.exe.
While the system is under moderate/heavy load (40 to 70% CPU utilization), I've noticed that using subprocess.check_output(cmd, shell=True) can result in a significant delay (up to ~45 seconds) after the subprocess has finished execution before the check_output command returns. Looking at output from ps -efH during this time shows the called subprocess as sh <defunct>, until it finally returns with a normal zero exit status.
Conversely, using, shell=True) to run the same command under the same moderate/heavy load will cause the subprocess to return immediately with no delay, all output printed to STDOUT/STDERR (rather than returned from the function call).
Why is there such a significant delay only when check_output() is redirecting the STDOUT/STDERR output into its return value, and not when the call() simply prints it back to the parent's STDOUT/STDERR?

Reading the docs, both and subprocess.check_output are use-cases of subprocess.Popen. One minor difference is that check_output will raise a Python error if the subprocess returns a non-zero exit status. The greater difference is emphasized in the bit about check_output (my emphasis):
The full function signature is largely the same as that of the Popen constructor, except that stdout is not permitted as it is used internally. All other supplied arguments are passed directly through to the Popen constructor.
So how is stdout "used internally"? Let's compare call and check_output:
def call(*popenargs, **kwargs):
return Popen(*popenargs, **kwargs).wait()
def check_output(*popenargs, **kwargs):
if 'stdout' in kwargs:
raise ValueError('stdout argument not allowed, it will be overridden.')
process = Popen(stdout=PIPE, *popenargs, **kwargs)
output, unused_err = process.communicate()
retcode = process.poll()
if retcode:
cmd = kwargs.get("args")
if cmd is None:
cmd = popenargs[0]
raise CalledProcessError(retcode, cmd, output=output)
return output
Now we have to look at Popen.communicate as well. Doing this, we notice that for one pipe, communicate does several things which simply take more time than simply returning Popen().wait(), as call does.
For one thing, communicate processes stdout=PIPE whether you set shell=True or not. Clearly, call does not. It just lets your shell spout whatever... making it a security risk, as Python describes here.
Secondly, in the case of check_output(cmd, shell=True) (just one pipe)... whatever your subprocess sends to stdout is processed by a thread in the _communicate method. And Popen must join the thread (wait on it) before additionally waiting on the subprocess itself to terminate!
Plus, more trivially, it processes stdout as a list which must then be joined into a string.
In short, even with minimal arguments, check_output spends a lot more time in Python processes than call does.

Let's look at the code. The .check_output has the following wait:
def _internal_poll(self, _deadstate=None, _waitpid=os.waitpid,
_WNOHANG=os.WNOHANG, _os_error=os.error, _ECHILD=errno.ECHILD):
"""Check if child process has terminated. Returns returncode
This method is called by __del__, so it cannot reference anything
outside of the local scope (nor can any methods it calls).
if self.returncode is None:
pid, sts = _waitpid(, _WNOHANG)
if pid ==
except _os_error as e:
if _deadstate is not None:
self.returncode = _deadstate
if e.errno == _ECHILD:
# This happens if SIGCLD is set to be ignored or
# waiting for child processes has otherwise been
# disabled for our process. This child is dead, we
# can't get the status.
self.returncode = 0
return self.returncode
The .call waits using the following code:
def wait(self):
"""Wait for child process to terminate. Returns returncode
while self.returncode is None:
pid, sts = _eintr_retry_call(os.waitpid,, 0)
except OSError as e:
if e.errno != errno.ECHILD:
# This happens if SIGCLD is set to be ignored or waiting
# for child processes has otherwise been disabled for our
# process. This child is dead, we can't get the status.
pid =
sts = 0
# Check the pid and loop as waitpid has been known to return
# 0 even without WNOHANG in odd situations. issue14396.
if pid ==
return self.returncode
Notice that bug related to internal_poll. It is viewable at Pretty much exactly the issue you are running into.
Edit: The other potential issue between .call and .check_output is that .check_output actually cares about stdin and stdout and will try to perform IO against both pipes. If you are running into a process that get's itself into a zombie state it is possible that a read against a pipe in a defunct state is causing the hang you are experiencing.
In most cases zombie states get cleaned up pretty quickly, but, they will not if for instance they are interrupted while in a system call (like read or write). Of course the read/write system call should itself be interrupted as soon as the IO can no longer be performed, but, it is possible that you are hitting some sort of race condition where things are getting killed in a bad order.
The only way that I can think of to determine which is the cause in this case is for you to either add debugging code to the subprocess file or to invoke the python debugger and initiate a backtrace when you run into the condition you are experiencing.


Python does it auto close on completion?

I apologize if this is a dumb question, however, I am not very fluent in Python yet.
In regards to the Python Subprocess function...
I've seen that when you use sp = subprocess.Popen(...) people close/terminate it when it's finished running the command. Example:
sp = subprocess.Popen(['powershell.exe', '-ExecutionPolicy', 'Unrestricted', 'cp', '-r', 'ui', f'..\\{name}'], stdout=subprocess.PIPE, stderr=subprocess.PIPE, cwd='UI Boiler')
However, my question is, do you need to close any functions? Or do those processes close automatically once they are finished running their commands?
The project I am working on requires a lot of those to be run and I do not wish to have 10+ shells/powershells/processes open because I didn't close them.
**Yes, on both windows and posix implementations, as well as will both block until completion e.g. via Process.wait() internally. Since this is a blocking call it will wait until process completion to return, so you should not need to do anything special to close processes.
To wit, here's the relevant snippets from subprocess source in cpython-3.10 (amended for brevity):
def call(*popenargs, timeout=None, **kwargs):
with Popen(*popenargs, **kwargs) as p:
return p.wait(timeout=timeout)
except: # Including KeyboardInterrupt, wait handled that.
# ...
def run(*popenargs,
input=None, capture_output=False, timeout=None, check=False, **kwargs):
# ...
with Popen(*popenargs, **kwargs) as process:
# communicate (as well as the with statement will both wait() internally
stdout, stderr = process.communicate(input, timeout=timeout)
except TimeoutExpired as exc:
# ... additional handling here
except: # Including KeyboardInterrupt, communicate handled that.
# We don't call process.wait() as .__exit__ does that for us.
retcode = process.poll()
if check and retcode:
raise CalledProcessError(retcode, process.args,
output=stdout, stderr=stderr)
return CompletedProcess(process.args, retcode, stdout, stderr)
If however you want to have more control over if and when the subprocess blocks, e.g. such that you can run other code on the same thread while the other process is running, then you should use the internal sp = supbrocess.Popen() directly
As to the call to terminate() - note this would always be a no-op in your example of waiting first and then terminating without a catch. Reason being, terminate as implemented will never even bother sending a TERM signal to your subprocess because wait() is a blocking call that will not exit until the child process completes or throws an exception (e.g. on timeout). Again, if you are calling subprocesses that might hang and you want to run in the background, e.g. so you can terminate yourself if it hasn't completed after a certain amount of time e.g. you will need to manage the subprocess yourself and is probably not suitable for your needs.
A note on terminate(): and do both properly support automatically sending a kill to an erroring or timed out process, so if that's all you need, you can stick with one of those. In fact, on windows, kill() and terminate() are identical. On posix, a SIGKILL will be sent if the subprocess throws or times out.
If on POSIX, you would want to send a SIGTERM instead so that you give the subprocess the opportunity to try to terminate gracefully or cleanup then again it's best to interact with the Process object directly via Popen

How to asynchronously call a shell script from Python?

I have a shell script which does some processing over the string passed and then writes it to a file. However I don't want my function foo() to wait for it to complete the operation. How do I call process(msg) and then move on the with the execution of {code block 2} without waiting for process(msg) to complete execution?
def process(msg):['sh', './', msg])
def foo():
# {code block 1}
# {code block 2}
foo() will be called from another function, almost once or twice per second.
Just for completeness: Python's asyncio offers a high level interface for doing just that:
Example from documentation:
import asyncio
async def run(cmd):
proc = await asyncio.create_subprocess_shell(
stdout, stderr = await proc.communicate()
print(f'[{cmd!r} exited with {proc.returncode}]')
if stdout:
if stderr:
print(f'[stderr]\n{stderr.decode()}')'ls /zzz')) and creates a process, waits for it to finish, and returns a CompletedProcess object.
subprocess.Popen() creates a process and returns it. It is used under the hood of the previous functions. You can then wait for the process to finish, send it messages, or whatever else you want to do with it. The arguments are mostly the same as to call or run.
As a bit of elaboration, Popen is the python implementation of using the os to start a new process. os.fork() is a lower level that doesn't actually do what we want here, that would spawn another instance of the python interpreter with the same memory state as the current one. If you wanted to use the lower level syscall, os.spawn is closer to than os.fork.
To verify that Popen is doing what you want, this test program will pring "returncode = None", then wait 5 seconds, and print "returncode = 0"
from subprocess import Popen
p = Popen(["sleep", "5"])
print("started the proc") # this will print immediately
p.poll() # this checks if the process is done but does not block
print(f"p returncode = {p.returncode}")
p.wait() # this blocks until the process exits
print(f"p returncode = {p.returncode}")
What you need is i.e. os.fork() that way you can spawn a child which can outlive the parent process which can be later claimed by systemd on Linux. I have no clue about Windows.

Is there a way to check if a subprocess is still running?

I'm launching a number of subprocesses with subprocess.Popen in Python.
I'd like to check whether one such process has completed. I've found two ways of checking the status of a subprocess, but both seem to force the process to complete.
One is using process.communicate() and printing the returncode, as explained here: checking status of process with subprocess.Popen in Python.
Another is simply calling process.wait() and checking that it returns 0.
Is there a way to check if a process is still running without waiting for it to complete if it is?
Ouestion: ... a way to check if a process is still running ...
You can do it for instance:
p = subprocess.Popen(...
A None value indicates that the process hasn't terminated yet.
poll = p.poll()
if poll is None:
# p.subprocess is alive
Python » 3.6.1 Documentation popen-objects
Tested with Python:3.4.2
Doing the
myProcessIsRunning = poll() is None
As suggested by the main answer, is the recommended way and the simplest way to check if a process running. (and it works in jython as well)
If you do not have the process instance in hand to check it.
Then use the operating system TaskList / Ps processes.
On windows, my command will look as follows:
filterByPid = "PID eq %s" % pid
pidStr = str(pid)
commandArguments = ['cmd', '/c', "tasklist", "/FI", filterByPid, "|", "findstr", pidStr ]
This is essentially doing the same thing as the following command line:
cmd /c "tasklist /FI "PID eq 55588" | findstr 55588"
And on linux, I do exactly the same using the:
pidStr = str(pid)
commandArguments = ['ps', '-p', pidStr ]
The ps command will already be returning error code 0 / 1 depending on whether the process is found. While on windows you need the find string command.
This is the same approach that is discussed on the following stack overflow thread:
Verify if a process is running using its PID in JAVA
If you use this approach, remember to wrap your command call in a try/except:
foundRunningProcess = subprocess.check_output(argumentsArray, **kwargs)
return True
except Exception as err:
return False
Note, be careful if you are developing with VS Code and using pure Python and Jython.
On my environment, I was under the illusion that the poll() method did not work because a process that I suspected that must have ended was indeed running.
This process had launched Wildfly. And after I had asked for wildfly to stop, the shell was still waiting for user to "Press any key to continue . . .".
In order to finish off this process, in pure python the following code was working:
On jython, I had to fix this code to look as follows:
print >>process.stdin, os.linesep
And with this difference the process did indeed finish.
And the jython.poll() started telling me that the process is indeed finished.
As suggested by the other answers None is the designed placeholder for the "return code" when no code has been returned yet by the subprocess.
The documentation for the returncode attribute backs this up (emphasis mine):
The child return code, set by poll() and wait() (and indirectly by communicate()). A None value indicates that the process hasn’t terminated yet.
A negative value -N indicates that the child was terminated by signal N (POSIX only).
An interesting place where this None value occurs is when using the timeout parameter for wait or communicate.
If the process does not terminate after timeout seconds, a TimeoutExpired exception will be raised.
If you catch that exception and check the returncode attribute it will indeed be None
import subprocess
with subprocess.Popen(['ping','']) as p:
except subprocess.TimeoutExpired:
assert p.returncode is None
If you look at the source for subprocess you can see the exception being raised.
If you search that source for self.returncode is you'll find many uses where the library authors lean on that None return code design to infer if an app is running or not running. The returncode attribute is initialized to None and only ever changes in a few spots, the main flow in invocations to _handle_exitstatus to pass on the actual return code.
You could use subprocess.check_output to have a look at your output.
Try this code:
import subprocess
subprocess.check_output(['your command here'], shell=True, stderr=subprocess.STDOUT)
Hope this helped!

Polling subprocess object without blocking

I'm writing a python script that launches programs in the background and then monitors to see if they encounter an error. I am using the subprocess module to start the process and keep a list of running programs.
processes.append((subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE), command))
I have found that when I try to monitor the programs by calling communicate on the subprocess object, the main program waits for the program to finish. I have tried to use poll(), but that doesn't give me access to the error code that caused the crash and I would like to address the issue and retry opening the process.
runningProcesses is a list of tuples containing the subprocess object and the command associated with it.
def monitorPrograms(runningProcesses):
for program in runningProcesses:
temp = program[0].communicate()
if program[0].returncode:
if program[0].returncode == 1:
print "Program exited successfully."
print "Whoops, something went wrong. Program %s crashed." % program[0].pid
When I tried to get the return code without using communicate, the crash of the program didn't register.
Do I have to use threads to run the communication in parallel or is there a simpler way that I am missing?
No need to use threads, to monitor multiple processes, especially if you don't use their output (use DEVNULL instead of PIPE to hide the output), see Python threading multiple bash subprocesses?
Your main issue is incorrect Popen.poll() usage. If it returns None; it means that the process is still running -- you should call it until you get non-None value. Here's a similar to your case code example that prints ping processes statuses.
If you do want to get subprocess' stdout/stderr as a string then you could use threads,
If you are on Unix and you control all the code that may spawn subprocesses then you could avoid polling and handle SIGCHLD yourself. asyncio stdlib library may handle SIGCHLD. You could also implement it manually, though it might be complicated.
Based on my research, the best way to do this is with threads. Here's an article that I referenced when creating my own package to solve this problem.
The basic method used here is to spin of threads that constantly request log output (and finally the exit status) of the subprocess call.
Here's an example of my own "receiver" which listens for logs:
class Receiver(threading.Thread):
def __init__(self, stream, stream_type=None, callback=None):
super(Receiver, self).__init__() = stream
self.stream_type = stream_type
self.callback = callback
self.complete = False
self.text = ''
def run(self):
for line in iter(, ''):
line = line.rstrip()
if self.callback:
line = self.callback(line, msg_type=self.stream_type)
self.text += line + "\n"
self.complete = True
And now the code that spins the receiver off:
def _execute(self, command):
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
shell=True, preexec_fn=os.setsid)
out = Receiver(process.stdout, stream_type='out', callback=self.handle_log)
err = Receiver(process.stderr, stream_type='err', callback=self.handle_log)
except CommandTimeout:
os.killpg(, signal.SIGTERM)
status = process.poll()
output = CommandOutput(status=status, stdout=out.text, stderr=err.text)
return output
CommandOutput is simply a named tuple that makes it easy to reference the data I care about.
You'll notice I have a method 'wait_for_complete' which waits for the receiver to set complete = True. Once complete, the execute method calls process.poll() to get the exit code. We now have all stdout/stderr and the status code of the process.

Is it necessary to call Popen.wait() to "clean up" after the Popen object?

I am using Popen to maintain a pool of subprocesses in a Python program. There are natural points in my program to perform "cleanup" - at these points I call Popen.poll() to determine whether a particular process is still running, and if not, I remove its Popen object from the pool and reclaim whatever resources it was using.
Is there any need to call Popen.wait() in order to perform some kind of language or OS level cleanup? The call to Popen.poll() has already determined that the process has terminated, and it even sets the returncode attribute. Is there any additional reason to call Popen.wait() as well?
No, you don't have to call wait if you are calling poll. They basically do the same thing, except that wait waits infinitely.
if self.returncode is None:
if _WaitForSingleObject(self._handle, 0) == _WAIT_OBJECT_0:
self.returncode = _GetExitCodeProcess(self._handle)
return self.returncode
if self.returncode is None:
self.returncode = _subprocess.GetExitCodeProcess(self._handle)
return self.returncode
This is the code for windows implementation of the subprocess module, but all other should follow the same rules.
On MacOS X and I assume the implementation for Linux is the same, they both call os.waitpid.

