Using subprocess wait() and poll() - python

I am trying to write a small app that uses the subprocess module.
My program calls an external Bash command that takes some time to process. During this time, I would like to show the user a series of messages like this:
Processing. Please wait...
The output is foo()
How can I do this using Popen.wait() or Popen.poll(). I have read that I need to use the Popen.returncode, but how I can get it to actively check the state, I don't know.

Both wait() (with timeout specified) and poll() return None if the process has not yet finished, and something different if the process has finished (I think an integer, the exit code, hopefully 0).
Edit:
wait() and poll() have different behaviors:
wait (without the timeout argument) will block and wait for the process to complete.
wait with the timeout argument will wait timeout seconds for the process to complete. If it doesn't complete, it will throw the TimeoutExpired exception. If you catch the exception, you're then welcome to go on, or to wait again.
poll always returns immediately. It effectively does a wait with a timeout of 0, catches any exception, and returns None if the process hasn't completed.
With either wait or poll, if the process has completed, the popen object's returncode will be set (otherwise it's None - you can check for that as easily as calling wait or poll), and the return value from the function will also be the process's return code.
</Edit>
So I think you should do something like:
while myprocess.poll() is None:
print("Still working...")
# sleep a while
Be aware that if the bash script creates a lot of output you must use communicate() or something similar to prevent stdout or stderr to become stuffed.

#extraneon's answer is a little backwards. Both wait() and poll() return the process's exit code if the process has finished. The poll() method will return None if the process is still running and the wait() method will block until the process exits:
Check out the following page: https://docs.python.org/3.4/library/subprocess.html#popen-objects
Popen.poll()
Check if child process has terminated. Set and return returncode attribute.
Popen.wait()
Wait for child process to terminate. Set and return returncode attribute.

Related

How to asynchronously call a shell script from Python?

I have a shell script which does some processing over the string passed and then writes it to a file. However I don't want my function foo() to wait for it to complete the operation. How do I call process(msg) and then move on the with the execution of {code block 2} without waiting for process(msg) to complete execution?
def process(msg):
subprocess.call(['sh', './process.sh', msg])
def foo():
# {code block 1}
process(msg)
# {code block 2}
foo() will be called from another function, almost once or twice per second.
Just for completeness: Python's asyncio offers a high level interface for doing just that:
https://docs.python.org/3.9/library/asyncio-subprocess.html#subprocesses
Example from documentation:
import asyncio
async def run(cmd):
proc = await asyncio.create_subprocess_shell(
cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE)
stdout, stderr = await proc.communicate()
print(f'[{cmd!r} exited with {proc.returncode}]')
if stdout:
print(f'[stdout]\n{stdout.decode()}')
if stderr:
print(f'[stderr]\n{stderr.decode()}')
asyncio.run(run('ls /zzz'))
subprocess.call() and subprocess.run() creates a process, waits for it to finish, and returns a CompletedProcess object.
subprocess.Popen() creates a process and returns it. It is used under the hood of the previous functions. You can then wait for the process to finish, send it messages, or whatever else you want to do with it. The arguments are mostly the same as to call or run.
https://docs.python.org/3/library/subprocess.html
As a bit of elaboration, Popen is the python implementation of using the os to start a new process. os.fork() is a lower level that doesn't actually do what we want here, that would spawn another instance of the python interpreter with the same memory state as the current one. If you wanted to use the lower level syscall, os.spawn is closer to subprocess.run than os.fork.
To verify that Popen is doing what you want, this test program will pring "returncode = None", then wait 5 seconds, and print "returncode = 0"
from subprocess import Popen
p = Popen(["sleep", "5"])
print("started the proc") # this will print immediately
p.poll() # this checks if the process is done but does not block
print(f"p returncode = {p.returncode}")
p.wait() # this blocks until the process exits
print(f"p returncode = {p.returncode}")
What you need is https://docs.python.org/3/library/os.html#os.fork i.e. os.fork() that way you can spawn a child which can outlive the parent process which can be later claimed by systemd on Linux. I have no clue about Windows.

Python subprocess polling not giving return code when used with Java process

I'm having a problem with subprocess poll not returning the return code when the process has finished.
I found out how to set a timeout on subprocess.Popen and used that as the basis for my code. However, I have a call that uses Java that doesn't correctly report the return code so each call "times out" even though it is actually finished. I know the process has finished because when removing the poll timeout check, the call runs without issue returning a good exit code and within the time limit.
Here is the code I am testing with.
import subprocess
import time
def execute(command):
print('start command: {}'.format(command))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
print('wait')
wait = 10
while process.poll() is None and wait > 0:
time.sleep(1)
wait -= 1
print('done')
if wait == 0:
print('terminate')
process.terminate()
print('communicate')
stdout, stderr = process.communicate()
print('rc')
exit_code = process.returncode
if exit_code != 0:
print('got bad rc')
if __name__ == '__main__':
execute(['ping','-n','15','127.0.0.1']) # correctly times out
execute(['ping','-n','5','127.0.0.1']) # correctly runs within the time limit
# incorrectly times out
execute(['C:\\dev\\jdk8\\bin\\java.exe', '-jar', 'JMXQuery-0.1.8.jar', '-url', 'service:jmx:rmi:///jndi/rmi://localhost:18080/jmxrmi', '-json', '-q', 'java.lang:type=Runtime;java.lang:type=OperatingSystem'])
You can see that two examples are designed to time out and two are not to time out and they all work correctly. However, the final one (using jmxquery to get tomcat metrics) doesn't return the exit code and therefore "times out" and has to be terminated, which then causes it to return an error code of 1.
Is there something I am missing in the way subprocess poll is interacting with this Java process that is causing it to not return an exit code? Is there a way to get a timeout option to work with this?
This has the same cause as a number of existing questions, but the desire to impose a timeout requires a different answer.
The OS deliberately gives only a small amount of buffer space to each pipe. When a process writes to one that is full (because the reader has not yet consumed the previous output), it blocks. (The reason is that a producer that is faster than its consumer would otherwise be able to quickly use a great deal of memory for no gain.) Therefore, if you want to do more than one of the following with a subprocess, you have to interleave them rather than doing each in turn:
Read from standard output
Read from standard error (unless it’s merged via subprocess.STDOUT)
Wait for the process to exit, or for a timeout to elapse
Of course, the subprocess might close its streams before it exits, write useful output after you notice the timeout and before you kill it, and/or start additional processes that keep the pipe open indefinitely, so you might want to have multiple timeouts. Probably what’s most informative is the EOF on the pipe, so repeatedly use something like select to wait for (however much is left of) the timeout, issue single reads on the streams that are ready, and wait (with another timeout if you’re concerned about hangs after an early stream closure) on EOF. If the timeout occurs instead, (try to) kill the subprocess, and consider issuing non-blocking reads (or another timeout loop) to get any last available output before closing the pipes.
Using the other answer by #DavisHerring as the basis for more research, I came across a concept that worked for my original case. Here is the code that came out of that.
import subprocess
import threading
import time
def execute(command):
print('start command: {}'.format(command))
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
timer = threading.Timer(10, terminate_process, [process])
timer.start()
print('communicate')
stdout, stderr = process.communicate()
print('rc')
exit_code = process.returncode
timer.cancel()
if exit_code != 0:
print('got bad rc')
def terminate_process(p):
try:
p.terminate()
except OSError:
pass # ignore error
It uses the threading.Timer to make sure that the process doesn't go over the time limit and terminates the process if it does. It otherwise waits for a response back and cancels the timer once it finishes.

Send input to python subprocess without waiting for result

I'm trying to write some basic tests for a piece of code that normally accepts input endlessly through stdin until given a specific exit command.
I want to check if the program crashes on being given some input string (after some amount of time to account for processing), but can't seem to figure out how to send data and not be stuck waiting for output which I don't care about.
My current code looks like this (using cat as an example of the program):
myproc = subprocess.Popen(['cat'], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
myproc.communicate(input=inputdata.encode("utf-8"))
time.sleep(0.1)
if myproc.poll() != None:
print("not running")
else:
print("still running")
How can I modify this to allow the program to proceed to the polling instead of hanging after the communicate() call?
You are using the wrong tool here with communicate which waits for the end of the program. You should simply feed the standard input of the subprocess:
myproc = subprocess.Popen(['cat'], stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
myproc.stdin.write(inputdata.encode("utf-8"))
time.sleep(0.1)
if myproc.poll() != None:
print("not running")
else:
print("still running")
But beware: you cannot be sure that the output pipes will contain anything before the end of the subprocess...
You could set a timeout in the Popen.communicate(input=None, timeout=None) function. After the timeout the process is still running and I think but you have to test it you can still send in input with communicate.
From the docs:
If the process does not terminate after timeout seconds, a TimeoutExpired exception will be raised. Catching this exception and retrying communication will not lose any output.
The child process is not killed if the timeout expires, so in order to
cleanup properly a well-behaved application should kill the child
process and finish communication:
I think I understand what you want here. If you know an existing command that will crash your program, you can use subprocess.Popen.wait() and it'll still block, but it'll return a tuple of the output message and the error associated with it, if any.
Then you can note the error and catch it in a try exception statement.
This was really helpful when I was working with sub processes:
https://docs.python.org/3/library/asyncio-subprocess.html

Python subprocess.communicate hangs when parent leaves zombies

I'm trying to use Popen to create a subprocess A along with a thread that communicates with it using Popen.communicate. The main process will wait on the thread using Thread.join with a specified timeout, and kills A after that timeout expires, which should cause the thread to die as well.
However, this doesn't seem to work when A itself spawns more subprocesses B,C and D with different process groups than A that refuse to die. Even after A is dead and labelled defunct, and even after the main process reaps A using os.waitpid() so that it no longer exists, the the thread refuses to join with the main thread.
Only after all the children, B, C, D are killed, does Popen.communicate finally return.
Is this behavior actually expected from the module? A recursive wait might be useful in some cases, but it's certainly not appropriate as the default behavior for Popen.communicate. And if this is the intended behavior, is there any way to override it?
Here's a very simple example:
from subprocess import PIPE, Popen
from threading import Thread
import os
import time
import signal
DEVNULL = open(os.devnull, 'w')
proc = Popen(["/bin/bash"], stdin=PIPE, stdout=PIPE,
stderr=DEVNULL, start_new_session=True)
def thread_function():
print("Entering thread")
return proc.communicate(input=b"nohup sleep 100 &\nexit\n")
thread = Thread(target=thread_function)
thread.start()
time.sleep(1)
proc.kill()
while True:
thread.join(timeout=5)
if not thread.is_alive():
break
print("Thread still alive")
This is on Linux.
I think this comes from a fairly natural way to write the popen.communicate method in Linux. Proc.communicate() appears to read the stdin file descriptor, which will return an EOF when the process dies. Then it does the wait to get the exit code of the process.
In your example, the sleep process inherits the stdin file descriptor from the bash process. So when the bash process dies, popen.communicate doesn't get an EOF on the stdin pipe, as the sleep still has it open. The simplest way to fix this is to change the communicate line to:
return proc.communicate(input=b"nohup sleep 100 >/dev/null&\nexit\n")
This causes your thread to end as soon the bash dies... due to the exit, not your proc.kill, in this case. However, the sleep is still running after bash dies if you use the exit statement or the proc.kill call. If you want to kill the sleep as well, I would use
os.killpg(proc.pid,15)
instead of the proc.kill(). The more general problem of killing B, C and D if they change the group is a more complex problem.
Addtional data:
I couldn't find any official documentation for this method of proc.communicate, but I forgot the most obvious place :-) I found it with the help of this answer. The docs for communicate say:
Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate.
You are getting stuck at step 2: Read until end-of-file, because the sleep is keeping the pipe open.

Python avoid orphan processes

I'm using python to benchmark something. This can take a large amount of time, and I want to set a (global) timeout. I use the following script (summarized):
class TimeoutException(Exception):
pass
def timeout_handler(signum, frame):
raise TimeoutException()
# Halt problem after half an hour
signal.alarm(1800)
try:
while solution is None:
guess = guess()
try:
with open(solutionfname, 'wb') as solutionf:
solverprocess = subprocess.Popen(["solver", problemfname], stdout=solutionf)
solverprocess.wait()
finally:
# `solverprocess.poll() == None` instead of try didn't work either
try:
solverprocess.kill()
except:
# Solver process was already dead
pass
except TimeoutException:
pass
# Cancel alarm if it's still active
signal.alarm(0)
However it keeps spawning orphan processes sometimes, but I can't reliably recreate the circumstances. Does anyone know what the correct way to prevent this is?
You simply have to wait after killing the process.
The documentation for the kill() method states:
Kills the child. On Posix OSs the function sends SIGKILL to the child.
On Windows kill() is an alias for terminate().
In other words, if you aren't on Windows, you are only sending a signal to the subprocess.
This will create a zombie process because the parent process didn't read the return value of the subprocess.
The kill() and terminate() methods are just shortcuts to send_signal(SIGKILL) and send_signal(SIGTERM).
Try adding a call to wait() after the kill(). This is even shown in the example under the documentation for communicate():
proc = subprocess.Popen(...)
try:
outs, errs = proc.communicate(timeout=15)
except TimeoutExpired:
proc.kill()
outs, errs = proc.communicate()
note the call to communicate() after the kill(). (It is equivalent to calling wait() and also erading the outputs of the subprocess).
I want to clarify one thing: it seems like you don't understand exactly what a zombie process is. A zombie process is a terminated process. The kernel keeps the process in the process table until the parent process reads its exit status. I believe all memory used by the subprocess is actually reused; the kernel only has to keep track of the exit status of such a process.
So, the zombie processes you see aren't running. They are already completely dead, and that's why they are called zombie. They are "alive" in the process table, but aren't really running at all.
Calling wait() does exactly this: wait till the subprocess ends and read the exit status. This allows the kernel to remove the subprocess from the process table.
On linux, you can use python-prctl.
Define a preexec function such as:
def pre_exec():
import signal
prctl.set_pdeathsig(signal.SIGTERM)
And have your Popen call pass it.
subprocess.Popen(..., preexec_fn=pre_exec)
That's as simple as that. Now the child process will die rather than become orphan if the parent dies.
If you don't like the external dependency of python-prctl you can also use the older prctl. Instead of
prctl.set_pdeathsig(signal.SIGTERM)
you would have
prctl.prctl(prctl.PDEATHSIG, signal.SIGTERM)

Categories

Resources