I have a shell script which does some processing over the string passed and then writes it to a file. However I don't want my function foo() to wait for it to complete the operation. How do I call process(msg) and then move on the with the execution of {code block 2} without waiting for process(msg) to complete execution?
def process(msg):
subprocess.call(['sh', './process.sh', msg])
def foo():
# {code block 1}
process(msg)
# {code block 2}
foo() will be called from another function, almost once or twice per second.
Just for completeness: Python's asyncio offers a high level interface for doing just that:
https://docs.python.org/3.9/library/asyncio-subprocess.html#subprocesses
Example from documentation:
import asyncio
async def run(cmd):
proc = await asyncio.create_subprocess_shell(
cmd,
stdout=asyncio.subprocess.PIPE,
stderr=asyncio.subprocess.PIPE)
stdout, stderr = await proc.communicate()
print(f'[{cmd!r} exited with {proc.returncode}]')
if stdout:
print(f'[stdout]\n{stdout.decode()}')
if stderr:
print(f'[stderr]\n{stderr.decode()}')
asyncio.run(run('ls /zzz'))
subprocess.call() and subprocess.run() creates a process, waits for it to finish, and returns a CompletedProcess object.
subprocess.Popen() creates a process and returns it. It is used under the hood of the previous functions. You can then wait for the process to finish, send it messages, or whatever else you want to do with it. The arguments are mostly the same as to call or run.
https://docs.python.org/3/library/subprocess.html
As a bit of elaboration, Popen is the python implementation of using the os to start a new process. os.fork() is a lower level that doesn't actually do what we want here, that would spawn another instance of the python interpreter with the same memory state as the current one. If you wanted to use the lower level syscall, os.spawn is closer to subprocess.run than os.fork.
To verify that Popen is doing what you want, this test program will pring "returncode = None", then wait 5 seconds, and print "returncode = 0"
from subprocess import Popen
p = Popen(["sleep", "5"])
print("started the proc") # this will print immediately
p.poll() # this checks if the process is done but does not block
print(f"p returncode = {p.returncode}")
p.wait() # this blocks until the process exits
print(f"p returncode = {p.returncode}")
What you need is https://docs.python.org/3/library/os.html#os.fork i.e. os.fork() that way you can spawn a child which can outlive the parent process which can be later claimed by systemd on Linux. I have no clue about Windows.
Related
There's this external python script that I would like to call.
It provides an async mode so that it returns the task id before it completes the whole process.
The mechanism works well when I execute in the command line. The task id returns on stdout immediately. But the main process actually forks a subprocess to do the backend job. So when I want to use bash script to get the task id, it hangs until the subprocess finishes. It's not async at all.
So my question is, how can I get the main process output immediately instead of waiting for the subprocess complete?
e.g.
$ ./cmd args
{"task": 1}
$ x=`./cmd args`
<< it hungs until entire process completed and returns all result at once.
{"task": 1} {"task": 1} {"actual_result": "xxx"}
// It's the same using python
import subprocess
p = subprocess.Popen(["cmd", "args"], stdout= subprocess.PIPE, stderr = subprocess.PIPE)
out, err = p.communicate()
<< stuck here as well
I would not call this a fork - read this for the difference https://stackoverflow.com/questions/49627957/what-is-the-difference-between-subprocess-popen-and-os-fork#:~:text=It%20seems%20like%20subprocess.,to%20create%20a%20child%20process.
. So you want let the subprocess run while main process print something.
communicate() blocks IO, which is your main problem. You can get rid of the PIPE, just let the SP print to stdout, or any file object. More aggressively, add 'nohup' to the front of the child command can make the parent process free to exit itself without worrying about the child. Thought it has side effect of making the child nohup from current shell.
If you insist the parent program should manage all print, use poll() to check the status of child process before you communicate() with it.
I'm trying to use Popen to create a subprocess A along with a thread that communicates with it using Popen.communicate. The main process will wait on the thread using Thread.join with a specified timeout, and kills A after that timeout expires, which should cause the thread to die as well.
However, this doesn't seem to work when A itself spawns more subprocesses B,C and D with different process groups than A that refuse to die. Even after A is dead and labelled defunct, and even after the main process reaps A using os.waitpid() so that it no longer exists, the the thread refuses to join with the main thread.
Only after all the children, B, C, D are killed, does Popen.communicate finally return.
Is this behavior actually expected from the module? A recursive wait might be useful in some cases, but it's certainly not appropriate as the default behavior for Popen.communicate. And if this is the intended behavior, is there any way to override it?
Here's a very simple example:
from subprocess import PIPE, Popen
from threading import Thread
import os
import time
import signal
DEVNULL = open(os.devnull, 'w')
proc = Popen(["/bin/bash"], stdin=PIPE, stdout=PIPE,
stderr=DEVNULL, start_new_session=True)
def thread_function():
print("Entering thread")
return proc.communicate(input=b"nohup sleep 100 &\nexit\n")
thread = Thread(target=thread_function)
thread.start()
time.sleep(1)
proc.kill()
while True:
thread.join(timeout=5)
if not thread.is_alive():
break
print("Thread still alive")
This is on Linux.
I think this comes from a fairly natural way to write the popen.communicate method in Linux. Proc.communicate() appears to read the stdin file descriptor, which will return an EOF when the process dies. Then it does the wait to get the exit code of the process.
In your example, the sleep process inherits the stdin file descriptor from the bash process. So when the bash process dies, popen.communicate doesn't get an EOF on the stdin pipe, as the sleep still has it open. The simplest way to fix this is to change the communicate line to:
return proc.communicate(input=b"nohup sleep 100 >/dev/null&\nexit\n")
This causes your thread to end as soon the bash dies... due to the exit, not your proc.kill, in this case. However, the sleep is still running after bash dies if you use the exit statement or the proc.kill call. If you want to kill the sleep as well, I would use
os.killpg(proc.pid,15)
instead of the proc.kill(). The more general problem of killing B, C and D if they change the group is a more complex problem.
Addtional data:
I couldn't find any official documentation for this method of proc.communicate, but I forgot the most obvious place :-) I found it with the help of this answer. The docs for communicate say:
Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate.
You are getting stuck at step 2: Read until end-of-file, because the sleep is keeping the pipe open.
I have a Python class which starts a subprocess that runs another Python program; the first thing it does is spwans athreading.Thread to do some other other work. However, regardless of what the thread's target is, the Thread.start call is blocking, and the rest of the program is not executed.
What could possibly be causing this problem? Is it some general problem regarding the Python global interpreter lock?
EDIT:For some more background, it's a subprocess that runs a single PyTest unit test; inside the unit test, a thread is being started to create a server (the server creation isn't a problem, the threading.Thread.start problem occurs regardless).
The subprocess call is
subprocess.Popen(['/usr/local/bin/py.test', '-v', test_path, '-k',
test_name],
cwd=grading_path,
env=env,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
stdout, stderr = result.communicate()
and the threading call is
def test_case_1(self):
t = Thread(target=time.sleep, args=(1,)) # fake target, doesn't work regardless
t.start() # it blocks here
... other logic ....
There is communication between the subprocess and the and the parent process; the subprocess basically just sends back some result data.
I've been using subprocess.check_output() for some time to capture output from subprocesses, but ran into some performance problems under certain circumstances. I'm running this on a RHEL6 machine.
The calling Python environment is linux-compiled and 64-bit. The subprocess I'm executing is a shell script which eventually fires off a Windows python.exe process via Wine (why this foolishness is required is another story). As input to the shell script, I'm piping in a small bit of Python code that gets passed off to python.exe.
While the system is under moderate/heavy load (40 to 70% CPU utilization), I've noticed that using subprocess.check_output(cmd, shell=True) can result in a significant delay (up to ~45 seconds) after the subprocess has finished execution before the check_output command returns. Looking at output from ps -efH during this time shows the called subprocess as sh <defunct>, until it finally returns with a normal zero exit status.
Conversely, using subprocess.call(cmd, shell=True) to run the same command under the same moderate/heavy load will cause the subprocess to return immediately with no delay, all output printed to STDOUT/STDERR (rather than returned from the function call).
Why is there such a significant delay only when check_output() is redirecting the STDOUT/STDERR output into its return value, and not when the call() simply prints it back to the parent's STDOUT/STDERR?
Reading the docs, both subprocess.call and subprocess.check_output are use-cases of subprocess.Popen. One minor difference is that check_output will raise a Python error if the subprocess returns a non-zero exit status. The greater difference is emphasized in the bit about check_output (my emphasis):
The full function signature is largely the same as that of the Popen constructor, except that stdout is not permitted as it is used internally. All other supplied arguments are passed directly through to the Popen constructor.
So how is stdout "used internally"? Let's compare call and check_output:
call
def call(*popenargs, **kwargs):
return Popen(*popenargs, **kwargs).wait()
check_output
def check_output(*popenargs, **kwargs):
if 'stdout' in kwargs:
raise ValueError('stdout argument not allowed, it will be overridden.')
process = Popen(stdout=PIPE, *popenargs, **kwargs)
output, unused_err = process.communicate()
retcode = process.poll()
if retcode:
cmd = kwargs.get("args")
if cmd is None:
cmd = popenargs[0]
raise CalledProcessError(retcode, cmd, output=output)
return output
communicate
Now we have to look at Popen.communicate as well. Doing this, we notice that for one pipe, communicate does several things which simply take more time than simply returning Popen().wait(), as call does.
For one thing, communicate processes stdout=PIPE whether you set shell=True or not. Clearly, call does not. It just lets your shell spout whatever... making it a security risk, as Python describes here.
Secondly, in the case of check_output(cmd, shell=True) (just one pipe)... whatever your subprocess sends to stdout is processed by a thread in the _communicate method. And Popen must join the thread (wait on it) before additionally waiting on the subprocess itself to terminate!
Plus, more trivially, it processes stdout as a list which must then be joined into a string.
In short, even with minimal arguments, check_output spends a lot more time in Python processes than call does.
Let's look at the code. The .check_output has the following wait:
def _internal_poll(self, _deadstate=None, _waitpid=os.waitpid,
_WNOHANG=os.WNOHANG, _os_error=os.error, _ECHILD=errno.ECHILD):
"""Check if child process has terminated. Returns returncode
attribute.
This method is called by __del__, so it cannot reference anything
outside of the local scope (nor can any methods it calls).
"""
if self.returncode is None:
try:
pid, sts = _waitpid(self.pid, _WNOHANG)
if pid == self.pid:
self._handle_exitstatus(sts)
except _os_error as e:
if _deadstate is not None:
self.returncode = _deadstate
if e.errno == _ECHILD:
# This happens if SIGCLD is set to be ignored or
# waiting for child processes has otherwise been
# disabled for our process. This child is dead, we
# can't get the status.
# http://bugs.python.org/issue15756
self.returncode = 0
return self.returncode
The .call waits using the following code:
def wait(self):
"""Wait for child process to terminate. Returns returncode
attribute."""
while self.returncode is None:
try:
pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
except OSError as e:
if e.errno != errno.ECHILD:
raise
# This happens if SIGCLD is set to be ignored or waiting
# for child processes has otherwise been disabled for our
# process. This child is dead, we can't get the status.
pid = self.pid
sts = 0
# Check the pid and loop as waitpid has been known to return
# 0 even without WNOHANG in odd situations. issue14396.
if pid == self.pid:
self._handle_exitstatus(sts)
return self.returncode
Notice that bug related to internal_poll. It is viewable at http://bugs.python.org/issue15756. Pretty much exactly the issue you are running into.
Edit: The other potential issue between .call and .check_output is that .check_output actually cares about stdin and stdout and will try to perform IO against both pipes. If you are running into a process that get's itself into a zombie state it is possible that a read against a pipe in a defunct state is causing the hang you are experiencing.
In most cases zombie states get cleaned up pretty quickly, but, they will not if for instance they are interrupted while in a system call (like read or write). Of course the read/write system call should itself be interrupted as soon as the IO can no longer be performed, but, it is possible that you are hitting some sort of race condition where things are getting killed in a bad order.
The only way that I can think of to determine which is the cause in this case is for you to either add debugging code to the subprocess file or to invoke the python debugger and initiate a backtrace when you run into the condition you are experiencing.
I am trying to write a small app that uses the subprocess module.
My program calls an external Bash command that takes some time to process. During this time, I would like to show the user a series of messages like this:
Processing. Please wait...
The output is foo()
How can I do this using Popen.wait() or Popen.poll(). I have read that I need to use the Popen.returncode, but how I can get it to actively check the state, I don't know.
Both wait() (with timeout specified) and poll() return None if the process has not yet finished, and something different if the process has finished (I think an integer, the exit code, hopefully 0).
Edit:
wait() and poll() have different behaviors:
wait (without the timeout argument) will block and wait for the process to complete.
wait with the timeout argument will wait timeout seconds for the process to complete. If it doesn't complete, it will throw the TimeoutExpired exception. If you catch the exception, you're then welcome to go on, or to wait again.
poll always returns immediately. It effectively does a wait with a timeout of 0, catches any exception, and returns None if the process hasn't completed.
With either wait or poll, if the process has completed, the popen object's returncode will be set (otherwise it's None - you can check for that as easily as calling wait or poll), and the return value from the function will also be the process's return code.
</Edit>
So I think you should do something like:
while myprocess.poll() is None:
print("Still working...")
# sleep a while
Be aware that if the bash script creates a lot of output you must use communicate() or something similar to prevent stdout or stderr to become stuffed.
#extraneon's answer is a little backwards. Both wait() and poll() return the process's exit code if the process has finished. The poll() method will return None if the process is still running and the wait() method will block until the process exits:
Check out the following page: https://docs.python.org/3.4/library/subprocess.html#popen-objects
Popen.poll()
Check if child process has terminated. Set and return returncode attribute.
Popen.wait()
Wait for child process to terminate. Set and return returncode attribute.