Getting latest lines of streaming stdout from a Python subprocess - python

My goal : To read the latest "chunk" (N lines) of streaming stdout every M seconds from a subprocess.
Current code:
start the subprocess
reads stdout
once I have a chunk of N lines, print it out (or save as current chunk)
wait M seconds
repeat
I have also put code for the moment to terminate the subprocess (which is an endless stream until you hit Ctrl-C)
What I want to achieve is after I wait for M seconds, if for it to always read the latest N lines and not the subsequent N lines in stdout (they can be discarded as I'm only interested in the latest)
My end goal would be to spawn a thread to run the process and keep saving the latest lines and then call from the main process whenever I need the latest results of the stream.
Any help would be greatly appreciated!
#!/usr/bin/env python3
import signal
import time
from subprocess import Popen, PIPE
sig = signal.SIGTERM
N=9
M=5
countlines=0
p = Popen(["myprogram"], stdout=PIPE, bufsize=1, universal_newlines=True)
chunk=[]
for line in p.stdout:
countlines+=1
chunk.append(line)
if len(chunk)==N:
print(chunk)
chunk=[]
time.sleep(M)
if countlines>100:
p.send_signal(sig)
break
print("done")

After much searching, I stumbled upon a solution here:
https://eli.thegreenplace.net/2017/interacting-with-a-long-running-child-process-in-python/
Eli's "Launch, interact, get output in real time, terminate" code section worked for me.
So far its the most elegant solution I've found.
Adapted to my problem above, and written within a class (not shown here):
def output_reader(self,proc):
chunk=[]
countlines=0
for line in iter(proc.stdout.readline, b''):
countlines+=1
chunk.append(line.decode("utf-8"))
if countlines==N:
self.current_chunk = chunk
chunk=[]
countlines=0
def main():
proc = subprocess.Popen(['myprocess'],
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
t = threading.Thread(target=output_reader, args=(proc,))
t.start()
try:
time.sleep(0.2)
for i in range(10):
time.sleep(1) # waits a while before getting latest lines
print(self.current_chunk)
finally:
proc.terminate()
try:
proc.wait(timeout=0.2)
print('== subprocess exited with rc =', proc.returncode)
except subprocess.TimeoutExpired:
print('subprocess did not terminate in time')
t.join()

Here is another possible solution. It is a program that you would run as a separate process in the pipeline, which presents a REST API that when queried will return the last N lines that it read on stdin (where N and the port number are supplied on stdin). It is using run in flask so should not be used in situations where the outside world has access to the local server port to make requests, though this could be adapted.
import sys
import time
import threading
import argparse
from flask import Flask, request
from flask_restful import Resource, Api
class Server:
def __init__(self):
self.data = {'at_eof': False,
'lines_read': 0,
'latest_lines': []}
self.thread = None
self.args = None
self.stop = False
def parse_args(self):
parser = argparse.ArgumentParser()
parser.add_argument("num_lines", type=int,
help="number of lines to cache")
parser.add_argument("port", type=int,
help="port to serve on")
self.args = parser.parse_args()
def start_updater(self):
def updater():
lines = self.data['latest_lines']
while True:
if self.stop:
return
line = sys.stdin.readline()
if not line:
break
self.data['lines_read'] += 1
lines.append(line)
while len(lines) > self.args.num_lines:
lines.pop(0)
self.data['at_eof'] = True
self.thread = threading.Thread(target=updater)
self.thread.start()
def get_data(self):
return self.data
def shutdown(self):
self.stop = True
func = request.environ.get('werkzeug.server.shutdown')
if func:
func()
return 'Shutting down'
else:
return 'shutdown failed'
def add_apis(self, app):
class GetData(Resource):
get = self.get_data
class Shutdown(Resource):
get = self.shutdown
api = Api(app)
api.add_resource(GetData, "/getdata")
api.add_resource(Shutdown, "/shutdown")
def run(self):
self.parse_args()
self.start_updater()
app = Flask(__name__)
self.add_apis(app)
app.run(port=self.args.port)
server = Server()
server.run()
Example usage: here is a test program whose output we want to serve:
import sys
import time
for i in range(100):
print("this is line {}".format(i))
sys.stdout.flush()
time.sleep(.1)
And a simple pipeline to launch it (here from the linux shell prompt but could be done via subprocess.Popen), serving the last 5 lines, on port 8001:
python ./writer.py | python ./server.py 5 8001
An example query, here using curl as the client but it could be done via Python requests:
$ curl -s http://localhost:8001/getdata
{"at_eof": false, "lines_read": 30, "latest_lines": ["this is line 25\n", "this is line 26\n", "this is line 27\n", "this is line 28\n", "this is line 29\n"]}
The server also provides an http://localhost:<port>/shutdown URL to terminate it, though if you call it before you first see "at_eof": true, then expect the writer to die with a broken pipe.

Related

Python subprocess: How can I skip the process.stdout.readline() request if there is no new line? [duplicate]

I'm using the subprocess module to start a subprocess and connect to its output stream (standard output). I want to be able to execute non-blocking reads on its standard output. Is there a way to make .readline non-blocking or to check if there is data on the stream before I invoke .readline? I'd like this to be portable or at least work under Windows and Linux.
Here is how I do it for now (it's blocking on the .readline if no data is available):
p = subprocess.Popen('myprogram.exe', stdout = subprocess.PIPE)
output_str = p.stdout.readline()
fcntl, select, asyncproc won't help in this case.
A reliable way to read a stream without blocking regardless of operating system is to use Queue.get_nowait():
import sys
from subprocess import PIPE, Popen
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
p = Popen(['myprogram.exe'], stdout=PIPE, bufsize=1, close_fds=ON_POSIX)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
# ... do other things here
# read line without blocking
try: line = q.get_nowait() # or q.get(timeout=.1)
except Empty:
print('no output yet')
else: # got line
# ... do something with line
I have often had a similar problem; Python programs I write frequently need to have the ability to execute some primary functionality while simultaneously accepting user input from the command line (stdin). Simply putting the user input handling functionality in another thread doesn't solve the problem because readline() blocks and has no timeout. If the primary functionality is complete and there is no longer any need to wait for further user input I typically want my program to exit, but it can't because readline() is still blocking in the other thread waiting for a line. A solution I have found to this problem is to make stdin a non-blocking file using the fcntl module:
import fcntl
import os
import sys
# make stdin a non-blocking file
fd = sys.stdin.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
# user input handling thread
while mainThreadIsRunning:
try: input = sys.stdin.readline()
except: continue
handleInput(input)
In my opinion this is a bit cleaner than using the select or signal modules to solve this problem but then again it only works on UNIX...
Python 3.4 introduces new provisional API for asynchronous IO -- asyncio module.
The approach is similar to twisted-based answer by #Bryan Ward -- define a protocol and its methods are called as soon as data is ready:
#!/usr/bin/env python3
import asyncio
import os
class SubprocessProtocol(asyncio.SubprocessProtocol):
def pipe_data_received(self, fd, data):
if fd == 1: # got stdout data (bytes)
print(data)
def connection_lost(self, exc):
loop.stop() # end loop.run_forever()
if os.name == 'nt':
loop = asyncio.ProactorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(loop.subprocess_exec(SubprocessProtocol,
"myprogram.exe", "arg1", "arg2"))
loop.run_forever()
finally:
loop.close()
See "Subprocess" in the docs.
There is a high-level interface asyncio.create_subprocess_exec() that returns Process objects that allows to read a line asynchroniosly using StreamReader.readline() coroutine
(with async/await Python 3.5+ syntax):
#!/usr/bin/env python3.5
import asyncio
import locale
import sys
from asyncio.subprocess import PIPE
from contextlib import closing
async def readline_and_kill(*args):
# start child process
process = await asyncio.create_subprocess_exec(*args, stdout=PIPE)
# read line (sequence of bytes ending with b'\n') asynchronously
async for line in process.stdout:
print("got line:", line.decode(locale.getpreferredencoding(False)))
break
process.kill()
return await process.wait() # wait for the child process to exit
if sys.platform == "win32":
loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
with closing(loop):
sys.exit(loop.run_until_complete(readline_and_kill(
"myprogram.exe", "arg1", "arg2")))
readline_and_kill() performs the following tasks:
start subprocess, redirect its stdout to a pipe
read a line from subprocess' stdout asynchronously
kill subprocess
wait for it to exit
Each step could be limited by timeout seconds if necessary.
On Unix-like systems and Python 3.5+ there's os.set_blocking which does exactly what it says.
import os
import time
import subprocess
cmd = 'python3', '-c', 'import time; [(print(i), time.sleep(1)) for i in range(5)]'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
os.set_blocking(p.stdout.fileno(), False)
start = time.time()
while True:
# first iteration always produces empty byte string in non-blocking mode
for i in range(2):
line = p.stdout.readline()
print(i, line)
time.sleep(0.5)
if time.time() > start + 5:
break
p.terminate()
This outputs:
1 b''
2 b'0\n'
1 b''
2 b'1\n'
1 b''
2 b'2\n'
1 b''
2 b'3\n'
1 b''
2 b'4\n'
With os.set_blocking commented it's:
0 b'0\n'
1 b'1\n'
0 b'2\n'
1 b'3\n'
0 b'4\n'
1 b''
Try the asyncproc module. For example:
import os
from asyncproc import Process
myProc = Process("myprogram.app")
while True:
# check to see if process has ended
poll = myProc.wait(os.WNOHANG)
if poll != None:
break
# print any new output
out = myProc.read()
if out != "":
print out
The module takes care of all the threading as suggested by S.Lott.
You can do this really easily in Twisted. Depending upon your existing code base, this might not be that easy to use, but if you are building a twisted application, then things like this become almost trivial. You create a ProcessProtocol class, and override the outReceived() method. Twisted (depending upon the reactor used) is usually just a big select() loop with callbacks installed to handle data from different file descriptors (often network sockets). So the outReceived() method is simply installing a callback for handling data coming from STDOUT. A simple example demonstrating this behavior is as follows:
from twisted.internet import protocol, reactor
class MyProcessProtocol(protocol.ProcessProtocol):
def outReceived(self, data):
print data
proc = MyProcessProtocol()
reactor.spawnProcess(proc, './myprogram', ['./myprogram', 'arg1', 'arg2', 'arg3'])
reactor.run()
The Twisted documentation has some good information on this.
If you build your entire application around Twisted, it makes asynchronous communication with other processes, local or remote, really elegant like this. On the other hand, if your program isn't built on top of Twisted, this isn't really going to be that helpful. Hopefully this can be helpful to other readers, even if it isn't applicable for your particular application.
Use select & read(1).
import subprocess #no new requirements
def readAllSoFar(proc, retVal=''):
while (select.select([proc.stdout],[],[],0)[0]!=[]):
retVal+=proc.stdout.read(1)
return retVal
p = subprocess.Popen(['/bin/ls'], stdout=subprocess.PIPE)
while not p.poll():
print (readAllSoFar(p))
For readline()-like:
lines = ['']
while not p.poll():
lines = readAllSoFar(p, lines[-1]).split('\n')
for a in range(len(lines)-1):
print a
lines = readAllSoFar(p, lines[-1]).split('\n')
for a in range(len(lines)-1):
print a
Things are a lot better in modern Python.
Here's a simple child program, "hello.py":
#!/usr/bin/env python3
while True:
i = input()
if i == "quit":
break
print(f"hello {i}")
And a program to interact with it:
import asyncio
async def main():
proc = await asyncio.subprocess.create_subprocess_exec(
"./hello.py", stdin=asyncio.subprocess.PIPE, stdout=asyncio.subprocess.PIPE
)
proc.stdin.write(b"bob\n")
print(await proc.stdout.read(1024))
proc.stdin.write(b"alice\n")
print(await proc.stdout.read(1024))
proc.stdin.write(b"quit\n")
await proc.wait()
asyncio.run(main())
That prints out:
b'hello bob\n'
b'hello alice\n'
Note that the actual pattern, which is also by almost all of the previous answers, both here and in related questions, is to set the child's stdout file descriptor to non-blocking and then poll it in some sort of select loop. These days, of course, that loop is provided by asyncio.
One solution is to make another process to perform your read of the process, or make a thread of the process with a timeout.
Here's the threaded version of a timeout function:
http://code.activestate.com/recipes/473878/
However, do you need to read the stdout as it's coming in?
Another solution may be to dump the output to a file and wait for the process to finish using p.wait().
f = open('myprogram_output.txt','w')
p = subprocess.Popen('myprogram.exe', stdout=f)
p.wait()
f.close()
str = open('myprogram_output.txt','r').read()
Here is my code, used to catch every output from subprocess ASAP, including partial lines. It pumps at same time and stdout and stderr in almost correct order.
Tested and correctly worked on Python 2.7 linux & windows.
#!/usr/bin/python
#
# Runner with stdout/stderr catcher
#
from sys import argv
from subprocess import Popen, PIPE
import os, io
from threading import Thread
import Queue
def __main__():
if (len(argv) > 1) and (argv[-1] == "-sub-"):
import time, sys
print "Application runned!"
time.sleep(2)
print "Slept 2 second"
time.sleep(1)
print "Slept 1 additional second",
time.sleep(2)
sys.stderr.write("Stderr output after 5 seconds")
print "Eol on stdin"
sys.stderr.write("Eol on stderr\n")
time.sleep(1)
print "Wow, we have end of work!",
else:
os.environ["PYTHONUNBUFFERED"]="1"
try:
p = Popen( argv + ["-sub-"],
bufsize=0, # line-buffered
stdin=PIPE, stdout=PIPE, stderr=PIPE )
except WindowsError, W:
if W.winerror==193:
p = Popen( argv + ["-sub-"],
shell=True, # Try to run via shell
bufsize=0, # line-buffered
stdin=PIPE, stdout=PIPE, stderr=PIPE )
else:
raise
inp = Queue.Queue()
sout = io.open(p.stdout.fileno(), 'rb', closefd=False)
serr = io.open(p.stderr.fileno(), 'rb', closefd=False)
def Pump(stream, category):
queue = Queue.Queue()
def rdr():
while True:
buf = stream.read1(8192)
if len(buf)>0:
queue.put( buf )
else:
queue.put( None )
return
def clct():
active = True
while active:
r = queue.get()
try:
while True:
r1 = queue.get(timeout=0.005)
if r1 is None:
active = False
break
else:
r += r1
except Queue.Empty:
pass
inp.put( (category, r) )
for tgt in [rdr, clct]:
th = Thread(target=tgt)
th.setDaemon(True)
th.start()
Pump(sout, 'stdout')
Pump(serr, 'stderr')
while p.poll() is None:
# App still working
try:
chan,line = inp.get(timeout = 1.0)
if chan=='stdout':
print "STDOUT>>", line, "<?<"
elif chan=='stderr':
print " ERROR==", line, "=?="
except Queue.Empty:
pass
print "Finish"
if __name__ == '__main__':
__main__()
Disclaimer: this works only for tornado
You can do this by setting the fd to be nonblocking and then use ioloop to register callbacks. I have packaged this in an egg called tornado_subprocess and you can install it via PyPI:
easy_install tornado_subprocess
now you can do something like this:
import tornado_subprocess
import tornado.ioloop
def print_res( status, stdout, stderr ) :
print status, stdout, stderr
if status == 0:
print "OK:"
print stdout
else:
print "ERROR:"
print stderr
t = tornado_subprocess.Subprocess( print_res, timeout=30, args=[ "cat", "/etc/passwd" ] )
t.start()
tornado.ioloop.IOLoop.instance().start()
you can also use it with a RequestHandler
class MyHandler(tornado.web.RequestHandler):
def on_done(self, status, stdout, stderr):
self.write( stdout )
self.finish()
#tornado.web.asynchronous
def get(self):
t = tornado_subprocess.Subprocess( self.on_done, timeout=30, args=[ "cat", "/etc/passwd" ] )
t.start()
Existing solutions did not work for me (details below). What finally worked was to implement readline using read(1) (based on this answer). The latter does not block:
from subprocess import Popen, PIPE
from threading import Thread
def process_output(myprocess): #output-consuming thread
nextline = None
buf = ''
while True:
#--- extract line using read(1)
out = myprocess.stdout.read(1)
if out == '' and myprocess.poll() != None: break
if out != '':
buf += out
if out == '\n':
nextline = buf
buf = ''
if not nextline: continue
line = nextline
nextline = None
#--- do whatever you want with line here
print 'Line is:', line
myprocess.stdout.close()
myprocess = Popen('myprogram.exe', stdout=PIPE) #output-producing process
p1 = Thread(target=process_output, args=(myprocess,)) #output-consuming thread
p1.daemon = True
p1.start()
#--- do whatever here and then kill process and thread if needed
if myprocess.poll() == None: #kill process; will automatically stop thread
myprocess.kill()
myprocess.wait()
if p1 and p1.is_alive(): #wait for thread to finish
p1.join()
Why existing solutions did not work:
Solutions that require readline (including the Queue based ones) always block. It is difficult (impossible?) to kill the thread that executes readline. It only gets killed when the process that created it finishes, but not when the output-producing process is killed.
Mixing low-level fcntl with high-level readline calls may not work properly as anonnn has pointed out.
Using select.poll() is neat, but doesn't work on Windows according to python docs.
Using third-party libraries seems overkill for this task and adds additional dependencies.
I add this problem to read some subprocess.Popen stdout.
Here is my non blocking read solution:
import fcntl
def non_block_read(output):
fd = output.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
try:
return output.read()
except:
return ""
# Use example
from subprocess import *
sb = Popen("echo test && sleep 1000", shell=True, stdout=PIPE)
sb.kill()
# sb.stdout.read() # <-- This will block
non_block_read(sb.stdout)
'test\n'
Here is a simple solution based on threads which:
works on both Linux and Windows (not relying on select).
reads both stdout and stderr asynchronouly.
doesn't rely on active polling with arbitrary waiting time (CPU friendly).
doesn't use asyncio (which may conflict with other libraries).
runs until the child process terminates.
printer.py
import time
import sys
sys.stdout.write("Hello\n")
sys.stdout.flush()
time.sleep(1)
sys.stdout.write("World!\n")
sys.stdout.flush()
time.sleep(1)
sys.stderr.write("That's an error\n")
sys.stderr.flush()
time.sleep(2)
sys.stdout.write("Actually, I'm fine\n")
sys.stdout.flush()
time.sleep(1)
reader.py
import queue
import subprocess
import sys
import threading
def enqueue_stream(stream, queue, type):
for line in iter(stream.readline, b''):
queue.put(str(type) + line.decode('utf-8'))
stream.close()
def enqueue_process(process, queue):
process.wait()
queue.put('x')
p = subprocess.Popen('python printer.py', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
q = queue.Queue()
to = threading.Thread(target=enqueue_stream, args=(p.stdout, q, 1))
te = threading.Thread(target=enqueue_stream, args=(p.stderr, q, 2))
tp = threading.Thread(target=enqueue_process, args=(p, q))
te.start()
to.start()
tp.start()
while True:
line = q.get()
if line[0] == 'x':
break
if line[0] == '2': # stderr
sys.stdout.write("\033[0;31m") # ANSI red color
sys.stdout.write(line[1:])
if line[0] == '2':
sys.stdout.write("\033[0m") # reset ANSI code
sys.stdout.flush()
tp.join()
to.join()
te.join()
This version of non-blocking read doesn't require special modules and will work out-of-the-box on majority of Linux distros.
import os
import sys
import time
import fcntl
import subprocess
def async_read(fd):
# set non-blocking flag while preserving old flags
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
# read char until EOF hit
while True:
try:
ch = os.read(fd.fileno(), 1)
# EOF
if not ch: break
sys.stdout.write(ch)
except OSError:
# waiting for data be available on fd
pass
def shell(args, async=True):
# merge stderr and stdout
proc = subprocess.Popen(args, shell=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
if async: async_read(proc.stdout)
sout, serr = proc.communicate()
return (sout, serr)
if __name__ == '__main__':
cmd = 'ping 8.8.8.8'
sout, serr = shell(cmd.split())
I have the original questioner's problem, but did not wish to invoke threads. I mixed Jesse's solution with a direct read() from the pipe, and my own buffer-handler for line reads (however, my sub-process - ping - always wrote full lines < a system page size). I avoid busy-waiting by only reading in a gobject-registered io watch. These days I usually run code within a gobject MainLoop to avoid threads.
def set_up_ping(ip, w):
# run the sub-process
# watch the resultant pipe
p = subprocess.Popen(['/bin/ping', ip], stdout=subprocess.PIPE)
# make stdout a non-blocking file
fl = fcntl.fcntl(p.stdout, fcntl.F_GETFL)
fcntl.fcntl(p.stdout, fcntl.F_SETFL, fl | os.O_NONBLOCK)
stdout_gid = gobject.io_add_watch(p.stdout, gobject.IO_IN, w)
return stdout_gid # for shutting down
The watcher is
def watch(f, *other):
print 'reading',f.read()
return True
And the main program sets up a ping and then calls gobject mail loop.
def main():
set_up_ping('192.168.1.8', watch)
# discard gid as unused here
gobject.MainLoop().run()
Any other work is attached to callbacks in gobject.
Adding this answer here since it provides ability to set non-blocking pipes on Windows and Unix.
All the ctypes details are thanks to #techtonik's answer.
There is a slightly modified version to be used both on Unix and Windows systems.
Python3 compatible (only minor change needed).
Includes posix version, and defines exception to use for either.
This way you can use the same function and exception for Unix and Windows code.
# pipe_non_blocking.py (module)
"""
Example use:
p = subprocess.Popen(
command,
stdout=subprocess.PIPE,
)
pipe_non_blocking_set(p.stdout.fileno())
try:
data = os.read(p.stdout.fileno(), 1)
except PortableBlockingIOError as ex:
if not pipe_non_blocking_is_error_blocking(ex):
raise ex
"""
__all__ = (
"pipe_non_blocking_set",
"pipe_non_blocking_is_error_blocking",
"PortableBlockingIOError",
)
import os
if os.name == "nt":
def pipe_non_blocking_set(fd):
# Constant could define globally but avoid polluting the name-space
# thanks to: https://stackoverflow.com/questions/34504970
import msvcrt
from ctypes import windll, byref, wintypes, WinError, POINTER
from ctypes.wintypes import HANDLE, DWORD, BOOL
LPDWORD = POINTER(DWORD)
PIPE_NOWAIT = wintypes.DWORD(0x00000001)
def pipe_no_wait(pipefd):
SetNamedPipeHandleState = windll.kernel32.SetNamedPipeHandleState
SetNamedPipeHandleState.argtypes = [HANDLE, LPDWORD, LPDWORD, LPDWORD]
SetNamedPipeHandleState.restype = BOOL
h = msvcrt.get_osfhandle(pipefd)
res = windll.kernel32.SetNamedPipeHandleState(h, byref(PIPE_NOWAIT), None, None)
if res == 0:
print(WinError())
return False
return True
return pipe_no_wait(fd)
def pipe_non_blocking_is_error_blocking(ex):
if not isinstance(ex, PortableBlockingIOError):
return False
from ctypes import GetLastError
ERROR_NO_DATA = 232
return (GetLastError() == ERROR_NO_DATA)
PortableBlockingIOError = OSError
else:
def pipe_non_blocking_set(fd):
import fcntl
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
return True
def pipe_non_blocking_is_error_blocking(ex):
if not isinstance(ex, PortableBlockingIOError):
return False
return True
PortableBlockingIOError = BlockingIOError
To avoid reading incomplete data, I ended up writing my own readline generator (which returns the byte string for each line).
Its a generator so you can for example...
def non_blocking_readlines(f, chunk=1024):
"""
Iterate over lines, yielding b'' when nothings left
or when new data is not yet available.
stdout_iter = iter(non_blocking_readlines(process.stdout))
line = next(stdout_iter) # will be a line or b''.
"""
import os
from .pipe_non_blocking import (
pipe_non_blocking_set,
pipe_non_blocking_is_error_blocking,
PortableBlockingIOError,
)
fd = f.fileno()
pipe_non_blocking_set(fd)
blocks = []
while True:
try:
data = os.read(fd, chunk)
if not data:
# case were reading finishes with no trailing newline
yield b''.join(blocks)
blocks.clear()
except PortableBlockingIOError as ex:
if not pipe_non_blocking_is_error_blocking(ex):
raise ex
yield b''
continue
while True:
n = data.find(b'\n')
if n == -1:
break
yield b''.join(blocks) + data[:n + 1]
data = data[n + 1:]
blocks.clear()
blocks.append(data)
Not the first and probably not the last, I have built a package that does non blocking stdout PIPE reads with two different methods, one being based on the work of J.F. Sebastian (#jfs)'s answer, the other being a simple communicate() loop with a thread to check for timeouts.
Both stdout capture methods are tested to work both under Linux and Windows, with Python versions from 2.7 to 3.9 as of the time of writing
Being non blocking, it guarantees timeout enforcement, even with multiple child and grandchild processes, and even under Python 2.7.
The package also handles both bytes and text stdout encodings, being a nightmare when trying to catch EOF.
You'll find the package at https://github.com/netinvent/command_runner
If you need some well tested non blocking read implementations, try it out (or hack the code):
pip install command_runner
from command_runner import command_runner
exit_code, output = command_runner('ping 127.0.0.1', timeout=3)
exit_code, output = command_runner('echo hello world, shell=True)
exit_code, output = command_runner('some command', stdout='some_file')
You can find the core non blocking read code in _poll_process() or _monitor_process() depending on the capture method employed.
From there, you can hack your way to what you want, or simply use the whole package to execute your commands as a subprocess replacement.
The select module helps you determine where the next useful input is.
However, you're almost always happier with separate threads. One does a blocking read the stdin, another does wherever it is you don't want blocked.
why bothering thread&queue?
unlike readline(), BufferedReader.read1() wont block waiting for \r\n, it returns ASAP if there is any output coming in.
#!/usr/bin/python
from subprocess import Popen, PIPE, STDOUT
import io
def __main__():
try:
p = Popen( ["ping", "-n", "3", "127.0.0.1"], stdin=PIPE, stdout=PIPE, stderr=STDOUT )
except: print("Popen failed"); quit()
sout = io.open(p.stdout.fileno(), 'rb', closefd=False)
while True:
buf = sout.read1(1024)
if len(buf) == 0: break
print buf,
if __name__ == '__main__':
__main__()
In my case I needed a logging module that catches the output from the background applications and augments it(adding time-stamps, colors, etc.).
I ended up with a background thread that does the actual I/O. Following code is only for POSIX platforms. I stripped non-essential parts.
If someone is going to use this beast for long runs consider managing open descriptors. In my case it was not a big problem.
# -*- python -*-
import fcntl
import threading
import sys, os, errno
import subprocess
class Logger(threading.Thread):
def __init__(self, *modules):
threading.Thread.__init__(self)
try:
from select import epoll, EPOLLIN
self.__poll = epoll()
self.__evt = EPOLLIN
self.__to = -1
except:
from select import poll, POLLIN
print 'epoll is not available'
self.__poll = poll()
self.__evt = POLLIN
self.__to = 100
self.__fds = {}
self.daemon = True
self.start()
def run(self):
while True:
events = self.__poll.poll(self.__to)
for fd, ev in events:
if (ev&self.__evt) != self.__evt:
continue
try:
self.__fds[fd].run()
except Exception, e:
print e
def add(self, fd, log):
assert not self.__fds.has_key(fd)
self.__fds[fd] = log
self.__poll.register(fd, self.__evt)
class log:
logger = Logger()
def __init__(self, name):
self.__name = name
self.__piped = False
def fileno(self):
if self.__piped:
return self.write
self.read, self.write = os.pipe()
fl = fcntl.fcntl(self.read, fcntl.F_GETFL)
fcntl.fcntl(self.read, fcntl.F_SETFL, fl | os.O_NONBLOCK)
self.fdRead = os.fdopen(self.read)
self.logger.add(self.read, self)
self.__piped = True
return self.write
def __run(self, line):
self.chat(line, nl=False)
def run(self):
while True:
try: line = self.fdRead.readline()
except IOError, exc:
if exc.errno == errno.EAGAIN:
return
raise
self.__run(line)
def chat(self, line, nl=True):
if nl: nl = '\n'
else: nl = ''
sys.stdout.write('[%s] %s%s' % (self.__name, line, nl))
def system(command, param=[], cwd=None, env=None, input=None, output=None):
args = [command] + param
p = subprocess.Popen(args, cwd=cwd, stdout=output, stderr=output, stdin=input, env=env, bufsize=0)
p.wait()
ls = log('ls')
ls.chat('go')
system("ls", ['-l', '/'], output=ls)
date = log('date')
date.chat('go')
system("date", output=date)
This is a example to run interactive command in subprocess, and the stdout is interactive by using pseudo terminal. You can refer to: https://stackoverflow.com/a/43012138/3555925
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import select
import termios
import tty
import pty
from subprocess import Popen
command = 'bash'
# command = 'docker run -it --rm centos /bin/bash'.split()
# save original tty setting then set it to raw mode
old_tty = termios.tcgetattr(sys.stdin)
tty.setraw(sys.stdin.fileno())
# open pseudo-terminal to interact with subprocess
master_fd, slave_fd = pty.openpty()
# use os.setsid() make it run in a new process group, or bash job control will not be enabled
p = Popen(command,
preexec_fn=os.setsid,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
universal_newlines=True)
while p.poll() is None:
r, w, e = select.select([sys.stdin, master_fd], [], [])
if sys.stdin in r:
d = os.read(sys.stdin.fileno(), 10240)
os.write(master_fd, d)
elif master_fd in r:
o = os.read(master_fd, 10240)
if o:
os.write(sys.stdout.fileno(), o)
# restore tty settings back
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_tty)
My problem is a bit different as I wanted to collect both stdout and stderr from a running process, but ultimately the same since I wanted to render the output in a widget as its generated.
I did not want to resort to many of the proposed workarounds using Queues or additional Threads as they should not be necessary to perform such a common task as running another script and collecting its output.
After reading the proposed solutions and python docs I resolved my issue with the implementation below. Yes it only works for POSIX as I'm using the select function call.
I agree that the docs are confusing and the implementation is awkward for such a common scripting task. I believe that older versions of python have different defaults for Popen and different explanations so that created a lot of confusion. This seems to work well for both Python 2.7.12 and 3.5.2.
The key was to set bufsize=1 for line buffering and then universal_newlines=True to process as a text file instead of a binary which seems to become the default when setting bufsize=1.
class workerThread(QThread):
def __init__(self, cmd):
QThread.__init__(self)
self.cmd = cmd
self.result = None ## return code
self.error = None ## flag indicates an error
self.errorstr = "" ## info message about the error
def __del__(self):
self.wait()
DEBUG("Thread removed")
def run(self):
cmd_list = self.cmd.split(" ")
try:
cmd = subprocess.Popen(cmd_list, bufsize=1, stdin=None
, universal_newlines=True
, stderr=subprocess.PIPE
, stdout=subprocess.PIPE)
except OSError:
self.error = 1
self.errorstr = "Failed to execute " + self.cmd
ERROR(self.errorstr)
finally:
VERBOSE("task started...")
import select
while True:
try:
r,w,x = select.select([cmd.stdout, cmd.stderr],[],[])
if cmd.stderr in r:
line = cmd.stderr.readline()
if line != "":
line = line.strip()
self.emit(SIGNAL("update_error(QString)"), line)
if cmd.stdout in r:
line = cmd.stdout.readline()
if line == "":
break
line = line.strip()
self.emit(SIGNAL("update_output(QString)"), line)
except IOError:
pass
cmd.wait()
self.result = cmd.returncode
if self.result < 0:
self.error = 1
self.errorstr = "Task terminated by signal " + str(self.result)
ERROR(self.errorstr)
return
if self.result:
self.error = 1
self.errorstr = "exit code " + str(self.result)
ERROR(self.errorstr)
return
return
ERROR, DEBUG and VERBOSE are simply macros that print output to the terminal.
This solution is IMHO 99.99% effective as it still uses the blocking readline function, so we assume the sub process is nice and outputs complete lines.
I welcome feedback to improve the solution as I am still new to Python.
I have created a library based on J. F. Sebastian's solution. You can use it.
https://github.com/cenkalti/what
Working from J.F. Sebastian's answer, and several other sources, I've put together a simple subprocess manager. It provides the request non-blocking reading, as well as running several processes in parallel. It doesn't use any OS-specific call (that I'm aware) and thus should work anywhere.
It's available from pypi, so just pip install shelljob. Refer to the project page for examples and full docs.
EDIT: This implementation still blocks. Use J.F.Sebastian's answer instead.
I tried the top answer, but the additional risk and maintenance of thread code was worrisome.
Looking through the io module (and being limited to 2.6), I found BufferedReader. This is my threadless, non-blocking solution.
import io
from subprocess import PIPE, Popen
p = Popen(['myprogram.exe'], stdout=PIPE)
SLEEP_DELAY = 0.001
# Create an io.BufferedReader on the file descriptor for stdout
with io.open(p.stdout.fileno(), 'rb', closefd=False) as buffer:
while p.poll() == None:
time.sleep(SLEEP_DELAY)
while '\n' in bufferedStdout.peek(bufferedStdout.buffer_size):
line = buffer.readline()
# do stuff with the line
# Handle any remaining output after the process has ended
while buffer.peek():
line = buffer.readline()
# do stuff with the line
This solution uses the select module to "read any available data" from an IO stream. This function blocks initially until data is available, but then reads only the data that is available and doesn't block further.
Given the fact that it uses the select module, this only works on Unix.
The code is fully PEP8-compliant.
import select
def read_available(input_stream, max_bytes=None):
"""
Blocks until any data is available, then all available data is then read and returned.
This function returns an empty string when end of stream is reached.
Args:
input_stream: The stream to read from.
max_bytes (int|None): The maximum number of bytes to read. This function may return fewer bytes than this.
Returns:
str
"""
# Prepare local variables
input_streams = [input_stream]
empty_list = []
read_buffer = ""
# Initially block for input using 'select'
if len(select.select(input_streams, empty_list, empty_list)[0]) > 0:
# Poll read-readiness using 'select'
def select_func():
return len(select.select(input_streams, empty_list, empty_list, 0)[0]) > 0
# Create while function based on parameters
if max_bytes is not None:
def while_func():
return (len(read_buffer) < max_bytes) and select_func()
else:
while_func = select_func
while True:
# Read single byte at a time
read_data = input_stream.read(1)
if len(read_data) == 0:
# End of stream
break
# Append byte to string buffer
read_buffer += read_data
# Check if more data is available
if not while_func():
break
# Return read buffer
return read_buffer
I also faced the problem described by Jesse and solved it by using "select" as Bradley, Andy and others did but in a blocking mode to avoid a busy loop. It uses a dummy Pipe as a fake stdin. The select blocks and wait for either stdin or the pipe to be ready. When a key is pressed stdin unblocks the select and the key value can be retrieved with read(1). When a different thread writes to the pipe then the pipe unblocks the select and it can be taken as an indication that the need for stdin is over. Here is some reference code:
import sys
import os
from select import select
# -------------------------------------------------------------------------
# Set the pipe (fake stdin) to simulate a final key stroke
# which will unblock the select statement
readEnd, writeEnd = os.pipe()
readFile = os.fdopen(readEnd)
writeFile = os.fdopen(writeEnd, "w")
# -------------------------------------------------------------------------
def getKey():
# Wait for stdin or pipe (fake stdin) to be ready
dr,dw,de = select([sys.__stdin__, readFile], [], [])
# If stdin is the one ready then read it and return value
if sys.__stdin__ in dr:
return sys.__stdin__.read(1) # For Windows use ----> getch() from module msvcrt
# Must finish
else:
return None
# -------------------------------------------------------------------------
def breakStdinRead():
writeFile.write(' ')
writeFile.flush()
# -------------------------------------------------------------------------
# MAIN CODE
# Get key stroke
key = getKey()
# Keyboard input
if key:
# ... do your stuff with the key value
# Faked keystroke
else:
# ... use of stdin finished
# -------------------------------------------------------------------------
# OTHER THREAD CODE
breakStdinRead()
Try wexpect, which is the windows alternative of pexpect.
import wexpect
p = wexpect.spawn('myprogram.exe')
p.stdout.readline('.') // regex pattern of any character
output_str = p.after()
Here is a module that supports non-blocking reads and background writes in python:
https://pypi.python.org/pypi/python-nonblock
Provides a function,
nonblock_read which will read data from the stream, if available, otherwise return an empty string (or None if the stream is closed on the other side and all possible data has been read)
You may also consider the python-subprocess2 module,
https://pypi.python.org/pypi/python-subprocess2
which adds to the subprocess module. So on the object returned from "subprocess.Popen" is added an additional method, runInBackground. This starts a thread and returns an object which will automatically be populated as stuff is written to stdout/stderr, without blocking your main thread.
Enjoy!

Tried to make a non-blocking command execution function, what's causing this unexpected behavior?

What I wanted to happen:
So my goal was to write a function that leverages subprocess to run a command and read the stdout, whether it be immediate or delayed, line by line as it comes. And to do that in a non-blocking, asynchronous way.
I also wanted to be able to pass a function to be called each time a new stdout line is read.
What happened instead:
Until the process being run is completely finished / killed, the output isn't handled / printed as expected. All the correct output happens, but I expected it to print in real-time as the output is polled. Rather, it waits until the entire process finishes running, then prints all the expected output.
What I tried:
So I wrote a simple test script lab_temp.py to provide some output:
from time import sleep
for i in range(10):
print('i:', i)
sleep(1)
And a function set_interval.py which I mostly copied from some SO answer (although I'm sorry I don't recall which answer to give credit):
import threading
def set_interval(func, sec):
def func_wrapper():
t = set_interval(func, sec)
result = func()
if result == False:
t.cancel()
t = threading.Timer(sec, func_wrapper)
t.start()
return t
And then a function call_command.py to run the command and asynchronously poll the process at some interval for output, until it's done. I'm only barely experienced with asynchronous code, and that's probably related to my mistake, but I think the async part is being handled behind the scenes by threading.Timer (in set_interval.py).
call_command.py:
import subprocess
from set_interval import set_interval
def call_command(cmd, update_func=None):
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE, encoding='utf-8')
def polling(): # Replaces "while True:" to convert to non-blocking
for line in iter(p.stdout.readline, ''):
if update_func:
update_func(line.rstrip())
if p.poll() == 0:
print('False')
return False # cancel interval
else:
print('True')
return True # continue interval
set_interval(polling, 1)
And each of these functions have basic tests:
set_interval.test.py (seems to run as expected):
from set_interval import set_interval
i = 0
def func():
global i
i += 1
print(f"Interval: {i}...")
if i > 5:
return False
else:
return True
set_interval(func, 2)
print('non blocking')
call_command.test.py (results in the wrong behavior, as described initially):
from call_command import call_command
def func(out):
print(out) # <- This will print in one big batch once
# the entire process is complete.
call_command('python3 lab_temp.py', update_func=func)
print('non-blocking') # <- This will print right away, so I
# know it's not blocked / frozen.
What have I gotten wrong here causing the deviation from expectation?
Edit: Continued efforts...
import subprocess
from set_interval import set_interval
def call_command(cmd):
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE, encoding='utf-8')
def polling():
line = p.stdout.readline().strip()
if not line and p.poll() is not None:
return False
else:
print(line)
return True
set_interval(polling, 1)
Doesn't work. Nearly identical issues.
The problem is located in your command. The lab_temp.py script uses the print function that prints to sys.stdout by default. sys.stdout is block-buffered by default. Since the buffer is large enough to accept the whole script's output, it gets flushed no sooner than at the end.
To fix this, you can either use the sys.stdout's flush method:
from time import sleep
import sys
for i in range(10):
print('i:', i)
sys.stdout.flush()
sleep(1)
or use print's flush parameter:
from time import sleep
for i in range(10):
print('i:', i, flush=True)
sleep(1)
or run Python interpreter with the -u option:
call_command(['python3', '-u', 'lab_temp.py'])
or set the PYTHONUNBUFFERED environment variable to a non-empty string
import subprocess
from set_interval import set_interval
def call_command(cmd):
p = subprocess.Popen(cmd, stderr=subprocess.PIPE, stdout=subprocess.PIPE, encoding='utf-8', env={'PYTHONUNBUFFERED':'1'})
def polling():
line = p.stdout.readline().strip()
if not line and p.poll() is not None:
return False
else:
print(line)
return True
set_interval(polling, 1)
BTW, in order to avoid threads, you might want to use asyncio:
import asyncio
async def call_command(cmd):
p = await asyncio.create_subprocess_exec(cmd[0], *cmd[1:], stderr=asyncio.subprocess.PIPE, stdout=asyncio.subprocess.PIPE)
async for line in p.stdout:
line = line.strip().decode('utf-8')
print(line)

asynchronous subprocess Popen python 3.5

I am trying to asynchronously run the Popen command from subprocess, so that I can run other stuff in the background.
import subprocess
import requests
import asyncio
import asyncio.subprocess
async def x(message):
if len(message.content.split()) > 1:
#output = asyncio.create_subprocess_shell(message.content[3:], shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
output = subprocess.Popen(message.content[3:], shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
return output.communicate()[0].decode('utf-8')
I have tried to understand https://docs.python.org/3/library/asyncio-subprocess.html but i am not sure what a protocol factory is.
When I came to this question, I expected the answer to really use asyncio for interprocess communication.
I have found the following resource useful:
https://github.com/python/asyncio/blob/master/examples/child_process.py
and below is my simplified example (using 3.5+ async/await syntax), which reads lines and outputs them sorted:
import asyncio
from subprocess import Popen, PIPE
async def connect_write_pipe(file):
"""Return a write-only transport wrapping a writable pipe"""
loop = asyncio.get_event_loop()
transport, _ = await loop.connect_write_pipe(asyncio.Protocol, file)
return transport
async def connect_read_pipe(file):
"""Wrap a readable pipe in a stream"""
loop = asyncio.get_event_loop()
stream_reader = asyncio.StreamReader(loop=loop)
def factory():
return asyncio.StreamReaderProtocol(stream_reader)
transport, _ = await loop.connect_read_pipe(factory, file)
return stream_reader, transport
async def main(loop):
# start subprocess and wrap stdin, stdout, stderr
p = Popen(['/usr/bin/sort'], stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdin = await connect_write_pipe(p.stdin)
stdout, stdout_transport = await connect_read_pipe(p.stdout)
stderr, stderr_transport = await connect_read_pipe(p.stderr)
# interact with subprocess
name = {stdout: 'OUT', stderr: 'ERR'}
registered = {
asyncio.Task(stderr.read()): stderr,
asyncio.Task(stdout.read()): stdout
}
to_sort = b"one\ntwo\nthree\n"
stdin.write(to_sort)
stdin.close() # this way we tell we do not have anything else
# get and print lines from stdout, stderr
timeout = None
while registered:
done, pending = await asyncio.wait(
registered, timeout=timeout,
return_when=asyncio.FIRST_COMPLETED)
if not done:
break
for f in done:
stream = registered.pop(f)
res = f.result()
if res != b'':
print(name[stream], res.decode('ascii').rstrip())
registered[asyncio.Task(stream.read())] = stream
timeout = 0.0
stdout_transport.close()
stderr_transport.close()
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main(loop))
finally:
loop.close()
NB: without taking special measures, the amount of data to be written into the pipe is limited. In my system it was possible to write just over 700000 bytes before using up pipe buffers.
There are also other examples there, using create_subprocess_shell.
I have not yet used asyncio in real projects, so improvements' suggestions in the comments are welcome.
It's the right way to go...! Use
async/await
Tested it on Python - 3.X [Windows, MacOS]
import asyncio
from asyncio.subprocess import PIPE, STDOUT
import subprocess
import signal
def signal_handler(signal, frame):
loop.stop()
client.close()
sys.exit(0)
async def run_async(loop = ''):
cmd = 'sudo long_running_cmd --opt1=AAAA --opt2=BBBB'
print ("[INFO] Starting script...")
await asyncio.create_subprocess_shell(cmd1, stdin = PIPE, stdout = PIPE, stderr = STDOUT)
print("[INFO] Script is complete.")
loop = asyncio.get_event_loop()
signal.signal(signal.SIGINT, signal_handler)
tasks = [loop.create_task(run_async())]
wait_tasks = asyncio.wait(tasks)
loop.run_until_complete(wait_tasks)
loop.close()
Core logic:
process = await asyncio.create_subprocess_shell(cmd1, stdin = PIPE, stdout PIPE, stderr = STDOUT)
await process.wait()
I eventually found the answer to my question, which utilizes async.
http://pastebin.com/Zj8SK1CG
Tested with python 3.5. Just ask if you have questions.
import threading
import time
import subprocess
import shlex
from sys import stdout
# Only data wihtin a class are actually shared by the threads.
# Let's use a class as communicator (there could be problems if you have more than
# a single thread)
class Communicator(object):
counter = 0
stop = False
arg = None
result = None
# Here we can define what you want to do. There are other methods to do that
# but this is the one I prefer.
class ThreadedFunction(threading.Thread):
def run(self, *args, **kwargs):
super().run()
command = c.arg
# Here what you want to do...
command = shlex.split(command)
print(time.time()) # this is just to check that the command (sleep 5) is executed
output = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
print('\n',time.time())
c.result = output
if c.stop: return None # This is useful only within loops within threads
# Create a class instance
c = Communicator()
c.arg = 'time sleep 5' # Here I used the 'time' only to have some output
# Create the thread and start it
t = ThreadedFunction()
t.start() # Start the thread and do something else...
# ...for example count the seconds in the mean time..
try:
for j in range(100):
c.counter += 1
stdout.write('\r{:}'.format(c.counter))
stdout.flush()
time.sleep(1)
if c.result != None:
print(c.result)
break
except:
c.stop = True
This one is much simpler, I found it after the other reply that could, anyway, be interesting... so I left it.
import time
import subprocess
import shlex
from sys import stdout
command = 'time sleep 5' # Here I used the 'time' only to have some output
def x(command):
cmd = shlex.split(command)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
return p
# Start the subprocess and do something else...
p = x(command)
# ...for example count the seconds in the mean time..
try: # This take care of killing the subprocess if problems occur
for j in range(100):
stdout.write('\r{:}'.format(j))
stdout.flush()
time.sleep(1)
if p.poll() != None:
print(p.communicate())
break
except:
p.terminate() # or p.kill()
The asynchronism is evident from the fact that the python script prints the counter value on the stdout while the background process runs the sleep command. The fact that the python script exit after ~5sec printing the output of the bash time command printing the counter in the meanwhile is an evidence that the script works.

Partial read from "stdout" on Python using Popen [duplicate]

This question already has answers here:
Stop reading process output in Python without hang?
(5 answers)
Closed 8 years ago.
i'm trying to build a python script, which opens a subprocess (bash script) and reads "stdout" into a variable during 10 seconds. After 10 sec i need to transfer data to server via POST request.
How to make POST request i know, but how to collect "stdout" during 10 seconds?
i find a lot of examples how to use "Popen", launch bash script and read stderr instantly without biffering, but how to collect output during some time and release is partially?
I think this solution with two threads with simple responsibilities is clean and elegant.
import os
import subprocess
import threading
import functools
from time import sleep
class OutputMonitor(threading.Thread):
""" Start the subprocess in separate thread and append it's output to a buffer. """
def __init__(self, cmd):
super(OutputMonitor, self).__init__()
self.daemon = True
self.cmd = cmd
self.buffer = ''
self.buflock = threading.Lock()
def run(self):
popen = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while popen.poll() is None:
data = popen.stdout.read(4)
if data != "":
with self.buflock:
self.buffer += data
def get_current_output(self):
with self.buflock:
buf = self.buffer
self.buffer = ""
return buf
class OutputHandler(threading.Thread):
"""
Start a thread responsible for tracking subprocess output, and periodically
check if it has produced new output. If so, call handler to process this data.
"""
def __init__(self, cmd, interval, filepath):
super(OutputHandler, self).__init__()
self.om = OutputMonitor(cmd)
self.interval = interval
# Replace it with your handler init...
self.filepath = filepath
if os.path.exists(self.filepath):
os.unlink(self.filepath)
def run(self):
self.om.start()
while self.om.is_alive():
sleep(self.interval)
data = self.om.get_current_output()
self._handle_data_chunk(data)
def _handle_data_chunk(self, data):
# Replace it with you handling.
with open(self.filepath, 'a') as f:
f.write(data)
if __name__ == '__main__':
logfile_path = "C:\\log.txt"
interval = 5
cmd = ['ping', 'n', '10', '127.0.0.1']
oh = OutputHandler(cmd, interval, logfile_path)
oh.start()
oh.join()
You could do something similar to what is below:
point the subprocess to output to console output
catch the output in a variable which is common to posting and capturing function
setup the thread to post the logs every 10 seconds
import threading, sys, subprocess
out = ""
def postLogs():
print out
#do your posting here
threading.Timer(10.0, postLogs).start() #execute this function every 10 seconds
proc = subprocess.Popen("ping google.com", shell=True,stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while proc.poll() is None:
out = proc.stdout.readline()
sys.stdout.flush
if out != "":
postLogs()
okay lets continue with mrad's script
i edit it just a little. Added writing to file function and it works perfect. with
ping google.com
BUT it do not work with command which i need... i need to launch ffmpeg. Command which i need is
ffmpeg -i "my rtsp link" -vcodec copy -loglevel verbose -an -f flv "my RTMP link"
when i put my command inside this code 1- i see output instantly. 2- nothing saved in the file (
import subprocess
import threading
from datetime import datetime
from time import sleep
from Queue import Queue
class Monitor(threading.Thread):
def __init__(self, queue, cmd):
super(Monitor, self).__init__()
self.queue = queue
self.cmd = cmd
def run(self):
popen = subprocess.Popen(self.cmd, stdout=subprocess.PIPE, shell=True)
while popen.poll() is None:
line = popen.stdout.readline()
self.queue.put(line)
def foo(cmd, interval):
q = Queue()
m = Monitor(q, cmd)
m.start()
while m.is_alive():
sleep(interval)
current_queue_length = q.qsize()
chunk = ''
for i in xrange(current_queue_length):
chunk += q.get()
print chunk
f=open("/home/pi/raf/log.txt","a") #trying to write to log
f.write(chunk)
f.close()
if __name__ == '__main__':
cmd = 'ping google.com'
interval = 3
foo(cmd, interval)

Python readline blocking the while loop [duplicate]

I'm using the subprocess module to start a subprocess and connect to its output stream (standard output). I want to be able to execute non-blocking reads on its standard output. Is there a way to make .readline non-blocking or to check if there is data on the stream before I invoke .readline? I'd like this to be portable or at least work under Windows and Linux.
Here is how I do it for now (it's blocking on the .readline if no data is available):
p = subprocess.Popen('myprogram.exe', stdout = subprocess.PIPE)
output_str = p.stdout.readline()
fcntl, select, asyncproc won't help in this case.
A reliable way to read a stream without blocking regardless of operating system is to use Queue.get_nowait():
import sys
from subprocess import PIPE, Popen
from threading import Thread
try:
from queue import Queue, Empty
except ImportError:
from Queue import Queue, Empty # python 2.x
ON_POSIX = 'posix' in sys.builtin_module_names
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
p = Popen(['myprogram.exe'], stdout=PIPE, bufsize=1, close_fds=ON_POSIX)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
# ... do other things here
# read line without blocking
try: line = q.get_nowait() # or q.get(timeout=.1)
except Empty:
print('no output yet')
else: # got line
# ... do something with line
I have often had a similar problem; Python programs I write frequently need to have the ability to execute some primary functionality while simultaneously accepting user input from the command line (stdin). Simply putting the user input handling functionality in another thread doesn't solve the problem because readline() blocks and has no timeout. If the primary functionality is complete and there is no longer any need to wait for further user input I typically want my program to exit, but it can't because readline() is still blocking in the other thread waiting for a line. A solution I have found to this problem is to make stdin a non-blocking file using the fcntl module:
import fcntl
import os
import sys
# make stdin a non-blocking file
fd = sys.stdin.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
# user input handling thread
while mainThreadIsRunning:
try: input = sys.stdin.readline()
except: continue
handleInput(input)
In my opinion this is a bit cleaner than using the select or signal modules to solve this problem but then again it only works on UNIX...
Python 3.4 introduces new provisional API for asynchronous IO -- asyncio module.
The approach is similar to twisted-based answer by #Bryan Ward -- define a protocol and its methods are called as soon as data is ready:
#!/usr/bin/env python3
import asyncio
import os
class SubprocessProtocol(asyncio.SubprocessProtocol):
def pipe_data_received(self, fd, data):
if fd == 1: # got stdout data (bytes)
print(data)
def connection_lost(self, exc):
loop.stop() # end loop.run_forever()
if os.name == 'nt':
loop = asyncio.ProactorEventLoop() # for subprocess' pipes on Windows
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(loop.subprocess_exec(SubprocessProtocol,
"myprogram.exe", "arg1", "arg2"))
loop.run_forever()
finally:
loop.close()
See "Subprocess" in the docs.
There is a high-level interface asyncio.create_subprocess_exec() that returns Process objects that allows to read a line asynchroniosly using StreamReader.readline() coroutine
(with async/await Python 3.5+ syntax):
#!/usr/bin/env python3.5
import asyncio
import locale
import sys
from asyncio.subprocess import PIPE
from contextlib import closing
async def readline_and_kill(*args):
# start child process
process = await asyncio.create_subprocess_exec(*args, stdout=PIPE)
# read line (sequence of bytes ending with b'\n') asynchronously
async for line in process.stdout:
print("got line:", line.decode(locale.getpreferredencoding(False)))
break
process.kill()
return await process.wait() # wait for the child process to exit
if sys.platform == "win32":
loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
with closing(loop):
sys.exit(loop.run_until_complete(readline_and_kill(
"myprogram.exe", "arg1", "arg2")))
readline_and_kill() performs the following tasks:
start subprocess, redirect its stdout to a pipe
read a line from subprocess' stdout asynchronously
kill subprocess
wait for it to exit
Each step could be limited by timeout seconds if necessary.
On Unix-like systems and Python 3.5+ there's os.set_blocking which does exactly what it says.
import os
import time
import subprocess
cmd = 'python3', '-c', 'import time; [(print(i), time.sleep(1)) for i in range(5)]'
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
os.set_blocking(p.stdout.fileno(), False)
start = time.time()
while True:
# first iteration always produces empty byte string in non-blocking mode
for i in range(2):
line = p.stdout.readline()
print(i, line)
time.sleep(0.5)
if time.time() > start + 5:
break
p.terminate()
This outputs:
1 b''
2 b'0\n'
1 b''
2 b'1\n'
1 b''
2 b'2\n'
1 b''
2 b'3\n'
1 b''
2 b'4\n'
With os.set_blocking commented it's:
0 b'0\n'
1 b'1\n'
0 b'2\n'
1 b'3\n'
0 b'4\n'
1 b''
Try the asyncproc module. For example:
import os
from asyncproc import Process
myProc = Process("myprogram.app")
while True:
# check to see if process has ended
poll = myProc.wait(os.WNOHANG)
if poll != None:
break
# print any new output
out = myProc.read()
if out != "":
print out
The module takes care of all the threading as suggested by S.Lott.
You can do this really easily in Twisted. Depending upon your existing code base, this might not be that easy to use, but if you are building a twisted application, then things like this become almost trivial. You create a ProcessProtocol class, and override the outReceived() method. Twisted (depending upon the reactor used) is usually just a big select() loop with callbacks installed to handle data from different file descriptors (often network sockets). So the outReceived() method is simply installing a callback for handling data coming from STDOUT. A simple example demonstrating this behavior is as follows:
from twisted.internet import protocol, reactor
class MyProcessProtocol(protocol.ProcessProtocol):
def outReceived(self, data):
print data
proc = MyProcessProtocol()
reactor.spawnProcess(proc, './myprogram', ['./myprogram', 'arg1', 'arg2', 'arg3'])
reactor.run()
The Twisted documentation has some good information on this.
If you build your entire application around Twisted, it makes asynchronous communication with other processes, local or remote, really elegant like this. On the other hand, if your program isn't built on top of Twisted, this isn't really going to be that helpful. Hopefully this can be helpful to other readers, even if it isn't applicable for your particular application.
Use select & read(1).
import subprocess #no new requirements
def readAllSoFar(proc, retVal=''):
while (select.select([proc.stdout],[],[],0)[0]!=[]):
retVal+=proc.stdout.read(1)
return retVal
p = subprocess.Popen(['/bin/ls'], stdout=subprocess.PIPE)
while not p.poll():
print (readAllSoFar(p))
For readline()-like:
lines = ['']
while not p.poll():
lines = readAllSoFar(p, lines[-1]).split('\n')
for a in range(len(lines)-1):
print a
lines = readAllSoFar(p, lines[-1]).split('\n')
for a in range(len(lines)-1):
print a
Things are a lot better in modern Python.
Here's a simple child program, "hello.py":
#!/usr/bin/env python3
while True:
i = input()
if i == "quit":
break
print(f"hello {i}")
And a program to interact with it:
import asyncio
async def main():
proc = await asyncio.subprocess.create_subprocess_exec(
"./hello.py", stdin=asyncio.subprocess.PIPE, stdout=asyncio.subprocess.PIPE
)
proc.stdin.write(b"bob\n")
print(await proc.stdout.read(1024))
proc.stdin.write(b"alice\n")
print(await proc.stdout.read(1024))
proc.stdin.write(b"quit\n")
await proc.wait()
asyncio.run(main())
That prints out:
b'hello bob\n'
b'hello alice\n'
Note that the actual pattern, which is also by almost all of the previous answers, both here and in related questions, is to set the child's stdout file descriptor to non-blocking and then poll it in some sort of select loop. These days, of course, that loop is provided by asyncio.
One solution is to make another process to perform your read of the process, or make a thread of the process with a timeout.
Here's the threaded version of a timeout function:
http://code.activestate.com/recipes/473878/
However, do you need to read the stdout as it's coming in?
Another solution may be to dump the output to a file and wait for the process to finish using p.wait().
f = open('myprogram_output.txt','w')
p = subprocess.Popen('myprogram.exe', stdout=f)
p.wait()
f.close()
str = open('myprogram_output.txt','r').read()
Here is my code, used to catch every output from subprocess ASAP, including partial lines. It pumps at same time and stdout and stderr in almost correct order.
Tested and correctly worked on Python 2.7 linux & windows.
#!/usr/bin/python
#
# Runner with stdout/stderr catcher
#
from sys import argv
from subprocess import Popen, PIPE
import os, io
from threading import Thread
import Queue
def __main__():
if (len(argv) > 1) and (argv[-1] == "-sub-"):
import time, sys
print "Application runned!"
time.sleep(2)
print "Slept 2 second"
time.sleep(1)
print "Slept 1 additional second",
time.sleep(2)
sys.stderr.write("Stderr output after 5 seconds")
print "Eol on stdin"
sys.stderr.write("Eol on stderr\n")
time.sleep(1)
print "Wow, we have end of work!",
else:
os.environ["PYTHONUNBUFFERED"]="1"
try:
p = Popen( argv + ["-sub-"],
bufsize=0, # line-buffered
stdin=PIPE, stdout=PIPE, stderr=PIPE )
except WindowsError, W:
if W.winerror==193:
p = Popen( argv + ["-sub-"],
shell=True, # Try to run via shell
bufsize=0, # line-buffered
stdin=PIPE, stdout=PIPE, stderr=PIPE )
else:
raise
inp = Queue.Queue()
sout = io.open(p.stdout.fileno(), 'rb', closefd=False)
serr = io.open(p.stderr.fileno(), 'rb', closefd=False)
def Pump(stream, category):
queue = Queue.Queue()
def rdr():
while True:
buf = stream.read1(8192)
if len(buf)>0:
queue.put( buf )
else:
queue.put( None )
return
def clct():
active = True
while active:
r = queue.get()
try:
while True:
r1 = queue.get(timeout=0.005)
if r1 is None:
active = False
break
else:
r += r1
except Queue.Empty:
pass
inp.put( (category, r) )
for tgt in [rdr, clct]:
th = Thread(target=tgt)
th.setDaemon(True)
th.start()
Pump(sout, 'stdout')
Pump(serr, 'stderr')
while p.poll() is None:
# App still working
try:
chan,line = inp.get(timeout = 1.0)
if chan=='stdout':
print "STDOUT>>", line, "<?<"
elif chan=='stderr':
print " ERROR==", line, "=?="
except Queue.Empty:
pass
print "Finish"
if __name__ == '__main__':
__main__()
Disclaimer: this works only for tornado
You can do this by setting the fd to be nonblocking and then use ioloop to register callbacks. I have packaged this in an egg called tornado_subprocess and you can install it via PyPI:
easy_install tornado_subprocess
now you can do something like this:
import tornado_subprocess
import tornado.ioloop
def print_res( status, stdout, stderr ) :
print status, stdout, stderr
if status == 0:
print "OK:"
print stdout
else:
print "ERROR:"
print stderr
t = tornado_subprocess.Subprocess( print_res, timeout=30, args=[ "cat", "/etc/passwd" ] )
t.start()
tornado.ioloop.IOLoop.instance().start()
you can also use it with a RequestHandler
class MyHandler(tornado.web.RequestHandler):
def on_done(self, status, stdout, stderr):
self.write( stdout )
self.finish()
#tornado.web.asynchronous
def get(self):
t = tornado_subprocess.Subprocess( self.on_done, timeout=30, args=[ "cat", "/etc/passwd" ] )
t.start()
Existing solutions did not work for me (details below). What finally worked was to implement readline using read(1) (based on this answer). The latter does not block:
from subprocess import Popen, PIPE
from threading import Thread
def process_output(myprocess): #output-consuming thread
nextline = None
buf = ''
while True:
#--- extract line using read(1)
out = myprocess.stdout.read(1)
if out == '' and myprocess.poll() != None: break
if out != '':
buf += out
if out == '\n':
nextline = buf
buf = ''
if not nextline: continue
line = nextline
nextline = None
#--- do whatever you want with line here
print 'Line is:', line
myprocess.stdout.close()
myprocess = Popen('myprogram.exe', stdout=PIPE) #output-producing process
p1 = Thread(target=process_output, args=(myprocess,)) #output-consuming thread
p1.daemon = True
p1.start()
#--- do whatever here and then kill process and thread if needed
if myprocess.poll() == None: #kill process; will automatically stop thread
myprocess.kill()
myprocess.wait()
if p1 and p1.is_alive(): #wait for thread to finish
p1.join()
Why existing solutions did not work:
Solutions that require readline (including the Queue based ones) always block. It is difficult (impossible?) to kill the thread that executes readline. It only gets killed when the process that created it finishes, but not when the output-producing process is killed.
Mixing low-level fcntl with high-level readline calls may not work properly as anonnn has pointed out.
Using select.poll() is neat, but doesn't work on Windows according to python docs.
Using third-party libraries seems overkill for this task and adds additional dependencies.
I add this problem to read some subprocess.Popen stdout.
Here is my non blocking read solution:
import fcntl
def non_block_read(output):
fd = output.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
try:
return output.read()
except:
return ""
# Use example
from subprocess import *
sb = Popen("echo test && sleep 1000", shell=True, stdout=PIPE)
sb.kill()
# sb.stdout.read() # <-- This will block
non_block_read(sb.stdout)
'test\n'
Here is a simple solution based on threads which:
works on both Linux and Windows (not relying on select).
reads both stdout and stderr asynchronouly.
doesn't rely on active polling with arbitrary waiting time (CPU friendly).
doesn't use asyncio (which may conflict with other libraries).
runs until the child process terminates.
printer.py
import time
import sys
sys.stdout.write("Hello\n")
sys.stdout.flush()
time.sleep(1)
sys.stdout.write("World!\n")
sys.stdout.flush()
time.sleep(1)
sys.stderr.write("That's an error\n")
sys.stderr.flush()
time.sleep(2)
sys.stdout.write("Actually, I'm fine\n")
sys.stdout.flush()
time.sleep(1)
reader.py
import queue
import subprocess
import sys
import threading
def enqueue_stream(stream, queue, type):
for line in iter(stream.readline, b''):
queue.put(str(type) + line.decode('utf-8'))
stream.close()
def enqueue_process(process, queue):
process.wait()
queue.put('x')
p = subprocess.Popen('python printer.py', stdout=subprocess.PIPE, stderr=subprocess.PIPE)
q = queue.Queue()
to = threading.Thread(target=enqueue_stream, args=(p.stdout, q, 1))
te = threading.Thread(target=enqueue_stream, args=(p.stderr, q, 2))
tp = threading.Thread(target=enqueue_process, args=(p, q))
te.start()
to.start()
tp.start()
while True:
line = q.get()
if line[0] == 'x':
break
if line[0] == '2': # stderr
sys.stdout.write("\033[0;31m") # ANSI red color
sys.stdout.write(line[1:])
if line[0] == '2':
sys.stdout.write("\033[0m") # reset ANSI code
sys.stdout.flush()
tp.join()
to.join()
te.join()
This version of non-blocking read doesn't require special modules and will work out-of-the-box on majority of Linux distros.
import os
import sys
import time
import fcntl
import subprocess
def async_read(fd):
# set non-blocking flag while preserving old flags
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
# read char until EOF hit
while True:
try:
ch = os.read(fd.fileno(), 1)
# EOF
if not ch: break
sys.stdout.write(ch)
except OSError:
# waiting for data be available on fd
pass
def shell(args, async=True):
# merge stderr and stdout
proc = subprocess.Popen(args, shell=False, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
if async: async_read(proc.stdout)
sout, serr = proc.communicate()
return (sout, serr)
if __name__ == '__main__':
cmd = 'ping 8.8.8.8'
sout, serr = shell(cmd.split())
I have the original questioner's problem, but did not wish to invoke threads. I mixed Jesse's solution with a direct read() from the pipe, and my own buffer-handler for line reads (however, my sub-process - ping - always wrote full lines < a system page size). I avoid busy-waiting by only reading in a gobject-registered io watch. These days I usually run code within a gobject MainLoop to avoid threads.
def set_up_ping(ip, w):
# run the sub-process
# watch the resultant pipe
p = subprocess.Popen(['/bin/ping', ip], stdout=subprocess.PIPE)
# make stdout a non-blocking file
fl = fcntl.fcntl(p.stdout, fcntl.F_GETFL)
fcntl.fcntl(p.stdout, fcntl.F_SETFL, fl | os.O_NONBLOCK)
stdout_gid = gobject.io_add_watch(p.stdout, gobject.IO_IN, w)
return stdout_gid # for shutting down
The watcher is
def watch(f, *other):
print 'reading',f.read()
return True
And the main program sets up a ping and then calls gobject mail loop.
def main():
set_up_ping('192.168.1.8', watch)
# discard gid as unused here
gobject.MainLoop().run()
Any other work is attached to callbacks in gobject.
Adding this answer here since it provides ability to set non-blocking pipes on Windows and Unix.
All the ctypes details are thanks to #techtonik's answer.
There is a slightly modified version to be used both on Unix and Windows systems.
Python3 compatible (only minor change needed).
Includes posix version, and defines exception to use for either.
This way you can use the same function and exception for Unix and Windows code.
# pipe_non_blocking.py (module)
"""
Example use:
p = subprocess.Popen(
command,
stdout=subprocess.PIPE,
)
pipe_non_blocking_set(p.stdout.fileno())
try:
data = os.read(p.stdout.fileno(), 1)
except PortableBlockingIOError as ex:
if not pipe_non_blocking_is_error_blocking(ex):
raise ex
"""
__all__ = (
"pipe_non_blocking_set",
"pipe_non_blocking_is_error_blocking",
"PortableBlockingIOError",
)
import os
if os.name == "nt":
def pipe_non_blocking_set(fd):
# Constant could define globally but avoid polluting the name-space
# thanks to: https://stackoverflow.com/questions/34504970
import msvcrt
from ctypes import windll, byref, wintypes, WinError, POINTER
from ctypes.wintypes import HANDLE, DWORD, BOOL
LPDWORD = POINTER(DWORD)
PIPE_NOWAIT = wintypes.DWORD(0x00000001)
def pipe_no_wait(pipefd):
SetNamedPipeHandleState = windll.kernel32.SetNamedPipeHandleState
SetNamedPipeHandleState.argtypes = [HANDLE, LPDWORD, LPDWORD, LPDWORD]
SetNamedPipeHandleState.restype = BOOL
h = msvcrt.get_osfhandle(pipefd)
res = windll.kernel32.SetNamedPipeHandleState(h, byref(PIPE_NOWAIT), None, None)
if res == 0:
print(WinError())
return False
return True
return pipe_no_wait(fd)
def pipe_non_blocking_is_error_blocking(ex):
if not isinstance(ex, PortableBlockingIOError):
return False
from ctypes import GetLastError
ERROR_NO_DATA = 232
return (GetLastError() == ERROR_NO_DATA)
PortableBlockingIOError = OSError
else:
def pipe_non_blocking_set(fd):
import fcntl
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
return True
def pipe_non_blocking_is_error_blocking(ex):
if not isinstance(ex, PortableBlockingIOError):
return False
return True
PortableBlockingIOError = BlockingIOError
To avoid reading incomplete data, I ended up writing my own readline generator (which returns the byte string for each line).
Its a generator so you can for example...
def non_blocking_readlines(f, chunk=1024):
"""
Iterate over lines, yielding b'' when nothings left
or when new data is not yet available.
stdout_iter = iter(non_blocking_readlines(process.stdout))
line = next(stdout_iter) # will be a line or b''.
"""
import os
from .pipe_non_blocking import (
pipe_non_blocking_set,
pipe_non_blocking_is_error_blocking,
PortableBlockingIOError,
)
fd = f.fileno()
pipe_non_blocking_set(fd)
blocks = []
while True:
try:
data = os.read(fd, chunk)
if not data:
# case were reading finishes with no trailing newline
yield b''.join(blocks)
blocks.clear()
except PortableBlockingIOError as ex:
if not pipe_non_blocking_is_error_blocking(ex):
raise ex
yield b''
continue
while True:
n = data.find(b'\n')
if n == -1:
break
yield b''.join(blocks) + data[:n + 1]
data = data[n + 1:]
blocks.clear()
blocks.append(data)
Not the first and probably not the last, I have built a package that does non blocking stdout PIPE reads with two different methods, one being based on the work of J.F. Sebastian (#jfs)'s answer, the other being a simple communicate() loop with a thread to check for timeouts.
Both stdout capture methods are tested to work both under Linux and Windows, with Python versions from 2.7 to 3.9 as of the time of writing
Being non blocking, it guarantees timeout enforcement, even with multiple child and grandchild processes, and even under Python 2.7.
The package also handles both bytes and text stdout encodings, being a nightmare when trying to catch EOF.
You'll find the package at https://github.com/netinvent/command_runner
If you need some well tested non blocking read implementations, try it out (or hack the code):
pip install command_runner
from command_runner import command_runner
exit_code, output = command_runner('ping 127.0.0.1', timeout=3)
exit_code, output = command_runner('echo hello world, shell=True)
exit_code, output = command_runner('some command', stdout='some_file')
You can find the core non blocking read code in _poll_process() or _monitor_process() depending on the capture method employed.
From there, you can hack your way to what you want, or simply use the whole package to execute your commands as a subprocess replacement.
The select module helps you determine where the next useful input is.
However, you're almost always happier with separate threads. One does a blocking read the stdin, another does wherever it is you don't want blocked.
why bothering thread&queue?
unlike readline(), BufferedReader.read1() wont block waiting for \r\n, it returns ASAP if there is any output coming in.
#!/usr/bin/python
from subprocess import Popen, PIPE, STDOUT
import io
def __main__():
try:
p = Popen( ["ping", "-n", "3", "127.0.0.1"], stdin=PIPE, stdout=PIPE, stderr=STDOUT )
except: print("Popen failed"); quit()
sout = io.open(p.stdout.fileno(), 'rb', closefd=False)
while True:
buf = sout.read1(1024)
if len(buf) == 0: break
print buf,
if __name__ == '__main__':
__main__()
In my case I needed a logging module that catches the output from the background applications and augments it(adding time-stamps, colors, etc.).
I ended up with a background thread that does the actual I/O. Following code is only for POSIX platforms. I stripped non-essential parts.
If someone is going to use this beast for long runs consider managing open descriptors. In my case it was not a big problem.
# -*- python -*-
import fcntl
import threading
import sys, os, errno
import subprocess
class Logger(threading.Thread):
def __init__(self, *modules):
threading.Thread.__init__(self)
try:
from select import epoll, EPOLLIN
self.__poll = epoll()
self.__evt = EPOLLIN
self.__to = -1
except:
from select import poll, POLLIN
print 'epoll is not available'
self.__poll = poll()
self.__evt = POLLIN
self.__to = 100
self.__fds = {}
self.daemon = True
self.start()
def run(self):
while True:
events = self.__poll.poll(self.__to)
for fd, ev in events:
if (ev&self.__evt) != self.__evt:
continue
try:
self.__fds[fd].run()
except Exception, e:
print e
def add(self, fd, log):
assert not self.__fds.has_key(fd)
self.__fds[fd] = log
self.__poll.register(fd, self.__evt)
class log:
logger = Logger()
def __init__(self, name):
self.__name = name
self.__piped = False
def fileno(self):
if self.__piped:
return self.write
self.read, self.write = os.pipe()
fl = fcntl.fcntl(self.read, fcntl.F_GETFL)
fcntl.fcntl(self.read, fcntl.F_SETFL, fl | os.O_NONBLOCK)
self.fdRead = os.fdopen(self.read)
self.logger.add(self.read, self)
self.__piped = True
return self.write
def __run(self, line):
self.chat(line, nl=False)
def run(self):
while True:
try: line = self.fdRead.readline()
except IOError, exc:
if exc.errno == errno.EAGAIN:
return
raise
self.__run(line)
def chat(self, line, nl=True):
if nl: nl = '\n'
else: nl = ''
sys.stdout.write('[%s] %s%s' % (self.__name, line, nl))
def system(command, param=[], cwd=None, env=None, input=None, output=None):
args = [command] + param
p = subprocess.Popen(args, cwd=cwd, stdout=output, stderr=output, stdin=input, env=env, bufsize=0)
p.wait()
ls = log('ls')
ls.chat('go')
system("ls", ['-l', '/'], output=ls)
date = log('date')
date.chat('go')
system("date", output=date)
This is a example to run interactive command in subprocess, and the stdout is interactive by using pseudo terminal. You can refer to: https://stackoverflow.com/a/43012138/3555925
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import os
import sys
import select
import termios
import tty
import pty
from subprocess import Popen
command = 'bash'
# command = 'docker run -it --rm centos /bin/bash'.split()
# save original tty setting then set it to raw mode
old_tty = termios.tcgetattr(sys.stdin)
tty.setraw(sys.stdin.fileno())
# open pseudo-terminal to interact with subprocess
master_fd, slave_fd = pty.openpty()
# use os.setsid() make it run in a new process group, or bash job control will not be enabled
p = Popen(command,
preexec_fn=os.setsid,
stdin=slave_fd,
stdout=slave_fd,
stderr=slave_fd,
universal_newlines=True)
while p.poll() is None:
r, w, e = select.select([sys.stdin, master_fd], [], [])
if sys.stdin in r:
d = os.read(sys.stdin.fileno(), 10240)
os.write(master_fd, d)
elif master_fd in r:
o = os.read(master_fd, 10240)
if o:
os.write(sys.stdout.fileno(), o)
# restore tty settings back
termios.tcsetattr(sys.stdin, termios.TCSADRAIN, old_tty)
My problem is a bit different as I wanted to collect both stdout and stderr from a running process, but ultimately the same since I wanted to render the output in a widget as its generated.
I did not want to resort to many of the proposed workarounds using Queues or additional Threads as they should not be necessary to perform such a common task as running another script and collecting its output.
After reading the proposed solutions and python docs I resolved my issue with the implementation below. Yes it only works for POSIX as I'm using the select function call.
I agree that the docs are confusing and the implementation is awkward for such a common scripting task. I believe that older versions of python have different defaults for Popen and different explanations so that created a lot of confusion. This seems to work well for both Python 2.7.12 and 3.5.2.
The key was to set bufsize=1 for line buffering and then universal_newlines=True to process as a text file instead of a binary which seems to become the default when setting bufsize=1.
class workerThread(QThread):
def __init__(self, cmd):
QThread.__init__(self)
self.cmd = cmd
self.result = None ## return code
self.error = None ## flag indicates an error
self.errorstr = "" ## info message about the error
def __del__(self):
self.wait()
DEBUG("Thread removed")
def run(self):
cmd_list = self.cmd.split(" ")
try:
cmd = subprocess.Popen(cmd_list, bufsize=1, stdin=None
, universal_newlines=True
, stderr=subprocess.PIPE
, stdout=subprocess.PIPE)
except OSError:
self.error = 1
self.errorstr = "Failed to execute " + self.cmd
ERROR(self.errorstr)
finally:
VERBOSE("task started...")
import select
while True:
try:
r,w,x = select.select([cmd.stdout, cmd.stderr],[],[])
if cmd.stderr in r:
line = cmd.stderr.readline()
if line != "":
line = line.strip()
self.emit(SIGNAL("update_error(QString)"), line)
if cmd.stdout in r:
line = cmd.stdout.readline()
if line == "":
break
line = line.strip()
self.emit(SIGNAL("update_output(QString)"), line)
except IOError:
pass
cmd.wait()
self.result = cmd.returncode
if self.result < 0:
self.error = 1
self.errorstr = "Task terminated by signal " + str(self.result)
ERROR(self.errorstr)
return
if self.result:
self.error = 1
self.errorstr = "exit code " + str(self.result)
ERROR(self.errorstr)
return
return
ERROR, DEBUG and VERBOSE are simply macros that print output to the terminal.
This solution is IMHO 99.99% effective as it still uses the blocking readline function, so we assume the sub process is nice and outputs complete lines.
I welcome feedback to improve the solution as I am still new to Python.
I have created a library based on J. F. Sebastian's solution. You can use it.
https://github.com/cenkalti/what
Working from J.F. Sebastian's answer, and several other sources, I've put together a simple subprocess manager. It provides the request non-blocking reading, as well as running several processes in parallel. It doesn't use any OS-specific call (that I'm aware) and thus should work anywhere.
It's available from pypi, so just pip install shelljob. Refer to the project page for examples and full docs.
EDIT: This implementation still blocks. Use J.F.Sebastian's answer instead.
I tried the top answer, but the additional risk and maintenance of thread code was worrisome.
Looking through the io module (and being limited to 2.6), I found BufferedReader. This is my threadless, non-blocking solution.
import io
from subprocess import PIPE, Popen
p = Popen(['myprogram.exe'], stdout=PIPE)
SLEEP_DELAY = 0.001
# Create an io.BufferedReader on the file descriptor for stdout
with io.open(p.stdout.fileno(), 'rb', closefd=False) as buffer:
while p.poll() == None:
time.sleep(SLEEP_DELAY)
while '\n' in bufferedStdout.peek(bufferedStdout.buffer_size):
line = buffer.readline()
# do stuff with the line
# Handle any remaining output after the process has ended
while buffer.peek():
line = buffer.readline()
# do stuff with the line
This solution uses the select module to "read any available data" from an IO stream. This function blocks initially until data is available, but then reads only the data that is available and doesn't block further.
Given the fact that it uses the select module, this only works on Unix.
The code is fully PEP8-compliant.
import select
def read_available(input_stream, max_bytes=None):
"""
Blocks until any data is available, then all available data is then read and returned.
This function returns an empty string when end of stream is reached.
Args:
input_stream: The stream to read from.
max_bytes (int|None): The maximum number of bytes to read. This function may return fewer bytes than this.
Returns:
str
"""
# Prepare local variables
input_streams = [input_stream]
empty_list = []
read_buffer = ""
# Initially block for input using 'select'
if len(select.select(input_streams, empty_list, empty_list)[0]) > 0:
# Poll read-readiness using 'select'
def select_func():
return len(select.select(input_streams, empty_list, empty_list, 0)[0]) > 0
# Create while function based on parameters
if max_bytes is not None:
def while_func():
return (len(read_buffer) < max_bytes) and select_func()
else:
while_func = select_func
while True:
# Read single byte at a time
read_data = input_stream.read(1)
if len(read_data) == 0:
# End of stream
break
# Append byte to string buffer
read_buffer += read_data
# Check if more data is available
if not while_func():
break
# Return read buffer
return read_buffer
I also faced the problem described by Jesse and solved it by using "select" as Bradley, Andy and others did but in a blocking mode to avoid a busy loop. It uses a dummy Pipe as a fake stdin. The select blocks and wait for either stdin or the pipe to be ready. When a key is pressed stdin unblocks the select and the key value can be retrieved with read(1). When a different thread writes to the pipe then the pipe unblocks the select and it can be taken as an indication that the need for stdin is over. Here is some reference code:
import sys
import os
from select import select
# -------------------------------------------------------------------------
# Set the pipe (fake stdin) to simulate a final key stroke
# which will unblock the select statement
readEnd, writeEnd = os.pipe()
readFile = os.fdopen(readEnd)
writeFile = os.fdopen(writeEnd, "w")
# -------------------------------------------------------------------------
def getKey():
# Wait for stdin or pipe (fake stdin) to be ready
dr,dw,de = select([sys.__stdin__, readFile], [], [])
# If stdin is the one ready then read it and return value
if sys.__stdin__ in dr:
return sys.__stdin__.read(1) # For Windows use ----> getch() from module msvcrt
# Must finish
else:
return None
# -------------------------------------------------------------------------
def breakStdinRead():
writeFile.write(' ')
writeFile.flush()
# -------------------------------------------------------------------------
# MAIN CODE
# Get key stroke
key = getKey()
# Keyboard input
if key:
# ... do your stuff with the key value
# Faked keystroke
else:
# ... use of stdin finished
# -------------------------------------------------------------------------
# OTHER THREAD CODE
breakStdinRead()
Try wexpect, which is the windows alternative of pexpect.
import wexpect
p = wexpect.spawn('myprogram.exe')
p.stdout.readline('.') // regex pattern of any character
output_str = p.after()
Here is a module that supports non-blocking reads and background writes in python:
https://pypi.python.org/pypi/python-nonblock
Provides a function,
nonblock_read which will read data from the stream, if available, otherwise return an empty string (or None if the stream is closed on the other side and all possible data has been read)
You may also consider the python-subprocess2 module,
https://pypi.python.org/pypi/python-subprocess2
which adds to the subprocess module. So on the object returned from "subprocess.Popen" is added an additional method, runInBackground. This starts a thread and returns an object which will automatically be populated as stuff is written to stdout/stderr, without blocking your main thread.
Enjoy!

Categories

Resources