python, subprocess: reading output from subprocess - python

I have following script:
#!/usr/bin/python
while True:
x = raw_input()
print x[::-1]
I am calling it from ipython:
In [5]: p = Popen('./script.py', stdin=PIPE)
In [6]: p.stdin.write('abc\n')
cba
and it works fine.
However, when I do this:
In [7]: p = Popen('./script.py', stdin=PIPE, stdout=PIPE)
In [8]: p.stdin.write('abc\n')
In [9]: p.stdout.read()
the interpreter hangs. What am I doing wrong? I would like to be able to both write and read from another process multiple times, to pass some tasks to this process. What do I need to do differently?
EDIT 1
If I use communicate, I get this:
In [7]: p = Popen('./script.py', stdin=PIPE, stdout=PIPE)
In [8]: p.communicate('abc\n')
Traceback (most recent call last):
File "./script.py", line 4, in <module>
x = raw_input()
EOFError: EOF when reading a line
Out[8]: ('cba\n', None)
EDIT 2
I tried flushing:
#!/usr/bin/python
import sys
while True:
x = raw_input()
print x[::-1]
sys.stdout.flush()
and here:
In [5]: from subprocess import PIPE, Popen
In [6]: p = Popen('./script.py', stdin=PIPE, stdout=PIPE)
In [7]: p.stdin.write('abc')
In [8]: p.stdin.flush()
In [9]: p.stdout.read()
but it hangs again.

I believe there are two problems at work here:
1) Your parent script calls p.stdout.read(), which will read all data until end-of-file. However, your child script runs in an infinite loop so end-of-file will never happen. Probably you want p.stdout.readline()?
2) In interactive mode, most programs do buffer only one line at a time. When run from another program, they buffer much more. The buffering improves efficiency in many cases, but causes problems when two programs need to communicate interactively.
After p.stdin.write('abc\n') add:
p.stdin.flush()
In your subprocess script, after print x[::-1] add the following within the loop:
sys.stdout.flush()
(and import sys at the top)

The subprocess method check_output can be useful for this:
output = subprocess.check_output('./script.py')
And output will be the stdout from the process. If you need stderr, too:
output = subprocess.check_output('./script.py', stderr=subprocess.STDOUT)
Because you avoid managing pipes directly, it may circumvent your issue.

If you'd like to pass several lines to script.py then you need to read/write simultaneously:
#!/usr/bin/env python
import sys
from subprocess import PIPE, Popen
from threading import Thread
def print_output(out, ntrim=80):
for line in out:
print len(line)
if len(line) > ntrim: # truncate long output
line = line[:ntrim-2]+'..'
print line.rstrip()
if __name__=="__main__":
p = Popen(['python', 'script.py'], stdin=PIPE, stdout=PIPE)
Thread(target=print_output, args=(p.stdout,)).start()
for s in ['abc', 'def', 'ab'*10**7, 'ghi']:
print >>p.stdin, s
p.stdin.close()
sys.exit(p.wait()) #NOTE: read http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
Output:
4
cba
4
fed
20000001
bababababababababababababababababababababababababababababababababababababababa..
4
ihg
Where script.py:
#!/usr/bin/env python
"""Print reverse lines."""
while True:
try: x = raw_input()
except EOFError:
break # no more input
else:
print x[::-1]
Or
#!/usr/bin/env python
"""Print reverse lines."""
import sys
for line in sys.stdin:
print line.rstrip()[::-1]
Or
#!/usr/bin/env python
"""Print reverse lines."""
import fileinput
for line in fileinput.input(): # accept files specified as command line arguments
print line.rstrip()[::-1]

You're probably tripping over Python's output buffering. Here's what python --help has to say about it.
-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x
see man page for details on internal buffering relating to '-u'

When you are through writing to p.stdin, close it: p.stdin.close()

Use communicate() instead of .stdout.read().
Example:
from subprocess import Popen, PIPE
p = Popen('./script.py', stdin=PIPE, stdout=PIPE, stderr=PIPE)
input = 'abc\n'
stdout, stderr = p.communicate(input)
This recommendation comes from the Popen objects section in the subprocess documentation:
Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read
to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the
child process.

Related

Using POpen to send a variable to Stdin and to send Stdout to a variable

In shell script, we have the following command:
/script1.pl < input_file| /script2.pl > output_file
I would like to replicate the above stream in Python using the module subprocess. input_file is a large file, and I can't read the whole file at once. As such I would like to pass each line, an input_string into the pipe stream and return a string variable output_string, until the whole file has been streamed through.
The following is a first attempt:
process = subprocess.Popen(["/script1.pl | /script2.pl"], stdin = subprocess.PIPE, stdout = subprocess.PIPE, shell = True)
process.stdin.write(input_string)
output_string = process.communicate()[0]
However, using process.communicate()[0] closes the stream. I would like to keep the stream open for future streams. I have tried using process.stdout.readline(), instead, but the program hangs.
To emulate /script1.pl < input_file | /script2.pl > output_file shell command using subprocess module in Python:
#!/usr/bin/env python
from subprocess import check_call
with open('input_file', 'rb') as input_file
with open('output_file', 'wb') as output_file:
check_call("/script1.pl | /script2.pl", shell=True,
stdin=input_file, stdout=output_file)
You could write it without shell=True (though I don't see a reason here) based on 17.1.4.2. Replacing shell pipeline example from the docs:
#!/usr/bin/env python
from subprocess import Popen, PIPE
with open('input_file', 'rb') as input_file
script1 = Popen("/script1.pl", stdin=input_file, stdout=PIPE)
with open("output_file", "wb") as output_file:
script2 = Popen("/script2.pl", stdin=script1.stdout, stdout=output_file)
script1.stdout.close() # allow script1 to receive SIGPIPE if script2 exits
script2.wait()
script1.wait()
You could also use plumbum module to get shell-like syntax in Python:
#!/usr/bin/env python
from plumbum import local
script1, script2 = local["/script1.pl"], local["/script2.pl"]
(script1 < "input_file" | script2 > "output_file")()
See also How do I use subprocess.Popen to connect multiple processes by pipes?
If you want to read/write line by line then the answer depends on the concrete scripts that you want to run. In general it is easy to deadlock sending/receiving input/output if you are not careful e.g., due to buffering issues.
If input doesn't depend on output in your case then a reliable cross-platform approach is to use a separate thread for each stream:
#!/usr/bin/env python
from subprocess import Popen, PIPE
from threading import Thread
def pump_input(pipe):
try:
for i in xrange(1000000000): # generate large input
print >>pipe, i
finally:
pipe.close()
p = Popen("/script1.pl | /script2.pl", shell=True, stdin=PIPE, stdout=PIPE,
bufsize=1)
Thread(target=pump_input, args=[p.stdin]).start()
try: # read output line by line as soon as the child flushes its stdout buffer
for line in iter(p.stdout.readline, b''):
print line.strip()[::-1] # print reversed lines
finally:
p.stdout.close()
p.wait()

live output from subprocess command

I'm using a python script as a driver for a hydrodynamics code. When it comes time to run the simulation, I use subprocess.Popen to run the code, collect the output from stdout and stderr into a subprocess.PIPE --- then I can print (and save to a log-file) the output information, and check for any errors. The problem is, I have no idea how the code is progressing. If I run it directly from the command line, it gives me output about what iteration its at, what time, what the next time-step is, etc.
Is there a way to both store the output (for logging and error checking), and also produce a live-streaming output?
The relevant section of my code:
ret_val = subprocess.Popen( run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True )
output, errors = ret_val.communicate()
log_file.write(output)
print output
if( ret_val.returncode ):
print "RUN failed\n\n%s\n\n" % (errors)
success = False
if( errors ): log_file.write("\n\n%s\n\n" % errors)
Originally I was piping the run_command through tee so that a copy went directly to the log-file, and the stream still output directly to the terminal -- but that way I can't store any errors (to my knowlege).
My temporary solution so far:
ret_val = subprocess.Popen( run_command, stdout=log_file, stderr=subprocess.PIPE, shell=True )
while not ret_val.poll():
log_file.flush()
then, in another terminal, run tail -f log.txt (s.t. log_file = 'log.txt').
TLDR for Python 3:
import subprocess
import sys
with open("test.log", "wb") as f:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
for c in iter(lambda: process.stdout.read(1), b""):
sys.stdout.buffer.write(c)
f.buffer.write(c)
You have two ways of doing this, either by creating an iterator from the read or readline functions and do:
import subprocess
import sys
# replace "w" with "wb" for Python 3
with open("test.log", "w") as f:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
# replace "" with b'' for Python 3
for c in iter(lambda: process.stdout.read(1), ""):
sys.stdout.write(c)
f.write(c)
or
import subprocess
import sys
# replace "w" with "wb" for Python 3
with open("test.log", "w") as f:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
# replace "" with b"" for Python 3
for line in iter(process.stdout.readline, ""):
sys.stdout.write(line)
f.write(line)
Or you can create a reader and a writer file. Pass the writer to the Popen and read from the reader
import io
import time
import subprocess
import sys
filename = "test.log"
with io.open(filename, "wb") as writer, io.open(filename, "rb", 1) as reader:
process = subprocess.Popen(command, stdout=writer)
while process.poll() is None:
sys.stdout.write(reader.read())
time.sleep(0.5)
# Read the remaining
sys.stdout.write(reader.read())
This way you will have the data written in the test.log as well as on the standard output.
The only advantage of the file approach is that your code doesn't block. So you can do whatever you want in the meantime and read whenever you want from the reader in a non-blocking way. When you use PIPE, read and readline functions will block until either one character is written to the pipe or a line is written to the pipe respectively.
Executive Summary (or "tl;dr" version): it's easy when there's at most one subprocess.PIPE, otherwise it's hard.
It may be time to explain a bit about how subprocess.Popen does its thing.
(Caveat: this is for Python 2.x, although 3.x is similar; and I'm quite fuzzy on the Windows variant. I understand the POSIX stuff much better.)
The Popen function needs to deal with zero-to-three I/O streams, somewhat simultaneously. These are denoted stdin, stdout, and stderr as usual.
You can provide:
None, indicating that you don't want to redirect the stream. It will inherit these as usual instead. Note that on POSIX systems, at least, this does not mean it will use Python's sys.stdout, just Python's actual stdout; see demo at end.
An int value. This is a "raw" file descriptor (in POSIX at least). (Side note: PIPE and STDOUT are actually ints internally, but are "impossible" descriptors, -1 and -2.)
A stream—really, any object with a fileno method. Popen will find the descriptor for that stream, using stream.fileno(), and then proceed as for an int value.
subprocess.PIPE, indicating that Python should create a pipe.
subprocess.STDOUT (for stderr only): tell Python to use the same descriptor as for stdout. This only makes sense if you provided a (non-None) value for stdout, and even then, it is only needed if you set stdout=subprocess.PIPE. (Otherwise you can just provide the same argument you provided for stdout, e.g., Popen(..., stdout=stream, stderr=stream).)
The easiest cases (no pipes)
If you redirect nothing (leave all three as the default None value or supply explicit None), Pipe has it quite easy. It just needs to spin off the subprocess and let it run. Or, if you redirect to a non-PIPE—an int or a stream's fileno()—it's still easy, as the OS does all the work. Python just needs to spin off the subprocess, connecting its stdin, stdout, and/or stderr to the provided file descriptors.
The still-easy case: one pipe
If you redirect only one stream, Pipe still has things pretty easy. Let's pick one stream at a time and watch.
Suppose you want to supply some stdin, but let stdout and stderr go un-redirected, or go to a file descriptor. As the parent process, your Python program simply needs to use write() to send data down the pipe. You can do this yourself, e.g.:
proc = subprocess.Popen(cmd, stdin=subprocess.PIPE)
proc.stdin.write('here, have some data\n') # etc
or you can pass the stdin data to proc.communicate(), which then does the stdin.write shown above. There is no output coming back so communicate() has only one other real job: it also closes the pipe for you. (If you don't call proc.communicate() you must call proc.stdin.close() to close the pipe, so that the subprocess knows there is no more data coming through.)
Suppose you want to capture stdout but leave stdin and stderr alone. Again, it's easy: just call proc.stdout.read() (or equivalent) until there is no more output. Since proc.stdout() is a normal Python I/O stream you can use all the normal constructs on it, like:
for line in proc.stdout:
or, again, you can use proc.communicate(), which simply does the read() for you.
If you want to capture only stderr, it works the same as with stdout.
There's one more trick before things get hard. Suppose you want to capture stdout, and also capture stderr but on the same pipe as stdout:
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
In this case, subprocess "cheats"! Well, it has to do this, so it's not really cheating: it starts the subprocess with both its stdout and its stderr directed into the (single) pipe-descriptor that feeds back to its parent (Python) process. On the parent side, there's again only a single pipe-descriptor for reading the output. All the "stderr" output shows up in proc.stdout, and if you call proc.communicate(), the stderr result (second value in the tuple) will be None, not a string.
The hard cases: two or more pipes
The problems all come about when you want to use at least two pipes. In fact, the subprocess code itself has this bit:
def communicate(self, input=None):
...
# Optimization: If we are only using one pipe, or no pipe at
# all, using select() or threads is unnecessary.
if [self.stdin, self.stdout, self.stderr].count(None) >= 2:
But, alas, here we've made at least two, and maybe three, different pipes, so the count(None) returns either 1 or 0. We must do things the hard way.
On Windows, this uses threading.Thread to accumulate results for self.stdout and self.stderr, and has the parent thread deliver self.stdin input data (and then close the pipe).
On POSIX, this uses poll if available, otherwise select, to accumulate output and deliver stdin input. All this runs in the (single) parent process/thread.
Threads or poll/select are needed here to avoid deadlock. Suppose, for instance, that we've redirected all three streams to three separate pipes. Suppose further that there's a small limit on how much data can be stuffed into to a pipe before the writing process is suspended, waiting for the reading process to "clean out" the pipe from the other end. Let's set that small limit to a single byte, just for illustration. (This is in fact how things work, except that the limit is much bigger than one byte.)
If the parent (Python) process tries to write several bytes—say, 'go\n'to proc.stdin, the first byte goes in and then the second causes the Python process to suspend, waiting for the subprocess to read the first byte, emptying the pipe.
Meanwhile, suppose the subprocess decides to print a friendly "Hello! Don't Panic!" greeting. The H goes into its stdout pipe, but the e causes it to suspend, waiting for its parent to read that H, emptying the stdout pipe.
Now we're stuck: the Python process is asleep, waiting to finish saying "go", and the subprocess is also asleep, waiting to finish saying "Hello! Don't Panic!".
The subprocess.Popen code avoids this problem with threading-or-select/poll. When bytes can go over the pipes, they go. When they can't, only a thread (not the whole process) has to sleep—or, in the case of select/poll, the Python process waits simultaneously for "can write" or "data available", writes to the process's stdin only when there is room, and reads its stdout and/or stderr only when data are ready. The proc.communicate() code (actually _communicate where the hairy cases are handled) returns once all stdin data (if any) have been sent and all stdout and/or stderr data have been accumulated.
If you want to read both stdout and stderr on two different pipes (regardless of any stdin redirection), you will need to avoid deadlock too. The deadlock scenario here is different—it occurs when the subprocess writes something long to stderr while you're pulling data from stdout, or vice versa—but it's still there.
The Demo
I promised to demonstrate that, un-redirected, Python subprocesses write to the underlying stdout, not sys.stdout. So, here is some code:
from cStringIO import StringIO
import os
import subprocess
import sys
def show1():
print 'start show1'
save = sys.stdout
sys.stdout = StringIO()
print 'sys.stdout being buffered'
proc = subprocess.Popen(['echo', 'hello'])
proc.wait()
in_stdout = sys.stdout.getvalue()
sys.stdout = save
print 'in buffer:', in_stdout
def show2():
print 'start show2'
save = sys.stdout
sys.stdout = open(os.devnull, 'w')
print 'after redirect sys.stdout'
proc = subprocess.Popen(['echo', 'hello'])
proc.wait()
sys.stdout = save
show1()
show2()
When run:
$ python out.py
start show1
hello
in buffer: sys.stdout being buffered
start show2
hello
Note that the first routine will fail if you add stdout=sys.stdout, as a StringIO object has no fileno. The second will omit the hello if you add stdout=sys.stdout since sys.stdout has been redirected to os.devnull.
(If you redirect Python's file-descriptor-1, the subprocess will follow that redirection. The open(os.devnull, 'w') call produces a stream whose fileno() is greater than 2.)
We can also use the default file iterator for reading stdout instead of using iter construct with readline().
import subprocess
import sys
process = subprocess.Popen(
your_command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT
)
for line in process.stdout:
sys.stdout.write(line)
In addition to all these answer, one simple approach could also be as follows:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
while process.stdout.readable():
line = process.stdout.readline()
if not line:
break
print(line.strip())
Loop through the readable stream as long as it's readable and if it gets an empty result, stop.
The key here is that readline() returns a line (with \n at the end) as long as there's an output and empty if it's really at the end.
Hope this helps someone.
If you're able to use third-party libraries, You might be able to use something like sarge (disclosure: I'm its maintainer). This library allows non-blocking access to output streams from subprocesses - it's layered over the subprocess module.
If all you need is that the output will be visible on the console the easiest solution for me was to pass the following arguments to Popen
with Popen(cmd, stdout=sys.stdout, stderr=sys.stderr) as proc:
which will use your python scripts stdio file handles
Solution 1: Log stdout AND stderr concurrently in realtime
A simple solution which logs both stdout AND stderr concurrently, line-by-line in realtime into a log file.
import subprocess as sp
from concurrent.futures import ThreadPoolExecutor
def log_popen_pipe(p, stdfile):
with open("mylog.txt", "w") as f:
while p.poll() is None:
f.write(stdfile.readline())
f.flush()
# Write the rest from the buffer
f.write(stdfile.read())
with sp.Popen(["ls"], stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:
with ThreadPoolExecutor(2) as pool:
r1 = pool.submit(log_popen_pipe, p, p.stdout)
r2 = pool.submit(log_popen_pipe, p, p.stderr)
r1.result()
r2.result()
Solution 2: A function read_popen_pipes() that allows you to iterate over both pipes (stdout/stderr), concurrently in realtime
import subprocess as sp
from queue import Queue, Empty
from concurrent.futures import ThreadPoolExecutor
def enqueue_output(file, queue):
for line in iter(file.readline, ''):
queue.put(line)
file.close()
def read_popen_pipes(p):
with ThreadPoolExecutor(2) as pool:
q_stdout, q_stderr = Queue(), Queue()
pool.submit(enqueue_output, p.stdout, q_stdout)
pool.submit(enqueue_output, p.stderr, q_stderr)
while True:
if p.poll() is not None and q_stdout.empty() and q_stderr.empty():
break
out_line = err_line = ''
try:
out_line = q_stdout.get_nowait()
err_line = q_stderr.get_nowait()
except Empty:
pass
yield (out_line, err_line)
# The function in use:
with sp.Popen(["ls"], stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:
for out_line, err_line in read_popen_pipes(p):
print(out_line, end='')
print(err_line, end='')
p.poll()
Similar to previous answers but the following solution worked for me on windows using Python3 to provide a common method to print and log in realtime (source)
def print_and_log(command, logFile):
with open(logFile, 'wb') as f:
command = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
while True:
output = command.stdout.readline()
if not output and command.poll() is not None:
f.close()
break
if output:
f.write(output)
print(str(output.strip(), 'utf-8'), flush=True)
return command.poll()
A good but "heavyweight" solution is to use Twisted - see the bottom.
If you're willing to live with only stdout something along those lines should work:
import subprocess
import sys
popenobj = subprocess.Popen(["ls", "-Rl"], stdout=subprocess.PIPE)
while not popenobj.poll():
stdoutdata = popenobj.stdout.readline()
if stdoutdata:
sys.stdout.write(stdoutdata)
else:
break
print "Return code", popenobj.returncode
(If you use read() it tries to read the entire "file" which isn't useful, what we really could use here is something that reads all the data that's in the pipe right now)
One might also try to approach this with threading, e.g.:
import subprocess
import sys
import threading
popenobj = subprocess.Popen("ls", stdout=subprocess.PIPE, shell=True)
def stdoutprocess(o):
while True:
stdoutdata = o.stdout.readline()
if stdoutdata:
sys.stdout.write(stdoutdata)
else:
break
t = threading.Thread(target=stdoutprocess, args=(popenobj,))
t.start()
popenobj.wait()
t.join()
print "Return code", popenobj.returncode
Now we could potentially add stderr as well by having two threads.
Note however the subprocess docs discourage using these files directly and recommends to use communicate() (mostly concerned with deadlocks which I think isn't an issue above) and the solutions are a little klunky so it really seems like the subprocess module isn't quite up to the job (also see: http://www.python.org/dev/peps/pep-3145/ ) and we need to look at something else.
A more involved solution is to use Twisted as shown here: https://twistedmatrix.com/documents/11.1.0/core/howto/process.html
The way you do this with Twisted is to create your process using reactor.spawnprocess() and providing a ProcessProtocol that then processes output asynchronously. The Twisted sample Python code is here: https://twistedmatrix.com/documents/11.1.0/core/howto/listings/process/process.py
Based on all the above I suggest a slightly modified version (python3):
while loop calling readline (The iter solution suggested seemed to block forever for me - Python 3, Windows 7)
structered so handling of read data does not need to be duplicated after poll returns not-None
stderr piped into stdout so both output outputs are read
Added code to get exit value of cmd.
Code:
import subprocess
proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, universal_newlines=True)
while True:
rd = proc.stdout.readline()
print(rd, end='') # and whatever you want to do...
if not rd: # EOF
returncode = proc.poll()
if returncode is not None:
break
time.sleep(0.1) # cmd closed stdout, but not exited yet
# You may want to check on ReturnCode here
I found a simple solution to a much complicated problem.
Both stdout and stderr need to be streamed.
Both of them need to be non-blocking: when there is no output and when there are too much output.
Do not want to use Threading or multiprocessing, also not willing to use pexpect.
This solution uses a gist I found here
import subprocess as sbp
import fcntl
import os
def non_block_read(output):
fd = output.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
try:
return output.readline()
except:
return ""
with sbp.Popen('find / -name fdsfjdlsjf',
shell=True,
universal_newlines=True,
encoding='utf-8',
bufsize=1,
stdout=sbp.PIPE,
stderr=sbp.PIPE) as p:
while True:
out = non_block_read(p.stdout)
err = non_block_read(p.stderr)
if out:
print(out, end='')
if err:
print('E: ' + err, end='')
if p.poll() is not None:
break
It looks like line-buffered output will work for you, in which case something like the following might suit. (Caveat: it's untested.) This will only give the subprocess's stdout in real time. If you want to have both stderr and stdout in real time, you'll have to do something more complex with select.
proc = subprocess.Popen(run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
while proc.poll() is None:
line = proc.stdout.readline()
print line
log_file.write(line + '\n')
# Might still be data on stdout at this point. Grab any
# remainder.
for line in proc.stdout.read().split('\n'):
print line
log_file.write(line + '\n')
# Do whatever you want with proc.stderr here...
Why not set stdout directly to sys.stdout? And if you need to output to a log as well, then you can simply override the write method of f.
import sys
import subprocess
class SuperFile(open.__class__):
def write(self, data):
sys.stdout.write(data)
super(SuperFile, self).write(data)
f = SuperFile("log.txt","w+")
process = subprocess.Popen(command, stdout=f, stderr=f)
All of the above solutions I tried failed either to separate stderr and stdout output, (multiple pipes) or blocked forever when the OS pipe buffer was full which happens when the command you are running outputs too fast (there is a warning for this on python poll() manual of subprocess). The only reliable way I found was through select, but this is a posix-only solution:
import subprocess
import sys
import os
import select
# returns command exit status, stdout text, stderr text
# rtoutput: show realtime output while running
def run_script(cmd,rtoutput=0):
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
poller = select.poll()
poller.register(p.stdout, select.POLLIN)
poller.register(p.stderr, select.POLLIN)
coutput=''
cerror=''
fdhup={}
fdhup[p.stdout.fileno()]=0
fdhup[p.stderr.fileno()]=0
while sum(fdhup.values()) < len(fdhup):
try:
r = poller.poll(1)
except select.error, err:
if err.args[0] != EINTR:
raise
r=[]
for fd, flags in r:
if flags & (select.POLLIN | select.POLLPRI):
c = os.read(fd, 1024)
if rtoutput:
sys.stdout.write(c)
sys.stdout.flush()
if fd == p.stderr.fileno():
cerror+=c
else:
coutput+=c
else:
fdhup[fd]=1
return p.poll(), coutput.strip(), cerror.strip()
None of the Pythonic solutions worked for me.
It turned out that proc.stdout.read() or similar may block forever.
Therefore, I use tee like this:
subprocess.run('./my_long_running_binary 2>&1 | tee -a my_log_file.txt && exit ${PIPESTATUS}', shell=True, check=True, executable='/bin/bash')
This solution is convenient if you are already using shell=True.
${PIPESTATUS} captures the success status of the entire command chain (only available in Bash).
If I omitted the && exit ${PIPESTATUS}, then this would always return zero since tee never fails.
unbuffer might be necessary for printing each line immediately into the terminal, instead of waiting way too long until the "pipe buffer" gets filled.
However, unbuffer swallows the exit status of assert (SIG Abort)...
2>&1 also logs stderror to the file.
I think that the subprocess.communicate method is a bit misleading: it actually fills the stdout and stderr that you specify in the subprocess.Popen.
Yet, reading from the subprocess.PIPE that you can provide to the subprocess.Popen's stdout and stderr parameters will eventually fill up OS pipe buffers and deadlock your app (especially if you've multiple processes/threads that must use subprocess).
My proposed solution is to provide the stdout and stderr with files - and read the files' content instead of reading from the deadlocking PIPE. These files can be tempfile.NamedTemporaryFile() - which can also be accessed for reading while they're being written into by subprocess.communicate.
Below is a sample usage:
try:
with ProcessRunner(
("python", "task.py"), env=os.environ.copy(), seconds_to_wait=0.01
) as process_runner:
for out in process_runner:
print(out)
except ProcessError as e:
print(e.error_message)
raise
And this is the source code which is ready to be used with as many comments as I could provide to explain what it does:
If you're using python 2, please make sure to first install the latest version of the subprocess32 package from pypi.
import os
import sys
import threading
import time
import tempfile
import logging
if os.name == 'posix' and sys.version_info[0] < 3:
# Support python 2
import subprocess32 as subprocess
else:
# Get latest and greatest from python 3
import subprocess
logger = logging.getLogger(__name__)
class ProcessError(Exception):
"""Base exception for errors related to running the process"""
class ProcessTimeout(ProcessError):
"""Error that will be raised when the process execution will exceed a timeout"""
class ProcessRunner(object):
def __init__(self, args, env=None, timeout=None, bufsize=-1, seconds_to_wait=0.25, **kwargs):
"""
Constructor facade to subprocess.Popen that receives parameters which are more specifically required for the
Process Runner. This is a class that should be used as a context manager - and that provides an iterator
for reading captured output from subprocess.communicate in near realtime.
Example usage:
try:
with ProcessRunner(('python', task_file_path), env=os.environ.copy(), seconds_to_wait=0.01) as process_runner:
for out in process_runner:
print(out)
except ProcessError as e:
print(e.error_message)
raise
:param args: same as subprocess.Popen
:param env: same as subprocess.Popen
:param timeout: same as subprocess.communicate
:param bufsize: same as subprocess.Popen
:param seconds_to_wait: time to wait between each readline from the temporary file
:param kwargs: same as subprocess.Popen
"""
self._seconds_to_wait = seconds_to_wait
self._process_has_timed_out = False
self._timeout = timeout
self._process_done = False
self._std_file_handle = tempfile.NamedTemporaryFile()
self._process = subprocess.Popen(args, env=env, bufsize=bufsize,
stdout=self._std_file_handle, stderr=self._std_file_handle, **kwargs)
self._thread = threading.Thread(target=self._run_process)
self._thread.daemon = True
def __enter__(self):
self._thread.start()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self._thread.join()
self._std_file_handle.close()
def __iter__(self):
# read all output from stdout file that subprocess.communicate fills
with open(self._std_file_handle.name, 'r') as stdout:
# while process is alive, keep reading data
while not self._process_done:
out = stdout.readline()
out_without_trailing_whitespaces = out.rstrip()
if out_without_trailing_whitespaces:
# yield stdout data without trailing \n
yield out_without_trailing_whitespaces
else:
# if there is nothing to read, then please wait a tiny little bit
time.sleep(self._seconds_to_wait)
# this is a hack: terraform seems to write to buffer after process has finished
out = stdout.read()
if out:
yield out
if self._process_has_timed_out:
raise ProcessTimeout('Process has timed out')
if self._process.returncode != 0:
raise ProcessError('Process has failed')
def _run_process(self):
try:
# Start gathering information (stdout and stderr) from the opened process
self._process.communicate(timeout=self._timeout)
# Graceful termination of the opened process
self._process.terminate()
except subprocess.TimeoutExpired:
self._process_has_timed_out = True
# Force termination of the opened process
self._process.kill()
self._process_done = True
#property
def return_code(self):
return self._process.returncode
Here is a class which I'm using in one of my projects. It redirects output of a subprocess to the log. At first I tried simply overwriting the write-method but that doesn't work as the subprocess will never call it (redirection happens on filedescriptor level). So I'm using my own pipe, similar to how it's done in the subprocess-module. This has the advantage of encapsulating all logging/printing logic in the adapter and you can simply pass instances of the logger to Popen: subprocess.Popen("/path/to/binary", stderr = LogAdapter("foo"))
class LogAdapter(threading.Thread):
def __init__(self, logname, level = logging.INFO):
super().__init__()
self.log = logging.getLogger(logname)
self.readpipe, self.writepipe = os.pipe()
logFunctions = {
logging.DEBUG: self.log.debug,
logging.INFO: self.log.info,
logging.WARN: self.log.warn,
logging.ERROR: self.log.warn,
}
try:
self.logFunction = logFunctions[level]
except KeyError:
self.logFunction = self.log.info
def fileno(self):
#when fileno is called this indicates the subprocess is about to fork => start thread
self.start()
return self.writepipe
def finished(self):
"""If the write-filedescriptor is not closed this thread will
prevent the whole program from exiting. You can use this method
to clean up after the subprocess has terminated."""
os.close(self.writepipe)
def run(self):
inputFile = os.fdopen(self.readpipe)
while True:
line = inputFile.readline()
if len(line) == 0:
#no new data was added
break
self.logFunction(line.strip())
If you don't need logging but simply want to use print() you can obviously remove large portions of the code and keep the class shorter. You could also expand it by an __enter__ and __exit__ method and call finished in __exit__ so that you could easily use it as context.
import os
def execute(cmd, callback):
for line in iter(os.popen(cmd).readline, ''):
callback(line[:-1])
execute('ls -a', print)
Had the same problem and worked out a simple and clean solution using process.sdtout.read1() which works perfectly for my needs in python3.
Here is a demo using the ping command (requires internet connection):
from subprocess import Popen, PIPE
cmd = "ping 8.8.8.8"
proc = Popen([cmd], shell=True, stdout=PIPE)
while True:
print(proc.stdout.read1())
Every second or so a new line is printed in the python console as the ping command reports its data in real time.
In my view "live output from subprocess command" means that both stdout and stderr should be live. And stdin should also be delivered to the subprocess.
The fragment below produces live output on stdout and stderr and also captures them as bytes in outcome.{stdout,stderr}.
The trick involves proper use of select and poll.
Works well for me on Python 3.9.
if self.log == 1:
print(f"** cmnd= {fullCmndStr}")
self.outcome.stdcmnd = fullCmndStr
try:
process = subprocess.Popen(
fullCmndStr,
shell=True,
encoding='utf8',
executable="/bin/bash",
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except OSError:
self.outcome.error = OSError
else:
process.stdin.write(stdin)
process.stdin.close() # type: ignore
stdoutStrFile = io.StringIO("")
stderrStrFile = io.StringIO("")
pollStdout = select.poll()
pollStderr = select.poll()
pollStdout.register(process.stdout, select.POLLIN)
pollStderr.register(process.stderr, select.POLLIN)
stdoutEOF = False
stderrEOF = False
while True:
stdoutActivity = pollStdout.poll(0)
if stdoutActivity:
c= process.stdout.read(1)
if c:
stdoutStrFile.write(c)
if self.log == 1:
sys.stdout.write(c)
else:
stdoutEOF = True
stderrActivity = pollStderr.poll(0)
if stderrActivity:
c= process.stderr.read(1)
if c:
stderrStrFile.write(c)
if self.log == 1:
sys.stderr.write(c)
else:
stderrEOF = True
if stdoutEOF and stderrEOF:
break
if self.log == 1:
print(f"** cmnd={fullCmndStr}")
process.wait() # type: ignore
self.outcome.stdout = stdoutStrFile.getvalue()
self.outcome.stderr = stderrStrFile.getvalue()
self.outcome.error = process.returncode # type: ignore
The only way i've found how to read a subprocess' output in a streaming fashion (while also capturing it in a variable) in Python (for multiple output streams, i.e. both stdout and stderr) is by passing the subprocess a named temporary file to write to and then opening the same temporary file in a separate reading handle.
Note: this is for Python 3
stdout_write = tempfile.NamedTemporaryFile()
stdout_read = io.open(stdout_write.name, "r")
stderr_write = tempfile.NamedTemporaryFile()
stderr_read = io.open(stderr_write.name, "r")
stdout_captured = ""
stderr_captured = ""
proc = subprocess.Popen(["command"], stdout=stdout_write, stderr=stderr_write)
while True:
proc_done: bool = cli_process.poll() is not None
while True:
content = stdout_read.read(1024)
sys.stdout.write(content)
stdout_captured += content
if len(content) < 1024:
break
while True:
content = stderr_read.read(1024)
sys.stderr.write(content)
stdout_captured += content
if len(content) < 1024:
break
if proc_done:
break
time.sleep(0.1)
stdout_write.close()
stdout_read.close()
stderr_write.close()
stderr_read.close()
However, if you don't need to capture the output, then you can simply pass sys.stdout and sys.stderr streams from your Python script to the called subprocess, as xaav suggested in his answer :
subprocess.Popen(["command"], stdout=sys.stdout, stderr=sys.stderr)

printing stdout in realtime from a subprocess that requires stdin

This is a follow up to this question, but if I want to pass an argument to stdin to subprocess, how can I get the output in real time? This is what I currently have; I also tried replacing Popen with call from the subprocess module and this just leads to the script hanging.
from subprocess import Popen, PIPE, STDOUT
cmd = 'rsync --rsh=ssh -rv --files-from=- thisdir/ servername:folder/'
p = Popen(cmd.split(), stdout=PIPE, stdin=PIPE, stderr=STDOUT)
subfolders = '\n'.join(['subfolder1','subfolder2'])
output = p.communicate(input=subfolders)[0]
print output
In the former question where I did not have to pass stdin I was suggested to use p.stdout.readline, there there is no room there to pipe anything to stdin.
Addendum: This works for the transfer, but I see the output only at the end and I would like to see the details of the transfer while it's happening.
In order to grab stdout from the subprocess in real time you need to decide exactly what behavior you want; specifically, you need to decide whether you want to deal with the output line-by-line or character-by-character, and whether you want to block while waiting for output or be able to do something else while waiting.
It looks like it will probably suffice for your case to read the output in line-buffered fashion, blocking until each complete line comes in, which means the convenience functions provided by subprocess are good enough:
p = subprocess.Popen(some_cmd, stdout=subprocess.PIPE)
# Grab stdout line by line as it becomes available. This will loop until
# p terminates.
while p.poll() is None:
l = p.stdout.readline() # This blocks until it receives a newline.
print l
# When the subprocess terminates there might be unconsumed output
# that still needs to be processed.
print p.stdout.read()
If you need to write to the stdin of the process, just use another pipe:
p = subprocess.Popen(some_cmd, stdout=subprocess.PIPE, stdin=subprocess.PIPE)
# Send input to p.
p.stdin.write("some input\n")
p.stdin.flush()
# Now start grabbing output.
while p.poll() is None:
l = p.stdout.readline()
print l
print p.stdout.read()
Pace the other answer, there's no need to indirect through a file in order to pass input to the subprocess.
something like this I think
from subprocess import Popen, PIPE, STDOUT
p = Popen('c:/python26/python printingTest.py', stdout = PIPE,
stderr = PIPE)
for line in iter(p.stdout.readline, ''):
print line
p.stdout.close()
using an iterator will return live results basically ..
in order to send input to stdin you would need something like
other_input = "some extra input stuff"
with open("to_input.txt","w") as f:
f.write(other_input)
p = Popen('c:/python26/python printingTest.py < some_input_redirection_thing',
stdin = open("to_input.txt"),
stdout = PIPE,
stderr = PIPE)
this would be similar to the linux shell command of
%prompt%> some_file.o < cat to_input.txt
see alps answer for better passing to stdin
If you pass all your input before starting reading the output and if by "real-time" you mean whenever the subprocess flushes its stdout buffer:
from subprocess import Popen, PIPE, STDOUT
cmd = 'rsync --rsh=ssh -rv --files-from=- thisdir/ servername:folder/'
p = Popen(cmd.split(), stdout=PIPE, stdin=PIPE, stderr=STDOUT, bufsize=1)
subfolders = '\n'.join(['subfolder1','subfolder2'])
p.stdin.write(subfolders)
p.stdin.close() # eof
for line in iter(p.stdout.readline, ''):
print line, # do something with the output here
p.stdout.close()
rc = p.wait()

How to reuse intermediate results of Popen in Python?

The codes are like this:
from subprocess import Popen, PIPE
p1 = Popen("command1", stdout = PIPE)
p2 = Popen("command2", stdin = p1.stdout, stdout = PIPE)
result_a = p2.communicate()[0]
p1_again = Popen("command1", stdout = PIPE)
p3 = Popen("command3", stdin = p1_again.stdout, stdout = PIPE)
result_b = p3.communicate()[0]
with open("test") as tf:
p1_again_again = Popen("command1", stdout = tf)
p1_again_again.communicate()
The bad part is:
The command1 was executed three times because when I use commnnicate once, the stdout of that Popen object can't be used again. I was just wondering whether there's a method to reuse the intermediate results of PIPE.
Does anyone have ideas about how to make these codes better (better performance as well as less lines of codes)? Thanks!
here is a working solution. I have put example commands for cmd1, cmd2, cmd3 so that you can run it. It just takes the output from the first command and uppercases it in one command and lowercases it in the other.
code
from subprocess import Popen, PIPE, check_output
from tempfile import TemporaryFile
cmd1 = ['echo', 'Hi']
cmd2 = ['tr', '[:lower:]', '[:upper:]']
cmd3 = ['tr', '[:upper:]', '[:lower:]']
with TemporaryFile() as f:
p = Popen(cmd1, stdout=f)
ret_code = p.wait()
f.flush()
f.seek(0)
out2 = Popen(cmd2, stdin=f, stdout=PIPE).stdout.read()
f.seek(0)
out3 = Popen(cmd3, stdin=f, stdout=PIPE).stdout.read()
print out2, out3
output
HI
hi
some of the things to make note of in the solution. the tempfile module is always a great way to go when needing to work with temp files, it will automatically delete the temporary file as a cleanup once the with statement exits, even if there was some io exception thrown through out the with block. cmd1 is run once and output to the temp file, one calls the wait() method to make sure all execution has completed, then we do seek(0) each time so that when we call the read() method on f it is back at the start of the file. As a reference the question Saving stdout from subprocess.Popen to file, helped me in getting the first part of the solution.
If you can read all output of command1 in memory and then run command2, command3 one after another:
#!/usr/bin/env python
from subprocess import Popen, PIPE, check_output as qx
cmd1_output = qx(['ls']) # get all output
# run commands in sequence
results = [Popen(cmd, stdin=PIPE, stdout=PIPE).communicate(cmd1_output)[0]
for cmd in [['cat'], ['tr', 'a-z', 'A-Z']]]
Or you can write to a temporary file first if command1 generates a gigantic output that can't fit in memory as #Marwan Alsabbagh suggested:
#!/usr/bin/env python
import tempfile
from subprocess import check_call, check_output as qx
with tempfile.TemporaryFile() as file: # deleted automatically on closing
# run command1, wait for completion
check_call(['ls'], stdout=file)
# run commands in sequence
results = []
for cmd in [['cat'], ['tr', 'a-z', 'A-Z']]:
file.seek(0)
results.append(qx(cmd, stdin=file))
To handle input/output to/from subprocesses in parallel you could use threading:
#!/usr/bin/env python3
from contextlib import ExitStack # pip install contextlib2 (stdlib since 3.3)
from subprocess import Popen, PIPE
from threading import Thread
def tee(fin, *files):
try:
for chunk in iter(lambda: fin.read(1 << 10), b''):
for f in files: # fan out
f.write(chunk)
finally:
for f in (fin,) + files:
try:
f.close()
except OSError:
pass
with ExitStack() as stack:
# run commands asynchronously
source_proc = Popen(["command1", "arg1"], stdout=PIPE)
stack.callback(source_proc.wait)
stack.callback(source_proc.stdout.close)
processes = []
for command in [["tr", "a-z", "A-Z"], ["cat"]]:
processes.append(Popen(command, stdin=PIPE, stdout=PIPE))
stack.callback(processes[-1].wait)
stack.callback(processes[-1].stdout.close) # use .terminate()
stack.callback(processes[-1].stdin.close) # if it doesn't kill it
fout = open("test.txt", "wb")
stack.callback(fout.close)
# fan out source_proc's output
Thread(target=tee, args=([source_proc.stdout, fout] +
[p.stdin for p in processes])).start()
# collect results in parallel
results = [[] for _ in range(len(processes))]
threads = [Thread(target=r.extend, args=[iter(p.stdout.readline, b'')])
for p, r in zip(processes, results)]
for t in threads: t.start()
for t in threads: t.join() # wait for completion
I've used ExitStack here for a proper clean up in case of exceptions.

catching stdout in realtime from subprocess

I want to subprocess.Popen() rsync.exe in Windows, and print the stdout in Python.
My code works, but it doesn't catch the progress until a file transfer is done! I want to print the progress for each file in real time.
Using Python 3.1 now since I heard it should be better at handling IO.
import subprocess, time, os, sys
cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'
p = subprocess.Popen(cmd,
shell=True,
bufsize=64,
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
Some rules of thumb for subprocess.
Never use shell=True. It needlessly invokes an extra shell process to call your program.
When calling processes, arguments are passed around as lists. sys.argv in python is a list, and so is argv in C. So you pass a list to Popen to call subprocesses, not a string.
Don't redirect stderr to a PIPE when you're not reading it.
Don't redirect stdin when you're not writing to it.
Example:
import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in iter(p.stdout.readline, b''):
print(">>> " + line.rstrip())
That said, it is probable that rsync buffers its output when it detects that it is connected to a pipe instead of a terminal. This is the default behavior - when connected to a pipe, programs must explicitly flush stdout for realtime results, otherwise standard C library will buffer.
To test for that, try running this instead:
cmd = [sys.executable, 'test_out.py']
and create a test_out.py file with the contents:
import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")
Executing that subprocess should give you "Hello" and wait 10 seconds before giving "World". If that happens with the python code above and not with rsync, that means rsync itself is buffering output, so you are out of luck.
A solution would be to connect direct to a pty, using something like pexpect.
I know this is an old topic, but there is a solution now. Call the rsync with option --outbuf=L. Example:
cmd=['rsync', '-arzv','--backup','--outbuf=L','source/','dest']
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, b''):
print '>>> {}'.format(line.rstrip())
Depending on the use case, you might also want to disable the buffering in the subprocess itself.
If the subprocess will be a Python process, you could do this before the call:
os.environ["PYTHONUNBUFFERED"] = "1"
Or alternatively pass this in the env argument to Popen.
Otherwise, if you are on Linux/Unix, you can use the stdbuf tool. E.g. like:
cmd = ["stdbuf", "-oL"] + cmd
See also here about stdbuf or other options.
On Linux, I had the same problem of getting rid of the buffering. I finally used "stdbuf -o0" (or, unbuffer from expect) to get rid of the PIPE buffering.
proc = Popen(['stdbuf', '-o0'] + cmd, stdout=PIPE, stderr=PIPE)
stdout = proc.stdout
I could then use select.select on stdout.
See also https://unix.stackexchange.com/questions/25372/
for line in p.stdout:
...
always blocks until the next line-feed.
For "real-time" behaviour you have to do something like this:
while True:
inchar = p.stdout.read(1)
if inchar: #neither empty string nor None
print(str(inchar), end='') #or end=None to flush immediately
else:
print('') #flush for implicit line-buffering
break
The while-loop is left when the child process closes its stdout or exits.
read()/read(-1) would block until the child process closed its stdout or exited.
Your problem is:
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
the iterator itself has extra buffering.
Try doing like this:
while True:
line = p.stdout.readline()
if not line:
break
print line
You cannot get stdout to print unbuffered to a pipe (unless you can rewrite the program that prints to stdout), so here is my solution:
Redirect stdout to sterr, which is not buffered. '<cmd> 1>&2' should do it. Open the process as follows: myproc = subprocess.Popen('<cmd> 1>&2', stderr=subprocess.PIPE)
You cannot distinguish from stdout or stderr, but you get all output immediately.
Hope this helps anyone tackling this problem.
To avoid caching of output you might wanna try pexpect,
child = pexpect.spawn(launchcmd,args,timeout=None)
while True:
try:
child.expect('\n')
print(child.before)
except pexpect.EOF:
break
PS : I know this question is pretty old, still providing the solution which worked for me.
PPS: got this answer from another question
p = subprocess.Popen(command,
bufsize=0,
universal_newlines=True)
I am writing a GUI for rsync in python, and have the same probelms. This problem has troubled me for several days until i find this in pyDoc.
If universal_newlines is True, the file objects stdout and stderr are opened as text files in universal newlines mode. Lines may be terminated by any of '\n', the Unix end-of-line convention, '\r', the old Macintosh convention or '\r\n', the Windows convention. All of these external representations are seen as '\n' by the Python program.
It seems that rsync will output '\r' when translate is going on.
if you run something like this in a thread and save the ffmpeg_time property in a property of a method so you can access it, it would work very nice
I get outputs like this:
output be like if you use threading in tkinter
input = 'path/input_file.mp4'
output = 'path/input_file.mp4'
command = "ffmpeg -y -v quiet -stats -i \"" + str(input) + "\" -metadata title=\"#alaa_sanatisharif\" -preset ultrafast -vcodec copy -r 50 -vsync 1 -async 1 \"" + output + "\""
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, universal_newlines=True, shell=True)
for line in self.process.stdout:
reg = re.search('\d\d:\d\d:\d\d', line)
ffmpeg_time = reg.group(0) if reg else ''
print(ffmpeg_time)
Change the stdout from the rsync process to be unbuffered.
p = subprocess.Popen(cmd,
shell=True,
bufsize=0, # 0=unbuffered, 1=line-buffered, else buffer-size
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
I've noticed that there is no mention of using a temporary file as intermediate. The following gets around the buffering issues by outputting to a temporary file and allows you to parse the data coming from rsync without connecting to a pty. I tested the following on a linux box, and the output of rsync tends to differ across platforms, so the regular expressions to parse the output may vary:
import subprocess, time, tempfile, re
pipe_output, file_name = tempfile.TemporaryFile()
cmd = ["rsync", "-vaz", "-P", "/src/" ,"/dest"]
p = subprocess.Popen(cmd, stdout=pipe_output,
stderr=subprocess.STDOUT)
while p.poll() is None:
# p.poll() returns None while the program is still running
# sleep for 1 second
time.sleep(1)
last_line = open(file_name).readlines()
# it's possible that it hasn't output yet, so continue
if len(last_line) == 0: continue
last_line = last_line[-1]
# Matching to "[bytes downloaded] number% [speed] number:number:number"
match_it = re.match(".* ([0-9]*)%.* ([0-9]*:[0-9]*:[0-9]*).*", last_line)
if not match_it: continue
# in this case, the percentage is stored in match_it.group(1),
# time in match_it.group(2). We could do something with it here...
In Python 3, here's a solution, which takes a command off the command line and delivers real-time nicely decoded strings as they are received.
Receiver (receiver.py):
import subprocess
import sys
cmd = sys.argv[1:]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in p.stdout:
print("received: {}".format(line.rstrip().decode("utf-8")))
Example simple program that could generate real-time output (dummy_out.py):
import time
import sys
for i in range(5):
print("hello {}".format(i))
sys.stdout.flush()
time.sleep(1)
Output:
$python receiver.py python dummy_out.py
received: hello 0
received: hello 1
received: hello 2
received: hello 3
received: hello 4

Categories

Resources