Subprocess popen: Why do all writes happen at once in child process?

Subprocess popen: Why do all writes happen at once in child process? - python

I have two scripts, one controlling the other and communicating to it via stdin. The parent script:
import subprocess
import time
p = subprocess.Popen(['python','read_from_stdin.py'], stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.PIPE)
for i in range(0,10):
p.stdin.write(str(i))
p.stdin.write('\r\n') # \n is not sufficient on Windows
p.stdin.flush()
print i
time.sleep(1)
p.stdin.close()
The child script (called 'read_from_stdin.py'):
import sys
import datetime
with open(datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S') + '.txt','w') as f:
for line in sys.stdin:
f.write(datetime.datetime.now().isoformat() + ' ' + line)
In the file that's created by the child script all the inputs have the same timestamp, despite being written a second apart by the parent script, and despite the use for flush().

It is the read-ahead bug in Python 2: for line in sys.stdin: doesn't yield anything until its internal buffer is full. Use for line in iter(sys.stdin.readline, ''):, to workaround it.

EDIT: As per Karoly Horvath's comment below, it's not the case that it waits for for EOF, but there's buffering. The different child script below does work as expected.
I found this question on the subject: How do you read from stdin in Python?
A fair way down in the answers is:
The answer proposed by others:
for line in sys.stdin:
print line
is very simple and pythonic, but it must be noted that the script will
wait until EOF before starting to iterate on the lines of input.
This child script behaves as expected:
import sys
import datetime
with open(datetime.datetime.now().strftime('%Y-%m-%d-%H-%M-%S') + '.txt','w') as f:
line = sys.stdin.readline()
while line:
f.write(datetime.datetime.now().isoformat() + ' ' + line)
line = sys.stdin.readline()
f.write('Finished')

Related

real time logging to file with python subprocess

I expect this is really simple but I can't work this out.
I am trying to write to a log file in real time the output from a DD imaging subprocess - I'm using DD v 8.25 from which you can get regular progress updates using the 'status=progress' option which writes to stderr.
I can get it to log the full output real time by passing the file object to the stderr i.e
log_file = open('mylog.log', 'a')
p = subprocess.Popen['dd command...'], stdout=None, stderr=log_file)
...but I would prefer to intercept the string from stderr first so I can parse it before writing to file.
I have tried threading but I can't seem to get it to write, or if it does, it only does it at the end of the process and not during.
I'm a python noob so example code would be appreciated. Thanks!
UPDATE - NOW WORKING (ISH)
I had a look at the link J.F. Sebastian suggested and found posts about using threads, so after that I used the "kill -USR1" trick to get DD to post progress to stderr which I could then pick up:
#! /usr/bin/env python
from subprocess import PIPE, Popen
from threading import Thread
from queue import Queue, Empty
import time
q = Queue()
def parsestring(mystr):
newstring = mystr[0:mystr.find('bytes')]
return newstring
def enqueue(out, q):
for line in proc1.stderr:
q.put(line)
out.close()
def getstatus():
while proc1.poll() == None:
proc2 = Popen(["kill -USR1 $(pgrep ^dd)"], bufsize=1, shell=True)
time.sleep(2)
with open("log_file.log", mode="a") as log_fh:
start_time = time.time()
#start the imaging
proc1 = Popen(["dd if=/dev/sda1 of=image.dd bs=524288 count=3000"], bufsize=1, stderr=PIPE, shell=True)
#define and start the queue function thread
t = Thread(target=enqueue, args=(proc1.stderr, q))
t.daemon = True
t.start()
#define and start the getstatus function thread
t_getstatus = Thread(target=getstatus, args=())
t_getstatus.daemon
t_getstatus.start()
#get the string from the queue
while proc1.poll() == None:
try: nline = q.get_nowait()
except Empty:
continue
else:
mystr = nline.decode('utf-8')
if mystr.find('bytes') > 0:
log_fh.write(str(time.time()) + ' - ' + parsestring(mystr))
log_fh.flush()
#put in a delay
#time.sleep(2)
#print duration
end_time=time.time()
duration=end_time-start_time
print('Took ' + str(duration) + ' seconds')
The only issue is I can't work out how to improve performance. I only need it to report status every 2 seconds or so but increasing the time delay increases the time of the imaging, which I don't want. That's a question for another post though...
Thanks to both J.F. Sebastian and Ali.

With this example it's possible (with python 3) to stream from stderr to console:
#! /usr/bin/env python
from subprocess import Popen, PIPE
# emulate a program that write on stderr
proc = Popen(["/usr/bin/yes 1>&2 "], bufsize=512, stdout=PIPE, stderr=PIPE, shell=True)
r = b""
for line in proc.stderr:
r += line
print("current line", line, flush=True)
To stream to a file:
#! /usr/bin/env python
from subprocess import Popen, PIPE
with open("log_file.log", mode="b", encoding="utf8") as log_fh:
proc = Popen(["/usr/bin/yes 1>&2 "], bufsize=512, stdout=PIPE, stderr=PIPE, shell=True)
r = b""
# proc.stderr is an io.TextIOWrapper file-like obj
# iter over line
for line in proc.stderr:
r += line
# print("current line", line, flush=True)
log_fh.write(line) # file open in binary mode
# log_fh.write(line.decode("utf8")) # for text mode
log_fh.flush() # flush the content

To display dd's progress report in a terminal and to save (parsed) output to a log file:
#!/usr/bin/env python3
import io
from subprocess import PIPE, Popen
from time import monotonic as timer
cmd = "dd if=/dev/sda1 of=image.dd bs=524288 count=3000 status=progress".split()
with Popen(cmd, stderr=PIPE) as process, \
open("log_file.log", "a") as log_file:
start_time = timer()
for line in io.TextIOWrapper(process.stderr, newline=''):
print(line, flush=True, end='') # no newline ('\n')
if 'bytes' in line:
# XXX parse line here, add flush=True if necessary
print(line, file=log_file)
# print duration
print('Took {duration} seconds'.format(duration=timer() - start_time))
Note
no shell=True: you don't need the shell here. Popen() can run dd directly
no threads, queues: you don't need them here
please, please DO NOT USE while proc1.poll() == None You don't need it here (you'll see EOF on proc1.stderr if proc1.poll() is not None). You may lose data (there could be a buffered content even if the process has exited already). Unrelated: if you need to compare with None; use is None instead of == None
io.TextIOWrapper(newline='') enables text mode
(it uses locale.getpreferredencoding(False)) and it
treats '\r' as a newline too
use the default bufsize=-1 (see io.DEFAULT_BUFFER_SIZE)

Running an interactive command from within Python

I have a script that I want to run from within Python (2.6.5) that follows the logic below:
Prompts the user for a password. It looks like ("Enter password: ") (*Note: Input does not echo to screen)
Output irrelevant information
Prompt the user for a response ("Blah Blah filename.txt blah blah (Y/N)?: ")
The last prompt line contains text which I need to parse (filename.txt). The response provided doesn't matter (the program could actually exit here without providing one, as long as I can parse the line).
My requirements are somewhat similar to Wrapping an interactive command line application in a Python script, but the responses there seem a bit confusing, and mine still hangs even when the OP mentions that it doesn't for him.
Through looking around, I've come to the conclusion that subprocess is the best way of doing this, but I'm having a few issues. Here is my Popen line:
p = subprocess.Popen("cmd", shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, stdin=subprocess.PIPE)
When I call a read() or readline() on stdout, the prompt is printer to the screen and it hangs.
If I call a write("password\n") for stdin, the prompt is written to the screen and it hangs. The text in write() is not written (I don't the cursor move the a new line).
If I call p.communicate("password\n"), same behavior as write()
I was looking for a few ideas here on the best way to input to stdin and possibly how to parse the last line in the output if your feeling generous, though I could probably figure that out eventually.

If you are communicating with a program that subprocess spawns, you should check out A non-blocking read on a subprocess.PIPE in Python. I had a similar problem with my application and found using queues to be the best way to do ongoing communication with a subprocess.
As for getting values from the user, you can always use the raw_input() builtin to get responses, and for passwords, try using the getpass module to get non-echoing passwords from your user. You can then parse those responses and write them to your subprocess' stdin.
I ended up doing something akin to the following:
import sys
import subprocess
from threading import Thread
try:
from Queue import Queue, Empty
except ImportError:
from queue import Queue, Empty # Python 3.x
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
def getOutput(outQueue):
outStr = ''
try:
while True: # Adds output from the Queue until it is empty
outStr+=outQueue.get_nowait()
except Empty:
return outStr
p = subprocess.Popen("cmd", stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=False, universal_newlines=True)
outQueue = Queue()
errQueue = Queue()
outThread = Thread(target=enqueue_output, args=(p.stdout, outQueue))
errThread = Thread(target=enqueue_output, args=(p.stderr, errQueue))
outThread.daemon = True
errThread.daemon = True
outThread.start()
errThread.start()
try:
someInput = raw_input("Input: ")
except NameError:
someInput = input("Input: ")
p.stdin.write(someInput)
errors = getOutput(errQueue)
output = getOutput(outQueue)
Once you have the queues made and the threads started, you can loop through getting input from the user, getting errors and output from the process, and processing and displaying them to the user.

Using threading it might be slightly overkill for simple tasks.
Instead os.spawnvpe can be used. It will spawn script shell as a process. You will be able to communicate interactively with the script.
In this example I passed password as an argument, obviously that is not a good idea.
import os
import sys
from getpass import unix_getpass
def cmd(cmd):
cmd = cmd.split()
code = os.spawnvpe(os.P_WAIT, cmd[0], cmd, os.environ)
if code == 127:
sys.stderr.write('{0}: command not found\n'.format(cmd[0]))
return code
password = unix_getpass('Password: ')
cmd_run = './run.sh --password {0}'.format(password)
cmd(cmd_run)
pattern = raw_input('Pattern: ')
lines = []
with open('filename.txt', 'r') as fd:
for line in fd:
if pattern in line:
lines.append(line)
# manipulate lines

If you just want a user to enter a password without it being echoed to the screen just use the standard library's getpass module:
import getpass
print("You entered:", getpass.getpass())
NOTE:The prompt for this function defaults to "Password: " also this will only work on command lines where echoing can be controlled. So if it doesn't work try running it from terminal.

python: reading subprocess output in threads

I have an executable that I call using subprocess.Popen. Then, I intend to feed it some data via stdin using a thread that reads its value from a Queue which will later be populated in another thread. The output should be read using the stdout pipe in another thread and again be sorted in a Queue.
As far as I understand from my previous research, using threads with Queue is good practice.
The external executable, unfortunately, will not quickly give me an answer for every line that is piped in, so that simple write, readline cycles are not an option. The executable implements some internal multithreading and I want the output as soon as it becomes available, therefore the additional reader thread.
As an example for testing the executable will just shuffle each line (shuffleline.py):
#!/usr/bin/python -u
import sys
from random import shuffle
for line in sys.stdin:
line = line.strip()
# shuffle line
line = list(line)
shuffle(line)
line = "".join(line)
sys.stdout.write("%s\n"%(line))
sys.stdout.flush() # avoid buffers
Please note that this is already as unbuffered as possible. Or isn't it? This is my stripped down test program:
#!/usr/bin/python -u
import sys
import Queue
import threading
import subprocess
class WriteThread(threading.Thread):
def __init__(self, p_in, source_queue):
threading.Thread.__init__(self)
self.pipe = p_in
self.source_queue = source_queue
def run(self):
while True:
source = self.source_queue.get()
print "writing to process: ", repr(source)
self.pipe.write(source)
self.pipe.flush()
self.source_queue.task_done()
class ReadThread(threading.Thread):
def __init__(self, p_out, target_queue):
threading.Thread.__init__(self)
self.pipe = p_out
self.target_queue = target_queue
def run(self):
while True:
line = self.pipe.readline() # blocking read
if line == '':
break
print "reader read: ", line.rstrip()
self.target_queue.put(line)
if __name__ == "__main__":
cmd = ["python", "-u", "./shuffleline.py"] # unbuffered
proc = subprocess.Popen(cmd, bufsize=0, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
source_queue = Queue.Queue()
target_queue = Queue.Queue()
writer = WriteThread(proc.stdin, source_queue)
writer.setDaemon(True)
writer.start()
reader = ReadThread(proc.stdout, target_queue)
reader.setDaemon(True)
reader.start()
# populate queue
for i in range(10):
source_queue.put("string %s\n" %i)
source_queue.put("")
print "source_queue empty: ", source_queue.empty()
print "target_queue empty: ", target_queue.empty()
import time
time.sleep(2) # expect some output from reader thread
source_queue.join() # wait until all items in source_queue are processed
proc.stdin.close() # should end the subprocess
proc.wait()
this give the following output (python2.7):
writing to process: 'string 0\n'
writing to process: 'string 1\n'
writing to process: 'string 2\n'
writing to process: 'string 3\n'
writing to process: 'string 4\n'
writing to process: 'string 5\n'
writing to process: 'string 6\n'
source_queue empty: writing to process: 'string 7\n'
writing to process: 'string 8\n'
writing to process: 'string 9\n'
writing to process: ''
True
target_queue empty: True
then nothing for 2 seconds ...
reader read: rgsn0i t
reader read: nrg1sti
reader read: tis n2rg
reader read: snt gri3
reader read: nsri4 tg
reader read: stir5 gn
reader read: gnri6ts
reader read: ngrits7
reader read: 8nsrt ig
reader read: sg9 nitr
The interleaving at the beginning is expected. However the output of the subprocess does not appear until after the subprocess ends. With more lines piped in I get some output, thus I assume a caching problem in the stdout pipe. According to other questions posted here flushing stdout (in the subprocess) should work, at least on Linux.

Your problem has nothing to do the subprocess module, or threads (problematic as they are), or even mixing subprocesses and threads (a very bad idea, even worse than using threads to start with, unless you're using the backport of Python 3.2's subprocess module that you can get from code.google.com/p/python-subprocess32) or accessing the same things from multiple threads (as your print statements do.)
What happens is that your shuffleline.py program buffers. Not in output, but in input. Although it isn't very obvious, when you iterate over a fileobject, Python will read in blocks, usually 8k bytes. Since sys.stdin is a fileobject, your for loop will buffer until EOF or a full block:
for line in sys.stdin:
line = line.strip()
....
If you want to not do this, either use a while loop to call sys.stdin.readline() (which returns '' for EOF):
while True:
line = sys.stdin.readline()
if not line:
break
line = line.strip()
...
or use the two-argument form of iter(), which creates an iterator that calls the first argument until the second argument (the "sentinel") is returned:
for line in iter(sys.stdin.readline, ''):
line = line.strip()
...
I would also be remiss if I did not suggest not using threads for this, but non-blocking I/O on the subprocess's pipes instead, or even something like twisted.reactor.spawnProcess which has lots of ways of hooking processes and other things together as consumers and producers.

python, subprocess: reading output from subprocess

I have following script:
#!/usr/bin/python
while True:
x = raw_input()
print x[::-1]
I am calling it from ipython:
In [5]: p = Popen('./script.py', stdin=PIPE)
In [6]: p.stdin.write('abc\n')
cba
and it works fine.
However, when I do this:
In [7]: p = Popen('./script.py', stdin=PIPE, stdout=PIPE)
In [8]: p.stdin.write('abc\n')
In [9]: p.stdout.read()
the interpreter hangs. What am I doing wrong? I would like to be able to both write and read from another process multiple times, to pass some tasks to this process. What do I need to do differently?
EDIT 1
If I use communicate, I get this:
In [7]: p = Popen('./script.py', stdin=PIPE, stdout=PIPE)
In [8]: p.communicate('abc\n')
Traceback (most recent call last):
File "./script.py", line 4, in <module>
x = raw_input()
EOFError: EOF when reading a line
Out[8]: ('cba\n', None)
EDIT 2
I tried flushing:
#!/usr/bin/python
import sys
while True:
x = raw_input()
print x[::-1]
sys.stdout.flush()
and here:
In [5]: from subprocess import PIPE, Popen
In [6]: p = Popen('./script.py', stdin=PIPE, stdout=PIPE)
In [7]: p.stdin.write('abc')
In [8]: p.stdin.flush()
In [9]: p.stdout.read()
but it hangs again.

I believe there are two problems at work here:
1) Your parent script calls p.stdout.read(), which will read all data until end-of-file. However, your child script runs in an infinite loop so end-of-file will never happen. Probably you want p.stdout.readline()?
2) In interactive mode, most programs do buffer only one line at a time. When run from another program, they buffer much more. The buffering improves efficiency in many cases, but causes problems when two programs need to communicate interactively.
After p.stdin.write('abc\n') add:
p.stdin.flush()
In your subprocess script, after print x[::-1] add the following within the loop:
sys.stdout.flush()
(and import sys at the top)

The subprocess method check_output can be useful for this:
output = subprocess.check_output('./script.py')
And output will be the stdout from the process. If you need stderr, too:
output = subprocess.check_output('./script.py', stderr=subprocess.STDOUT)
Because you avoid managing pipes directly, it may circumvent your issue.

If you'd like to pass several lines to script.py then you need to read/write simultaneously:
#!/usr/bin/env python
import sys
from subprocess import PIPE, Popen
from threading import Thread
def print_output(out, ntrim=80):
for line in out:
print len(line)
if len(line) > ntrim: # truncate long output
line = line[:ntrim-2]+'..'
print line.rstrip()
if __name__=="__main__":
p = Popen(['python', 'script.py'], stdin=PIPE, stdout=PIPE)
Thread(target=print_output, args=(p.stdout,)).start()
for s in ['abc', 'def', 'ab'*10**7, 'ghi']:
print >>p.stdin, s
p.stdin.close()
sys.exit(p.wait()) #NOTE: read http://docs.python.org/library/subprocess.html#subprocess.Popen.wait
Output:
4
cba
4
fed
20000001
bababababababababababababababababababababababababababababababababababababababa..
4
ihg
Where script.py:
#!/usr/bin/env python
"""Print reverse lines."""
while True:
try: x = raw_input()
except EOFError:
break # no more input
else:
print x[::-1]
Or
#!/usr/bin/env python
"""Print reverse lines."""
import sys
for line in sys.stdin:
print line.rstrip()[::-1]
Or
#!/usr/bin/env python
"""Print reverse lines."""
import fileinput
for line in fileinput.input(): # accept files specified as command line arguments
print line.rstrip()[::-1]

You're probably tripping over Python's output buffering. Here's what python --help has to say about it.
-u : unbuffered binary stdout and stderr; also PYTHONUNBUFFERED=x
see man page for details on internal buffering relating to '-u'

When you are through writing to p.stdin, close it: p.stdin.close()

Use communicate() instead of .stdout.read().
Example:
from subprocess import Popen, PIPE
p = Popen('./script.py', stdin=PIPE, stdout=PIPE, stderr=PIPE)
input = 'abc\n'
stdout, stderr = p.communicate(input)
This recommendation comes from the Popen objects section in the subprocess documentation:
Warning: Use communicate() rather than .stdin.write, .stdout.read or .stderr.read
to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the
child process.

catching stdout in realtime from subprocess

I want to subprocess.Popen() rsync.exe in Windows, and print the stdout in Python.
My code works, but it doesn't catch the progress until a file transfer is done! I want to print the progress for each file in real time.
Using Python 3.1 now since I heard it should be better at handling IO.
import subprocess, time, os, sys
cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'
p = subprocess.Popen(cmd,
shell=True,
bufsize=64,
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()

Some rules of thumb for subprocess.
Never use shell=True. It needlessly invokes an extra shell process to call your program.
When calling processes, arguments are passed around as lists. sys.argv in python is a list, and so is argv in C. So you pass a list to Popen to call subprocesses, not a string.
Don't redirect stderr to a PIPE when you're not reading it.
Don't redirect stdin when you're not writing to it.
Example:
import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in iter(p.stdout.readline, b''):
print(">>> " + line.rstrip())
That said, it is probable that rsync buffers its output when it detects that it is connected to a pipe instead of a terminal. This is the default behavior - when connected to a pipe, programs must explicitly flush stdout for realtime results, otherwise standard C library will buffer.
To test for that, try running this instead:
cmd = [sys.executable, 'test_out.py']
and create a test_out.py file with the contents:
import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")
Executing that subprocess should give you "Hello" and wait 10 seconds before giving "World". If that happens with the python code above and not with rsync, that means rsync itself is buffering output, so you are out of luck.
A solution would be to connect direct to a pty, using something like pexpect.

I know this is an old topic, but there is a solution now. Call the rsync with option --outbuf=L. Example:
cmd=['rsync', '-arzv','--backup','--outbuf=L','source/','dest']
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, b''):
print '>>> {}'.format(line.rstrip())

Depending on the use case, you might also want to disable the buffering in the subprocess itself.
If the subprocess will be a Python process, you could do this before the call:
os.environ["PYTHONUNBUFFERED"] = "1"
Or alternatively pass this in the env argument to Popen.
Otherwise, if you are on Linux/Unix, you can use the stdbuf tool. E.g. like:
cmd = ["stdbuf", "-oL"] + cmd
See also here about stdbuf or other options.

On Linux, I had the same problem of getting rid of the buffering. I finally used "stdbuf -o0" (or, unbuffer from expect) to get rid of the PIPE buffering.
proc = Popen(['stdbuf', '-o0'] + cmd, stdout=PIPE, stderr=PIPE)
stdout = proc.stdout
I could then use select.select on stdout.
See also https://unix.stackexchange.com/questions/25372/

for line in p.stdout:
...
always blocks until the next line-feed.
For "real-time" behaviour you have to do something like this:
while True:
inchar = p.stdout.read(1)
if inchar: #neither empty string nor None
print(str(inchar), end='') #or end=None to flush immediately
else:
print('') #flush for implicit line-buffering
break
The while-loop is left when the child process closes its stdout or exits.
read()/read(-1) would block until the child process closed its stdout or exited.

Your problem is:
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
the iterator itself has extra buffering.
Try doing like this:
while True:
line = p.stdout.readline()
if not line:
break
print line

You cannot get stdout to print unbuffered to a pipe (unless you can rewrite the program that prints to stdout), so here is my solution:
Redirect stdout to sterr, which is not buffered. '<cmd> 1>&2' should do it. Open the process as follows: myproc = subprocess.Popen('<cmd> 1>&2', stderr=subprocess.PIPE)
You cannot distinguish from stdout or stderr, but you get all output immediately.
Hope this helps anyone tackling this problem.

To avoid caching of output you might wanna try pexpect,
child = pexpect.spawn(launchcmd,args,timeout=None)
while True:
try:
child.expect('\n')
print(child.before)
except pexpect.EOF:
break
PS : I know this question is pretty old, still providing the solution which worked for me.
PPS: got this answer from another question

p = subprocess.Popen(command,
bufsize=0,
universal_newlines=True)
I am writing a GUI for rsync in python, and have the same probelms. This problem has troubled me for several days until i find this in pyDoc.
If universal_newlines is True, the file objects stdout and stderr are opened as text files in universal newlines mode. Lines may be terminated by any of '\n', the Unix end-of-line convention, '\r', the old Macintosh convention or '\r\n', the Windows convention. All of these external representations are seen as '\n' by the Python program.
It seems that rsync will output '\r' when translate is going on.

if you run something like this in a thread and save the ffmpeg_time property in a property of a method so you can access it, it would work very nice
I get outputs like this:
output be like if you use threading in tkinter
input = 'path/input_file.mp4'
output = 'path/input_file.mp4'
command = "ffmpeg -y -v quiet -stats -i \"" + str(input) + "\" -metadata title=\"#alaa_sanatisharif\" -preset ultrafast -vcodec copy -r 50 -vsync 1 -async 1 \"" + output + "\""
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, universal_newlines=True, shell=True)
for line in self.process.stdout:
reg = re.search('\d\d:\d\d:\d\d', line)
ffmpeg_time = reg.group(0) if reg else ''
print(ffmpeg_time)

Change the stdout from the rsync process to be unbuffered.
p = subprocess.Popen(cmd,
shell=True,
bufsize=0, # 0=unbuffered, 1=line-buffered, else buffer-size
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)

I've noticed that there is no mention of using a temporary file as intermediate. The following gets around the buffering issues by outputting to a temporary file and allows you to parse the data coming from rsync without connecting to a pty. I tested the following on a linux box, and the output of rsync tends to differ across platforms, so the regular expressions to parse the output may vary:
import subprocess, time, tempfile, re
pipe_output, file_name = tempfile.TemporaryFile()
cmd = ["rsync", "-vaz", "-P", "/src/" ,"/dest"]
p = subprocess.Popen(cmd, stdout=pipe_output,
stderr=subprocess.STDOUT)
while p.poll() is None:
# p.poll() returns None while the program is still running
# sleep for 1 second
time.sleep(1)
last_line = open(file_name).readlines()
# it's possible that it hasn't output yet, so continue
if len(last_line) == 0: continue
last_line = last_line[-1]
# Matching to "[bytes downloaded] number% [speed] number:number:number"
match_it = re.match(".* ([0-9]*)%.* ([0-9]*:[0-9]*:[0-9]*).*", last_line)
if not match_it: continue
# in this case, the percentage is stored in match_it.group(1),
# time in match_it.group(2). We could do something with it here...

In Python 3, here's a solution, which takes a command off the command line and delivers real-time nicely decoded strings as they are received.
Receiver (receiver.py):
import subprocess
import sys
cmd = sys.argv[1:]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in p.stdout:
print("received: {}".format(line.rstrip().decode("utf-8")))
Example simple program that could generate real-time output (dummy_out.py):
import time
import sys
for i in range(5):
print("hello {}".format(i))
sys.stdout.flush()
time.sleep(1)
Output:
$python receiver.py python dummy_out.py
received: hello 0
received: hello 1
received: hello 2
received: hello 3
received: hello 4

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Subprocess popen: Why do all writes happen at once in child process? - python

It is the read-ahead bug in Python 2: for line in sys.stdin: doesn't yield anything until its internal buffer is full. Use for line in iter(sys.stdin.readline, ''):, to workaround it.

Related

real time logging to file with python subprocess

Running an interactive command from within Python

python: reading subprocess output in threads

python, subprocess: reading output from subprocess

catching stdout in realtime from subprocess

Categories

Resources