printing stdout in realtime from a subprocess that requires stdin - python

This is a follow up to this question, but if I want to pass an argument to stdin to subprocess, how can I get the output in real time? This is what I currently have; I also tried replacing Popen with call from the subprocess module and this just leads to the script hanging.
from subprocess import Popen, PIPE, STDOUT
cmd = 'rsync --rsh=ssh -rv --files-from=- thisdir/ servername:folder/'
p = Popen(cmd.split(), stdout=PIPE, stdin=PIPE, stderr=STDOUT)
subfolders = '\n'.join(['subfolder1','subfolder2'])
output = p.communicate(input=subfolders)[0]
print output
In the former question where I did not have to pass stdin I was suggested to use p.stdout.readline, there there is no room there to pipe anything to stdin.
Addendum: This works for the transfer, but I see the output only at the end and I would like to see the details of the transfer while it's happening.

In order to grab stdout from the subprocess in real time you need to decide exactly what behavior you want; specifically, you need to decide whether you want to deal with the output line-by-line or character-by-character, and whether you want to block while waiting for output or be able to do something else while waiting.
It looks like it will probably suffice for your case to read the output in line-buffered fashion, blocking until each complete line comes in, which means the convenience functions provided by subprocess are good enough:
p = subprocess.Popen(some_cmd, stdout=subprocess.PIPE)
# Grab stdout line by line as it becomes available. This will loop until
# p terminates.
while p.poll() is None:
l = p.stdout.readline() # This blocks until it receives a newline.
print l
# When the subprocess terminates there might be unconsumed output
# that still needs to be processed.
print p.stdout.read()
If you need to write to the stdin of the process, just use another pipe:
p = subprocess.Popen(some_cmd, stdout=subprocess.PIPE, stdin=subprocess.PIPE)
# Send input to p.
p.stdin.write("some input\n")
p.stdin.flush()
# Now start grabbing output.
while p.poll() is None:
l = p.stdout.readline()
print l
print p.stdout.read()
Pace the other answer, there's no need to indirect through a file in order to pass input to the subprocess.

something like this I think
from subprocess import Popen, PIPE, STDOUT
p = Popen('c:/python26/python printingTest.py', stdout = PIPE,
stderr = PIPE)
for line in iter(p.stdout.readline, ''):
print line
p.stdout.close()
using an iterator will return live results basically ..
in order to send input to stdin you would need something like
other_input = "some extra input stuff"
with open("to_input.txt","w") as f:
f.write(other_input)
p = Popen('c:/python26/python printingTest.py < some_input_redirection_thing',
stdin = open("to_input.txt"),
stdout = PIPE,
stderr = PIPE)
this would be similar to the linux shell command of
%prompt%> some_file.o < cat to_input.txt
see alps answer for better passing to stdin

If you pass all your input before starting reading the output and if by "real-time" you mean whenever the subprocess flushes its stdout buffer:
from subprocess import Popen, PIPE, STDOUT
cmd = 'rsync --rsh=ssh -rv --files-from=- thisdir/ servername:folder/'
p = Popen(cmd.split(), stdout=PIPE, stdin=PIPE, stderr=STDOUT, bufsize=1)
subfolders = '\n'.join(['subfolder1','subfolder2'])
p.stdin.write(subfolders)
p.stdin.close() # eof
for line in iter(p.stdout.readline, ''):
print line, # do something with the output here
p.stdout.close()
rc = p.wait()

Related

Interacting with subprocess while reading stdout and stderr in real time

I need to write a wrapper around a shell command that prints its stdout and stderr in real time back to stdout and stderr respectively, while also allowing the user to interact and send stdin to it, if the process prompts for input.
Note that pexpect.interact() almost solves the problem, except that it combines all stdout and stderr and sends it back to stdout. And there appears no way to stop it doing that.
What I have so far, is a method to read stdout and stderr from a process via subprocess.Popen:
def _popen_command(
command: List[str]
) -> None:
"""
Run a shell command with Popen line by line
in real time without redirecting stdout or stderr.
"""
with subprocess.Popen(
command,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
) as proc:
selector = selectors.DefaultSelector()
selector.register(proc.stdout, selectors.EVENT_READ)
selector.register(proc.stderr, selectors.EVENT_READ)
eof = False
while not eof:
for key, _ in selector.select():
data = key.fileobj.read1(1).decode()
if not data:
eof = True
if key.fileobj is proc.stdout:
print(data, end="")
else:
print(data, end="", file=sys.stderr)
I don't believe an answer exists with all of these features exists in Stack Overflow: handles stdout, stderr, in real time, line by line, and prints them back to stdout, stderr, respectively, and also allows arbitrary interaction with stdin.
Is it possible to do this?

printing stdout pipe of sub processes ignore last lines

I'm trying to run a sub processes and watching his stdout until I find desirable string.
this is my code:
def waitForAppOutput(proc, word):
for stdout_line in iter(proc.stdout.readline, b''):
print stdout_line
if word in stdout_line.rstrip():
return;
p = Popen(["./app.sh"], shell=True, stdin=PIPE ,stdout=PIPE, stderr=PIPE)
waitForAppOutput(p,"done!")
the issue here is that for some reason the function waitForAppOutput stop printing stdout few lines before the "done!" which is the last line that should appear in the stdout . I assume iter(proc.stdout.readline, b'') is blocking and readline is not able to read the last lines of the stdout.
any idea what is the issue here?
You have a misspelling: it should be waitForAppOutput instead of waitForAppOutout. How does this even run at all? And when you are invoking a command using a shell, you should not be passing an array of strings but rather one single string.
Normally one should use the communicate method on the return subprocess object from the Popen call to prevent potential deadlocks (which seems to be what you are experiencing). This returns a tuple: (stdout, stderr), the stdout and stderr output strings:
from subprocess import Popen, PIPE
def waitForAppOutput(stdout_lines, word):
for stdout_line in stdout_lines:
print stdout_line
if word in stdout_line.rstrip():
return;
p = Popen("./app.sh", shell=True, stdout=PIPE, stderr=PIPE, stdin=PIPE, universal_newlines=True)
expected_input = "command line 1\ncommand line 2\n"
stdout, stderr = p.communicate(expected_input)
stdout_lines = stdout.splitlines()
waitForAppOutput(stdout_lines, "done!")
The only issue is if the output strings are large (whatever your definition of large might be), for it might be memory-inefficient or even prohibitive to read the entire output into memory. If this is your situation, then I would try to avoid the deadlock by piping only stdout.
from subprocess import Popen, PIPE
def waitForAppOutput(proc, word):
for stdout_line in iter(proc.stdout.readline, ''):
print stdout_line
if word in stdout_line.rstrip():
return;
p = Popen("./app.sh", shell=True, stdout=PIPE, stdin=PIPE, universal_newlines=True)
expected_input = "command line 1\ncommand line 2\n"
p.stdin.write(expected_input)
p.stdin.close()
waitForAppOutput(p, "done!")
for stdout_line in iter(p.stdout.readline, ''):
pass # read rest of output
p.wait() # wait for termination
Update
Here is an example using both techniques that runs the Windows sort command to sort a bunch of input lines. This works particularly well both ways because the sort command does not start output until all the input has been read, so it's a very simple protocol in which deadlocking is easy to avoid. Try running this with USE_COMMUNICATE set alternately to True and False:
from subprocess import Popen, PIPE
USE_COMMUNICATE = False
p = Popen("sort", shell=True, stdout=PIPE, stdin=PIPE, universal_newlines=True)
expected_input = """q
w
e
r
t
y
u
i
o
p
"""
if USE_COMMUNICATE:
stdout_lines, stderr_lines = p.communicate(expected_input)
output = stdout_lines
else:
p.stdin.write(expected_input)
p.stdin.close()
output = iter(p.stdout.readline, '')
for stdout_line in output:
print stdout_line,
p.wait() # wait for termination
Prints:
e
i
o
p
q
r
t
u
w
y

Python subprocess Popen.communicate() equivalent to Popen.stdout.read()?

Very specific question (I hope): What are the differences between the following three codes?
(I expect it to be only that the first does not wait for the child process to be finished, while the second and third ones do. But I need to be sure this is the only difference...)
I also welcome other remarks/suggestions (though I'm already well aware of the shell=True dangers and cross-platform limitations)
Note that I already read Python subprocess interaction, why does my process work with Popen.communicate, but not Popen.stdout.read()? and that I do not want/need to interact with the program after.
Also note that I already read Alternatives to Python Popen.communicate() memory limitations? but that I didn't really get it...
Finally, note that I am aware that somewhere there is a risk of deadlock when one buffer is filled with one output using one method, but I got lost while looking for clear explanations on the Internet...
First code:
from subprocess import Popen, PIPE
def exe_f(command='ls -l', shell=True):
"""Function to execute a command and return stuff"""
process = Popen(command, shell=shell, stdout=PIPE, stderr=PIPE)
stdout = process.stdout.read()
stderr = process.stderr.read()
return process, stderr, stdout
Second code:
from subprocess import Popen, PIPE
from subprocess import communicate
def exe_f(command='ls -l', shell=True):
"""Function to execute a command and return stuff"""
process = Popen(command, shell=shell, stdout=PIPE, stderr=PIPE)
(stdout, stderr) = process.communicate()
return process, stderr, stdout
Third code:
from subprocess import Popen, PIPE
from subprocess import wait
def exe_f(command='ls -l', shell=True):
"""Function to execute a command and return stuff"""
process = Popen(command, shell=shell, stdout=PIPE, stderr=PIPE)
code = process.wait()
stdout = process.stdout.read()
stderr = process.stderr.read()
return process, stderr, stdout
Thanks.
If you look at the source for subprocess.communicate(), it shows a perfect example of the difference:
def communicate(self, input=None):
...
# Optimization: If we are only using one pipe, or no pipe at
# all, using select() or threads is unnecessary.
if [self.stdin, self.stdout, self.stderr].count(None) >= 2:
stdout = None
stderr = None
if self.stdin:
if input:
self.stdin.write(input)
self.stdin.close()
elif self.stdout:
stdout = self.stdout.read()
self.stdout.close()
elif self.stderr:
stderr = self.stderr.read()
self.stderr.close()
self.wait()
return (stdout, stderr)
return self._communicate(input)
You can see that communicate does make use of the read calls to stdout and stderr, and also calls wait(). It is just a matter of order of operations. In your case because you are using PIPE for both stdout and stderr, it goes into _communicate():
def _communicate(self, input):
stdout = None # Return
stderr = None # Return
if self.stdout:
stdout = []
stdout_thread = threading.Thread(target=self._readerthread,
args=(self.stdout, stdout))
stdout_thread.setDaemon(True)
stdout_thread.start()
if self.stderr:
stderr = []
stderr_thread = threading.Thread(target=self._readerthread,
args=(self.stderr, stderr))
stderr_thread.setDaemon(True)
stderr_thread.start()
if self.stdin:
if input is not None:
self.stdin.write(input)
self.stdin.close()
if self.stdout:
stdout_thread.join()
if self.stderr:
stderr_thread.join()
# All data exchanged. Translate lists into strings.
if stdout is not None:
stdout = stdout[0]
if stderr is not None:
stderr = stderr[0]
# Translate newlines, if requested. We cannot let the file
# object do the translation: It is based on stdio, which is
# impossible to combine with select (unless forcing no
# buffering).
if self.universal_newlines and hasattr(file, 'newlines'):
if stdout:
stdout = self._translate_newlines(stdout)
if stderr:
stderr = self._translate_newlines(stderr)
self.wait()
return (stdout, stderr)
This uses threads to read from multiple streams at once. Then it calls wait() at the end.
So to sum it up:
This example reads from one stream at a time and does not wait for it to finish the process.
This example reads from both streams at the same time via internal threads, and waits for it to finish the process.
This example waits for the process to finish, and then reads one stream at a time. And as you mentioned has the potential to deadlock if there is too much written to the streams.
Also, you don't need these two import statements in your 2nd and 3rd examples:
from subprocess import communicate
from subprocess import wait
They are both methods of the Popen object.

How to get output from subprocess.Popen(). proc.stdout.readline() blocks, no data prints out

I want output from execute Test_Pipe.py, I tried following code on Linux but it did not work.
Test_Pipe.py
import time
while True :
print "Someting ..."
time.sleep(.1)
Caller.py
import subprocess as subp
import time
proc = subp.Popen(["python", "Test_Pipe.py"], stdout=subp.PIPE, stdin=subp.PIPE)
while True :
data = proc.stdout.readline() #block / wait
print data
time.sleep(.1)
The line proc.stdout.readline() was blocked, so no data prints out.
You obviously can use subprocess.communicate but I think you are looking for real time input and output.
readline was blocked because the process is probably waiting on your input. You can read character by character to overcome this like the following:
import subprocess
import sys
process = subprocess.Popen(
cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE
)
while True:
out = process.stdout.read(1)
if out == '' and process.poll() != None:
break
if out != '':
sys.stdout.write(out)
sys.stdout.flush()
Nadia's snippet does work but calling read with a 1 byte buffer is highly unrecommended. The better way to do this would be to set the stdout file descriptor to nonblocking using fcntl
fcntl.fcntl(
proc.stdout.fileno(),
fcntl.F_SETFL,
fcntl.fcntl(proc.stdout.fileno(), fcntl.F_GETFL) | os.O_NONBLOCK,
)
and then using select to test if the data is ready
while proc.poll() == None:
readx = select.select([proc.stdout.fileno()], [], [])[0]
if readx:
chunk = proc.stdout.read()
print chunk
She was correct in that your problem must be different from what you posted as Caller.py and Test_Pipe.py do work as provided.
https://derrickpetzold.com/p/capturing-output-from-ffmpeg-python/
Test_Pipe.py buffers its stdout by default so proc in Caller.py doesn't see any output until the child's buffer is full (if the buffer size is 8KB then it takes around a minute to fill Test_Pipe.py's stdout buffer).
To make the output unbuffered (line-buffered for text streams) you could pass -u flag to the child Python script. It allows to read subprocess' output line by line in "real-time":
import sys
from subprocess import Popen, PIPE
proc = Popen([sys.executable, "-u", "Test_Pipe.py"], stdout=PIPE, bufsize=1)
for line in iter(proc.stdout.readline, b''):
print line,
proc.communicate()
See links in Python: read streaming input from subprocess.communicate() on how to solve the block-buffering issue for non-Python child processes.
To avoid the many problems that can always arise with buffering for tasks such as "getting the subprocess's output to the main process in real time", I always recommend using pexpect for all non-Windows platform, wexpect on Windows, instead of subprocess, when such tasks are desired.

catching stdout in realtime from subprocess

I want to subprocess.Popen() rsync.exe in Windows, and print the stdout in Python.
My code works, but it doesn't catch the progress until a file transfer is done! I want to print the progress for each file in real time.
Using Python 3.1 now since I heard it should be better at handling IO.
import subprocess, time, os, sys
cmd = "rsync.exe -vaz -P source/ dest/"
p, line = True, 'start'
p = subprocess.Popen(cmd,
shell=True,
bufsize=64,
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
Some rules of thumb for subprocess.
Never use shell=True. It needlessly invokes an extra shell process to call your program.
When calling processes, arguments are passed around as lists. sys.argv in python is a list, and so is argv in C. So you pass a list to Popen to call subprocesses, not a string.
Don't redirect stderr to a PIPE when you're not reading it.
Don't redirect stdin when you're not writing to it.
Example:
import subprocess, time, os, sys
cmd = ["rsync.exe", "-vaz", "-P", "source/" ,"dest/"]
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
for line in iter(p.stdout.readline, b''):
print(">>> " + line.rstrip())
That said, it is probable that rsync buffers its output when it detects that it is connected to a pipe instead of a terminal. This is the default behavior - when connected to a pipe, programs must explicitly flush stdout for realtime results, otherwise standard C library will buffer.
To test for that, try running this instead:
cmd = [sys.executable, 'test_out.py']
and create a test_out.py file with the contents:
import sys
import time
print ("Hello")
sys.stdout.flush()
time.sleep(10)
print ("World")
Executing that subprocess should give you "Hello" and wait 10 seconds before giving "World". If that happens with the python code above and not with rsync, that means rsync itself is buffering output, so you are out of luck.
A solution would be to connect direct to a pty, using something like pexpect.
I know this is an old topic, but there is a solution now. Call the rsync with option --outbuf=L. Example:
cmd=['rsync', '-arzv','--backup','--outbuf=L','source/','dest']
p = subprocess.Popen(cmd,
stdout=subprocess.PIPE)
for line in iter(p.stdout.readline, b''):
print '>>> {}'.format(line.rstrip())
Depending on the use case, you might also want to disable the buffering in the subprocess itself.
If the subprocess will be a Python process, you could do this before the call:
os.environ["PYTHONUNBUFFERED"] = "1"
Or alternatively pass this in the env argument to Popen.
Otherwise, if you are on Linux/Unix, you can use the stdbuf tool. E.g. like:
cmd = ["stdbuf", "-oL"] + cmd
See also here about stdbuf or other options.
On Linux, I had the same problem of getting rid of the buffering. I finally used "stdbuf -o0" (or, unbuffer from expect) to get rid of the PIPE buffering.
proc = Popen(['stdbuf', '-o0'] + cmd, stdout=PIPE, stderr=PIPE)
stdout = proc.stdout
I could then use select.select on stdout.
See also https://unix.stackexchange.com/questions/25372/
for line in p.stdout:
...
always blocks until the next line-feed.
For "real-time" behaviour you have to do something like this:
while True:
inchar = p.stdout.read(1)
if inchar: #neither empty string nor None
print(str(inchar), end='') #or end=None to flush immediately
else:
print('') #flush for implicit line-buffering
break
The while-loop is left when the child process closes its stdout or exits.
read()/read(-1) would block until the child process closed its stdout or exited.
Your problem is:
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
the iterator itself has extra buffering.
Try doing like this:
while True:
line = p.stdout.readline()
if not line:
break
print line
You cannot get stdout to print unbuffered to a pipe (unless you can rewrite the program that prints to stdout), so here is my solution:
Redirect stdout to sterr, which is not buffered. '<cmd> 1>&2' should do it. Open the process as follows: myproc = subprocess.Popen('<cmd> 1>&2', stderr=subprocess.PIPE)
You cannot distinguish from stdout or stderr, but you get all output immediately.
Hope this helps anyone tackling this problem.
To avoid caching of output you might wanna try pexpect,
child = pexpect.spawn(launchcmd,args,timeout=None)
while True:
try:
child.expect('\n')
print(child.before)
except pexpect.EOF:
break
PS : I know this question is pretty old, still providing the solution which worked for me.
PPS: got this answer from another question
p = subprocess.Popen(command,
bufsize=0,
universal_newlines=True)
I am writing a GUI for rsync in python, and have the same probelms. This problem has troubled me for several days until i find this in pyDoc.
If universal_newlines is True, the file objects stdout and stderr are opened as text files in universal newlines mode. Lines may be terminated by any of '\n', the Unix end-of-line convention, '\r', the old Macintosh convention or '\r\n', the Windows convention. All of these external representations are seen as '\n' by the Python program.
It seems that rsync will output '\r' when translate is going on.
if you run something like this in a thread and save the ffmpeg_time property in a property of a method so you can access it, it would work very nice
I get outputs like this:
output be like if you use threading in tkinter
input = 'path/input_file.mp4'
output = 'path/input_file.mp4'
command = "ffmpeg -y -v quiet -stats -i \"" + str(input) + "\" -metadata title=\"#alaa_sanatisharif\" -preset ultrafast -vcodec copy -r 50 -vsync 1 -async 1 \"" + output + "\""
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, universal_newlines=True, shell=True)
for line in self.process.stdout:
reg = re.search('\d\d:\d\d:\d\d', line)
ffmpeg_time = reg.group(0) if reg else ''
print(ffmpeg_time)
Change the stdout from the rsync process to be unbuffered.
p = subprocess.Popen(cmd,
shell=True,
bufsize=0, # 0=unbuffered, 1=line-buffered, else buffer-size
stdin=subprocess.PIPE,
stderr=subprocess.PIPE,
stdout=subprocess.PIPE)
I've noticed that there is no mention of using a temporary file as intermediate. The following gets around the buffering issues by outputting to a temporary file and allows you to parse the data coming from rsync without connecting to a pty. I tested the following on a linux box, and the output of rsync tends to differ across platforms, so the regular expressions to parse the output may vary:
import subprocess, time, tempfile, re
pipe_output, file_name = tempfile.TemporaryFile()
cmd = ["rsync", "-vaz", "-P", "/src/" ,"/dest"]
p = subprocess.Popen(cmd, stdout=pipe_output,
stderr=subprocess.STDOUT)
while p.poll() is None:
# p.poll() returns None while the program is still running
# sleep for 1 second
time.sleep(1)
last_line = open(file_name).readlines()
# it's possible that it hasn't output yet, so continue
if len(last_line) == 0: continue
last_line = last_line[-1]
# Matching to "[bytes downloaded] number% [speed] number:number:number"
match_it = re.match(".* ([0-9]*)%.* ([0-9]*:[0-9]*:[0-9]*).*", last_line)
if not match_it: continue
# in this case, the percentage is stored in match_it.group(1),
# time in match_it.group(2). We could do something with it here...
In Python 3, here's a solution, which takes a command off the command line and delivers real-time nicely decoded strings as they are received.
Receiver (receiver.py):
import subprocess
import sys
cmd = sys.argv[1:]
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
for line in p.stdout:
print("received: {}".format(line.rstrip().decode("utf-8")))
Example simple program that could generate real-time output (dummy_out.py):
import time
import sys
for i in range(5):
print("hello {}".format(i))
sys.stdout.flush()
time.sleep(1)
Output:
$python receiver.py python dummy_out.py
received: hello 0
received: hello 1
received: hello 2
received: hello 3
received: hello 4

Categories

Resources