Python Subprocess reading from stdout blocks (reading output in real time)

Python Subprocess reading from stdout blocks (reading output in real time) - python

I am trying to interact with an application using subprocess.
I've created the process using Popen but I am not able to access the output stream without blocking the whole thread.
Writing to the inputstream however seem to work fine (tested it using communicate, however I may not be able to use this later as I need real time data).
I have already tried putting the buffer to 1 but it doesnt seem to work.
I have noticed that sometimes if the process terminates, the output is flushed.
I do believe that this issue may be caused by the fact that no flushing occurs (and that on closing, all the data gets recived at the same time) but I am not sure.
C code:
#include <stdio.h>
int main()
{
int testInteger;
printf("Enter an integer: \n");
scanf("%d", &testInteger);
printf("Number = %d",testInteger);
return 0;
}
Python code
import subprocess
p = subprocess.Popen("./a.out", stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=subprocess.PIPE, universal_newlines=True, bufsize=1, close_fds=True)
print(p.stdout.read(1)) #This should read E but instead blocks the whole thread!

I have already tried putting the buffer to 1 but it doesnt seem to work.
The bufsize parameter specifies the buffering of the pipe, but the binary you're calling has its own streams buffering which is generally full buffering if the binary is not outputting to a terminal (it's line buffering if stdout is a term).
You can observe this if you change the communication channel to stderr (using fprintf). Or if you fflush(stdout) explicitely after the printf. Or if you explicitely change the buffering configuration using setbuf(3)/setvbuf(3) (warning: this is UB unless it's done immediately at program start).
If you do not wish to modify the C program you can also use stdbuf (very much GNU-specific) to customise the buffering of the wrapped binary, just replace "./a.out" by ['stdbuf', '-o0', 'a.out'] to run a.out with an unbuffered stdout.
Incidentally, that sort of mess is why you probably don't want to handroll scripting interactive programs, that's why pexpect exists.
Oh, and stdin generally has the same buffering as stdout, by default (so line-buffered when hooked to a terminal and fully buffered otherwise).

The pipe read() waits for the subprocess to terminate before returning the entire output, hence it blocks.
You can try using readline() as described in read subprocess stdout line by line.
Edit:
Your c program might need to fflush(stdout) after printf. If printf detects a pipe then it can choose to not flush even with output of \n.
See more at Does printf always flush the buffer on encountering a newline?.

Related

How to get live output with subprocess in Python

I am trying to run a python file that prints something, waits 2 seconds, and then prints again. I want to catch these outputs live from my python script to then process them. I tried different things but nothing worked.
process = subprocess.Popen(cmd, stdout=subprocess.PIPE)
while True:
output = process.stdout.readline()
if process.poll() is not None and output == '':
break
if output:
print(output.strip())
I'm at this point but it doesn't work. It waits until the code finishes and then prints all the outputs.
I just need to run a python file and get live outputs from it, if you have other ideas for doing it, without using the print function let me know, just know that I have to run the file separately. I just thought of the easiest way possible but, from what I'm seeing it can't be done.

There are three layers of buffering here, and you need to limit all three of them to guarantee you get live data:
Use the stdbuf command (on Linux) to wrap the subprocess execution (e.g. run ['stdbuf', '-oL'] + cmd instead of just cmd), or (if you have the ability to do so) alter the program itself to either explicitly change the buffering on stdout (e.g. using setvbuf for C/C++ code to switch stdout globally to line-buffered mode, rather than the default block buffering it uses when outputting to a non-tty) or to insert flush statements after critical output (e.g. fflush(stdout); for C/C++, fileobj.flush() for Python, etc.) the buffering of the program to line-oriented mode (or add fflushs); without that, everything is stuck in user-mode buffers of the sub-process.
Add bufsize=0 to the Popen arguments (probably not needed since you don't send anything to stdin, but harmless) so it unbuffers all piped handles. If the Popen is in text=True mode, switch to bufsize=1 (which is line-buffered, rather than unbuffered).
Add flush=True to the print arguments (if you're connected to a terminal, the line-buffering will flush it for you, so it's only if stdout is piped to a file that this will matter), or explicitly call sys.stdout.flush().
Between the three of these, you should be able to guarantee no data is stuck waiting in user-mode buffers; if at least one line has been output by the sub-process, it will reach you immediately, and any output triggered by it will also appear immediately. Item #1 is the hardest in most cases (when you can't use stdbuf, or the process reconfigures its own buffering internally and undoes the effect of stdbuf, and you can't modify the process executable to fix it); you have complete control over #2 and #3, but #1 may be outside your control.

This is the code I use for that same purpose:
def run_command(command, **kwargs):
"""Run a command while printing the live output"""
process = subprocess.Popen(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
**kwargs,
)
while True: # Could be more pythonic with := in Python3.8+
line = process.stdout.readline()
if not line and process.poll() is not None:
break
print(line.decode(), end='')
An example of usage would be:
run_command(['git', 'status'], cwd=Path(__file__).parent.absolute())

Why doesn't `readline` return for `subprocess.stdout`?

I use Python subprocess to fork a C application and open a pipe to stdout.
app = subprocess.Popen(args, stdout=subprocess.PIPE)
The application writes several lines to stdout like so:
printf("Line 0\n");
printf("Line 1\n");
printf("Line 2\n");
I try to read these lines in my Python script before the application exits:
line = app.stdout.readline()
However, readline blocks indefinitely without returning any content, even though I expect to read Line 0, Line 1, and Line 2, in three separate calls to readline. I notice that when the application finally exits, readline returns the expected contents. However, I want readline to return the expected contents as soon as they are passed to printf. What is happening?

According to this StackOverflow question, printf only flushes its output on a newline when the output device is interactive (e.g., a terminal). When the output device is non-interactive (e.g., a Python subprocess pipe, a file, etc.), stdout is fully buffered, so the output will only be flushed when the buffer is full, when fflush(stdout) is manually called, or when the buffer is otherwise forced to flush (e.g., on process exit).
Thus, readline did not return because the application's stdout was never flushed, so readline had nothing to read. Changing the application code to this fixes the issue:
printf("Line 0\n");
printf("Line 1\n");
printf("Line 2\n");
fflush(stdout); // New line
Others may wish to set the buffer mode of stdout with setvbuf rather than call fflush.
I was surprised by this because the common understanding of printf is that it flushes on a newline character, but this does not happen in all cases.

Get the output of python subprocess in console

process = subprocess.check_output(BACKEND+"mainbgw setup " + str(NUM_USERS), shell=True,\
stderr=subprocess.STDOUT)
I am using the above statement to run a C program in django-python based server for some computations, there are some printf() statements whose output I would like to see on stdout while the server is running and executing the subprocess, how can that be done ?

If you actually don't need the output to be available to your python code as a string, you can just use os.system, or subprocess.call without redirecting stdout elsewhere. Then stdout of your C program will just go directly to stdout of your python program.
If you need both streaming stdout and access to the output as a string, you should use subprocess.Popen (or the old popen2.popen4) to obtain a file descriptor of the output stream, then repeatedly read lines from the stream until you exhausted it. In the mean time, you keep a concatenated version of all data you grabbed. This is an example of the loop.

subprocess.Popen.stdout - reading stdout in real-time (again)

Again, the same question.
The reason is - I still can't make it work after reading the following:
Real-time intercepting of stdout from another process in Python
Intercepting stdout of a subprocess while it is running
How do I get 'real-time' information back from a subprocess.Popen in python (2.5)
catching stdout in realtime from subprocess
My case is that I have a console app written in C, lets take for example this code in a loop:
tmp = 0.0;
printf("\ninput>>");
scanf_s("%f",&tmp);
printf ("\ninput was: %f",tmp);
It continuously reads some input and writes some output.
My python code to interact with it is the following:
p=subprocess.Popen([path],stdout=subprocess.PIPE,stdin=subprocess.PIPE)
p.stdin.write('12345\n')
for line in p.stdout:
print(">>> " + str(line.rstrip()))
p.stdout.flush()
So far whenever I read form p.stdout it always waits until the process is terminated and then outputs an empty string. I've tried lots of stuff - but still the same result.
I tried Python 2.6 and 3.1, but the version doesn't matter - I just need to make it work somewhere.

Trying to write to and read from pipes to a sub-process is tricky because of the default buffering going on in both directions. It's extremely easy to get a deadlock where one or the other process (parent or child) is reading from an empty buffer, writing into a full buffer or doing a blocking read on a buffer that's awaiting data before the system libraries flush it.
For more modest amounts of data the Popen.communicate() method might be sufficient. However, for data that exceeds its buffering you'd probably get stalled processes (similar to what you're already seeing?)
You might want to look for details on using the fcntl module and making one or the other (or both) of your file descriptors non-blocking. In that case, of course, you'll have to wrap all reads and/or writes to those file descriptors in the appropriate exception handling to handle the "EWOULDBLOCK" events. (I don't remember the exact Python exception that's raised for these).
A completely different approach would be for your parent to use the select module and os.fork() ... and for the child process to execve() the target program after directly handling any file dup()ing. (Basically you'd be re-implement parts of Popen() but with different parent file descriptor (PIPE) handling.
Incidentally, .communicate, at least in Python's 2.5 and 2.6 standard libraries, will only handle about 64K of remote data (on Linux and FreeBSD). This number may vary based on various factors (possibly including the build options used to compile your Python interpreter, or the version of libc being linked to it). It is NOT simply limited by available memory (despite J.F. Sebastian's assertion to the contrary) but is limited to a much smaller value.

Push reading from the pipe into a separate thread that signals when a chunk of output is available:
How can I read all availably data from subprocess.Popen.stdout (non blocking)?

The bufsize=256 argument prevents 12345\n from being sent to the child process in a chunk smaller than 256 bytes, as it will be when omitting bufsize or inserting p.stdin.flush() after p.stdin.write(). Default behaviour is line-buffering.
In either case you should at least see one empty line before blocking as emitted by the first printf(\n...) in your example.

Your particular example doesn't require "real-time" interaction. The following works:
from subprocess import Popen, PIPE
p = Popen(["./a.out"], stdin=PIPE, stdout=PIPE)
output = p.communicate(b"12345")[0] # send input/read all output
print output,
where a.out is your example C program.
In general, for a dialog-based interaction with a subprocess you could use pexpect module (or its analogs on Windows):
import pexpect
child = pexpect.spawn("./a.out")
child.expect("input>>")
child.sendline("12345.67890") # send a number
child.expect(r"\d+\.\d+") # expect the number at the end
print float(child.after) # assert that we can parse it
child.close()

I had the same problem, and "proc.communicate()" does not solve it because it waits for process terminating.
So here is what is working for me, on Windows with Python 3.5.1 :
import subprocess as sp
myProcess = sp.Popen( cmd, creationflags=sp.CREATE_NEW_PROCESS_GROUP,stdout=sp.PIPE,stderr=sp.STDOUT)
while i<40:
i+=1
time.sleep(.5)
out = myProcess.stdout.readline().decode("utf-8").rstrip()
I guess creationflags and other arguments are not mandatory (but I don't have time to test), so this would be the minimal syntax :
myProcess = sp.Popen( cmd, stdout=sp.PIPE)
for i in range(40)
time.sleep(.5)
out = myProcess.stdout.readline()

Python Popen, closing streams and multiple processes

I have some data that I would like to gzip, uuencode and then print to standard out. What I basically have is:
compressor = Popen("gzip", stdin = subprocess.PIPE, stdout = subprocess.PIPE)
encoder = Popen(["uuencode", "dummy"], stdin = compressor.stdout)
The way I feed data to the compressor is through compressor.stdin.write(stuff).
What I really need to do is to send an EOF to the compressor, and I have no idea how to do it.
At some point, I tried compressor.stdin.close() but that doesn't work -- it works well when the compressor writes to a file directly, but in the case above, the process doesn't terminate and stalls on compressor.wait().
Suggestions? In this case, gzip is an example and I really need to do something with piping the output of one process to another.
Note: The data I need to compress won't fit in memory, so communicate isn't really a good option here. Also, if I just run
compressor.communicate("Testing")
after the 2 lines above, it still hangs with the error
File "/usr/lib/python2.4/subprocess.py", line 1041, in communicate
rlist, wlist, xlist = select.select(read_set, write_set, [])

I suspect the issue is with the order in which you open the pipes. UUEncode is funny is that it will whine when you launch it if there's no incoming pipe in just the right way (try launching the darn thing on it's own in a Popen call to see the explosion with just PIPE as the stdin and stdout)
Try this:
encoder = Popen(["uuencode", "dummy"], stdin=PIPE, stdout=PIPE)
compressor = Popen("gzip", stdin=PIPE, stdout=encoder.stdin)
compressor.communicate("UUencode me please")
encoded_text = encoder.communicate()[0]
print encoded_text
begin 644 dummy
F'XL(`%]^L$D``PL-3<U+SD])5<A-52C(24TL3#4`;2O+"!(`````
`
end
You are right, btw... there is no way to send a generic EOF down a pipe. After all, each program really defines its own EOF. The way to do it is to close the pipe, as you were trying to do.
EDIT: I should be clearer about uuencode. As a shell program, it's default behaviour is to expect console input. If you run it without a "live" incoming pipe, it will block waiting for console input. By opening the encoder second, before you had sent material down the compressor pipe, the encoder was blocking waiting for you to start typing. Jerub was right in that there was something blocking.

This is not the sort of thing you should be doing directly in python, there are eccentricities regarding the how thing work that make it a much better idea to do this with a shell. If you can just use subprocess.Popen("foo | bar", shell=True), then all the better.
What might be happening is that gzip has not been able to output all of its input yet, and the process will no exit until its stdout writes have been finished.
You can look at what system call a process is blocking on if you use strace. Use ps auxwf to discover which process is the gzip process, then use strace -p $pidnum to see what system call it is performing. Note that stdin is FD 0 and stdout is FD 1, you will probably see it reading or writing on those file descriptors.

if you just want to compress and don't need the file wrappers consider using the zlib module
import zlib
compressed = zlib.compress("text")
any reason why the shell=True and unix pipes suggestions won't work?
from subprocess import *
pipes = Popen("gzip | uuencode dummy", stdin=PIPE, stdout=PIPE, shell=True)
for i in range(1, 100):
pipes.stdin.write("some data")
pipes.stdin.close()
print pipes.stdout.read()
seems to work

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Subprocess reading from stdout blocks (reading output in real time) - python

Related

How to get live output with subprocess in Python

Why doesn't `readline` return for `subprocess.stdout`?

Get the output of python subprocess in console

subprocess.Popen.stdout - reading stdout in real-time (again)

Python Popen, closing streams and multiple processes

Categories

Resources