subprocess.communicate - read lines that are not newline terminated

subprocess.communicate - read lines that are not newline terminated - python

I'm writing a Python program that uses the subprocess module to communicate with the admin interface of an appliance over ssh. Sometimes the appliance prompts for input with a line that's not newline terminated. How do I get subprocess.communicate() to return those lines to me? Is there a way to read unbuffered and character-by character? The amount of I/O generated is pretty small, so I'm not concerned about high overhead here.

Opening the process with bufsize=0 will turn off output buffering according to the subprocess docs. I think you'll still have to pass some custom file-like object (like a StringIO) into Popen as stdout or stderr and you'll have to read from those; communicate() waits for the process to terminate before it returns any of the command's output.

Related

Output from subprocess is not available on unbuffered stdout pipe before the subprocess terminates?

I've created a subprocess using subprocess.Popen(['myapp'], stdin=PIPE, stdout=PIPE, stderr=PIPE, bufsize=0) that executes a C-program that writes to stdout using e.g. puts().
The problem is that the Python program blocks in p.stdout.read(1024), although the subprocess starts by puts("HelloWorld"). Only after the subprocess terminates, is output available on p.stdout. I thought that bufsize=0 would mean that pipes become unbuffered, so that output is immediately available on the pipe.
I have read the below question, which states that newlines should cause output to be flushed. However, puts() prints a newline, so are the pipes not recognized as an interactive device?
Difference between puts() and printf() in C while using sleep()
It's because puts is also outputting a newline character which, on
devices that can be determined to be interactive, causes flushing by
default (for standard output) (a).
Any ideas?

This is application behavior. Even if the pipe is unbuffered, applications normally buffer information that they are going to write to a file (any type of file) for some time before actually writing it. As Jon's comment above indicates, a system-call like fflush() can be used by programs to ensure that they actually have posted the data, and, if applicable, that a physical I/O operation has actually completed.

Python subprocess & stdout - program deadlocks

I have a simulation program which is piloted though stdin and provides output to stdout
Doing a C++/Qt program for running it in a QProcess works well.
Doing a Python program for running it under linux works well, using:
p = subprocess.Popen(cmd,stdin=subprocess.PIPE,stdout=subprocess.PIPE)
And using p.stdin.write, p.stdout.readline, and p.wait
However, under windows, the program runs and gets the commands through stdin as it should(this has been verified by debugging the subprocess), but the python program deadlocks at any p.stdout.readline, and p.wait. If the stdout=subprocess.PIPE parameter is removed, the program works, the output is displayed on the console and no deadlock occurs.
This sounds familiar with a warning from the Python documentation:
Warning : This will deadlock when using stdout=PIPE and/or stderr=PIPE and the child process generates enough output to a pipe
such that it blocks waiting for the OS pipe buffer to accept more
data. Use communicate() to avoid that.
However, I can't use communicate(), as the program protocol is not a single command and a single output, rather several commands and replies are required.
Is there any solution?

Unsure of it, but it looks like a buffering problem. On Linux (as on most Unix or Unix-like), output to a file or a pipe is internally buffered at the OS level. That means that after a write call, all the data is buffered but nothing is available at the other end of the pipe until either the internal buffer is full, the data is flushed or the pipe is closed. That's one of the reasons why ptys were invented and are not implemented with a pipe pair.
Said differently, it is not possible to drive a program where you need to use previous output to know what you should give as input with pipes, unless the program has been specially tailored for it by consistently flushing its output before reading anything. It works on a true terminal (tty or pty) because the driver automatically forces a flush of the output before any read on the same device.
But it is not the same dealock that is described in the documentation that you have cited in your question.

Python subprocess: Print to stdin, read stdout until newline, repeat

I am looking to interface with an interactive command line application using Python 3.5. The idea is that I start the process at the beginning of the Python script and leave it open. In a loop, I print a file path, followed by a line return, to stdin, wait for a quarter second or so as it processes, and read from stdout until it reaches a newline.
This is quite similar to the communicate feature of subprocess, but I am waiting for a line return instead of waiting for the process to terminate. Anyone aware of a relatively simple way to do this?
Edit: it would be preferable to use the standard library to do this, rather than third-party libraries such as pexpect, if possible.

You can use subprocess.Popen for this.
Something like this:
proc = subprocess.Popen(['my-command'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
Now proc.stdin and proc.stdout are your ends of pipes that send data to the subprocess stdin and read from the subprocess stdout.
Since you're only interested in reading newline-terminated lines, you can probably get around any problems caused by buffering. Buffering is one of the big gotchas when using subprocess to communicate with interactive processes. Usually I/O is line-buffered, meaning that if the subprocess doesn't terminate a line with newline, you might never see any data on proc.stdout, and vice versa with you writing to proc.stdin - it might not see it if you're not ending with newline. You can turn buffering off, but that's not so simple, and not platform independent.
Another problem you might have to solve is that you can't determine whether the subprocess is waiting for input or has sent you output except by writing and reading from the pipes. So you might need to start a second thread so you can wait for output on proc.stdout and write to proc.stdin at the same time without running into a deadlock because both processes are blocking on pipe I/O (or, if you're on a Unix which supports select with file handles, use select to determine which pipes are ready to receive or ready to be read from).

This sounds like a job for an event loop. The subprocess module starts to show its strain under complex tasks.
I've done this with Twisted, by subclassing the following:
twisted.internet.endpoints.ProcessEndpoint
twisted.protocols.basic.LineOnlyReceiver
Most documentation for Twisted uses sockets as endpoints, but it's not hard to adjust the code for processes.

Live reading / writing to a subprocess stdin/stdout

I want to make a Python wrapper for another command-line program.
I want to read Python's stdin as quickly as possible, filter and translate it, and then write it promptly to the child program's stdin.
At the same time, I want to be reading as quickly as possible from the child program's stdout and, after a bit of massaging, writing it promptly to Python's stdout.
The Python subprocess module is full of warnings to use communicate() to avoid deadlocks. However, communicate() doesn't give me access to the child program's stdout until the child has terminated.

I think you'll be fine (carefully) ignoring the warnings using Popen.stdin, etc yourself. Just be sure to process the streams line-by-line and iterate through them on a fair schedule so not to fill up any buffers. A relatively simple (and inefficient) way of doing this in Python is using separate threads for the three streams. That's how Popen.communicate does it internally. Check out its source code to see how.

Disclaimer: This solution likely requires that you have access to the source code of the process you are trying to call, but may be worth trying anyways. It depends on the called process periodically flushing its stdout buffer which is not standard.
Say you have a process proc created by subprocess.Popen. proc has attributes stdin and stdout. These attributes are simply file-like objects. So, in order to send information through stdin you would call proc.stdin.write(). To retrieve information from proc.stdout you would call proc.stdout.readline() to read an individual line.
A couple of caveats:
When writing to proc.stdin via write() you will need to end the input with a newline character. Without a newline character, your subprocess will hang until a newline is passed.
In order to read information from proc.stdout you will need to make sure that the command called by subprocess appropriately flushes its stdout buffer after each print statement and that each line ends with a newline. If the stdout buffer does not flush at appropriate times, your call to proc.stdout.readline() will hang.

Repeatedly write to STDIN and read STDOUT of a Subprocess without closing it

I am trying to employ a Subprocess in Python for keeping an external script open in a Server-like fashion. The external script first loads a model. Once this is done, it accepts requests via STDIN and returns processed strings to STDOUT.
So far, I've tried
tokenizer = subprocess.Popen([tokenizer_path, '-l', lang_prefix], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
However, I cannot use
tokenizer.stdin.write(input_string+'\n')
out = self._tokenizer.stdout.readline()
in order to repeatedly process input_strings by means of the subprocess – out will just be empty, no matter if I use stdout.read() or stdout.readline(). However, it works when I close the stdin with tokenizer.stdin.close() before reading STDOUT, but this closes the subprocess, which is not what I want as I would have to reload the whole external script again before sending another request.
Is there any way to use a subprocess in a server-like fashion in python without closing and re-opening it?

Thanks to this Answer, I found out that a slave handle must be used in order to properly communicate with the subprocess:
master, slave = pty.openpty()
tokenizer = subprocess.Popen(script, shell=True stdin=subprocess.PIPE, stdout=slave)
stdin_handle = process.stdin
stdout_handle = os.fdopen(master)
Now, I can communicate to the subprocess without closing it via
stdin_handle.write(input)
stdout_handle.readline() #gets the processed input

Your external script probably buffers its output, so you only can read it in the father when the buffer in the child is flushed (which the child must do itself). One way to make it flush its buffers is probably closing the input because then it terminates in a proper fashion and flushes its buffers in the process.
If you have control over the external program (i. e. if you can patch it), insert a flushing after the output is produced.
Otherwise programs sometimes can be made to not buffer their output by attaching them to a pseudo-TTY (many programs, including the stdlib, assume that when their output is going to a TTY, no buffering is wished). But this is a bit tricky.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

subprocess.communicate - read lines that are not newline terminated - python

Related

Output from subprocess is not available on unbuffered stdout pipe before the subprocess terminates?

Python subprocess & stdout - program deadlocks

Python subprocess: Print to stdin, read stdout until newline, repeat

Live reading / writing to a subprocess stdin/stdout

Repeatedly write to STDIN and read STDOUT of a Subprocess without closing it

Categories

Resources