Python Popen with stdout and stdin piped : read freeze [duplicate]

Python Popen with stdout and stdin piped : read freeze [duplicate] - python

Ok so I'm trying to run a C program from a python script. Currently I'm using a test C program:
#include <stdio.h>
int main() {
while (1) {
printf("2000\n");
sleep(1);
}
return 0;
}
To simulate the program that I will be using, which takes readings from a sensor constantly.
Then I'm trying to read the output (in this case "2000") from the C program with subprocess in python:
#!usr/bin/python
import subprocess
process = subprocess.Popen("./main", stdout=subprocess.PIPE)
while True:
for line in iter(process.stdout.readline, ''):
print line,
but this is not working. From using print statements, it runs the .Popen line then waits at for line in iter(process.stdout.readline, ''):, until I press Ctrl-C.
Why is this? This is exactly what most examples that I've seen have as their code, and yet it does not read the file.
Is there a way of making it run only when there is something to be read?

It is a block buffering issue.
What follows is an extended for your case version of my answer to Python: read streaming input from subprocess.communicate() question.
Fix stdout buffer in C program directly
stdio-based programs as a rule are line buffered if they are running interactively in a terminal and block buffered when their stdout is redirected to a pipe. In the latter case, you won't see new lines until the buffer overflows or flushed.
To avoid calling fflush() after each printf() call, you could force line buffered output by calling in a C program at the very beginning:
setvbuf(stdout, (char *) NULL, _IOLBF, 0); /* make line buffered stdout */
As soon as a newline is printed the buffer is flushed in this case.
Or fix it without modifying the source of C program
There is stdbuf utility that allows you to change buffering type without modifying the source code e.g.:
from subprocess import Popen, PIPE
process = Popen(["stdbuf", "-oL", "./main"], stdout=PIPE, bufsize=1)
for line in iter(process.stdout.readline, b''):
print line,
process.communicate() # close process' stream, wait for it to exit
There are also other utilities available, see Turn off buffering in pipe.
Or use pseudo-TTY
To trick the subprocess into thinking that it is running interactively, you could use pexpect module or its analogs, for code examples that use pexpect and pty modules, see Python subprocess readlines() hangs. Here's a variation on the pty example provided there (it should work on Linux):
#!/usr/bin/env python
import os
import pty
import sys
from select import select
from subprocess import Popen, STDOUT
master_fd, slave_fd = pty.openpty() # provide tty to enable line buffering
process = Popen("./main", stdin=slave_fd, stdout=slave_fd, stderr=STDOUT,
bufsize=0, close_fds=True)
timeout = .1 # ugly but otherwise `select` blocks on process' exit
# code is similar to _copy() from pty.py
with os.fdopen(master_fd, 'r+b', 0) as master:
input_fds = [master, sys.stdin]
while True:
fds = select(input_fds, [], [], timeout)[0]
if master in fds: # subprocess' output is ready
data = os.read(master_fd, 512) # <-- doesn't block, may return less
if not data: # EOF
input_fds.remove(master)
else:
os.write(sys.stdout.fileno(), data) # copy to our stdout
if sys.stdin in fds: # got user input
data = os.read(sys.stdin.fileno(), 512)
if not data:
input_fds.remove(sys.stdin)
else:
master.write(data) # copy it to subprocess' stdin
if not fds: # timeout in select()
if process.poll() is not None: # subprocess ended
# and no output is buffered <-- timeout + dead subprocess
assert not select([master], [], [], 0)[0] # race is possible
os.close(slave_fd) # subproces don't need it anymore
break
rc = process.wait()
print("subprocess exited with status %d" % rc)
Or use pty via pexpect
pexpect wraps pty handling into higher level interface:
#!/usr/bin/env python
import pexpect
child = pexpect.spawn("/.main")
for line in child:
print line,
child.close()
Q: Why not just use a pipe (popen())? explains why pseudo-TTY is useful.

Your program isn't hung, it just runs very slowly. Your program is using buffered output; the "2000\n" data is not being written to stdout immediately, but will eventually make it. In your case, it might take BUFSIZ/strlen("2000\n") seconds (probably 1638 seconds) to complete.
After this line:
printf("2000\n");
add
fflush(stdout);

See readline docs.
Your code:
process.stdout.readline
Is waiting for EOF or a newline.
I cannot tell what you are ultimately trying to do, but adding a newline to your printf, e.g., printf("2000\n");, should at least get you started.

Related

Python 2 Subprocess: Cannot get output from readline

I have the following C application
#include <stdio.h>
int main(void)
{
printf("hello world\n");
/* Go into an infinite loop here. */
while(1);
return 0;
}
And I have the following python code.
import subprocess
import time
import pprint
def run():
command = ["./myapplication"]
process = subprocess.Popen(command, stdout=subprocess.PIPE)
try:
while process.poll() is None:
# HELP: This call blocks...
for i in process.stdout.readline():
print(i)
finally:
if process.poll() is None:
process.kill()
if __name__ == "__main__":
run()
When I run the python code, the stdout.readline or even stdout.read blocks.
If I run the application using subprocess.call(program) then I can see "hello world" in stdout.
How can I read input from stdout with the example I have provided?
Note: I would not want to modify my C program. I have tried this on both Python 2.7.17 and Python 3.7.5 under Ubuntu 19.10 and I get the same behaviour. Adding bufsize=0 did not help me.

The easiest way is to flush buffers in the C program
...
printf("hello world\n");
fflush(stdout);
while(1);
...
If you don't want to change the C program, you can manipulate the libc buffering behavior from outside. This can be done by using stdbuf to call your program (linux). The syntax is "stdbuf -o0 yourapplication" for zero buffering and "stdbuf -oL yourapplication" for line buffering. Therefore in your python code use
...
command = ["/usr/bin/stdbuf","-oL","pathtomyapplication"]
process = subprocess.Popen(command, stdout=subprocess.PIPE)
...
or
...
command = ["/usr/bin/stdbuf","-o0","pathtomyapplication"]
process = subprocess.Popen(command, stdout=subprocess.PIPE)
...

Applications built using the C Standard IO Library (built with #include <stdio.h>) buffer input and output (see here for why). The stdio library, like isatty, can tell that it is writing to a pipe not a TTY and so it chooses block buffering instead of line buffering. Data is flushed when the buffer is full, but "hello world\n" is not filling the buffer so it's not flushed.
One way around is shown in Timo Hartmann answer, using stdbuf utility. This uses an LD_PRELOAD trick to swap in its own libstdbuf.so. In many cases that is a fine solution, but LD_PRELOAD is kind of a hack and does not work in some cases, so it may not be a general solution.
Maybe you want to do this directly in Python, and there are stdlib options to help here, you can use a pseudo-tty (docs py2, docs py3) connected to stdout instead of a pipe. The program myapplication should enable line buffering, meaning that any newline character flushes the buffer.
from __future__ import print_function
from subprocess import Popen, PIPE
import errno
import os
import pty
import sys
mfd, sfd = pty.openpty()
proc = Popen(["/tmp/myapplication"], stdout=sfd)
os.close(sfd) # don't wait for input
while True:
try:
output = os.read(mfd, 1000)
except OSError as e:
if e.errno != errno.EIO:
raise
else:
print(output)
Note that we are reading bytes from the output now, so we can not necessarily decode them right away!
See Processing the output of a subprocess with Python in realtime for a blog post cleaning up this idea. There are also existing third-party libraries to do this stuff, see ptyprocess.

How to get output from python2 subprocess which run a script using multiprocessing?

Here is my demo code. It contains two scripts.
The first is main.py, it will call print_line.py with subprocess module.
The second is print_line.py, it prints something to the stdout.
main.py
import subprocess
p = subprocess.Popen('python2 print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
while True:
line = p.stdout.readline()
if line:
print(line)
else:
break
print_line.py
from multiprocessing import Process, JoinableQueue, current_process
if __name__ == '__main__':
task_q = JoinableQueue()
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
task_q.task_done()
for _ in range(10):
p = Process(target=do_task)
p.daemon = True
p.start()
for i in range(100):
task_q.put(i)
task_q.join()
Before, print_line.py is written with threading and Queue module, everything is fine. But now, after changing to multiprocessing module, the main.py cannot get any output from print_line. I tried to use Popen.communicate() to get the output or set preexec_fn=os.setsid inPopen(). Neither of them work.
So, here is my question:
Why subprocess cannot get the output with multiprocessing? why it is ok with threading?
If I comment out stdout=subprocess.PIPE and stderr=subprocess.PIPE, the output is printed in my console. Why? How does this happen?
Is there any chance to get the output from print_line.py?

Curious.
In theory this should work as it is, but it does not. The reason being somewhere in the deep, murky waters of buffered IO. It seems that the output of a subprocess of a subprocess can get lost if not flushed.
You have two workarounds:
One is to use flush() in your print_line.py:
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
sys.stdout.flush()
task_q.task_done()
This will fix the issue as you will flush your stdout as soon as you have written something to it.
Another option is to use -u flag to Python in your main.py:
p = subprocess.Popen('python2 -u print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
-u will force stdin and stdout to be completely unbuffered in print_line.py, and children of print_line.py will then inherit this behaviour.
These are workarounds to the problem. If you are interested in the theory why this happens, it definitely has something to do with unflushed stdout being lost if subprocess terminates, but I am not the expert in this.

It's not a multiprocessing issue, but it is a subprocess issue—or more precisely, it has to to with standard I/O and buffering, as in Hannu's answer. The trick is that by default, the output of any process, whether in Python or not, is line buffered if the output device is a "terminal device" as determined by os.isatty(stream.fileno()):
>>> import sys
>>> sys.stdout.fileno()
1
>>> import os
>>> os.isatty(1)
True
There is a shortcut available to you once the stream is open:
>>> sys.stdout.isatty()
True
but the os.isatty() operation is the more fundamental one. That is, internally, Python inspects the file descriptor first using os.isatty(fd), then chooses the stream's buffering based on the result (and/or arguments and/or the function used to open the stream). The sys.stdout stream is opened early on during Python's startup, before you generally have much control.1
When you call open or codecs.open or otherwise do your own operation to open a file, you can specify the buffering via one of the optional arguments. The default for open is the system default, which is line buffering if isatty(), otherwise fully buffered. Curiously, the default for codecs.open is line buffered.
A line buffered stream gets an automatic flush() applied when you write a newline to it.
An unbuffered stream writes each byte to its output immediately. This is very inefficient in general. A fully buffered stream writes its output when the buffer gets sufficiently full—the definition of "sufficient" here tends to be pretty variable, anything from 1024 (1k) to 1048576 (1 MB)—or when explicitly directed.
When you run something as a process, it's the process itself that decides how to do any buffering. Your own Python code, reading from the process, cannot control it. But if you know something—or a lot—about the processes that you will run, you can set up their environment so that they run line-buffered, or even unbuffered. (Or, as in your case, since you write that code, you can write it to do what you want.)
1There are hooks that fire up very early, where you can fuss with this sort of thing. They are tricky to work though.

Forward subprocess stdout does not work when stdout is redirected to a file

I have a problem forwarding the stdout of a subprocess to stdout of the current process.
This is my MWE caller code (runner.py):
import sys
import subprocess
import time
p = subprocess.Popen([sys.executable, "test.py"], stdout=sys.stdout)
time.sleep(10)
p.terminate()
and here is the content of the callee test.py:
import time
while True:
time.sleep(1)
print "Heartbeat"
The following will work and print all the heartbeats to the console:
python runner.py
However, the following does not work, the output text file remains empty (using Python 2.7):
python runner.py > test.txt
What do I have to do?

When the standard output is a TTY (a terminal), sys.stdout is line-buffered by default: each line you print is immediately written to the TTY.
But when the standard output is a file, then sys.stdout is block-buffered: data is written to the file only when a certain amount of data gets printed. By using p.terminate(), you are killing the process before the buffer is flushed.
Use sys.stdout.flush() after your print and you'll be fine:
import sys
import time
while True:
time.sleep(1)
print "Heartbeat"
sys.stdout.flush()
If you were using Python 3, you could also use the flush argument of the print function, like this:
import time
while True:
time.sleep(1)
print("Heartbeat", flush=True)
Alternatively, you can also set up a handler for SIGTERM to ensure that the buffer is flushed when p.terminate() gets called:
import signal
signal.signal(signal.SIGTERM, sys.stdout.flush)

It is possible to force flushes by doing sys.stdout.flush() after each print, but this would quickly become cumbersome. Since you know you're running Python, it is possible to force Python to unbuffered mode - either with -u switch or PYTHONUNBUFFERED environment variable:
p = subprocess.Popen([sys.executable, '-u', 'test.py'], stdout=sys.stdout)
or
import os
# force all future python processes to be unbuffered
os.environ['PYTHONUNBUFFERED'] = '1'
p = subprocess.Popen([sys.executable, 'test.py'])

You don't need to pass stdout=sys.stdout unless sys.stdout uses a different file descriptor than the one that python executable has started with. C stdout fd is inherited by default: you don't need to do anything in order for the child process to inherit it.
As #Andrea Corbellini said, if the output is redirected to a file then python uses a block-buffering mode and "Heartbeat"*10 (usually) is too small to overflow the buffer.
I would expect that python flushes its internal stdout buffers on exit but it doesn't do it on SIGTERM signal (generated by the .terminate() call).
To allow the child process to exit gracefully, use SIGINT (Ctrl+C) instead of p.terminate():
p.send_signal(signal.SIGINT)
In that case test.py will flush the buffers and you'll see the output in test.txt file. Either discard stderr or catch KeyboardInterrupt exception in the child.
If you want to see the output while the child process is still running then run python -u, to disable buffering or set PYTHONUNBUFFERED envvar to change the behavior of all affected python processes as #Antti Haapala suggested.
Note: your parent process may also buffer the output. If you don't flush the buffer in time then the output printed before test.py is even started may appear after its output in the test.txt file. The buffers in the parent and the child processes are independent. Either flush buffers manually or make sure that an appropriate buffering mode is used in each process. See Disable output buffering

Parsing pexpect output

I'm trying to parse in real time the output of a program block-buffered, which means that output is not available until the process ends. What I need is just to parse line by line, filter and manage data from the output, as it could run for hours.
I've tried to capture the output with subprocess.Popen(), but yes, as you may guess, Popen can't manage this kind of behavior, it keeps buffering until end of process.
from subprocess import Popen, PIPE
p = Popen("my noisy stuff ", shell=True, stdout=PIPE, stderr=PIPE)
for line in p.stdout.readlines():
#parsing text and getting data
So I found pexpect, which prints the output in real time, as it treats the stdout as a file, or I could even do a dirty trick printing out a file and parsing it outside the function. But ok, it is too dirty, even for me ;)
import pexpect
import sys
pexpect.run("my noisy stuff", logfile=sys.stdout)
But I guess it should a better pythonic way to do this, just manage the stdout like subprocess. Popen does. How can I do this?
EDIT:
Running J.F. proposal:
This is a deliberately wrong audit, it takes about 25 secs. to stop.
from subprocess import Popen, PIPE
command = "bully mon0 -e ESSID -c 8 -b aa:bb:cc:dd:ee:00 -v 2"
p = Popen(command, shell=True, stdout=PIPE, stderr=PIPE)
for line in iter(p.stdout.readline, b''):
print "inside loop"
print line
print "outside loop"
p.stdout.close()
p.wait()
#$ sudo python SCRIPT.py
### <= 25 secs later......
# inside loop
#[!] Bully v1.0-21 - WPS vulnerability assessment utility
#inside loop
#[!] Using 'ee:cc:bb:aa:bb:ee' for the source MAC address
#inside loop
#[X] Unable to get a beacon from the AP, possible causes are
#inside loop
#[.] an invalid --bssid or -essid was provided,
#inside loop
#[.] the access point isn't on channel '8',
#inside loop
#[.] you aren't close enough to the access point.
#outside loop
Using this method instead:
EDIT: Due to large delays and timeouts in the output, I had to fix the child, and added some hacks, so final code looks like this
import pexpect
child = pexpect.spawn(command)
child.maxsize = 1 #Turns off buffering
child.timeout = 50 # default is 30, insufficient for me. Crashes were due to this param.
for line in child:
print line,
child.close()
Gives back the same output, but it prints lines in real time. So... SOLVED Thanks #J.F. Sebastian

.readlines() reads all lines. No wonder you don't see any output until the subprocess ends. You could use .readline() instead to read line by line as soon as the subprocess flushes its stdout buffer:
from subprocess import Popen, PIPE
p = Popen("my noisy stuff", stdout=PIPE, bufsize=1)
for line in iter(p.stdout.readline, b''):
# process line
..
p.stdout.close()
p.wait()
If you are already have pexpect then you could use it to workaround the block-buffering issue:
import pexpect
child = pexpect.spawn("my noisy stuff", timeout=None)
for line in child:
# process line
..
child.close()
See also stdbuf, pty -based solutions from the question I've linked in the comments.

How to read the first line of a subprocess without buffers filling up in Python

From Python in Linux, I want to start a sub-process, wait until it prints one line on it's standard out, then continue with the rest of my Python script. If I do:
from subprocess import *
proc = Popen(my_process, stdout=PIPE)
proc.readline()
# Now continue with the rest of my script
Will my process eventually block if it writes a lot to its stdout, because the pipe fills up?
Ideally, I'd like the rest of the output to go to the standard output of my script. Is there a way to change the stdout of the subprocess from PIPE to my standard output after it starts?
I'm guessing I'll have to spawn a separate thread just to read from my process's stdout and print to my own, but I'd like to avoid that if there's a simpler solution.

Stop the process?
proc.terminate()
After the readline

The readline method should not block if the line is particularly large, this is pulling data directly out of the pipe buffer and into userspace. If the data was remaining in the pipe buffer, there's a good chance it would block the spawned process but I'm pretty sure Python must take the data out of the pipe buffer before it can examine it for the end-of-line.
Or you could just read characters off the pipe directly, this would prevent any possible buffer issues:
from subprocess import *
proc = Popen(my_process, stdout=PIPE)
c = ' '
while c != '\n':
c = proc.stdout.read(1)
# Now complete the rest of the program....

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Popen with stdout and stdin piped : read freeze [duplicate] - python

See readline docs. Your code: process.stdout.readline Is waiting for EOF or a newline. I cannot tell what you are ultimately trying to do, but adding a newline to your printf, e.g., printf("2000\n");, should at least get you started.

Related

Python 2 Subprocess: Cannot get output from readline

How to get output from python2 subprocess which run a script using multiprocessing?

Forward subprocess stdout does not work when stdout is redirected to a file

Parsing pexpect output

How to read the first line of a subprocess without buffers filling up in Python

Categories

Resources