python os.mkfifo() for Windows - python

Short version (if you can answer the short version it does the job for me, the rest is mainly for the benefit of other people with a similar task):
In python in Windows, I want to create 2 file objects, attached to the same file (it doesn't have to be an actual file on the hard-drive), one for reading and one for writing, such that if the reading end tries to read it will never get EOF (it will just block until something is written). I think in linux os.mkfifo() would do the job, but in Windows it doesn't exist. What can be done? (I must use file-objects).
Some extra details:
I have a python module (not written by me) that plays a certain game through stdin and stdout (using raw_input() and print). I also have a Windows executable playing the same game, through stdin and stdout as well. I want to make them play one against the other, and log all their communication.
Here's the code I can write (the get_fifo() function is not implemented, because that's what I don't know to do it Windows):
class Pusher(Thread):
def __init__(self, source, dest, p1, name):
Thread.__init__(self)
self.source = source
self.dest = dest
self.name = name
self.p1 = p1
def run(self):
while (self.p1.poll()==None) and\
(not self.source.closed) and (not self.source.closed):
line = self.source.readline()
logging.info('%s: %s' % (self.name, line[:-1]))
self.dest.write(line)
self.dest.flush()
exe_to_pythonmodule_reader, exe_to_pythonmodule_writer =\
get_fifo()
pythonmodule_to_exe_reader, pythonmodule_to_exe_writer =\
get_fifo()
p1 = subprocess.Popen(exe, shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
old_stdin = sys.stdin
old_stdout = sys.stdout
sys.stdin = exe_to_pythonmodule_reader
sys.stdout = pythonmodule_to_exe_writer
push1 = Pusher(p1.stdout, exe_to_pythonmodule_writer, p1, '1')
push2 = Pusher(pythonmodule_to_exe_reader, p1.stdin, p1, '2')
push1.start()
push2.start()
ret = pythonmodule.play()
sys.stdin = old_stdin
sys.stdout = old_stdout

Following the two answers above, I accidentally bumped into the answer. os.pipe() does the job. Thank you for your answers.
I'm posting the complete code in case someone else is looking for this:
import subprocess
from threading import Thread
import time
import sys
import logging
import tempfile
import os
import game_playing_module
class Pusher(Thread):
def __init__(self, source, dest, proc, name):
Thread.__init__(self)
self.source = source
self.dest = dest
self.name = name
self.proc = proc
def run(self):
while (self.proc.poll()==None) and\
(not self.source.closed) and (not self.dest.closed):
line = self.source.readline()
logging.info('%s: %s' % (self.name, line[:-1]))
self.dest.write(line)
self.dest.flush()
def get_reader_writer():
fd_read, fd_write = os.pipe()
return os.fdopen(fd_read, 'r'), os.fdopen(fd_write, 'w')
def connect(exe):
logging.basicConfig(level=logging.DEBUG,\
format='%(message)s',\
filename=LOG_FILE_NAME,
filemode='w')
program_to_grader_reader, program_to_grader_writer =\
get_reader_writer()
grader_to_program_reader, grader_to_program_writer =\
get_reader_writer()
p1 = subprocess.Popen(exe, shell=False, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
old_stdin = sys.stdin
old_stdout = sys.stdout
sys.stdin = program_to_grader_reader
sys.stdout = grader_to_program_writer
push1 = Pusher(p1.stdout, program_to_grader_writer, p1, '1')
push2 = Pusher(grader_to_program_reader, p1.stdin, p1, '2')
push1.start()
push2.start()
game_playing_module.play()
sys.stdin = old_stdin
sys.stdout = old_stdout
fil = file(LOG_FILE, 'r')
data = fil.read()
fil.close()
return data
if __name__=='__main__':
if len(sys.argv) != 2:
print 'Usage: connect.py exe'
print sys.argv
exit()
print sys.argv
print connect(sys.argv[1])

On Windows, you are looking at (Named or Anonymous) Pipes.
A pipe is a section of shared memory that processes use for communication. The process that creates a pipe is the pipe server. A process that connects to a pipe is a pipe client. One process writes information to the pipe, then the other process reads the information from the pipe.
To work with Windows Pipes, you can use Python for Windows extensions (pywin32), or the Ctypes module. A special utility module, win32pipe, provides an interface to the win32 pipe API's. It includes implementations of the popen[234]() convenience functions.
See how-to-use-win32-apis-with-python and similar SO questions (not specific to Pipes, but points to useful info).

For a cross-platform solution, I'd recommend building the file-like object on top of a socket on localhost (127.0.0.1) -- that's what IDLE does by default to solve a problem that's quite similar to yours.

os.pipe() returns an anonymous pipe, or a named pipe on Windows, which is very lightweight and efficient.
TCP sockets (as suggested by user1495323) are more heavyweight: you can see them with netstat for example, and each one requires a port number, and the number of available ports is limited to 64k per peer (e.g. 64k from localhost to localhost).
On the other hand, named pipes (on Windows) are limited because:
You can't use select() for nonblocking I/O on Windows, because they're not sockets.
There's no apparent way to read() with a timeout, and
Even making them non-blocking is difficult.
And sockets can be wrapped in Python-compatible filehandles using makefile(), which allows them to be used to redirect stdout or stderr. This makes this an attractive option for some use cases, such as sending stdout from one thread to another.
A socket can be constructed with an automatically-assigned port number like this (based on the excellent Python socket HOWTO):
with closing(socket.socket(socket.AF_INET, socket.SOCK_STREAM)) as input_socket:
# Avoid socket exhaustion by setting SO_REUSEADDR <https://stackoverflow.com/a/12362623/648162>:
input_socket.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
# localhost doesn't work if the definition is missing from the hosts file,
# and 127.0.0.1 only works with IPv4 loopback, but socket.gethostname()
# should always work:
input_socket.bind((socket.gethostname(), 0))
random_port_number = input_socket.getsockname()[1]
input_socket.listen(1)
# Do something with input_socket, for example pass it to another thread.
output_socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# close() should not strictly be necessary here, but since connect() could fail, it avoids leaking fds
# in that case. "If a file descriptor is given, it is closed when the returned I/O object is closed".
with output_socket:
output_socket.connect((socket.gethostname(), random_port_number))
The user of input_socket (e.g. another thread) can then do:
with input_socket:
while True:
readables, _, _ = select.select([input_socket], [], [input_socket], 1.0)
if len(readables) > 0:
input_conn, addr = self.input_socket.accept()
break
with input_conn:
while True:
readables, _, errored = select.select([input_conn], [], [input_conn], 1.0)
if len(errored) > 0:
print("connection errored, stopping")
break
if len(readables) > 0:
read_data = input_conn.recv(1024)
if len(read_data) == 0:
print("connection closed, stopping")
break
else:
print(f"read data: {read_data!r}")

Related

How to check output of a sub process but also hide it? [duplicate]

NB. I have seen Log output of multiprocessing.Process - unfortunately, it doesn't answer this question.
I am creating a child process (on windows) via multiprocessing. I want all of the child process's stdout and stderr output to be redirected to a log file, rather than appearing at the console. The only suggestion I have seen is for the child process to set sys.stdout to a file. However, this does not effectively redirect all stdout output, due to the behaviour of stdout redirection on Windows.
To illustrate the problem, build a Windows DLL with the following code
#include <iostream>
extern "C"
{
__declspec(dllexport) void writeToStdOut()
{
std::cout << "Writing to STDOUT from test DLL" << std::endl;
}
}
Then create and run a python script like the following, which imports this DLL and calls the function:
from ctypes import *
import sys
print
print "Writing to STDOUT from python, before redirect"
print
sys.stdout = open("stdout_redirect_log.txt", "w")
print "Writing to STDOUT from python, after redirect"
testdll = CDLL("Release/stdout_test.dll")
testdll.writeToStdOut()
In order to see the same behaviour as me, it is probably necessary for the DLL to be built against a different C runtime than than the one Python uses. In my case, python is built with Visual Studio 2010, but my DLL is built with VS 2005.
The behaviour I see is that the console shows:
> stdout_test.py
Writing to STDOUT from python, before redirect
Writing to STDOUT from test DLL
While the file stdout_redirect_log.txt ends up containing:
Writing to STDOUT from python, after redirect
In other words, setting sys.stdout failed to redirect the stdout output generated by the DLL. This is unsurprising given the nature of the underlying APIs for stdout redirection in Windows. I have encountered this problem at the native/C++ level before and never found a way to reliably redirect stdout from within a process. It has to be done externally.
This is actually the very reason I am launching a child process - it's so that I can connect externally to its pipes and thus guarantee that I am intercepting all of its output. I can definitely do this by launching the process manually with pywin32, but I would very much like to be able to use the facilities of multiprocessing, in particular the ability to communicate with the child process via a multiprocessing Pipe object, in order to get progress updates. The question is whether there is any way to both use multiprocessing for its IPC facilities and to reliably redirect all of the child's stdout and stderr output to a file.
UPDATE: Looking at the source code for multiprocessing.Processs, it has a static member, _Popen, which looks like it can be used to override the class used to create the process. If it's set to None (default), it uses a multiprocessing.forking._Popen, but it looks like by saying
multiprocessing.Process._Popen = MyPopenClass
I could override the process creation. However, although I could derive this from multiprocessing.forking._Popen, it looks like I would have to copy a bunch of internal stuff into my implementation, which sounds flaky and not very future-proof. If that's the only choice I think I'd probably plump for doing the whole thing manually with pywin32 instead.
The solution you suggest is a good one: create your processes manually such that you have explicit access to their stdout/stderr file handles. You can then create a socket to communicate with the sub-process and use multiprocessing.connection over that socket (multiprocessing.Pipe creates the same type of connection object, so this should give you all the same IPC functionality).
Here's a two-file example.
master.py:
import multiprocessing.connection
import subprocess
import socket
import sys, os
## Listen for connection from remote process (and find free port number)
port = 10000
while True:
try:
l = multiprocessing.connection.Listener(('localhost', int(port)), authkey="secret")
break
except socket.error as ex:
if ex.errno != 98:
raise
port += 1 ## if errno==98, then port is not available.
proc = subprocess.Popen((sys.executable, "subproc.py", str(port)), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
## open connection for remote process
conn = l.accept()
conn.send([1, "asd", None])
print(proc.stdout.readline())
subproc.py:
import multiprocessing.connection
import subprocess
import sys, os, time
port = int(sys.argv[1])
conn = multiprocessing.connection.Client(('localhost', port), authkey="secret")
while True:
try:
obj = conn.recv()
print("received: %s\n" % str(obj))
sys.stdout.flush()
except EOFError: ## connection closed
break
You may also want to see the first answer to this question to get non-blocking reads from the subprocess.
I don't think you have a better option than redirecting a subprocess to a file as you mentioned in your comment.
The way consoles stdin/out/err work in windows is each process when it's born has its std handles defined. You can change them with SetStdHandle. When you modify python's sys.stdout you only modify where python prints out stuff, not where other DLL's are printing stuff. Part of the CRT in your DLL is using GetStdHandle to find out where to print out to. If you want, you can do whatever piping you want in windows API in your DLL or in your python script with pywin32. Though I do think it'll be simpler with subprocess.
Alternatively - and I know this might be slightly off-topic, but helped in my case for the same problem - , this can be resolved with screen on Linux:
screen -L -Logfile './logfile_%Y-%m-%d.log' python my_multiproc_script.py
this way no need to implement all the master-child communication
I assume I'm off base and missing something, but for what it's worth here is what came to mind when I read your question.
If you can intercept all of the stdout and stderr (I got that impression from your question), then why not add or wrap that capture functionality around each of your processes? Then send what is captured through a queue to a consumer that can do whatever you want with all of the outputs?
In my situation I changed sys.stdout.write to write to a PySide QTextEdit. I couldn't read from sys.stdout and I didn't know how to change sys.stdout to be readable. I created two Pipes. One for stdout and the other for stderr. In the separate process I redirect sys.stdout and sys.stderr to the child connection of the multiprocessing pipe. On the main process I created two threads to read the stdout and stderr parent pipe and redirect the pipe data to sys.stdout and sys.stderr.
import sys
import contextlib
import threading
import multiprocessing as mp
import multiprocessing.queues
from queue import Empty
import time
class PipeProcess(mp.Process):
"""Process to pipe the output of the sub process and redirect it to this sys.stdout and sys.stderr.
Note:
The use_queue = True argument will pass data between processes using Queues instead of Pipes. Queues will
give you the full output and read all of the data from the Queue. A pipe is more efficient, but may not
redirect all of the output back to the main process.
"""
def __init__(self, group=None, target=None, name=None, args=tuple(), kwargs={}, *_, daemon=None,
use_pipe=None, use_queue=None):
self.read_out_th = None
self.read_err_th = None
self.pipe_target = target
self.pipe_alive = mp.Event()
if use_pipe or (use_pipe is None and not use_queue): # Default
self.parent_stdout, self.child_stdout = mp.Pipe(False)
self.parent_stderr, self.child_stderr = mp.Pipe(False)
else:
self.parent_stdout = self.child_stdout = mp.Queue()
self.parent_stderr = self.child_stderr = mp.Queue()
args = (self.child_stdout, self.child_stderr, target) + tuple(args)
target = self.run_pipe_out_target
super(PipeProcess, self).__init__(group=group, target=target, name=name, args=args, kwargs=kwargs,
daemon=daemon)
def start(self):
"""Start the multiprocess and reading thread."""
self.pipe_alive.set()
super(PipeProcess, self).start()
self.read_out_th = threading.Thread(target=self.read_pipe_out,
args=(self.pipe_alive, self.parent_stdout, sys.stdout))
self.read_err_th = threading.Thread(target=self.read_pipe_out,
args=(self.pipe_alive, self.parent_stderr, sys.stderr))
self.read_out_th.daemon = True
self.read_err_th.daemon = True
self.read_out_th.start()
self.read_err_th.start()
#classmethod
def run_pipe_out_target(cls, pipe_stdout, pipe_stderr, pipe_target, *args, **kwargs):
"""The real multiprocessing target to redirect stdout and stderr to a pipe or queue."""
sys.stdout.write = cls.redirect_write(pipe_stdout) # , sys.__stdout__) # Is redirected in main process
sys.stderr.write = cls.redirect_write(pipe_stderr) # , sys.__stderr__) # Is redirected in main process
pipe_target(*args, **kwargs)
#staticmethod
def redirect_write(child, out=None):
"""Create a function to write out a pipe and write out an additional out."""
if isinstance(child, mp.queues.Queue):
send = child.put
else:
send = child.send_bytes # No need to pickle with child_conn.send(data)
def write(data, *args):
try:
if isinstance(data, str):
data = data.encode('utf-8')
send(data)
if out is not None:
out.write(data)
except:
pass
return write
#classmethod
def read_pipe_out(cls, pipe_alive, pipe_out, out):
if isinstance(pipe_out, mp.queues.Queue):
# Queue has better functionality to get all of the data
def recv():
return pipe_out.get(timeout=0.5)
def is_alive():
return pipe_alive.is_set() or pipe_out.qsize() > 0
else:
# Pipe is more efficient
recv = pipe_out.recv_bytes # No need to unpickle with data = pipe_out.recv()
is_alive = pipe_alive.is_set
# Loop through reading and redirecting data
while is_alive():
try:
data = recv()
if isinstance(data, bytes):
data = data.decode('utf-8')
out.write(data)
except EOFError:
break
except Empty:
pass
except:
pass
def join(self, *args):
# Wait for process to finish (unless a timeout was given)
super(PipeProcess, self).join(*args)
# Trigger to stop the threads
self.pipe_alive.clear()
# Pipe must close to prevent blocking and waiting on recv forever
if not isinstance(self.parent_stdout, mp.queues.Queue):
with contextlib.suppress():
self.parent_stdout.close()
with contextlib.suppress():
self.parent_stderr.close()
# Close the pipes and threads
with contextlib.suppress():
self.read_out_th.join()
with contextlib.suppress():
self.read_err_th.join()
def run_long_print():
for i in range(1000):
print(i)
print(i, file=sys.stderr)
print('finished')
if __name__ == '__main__':
# Example test write (My case was a QTextEdit)
out = open('stdout.log', 'w')
err = open('stderr.log', 'w')
# Overwrite the write function and not the actual stdout object to prove this works
sys.stdout.write = out.write
sys.stderr.write = err.write
# Create a process that uses pipes to read multiprocess output back into sys.stdout.write
proc = PipeProcess(target=run_long_print, use_queue=True) # If use_pipe=True Pipe may not write out all values
# proc.daemon = True # If daemon and use_queue Not all output may be redirected to stdout
proc.start()
# time.sleep(5) # Not needed unless use_pipe or daemon and all of stdout/stderr is desired
# Close the process
proc.join() # For some odd reason this blocks forever when use_queue=False
# Close the output files for this test
out.close()
err.close()
Here is the simple and straightforward way for capturing stdout for multiprocessing.Process:
import app
import io
import sys
from multiprocessing import Process
def run_app(some_param):
sys.stdout = io.TextIOWrapper(open(sys.stdout.fileno(), 'wb', 0), write_through=True)
app.run()
app_process = Process(target=run_app, args=('some_param',))
app_process.start()
# Use app_process.termninate() for python <= 3.7.
app_process.kill()

Python Subprocess - filter out logging

Python 3.6
I want to take all input from a subprocess which I run with the subprocess module. I can easily pipe this output to a log file, and it works great.
But, I want to filter out a lot of the lines (lots of noisy output from modules I do not control).
Attempt 1
def run_command(command, log_file):
process = subprocess.Popen(command, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, bufsize=1,
universal_newlines=True)
while True:
output = process.stdout.readline()
if output == '' and process.poll() is not None:
break
if output and not_noisy_line(output):
log_file.write(output)
log_file.flush()
return process.poll()
But this introduced a race condition between my subprocess and the output.
Attempt 2
I created a new method and a class to wrap the logging.
def run_command(command, log_file):
process = subprocess.run(command, stdout=QuiteLogger(log_file), stderr=QuiteLogger(log_file), timeout=120)
return process.returncode
class QuiteLogger(io.TextIOWrapper):
def write(self, data, encoding=sys.getdefaultencoding()):
data = filter(data)
super().write(data)
This does however just completely skip my filter function, my write method is not called at all by the subprocess. (If I call QuietLogger().write('asdasdsa') it goes through the filters)
Any clues?
This is an interesting situation in which the file object abstraction partially breaks down. The reason your solution does not work, is because subprocess is not actually using your QuietLogger but is getting the raw file number out of it (then repackaging it as a io.TextIOWrapper object).
I don't know if this is an intrinsic limitation in how the subprocess is handled, relying on OS support, or if this is just a mistake in the Python design, but in order to achieve what you want, you need to use the standard subprocess.PIPE and then roll your own file writer.
If you can wait for the subprocess to finish, then it can be trivially done, using the subprocess.run and then picking the stream out of the CompletedProcess (p) object:
p = subprocess.run(command, stdout=subprocess.PIPE, universal_newlines=True)
data = filter(p.stdout)
with open(logfile, 'w') as f:
f.write(data)
If you must work with the ouput while it is being generated (thus, you cannot wait for the subprocess to end) the simplest way is to resort to subprocess.Popen and threads:
import subprocess
import threading
logfile ='tmp.txt'
filter_passed = lambda line: line[:3] != 'Bad'
command = ['my_cmd', 'arg']
def writer(p, logfile):
with open(logfile, 'w') as f:
for line in p.stdout:
if filter_passed(line):
f.write(line)
p = subprocess.Popen(command, stdout=subprocess.PIPE, universal_newlines=True)
t = threading.Thread(target=writer, args=(p,logfile))
t.start()
t.join()
[Edit: My brain got derailed along the way, and I ended up answering another question than was actually asked. The following solution is useful for concurrently writing to a file, not for using the logging module in any way. However, since at least it's useful for that, I'll leave the answer in place for now.]
If you were just using threads, not separate processes, you'd just have to have a standard lock. So you could try something similar.
There's always the option of locking the output file. I don't know if your operating system supports anything like that, but the usual Unix way of doing it is to create a lock file. Basically, if the file exists, then wait; otherwise create the file before writing to your log file, and after you're done, remove the lock file again. You could use a context manager like this:
import os
import os.path
from time import sleep
class LockedFile():
def __init__(self, filename, mode):
self.filename = filename
self.lockfile = filename + '.lock'
self.mode = mode
def __enter__(self):
while True:
if os.path.isfile(self.lockfile):
sleep(0.1)
else:
break
with open(self.lockfile, 'a'):
os.utime(self.lockfile)
self.f = open(self.filename, self.mode)
return self.f
def __exit__(self, *args):
self.f.close()
os.remove(self.lockfile)
# And here's how to use it:
with LockedFile('blorg', 'a') as f:
f.write('foo\n')

live output from subprocess command

I'm using a python script as a driver for a hydrodynamics code. When it comes time to run the simulation, I use subprocess.Popen to run the code, collect the output from stdout and stderr into a subprocess.PIPE --- then I can print (and save to a log-file) the output information, and check for any errors. The problem is, I have no idea how the code is progressing. If I run it directly from the command line, it gives me output about what iteration its at, what time, what the next time-step is, etc.
Is there a way to both store the output (for logging and error checking), and also produce a live-streaming output?
The relevant section of my code:
ret_val = subprocess.Popen( run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True )
output, errors = ret_val.communicate()
log_file.write(output)
print output
if( ret_val.returncode ):
print "RUN failed\n\n%s\n\n" % (errors)
success = False
if( errors ): log_file.write("\n\n%s\n\n" % errors)
Originally I was piping the run_command through tee so that a copy went directly to the log-file, and the stream still output directly to the terminal -- but that way I can't store any errors (to my knowlege).
My temporary solution so far:
ret_val = subprocess.Popen( run_command, stdout=log_file, stderr=subprocess.PIPE, shell=True )
while not ret_val.poll():
log_file.flush()
then, in another terminal, run tail -f log.txt (s.t. log_file = 'log.txt').
TLDR for Python 3:
import subprocess
import sys
with open("test.log", "wb") as f:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
for c in iter(lambda: process.stdout.read(1), b""):
sys.stdout.buffer.write(c)
f.buffer.write(c)
You have two ways of doing this, either by creating an iterator from the read or readline functions and do:
import subprocess
import sys
# replace "w" with "wb" for Python 3
with open("test.log", "w") as f:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
# replace "" with b'' for Python 3
for c in iter(lambda: process.stdout.read(1), ""):
sys.stdout.write(c)
f.write(c)
or
import subprocess
import sys
# replace "w" with "wb" for Python 3
with open("test.log", "w") as f:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
# replace "" with b"" for Python 3
for line in iter(process.stdout.readline, ""):
sys.stdout.write(line)
f.write(line)
Or you can create a reader and a writer file. Pass the writer to the Popen and read from the reader
import io
import time
import subprocess
import sys
filename = "test.log"
with io.open(filename, "wb") as writer, io.open(filename, "rb", 1) as reader:
process = subprocess.Popen(command, stdout=writer)
while process.poll() is None:
sys.stdout.write(reader.read())
time.sleep(0.5)
# Read the remaining
sys.stdout.write(reader.read())
This way you will have the data written in the test.log as well as on the standard output.
The only advantage of the file approach is that your code doesn't block. So you can do whatever you want in the meantime and read whenever you want from the reader in a non-blocking way. When you use PIPE, read and readline functions will block until either one character is written to the pipe or a line is written to the pipe respectively.
Executive Summary (or "tl;dr" version): it's easy when there's at most one subprocess.PIPE, otherwise it's hard.
It may be time to explain a bit about how subprocess.Popen does its thing.
(Caveat: this is for Python 2.x, although 3.x is similar; and I'm quite fuzzy on the Windows variant. I understand the POSIX stuff much better.)
The Popen function needs to deal with zero-to-three I/O streams, somewhat simultaneously. These are denoted stdin, stdout, and stderr as usual.
You can provide:
None, indicating that you don't want to redirect the stream. It will inherit these as usual instead. Note that on POSIX systems, at least, this does not mean it will use Python's sys.stdout, just Python's actual stdout; see demo at end.
An int value. This is a "raw" file descriptor (in POSIX at least). (Side note: PIPE and STDOUT are actually ints internally, but are "impossible" descriptors, -1 and -2.)
A stream—really, any object with a fileno method. Popen will find the descriptor for that stream, using stream.fileno(), and then proceed as for an int value.
subprocess.PIPE, indicating that Python should create a pipe.
subprocess.STDOUT (for stderr only): tell Python to use the same descriptor as for stdout. This only makes sense if you provided a (non-None) value for stdout, and even then, it is only needed if you set stdout=subprocess.PIPE. (Otherwise you can just provide the same argument you provided for stdout, e.g., Popen(..., stdout=stream, stderr=stream).)
The easiest cases (no pipes)
If you redirect nothing (leave all three as the default None value or supply explicit None), Pipe has it quite easy. It just needs to spin off the subprocess and let it run. Or, if you redirect to a non-PIPE—an int or a stream's fileno()—it's still easy, as the OS does all the work. Python just needs to spin off the subprocess, connecting its stdin, stdout, and/or stderr to the provided file descriptors.
The still-easy case: one pipe
If you redirect only one stream, Pipe still has things pretty easy. Let's pick one stream at a time and watch.
Suppose you want to supply some stdin, but let stdout and stderr go un-redirected, or go to a file descriptor. As the parent process, your Python program simply needs to use write() to send data down the pipe. You can do this yourself, e.g.:
proc = subprocess.Popen(cmd, stdin=subprocess.PIPE)
proc.stdin.write('here, have some data\n') # etc
or you can pass the stdin data to proc.communicate(), which then does the stdin.write shown above. There is no output coming back so communicate() has only one other real job: it also closes the pipe for you. (If you don't call proc.communicate() you must call proc.stdin.close() to close the pipe, so that the subprocess knows there is no more data coming through.)
Suppose you want to capture stdout but leave stdin and stderr alone. Again, it's easy: just call proc.stdout.read() (or equivalent) until there is no more output. Since proc.stdout() is a normal Python I/O stream you can use all the normal constructs on it, like:
for line in proc.stdout:
or, again, you can use proc.communicate(), which simply does the read() for you.
If you want to capture only stderr, it works the same as with stdout.
There's one more trick before things get hard. Suppose you want to capture stdout, and also capture stderr but on the same pipe as stdout:
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
In this case, subprocess "cheats"! Well, it has to do this, so it's not really cheating: it starts the subprocess with both its stdout and its stderr directed into the (single) pipe-descriptor that feeds back to its parent (Python) process. On the parent side, there's again only a single pipe-descriptor for reading the output. All the "stderr" output shows up in proc.stdout, and if you call proc.communicate(), the stderr result (second value in the tuple) will be None, not a string.
The hard cases: two or more pipes
The problems all come about when you want to use at least two pipes. In fact, the subprocess code itself has this bit:
def communicate(self, input=None):
...
# Optimization: If we are only using one pipe, or no pipe at
# all, using select() or threads is unnecessary.
if [self.stdin, self.stdout, self.stderr].count(None) >= 2:
But, alas, here we've made at least two, and maybe three, different pipes, so the count(None) returns either 1 or 0. We must do things the hard way.
On Windows, this uses threading.Thread to accumulate results for self.stdout and self.stderr, and has the parent thread deliver self.stdin input data (and then close the pipe).
On POSIX, this uses poll if available, otherwise select, to accumulate output and deliver stdin input. All this runs in the (single) parent process/thread.
Threads or poll/select are needed here to avoid deadlock. Suppose, for instance, that we've redirected all three streams to three separate pipes. Suppose further that there's a small limit on how much data can be stuffed into to a pipe before the writing process is suspended, waiting for the reading process to "clean out" the pipe from the other end. Let's set that small limit to a single byte, just for illustration. (This is in fact how things work, except that the limit is much bigger than one byte.)
If the parent (Python) process tries to write several bytes—say, 'go\n'to proc.stdin, the first byte goes in and then the second causes the Python process to suspend, waiting for the subprocess to read the first byte, emptying the pipe.
Meanwhile, suppose the subprocess decides to print a friendly "Hello! Don't Panic!" greeting. The H goes into its stdout pipe, but the e causes it to suspend, waiting for its parent to read that H, emptying the stdout pipe.
Now we're stuck: the Python process is asleep, waiting to finish saying "go", and the subprocess is also asleep, waiting to finish saying "Hello! Don't Panic!".
The subprocess.Popen code avoids this problem with threading-or-select/poll. When bytes can go over the pipes, they go. When they can't, only a thread (not the whole process) has to sleep—or, in the case of select/poll, the Python process waits simultaneously for "can write" or "data available", writes to the process's stdin only when there is room, and reads its stdout and/or stderr only when data are ready. The proc.communicate() code (actually _communicate where the hairy cases are handled) returns once all stdin data (if any) have been sent and all stdout and/or stderr data have been accumulated.
If you want to read both stdout and stderr on two different pipes (regardless of any stdin redirection), you will need to avoid deadlock too. The deadlock scenario here is different—it occurs when the subprocess writes something long to stderr while you're pulling data from stdout, or vice versa—but it's still there.
The Demo
I promised to demonstrate that, un-redirected, Python subprocesses write to the underlying stdout, not sys.stdout. So, here is some code:
from cStringIO import StringIO
import os
import subprocess
import sys
def show1():
print 'start show1'
save = sys.stdout
sys.stdout = StringIO()
print 'sys.stdout being buffered'
proc = subprocess.Popen(['echo', 'hello'])
proc.wait()
in_stdout = sys.stdout.getvalue()
sys.stdout = save
print 'in buffer:', in_stdout
def show2():
print 'start show2'
save = sys.stdout
sys.stdout = open(os.devnull, 'w')
print 'after redirect sys.stdout'
proc = subprocess.Popen(['echo', 'hello'])
proc.wait()
sys.stdout = save
show1()
show2()
When run:
$ python out.py
start show1
hello
in buffer: sys.stdout being buffered
start show2
hello
Note that the first routine will fail if you add stdout=sys.stdout, as a StringIO object has no fileno. The second will omit the hello if you add stdout=sys.stdout since sys.stdout has been redirected to os.devnull.
(If you redirect Python's file-descriptor-1, the subprocess will follow that redirection. The open(os.devnull, 'w') call produces a stream whose fileno() is greater than 2.)
We can also use the default file iterator for reading stdout instead of using iter construct with readline().
import subprocess
import sys
process = subprocess.Popen(
your_command, stdout=subprocess.PIPE, stderr=subprocess.STDOUT
)
for line in process.stdout:
sys.stdout.write(line)
In addition to all these answer, one simple approach could also be as follows:
process = subprocess.Popen(your_command, stdout=subprocess.PIPE)
while process.stdout.readable():
line = process.stdout.readline()
if not line:
break
print(line.strip())
Loop through the readable stream as long as it's readable and if it gets an empty result, stop.
The key here is that readline() returns a line (with \n at the end) as long as there's an output and empty if it's really at the end.
Hope this helps someone.
If you're able to use third-party libraries, You might be able to use something like sarge (disclosure: I'm its maintainer). This library allows non-blocking access to output streams from subprocesses - it's layered over the subprocess module.
If all you need is that the output will be visible on the console the easiest solution for me was to pass the following arguments to Popen
with Popen(cmd, stdout=sys.stdout, stderr=sys.stderr) as proc:
which will use your python scripts stdio file handles
Solution 1: Log stdout AND stderr concurrently in realtime
A simple solution which logs both stdout AND stderr concurrently, line-by-line in realtime into a log file.
import subprocess as sp
from concurrent.futures import ThreadPoolExecutor
def log_popen_pipe(p, stdfile):
with open("mylog.txt", "w") as f:
while p.poll() is None:
f.write(stdfile.readline())
f.flush()
# Write the rest from the buffer
f.write(stdfile.read())
with sp.Popen(["ls"], stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:
with ThreadPoolExecutor(2) as pool:
r1 = pool.submit(log_popen_pipe, p, p.stdout)
r2 = pool.submit(log_popen_pipe, p, p.stderr)
r1.result()
r2.result()
Solution 2: A function read_popen_pipes() that allows you to iterate over both pipes (stdout/stderr), concurrently in realtime
import subprocess as sp
from queue import Queue, Empty
from concurrent.futures import ThreadPoolExecutor
def enqueue_output(file, queue):
for line in iter(file.readline, ''):
queue.put(line)
file.close()
def read_popen_pipes(p):
with ThreadPoolExecutor(2) as pool:
q_stdout, q_stderr = Queue(), Queue()
pool.submit(enqueue_output, p.stdout, q_stdout)
pool.submit(enqueue_output, p.stderr, q_stderr)
while True:
if p.poll() is not None and q_stdout.empty() and q_stderr.empty():
break
out_line = err_line = ''
try:
out_line = q_stdout.get_nowait()
err_line = q_stderr.get_nowait()
except Empty:
pass
yield (out_line, err_line)
# The function in use:
with sp.Popen(["ls"], stdout=sp.PIPE, stderr=sp.PIPE, text=True) as p:
for out_line, err_line in read_popen_pipes(p):
print(out_line, end='')
print(err_line, end='')
p.poll()
Similar to previous answers but the following solution worked for me on windows using Python3 to provide a common method to print and log in realtime (source)
def print_and_log(command, logFile):
with open(logFile, 'wb') as f:
command = subprocess.Popen(command, stdout=subprocess.PIPE, shell=True)
while True:
output = command.stdout.readline()
if not output and command.poll() is not None:
f.close()
break
if output:
f.write(output)
print(str(output.strip(), 'utf-8'), flush=True)
return command.poll()
A good but "heavyweight" solution is to use Twisted - see the bottom.
If you're willing to live with only stdout something along those lines should work:
import subprocess
import sys
popenobj = subprocess.Popen(["ls", "-Rl"], stdout=subprocess.PIPE)
while not popenobj.poll():
stdoutdata = popenobj.stdout.readline()
if stdoutdata:
sys.stdout.write(stdoutdata)
else:
break
print "Return code", popenobj.returncode
(If you use read() it tries to read the entire "file" which isn't useful, what we really could use here is something that reads all the data that's in the pipe right now)
One might also try to approach this with threading, e.g.:
import subprocess
import sys
import threading
popenobj = subprocess.Popen("ls", stdout=subprocess.PIPE, shell=True)
def stdoutprocess(o):
while True:
stdoutdata = o.stdout.readline()
if stdoutdata:
sys.stdout.write(stdoutdata)
else:
break
t = threading.Thread(target=stdoutprocess, args=(popenobj,))
t.start()
popenobj.wait()
t.join()
print "Return code", popenobj.returncode
Now we could potentially add stderr as well by having two threads.
Note however the subprocess docs discourage using these files directly and recommends to use communicate() (mostly concerned with deadlocks which I think isn't an issue above) and the solutions are a little klunky so it really seems like the subprocess module isn't quite up to the job (also see: http://www.python.org/dev/peps/pep-3145/ ) and we need to look at something else.
A more involved solution is to use Twisted as shown here: https://twistedmatrix.com/documents/11.1.0/core/howto/process.html
The way you do this with Twisted is to create your process using reactor.spawnprocess() and providing a ProcessProtocol that then processes output asynchronously. The Twisted sample Python code is here: https://twistedmatrix.com/documents/11.1.0/core/howto/listings/process/process.py
Based on all the above I suggest a slightly modified version (python3):
while loop calling readline (The iter solution suggested seemed to block forever for me - Python 3, Windows 7)
structered so handling of read data does not need to be duplicated after poll returns not-None
stderr piped into stdout so both output outputs are read
Added code to get exit value of cmd.
Code:
import subprocess
proc = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT, universal_newlines=True)
while True:
rd = proc.stdout.readline()
print(rd, end='') # and whatever you want to do...
if not rd: # EOF
returncode = proc.poll()
if returncode is not None:
break
time.sleep(0.1) # cmd closed stdout, but not exited yet
# You may want to check on ReturnCode here
I found a simple solution to a much complicated problem.
Both stdout and stderr need to be streamed.
Both of them need to be non-blocking: when there is no output and when there are too much output.
Do not want to use Threading or multiprocessing, also not willing to use pexpect.
This solution uses a gist I found here
import subprocess as sbp
import fcntl
import os
def non_block_read(output):
fd = output.fileno()
fl = fcntl.fcntl(fd, fcntl.F_GETFL)
fcntl.fcntl(fd, fcntl.F_SETFL, fl | os.O_NONBLOCK)
try:
return output.readline()
except:
return ""
with sbp.Popen('find / -name fdsfjdlsjf',
shell=True,
universal_newlines=True,
encoding='utf-8',
bufsize=1,
stdout=sbp.PIPE,
stderr=sbp.PIPE) as p:
while True:
out = non_block_read(p.stdout)
err = non_block_read(p.stderr)
if out:
print(out, end='')
if err:
print('E: ' + err, end='')
if p.poll() is not None:
break
It looks like line-buffered output will work for you, in which case something like the following might suit. (Caveat: it's untested.) This will only give the subprocess's stdout in real time. If you want to have both stderr and stdout in real time, you'll have to do something more complex with select.
proc = subprocess.Popen(run_command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True)
while proc.poll() is None:
line = proc.stdout.readline()
print line
log_file.write(line + '\n')
# Might still be data on stdout at this point. Grab any
# remainder.
for line in proc.stdout.read().split('\n'):
print line
log_file.write(line + '\n')
# Do whatever you want with proc.stderr here...
Why not set stdout directly to sys.stdout? And if you need to output to a log as well, then you can simply override the write method of f.
import sys
import subprocess
class SuperFile(open.__class__):
def write(self, data):
sys.stdout.write(data)
super(SuperFile, self).write(data)
f = SuperFile("log.txt","w+")
process = subprocess.Popen(command, stdout=f, stderr=f)
All of the above solutions I tried failed either to separate stderr and stdout output, (multiple pipes) or blocked forever when the OS pipe buffer was full which happens when the command you are running outputs too fast (there is a warning for this on python poll() manual of subprocess). The only reliable way I found was through select, but this is a posix-only solution:
import subprocess
import sys
import os
import select
# returns command exit status, stdout text, stderr text
# rtoutput: show realtime output while running
def run_script(cmd,rtoutput=0):
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
poller = select.poll()
poller.register(p.stdout, select.POLLIN)
poller.register(p.stderr, select.POLLIN)
coutput=''
cerror=''
fdhup={}
fdhup[p.stdout.fileno()]=0
fdhup[p.stderr.fileno()]=0
while sum(fdhup.values()) < len(fdhup):
try:
r = poller.poll(1)
except select.error, err:
if err.args[0] != EINTR:
raise
r=[]
for fd, flags in r:
if flags & (select.POLLIN | select.POLLPRI):
c = os.read(fd, 1024)
if rtoutput:
sys.stdout.write(c)
sys.stdout.flush()
if fd == p.stderr.fileno():
cerror+=c
else:
coutput+=c
else:
fdhup[fd]=1
return p.poll(), coutput.strip(), cerror.strip()
None of the Pythonic solutions worked for me.
It turned out that proc.stdout.read() or similar may block forever.
Therefore, I use tee like this:
subprocess.run('./my_long_running_binary 2>&1 | tee -a my_log_file.txt && exit ${PIPESTATUS}', shell=True, check=True, executable='/bin/bash')
This solution is convenient if you are already using shell=True.
${PIPESTATUS} captures the success status of the entire command chain (only available in Bash).
If I omitted the && exit ${PIPESTATUS}, then this would always return zero since tee never fails.
unbuffer might be necessary for printing each line immediately into the terminal, instead of waiting way too long until the "pipe buffer" gets filled.
However, unbuffer swallows the exit status of assert (SIG Abort)...
2>&1 also logs stderror to the file.
I think that the subprocess.communicate method is a bit misleading: it actually fills the stdout and stderr that you specify in the subprocess.Popen.
Yet, reading from the subprocess.PIPE that you can provide to the subprocess.Popen's stdout and stderr parameters will eventually fill up OS pipe buffers and deadlock your app (especially if you've multiple processes/threads that must use subprocess).
My proposed solution is to provide the stdout and stderr with files - and read the files' content instead of reading from the deadlocking PIPE. These files can be tempfile.NamedTemporaryFile() - which can also be accessed for reading while they're being written into by subprocess.communicate.
Below is a sample usage:
try:
with ProcessRunner(
("python", "task.py"), env=os.environ.copy(), seconds_to_wait=0.01
) as process_runner:
for out in process_runner:
print(out)
except ProcessError as e:
print(e.error_message)
raise
And this is the source code which is ready to be used with as many comments as I could provide to explain what it does:
If you're using python 2, please make sure to first install the latest version of the subprocess32 package from pypi.
import os
import sys
import threading
import time
import tempfile
import logging
if os.name == 'posix' and sys.version_info[0] < 3:
# Support python 2
import subprocess32 as subprocess
else:
# Get latest and greatest from python 3
import subprocess
logger = logging.getLogger(__name__)
class ProcessError(Exception):
"""Base exception for errors related to running the process"""
class ProcessTimeout(ProcessError):
"""Error that will be raised when the process execution will exceed a timeout"""
class ProcessRunner(object):
def __init__(self, args, env=None, timeout=None, bufsize=-1, seconds_to_wait=0.25, **kwargs):
"""
Constructor facade to subprocess.Popen that receives parameters which are more specifically required for the
Process Runner. This is a class that should be used as a context manager - and that provides an iterator
for reading captured output from subprocess.communicate in near realtime.
Example usage:
try:
with ProcessRunner(('python', task_file_path), env=os.environ.copy(), seconds_to_wait=0.01) as process_runner:
for out in process_runner:
print(out)
except ProcessError as e:
print(e.error_message)
raise
:param args: same as subprocess.Popen
:param env: same as subprocess.Popen
:param timeout: same as subprocess.communicate
:param bufsize: same as subprocess.Popen
:param seconds_to_wait: time to wait between each readline from the temporary file
:param kwargs: same as subprocess.Popen
"""
self._seconds_to_wait = seconds_to_wait
self._process_has_timed_out = False
self._timeout = timeout
self._process_done = False
self._std_file_handle = tempfile.NamedTemporaryFile()
self._process = subprocess.Popen(args, env=env, bufsize=bufsize,
stdout=self._std_file_handle, stderr=self._std_file_handle, **kwargs)
self._thread = threading.Thread(target=self._run_process)
self._thread.daemon = True
def __enter__(self):
self._thread.start()
return self
def __exit__(self, exc_type, exc_val, exc_tb):
self._thread.join()
self._std_file_handle.close()
def __iter__(self):
# read all output from stdout file that subprocess.communicate fills
with open(self._std_file_handle.name, 'r') as stdout:
# while process is alive, keep reading data
while not self._process_done:
out = stdout.readline()
out_without_trailing_whitespaces = out.rstrip()
if out_without_trailing_whitespaces:
# yield stdout data without trailing \n
yield out_without_trailing_whitespaces
else:
# if there is nothing to read, then please wait a tiny little bit
time.sleep(self._seconds_to_wait)
# this is a hack: terraform seems to write to buffer after process has finished
out = stdout.read()
if out:
yield out
if self._process_has_timed_out:
raise ProcessTimeout('Process has timed out')
if self._process.returncode != 0:
raise ProcessError('Process has failed')
def _run_process(self):
try:
# Start gathering information (stdout and stderr) from the opened process
self._process.communicate(timeout=self._timeout)
# Graceful termination of the opened process
self._process.terminate()
except subprocess.TimeoutExpired:
self._process_has_timed_out = True
# Force termination of the opened process
self._process.kill()
self._process_done = True
#property
def return_code(self):
return self._process.returncode
Here is a class which I'm using in one of my projects. It redirects output of a subprocess to the log. At first I tried simply overwriting the write-method but that doesn't work as the subprocess will never call it (redirection happens on filedescriptor level). So I'm using my own pipe, similar to how it's done in the subprocess-module. This has the advantage of encapsulating all logging/printing logic in the adapter and you can simply pass instances of the logger to Popen: subprocess.Popen("/path/to/binary", stderr = LogAdapter("foo"))
class LogAdapter(threading.Thread):
def __init__(self, logname, level = logging.INFO):
super().__init__()
self.log = logging.getLogger(logname)
self.readpipe, self.writepipe = os.pipe()
logFunctions = {
logging.DEBUG: self.log.debug,
logging.INFO: self.log.info,
logging.WARN: self.log.warn,
logging.ERROR: self.log.warn,
}
try:
self.logFunction = logFunctions[level]
except KeyError:
self.logFunction = self.log.info
def fileno(self):
#when fileno is called this indicates the subprocess is about to fork => start thread
self.start()
return self.writepipe
def finished(self):
"""If the write-filedescriptor is not closed this thread will
prevent the whole program from exiting. You can use this method
to clean up after the subprocess has terminated."""
os.close(self.writepipe)
def run(self):
inputFile = os.fdopen(self.readpipe)
while True:
line = inputFile.readline()
if len(line) == 0:
#no new data was added
break
self.logFunction(line.strip())
If you don't need logging but simply want to use print() you can obviously remove large portions of the code and keep the class shorter. You could also expand it by an __enter__ and __exit__ method and call finished in __exit__ so that you could easily use it as context.
import os
def execute(cmd, callback):
for line in iter(os.popen(cmd).readline, ''):
callback(line[:-1])
execute('ls -a', print)
Had the same problem and worked out a simple and clean solution using process.sdtout.read1() which works perfectly for my needs in python3.
Here is a demo using the ping command (requires internet connection):
from subprocess import Popen, PIPE
cmd = "ping 8.8.8.8"
proc = Popen([cmd], shell=True, stdout=PIPE)
while True:
print(proc.stdout.read1())
Every second or so a new line is printed in the python console as the ping command reports its data in real time.
In my view "live output from subprocess command" means that both stdout and stderr should be live. And stdin should also be delivered to the subprocess.
The fragment below produces live output on stdout and stderr and also captures them as bytes in outcome.{stdout,stderr}.
The trick involves proper use of select and poll.
Works well for me on Python 3.9.
if self.log == 1:
print(f"** cmnd= {fullCmndStr}")
self.outcome.stdcmnd = fullCmndStr
try:
process = subprocess.Popen(
fullCmndStr,
shell=True,
encoding='utf8',
executable="/bin/bash",
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
)
except OSError:
self.outcome.error = OSError
else:
process.stdin.write(stdin)
process.stdin.close() # type: ignore
stdoutStrFile = io.StringIO("")
stderrStrFile = io.StringIO("")
pollStdout = select.poll()
pollStderr = select.poll()
pollStdout.register(process.stdout, select.POLLIN)
pollStderr.register(process.stderr, select.POLLIN)
stdoutEOF = False
stderrEOF = False
while True:
stdoutActivity = pollStdout.poll(0)
if stdoutActivity:
c= process.stdout.read(1)
if c:
stdoutStrFile.write(c)
if self.log == 1:
sys.stdout.write(c)
else:
stdoutEOF = True
stderrActivity = pollStderr.poll(0)
if stderrActivity:
c= process.stderr.read(1)
if c:
stderrStrFile.write(c)
if self.log == 1:
sys.stderr.write(c)
else:
stderrEOF = True
if stdoutEOF and stderrEOF:
break
if self.log == 1:
print(f"** cmnd={fullCmndStr}")
process.wait() # type: ignore
self.outcome.stdout = stdoutStrFile.getvalue()
self.outcome.stderr = stderrStrFile.getvalue()
self.outcome.error = process.returncode # type: ignore
The only way i've found how to read a subprocess' output in a streaming fashion (while also capturing it in a variable) in Python (for multiple output streams, i.e. both stdout and stderr) is by passing the subprocess a named temporary file to write to and then opening the same temporary file in a separate reading handle.
Note: this is for Python 3
stdout_write = tempfile.NamedTemporaryFile()
stdout_read = io.open(stdout_write.name, "r")
stderr_write = tempfile.NamedTemporaryFile()
stderr_read = io.open(stderr_write.name, "r")
stdout_captured = ""
stderr_captured = ""
proc = subprocess.Popen(["command"], stdout=stdout_write, stderr=stderr_write)
while True:
proc_done: bool = cli_process.poll() is not None
while True:
content = stdout_read.read(1024)
sys.stdout.write(content)
stdout_captured += content
if len(content) < 1024:
break
while True:
content = stderr_read.read(1024)
sys.stderr.write(content)
stdout_captured += content
if len(content) < 1024:
break
if proc_done:
break
time.sleep(0.1)
stdout_write.close()
stdout_read.close()
stderr_write.close()
stderr_read.close()
However, if you don't need to capture the output, then you can simply pass sys.stdout and sys.stderr streams from your Python script to the called subprocess, as xaav suggested in his answer :
subprocess.Popen(["command"], stdout=sys.stdout, stderr=sys.stderr)

Python: select() doesn't signal all input from pipe

I am trying to load an external command line program with Python and communicate with it via pipes. The progam takes text input via stdin and produces text output in lines to stdout. Communication should be asynchronous using select().
The problem is, that not all output of the program is signalled in select(). Usually the last one or two lines are not signalled. If select() returns with a timeout and I am trying to read from the pipe anyway readline() returns immediately with the line sent from the program. See code below.
The program doesn't buffer the output and sends all output in text lines. Connecting to the program via pipes in many other languages and environments has worked fine so far.
I have tried Python 3.1 and 3.2 on Mac OSX 10.6.
import subprocess
import select
engine = subprocess.Popen("Engine", bufsize=0, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
engine.stdin.write(b"go\n")
engine.stdin.flush()
while True:
inputready,outputready,exceptready = select.select( [engine.stdout.fileno()] , [], [], 10.0)
if (inputready, outputready, exceptready) == ([], [], []):
print("trying to read from engine anyway...")
line = engine.stdout.readline()
print(line)
for s in inputready:
line = engine.stdout.readline()
print(line)
Note that internally file.readlines([size]) loops and invokes the read() syscall more than once, attempting to fill an internal buffer of size. The first call to read() will immediately return, since select() indicated the fd was readable. However the 2nd call will block until data is available, which defeats the purpose of using select. In any case it is tricky to use file.readlines([size]) in an asynchronous app.
You should call os.read(fd, size) once on each fd for every pass through select. This performs a non-blocking read, and lets you buffer partial lines until data is available and detects EOF unambiguously.
I modified your code to illustrate using os.read. It also reads from the process' stderr:
import os
import select
import subprocess
from cStringIO import StringIO
target = 'Engine'
PIPE = subprocess.PIPE
engine = subprocess.Popen(target, bufsize=0, stdin=PIPE, stdout=PIPE, stderr=PIPE)
engine.stdin.write(b"go\n")
engine.stdin.flush()
class LineReader(object):
def __init__(self, fd):
self._fd = fd
self._buf = ''
def fileno(self):
return self._fd
def readlines(self):
data = os.read(self._fd, 4096)
if not data:
# EOF
return None
self._buf += data
if '\n' not in data:
return []
tmp = self._buf.split('\n')
lines, self._buf = tmp[:-1], tmp[-1]
return lines
proc_stdout = LineReader(engine.stdout.fileno())
proc_stderr = LineReader(engine.stderr.fileno())
readable = [proc_stdout, proc_stderr]
while readable:
ready = select.select(readable, [], [], 10.0)[0]
if not ready:
continue
for stream in ready:
lines = stream.readlines()
if lines is None:
# got EOF on this stream
readable.remove(stream)
continue
for line in lines:
print line

How do I get 'real-time' information back from a subprocess.Popen in python (2.5)

I'd like to use the subprocess module in the following way:
create a new process that potentially takes a long time to execute.
capture stdout (or stderr, or potentially both, either together or separately)
Process data from the subprocess as it comes in, perhaps firing events on every line received (in wxPython say) or simply printing them out for now.
I've created processes with Popen, but if I use communicate() the data comes at me all at once, once the process has terminated.
If I create a separate thread that does a blocking readline() of myprocess.stdout (using stdout = subprocess.PIPE) I don't get any lines with this method either, until the process terminates. (no matter what I set as bufsize)
Is there a way to deal with this that isn't horrendous, and works well on multiple platforms?
Update with code that appears not to work (on windows anyway)
class ThreadWorker(threading.Thread):
def __init__(self, callable, *args, **kwargs):
super(ThreadWorker, self).__init__()
self.callable = callable
self.args = args
self.kwargs = kwargs
self.setDaemon(True)
def run(self):
try:
self.callable(*self.args, **self.kwargs)
except wx.PyDeadObjectError:
pass
except Exception, e:
print e
if __name__ == "__main__":
import os
from subprocess import Popen, PIPE
def worker(pipe):
while True:
line = pipe.readline()
if line == '': break
else: print line
proc = Popen("python subprocess_test.py", shell=True, stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdout_worker = ThreadWorker(worker, proc.stdout)
stderr_worker = ThreadWorker(worker, proc.stderr)
stdout_worker.start()
stderr_worker.start()
while True: pass
stdout will be buffered - so you won't get anything till that buffer is filled, or the subprocess exits.
You can try flushing stdout from the sub-process, or using stderr, or changing stdout on non-buffered mode.
It sounds like the issue might be the use of buffered output by the subprocess - if a relatively small amount of output is created, it could be buffered until the subprocess exits. Some background can be found here:
Here's what worked for me:
cmd = ["./tester_script.bash"]
p = subprocess.Popen( cmd, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE )
while p.poll() is None:
out = p.stdout.readline()
do_something_with( out, err )
In your case you could try to pass a reference to the sub-process to your Worker Thread, and do the polling inside the thread. I don't know how it will behave when two threads poll (and interact with) the same subprocess, but it may work.
Also note thate the while p.poll() is None: is intended as is. Do not replace it with while not p.poll() as in python 0 (the returncode for successful termination) is also considered False.
I've been running into this problem as well. The problem occurs because you are trying to read stderr as well. If there are no errors, then trying to read from stderr would block.
On Windows, there is no easy way to poll() file descriptors (only Winsock sockets).
So a solution is not to try and read from stderr.
Using pexpect [http://www.noah.org/wiki/Pexpect] with non-blocking readlines will resolve this problem. It stems from the fact that pipes are buffered, and so your app's output is getting buffered by the pipe, therefore you can't get to that output until the buffer fills or the process dies.
This seems to be a well-known Python limitation, see
PEP 3145 and maybe others.
Read one character at a time: http://blog.thelinuxkid.com/2013/06/get-python-subprocess-output-without.html
import contextlib
import subprocess
# Unix, Windows and old Macintosh end-of-line
newlines = ['\n', '\r\n', '\r']
def unbuffered(proc, stream='stdout'):
stream = getattr(proc, stream)
with contextlib.closing(stream):
while True:
out = []
last = stream.read(1)
# Don't loop forever
if last == '' and proc.poll() is not None:
break
while last not in newlines:
# Don't loop forever
if last == '' and proc.poll() is not None:
break
out.append(last)
last = stream.read(1)
out = ''.join(out)
yield out
def example():
cmd = ['ls', '-l', '/']
proc = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT,
# Make all end-of-lines '\n'
universal_newlines=True,
)
for line in unbuffered(proc):
print line
example()
Using subprocess.Popen, I can run the .exe of one of my C# projects and redirect the output to my Python file. I am able now to print() all the information being output to the C# console (using Console.WriteLine()) to the Python console.
Python code:
from subprocess import Popen, PIPE, STDOUT
p = Popen('ConsoleDataImporter.exe', stdout = PIPE, stderr = STDOUT, shell = True)
while True:
line = p.stdout.readline()
print(line)
if not line:
break
This gets the console output of my .NET project line by line as it is created and breaks out of the enclosing while loop upon the project's termination. I'd imagine this would work for two python files as well.
I've used the pexpect module for this, it seems to work ok. http://sourceforge.net/projects/pexpect/

Categories

Resources