gobject io monitoring + nonblocking reads

gobject io monitoring + nonblocking reads - python

I've got a problem with using the io_add_watch monitor in python (via gobject). I want to do a nonblocking read of the whole buffer after every notification. Here's the code (shortened a bit):
class SomeApp(object):
def __init__(self):
# some other init that does a lot of stderr debug writes
fl = fcntl.fcntl(0, fcntl.F_GETFL, 0)
fcntl.fcntl(0, fcntl.F_SETFL, fl | os.O_NONBLOCK)
print "hooked", gobject.io_add_watch(0, gobject.IO_IN | gobject.IO_PRI, self.got_message, [""])
self.app = gobject.MainLoop()
def run(self):
print "ready"
self.app.run()
def got_message(self, fd, condition, data):
print "reading now"
data[0] += os.read(0, 1024)
print "got something", fd, condition, data
return True
gobject.threads_init()
SomeApp().run()
Here's the trick - when I run the program without debug output activated, I don't get the got_message calls. When I write a lot of stuff to the stderr first, the problem disappears. If I don't write anything apart from the prints visible in this code, I don't get the stdin messsage signals. Another interesting thing is that when I try to run the same app with stderr debug enabled but via strace (to check if there are any fcntl / ioctl calls I missed), the problem appears again.
So in short: if I write a lot to stderr first without strace, io_watch works. If I write a lot with strace, or don't write at all io_watch doesn't work.
The "some other init" part takes some time, so if I type some text before I see "hooked 2" output and then press "ctrl+c" after "ready", the get_message callback is called, but the read call throws EAGAIN, so the buffer seems to be empty.
Strace log related to the stdin:
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost isig icanon echo ...}) = 0
fcntl(0, F_GETFL) = 0xa002 (flags O_RDWR|O_ASYNC|O_LARGEFILE)
fcntl(0, F_SETFL, O_RDWR|O_NONBLOCK|O_ASYNC|O_LARGEFILE) = 0
fcntl(0, F_GETFL) = 0xa802 (flags O_RDWR|O_NONBLOCK|O_ASYNC|O_LARGEFILE)
Does anyone have some ideas on what's going on here?
EDIT: Another clue. I tried to refactor the app to do the reading in a different thread and pass it back via a pipe. It "kind of" works:
...
rpipe, wpipe = os.pipe()
stopped = threading.Event()
self.stdreader = threading.Thread(name = "reader", target = self.std_read_loop, args = (wpipe, stopped))
self.stdreader.start()
new_data = ""
print "hooked", gobject.io_add_watch(rpipe, gobject.IO_IN | gobject.IO_PRI, self.got_message, [new_data])
def std_read_loop(self, wpipe, stop_event):
while True:
try:
new_data = os.read(0, 1024)
while len(new_data) > 0:
l = os.write(wpipe, new_data)
new_data = new_data[l:]
except OSError, e:
if stop_event.isSet():
break
time.sleep(0.1)
...
It's surprising that if I just put the same text in a new pipe, everything starts to work. The problem is that:
the first line is not "noticed" at all - I get only the second and following lines
it's fugly
Maybe that will give someone else a clue on why that's happening?

This sounds like a race condition in which there is some delay to setting your callback, or else there is a change in the environment which affects whether or not you can set the callback.
I would look carefully at what happens before you call io_add_watch(). For instance the Python fcntl docs say:
All functions in this module take a
file descriptor fd as their first
argument. This can be an integer file
descriptor, such as returned by
sys.stdin.fileno(), or a file object,
such as sys.stdin itself, which
provides a fileno() which returns a
genuine file descriptor.
Clearly that is not what you are doing when you assume that STDIN will have FD == 0. I would change that first and try again.
The other thing is that if the FD is already blocked, then your process could be waiting while other non-blocked processes are running, therefore there is a timing difference depending on what you do first. What happens if you refactor the fcntl stuff so that it is done soon after the program starts, even before importing the GTK modules?
I'm not sure that I understand why a program using the GTK GUI would want to read from the standard input in the first place. If you are actually trying to capture the output of another process, you should use the subprocess module to set up a pipe, then io_add_watch() on the pipe like so:
proc = subprocess.Popen(command, stdout = subprocess.PIPE)
gobject.io_add_watch(proc.stdout, glib.IO_IN, self.write_to_buffer )
Again, in this example we make sure that we have a valid opened FD before calling io_add_watch().
Normally, when gobject.io_add_watch() is used, it is called just before gobject.MainLoop(). For example, here is some working code using io_add_watch to catch IO_IN.

The documentation says you should return TRUE from the callback or it will be removed from the list of event sources.

What happens if you hook the callback first, prior to any stderr output? Does it still get called when you have debug output enabled?
Also, I suppose you should probably be repeatedly calling os.read() in your handler until it gives no data, in case >1024 bytes become ready between calls.
Have you tried using the select module in a background thread to emulate gio functionality? Does that work? What platform is this and what kind of FD are you dealing with? (file? socket? pipe?)

Related

How to write zero bytes to stdin of subprocess?

There's a console program I want to run from a python script. Let me call it the child.
Once in a while, to continue processing, the child expects to read 0 bytes of data from stdin.
For simplicity, let's assume the child is the following python script:
child.py
import os
import sys
import time
stdin = sys.stdin.fileno()
def speak(message):
print(message, flush=True, end="")
while True:
speak("Please say nothing!")
data = os.read(stdin, 1024)
if data == b"":
speak("Thank you for nothing.")
time.sleep(5)
else:
speak("I won't continue unless you keep silent.")
To work successfully (i.e. seeing "Thank you for nothing." printed), you must typically hit Ctrl+D when running it in a UNIX terminal, while in cmd or PowerShell under Windows hitting Ctrl+Z followed by Enter will do the trick.
Here's an attempt to run the child from inside a python script, which I shall call the parent:
parent.py
import os
import subprocess
def speak_to_child(child_stdin_r, child_stdin_w, message):
child = subprocess.Popen(
["python", "child.py"],
stdin=child_stdin_r,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
child_stdout_r = child.stdout.fileno()
while True:
data = os.read(child_stdout_r, 1024)
print(f"child said: {data}")
if data == b"Please say nothing!":
os.write(child_stdin_w, message)
child_stdin_r, child_stdin_w = os.pipe()
speak_to_child(child_stdin_r, child_stdin_w, b"Not sure how to say nothing.")
This is of course an unsuccessful attempt as the child will clearly answer with "I won't continue unless you keep silent." after reading "Not sure how to say nothing." from its stdin.
Naively changing the message b"Not sure how to say nothing." in the parent to the empty message b"" doesn't get rid of the problem, since writing 0 bytes to a pipe won't cause a read of 0 bytes on the receiving end.
Now on UNIX we could easily solve the problem by replacing the pipe with a pseudoterminal and the empty message b"" with an EOT character b"\x04" like so:
import pty
child_stdin_w, child_stdin_r = pty.openpty()
speak_to_child(child_stdin_r, child_stdin_w, b"\x04")
This works ... but evidently not on Windows. So now my questions:
On UNIX, is using a pseudoterminal the best way to force the read of 0 bytes or is there a better way?
On Windows, given that pseudoterminals aren't available, how can I solve the problem?
A platform agnostic solution would of course be ideal.

How to get output from python2 subprocess which run a script using multiprocessing?

Here is my demo code. It contains two scripts.
The first is main.py, it will call print_line.py with subprocess module.
The second is print_line.py, it prints something to the stdout.
main.py
import subprocess
p = subprocess.Popen('python2 print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
while True:
line = p.stdout.readline()
if line:
print(line)
else:
break
print_line.py
from multiprocessing import Process, JoinableQueue, current_process
if __name__ == '__main__':
task_q = JoinableQueue()
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
task_q.task_done()
for _ in range(10):
p = Process(target=do_task)
p.daemon = True
p.start()
for i in range(100):
task_q.put(i)
task_q.join()
Before, print_line.py is written with threading and Queue module, everything is fine. But now, after changing to multiprocessing module, the main.py cannot get any output from print_line. I tried to use Popen.communicate() to get the output or set preexec_fn=os.setsid inPopen(). Neither of them work.
So, here is my question:
Why subprocess cannot get the output with multiprocessing? why it is ok with threading?
If I comment out stdout=subprocess.PIPE and stderr=subprocess.PIPE, the output is printed in my console. Why? How does this happen?
Is there any chance to get the output from print_line.py?

Curious.
In theory this should work as it is, but it does not. The reason being somewhere in the deep, murky waters of buffered IO. It seems that the output of a subprocess of a subprocess can get lost if not flushed.
You have two workarounds:
One is to use flush() in your print_line.py:
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
sys.stdout.flush()
task_q.task_done()
This will fix the issue as you will flush your stdout as soon as you have written something to it.
Another option is to use -u flag to Python in your main.py:
p = subprocess.Popen('python2 -u print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
-u will force stdin and stdout to be completely unbuffered in print_line.py, and children of print_line.py will then inherit this behaviour.
These are workarounds to the problem. If you are interested in the theory why this happens, it definitely has something to do with unflushed stdout being lost if subprocess terminates, but I am not the expert in this.

It's not a multiprocessing issue, but it is a subprocess issue—or more precisely, it has to to with standard I/O and buffering, as in Hannu's answer. The trick is that by default, the output of any process, whether in Python or not, is line buffered if the output device is a "terminal device" as determined by os.isatty(stream.fileno()):
>>> import sys
>>> sys.stdout.fileno()
1
>>> import os
>>> os.isatty(1)
True
There is a shortcut available to you once the stream is open:
>>> sys.stdout.isatty()
True
but the os.isatty() operation is the more fundamental one. That is, internally, Python inspects the file descriptor first using os.isatty(fd), then chooses the stream's buffering based on the result (and/or arguments and/or the function used to open the stream). The sys.stdout stream is opened early on during Python's startup, before you generally have much control.1
When you call open or codecs.open or otherwise do your own operation to open a file, you can specify the buffering via one of the optional arguments. The default for open is the system default, which is line buffering if isatty(), otherwise fully buffered. Curiously, the default for codecs.open is line buffered.
A line buffered stream gets an automatic flush() applied when you write a newline to it.
An unbuffered stream writes each byte to its output immediately. This is very inefficient in general. A fully buffered stream writes its output when the buffer gets sufficiently full—the definition of "sufficient" here tends to be pretty variable, anything from 1024 (1k) to 1048576 (1 MB)—or when explicitly directed.
When you run something as a process, it's the process itself that decides how to do any buffering. Your own Python code, reading from the process, cannot control it. But if you know something—or a lot—about the processes that you will run, you can set up their environment so that they run line-buffered, or even unbuffered. (Or, as in your case, since you write that code, you can write it to do what you want.)
1There are hooks that fire up very early, where you can fuss with this sort of thing. They are tricky to work though.

Python subprocess: read returncode is sometimes different from returned code

I have a Python script that calls another Python script using subprocess.Popen. I know the called code always returns 10 ,which means it failed.
My problem is, the caller only reads 10 approximatively 75% of the time. The other 25% it reads 0 and mistakes the called program failure code as a success. Same command, same environment, apparently random occurences.
Environment: Python 2.7.10, Linux Redhat 6.4. The code presented here is a (very) simplified version but I can still reproduce the problem using it.
This is the called script, constant_return.py:
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Simplified called code
"""
import sys
if __name__ == "__main__":
sys.exit(10)
This is the caller code:
#!/usr/bin/env python2.7
# -*- coding: utf-8 -*-
"""
Simplified version of the calling code
"""
try:
import sys
import subprocess
import threading
except Exception, eImp:
print "Error while loading Python library : %s" % eImp
sys.exit(100)
class BizarreProcessing(object):
"""
Simplified caller class
"""
def __init__(self):
"""
Classic initialization
"""
object.__init__(self)
def logPipe(self, isStdOut_, process_):
"""
Simplified log handler
"""
try:
if isStdOut_:
output = process_.stdout
logfile = open("./log_out.txt", "wb")
else:
output = process_.stderr
logfile = open("./log_err.txt", "wb")
#Read pipe content as long as the process is running
while (process_.poll() == None):
text = output.readline()
if (text != '' and text.strip() != ''):
logfile.write(text)
#When the process is finished, there might still be lines remaining in the pipe
output.readlines()
for oneline in output.readlines():
if (oneline != None and oneline.strip() != ''):
logfile.write(text)
finally:
logfile.close()
def startProcessing(self):
"""
Launch process
"""
# Simplified command line definition
command = "/absolute/path/to/file/constant_return.py"
# Execute command in a new process
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
#Launch a thread to gather called programm stdout and stderr
#This to avoid a deadlock with pipe filled and such
stdoutTread = threading.Thread(target=self.logPipe, args=(True, process))
stdoutTread.start()
stderrThread = threading.Thread(target=self.logPipe, args=(False, process))
stderrThread.start()
#Wait for the end of the process and get process result
stdoutTread.join()
stderrThread.join()
result = process.wait()
print("returned code: " + str(result))
#Send it back to the caller
return (result)
#
# Main
#
if __name__ == "__main__":
# Execute caller code
processingInstance = BizarreProcessing()
aResult = processingInstance.startProcessing()
#Return the code
sys.exit(aResult)
Here is what I type in bash to execute the caller script:
for res in {1..100}
do
/path/to/caller/script.py
echo $? >> /tmp/returncodelist.txt
done
It seems to be somehow connected to the way I read the called program outputs, because when I create the subprocess with process = subprocess.Popen(command, shell=True, stdout=sys.stdout, stderr=sys.stderr) and remove all the Thread stuff it reads the correct return code (but doesn't log as I want anymore...)
Any idea what I did wrong ?
Thanks a lot for your help

logPipe is also checking whether the process is alive to determine whether there's more data to read. This is not correct - you should be checking whether the pipe has reached EOF, by looking for a zero-length read, or by using output.readlines(). The I/O pipes may outlive the process.
This simplifies logPipe significantly: Change logPipe as below:
def logPipe(self, isStdOut_, process_):
"""
Simplified log handler
"""
try:
if isStdOut_:
output = process_.stdout
logfile = open("./log_out.txt", "wb")
else:
output = process_.stderr
logfile = open("./log_err.txt", "wb")
#Read pipe content as long as the process is running
with output:
for text in output:
if text.strip(): # ... checks if it's not an empty string
logfile.write(text)
finally:
logfile.close()
Second, don't join your logging threads until after process.wait(), for the same reason - the I/O pipes may outlive the process.
What I think is happening under the covers is that there's a SIGPIPE being emitted and mishandled somewhere - possibly being misconstrued as the process termination condition. This is because the pipe is being closed on one end or the other without being flushed. SIGPIPE can sometimes be a nuisance in larger applications; it may be that the Python library swallows it or does something childish with it.
edit As #Blackjack points out, SIGPIPE is automatically blocked by Python. So, that rules out SIGPIPE malfeasance. A second theory though: The documentation behind Popen.poll() states:
Check if child process has terminated. Set and return returncode
attribute.
If you strace this (eg, strace -f -o strace.log ./caller.py), this appears to be being done via wait4(WNOHANG). You've got 2 threads waiting with WNOHANG and one waiting normally, but only one call will return correctly with the process exit code. If there is no lock in the implementation of subprocess.poll(), then there is quite likely a race to assign process.resultcode, or a potential failure to do so correctly. Limiting your Popen.waits/polls to a single thread should be a good way to avoid this. See man waitpid.
edit as an aside, if you can hold all your stdout/stderr data in memory, subprocess.communicate() is much easier to use and does not require the logPipe or background threads at all.
https://docs.python.org/2/library/subprocess.html#subprocess.Popen.communicate

SGE script: print to file during execution (not just at the end)?

I have an SGE script to execute some python code, submitted to the queue using qsub. In the python script, I have a few print statements (updating me on the progress of the program). When I run the python script from the command line, the print statements are sent to stdout. For the sge script, I use the -o option to redirect the output to a file. However, it seems that the script will only send these to the file after the python script has completed running. This is annoying because (a) I can no longer see real time updates on the program and (b) if my job does not terminate correctly (for example if my job gets kicked off the queue) none of the updates are printed. How can I make sure that the script is writing to the file each time it I want to print something, as opposed to lumping it all together at the end?

I think you are running into an issue with buffered output. Python uses a library to handle it's output, and the library knows that it's more efficient to write a block at a time when it's not talking to a tty.
There are a couple of ways to work around this. You can run python with the "-u" option (see the python man page for details), for example, with something like this as the first line of your script:
#! /usr/bin/python -u
but this doesn't work if you are using the "/usr/bin/env" trick because you don't know where python is installed.
Another way is to reopen the stdout with something like this:
import sys
import os
# reopen stdout file descriptor with write mode
# and 0 as the buffer size (unbuffered)
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
Note the bufsize parameter of os.fdopen being set to 0 to force it to be unbuffered. You can do something similar with sys.stderr.

As others mentioned, it is out of performance reasons to not always write the stdout when not connected to a tty.
If you have a specific point at which you want the stdout to be written, you can force that by using
import sys
sys.stdout.flush()
at that point.

I just encountered a similar issue with SGE, and no suggested method to "unbuffer" the file IO seemed to work for me. I had to wait until the end of program execution to see any output.
The workaround I found was to wrap sys.stdout into a custom object that re-implements the "write" method. Instead of actually writing to stdout, this new method instead opens the file where IO is redirected, appends with the desired data, and then closes the file. It's a bit ugly, but I found it solved the problem, since the actual opening/closing of the file forces IO to be interactive.
Here's a minimal example:
import os, sys, time
class RedirIOStream:
def __init__(self, stream, REDIRPATH):
self.stream = stream
self.path = REDIRPATH
def write(self, data):
# instead of actually writing, just append to file directly!
myfile = open( self.path, 'a' )
myfile.write(data)
myfile.close()
def __getattr__(self, attr):
return getattr(self.stream, attr)
if not sys.stdout.isatty():
# Detect redirected stdout and std error file locations!
# Warning: this will only work on LINUX machines
STDOUTPATH = os.readlink('/proc/%d/fd/1' % os.getpid())
STDERRPATH = os.readlink('/proc/%d/fd/2' % os.getpid())
sys.stdout=RedirIOStream(sys.stdout, STDOUTPATH)
sys.stderr=RedirIOStream(sys.stderr, STDERRPATH)
# Simple program to print msg every 3 seconds
def main():
tstart = time.time()
for x in xrange( 10 ):
time.sleep( 3 )
MSG = ' %d/%d after %.0f sec' % (x, args.nMsg, time.time()-tstart )
print MSG
if __name__ == '__main__':
main()

This is SGE buffering the output of your process, it happens whether its a python process or any other.
In general you can decrease or disable the buffering in SGE by changing it and recompiling. But its not a great thing to do, all that data is going to be slowly written to disk affecting your overall performance.

Why not print to a file instead of stdout?
outFileID = open('output.log','w')
print(outFileID,'INFO: still working!')
print(outFileID,'WARNING: blah blah!')
and use
tail -f output.log

This works for me:
class ForceIOStream:
def __init__(self, stream):
self.stream = stream
def write(self, data):
self.stream.write(data)
self.stream.flush()
if not self.stream.isatty():
os.fsync(self.stream.fileno())
def __getattr__(self, attr):
return getattr(self.stream, attr)
sys.stdout = ForceIOStream(sys.stdout)
sys.stderr = ForceIOStream(sys.stderr)
and the issue has to do with NFS not syncing data back to the master until a file is closed or fsync is called.

I hit this same problem today and solved it by just writing to disk instead of printing:
with open('log-file.txt','w') as out:
out.write(status_report)

print() supports the argument flush since Python 3.3 (documentation). So, to force flush the stream:
print('Hello World!', flush=True)

Python `tee` stdout of child process

Is there a way in Python to do the equivalent of the UNIX command line tee? I'm doing a typical fork/exec pattern, and I'd like the stdout from the child to appear in both a log file and on the stdout of the parent simultaneously without requiring any buffering.
In this python code for instance, the stdout of the child ends up in the log file, but not in the stdout of the parent.
pid = os.fork()
logFile = open(path,"w")
if pid == 0:
os.dup2(logFile.fileno(),1)
os.execv(cmd)
edit: I do not wish to use the subprocess module. I'm doing some complicated stuff with the child process that requires me call fork manually.

Here you have a working solution without using the subprocess module. Although, you could use it for the tee process while still using the exec* functions suite for your custom subprocess (just use stdin=subprocess.PIPE and then duplicate the descriptor to your stdout).
import os, time, sys
pr, pw = os.pipe()
pid = os.fork()
if pid == 0:
os.close(pw)
os.dup2(pr, sys.stdin.fileno())
os.close(pr)
os.execv('/usr/bin/tee', ['tee', 'log.txt'])
else:
os.close(pr)
os.dup2(pw, sys.stdout.fileno())
os.close(pw)
pid2 = os.fork()
if pid2 == 0:
# Replace with your custom process call
os.execv('/usr/bin/yes', ['yes'])
else:
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
pass
Note that the tee command, internally, does the same thing as Ben suggested in his answer: reading input and looping over output file descriptors while writing to them. It may be more efficient because of the optimized implementation and because it's written in C, but you have the overhead of the different pipes (don't know for sure which solution is more efficient, but in my opinion, reassigning a custom file-like object to stdout is a more elegant solution).
Some more resources:
How do I duplicate sys.stdout to a log file in python?
http://www.shallowsky.com/blog/programming/python-tee.html

In the following, SOMEPATH is the path to the child executable, in a format suitable for subprocess.Popen (see its docs).
import sys, subprocess
f = open('logfile.txt', 'w')
proc = subprocess.Popen(SOMEPATH, stdout=subprocess.PIPE)
while True:
out = proc.stdout.read(1)
if out == '' and proc.poll() != None:
break
if out != '':
# CR workaround since chars are read one by one, and Windows interprets
# both CR and LF as end of lines. Linux only has LF
if out != '\r': f.write(out)
sys.stdout.write(out)
sys.stdout.flush()

Would an approach like this do what you want?
import sys
class Log(object):
def __init__(self, filename, mode, buffering):
self.filename = filename
self.mode = mode
self.handle = open(filename, mode, buffering)
def write(self, thing):
self.handle.write(thing)
sys.stdout.write(thing)
You'd probably need to implement more of the file interface for this to be really useful (and I've left out properly defaulting mode and buffering, if you want it). You could then do all your writes in the child process to an instance of Log. Or, if you wanted to be really magic, and you're sure you implement enough of the file interface that things won't fall over and die, you could potentially assign sys.stdout to be an instance of this class. Then I think any means of writing to stdout, including print, will go via the log class.
Edit to add: Obviously if you assign to sys.stdout you will have to do something else in the write method to echo the output to stdout!! I think you could use sys.__stdout__ for that.

Oh, you. I had a decent answer all prettied-up before I saw the last line of your example: execv(). Well, poop. The original idea was replacing each child process' stdout with an instance of this blog post's tee class, and split the stream into the original stdout, and the log file:
http://www.shallowsky.com/blog/programming/python-tee.html
But, since you're using execv(), the child process' tee instance would just get clobbered, so that won't work.
Unfortunately for you, there is no "out of the box" solution to your problem that I can find. The closest thing would be to spawn the actual tee program in a subprocess; if you wanted to be more cross-platform, you could fork a simple Python substitute.
First thing to know when coding a tee substitute: tee really is a simple program. In all the true C implementations I've seen, it's not much more complicated than this:
while((character = read()) != EOF) {
/* Write to all of the output streams in here, then write to stdout. */
}
Unfortunately, you can't just join two streams together. That would be really useful (so that the input of one stream would automatically be forwarded out of another), but we've no such luxury without coding it ourselves. So, Eli and I are going to have very similar answers. The difference is that, in my answer, the Python 'tee' is going to run in a separate process, via a pipe; that way, the parent thread is still useful!
(Remember: copy the blog post's tee class, too.)
import os, sys
# Open it for writing in binary mode.
logFile=open("bar", "bw")
# Verbose names, but I wanted to get the point across.
# These are file descriptors, i.e. integers.
parentSideOfPipe, childSideOfPipe = os.pipe()
# 'Tee' subprocess.
pid = os.fork()
if pid == 0:
while True:
char = os.read(parentSideOfPipe, 1)
logFile.write(char)
os.write(1, char)
# Actual command
pid = os.fork()
if pid == 0:
os.dup2(childSideOfPipe, 1)
os.execv(cmd)
I'm sorry if that's not what you wanted, but it's the best solution I can find.
Good luck with the rest of your project!

The first obvious answer is to fork an actual tee process but that is probably not ideal.
The tee code (from coreutils) merely reads each line and writes to each file in turn (effectively buffering).

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.