Python: Printing file to stream as file changes

Python: Printing file to stream as file changes - python

I am running a large suite of unittests in a subprocess through another application (Autodesk Maya). Maya runs a special Python interpreter with it's own libraries that cannot be used outside of the application, thus the need to test within the application. I would like the parent process to print the results of the testing as it is happening. The subprocess is very 'noisy' though, so I do not want to simply redirect the subprocess's stdout to the parent process's stdout. Instead, I would like the test runner to somehow directly stream to the parent process's stdout
I am currently using a TextTestRunner in the subprocess with it's stdout set to an open file. The parent process knows where this file exists, and writes the contents of the tile to stdout once the subprocess is complete. Since the tests can take a long time to run though, I would prefer that the parent process can somehow 'stream' the contents of this file as it is being created by the subprocess. But I am not sure how to do this or if there is a better approach.
Here's an example of how this is currently set up.
module_path = 'my.test.module'
suite_callable = 'suite'
stream_fpath = '/tmp/the_test_results.txt'
script_fpath = '/tmp/the_test_script.py'
script = '''
import sys
if sys.version_info[0] <= 2 and sys.version_info[1] <= 6:
import unittest2 as unittest
else:
import unittest
import {module_path}
suite = {module_path}.{suite_callable}()
with open("{stream_path}", "w") as output:
runner = unittest.TextTestRunner(stream=output)
runner.run(suite)
output.close()
'''.format(**locals())
with open(script_fpath, 'w') as f:
f.write(script)
subprocess.call(['maya', '-command', '\'python("execfile(\\"{script_fpath}\\")")\''.format(**locals())]
with open(stream_fpath, 'r') as f:
print f.read()
Thanks for any info!

Rather than writing to a file, you shoud be able to make a file-like object to replace stderr. The object's write method could do something with each input as it comes in; you could squirt it to something listening on TCP, or print stuff to a TK window, or anything else in addition to logging to a file if you still want the results.
Implementing a stream replacement is pretty simple, in this case you probably only need to implement write, writelines, open and close (unless your testrunner also uses flush).
class FakeStdErr(object):
def __init__(self)
self.lines = []
def write(self, text):
self.lines.append(text)
def writelines(self, *args):
for item in args: self.lines.append(item)
def open(self):
self.lines = []
def close (self):
pass
In your use case you might want to use a silencer class (which is a variant on the same trick) to replace the default stdout (to shut up your chatty process) and direct your test runner stream to this guy; after all the tests are done you could dump the contents to disk as a file or print them to the screen by restoring the default stdout (the link shows how to do that if you're not familiar).

(Edited - suggest using stderr, or parsing)
Alternative 1: Intercept the output rather than having it go to a file.
Have script write to sys.stderr instead of the open() of stream_fpath:
runner = unittest.TextTestRunner(stream=sys.stderr)
Replace subprocess.call with running = subprocess.Popen(<existing parameters>, stderr=PIPE). Then read running.stderr until EOF or until running.poll() returns other than None. You can do what you want with the data. For example, you can print it to the screen and also print it to stream_fpath.
This assumes that the noisy output comes from maya, which will still be dumping to stdout.
Alternative 2: parse the noisy output from stdout=PIPE. If you can differentiate the test runner's output by adding some tag to each line, you can search for that tag and only print the lines that match.
Popen documentation (Python 2)

Related

IPython: redirecting output of a Python script to a file (like bash >)

I have a Python script that I want to run in IPython. I want to redirect (write) the output to a file, similar to:
python my_script.py > my_output.txt
How do I do this when I run the script in IPython, i.e. like execfile('my_script.py')
There is an older page describing a function that could be written to do this, but I believe that there is now a built-in way to do this that I just can't find.

IPython has its own context manager for capturing stdout/err, but it doesn't redirect to files, it redirects to an object:
from IPython.utils import io
with io.capture_output() as captured:
%run my_script.py
print captured.stdout # prints stdout from your script
And this functionality is exposed in a %%capture cell-magic, as illustrated in the Cell Magics example notebook.
It's a simple context manager, so you can write your own version that would redirect to files:
class redirect_output(object):
"""context manager for reditrecting stdout/err to files"""
def __init__(self, stdout='', stderr=''):
self.stdout = stdout
self.stderr = stderr
def __enter__(self):
self.sys_stdout = sys.stdout
self.sys_stderr = sys.stderr
if self.stdout:
sys.stdout = open(self.stdout, 'w')
if self.stderr:
if self.stderr == self.stdout:
sys.stderr = sys.stdout
else:
sys.stderr = open(self.stderr, 'w')
def __exit__(self, exc_type, exc_value, traceback):
sys.stdout = self.sys_stdout
sys.stderr = self.sys_stderr
which you would invoke with:
with redirect_output("my_output.txt"):
%run my_script.py

To quickly store text contained in a variable while working in IPython use %store with > or >>:
%store VARIABLE >>file.txt (appends)
%store VARIABLE >file.txt (overwrites)
(Make sure there is no space immediately following the > or >>)

For just one script to run I would do the redirection in bash
ipython -c "execfile('my_script.py')" > my_output.txt
On python 3, execfile does not exist any more, so use this instead
ipython -c "exec(open('my_script.py').read())" > my_output.txt
Be careful with the double vs single quotes.

While this an old question, I found this and the answers as I was facing a similar problem.
The solution I found after sifting through IPython Cell magics documentation is actually fairly simple. At the most basic the solution is to assign the output of the command to a variable.
This simple two-cell example shows how to do that. In the first Notebook cell we define the Python script with some output to stdout making use of the %%writefile cell magic.
%%writefile output.py
print("This is the output that is supposed to go to a file")
Then we run that script like it was run from a shell using the ! operator.
output = !python output.py
print(output)
>>> ['This is the output that is supposed to go to a file']
Then you can easily make use of the %store magic to persist the output.
%store output >output.log
Notice however that the output of the command is persisted as a list of lines. You might want to call "\n".join(output) prior storing the output.

use this code to save the output to file
import time
from threading import Thread
import sys
#write the stdout to file
def log():
#for stop the thread
global run
while (run):
try:
global out
text = str(sys.stdout.getvalue())
with open("out.txt", 'w') as f:
f.write(text)
finally:
time.sleep(1)
%%capture out
run = True
print("start")
process = Thread(target=log, args=[]).start()
# do some work
for i in range(10, 1000):
print(i)
time.sleep(1)
run= False
process.join()
It is useful to use a text editor that tracer changes the file and suggest reloading the file like
notepad++

I wonder why the verified solution doesn't entirely work in a loop, the following:
for i in range(N):
with redirect_output("path_to_output_file"):
%run <python_script> arg1 arg2 arg3
creates N files in the directory with only output from the first print statement of the <python_script>. Just confirming- when run separately for each iteration of the for loop, the script produces the right result.

There's the hacky way of overwriting sys.stdout and sys.stderr with a file object, but that's really not a good way to go about it. Really, if you want to control where the output goes from inside python, you need to implement some sort of logging and/or output handling system that you can configure via the command line or function arguments instead of using print statements.

It seems a lot of code....
My solution.
redirect output of ipython script into a csv or text file like sqlplus spool
wonder there is an easy way like oracle sqlplus spool command..?

SGE script: print to file during execution (not just at the end)?

I have an SGE script to execute some python code, submitted to the queue using qsub. In the python script, I have a few print statements (updating me on the progress of the program). When I run the python script from the command line, the print statements are sent to stdout. For the sge script, I use the -o option to redirect the output to a file. However, it seems that the script will only send these to the file after the python script has completed running. This is annoying because (a) I can no longer see real time updates on the program and (b) if my job does not terminate correctly (for example if my job gets kicked off the queue) none of the updates are printed. How can I make sure that the script is writing to the file each time it I want to print something, as opposed to lumping it all together at the end?

I think you are running into an issue with buffered output. Python uses a library to handle it's output, and the library knows that it's more efficient to write a block at a time when it's not talking to a tty.
There are a couple of ways to work around this. You can run python with the "-u" option (see the python man page for details), for example, with something like this as the first line of your script:
#! /usr/bin/python -u
but this doesn't work if you are using the "/usr/bin/env" trick because you don't know where python is installed.
Another way is to reopen the stdout with something like this:
import sys
import os
# reopen stdout file descriptor with write mode
# and 0 as the buffer size (unbuffered)
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
Note the bufsize parameter of os.fdopen being set to 0 to force it to be unbuffered. You can do something similar with sys.stderr.

As others mentioned, it is out of performance reasons to not always write the stdout when not connected to a tty.
If you have a specific point at which you want the stdout to be written, you can force that by using
import sys
sys.stdout.flush()
at that point.

I just encountered a similar issue with SGE, and no suggested method to "unbuffer" the file IO seemed to work for me. I had to wait until the end of program execution to see any output.
The workaround I found was to wrap sys.stdout into a custom object that re-implements the "write" method. Instead of actually writing to stdout, this new method instead opens the file where IO is redirected, appends with the desired data, and then closes the file. It's a bit ugly, but I found it solved the problem, since the actual opening/closing of the file forces IO to be interactive.
Here's a minimal example:
import os, sys, time
class RedirIOStream:
def __init__(self, stream, REDIRPATH):
self.stream = stream
self.path = REDIRPATH
def write(self, data):
# instead of actually writing, just append to file directly!
myfile = open( self.path, 'a' )
myfile.write(data)
myfile.close()
def __getattr__(self, attr):
return getattr(self.stream, attr)
if not sys.stdout.isatty():
# Detect redirected stdout and std error file locations!
# Warning: this will only work on LINUX machines
STDOUTPATH = os.readlink('/proc/%d/fd/1' % os.getpid())
STDERRPATH = os.readlink('/proc/%d/fd/2' % os.getpid())
sys.stdout=RedirIOStream(sys.stdout, STDOUTPATH)
sys.stderr=RedirIOStream(sys.stderr, STDERRPATH)
# Simple program to print msg every 3 seconds
def main():
tstart = time.time()
for x in xrange( 10 ):
time.sleep( 3 )
MSG = ' %d/%d after %.0f sec' % (x, args.nMsg, time.time()-tstart )
print MSG
if __name__ == '__main__':
main()

This is SGE buffering the output of your process, it happens whether its a python process or any other.
In general you can decrease or disable the buffering in SGE by changing it and recompiling. But its not a great thing to do, all that data is going to be slowly written to disk affecting your overall performance.

Why not print to a file instead of stdout?
outFileID = open('output.log','w')
print(outFileID,'INFO: still working!')
print(outFileID,'WARNING: blah blah!')
and use
tail -f output.log

This works for me:
class ForceIOStream:
def __init__(self, stream):
self.stream = stream
def write(self, data):
self.stream.write(data)
self.stream.flush()
if not self.stream.isatty():
os.fsync(self.stream.fileno())
def __getattr__(self, attr):
return getattr(self.stream, attr)
sys.stdout = ForceIOStream(sys.stdout)
sys.stderr = ForceIOStream(sys.stderr)
and the issue has to do with NFS not syncing data back to the master until a file is closed or fsync is called.

I hit this same problem today and solved it by just writing to disk instead of printing:
with open('log-file.txt','w') as out:
out.write(status_report)

print() supports the argument flush since Python 3.3 (documentation). So, to force flush the stream:
print('Hello World!', flush=True)

Python `tee` stdout of child process

Is there a way in Python to do the equivalent of the UNIX command line tee? I'm doing a typical fork/exec pattern, and I'd like the stdout from the child to appear in both a log file and on the stdout of the parent simultaneously without requiring any buffering.
In this python code for instance, the stdout of the child ends up in the log file, but not in the stdout of the parent.
pid = os.fork()
logFile = open(path,"w")
if pid == 0:
os.dup2(logFile.fileno(),1)
os.execv(cmd)
edit: I do not wish to use the subprocess module. I'm doing some complicated stuff with the child process that requires me call fork manually.

Here you have a working solution without using the subprocess module. Although, you could use it for the tee process while still using the exec* functions suite for your custom subprocess (just use stdin=subprocess.PIPE and then duplicate the descriptor to your stdout).
import os, time, sys
pr, pw = os.pipe()
pid = os.fork()
if pid == 0:
os.close(pw)
os.dup2(pr, sys.stdin.fileno())
os.close(pr)
os.execv('/usr/bin/tee', ['tee', 'log.txt'])
else:
os.close(pr)
os.dup2(pw, sys.stdout.fileno())
os.close(pw)
pid2 = os.fork()
if pid2 == 0:
# Replace with your custom process call
os.execv('/usr/bin/yes', ['yes'])
else:
try:
while True:
time.sleep(1)
except KeyboardInterrupt:
pass
Note that the tee command, internally, does the same thing as Ben suggested in his answer: reading input and looping over output file descriptors while writing to them. It may be more efficient because of the optimized implementation and because it's written in C, but you have the overhead of the different pipes (don't know for sure which solution is more efficient, but in my opinion, reassigning a custom file-like object to stdout is a more elegant solution).
Some more resources:
How do I duplicate sys.stdout to a log file in python?
http://www.shallowsky.com/blog/programming/python-tee.html

In the following, SOMEPATH is the path to the child executable, in a format suitable for subprocess.Popen (see its docs).
import sys, subprocess
f = open('logfile.txt', 'w')
proc = subprocess.Popen(SOMEPATH, stdout=subprocess.PIPE)
while True:
out = proc.stdout.read(1)
if out == '' and proc.poll() != None:
break
if out != '':
# CR workaround since chars are read one by one, and Windows interprets
# both CR and LF as end of lines. Linux only has LF
if out != '\r': f.write(out)
sys.stdout.write(out)
sys.stdout.flush()

Would an approach like this do what you want?
import sys
class Log(object):
def __init__(self, filename, mode, buffering):
self.filename = filename
self.mode = mode
self.handle = open(filename, mode, buffering)
def write(self, thing):
self.handle.write(thing)
sys.stdout.write(thing)
You'd probably need to implement more of the file interface for this to be really useful (and I've left out properly defaulting mode and buffering, if you want it). You could then do all your writes in the child process to an instance of Log. Or, if you wanted to be really magic, and you're sure you implement enough of the file interface that things won't fall over and die, you could potentially assign sys.stdout to be an instance of this class. Then I think any means of writing to stdout, including print, will go via the log class.
Edit to add: Obviously if you assign to sys.stdout you will have to do something else in the write method to echo the output to stdout!! I think you could use sys.__stdout__ for that.

Oh, you. I had a decent answer all prettied-up before I saw the last line of your example: execv(). Well, poop. The original idea was replacing each child process' stdout with an instance of this blog post's tee class, and split the stream into the original stdout, and the log file:
http://www.shallowsky.com/blog/programming/python-tee.html
But, since you're using execv(), the child process' tee instance would just get clobbered, so that won't work.
Unfortunately for you, there is no "out of the box" solution to your problem that I can find. The closest thing would be to spawn the actual tee program in a subprocess; if you wanted to be more cross-platform, you could fork a simple Python substitute.
First thing to know when coding a tee substitute: tee really is a simple program. In all the true C implementations I've seen, it's not much more complicated than this:
while((character = read()) != EOF) {
/* Write to all of the output streams in here, then write to stdout. */
}
Unfortunately, you can't just join two streams together. That would be really useful (so that the input of one stream would automatically be forwarded out of another), but we've no such luxury without coding it ourselves. So, Eli and I are going to have very similar answers. The difference is that, in my answer, the Python 'tee' is going to run in a separate process, via a pipe; that way, the parent thread is still useful!
(Remember: copy the blog post's tee class, too.)
import os, sys
# Open it for writing in binary mode.
logFile=open("bar", "bw")
# Verbose names, but I wanted to get the point across.
# These are file descriptors, i.e. integers.
parentSideOfPipe, childSideOfPipe = os.pipe()
# 'Tee' subprocess.
pid = os.fork()
if pid == 0:
while True:
char = os.read(parentSideOfPipe, 1)
logFile.write(char)
os.write(1, char)
# Actual command
pid = os.fork()
if pid == 0:
os.dup2(childSideOfPipe, 1)
os.execv(cmd)
I'm sorry if that's not what you wanted, but it's the best solution I can find.
Good luck with the rest of your project!

The first obvious answer is to fork an actual tee process but that is probably not ideal.
The tee code (from coreutils) merely reads each line and writes to each file in turn (effectively buffering).

Silence loggers and printing to screen - Python

I'm having a problem with my python script.
It's printing massive amounts of data on the screen, and I would like to prevent all sorts of printing to screen.
Edit:
The library I'm using is mechanize, and it's printing a LOT of data on screen.
I have set these to false with no luck!
br.set_debug_redirects(False)
br.set_debug_responses(False)
br.set_debug_http(False)
Any ideas?
Help would be amazing and very much appreciated!

(Based on your 2nd edit)
If you don't want to disable all output, you can try to be specific to mechanize itself. http://wwwsearch.sourceforge.net/mechanize/ provides a snippet, which I've modified (though I'm not sure if it will work):
import logging
logger = logging.getLogger("mechanize")
# only log really bad events
logger.setLevel(logging.ERROR)
When you print something it goes to the screen through the sys.stdout file. You can change this file to any other file (eg, a log file you open) so that nothing is printed to the screen:
import sys
# save the old stdout so you can print later (do sys.stdout = OLD_STDOUT)
OLD_STDOUT = sys.stdout
sys.stdout = open("logfile.txt", 'w')
Of course, if you're talking about some library that you're calling, it may be printing to sys.stderr. Luckily, you can do the exact same thing for this one (continuing from above):
OLD_STDERR = sys.stderr
sys.stderr = open("errorLog.txt", 'w')
Now if, for some reason, you want to completely ignore stdout (or stderr) and never see it again, you can define your own file-like classes that simply discard the objects:
class Discarder(object):
def write(self, text):
pass # do nothing
# now discard everything coming out of stdout
sys.stdout = Discarder()
And, to add to the din of possible solutions, here is a solution that works in Unix shells:
# discards all input (change /dev/null to a file name to keep track of output)
python yourScript.py > /dev/null

You may redirect sys.stdout and sys.stderr to a file or any file like object of yours e.g.
class EatLog(object):
def write(self):
pass
sys.stdout = EatLog()
but i would not recommend that, simpler option is to use OS level redirection e.g.
python myscript.py > out.log

you can use the StringIO module, too, instead of rolling your own stdout stream. Occasionally, the stdout needs more than a write method (flush is another common one), which StringIO will handle.
import StringIO
import sys
sys.stdout = StringIO.StringIO()

How do you make Python wait so that you can read the output?

I've always been a heavy user of Notepad2, as it is fast, feature-rich, and supports syntax highlighting. Recently I've been using it for Python.
My problem: when I finish editing a certain Python source code, and try to launch it, the screen disappears before I can see the output pop up. Is there any way for me to make the results wait so that I can read it, short of using an input() or time-delay function? Otherwise I'd have to use IDLE, because of the output that stops for you to read.
(My apologies if this question is a silly one, but I'm very new at Python and programming in general.)

If you don't want to use raw_input() or input() you could log your output (stdout, stderr) to a file or files.
You could either use the logging module, or just redirect sys.stdout and sys.stderr.
I would suggest using a combination of the logging and traceback if you want to log errors with their trace stack.
Something like this maybe:
import logging, traceback
logging.basicConfig(filename=r'C:\Temp\log.txt', level=logging.DEBUG)
try:
#do some stuff
logging.debug('I did some stuff!')
except SomeException:
logging.error(traceback.format_exc())
Here's an example of redirecting stdout and stderr:
if __name__ == '__main__':
save_out = sys.stdout # save the original stdout so you can put it back later
out_file = open(r'C:\Temp\out.txt', 'w')
sys.stdout = out_file
save_err = sys.stderr
err_file = open(r'C:\Temp\err.txt', 'w')
sys.stderr = err_file
main() #call your main function
sys.stdout = save_out # set stdout back to it's original object
sys.stderr = save_err
out_file.close()
err_file.close()
I'm going to point out that this is not the easiest or most straight forward way to go.

This is a "problem" with Notepad2, not Python itself.
Unless you want to use input()/sleep (or any other blocking function) in your scripts, I think you have to turn to the settings in Notepad2 and see what that has to offer.

you could start in the command window. e.g.:
c:\tmp\python>main.py
adding raw_input() (or input() in py3k) at the end of your script will let you freeze it for until enter is pressed, but it's not a good thing to do.

You can add a call to raw_input() to the end of your script in order to make it wait until you press Enter.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Printing file to stream as file changes - python

Related

IPython: redirecting output of a Python script to a file (like bash >)

SGE script: print to file during execution (not just at the end)?

Python `tee` stdout of child process

Silence loggers and printing to screen - Python

How do you make Python wait so that you can read the output?

Categories

Resources