Print output of "time" shell command without losing formatting - python

For some reason, when I run "time ./" from the terminal, I get this format:
real 0m0.090s
user 0m0.086s
sys 0m0.004s
But when I execute the same command in Python 2.7.6:
result = subprocess.Popen("time ./<binary>", shell = True, stdout = subprocess.PIPE)
...I get this format when I print(result.stderr):
0.09user 0.00system 0:00.09elapsed
Is there any way I can force the first (real, user, sys) format?

From the man time documentation:
After the utility finishes, time writes the total time elapsed, the time consumed by system overhead, and the time used to execute utility to the standard error stream.
Bold emphasis mine. You are capturing the stdout stream, not the stderr stream, so whatever output you see must be the result of something else mangling your Python stderr stream.
Capture stderr:
proc = subprocess.Popen("time ./<binary>", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = proc.communicate()
The stderr variable then holds the time command output.
If this continues to produce the same output, your /bin/bash implementation has a built-in time command that overrides the /usr/bin/time version (which probably outputs everything on one line). You can force the use of the bash builtin by telling Python to run with that:
proc = subprocess.Popen("time ./<binary>", shell=True, executable='/bin/bash',
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
stdout, stderr = proc.communicate()

First: Martijn Pieters' answer is correct about needing to capture time's stderr instead of stdout.
Also, at least on older versions of Python like 3.1, a subprocess.Popen object contains several things that could be considered "output". Attempting to print one just results in:
<subprocess.Popen object at 0x2068fd0>
If later versions are print-able, they must do some processing of their contents, probably including mangling the output.
Reading (and Printing) from Popen Object
The Popen object has a stderr field, which is a readable, file-like object. You can read from it like any other file-like object, although it's not recommended. Quoting the big, pink security warning:
Warning
Use communicate() rather than .stdin.write, .stdout.read or .stderr.read to avoid deadlocks due to any of the other OS pipe buffers filling up and blocking the child process.
To print your Popen's contents, you must:
communicate() with the sub-process, assigning the 2-tuple it returns (stdout, stderr) to local variable(s).
Convert the variable's contents into strings --- by default, it's a bytes object, as if the "file" had been opened in binary mode.
Below is a tiny program that prints the stderr output of a shell command without mangling (not counting the conversion from ASCII to Unicode and back).
#!/usr/bin/env python3
import subprocess
def main():
result = subprocess.Popen(
'time sleep 0.2',
shell=True,
stderr=subprocess.PIPE,
)
stderr = result.communicate()[1]
stderr_text = stderr.decode('us-ascii').rstrip('\n')
#print(stderr_text) # Prints all lines at once.
# Or, if you want to process output line-by-line...
lines = stderr_text.split('\n')
for line in lines:
print(line)
return
if "__main__" == __name__:
main()
This output is on an old Fedora Linux system, running bash with LC_ALL set to "C":
real 0m0.201s
user 0m0.000s
sys 0m0.001s
Note that you'll want to add some error-handling around my script's stderr_text = stderr.decode(...) line... For all I know, time emits non-ASCII characters depending on localization, environment variables, etc.
Alternative: universal_newlines
You can save some of the decoding boilerplate by using the universal_newlines option to Popen. It does the conversion from bytes to strings automatically:
If universal_newlines is True, these file objects will be opened as text streams in universal newlines mode using the encoding returned by locale.getpreferredencoding(False). [...]
def main_universal_newlines():
result = subprocess.Popen(
'time sleep 0.2',
shell=True,
stderr=subprocess.PIPE,
universal_newlines=True,
)
stderr_text = result.communicate()[1].rstrip('\n')
lines = stderr_text.split('\n')
for line in lines:
print(line)
return
Note that I still have to strip the last '\n' manually to exactly match the shell's output.

Related

Capture the output of subprocess.run() but also print it in real time?

I would like to run a command using subprocess.run() and then get its stdout/stderr as a string, but I want the subprocess to also print its output to the console normally while running. If I do
result = subprocess.run(['ls', '-al'])
then I can see the output printed to my console but I can't access the output after the command runs. If I do
result = subprocess.run(['ls', '-al'], capture_output=True, text=True)
I can access result.stdout and result.stderr but nothing is printed to the console while the command is running. Can I have both printing to the console and saving to result.stdout?
From the documentation of subprocess.run :
Run the command described by args. Wait for command to complete, then return a CompletedProcess instance.
[...]
If capture_output is true, stdout and stderr will be captured. When used, the internal Popen object is automatically created with stdout=PIPE and stderr=PIPE.
The docs for subprocess.PIPE say :
Special value that can be used as the stdin, stdout or stderr argument to Popen and indicates that a pipe to the standard stream should be opened. Most useful with Popen.communicate().
The doc for the Popen constructor parameter stdout :
stdin, stdout and stderr specify the executed program’s standard input, standard output and standard error file handles, respectively. Valid values are PIPE, DEVNULL, an existing file descriptor (a positive integer), an existing file object, and None.
So using capture_output=True is a no-go, because the output will be stored in a pipe for you to read after the call finishes.
The simpler is for you to use subprocess.Popen as #MatBBastos suggested, with wich you can communicate (repeatedly sending content to stdin and receiving content from stdout/stderr). The solution linked is a bit dated (cf its own comments; Python 2) but should work well. A related solution is this one.
To keep using subprocess.run, you will have to provide a file descriptor as stdout parameter, which I don't know how would have to redirect to a file object that does what you want : writing to the standard stream, but also keeping a copy in memory for later use.
There are docs in the io module, and a lot of questions on Stack Overflow about doing things like that, but it is notably more difficult than the other way.

Python executing cmd and storing whatever it returns

I am writing a script which is executing CMD commands on Windows. I use it to parse commands to diferent application. Those commands return some values or errors. How do I force Python/CMD to store whatever command returns (no matter if it's returned value or error) in a variable and force it NOT to print it to console. I tried subprocess and os.system() and all of those I tried allows to store value but when command returns an error, it still is being printed to the console and not stored in a variable.
That is a property of the shell / of cmd and how you call the process. By default there's one input stream (stdin) and two output streams (stdout and stderr) - the latter being the default stream for all errors.
You can direct either or both to one another or to stdin or a file by calling the script appropriately. See https://support.microsoft.com/en-us/help/110930/redirecting-error-messages-from-command-prompt-stderr-stdout
For example
myscript 1> output.msg 2>&1
will direct everything into output.msg, including errors. Now combine that output redirection with writing to a variable; that is explained in this answer.
When executing a command in a shell there is 2 different outputs handlers, stdout and stderr. Usually stdout is used to print normal output and stderr to print errors and warnings.
You can use subprocess.Popen.communicate() to read both stderr and stdout.
import subprocess
p = subprocess.Popen(
"dir",
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True
)
(stdout, stderr) = p.communicate()
print(stdout.decode('utf-8')) # Standard output
print(stderr.decode('utf-8')) # Standard error

How to get output from python2 subprocess which run a script using multiprocessing?

Here is my demo code. It contains two scripts.
The first is main.py, it will call print_line.py with subprocess module.
The second is print_line.py, it prints something to the stdout.
main.py
import subprocess
p = subprocess.Popen('python2 print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
while True:
line = p.stdout.readline()
if line:
print(line)
else:
break
print_line.py
from multiprocessing import Process, JoinableQueue, current_process
if __name__ == '__main__':
task_q = JoinableQueue()
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
task_q.task_done()
for _ in range(10):
p = Process(target=do_task)
p.daemon = True
p.start()
for i in range(100):
task_q.put(i)
task_q.join()
Before, print_line.py is written with threading and Queue module, everything is fine. But now, after changing to multiprocessing module, the main.py cannot get any output from print_line. I tried to use Popen.communicate() to get the output or set preexec_fn=os.setsid inPopen(). Neither of them work.
So, here is my question:
Why subprocess cannot get the output with multiprocessing? why it is ok with threading?
If I comment out stdout=subprocess.PIPE and stderr=subprocess.PIPE, the output is printed in my console. Why? How does this happen?
Is there any chance to get the output from print_line.py?
Curious.
In theory this should work as it is, but it does not. The reason being somewhere in the deep, murky waters of buffered IO. It seems that the output of a subprocess of a subprocess can get lost if not flushed.
You have two workarounds:
One is to use flush() in your print_line.py:
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
sys.stdout.flush()
task_q.task_done()
This will fix the issue as you will flush your stdout as soon as you have written something to it.
Another option is to use -u flag to Python in your main.py:
p = subprocess.Popen('python2 -u print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
-u will force stdin and stdout to be completely unbuffered in print_line.py, and children of print_line.py will then inherit this behaviour.
These are workarounds to the problem. If you are interested in the theory why this happens, it definitely has something to do with unflushed stdout being lost if subprocess terminates, but I am not the expert in this.
It's not a multiprocessing issue, but it is a subprocess issue—or more precisely, it has to to with standard I/O and buffering, as in Hannu's answer. The trick is that by default, the output of any process, whether in Python or not, is line buffered if the output device is a "terminal device" as determined by os.isatty(stream.fileno()):
>>> import sys
>>> sys.stdout.fileno()
1
>>> import os
>>> os.isatty(1)
True
There is a shortcut available to you once the stream is open:
>>> sys.stdout.isatty()
True
but the os.isatty() operation is the more fundamental one. That is, internally, Python inspects the file descriptor first using os.isatty(fd), then chooses the stream's buffering based on the result (and/or arguments and/or the function used to open the stream). The sys.stdout stream is opened early on during Python's startup, before you generally have much control.1
When you call open or codecs.open or otherwise do your own operation to open a file, you can specify the buffering via one of the optional arguments. The default for open is the system default, which is line buffering if isatty(), otherwise fully buffered. Curiously, the default for codecs.open is line buffered.
A line buffered stream gets an automatic flush() applied when you write a newline to it.
An unbuffered stream writes each byte to its output immediately. This is very inefficient in general. A fully buffered stream writes its output when the buffer gets sufficiently full—the definition of "sufficient" here tends to be pretty variable, anything from 1024 (1k) to 1048576 (1 MB)—or when explicitly directed.
When you run something as a process, it's the process itself that decides how to do any buffering. Your own Python code, reading from the process, cannot control it. But if you know something—or a lot—about the processes that you will run, you can set up their environment so that they run line-buffered, or even unbuffered. (Or, as in your case, since you write that code, you can write it to do what you want.)
1There are hooks that fire up very early, where you can fuss with this sort of thing. They are tricky to work though.

Printing to stdout in subprocess

I have a script which runs a subprocess as follows:
child_process = subprocess.Popen(["python", testset['dir'] + testname, \
output_spec_file, plugin_directory],\
stderr=subprocess.PIPE, stdout=subprocess.PIPE)
In that process, I am trying to insert print statements but they are not appearing to stdout. I tried using sys.stdout.write() in that subprocess and then sys.stduout.read() right after child_process but it is not capturing the output.
I am new to Python and I haven't gotten to that level of complexity in Python. I am actually working low level in C and there are some Python test scripts and I'm not sure how to print out from the subprocess.
Any suggestions?
sys.stdout.read (and write) are for standard input/output of the current process (not the subprocess). If you want to write to stdin of the child process, you need to use:
child_process.stdin.write("this goes to child") #Popen(..., stdin=subprocess.PIPE)
and similar for reading from the child's stdout stream:
child_process = subprocess.Popen( ... , stdout=subprocess.PIPE)
child_process.stdout.read("This is the data that comes back")
Of course, it is generally more idiomatic to use:
stdoutdata, stderrdata = child_process.communicate(stdindata)
(taking care to pass subprocess.PIPE to the Popen constructor where appropriate) provided that your input data can be passed all at once.

Need a better way to execute console commands from python and log the results

I have a python script which needs to execute several command line utilities. The stdout output is sometimes used for further processing. In all cases, I want to log the results and raise an exception if an error is detected. I use the following function to achieve this:
def execute(cmd, logsink):
logsink.log("executing: %s\n" % cmd)
popen_obj = subprocess.Popen(\
cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = popen_obj.communicate()
returncode = popen_obj.returncode
if (returncode <> 0):
logsink.log(" RETURN CODE: %s\n" % str(returncode))
if (len(stdout.strip()) > 0):
logsink.log(" STDOUT:\n%s\n" % stdout)
if (len(stderr.strip()) > 0):
logsink.log(" STDERR:\n%s\n" % stderr)
if (returncode <> 0):
raise Exception, "execute failed with error output:\n%s" % stderr
return stdout
"logsink" can be any python object with a log method. I typically use this to forward the logging data to a specific file, or echo it to the console, or both, or something else...
This works pretty good, except for three problems where I need more fine-grained control than the communicate() method provides:
stdout and stderr output can be interleaved on the
console, but the above function logs
them separately. This can
complicate the interpretation of the
log. How do I log stdout and stderr
lines interleaved, in the same order
as they were output?
The above function will only log the
command output once the command has
completed. This complicates diagnosis of issues when commands get stuck in an infinite loop or take a very long time for some other reason. How do I get the log in real-time, while the command is still executing?
If the logs are large, it can get
hard to interpret which command
generated which output. Is there a
way to prefix each line with
something (e.g. the first word of
the cmd string followed by :).
You can redirect to a file if you just want the output in a file for later evaluation.
Your already defining the stdout/stderr of the processes your executuing by the stdout=/stderr= methods.
In your example code your just redirecting to the scripts current out/err assigments.
subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
sys.stdout and sys.stderr are just file-like objects.
As the documentation documentation on sys.stdout mentions, "Any object is acceptable as long as it has a write() method that takes a string argument."
f = open('cmd_fileoutput.txt', 'w')
subprocess.Popen(cmd, shell=True, stdout=f, stderr=f)
So you only need to given it a class with a write method in order to re-direct output.
If you want both console output and file output may be making a class to manage the output.
General redirection:
# Redirecting stdout and stderr to a file
f = open('log.txt', 'w')
sys.stdout = f
sys.stderr = f
Making a redirection class:
# redirecting to both
class OutputManager:
def __init__(self, filename, console):
self.f = open(filename, 'w')
self.con = console
def write(self, data):
self.con.write(data)
self.f.write(data)
new_stdout = OutputManager("log.txt", sys.stdout)
Interleaving is dependant on buffering, so you may or may not get the output you expect.
(You can probably turn off or reduce the buffering used, but I don't remember how at the moment)
You can look into pexpect (http://www.noah.org/wiki/Pexpect)
It solves 1) and 2) out of the box, prefixing the output might be a little trickier.
One other option:
def run_test(test_cmd):
with tempfile.TemporaryFile() as cmd_out:
proc = subprocess.Popen(test_cmd, stdout=cmd_out, stderr=cmd_out)
proc.wait()
cmd_out.seek(0)
output = "".join(cmd_out.readlines())
return (proc.returncode, output)
This will interleave stdout and stderr as desired, in a real file that is conveniently open for you.
This is by no means a complete or exhaustive answer, but perhaps you should look into the Fabric module.
http://docs.fabfile.org/0.9.1/
Makes parallel execution of shell commands and error handling rather easy.

Categories

Resources