Executing a command and storing its output in a variable - python

I'm currently trying to write a python script that, among many things, calls an executable and stores what that executable sends to stdout in a variable. Here is what I have:
1 #!/usr/bin/python
2 import subprocess
3
4 subprocess.call("./pmm", shell=True)
How would I get the output of pmm to be stored in a variable?

In Python 2.7 (and 3.1 or above), you can use subprocess.check_output(). Example from the documentation:
>>> subprocess.check_output(["ls", "-l", "/dev/null"])
'crw-rw-rw- 1 root root 1, 3 Oct 18 2007 /dev/null\n'

p = subprocess.Popen(["./pmm"], shell=False, stdout=subprocess.PIPE)
output = p.stdout.read()

I wrote a post about this some time ago:
http://trifoliummedium.blogspot.com/2010/12/running-command-line-with-python-and.html
Use p.communicate() to get both stdout and stderr

First you have to save a reference to the subprocess (bind it to a name ... which, in other languages and more informally is referred to as "assigning it to a variable"). So you should use something like proc = subprocess.Popen(...)
From there I recommend that you call proc.poll() to test if the program has completed, and either sleep (using the time.sleep() function, for example) or perform other work (using select.select() for example) and then checking again, later. Or you can call proc.wait() so that you're sure the this ./pmm command has completed it's work before your program continues. The poll() method on an subprocess instance will return "None" if the subprocess it still running; otherwise it'll return the exit value of the command that was running on that subprocess. The wait() method for a subprocess will cause your program to block and then return the exit value.
After that you can call (output, errormsgs) = proc.communicate() to capture any output or error messages from your subprocess. If the output is too large it could cause problems; using the process instance's .stdout (PIPE file descriptor) is tricky and, if you were going to attempt this then you should use features in the fcntl (file descriptor control) module to switch it into a non-blocking mode and be prepared to handle he exceptions raised when attempting read() calls on the buffer when it's empty.

Related

Python subprocess package returns broken pipe

I am trying to do a very simple example of using subprocess package. The python script should open a new process and run read command. read command should receive input from stdin via PIPE. Every time when I try to use write() and flush() it says:
Traceback (most recent call last):
File "recorder.py", line 68, in <module>
p.stdin.flush()
BrokenPipeError: [Errno 32] Broken pipe
My python code looks like:
import subprocess
import time
p = subprocess.Popen(
[
"read",
],
stdout=subprocess.PIPE,
stdin=subprocess.PIPE,
stderr=subprocess.STDOUT,
shell=True,
bufsize=1
)
for character in "This is the message!\n":
p.stdin.write(character.encode("utf-8"))
time.sleep(0.25)
p.stdin.flush()
assert p.returncode == 0
Note: it's very important to send character after character (with sleeping timeout).
I actually could not replicate your result*, in my case your loop runs through and it'd fail on the assert as p has not finished yet and has no returncode (or rather its value is still None at that time). Inserting p.wait() after the loop and before the assert would force we only check for result after p has terminated.
Now for the exception you're seeing, it most likely indicates the pipe you're trying to perform flush() on is closed. Most likely due to the process having already terminated. Perhaps in your case at that point it already has a (non-zero) returncode too which could further help understand the problem?**
* On my system /bin/sh used by subprocess.Popen() with shell=True is actually bash. Running ["/bin/dash", "-c", "read"] which presumably happens to be shell called for /bin/sh on your system, I got broken pipe as well.
** Running dash like this seems to fail with:
/bin/dash: 1: read: arg count
And return 2.
Which sort of makes it more of a dash question: why calling /bin/dash -c "read" (from python) fails. It appears that dash read (unlike its bash counterpart) always expect at least one variable name to read into as an argument (replace read with read foo).
I guess this python question just became a lesson about assumptions and shell scripts portability. :)

Different methods of calling system commands

On searching for how to call system commands, I found there to be many methods in Python. How does one choose between different methods of calling system commands:
Method 1:
os.system('ls -l *.py')
Method 2:
os.popen("ls -l").read()
Method 3:
subprocess.check_output(["ls", "-l", "*.py"]);
Method 4:
p = subprocess.Popen("ls -l *.py", stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True, shell=True)
(out, err) = p.communicate()
Method 5:
from shell_command import shell_call
shell_call("ls -l *.py")
What are the relative advantages and disadvantages of each? Which ones are especially recommended for python3 versus python2. Are there any methods which will work in both versions?
Short answer: go with subprocess module
Any flavor from subprocess should be used depending on what you need (input, output, pipes), but os.system and os.popen are definetly to be replaced and shouldn't be used anymore.
subprocess.check_output is a wrapper around subprocess.Popenas the other sibling commands subprocess.call and check_call and only returns the output of the command, but doesn't manage any communication.
subprocess.Popen is the base method and it used to do more sophisticated process communication like One-Way/Bidirectional and PIPE redirecting.
shell_command is a package that eases shell interaction and it is based on subprocess.Popen. If you are not doing a lot of system admin work, no need to use it.
So, how do you choose which subprocess call you need?
1) No need for shell interaction, just Fire and Forget?
subprocess.call is the direct replacement of os.system and os.call. You call it and don't care about the output. The command line arguments are passed as a list of strings or a single string (shell=True mode only), which free's you from the burden of escaping quotes or special characters.
Example:
subprocess.call(['ls', '-l'])
The return value is the exit code of the application itself, if you want to know if the external command crashed/succeed or any other exit command, you have to handle it yourself.
1a) You need automatic Error Handling
If you want python to deal with the error handling, a convenient function subprocess.check_call is available, which is the same as subprocess.call but it raises a CalledProcessError exception if the process returns any other error value than 0.
try:
subprocess.check_call(['false'])
except subprocess.CalledProcessError as err:
print 'Error:', err
2) Interested in the output of the external command?
subprocess.call and subprocess.check_call are bound to the output of the parent's programm, so they cannot capture the output of the command. subprocess.check_output is the command that captures the output of the command.
Example:
output = subprocess.check_output(['ls', '-l'])
print output
total 941234
drwxr-xr-x 28 user staff 1972 Dec 9 11:24 test.cpp
-rw-r--r-- 1 user staff 799 Jan 1 09:12 out
3) Communicate with your process?
still writing...it's going to be about POPEN.. :)

How to get output from python2 subprocess which run a script using multiprocessing?

Here is my demo code. It contains two scripts.
The first is main.py, it will call print_line.py with subprocess module.
The second is print_line.py, it prints something to the stdout.
main.py
import subprocess
p = subprocess.Popen('python2 print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
while True:
line = p.stdout.readline()
if line:
print(line)
else:
break
print_line.py
from multiprocessing import Process, JoinableQueue, current_process
if __name__ == '__main__':
task_q = JoinableQueue()
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
task_q.task_done()
for _ in range(10):
p = Process(target=do_task)
p.daemon = True
p.start()
for i in range(100):
task_q.put(i)
task_q.join()
Before, print_line.py is written with threading and Queue module, everything is fine. But now, after changing to multiprocessing module, the main.py cannot get any output from print_line. I tried to use Popen.communicate() to get the output or set preexec_fn=os.setsid inPopen(). Neither of them work.
So, here is my question:
Why subprocess cannot get the output with multiprocessing? why it is ok with threading?
If I comment out stdout=subprocess.PIPE and stderr=subprocess.PIPE, the output is printed in my console. Why? How does this happen?
Is there any chance to get the output from print_line.py?
Curious.
In theory this should work as it is, but it does not. The reason being somewhere in the deep, murky waters of buffered IO. It seems that the output of a subprocess of a subprocess can get lost if not flushed.
You have two workarounds:
One is to use flush() in your print_line.py:
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
sys.stdout.flush()
task_q.task_done()
This will fix the issue as you will flush your stdout as soon as you have written something to it.
Another option is to use -u flag to Python in your main.py:
p = subprocess.Popen('python2 -u print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
-u will force stdin and stdout to be completely unbuffered in print_line.py, and children of print_line.py will then inherit this behaviour.
These are workarounds to the problem. If you are interested in the theory why this happens, it definitely has something to do with unflushed stdout being lost if subprocess terminates, but I am not the expert in this.
It's not a multiprocessing issue, but it is a subprocess issue—or more precisely, it has to to with standard I/O and buffering, as in Hannu's answer. The trick is that by default, the output of any process, whether in Python or not, is line buffered if the output device is a "terminal device" as determined by os.isatty(stream.fileno()):
>>> import sys
>>> sys.stdout.fileno()
1
>>> import os
>>> os.isatty(1)
True
There is a shortcut available to you once the stream is open:
>>> sys.stdout.isatty()
True
but the os.isatty() operation is the more fundamental one. That is, internally, Python inspects the file descriptor first using os.isatty(fd), then chooses the stream's buffering based on the result (and/or arguments and/or the function used to open the stream). The sys.stdout stream is opened early on during Python's startup, before you generally have much control.1
When you call open or codecs.open or otherwise do your own operation to open a file, you can specify the buffering via one of the optional arguments. The default for open is the system default, which is line buffering if isatty(), otherwise fully buffered. Curiously, the default for codecs.open is line buffered.
A line buffered stream gets an automatic flush() applied when you write a newline to it.
An unbuffered stream writes each byte to its output immediately. This is very inefficient in general. A fully buffered stream writes its output when the buffer gets sufficiently full—the definition of "sufficient" here tends to be pretty variable, anything from 1024 (1k) to 1048576 (1 MB)—or when explicitly directed.
When you run something as a process, it's the process itself that decides how to do any buffering. Your own Python code, reading from the process, cannot control it. But if you know something—or a lot—about the processes that you will run, you can set up their environment so that they run line-buffered, or even unbuffered. (Or, as in your case, since you write that code, you can write it to do what you want.)
1There are hooks that fire up very early, where you can fuss with this sort of thing. They are tricky to work though.

Calling 2 external applications at the same time

I have a python script in which I am trying to call them out at the same time.
I have written it as:
os.system('externalize {0}'.format(result))
os.system('viewer {0} -b {1}'.format(img_list[0], img_list[1]))
However by doing so, the second application will only be open/appear unless I quit/ exit out of the first application.
I tried using subprocess as follows:
subprocess.call('externalize {0}'.format(result), shell=True)
subprocess.call('viewer {0} -b {1}'.format(img_list[0], img_list[1]))
But I am not getting much success. Am I doing it wrong somewhere?
Run them as subprocesses without waiting for finish:
p1=subprocess.Popen(<args1>)
p2=subprocess.Popen(<args2>)
If/when you then need to wait for their finish and/or check their exit code, call wait() (or whatever else applicable) on these objects.
(In general, you should never ignore the object that Popen() returns and its exit code if you need to do something as a result of the subprocess' work (e.g. clean up the files you fed them if they're temporary).)
Several subprocess functions such as call are just convenience wrappers for the Popen object which executes programs asynchronously. You can use it instead
import subprocess as subp
result = 'foo'
img_list = ['bar', 'baz']
proc1 = subp.Popen('externalize {0}'.format(result), shell=True)
proc2 = subp.Popen('viewer {0} -b {1}'.format(img_list[0], img_list[1]), shell=True)
proc1.wait()
proc2.wait()

execv multiple executables in single python script?

From what I can tell, execv overtakes the current process, and once the called executable finishes, the program terminates. I want to call execv multiple times within the same script, but because of this, that cannot be done.
Is there an alternative to execv that runs within the current process (i.e. prints to same stdout) and won't terminate my program? If so, what is it?
Yes, use subprocess.
os.execv* is not approporiate for your task, from doc:
These functions all execute a new program, replacing the current
process; they do not return. On Unix, the new executable is loaded
into the current process, and will have the same process id as the
caller.
So, as you want the external exe to print to the same output, this is what you might do:
import subprocess
output = subprocess.check_output(['your_exe', 'arg1'])
By default, check_output() only returns output written to standard output. If you want both standard output and error collected, use the stderr argument.
output = subprocess.check_output(['your_exe', 'arg1'], stderr=subprocess.STDOUT)
The subprocess module in the stdlib is the best way to create processes.

Categories

Resources