I need to start a process and read the output of that process while the process is running. I want to be able to print the output (optional) and to return the output when the process has finished. Here is what I have so far (merged from other answers in stackoverflow):
def call(command, print_output):
process = subprocess.Popen(command, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
out = ""
while True:
line = process.stdout.readline().rstrip().decode("utf-8")
if line == '':
break
if print_output:
print(line)
out += line + "\n"
process.wait()
return process.returncode, out
This code works great in windows (tested with windows 7, python 3.3) but fails in linux (Ubuntu 12.04, python 3.2). In linux, the script hangs at the line
line = process.stdout.readline().rstrip().decode("utf-8")
when the process has finished.
What's wrong with the code? I've tried to check whether the process has been finished with process.poll() as well, but that returns always None under Linux.
The docs say
Warning Use communicate() rather than
.stdin.write, .stdout.read or .stderr.read to
avoid deadlocks due to any of the other OS pipe buffers filling
up and blocking the child process.
I know I had issues before on Windows.
I presume the command is running in unbuffered mode somehow.
The docs have recipes for using subprocess your sounds like shell-backquote yet your use of subprocess is different.
Related
I am using Python and it's subprocess library to check output from calls using strace, something in the matter of:
subprocess.check_output(["strace", str(processname)])
However, this only gives me the output after the called subprocess already finished, which is very limiting for my use-case.
I need a kind of "stream" or live-output from the process, so I need to read the output while the process is still running instead of only after it finished.
Is there a convenient way to achieve this using the subprocess library?
I'm thinking of a kind of poll every x seconds, but did not find any hints regarding on how to implement this in the documentation.
Many thanks in advance.
As of Python 3.2 (when context manager support was added to Popen), I have found this to be the most straightforward way to continuously stream output from a subprocess:
import subprocess
def run(args):
with subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT) as process:
for line in process.stdout:
print(line.decode('utf8'))
Had some problems referencing the selected answer for streaming output from a test runner. The following worked better for me:
import subprocess
from time import sleep
def stream_process(process):
go = process.poll() is None
for line in process.stdout:
print(line)
return go
process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while stream_process(process):
sleep(0.1)
According to the documentation:
Popen.poll()
Check if child process has terminated. Set and return returncode attribute.
So based on this you can:
process = subprocess.Popen('your_command_here',stdout=subprocess.PIPE)
while True:
output = process.stdout.readline()
if process.poll() is not None and output == '':
break
if output:
print (output.strip())
retval = process.poll()
This will loop, reading the stdout, and display the output in real time.
This does not work in current versions of python. (At least) for Python 3.8.5 and newer you should replace output == '' with output == b''
Here is my demo code. It contains two scripts.
The first is main.py, it will call print_line.py with subprocess module.
The second is print_line.py, it prints something to the stdout.
main.py
import subprocess
p = subprocess.Popen('python2 print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
while True:
line = p.stdout.readline()
if line:
print(line)
else:
break
print_line.py
from multiprocessing import Process, JoinableQueue, current_process
if __name__ == '__main__':
task_q = JoinableQueue()
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
task_q.task_done()
for _ in range(10):
p = Process(target=do_task)
p.daemon = True
p.start()
for i in range(100):
task_q.put(i)
task_q.join()
Before, print_line.py is written with threading and Queue module, everything is fine. But now, after changing to multiprocessing module, the main.py cannot get any output from print_line. I tried to use Popen.communicate() to get the output or set preexec_fn=os.setsid inPopen(). Neither of them work.
So, here is my question:
Why subprocess cannot get the output with multiprocessing? why it is ok with threading?
If I comment out stdout=subprocess.PIPE and stderr=subprocess.PIPE, the output is printed in my console. Why? How does this happen?
Is there any chance to get the output from print_line.py?
Curious.
In theory this should work as it is, but it does not. The reason being somewhere in the deep, murky waters of buffered IO. It seems that the output of a subprocess of a subprocess can get lost if not flushed.
You have two workarounds:
One is to use flush() in your print_line.py:
def do_task():
while True:
task = task_q.get()
pid = current_process().pid
print 'pid: {}, task: {}'.format(pid, task)
sys.stdout.flush()
task_q.task_done()
This will fix the issue as you will flush your stdout as soon as you have written something to it.
Another option is to use -u flag to Python in your main.py:
p = subprocess.Popen('python2 -u print_line.py',
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
close_fds=True,
shell=True,
universal_newlines=True)
-u will force stdin and stdout to be completely unbuffered in print_line.py, and children of print_line.py will then inherit this behaviour.
These are workarounds to the problem. If you are interested in the theory why this happens, it definitely has something to do with unflushed stdout being lost if subprocess terminates, but I am not the expert in this.
It's not a multiprocessing issue, but it is a subprocess issue—or more precisely, it has to to with standard I/O and buffering, as in Hannu's answer. The trick is that by default, the output of any process, whether in Python or not, is line buffered if the output device is a "terminal device" as determined by os.isatty(stream.fileno()):
>>> import sys
>>> sys.stdout.fileno()
1
>>> import os
>>> os.isatty(1)
True
There is a shortcut available to you once the stream is open:
>>> sys.stdout.isatty()
True
but the os.isatty() operation is the more fundamental one. That is, internally, Python inspects the file descriptor first using os.isatty(fd), then chooses the stream's buffering based on the result (and/or arguments and/or the function used to open the stream). The sys.stdout stream is opened early on during Python's startup, before you generally have much control.1
When you call open or codecs.open or otherwise do your own operation to open a file, you can specify the buffering via one of the optional arguments. The default for open is the system default, which is line buffering if isatty(), otherwise fully buffered. Curiously, the default for codecs.open is line buffered.
A line buffered stream gets an automatic flush() applied when you write a newline to it.
An unbuffered stream writes each byte to its output immediately. This is very inefficient in general. A fully buffered stream writes its output when the buffer gets sufficiently full—the definition of "sufficient" here tends to be pretty variable, anything from 1024 (1k) to 1048576 (1 MB)—or when explicitly directed.
When you run something as a process, it's the process itself that decides how to do any buffering. Your own Python code, reading from the process, cannot control it. But if you know something—or a lot—about the processes that you will run, you can set up their environment so that they run line-buffered, or even unbuffered. (Or, as in your case, since you write that code, you can write it to do what you want.)
1There are hooks that fire up very early, where you can fuss with this sort of thing. They are tricky to work though.
I'm new to python and would like to open a windows cmd prompt, start a process, leave the process running and then issue commands to the same running process.
The commands will change so i cant just include these commands in the cmdline variable below. Also, the process takes 10-15 seconds to start so i dont want to waste time waiting for the process to start and run commands each time. just want to start process once. and run quick commands as needed in the same process
I was hoping to use subprocess.Popen to make this work, though i am open to better methods. Note that my process to run is not cmd, but im just using this as example
import subprocess
cmdline = ['cmd', '/k']
cmd = subprocess.Popen(cmdline, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
cmd.stdin.write("echo hi") #would like this to be written to the cmd prompt
print cmd.stdout.readline() #would like to see 'hi' readback
cmd.stdin.write("echo hi again") #would like this to be written to the cmd prompt
print cmd.stdout.readline() #would like to see 'hi again' readback
The results arent what i expect. Seems as though the stdin.write commands arent actually getting in and the readline freezes up with nothing to read.
I have tried the popen.communicate() instead of write/readline, but it kills the process. I have tried setting bufsize in the Popen line, but that didn't make too much difference
Your comments suggest that you are confusing command-line arguments with input via stdin. Namely, the fact that system-console.exe program accepts script=filename parameter does not imply that you can send it the same string as a command via stdin e.g., python executable accepts -c "print(1)" command-line arguments but it is a SyntaxError if you pass it as a command to Python shell.
Therefore, the first step is to use the correct syntax. Suppose the system-console.exe accepts a filename by itself:
#!/usr/bin/env python3
import time
from subprocess import Popen, PIPE
with Popen(r'C:\full\path\to\system-console.exe -cli -',
stdin=PIPE, bufsize=1, universal_newlines=True) as shell:
for _ in range(10):
print('capture.tcl', file=shell.stdin, flush=True)
time.sleep(5)
Note: if you've redirected more than one stream e.g., stdin, stdout then you should read/write both streams concurrently (e.g., using multiple threads) otherwise it is very easy to deadlock your program.
Related:
Q: Why not just use a pipe (popen())? -- mandatory reading for Unix environment but it might also be applicable for some programs on Windows
subprocess readline hangs waiting for EOF -- code example on how to pass multiple inputs, read multiple outputs using subprocess, pexpect modules.
The second and the following steps might have to deal with buffering issues on the side of the child process (out of your hands on Windows), whether system-console allows to redirect its stdin/stdout or whether it works with a console directly, and character encoding issues (how various commands in the pipeline encode text).
Here is some code that I tested and is working on Windows 10, Quartus Prime 15.1 and Python 3.5
import subprocess
class altera_system_console:
def __init__(self):
sc_path = r'C:\altera_lite\15.1\quartus\sopc_builder\bin\system-console.exe --cli --disable_readline'
self.console = subprocess.Popen(sc_path, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
def read_output(self):
rtn = ""
loop = True
i = 0
match = '% '
while loop:
out = self.console.stdout.read1(1)
if bytes(match[i],'utf-8') == out:
i = i+1
if i==len(match):
loop=False
else:
rtn = rtn + out.decode('utf-8')
return rtn
def cmd(self,cmd_string):
self.console.stdin.write(bytes(cmd_string+'\n','utf-8'))
self.console.stdin.flush()
c = altera_system_console()
print(c.read_output())
c.cmd('set jtag_master [lindex [get_service_paths master] 0]')
print(c.read_output())
c.cmd('open_service master $jtag_master')
print(c.read_output())
c.cmd('master_write_8 $jtag_master 0x00 0xFF')
print(c.read_output())
You need to use iter if you want to see the output in real time:
import subprocess
cmdline = ['cmd', '/k']
cmd = subprocess.Popen(cmdline, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
cmd.stdin.write("echo hi\n")#would like this to be written to the cmd prompt
for line in iter(cmd.stdout.readline,""):
print line
cmd.stdin.write("echo hi again\n")#would like this to be written to the cmd prompt
Not sure exactly what you are trying to do but if you want to input certain data when you get certain output then I would recommend using pexpect
I am working on some scripts (in the company I work in) that are loaded/unloaded into hypervisors to fire a piece of code when an event occurs. The only way to actually unload a script is to hit Ctrl-C. I am writing a function in Python that automates the process
As soon as it sees the string "done" in the output of the program, it should kill the vprobe.
I am using subprocess.Popen to execute the command:
lineList = buff.readlines()
cmd = "vprobe /vprobe/myhello.emt"
p = subprocess.Popen(args = cmd, shell=True,stdout = buff, universal_newlines = True,preexec_fn=os.setsid)
while not re.search("done",lineList[-1]):
print "waiting"
os.kill(p.pid,signal.CTRL_C_EVENT)
As you can see, I am writing the output in buff file descriptor opened in read+write mode. I check the last line; if it has 'done', I kill it. Unfortunately, the CTRL_C_EVENT is only valid for Windows.
What can I do for Linux?
I think you can just send the Linux equivalent, signal.SIGINT (the interrupt signal).
(Edit: I used to have something here discouraging the use of this strategy for controlling subprocesses, but on more careful reading it sounds like you've already decided you need control-C in this specific case... So, SIGINT should do it.)
In Linux, Ctrl-C keyboard interrupt can be sent programmatically to a process using Popen.send_signal(signal.SIGINT) function. For example
import subprocess
import signal
..
process = subprocess.Popen(..)
..
process.send_signal(signal.SIGINT)
..
Don't use Popen.communicate() for blocking commands..
Maybe I misunderstand something, but the way you do it it is difficult to get the desired result.
Whatever buff is, you query it first, then use it in the context of Popen() and then you hope that by maciv lineList fills itself up.
What you probably want is something like
logfile = open("mylogfile", "a")
p = subprocess.Popen(['vprobe', '/vprobe/myhello.emt'], stdout=subprocess.PIPE, buff, universal_newlines=True, preexec_fn=os.setsid)
for line in p.stdout:
logfile.write(line)
if re.search("done", line):
break
print "waiting"
os.kill(p.pid, signal.CTRL_C_EVENT)
This gives you a pipe end fed by your vprobe script which you can read out linewise and act appropriately upon the found output.
I'm running multiple commands which may take some time, in parallel, on a Linux machine running Python 2.6.
So, I used subprocess.Popen class and process.communicate() method to parallelize execution of mulitple command groups and capture the output at once after execution.
def run_commands(commands, print_lock):
# this part runs in parallel.
outputs = []
for command in commands:
proc = subprocess.Popen(shlex.split(command), stdout=subprocess.PIPE, stderr=subprocess.STDOUT, close_fds=True)
output, unused_err = proc.communicate() # buffers the output
retcode = proc.poll() # ensures subprocess termination
outputs.append(output)
with print_lock: # print them at once (synchronized)
for output in outputs:
for line in output.splitlines():
print(line)
At somewhere else it's called like this:
processes = []
print_lock = Lock()
for ...:
commands = ... # a group of commands is generated, which takes some time.
processes.append(Thread(target=run_commands, args=(commands, print_lock)))
processes[-1].start()
for p in processes: p.join()
print('done.')
The expected result is that each output of a group of commands is displayed at once while execution of them is done in parallel.
But from the second output group (of course, the thread that become the second is changed due to scheduling indeterminism), it begins to print without newlines and adding spaces as many as the number of characters printed in each previous line and input echo is turned off -- the terminal state is "garbled" or "crashed". (If I issue reset shell command, it restores normal.)
At first, I tried to find the reason from handling of '\r', but it was not the reason. As you see in my code, I handled it properly using splitlines(), and I confirmed that with repr() function applied to the output.
I think the reason is concurrent use of pipes in Popen and communicate() for stdout/stderr. I tried check_output shortcut method in Python 2.7, but no success. Of course, the problem described above does not occur if I serialize all command executions and prints.
Is there any better way to handle Popen and communicate() in parallel?
A final result inspired by the comment from J.F.Sebastian.
http://bitbucket.org/daybreaker/kaist-cs443/src/247f9ecf3cee/tools/manage.py
It seems to be a Python bug.
I am not sure it is clear what run_commands needs to be actually doing, but it seems to be simply doing a poll on a subprocess, ignoring the return-code and continuing in the loop. When you get to the part where you are printing output, how could you know the sub-processes have completed?
In your example code I noticed your use of:
for line in output.splitlines():
to address partially the issue of " /r " ; use of
for line in output.splitlines(True):
would have been helpful.