I want to run a process that runs an infinite loop (for example, starting a database server) from a python script and capture stdout and stderr. I tried this, but p.communicate() never returns, apparently because the process needs to finish first.
from subprocess import Popen, PIPE, STDOUT
cmd = "python infinite_loop.py"
p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT)
print("the process is running")
stdout, stderr = p.communicate()
print(stdout)
I'd like to get the output in some kind of streaming form. For example, I might want to save every 100 characters to a new log file. How can I do it?
Edit: Something closer to what you already had, as asyncio seems like overkill for a single coroutine:
import sys
from subprocess import Popen, PIPE, STDOUT
args = (sys.executable, '-u', 'test4.py')
cmd = ' '.join(args)
p = Popen(cmd, shell=True, stdin=PIPE, stdout=PIPE, stderr=STDOUT, universal_newlines=True)
print("the process is running")
for line in iter(p.stdout.readline,''):
line = line.rstrip()
print(line)
Original:
I threw something together. The following uses asyncio.subprocess to read lines from a subprocess' output, and then do something with them (in this case, just print() them).
The subprocess is specified by args, and in my case is just running another python instance in unbuffered mode with the following script (test4.py):
import time
for _ in range(10):
print(time.time(), flush=True)
time.sleep(1)
I'm sleeping in the for loop so it's clear whether the lines are coming in individually or all at once when the program has finished. (If you don't believe me, you can change the for loop to while True:, which will never finish).
The "supervisor" script is:
import asyncio.subprocess
import sys
async def get_lines(args):
proc = await asyncio.create_subprocess_exec(*args, stdout=asyncio.subprocess.PIPE)
while proc.returncode is None:
data = await proc.stdout.readline()
if not data: break
line = data.decode('ascii').rstrip()
# Handle line (somehow)
print(line)
if sys.platform == "win32":
loop = asyncio.ProactorEventLoop()
asyncio.set_event_loop(loop)
else:
loop = asyncio.get_event_loop()
args = (sys.executable, '-u', 'test4.py')
loop.run_until_complete(get_lines(args))
loop.close()
Note that async def is Python 3.5+, but you could use #asyncio.coroutine in 3.4.
Related
I followed the accepted answer for this question A non-blocking read on a subprocess.PIPE in Python to read non-blocking from a subprocess. This generally works fine, except if the process I call terminates quickly.
This is on Windows.
To illustrate, I have a bat file that simply writes one line to stdout:
test.bat:
#ECHO OFF
ECHO Fast termination
And here the python code, adapted from above mentioned answer:
from subprocess import PIPE, Popen
from threading import Thread
from queue import Queue, Empty
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
p = Popen(['test.bat'], stdout=PIPE, bufsize=-1,
text=True)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
output = str()
while True:
try:
line = q.get_nowait()
except Empty:
line = ""
output += line
if p.poll() is not None:
break
print(output)
Sometimes, the line from the bat file is correctly captured and printed, sometimes nothing is captured an printed. I suspect that the subprocess might finish before the thread connects the queue to the pipe, and then it doesn't read anything. If I add a little wait of 2 seconds in the bat file before echoing the line, it seems to always work. Likewise the behavior can be forced by adding a little sleep after the Popen in the python code. Is there a way to reliably capture the output of the subprocess even if it finishes immediately while still doing a non-blocking read?
I am trying to asynchronously run the Popen command from subprocess, so that I can run other stuff in the background.
import subprocess
import requests
import asyncio
import asyncio.subprocess
async def x(message):
if len(message.content.split()) > 1:
#output = asyncio.create_subprocess_shell(message.content[3:], shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
output = subprocess.Popen(message.content[3:], shell=True, stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
return output.communicate()[0].decode('utf-8')
I have tried to understand https://docs.python.org/3/library/asyncio-subprocess.html but i am not sure what a protocol factory is.
When I came to this question, I expected the answer to really use asyncio for interprocess communication.
I have found the following resource useful:
https://github.com/python/asyncio/blob/master/examples/child_process.py
and below is my simplified example (using 3.5+ async/await syntax), which reads lines and outputs them sorted:
import asyncio
from subprocess import Popen, PIPE
async def connect_write_pipe(file):
"""Return a write-only transport wrapping a writable pipe"""
loop = asyncio.get_event_loop()
transport, _ = await loop.connect_write_pipe(asyncio.Protocol, file)
return transport
async def connect_read_pipe(file):
"""Wrap a readable pipe in a stream"""
loop = asyncio.get_event_loop()
stream_reader = asyncio.StreamReader(loop=loop)
def factory():
return asyncio.StreamReaderProtocol(stream_reader)
transport, _ = await loop.connect_read_pipe(factory, file)
return stream_reader, transport
async def main(loop):
# start subprocess and wrap stdin, stdout, stderr
p = Popen(['/usr/bin/sort'], stdin=PIPE, stdout=PIPE, stderr=PIPE)
stdin = await connect_write_pipe(p.stdin)
stdout, stdout_transport = await connect_read_pipe(p.stdout)
stderr, stderr_transport = await connect_read_pipe(p.stderr)
# interact with subprocess
name = {stdout: 'OUT', stderr: 'ERR'}
registered = {
asyncio.Task(stderr.read()): stderr,
asyncio.Task(stdout.read()): stdout
}
to_sort = b"one\ntwo\nthree\n"
stdin.write(to_sort)
stdin.close() # this way we tell we do not have anything else
# get and print lines from stdout, stderr
timeout = None
while registered:
done, pending = await asyncio.wait(
registered, timeout=timeout,
return_when=asyncio.FIRST_COMPLETED)
if not done:
break
for f in done:
stream = registered.pop(f)
res = f.result()
if res != b'':
print(name[stream], res.decode('ascii').rstrip())
registered[asyncio.Task(stream.read())] = stream
timeout = 0.0
stdout_transport.close()
stderr_transport.close()
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main(loop))
finally:
loop.close()
NB: without taking special measures, the amount of data to be written into the pipe is limited. In my system it was possible to write just over 700000 bytes before using up pipe buffers.
There are also other examples there, using create_subprocess_shell.
I have not yet used asyncio in real projects, so improvements' suggestions in the comments are welcome.
It's the right way to go...! Use
async/await
Tested it on Python - 3.X [Windows, MacOS]
import asyncio
from asyncio.subprocess import PIPE, STDOUT
import subprocess
import signal
def signal_handler(signal, frame):
loop.stop()
client.close()
sys.exit(0)
async def run_async(loop = ''):
cmd = 'sudo long_running_cmd --opt1=AAAA --opt2=BBBB'
print ("[INFO] Starting script...")
await asyncio.create_subprocess_shell(cmd1, stdin = PIPE, stdout = PIPE, stderr = STDOUT)
print("[INFO] Script is complete.")
loop = asyncio.get_event_loop()
signal.signal(signal.SIGINT, signal_handler)
tasks = [loop.create_task(run_async())]
wait_tasks = asyncio.wait(tasks)
loop.run_until_complete(wait_tasks)
loop.close()
Core logic:
process = await asyncio.create_subprocess_shell(cmd1, stdin = PIPE, stdout PIPE, stderr = STDOUT)
await process.wait()
I eventually found the answer to my question, which utilizes async.
http://pastebin.com/Zj8SK1CG
Tested with python 3.5. Just ask if you have questions.
import threading
import time
import subprocess
import shlex
from sys import stdout
# Only data wihtin a class are actually shared by the threads.
# Let's use a class as communicator (there could be problems if you have more than
# a single thread)
class Communicator(object):
counter = 0
stop = False
arg = None
result = None
# Here we can define what you want to do. There are other methods to do that
# but this is the one I prefer.
class ThreadedFunction(threading.Thread):
def run(self, *args, **kwargs):
super().run()
command = c.arg
# Here what you want to do...
command = shlex.split(command)
print(time.time()) # this is just to check that the command (sleep 5) is executed
output = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE).communicate()
print('\n',time.time())
c.result = output
if c.stop: return None # This is useful only within loops within threads
# Create a class instance
c = Communicator()
c.arg = 'time sleep 5' # Here I used the 'time' only to have some output
# Create the thread and start it
t = ThreadedFunction()
t.start() # Start the thread and do something else...
# ...for example count the seconds in the mean time..
try:
for j in range(100):
c.counter += 1
stdout.write('\r{:}'.format(c.counter))
stdout.flush()
time.sleep(1)
if c.result != None:
print(c.result)
break
except:
c.stop = True
This one is much simpler, I found it after the other reply that could, anyway, be interesting... so I left it.
import time
import subprocess
import shlex
from sys import stdout
command = 'time sleep 5' # Here I used the 'time' only to have some output
def x(command):
cmd = shlex.split(command)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
return p
# Start the subprocess and do something else...
p = x(command)
# ...for example count the seconds in the mean time..
try: # This take care of killing the subprocess if problems occur
for j in range(100):
stdout.write('\r{:}'.format(j))
stdout.flush()
time.sleep(1)
if p.poll() != None:
print(p.communicate())
break
except:
p.terminate() # or p.kill()
The asynchronism is evident from the fact that the python script prints the counter value on the stdout while the background process runs the sleep command. The fact that the python script exit after ~5sec printing the output of the bash time command printing the counter in the meanwhile is an evidence that the script works.
I've got a program on Windows that calls a bunch of subprocesses, and displays the results in a GUI. I'm using PyQt for the GUI, and the subprocess module to run the programs.
I've got the following WorkerThread, that spawns a subthread for each shell command devoted to reading the process stdout and printing the results (later I'll wire it up to the GUI).
This all works. Except proc.stdout.read(1) never returns until after the subprocess has completed. This is a big problem, since some of these subprocesses can take 15-20 minutes to run, and I need to display results as they're running.
What do I need to do to get the pipe working while the subprocess is running?
class WorkerThread(QtCore.QThread):
def run(self):
def sh(cmd, cwd = None):
proc = subprocess.Popen(cmd,
shell = True,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
stdin = subprocess.PIPE,
cwd = cwd,
env = os.environ)
proc.stdin.close()
class ReadStdOutThread(QtCore.QThread):
def run(_self):
s = ''
while True:
if self.request_exit: return
b = proc.stdout.read(1)
if b == '\n':
print s
s = ''
continue
if b:
s += b
continue
if s: print s
return
thread = ReadStdOutThread()
thread.start()
retcode = proc.wait()
if retcode:
raise subprocess.CalledProcessError(retcode, cmd)
return 0
FWIW: I rewrote the whole thing using QProcess, and I see the exact same problem. The stdout receives no data, until the underlying process has returned. Then I get everything all at once.
If you know how long will be the the lines of command's output you can poll on the stdout PIPE of the process.
An example of what I mean:
import select
import subprocess
import threading
import os
# Some time consuming command.
command = 'while [ 1 ]; do sleep 1; echo "Testing"; done'
# A worker thread, not as complex as yours, just to show my point.
class Worker(threading.Thread):
def __init__(self):
super(Worker, self).__init__()
self.proc = subprocess.Popen(
command, shell=True,
stdout=subprocess.PIPE,
stdin=subprocess.PIPE, stderr=subprocess.STDOUT
)
def run(self):
self.proc.communicate()
def get_proc(self):
# The proc is needed for ask him for his
# output file descriptor later.
return self.proc
if __name__ == '__main__':
w = Worker()
w.start()
proc = w.get_proc()
pollin = select.poll()
pollin.register(proc.stdout, select.POLLIN)
while ( 1 ):
events = pollin.poll()
for fd, event in events:
if event == select.POLLIN:
# This is the main issue of my idea,
# if you don't know the length of lines
# that process ouput, this is a problem.
# I put 7 since I know the word "Testing" have
# 7 characters.
print os.read(fd, 7)
Maybe this is not exactly what you're looking for, but I think it give you a pretty good idea of what to do to solve your problem.
EDIT: I think I've just found what you need Streaming stdout from a Python subprocess in Python.
I am running a long process (actually another python script) in the background. I need to know when it has finished. I have found that Popen.poll() always returns 0 for a background process. Is there another way to do this?
p = subprocess.Popen("sleep 30 &", shell=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
a = p.poll()
print(a)
Above code never prints None.
You don't need to use the shell backgrounding & syntax, as subprocess will run the process in the background by itself
Just run the command normally, then wait until Popen.poll returns not None
import time
import subprocess
p = subprocess.Popen("sleep 30", shell=True)
# Better: p = subprocess.Popen(["sleep", "30"])
# Wait until process terminates
while p.poll() is None:
time.sleep(0.5)
# It's done
print("Process ended, ret code:", p.returncode)
I think you want either the popen.wait() or popen.communicate() commands. Communicate will grab the stdout and stderr data which you've put into PIPE. If the other item is a Python script I would avoid running a shell=True call by doing something like:
p = subprocess.Popen([python.call, "my", params, (go, here)], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
(stdout, stderr) = p.communicate()
print(stdout)
print(stderr)
Of course these hold the main thread and wait for the other process to complete, which might be bad. If you want to busy wait then you could simply wrap your original code in a loop. (Your original code did print "None" for me, btw)
Example of the wrapping in a loop solution:
p = subprocess.Popen([python.call, "my", params, (go, here)], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while p.poll() == None:
# We can do other things here while we wait
time.sleep(.5)
p.poll()
(results, errors) = p.communicate()
if errors == '':
return results
else:
raise My_Exception(errors)
You shouldn't run your script with ampersand at the end. Because shell forks your process and returns 0 exit code.
I need to show some progress bar or something while spawning and running subprocess.
How can I do that with python?
import subprocess
cmd = ['python','wait.py']
p = subprocess.Popen(cmd, bufsize=1024,stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
p.stdin.close()
outputmessage = p.stdout.read() #This will print the standard output from the spawned process
message = p.stderr.read()
I could spawn subprocess with this code, but I need to print out something when each second is passing.
Since the subprocess call is blocking, one way to print something out while waiting would be to use multithreading. Here's an example using threading._Timer:
import threading
import subprocess
class RepeatingTimer(threading._Timer):
def run(self):
while True:
self.finished.wait(self.interval)
if self.finished.is_set():
return
else:
self.function(*self.args, **self.kwargs)
def status():
print "I'm alive"
timer = RepeatingTimer(1.0, status)
timer.daemon = True # Allows program to exit if only the thread is alive
timer.start()
proc = subprocess.Popen([ '/bin/sleep', "5" ])
proc.wait()
timer.cancel()
On an unrelated note, calling stdout.read() while using multiple pipes can lead to deadlock. The subprocess.communicate() function should be used instead.
As far as I see it all you need to do is put those reads in a loop with a delay and a print - does it have to be precisely a second or around about a second?