Streaming stdin/stdout in Python - python

I'm trying to stream a bash shell to/from a simple WebSockets UI, but I'm having trouble redirecting the IO. I want to start an instance of bash and connect stdout and stdin to write_message() and on_message() functions that interact with my web UI. Here's a simplified version of what I'm trying to do:
class Handler(WebSocketHandler):
def open(self):
print "New connection opened."
self.app = subprocess.Popen(["/bin/bash", "--norc", "-i"], stdout=subprocess.PIPE, stdin=subprocess.PIPE, shell=False)
thread.start_new_thread(self.io_loop, ())
def on_message(self, message):
self.app.stdin.write(message)
def on_close(self):
self.app.terminate()
def io_loop(self):
while self.app.poll() is None:
line = self.app.stdout.readline()
if line:
self.write_message(line)
While bash appears to start and on_message does get called, I don't get any output. readline() remains blocking. I've tried stdout.read(), stdout.read(1), and various buffer modifications, but still no output. I've also tried hardcoding commands with a trailing '\n' in on_message to isolate the issue, but I still don't get any output from readline().
Ideally I want to stream each byte written to stdout in realtime, without waiting for EOL or any other characters, but I'm having a hard time finding the right API. Any pointers would be appreciated.

It looks to me like the line:
line = self.app.stdout.readline()
will block the ioloop from running because the application will spend most of its time hung up in the readline() waiting for the application to write some output instead. To get this to work, you are going to have to get the stdin and stdout of the process (and what about stderr? — you need to capture that too), switch them into non-blocking mode, and add them to the set of file descriptors that the ioloop spends its time looping on.

Related

How do I properly loop through subprocess.stdout

I'm creating a program where I need to use a powershell session and I found out how I could have a persistent session using the below code. However I want to loop through the new lines of the output of powershell when a command has been run. The for loop below is the only way i've found to do so but it expects an EOF and doesn't get it so it just lingers and the program never exits. How can I get the amount of new lines in stdout so I can properly loop through them?
from subprocess import Popen, PIPE
process = Popen(["powershell"], stdin=PIPE, stdout=PIPE)
def ps(command):
command = bytes("{}\n".format(command), encoding='utf-8')
process.stdin.write(command)
process.stdin.flush()
process.stdout.readline()
return process.stdout.readline().decode("utf-8")
ps("echo hello world")
for line in process.stdout:
print(line.strip().decode("utf-8"))
process.stdin.close()
process.wait()
You need the Powershell command to know when to exit. Typically, the solution is to not just flush, but close the stdin for the child process; when it's done with its work and finds EOF on its input, it should exit on its own. Just change:
process.stdin.flush()
to:
process.stdin.close()
which implies a flush and also ensures the child process knows input is done. If that doesn't work on its own, you might explicitly add a quit or exit (whatever Powershell uses to terminate the session manually) command after the command you're actually running.
If you must run multiple commands in the single subprocess, and each command must be fully consumed before the next one is sent, there are terrible heuristic solutions available, e.g. sending three commands at once, where the second simply echoes a sentinel string and the third explicitly flushes stdout (to ensure block buffering doesn't mean you deadlock waiting for the sentinel when its stuck in subprocess's internal buffers), and your loop can terminate once it sees the sentinel. Without a sentinel, it's worse, because you basically can't tell when the command is done, and just have to use the select/selectors module to poll the process's stdout with a timeout, reading lines whenever there is available data, and assuming the process is done if no new input is available without the expected timeout window.

Polling subprocess object without blocking

I'm writing a python script that launches programs in the background and then monitors to see if they encounter an error. I am using the subprocess module to start the process and keep a list of running programs.
processes.append((subprocess.Popen(command, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE), command))
I have found that when I try to monitor the programs by calling communicate on the subprocess object, the main program waits for the program to finish. I have tried to use poll(), but that doesn't give me access to the error code that caused the crash and I would like to address the issue and retry opening the process.
runningProcesses is a list of tuples containing the subprocess object and the command associated with it.
def monitorPrograms(runningProcesses):
for program in runningProcesses:
temp = program[0].communicate()
if program[0].returncode:
if program[0].returncode == 1:
print "Program exited successfully."
else:
print "Whoops, something went wrong. Program %s crashed." % program[0].pid
When I tried to get the return code without using communicate, the crash of the program didn't register.
Do I have to use threads to run the communication in parallel or is there a simpler way that I am missing?
No need to use threads, to monitor multiple processes, especially if you don't use their output (use DEVNULL instead of PIPE to hide the output), see Python threading multiple bash subprocesses?
Your main issue is incorrect Popen.poll() usage. If it returns None; it means that the process is still running -- you should call it until you get non-None value. Here's a similar to your case code example that prints ping processes statuses.
If you do want to get subprocess' stdout/stderr as a string then you could use threads, async.io.
If you are on Unix and you control all the code that may spawn subprocesses then you could avoid polling and handle SIGCHLD yourself. asyncio stdlib library may handle SIGCHLD. You could also implement it manually, though it might be complicated.
Based on my research, the best way to do this is with threads. Here's an article that I referenced when creating my own package to solve this problem.
The basic method used here is to spin of threads that constantly request log output (and finally the exit status) of the subprocess call.
Here's an example of my own "receiver" which listens for logs:
class Receiver(threading.Thread):
def __init__(self, stream, stream_type=None, callback=None):
super(Receiver, self).__init__()
self.stream = stream
self.stream_type = stream_type
self.callback = callback
self.complete = False
self.text = ''
def run(self):
for line in iter(self.stream.readline, ''):
line = line.rstrip()
if self.callback:
line = self.callback(line, msg_type=self.stream_type)
self.text += line + "\n"
self.complete = True
And now the code that spins the receiver off:
def _execute(self, command):
process = subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE,
shell=True, preexec_fn=os.setsid)
out = Receiver(process.stdout, stream_type='out', callback=self.handle_log)
err = Receiver(process.stderr, stream_type='err', callback=self.handle_log)
out.start()
err.start()
try:
self.wait_for_complete(out)
except CommandTimeout:
os.killpg(process.pid, signal.SIGTERM)
raise
else:
status = process.poll()
output = CommandOutput(status=status, stdout=out.text, stderr=err.text)
return output
finally:
out.join(timeout=1)
err.join(timeout=1)
CommandOutput is simply a named tuple that makes it easy to reference the data I care about.
You'll notice I have a method 'wait_for_complete' which waits for the receiver to set complete = True. Once complete, the execute method calls process.poll() to get the exit code. We now have all stdout/stderr and the status code of the process.

Simulate Ctrl-C keyboard interrupt in Python while working in Linux

I am working on some scripts (in the company I work in) that are loaded/unloaded into hypervisors to fire a piece of code when an event occurs. The only way to actually unload a script is to hit Ctrl-C. I am writing a function in Python that automates the process
As soon as it sees the string "done" in the output of the program, it should kill the vprobe.
I am using subprocess.Popen to execute the command:
lineList = buff.readlines()
cmd = "vprobe /vprobe/myhello.emt"
p = subprocess.Popen(args = cmd, shell=True,stdout = buff, universal_newlines = True,preexec_fn=os.setsid)
while not re.search("done",lineList[-1]):
print "waiting"
os.kill(p.pid,signal.CTRL_C_EVENT)
As you can see, I am writing the output in buff file descriptor opened in read+write mode. I check the last line; if it has 'done', I kill it. Unfortunately, the CTRL_C_EVENT is only valid for Windows.
What can I do for Linux?
I think you can just send the Linux equivalent, signal.SIGINT (the interrupt signal).
(Edit: I used to have something here discouraging the use of this strategy for controlling subprocesses, but on more careful reading it sounds like you've already decided you need control-C in this specific case... So, SIGINT should do it.)
In Linux, Ctrl-C keyboard interrupt can be sent programmatically to a process using Popen.send_signal(signal.SIGINT) function. For example
import subprocess
import signal
..
process = subprocess.Popen(..)
..
process.send_signal(signal.SIGINT)
..
Don't use Popen.communicate() for blocking commands..
Maybe I misunderstand something, but the way you do it it is difficult to get the desired result.
Whatever buff is, you query it first, then use it in the context of Popen() and then you hope that by maciv lineList fills itself up.
What you probably want is something like
logfile = open("mylogfile", "a")
p = subprocess.Popen(['vprobe', '/vprobe/myhello.emt'], stdout=subprocess.PIPE, buff, universal_newlines=True, preexec_fn=os.setsid)
for line in p.stdout:
logfile.write(line)
if re.search("done", line):
break
print "waiting"
os.kill(p.pid, signal.CTRL_C_EVENT)
This gives you a pipe end fed by your vprobe script which you can read out linewise and act appropriately upon the found output.

How do I run a sub-process, display its output in a GUI and allow it to be terminated?

I have been trying to write an application that runs subprocesses and (among other things) displays their output in a GUI and allows the user to click a button to cancel them. I start the processes like this:
queue = Queue.Queue(500)
process = subprocess.Popen(
command,
stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
iothread = threading.Thread(
target=simple_io_thread,
args=(process.stdout, queue))
iothread.daemon=True
iothread.start()
where simple_io_thread is defined as follows:
def simple_io_thread(pipe, queue):
while True:
line = pipe.readline()
queue.put(line, block=True)
if line=="":
break
This works well enough. In my UI I periodically do non-blocking "get"s from the queue. However, my problems come when I want to terminate the subprocess. (The subprocess is an arbitrary process, not something I wrote myself.) I can use the terminate method to terminate the process, but I do not know how to guarantee that my I/O thread will terminate. It will normally be doing blocking I/O on the pipe. This may or may not end some time after I terminate the process. (If the subprocess has spawned another subprocess, I can kill the first subprocess, but the second one will still keep the pipe open. I'm not even sure how to get such grand-children to terminate cleanly.) After that the I/O thread will try to enqueue the output, but I don't want to commit to reading from the queue indefinitely.
Ideally I would like some way to request termination of the subprocess, block for a short (<0.5s) amount of time and after that be guaranteed that the I/O thread has exited (or will exit in a timely fashion without interfering with anything else) and that I can stop reading from the queue.
It's not critical to me that a solution uses an I/O thread. If there's another way to do this that works on Windows and Linux with Python 2.6 and a Tkinter GUI that would be fine.
EDIT - Will's answer and other things I've seen on the web about doing this in other languages suggest that the operating system expects you just to close the file handle on the main thread and then the I/O thread should come out of its blocking read. However, as I described in the comment, that doesn't seem to work for me. If I do this on the main thread:
process.stdout.close()
I get:
IOError: close() called during concurrent operation on the same file object.
...on the main thread. If I do this on the main thread:
os.close(process.stdout.fileno())
I get:
close failed in file object destructor: IOError: [Errno 9] Bad file descriptor
...later on in the main thread when it tries to close the file handle itself.
I know this is an old post, but in case it still helps anyone, I think your problem could be solved by passing the subprocess.Popen instance to io_thread, rather than it's output stream.
If you do that, then you can replace your while True: line with while process.poll() == None:.
process.poll() checks for the subprocess return code; if the process hasn't finished, then there isn't one (i.e. process.poll() == None). You can then do away with if line == "": break.
The reason I'm here is because I wrote a very similar script to this today, and I got those:-
IOError: close() called during concurrent operation on the same file object. errors.
Again, in case it helps, I think my problems stem from (my) io_thread doing some overly efficient garbage collection, and closes a file handle I give it (I'm probably wrong, but it works now..) Mine's different tho in that it's not daemonic, and it iterates through subprocess.stdout, rather than using a while loop.. i.e.:-
def io_thread(subprocess,logfile,lock):
for line in subprocess.stdout:
lock.acquire()
print line,
lock.release()
logfile.write( line )
I should also probably mention that I pass the bufsize argument to subprocess.Popen, so that it's line buffered.
This is probably old enough, but still usefull to someone coming from search engine...
The reason that it shows that message is that after the subprocess has been completed it closes the file descriptors, therefore, the daemon thread (which is running concurrently) will try to use those closed descriptors raising the error.
By joining the thread before the subprocess wait() or communicate() methods should be more than enough to suppress the error.
my_thread.join()
print my_thread.is_alive()
my_popen.communicate()
In the code that terminates the process, you could also explicitly os.close() the pipe that your thread is reading from?
You should close the write pipe instead... but as you wrote the code you cannot access to it. To do it you should
crate a pipe
pass the write pipe file id to Popen's stdout
use the read pipe file simple_io_thread to read lines.
Now you can close the write pipe and the read thread will close gracefully.
queue = Queue.Queue(500)
r, w = os.pipe()
process = subprocess.Popen(
command,
stdout=w,
stderr=subprocess.STDOUT)
iothread = threading.Thread(
target=simple_io_thread,
args=(os.fdopen(r), queue))
iothread.daemon=True
iothread.start()
Now by
os.close(w)
You can close the pipe and iothread will shutdown without any exception.

Python Popen, closing streams and multiple processes

I have some data that I would like to gzip, uuencode and then print to standard out. What I basically have is:
compressor = Popen("gzip", stdin = subprocess.PIPE, stdout = subprocess.PIPE)
encoder = Popen(["uuencode", "dummy"], stdin = compressor.stdout)
The way I feed data to the compressor is through compressor.stdin.write(stuff).
What I really need to do is to send an EOF to the compressor, and I have no idea how to do it.
At some point, I tried compressor.stdin.close() but that doesn't work -- it works well when the compressor writes to a file directly, but in the case above, the process doesn't terminate and stalls on compressor.wait().
Suggestions? In this case, gzip is an example and I really need to do something with piping the output of one process to another.
Note: The data I need to compress won't fit in memory, so communicate isn't really a good option here. Also, if I just run
compressor.communicate("Testing")
after the 2 lines above, it still hangs with the error
File "/usr/lib/python2.4/subprocess.py", line 1041, in communicate
rlist, wlist, xlist = select.select(read_set, write_set, [])
I suspect the issue is with the order in which you open the pipes. UUEncode is funny is that it will whine when you launch it if there's no incoming pipe in just the right way (try launching the darn thing on it's own in a Popen call to see the explosion with just PIPE as the stdin and stdout)
Try this:
encoder = Popen(["uuencode", "dummy"], stdin=PIPE, stdout=PIPE)
compressor = Popen("gzip", stdin=PIPE, stdout=encoder.stdin)
compressor.communicate("UUencode me please")
encoded_text = encoder.communicate()[0]
print encoded_text
begin 644 dummy
F'XL(`%]^L$D``PL-3<U+SD])5<A-52C(24TL3#4`;2O+"!(`````
`
end
You are right, btw... there is no way to send a generic EOF down a pipe. After all, each program really defines its own EOF. The way to do it is to close the pipe, as you were trying to do.
EDIT: I should be clearer about uuencode. As a shell program, it's default behaviour is to expect console input. If you run it without a "live" incoming pipe, it will block waiting for console input. By opening the encoder second, before you had sent material down the compressor pipe, the encoder was blocking waiting for you to start typing. Jerub was right in that there was something blocking.
This is not the sort of thing you should be doing directly in python, there are eccentricities regarding the how thing work that make it a much better idea to do this with a shell. If you can just use subprocess.Popen("foo | bar", shell=True), then all the better.
What might be happening is that gzip has not been able to output all of its input yet, and the process will no exit until its stdout writes have been finished.
You can look at what system call a process is blocking on if you use strace. Use ps auxwf to discover which process is the gzip process, then use strace -p $pidnum to see what system call it is performing. Note that stdin is FD 0 and stdout is FD 1, you will probably see it reading or writing on those file descriptors.
if you just want to compress and don't need the file wrappers consider using the zlib module
import zlib
compressed = zlib.compress("text")
any reason why the shell=True and unix pipes suggestions won't work?
from subprocess import *
pipes = Popen("gzip | uuencode dummy", stdin=PIPE, stdout=PIPE, shell=True)
for i in range(1, 100):
pipes.stdin.write("some data")
pipes.stdin.close()
print pipes.stdout.read()
seems to work

Categories

Resources