Capture output from external C program with python GUI in realtime - python

I have a threaded C program that I want to launch using a Python GUI in a Unix environment. I want to be able to change the state of the GUI by gathering output from the C program.
For instance, this is output from my C program using printf:
Thread on tile 1: On
Thread on tile 2: OFF
Thread on tile 3: Disable
...
Thread on tile 61: ON
I would update my GUI based on the output. What makes the problem difficult is both my GUI and the C program need to run simultaneously and updates happening in realtime. I also need to be able to send commands to the C program from my GUI.
I'm new to Python, C, and Unix (I know, complete rookie status).
I've read up on subprocess, Popen, and pexpect, but not sure how to put it all together of if this is even possible at all.
Thanks in advance

The basic outline of an approach would be to have your python GUI create a new process with the C program and then have the python GUI reading from one end of a pipe while the C program is writing to the other end of the pipe. The python GUI will read the output from the C program, interpret the output, and then do something based on what it has read.
Multiprocessing with Python.
How to fork and return text with Python
Recipe to fork a daemon process on Unix
The recipe article has comments about doing redirection of standard in and standard out which you may need to do.

There's a toughie. I've run into this problem in the past (with no truly satisfactory solution):
https://groups.google.com/forum/?fromgroups#!topic/comp.lang.python/79uoHgAbg18
As suggested there, take a look at this custom module:
http://pypi.python.org/pypi/sarge/0.1
http://sarge.readthedocs.org/en/latest/
Edit #Richard (not enough rep to comment): The problem with pipes is that unless they are attached to an interactive terminal, they are fully buffered -- meaning that none of the output is passed through the pipe to the Python until the C prog is done running, which certainly doesn't qualify as a real time.
Edit 2: Based on Richard's link and some earlier thinking I had done, it occurred to me that it might be possible to manually loop over the pipe by treating it as a file object and only reading one line at a time:
from time import sleep
# Assume proc is the Popen object
wait_time = 1 # 1 second, same delay as `tail -f`
while True: # Or whatever condition you need
line = proc.stdout.readline()
if line != '' and line != '\n':
parse_line_do_stuff()
sleep(wait_time)
This assumes that readline() is non-blocking, and further assumes that the pipe is at most line buffered, and even then it might not work. I've never tried it.

You have two processes, and you want them to communicate. This is known as "interprocess communication" or even IPC. If you Google for "interprocess communication" you can find some information, like this:
http://beej.us/guide/bgipc/output/html/singlepage/bgipc.html
You might want to try a "domain socket" for communicating between the C program and the Python GUI. The guide linked above explains how to talk through a domain socket in C; here is a tutorial on talking through a domain socket with Python.
http://www.doughellmann.com/PyMOTW/socket/uds.html

Related

Subprocess, repeatedly write to STDIN while reading from STDOUT (Windows)

I want to call an external process from python. The process I'm calling reads an input string and gives tokenized result, and waits for another input (binary is MeCab tokenizer if that helps).
I need to tokenize thousands of lines of string by calling this process.
Problem is Popen.communicate() works but waits for the process to die before giving out the STDOUT result. I don't want to keep closing and opening new subprocesses for thousands of times. (And I don't want to send the whole text, it may easily grow over tens of thousands of -long- lines in future.)
from subprocess import PIPE, Popen
with Popen("mecab -O wakati".split(), stdin=PIPE,
stdout=PIPE, stderr=PIPE, close_fds=False,
universal_newlines=True, bufsize=1) as proc:
output, errors = proc.communicate("foobarbaz")
print(output)
I've tried reading proc.stdout.read() instead of using communicate but it is blocked by stdin and doesn't return any results before proc.stdin.close() is called. Which, again means I need to create a new process everytime.
I've tried to implement queues and threads from a similar question as below, but it either doesn't return anything so it's stuck on While True, or when I force stdin buffer to fill by repeteadly sending strings, it outputs all the results at once.
from subprocess import PIPE, Popen
from threading import Thread
from queue import Queue, Empty
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
p = Popen('mecab -O wakati'.split(), stdout=PIPE, stdin=PIPE,
universal_newlines=True, bufsize=1, close_fds=False)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True
t.start()
p.stdin.write("foobarbaz")
while True:
try:
line = q.get_nowait()
except Empty:
pass
else:
print(line)
break
Also looked at the Pexpect route, but it's windows port doesn't support some important modules (pty based ones), so I couldn't apply that as well.
I know there are a lot of similar answers, and I've tried most of them. But nothing I've tried seems to work on Windows.
EDIT: some info on the binary I'm using, when I use it via command line. It runs and tokenizes sentences I give, until I'm done and forcibly close the program.
(...waits_for_input -> input_recieved -> output -> waits_for_input...)
Thanks.
If mecab uses C FILE streams with default buffering, then piped stdout has a 4 KiB buffer. The idea here is that a program can efficiently use small, arbitrary-sized reads and writes to the buffers, and the underlying standard I/O implementation handles automatically filling and flushing the much-larger buffers. This minimizes the number of required system calls and maximizes throughput. Obviously you don't want this behavior for interactive console or terminal I/O or writing to stderr. In these cases the C runtime uses line-buffering or no buffering.
A program can override this behavior, and some do have command-line options to set the buffer size. For example, Python has the "-u" (unbuffered) option and PYTHONUNBUFFERED environment variable. If mecab doesn't have a similar option, then there isn't a generic workaround on Windows. The C runtime situation is too complicated. A Windows process can link statically or dynamically to one or several CRTs. The situation on Linux is different since a Linux process generally loads a single system CRT (e.g. GNU libc.so.6) into the global symbol table, which allows an LD_PRELOAD library to configure the C FILE streams. Linux stdbuf uses this trick, e.g. stdbuf -o0 mecab -O wakati.
One option to experiment with is to call CreateConsoleScreenBuffer and get a file descriptor for the handle from msvcrt.open_osfhandle. Then pass this as stdout instead of using a pipe. The child process will see this as a TTY and use line buffering instead of full buffering. However managing this is non-trivial. It would involve reading (i.e. ReadConsoleOutputCharacter) a sliding buffer (call GetConsoleScreenBufferInfo to track the cursor position) that's actively written to by another process. This kind of interaction isn't something that I've ever needed or even experimented with. But I have used a console screen buffer non-interactively, i.e. reading the buffer after the child has exited. This allows reading up to 9,999 lines of output from programs that write directly to the console instead of stdout, e.g. programs that call WriteConsole or open "CON" or "CONOUT$".
Here is a workaround for Windows. This should also be adaptable to other operating systems.
Download a console emulator like ConEmu (https://conemu.github.io/)
Start it instead of mecab as your subprocess.
p = Popen(['conemu'] , stdout=PIPE, stdin=PIPE,
universal_newlines=True, bufsize=1, close_fds=False)
Then send the following as the first input:
mecab -O wakafi & exit
You are letting the emulator handle the file output issues for you; the way it normally does when you manually interact with it.
I am still looking into this; but already looks promising...
Only problem is conemu is a gui application; so if no other way to hook into its input and output, one might have to tweak and rebuild from sources (it's open source). I haven't found any other way; but this should work.
I have asked the question about running in some sort of console mode here; so you can check that thread also for something. The author Maximus is on SO...
The code
while True:
try:
line = q.get_nowait()
except Empty:
pass
else:
print(line)
break
is essentially the same as
print(q.get())
except less efficient because it burns CPU time while waiting. The explicit loop won't make data from the subprocess arrive sooner; it arrives when it arrives.
For dealing with uncooperative binaries I have a few suggestions, from best to worst:
Find a Python library and use that instead. It appears that there's an official Python binding in the MeCab source tree and I see some prebuilt packages on PyPI. You can also look for a DLL build that you can call with ctypes or another Python FFI. If that doesn't work...
Find a binary that flushes after each line of output. The most recent Win32 build I found online, v0.98, does flush after each line. Failing that...
Build your own binary that flushes after each line. It should be easy enough to find the main loop and insert a flush call in it. But MeCab seems to explicitly flush already, and git blame says that the flush statement was last changed in 2011, so I'm surprised you ever had this problem and I suspect that there may have just been a bug in your Python code. Failing that...
Process the output asynchronously. If your concern is that you want to deal with the output in parallel with the tokenization for performance reasons, you can mostly do that, after the first 4K. Just do the processing in the second thread instead of stuffing the lines in a queue. If you can't do that...
This is a terrible hack but it may work in some cases: intersperse your inputs with dummy inputs that produce at least 4K of output. For example, you could output 2047 blank lines after every real input line (2047 CRLFs plus the CRLF from the real output = 4K), or a single line of b'A' * 4092 + b'\r\n', whichever is faster.
Not on this list at all is an approach suggested by the two previous answers: directing the output to a Win32 console and scraping the console. This is a terrible idea because scraping gets you cooked output as a rectangular array of characters. The scraper has no way to know whether two lines were originally one overlong line that wrapped. If it guesses wrong, your outputs will get out of sync with your inputs. It's impossible to work around output buffering in this way if you care at all about the integrity of the output.
I guess the answer, if not the solution, can be found here
https://github.com/ikriv/ConsoleProxy/blob/master/src/Tools/Exec/readme.md
I guess, because I had a similar problem, which I worked around, and could not try this route because this tool is not available for Windows 2003, which is the OS I had to use (in a VM for a legacy application).
I'd like to know if I guessed right.

Python subprocess signal

I would like to establish a very simple communication between two python scripts. I have decided that the best way to communicate and to have both scripts read from a text file. I would like the main program to wait while to child programs execute.
Normally I would make the main program wait x amount of time and continuously check the text file for an okay flag. However I have seen people talk about using a signal.
Could someone please give an example of this.
There is Popen.send_signal() method that allows you to send a signal to a child process.
Here's code example that sends SIGINT to ping subprocess to get the summary in the output on exit.
You need one process to write and one to read; both processes reading leads to no communication. Signals are used only for special proposes, not for normal inter-process-communication. Use something like pipes or sockets. It's not more complicated than files, but much more powerful.

Repeatedly interacting with program using subprocess

I'm trying to run a program that requires successive interactions (I have to answer with strings: '0' or '1') from within my python script.
My code:
from subprocess import Popen, PIPE
command = ['program', '-arg1', 'path/file_to_arg1']
p = Popen(command, stdin=PIPE, stdout=PIPE)
p.communicate('0'.encode())
The last two lines work for the first interaction, but after that the program prints all the following questions on the screen without waiting for their respective inputs. I basically need to answer the first question, wait until the program deals with it and prints the second question, then answer the second question, and so on.
Any ideas?
Thanks!
PS: I'm using Python 3.3.4
The subprocess module is designed for single-use interactions. Print to a process and read the result, and then STOP. It is challenging to do a ongoing back-and-forth interaction with a Unix process, where you continue to take turns reading and writing. I recommend using a library built for the task, instead of rewriting all the necessary logic from scratch.
There is a classic library named Expect, which works well for interacting with a child process. There is a python implementation named Pexpect (read the docs here). I recommend using Pexpect, or a similar library.
Pexpect works like this:
# spawn a subprocess.
# then wait for expected output from the child process,
# and send additional commands to the child.
child = pexpect.spawnu('ftp ftp.openbsd.org')
child.expect('(?i)name .*: ')
child.sendline('anonymous')
child.expect('(?i)password')
child.sendline('pexpect#sourceforge.net')
child.expect('ftp> ')
child.sendline('cd /pub/OpenBSD/3.7/packages/i386')
child.expect('ftp> ')
For catching stdout in realtime from subprocess better check from this thread catching stdout in realtime from subprocess

Use python subprocess module like a command line simulator

I am writing a test framework in Python for a command line application. The application will create directories, call other shell scripts in the current directory and will output on the Stdout.
I am trying to treat {Python-SubProcess, CommandLine} combo as equivalent to {Selenium, Browser}. The first component plays something on the second and checks if the output is expected. I am facing the following problems
The Popen construct takes a command and returns back after that command is completed. What I want is a live handle to the process so I can run further commands + verifications and finally close the shell once done
I am okay with writing some infrastructure code for achieveing this since we have a lot of command line applications that need testing like this.
Here is a sample code that I am running
p = subprocess.Popen("/bin/bash", cwd = test_dir)
p.communicate(input = "hostname") --> I expect the hostname to be printed out
p.communicate(input = "time") --> I expect current time to be printed out
but the process hangs or may be I am doing something wrong. Also how do I "grab" the output of that sub process so I can assert that something exists?
subprocess.Popen allows you to continue execution after starting a process. The Popen objects expose wait(), poll() and many other methods to communicate with a child process when it is running. Isn't it what you need?
See Popen constructor and Popen objects description for details.
Here is a small example that runs Bash on Unix systems and executes a command:
from subprocess import Popen, PIPE
p = Popen (['/bin/sh'], stdout=PIPE, stderr=PIPE, stdin=PIPE)
sout, serr = p.communicate('ls\n')
print 'OUT:'
print sout
print 'ERR:'
print serr
UPD: communicate() waits for process termination. If you do not need that, you may use the appropriate pipes directly, though that usually gives you rather ugly code.
UPD2: You updated the question. Yes, you cannot call communicate twice for a single process. You may either give all commands you need to execute in a single call to communicate and check the whole output, or work with pipes (Popen.stdin, Popen.stdout, Popen.stderr). If possible, I strongly recommend the first solution (using communicate).
Otherwise you will have to put a command to input and wait for some time for desired output. What you need is non-blocking read to avoid hanging when there is nothing to read. Here is a recipe how to emulate a non-blocking mode on pipes using threads. The code is ugly and strangely complicated for such a trivial purpose, but that's how it's done.
Another option could be using p.stdout.fileno() for select.select() call, but that won't work on Windows (on Windows select operates only on objects originating from WinSock). You may consider it if you are not on Windows.
Instead of using plain subprocess you might find Python sh library very useful:
http://amoffat.github.com/sh/
Here is an example how to build in an asynchronous interaction loop with sh:
http://amoffat.github.com/sh/tutorials/2-interacting_with_processes.html
Another (old) library for solving this problem is pexpect:
http://www.noah.org/wiki/pexpect

Python Printing StdOut As It Received

I'm trying to run wrap a simple (windows) command line tool up in a PyQt GUI app that I am writing. The problem I have is that the command line tool throws it's progress out to stdout (it's a server reset command so you get "Attempting to stop" and "Restarting" type output.
What I am trying to do is capture the output so I can display it as part of my app. I assumed it would be quite simple to do something like the following :
import os
import subprocess as sub
cmd = "COMMAND LINE APP NAME -ARGS"
proc = sub.Popen(cmd, shell=True, stdout=sub.PIPE).stdout
while 1:
line = proc.readline()
if not line:
break
print line
This partially works in that I do get the contents of StdOut but instead of as the progress messages are sent I get it once the command line application exits and it seems to flush StdOut in one go.
Is there a simple answer?
Interactive communication through stdin/stdout is a common problem.
You're in luck though, with PyQt you can use QProcess, as described here:
http://diotavelli.net/PyQtWiki/Capturing_Output_from_a_Process
Do I understand the question?
I believe you're running something like "echo first; sleep 60; echo second" and you want see the "first" well-ahead of the "second", but they're both spitting out at the same time.
The reason you're having issues is that the operating system stores the output of processes in its memory. It will only take the trouble of sending the output to your program if the buffer has filled, or the other program has ended. So, we need to dig into the O/S and figure out how to tell it "Hey, gimme that!" This is generally known as asynchronous or non-blocking mode.
Luckily someone has done the hard work for us.
This guy has added a send() and recv() method to the python built-in Popen class.
It also looks like he fixed the bugs that people found in the comments.
Try it out:
http://code.activestate.com/recipes/440554/

Categories

Resources