I have a script (worker.py) that prints unbuffered output in the form...
1
2
3
.
.
.
n
where n is some constant number of iterations a loop in this script will make. In another script (service_controller.py) I start a number of threads, each of which starts a subprocess using subprocess.Popen(stdout=subprocess.PIPE, ...); Now, in my main thread (service_controller.py) I want to read the output of each thread's worker.py subprocess and use it to calculate an estimate for the time remaining till completion.
I have all of the logic working that reads the stdout from worker.py and determines the last printed number. The problem is that I can not figure out how to do this in a non-blocking way. If I read a constant bufsize then each read will end up waiting for the same data from each of the workers. I have tried numerous ways including using fcntl, select + os.read, etc. What is my best option here? I can post my source if needed, but I figured the explanation describes the problem well enough.
Thanks for any help here.
EDIT
Adding sample code
I have a worker that starts a subprocess.
class WorkerThread(threading.Thread):
def __init__(self):
self.completed = 0
self.process = None
self.lock = threading.RLock()
threading.Thread.__init__(self)
def run(self):
cmd = ["/path/to/script", "arg1", "arg2"]
self.process = subprocess.Popen(cmd, stdout=subprocess.PIPE, bufsize=1, shell=False)
#flags = fcntl.fcntl(self.process.stdout, fcntl.F_GETFL)
#fcntl.fcntl(self.process.stdout.fileno(), fcntl.F_SETFL, flags | os.O_NONBLOCK)
def get_completed(self):
self.lock.acquire();
fd = select.select([self.process.stdout.fileno()], [], [], 5)[0]
if fd:
self.data += os.read(fd, 1)
try:
self.completed = int(self.data.split("\n")[-2])
except IndexError:
pass
self.lock.release()
return self.completed
I then have a ThreadManager.
class ThreadManager():
def __init__(self):
self.pool = []
self.running = []
self.lock = threading.Lock()
def clean_pool(self, pool):
for worker in [x for x in pool is not x.isAlive()]:
worker.join()
pool.remove(worker)
del worker
return pool
def run(self, concurrent=5):
while len(self.running) + len(self.pool) > 0:
self.clean_pool(self.running)
n = min(max(concurrent - len(self.running), 0), len(self.pool))
if n > 0:
for worker in self.pool[0:n]:
worker.start()
self.running.extend(self.pool[0:n])
del self.pool[0:n]
time.sleep(.01)
for worker in self.running + self.pool:
worker.join()
and some code to run it.
threadManager = ThreadManager()
for i in xrange(0, 5):
threadManager.pool.append(WorkerThread())
threadManager.run()
I have stripped out a log of the other code in hopes to try to pinpoint the issue.
Instead of having your service_controller being blocked by i/o access, only the thread loop should read its own controlled process output.
then, you can have method in the threaded object controlling the process to get the last polled output.
of course, don't forget in that case to use some locking mechanism to protect the buffer that will be used both by the thread to fill it and the method called by the controller to get it.
Your method WorkerThread.run() launches the subprocess and then terminates immediately. Run() needs to perform the polling and update WorkerThread.completed until the subprocess completes.
Related
To start, I'm aware this looks like a duplicate. I've been reading:
Python subprocess readlines() hangs
Python Subprocess readline hangs() after reading all input
subprocess readline hangs waiting for EOF
But these options either straight don't work or I can't use them.
The Problem
# Obviously, swap HOSTNAME1 and HOSTNAME2 with something real
cmd = "ssh -N -f -L 1111:<HOSTNAME1>:80 <HOSTNAME2>"
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, env=os.environ)
while True:
out = p.stdout.readline()
# Hangs here ^^^^^^^ forever
out = out.decode('utf-8')
if out:
print(out)
if p.poll() is not None:
break
My dilemma is that the function calling the subprocess.Popen() is a library function for running bash commands, so it needs to be very generic and has the following restrictions:
Must display output as it comes in; not block and then spam the screen all at once
Can't use multiprocessing in case the parent caller is multiprocessing the library function (Python doesn't allow child processes to have child processes)
Can't use signal.SIGALRM for the same reason as multiprocessing; the parent caller may be trying to set their own timeout
Can't use third party non-built-in modules
Threading straight up doesn't work. When the readline() call is in a thread, thread.join(timeout=1)lets the program continue, but ctrl+c doesn't work on it at all, and calling sys.exit() doesn't exit the program, since the thread is still open. And as you know, you can't kill a thread in python by design.
No manner of bufsize or other subprocess args seems to make a difference; neither does putting readline() in an iterator.
I would have a workable solution if I could kill a thread, but that's super taboo, even though this is definitely a legitimate use case.
I'm open to any ideas.
One option is to use a thread to publish to a queue. Then you can block on the queue with a timeout. You can make the reader thread a daemon so it won't prevent system exit. Here's a sketch:
import subprocess
from threading import Thread
from queue import Queue
def reader(stream, queue):
while True:
line = stream.readline()
queue.put(line)
if not line:
break
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, ...)
queue = Queue()
thread = Thread(target=reader, args=(p.stdout, queue))
thread.daemon = True
thread.start()
while True:
out = queue.get(timeout=1) # timeout is optional
if not out: # Reached end of stream
break
... # Do whatever with output
# Output stream was closed but process may still be running
p.wait()
Note that you should adapt this answer to your particular use case. For example, you may want to add a way to signal to the reader thread to stop running before reaching the end of stream.
Another option would be to poll the input stream, like in this question: timeout on subprocess readline in python
I finally got a working solution; the key piece of information I was missing was thread.daemon = True, which #augurar pointed out in their answer.
Setting thread.daemon = True allows the thread to be terminated when the main process terminates; therefore unblocking my use of a sub-thread to monitor readline().
Here is a sample implementation of my solution; I used a Queue() object to pass strings to the main process, and I implemented a 3 second timer for cases like the original problem I was trying to solve where the subprocess has finished and terminated, but the readline() is hung for some reason.
This also helps avoid a race condition between which thing finishes first.
This works for both Python 2 and 3.
import sys
import threading
import subprocess
from datetime import datetime
try:
import queue
except:
import Queue as queue # Python 2 compatibility
def _monitor_readline(process, q):
while True:
bail = True
if process.poll() is None:
bail = False
out = ""
if sys.version_info[0] >= 3:
out = process.stdout.readline().decode('utf-8')
else:
out = process.stdout.readline()
q.put(out)
if q.empty() and bail:
break
def bash(cmd):
# Kick off the command
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
# Create the queue instance
q = queue.Queue()
# Kick off the monitoring thread
thread = threading.Thread(target=_monitor_readline, args=(process, q))
thread.daemon = True
thread.start()
start = datetime.now()
while True:
bail = True
if process.poll() is None:
bail = False
# Re-set the thread timer
start = datetime.now()
out = ""
while not q.empty():
out += q.get()
if out:
print(out)
# In the case where the thread is still alive and reading, and
# the process has exited and finished, give it up to 3 seconds
# to finish reading
if bail and thread.is_alive() and (datetime.now() - start).total_seconds() < 3:
bail = False
if bail:
break
# To demonstrate output in realtime, sleep is called in between these echos
bash("echo lol;sleep 2;echo bbq")
I'm creating a multiprocessing.Queue in Python and adding multiprocessing.Process instances to this Queue.
I would like to add a function call that is executed after every job, which checks if a specific task has succeeded. If so, I would like to empty the Queue and terminate execution.
My Process class is:
class Worker(multiprocessing.Process):
def __init__(self, queue, check_success=None, directory=None, permit_nonzero=False):
super(Worker, self).__init__()
self.check_success = check_success
self.directory = directory
self.permit_nonzero = permit_nonzero
self.queue = queue
def run(self):
for job in iter(self.queue.get, None):
stdout = mbkit.dispatch.cexectools.cexec([job], directory=self.directory, permit_nonzero=self.permit_nonzero)
with open(job.rsplit('.', 1)[0] + '.log', 'w') as f_out:
f_out.write(stdout)
if callable(self.check_success) and self.check_success(job):
# Terminate all remaining jobs here
pass
And my Queue is setup here:
class LocalJobServer(object):
#staticmethod
def sub(command, check_success=None, directory=None, nproc=1, permit_nonzero=False, time=None, *args, **kwargs):
if check_success and not callable(check_success):
msg = "check_success option requires a callable function/object: {0}".format(check_success)
raise ValueError(msg)
# Create a new queue
queue = multiprocessing.Queue()
# Create workers equivalent to the number of jobs
workers = []
for _ in range(nproc):
wp = Worker(queue, check_success=check_success, directory=directory, permit_nonzero=permit_nonzero)
wp.start()
workers.append(wp)
# Add each command to the queue
for cmd in command:
queue.put(cmd, timeout=time)
# Stop workers from exiting without completion
for _ in range(nproc):
queue.put(None)
for wp in workers:
wp.join()
The function call mbkit.dispatch.cexectools.cexec() is a wrapper around subprocess.Popen and returns p.stdout.
In the Worker class, I've written the conditional to check if a job succeeded, and tried emptying the remaining jobs in the Queue using a while loop, i.e. my Worker.run() function looked like this:
def run(self):
for job in iter(self.queue.get, None):
stdout = mbkit.dispatch.cexectools.cexec([job], directory=self.directory, permit_nonzero=self.permit_nonzero)
with open(job.rsplit('.', 1)[0] + '.log', 'w') as f_out:
f_out.write(stdout)
if callable(self.check_success) and self.check_success(job):
break
while not self.queue.empty():
self.queue.get()
Although this works sometimes, it usually deadlocks and my only option is to Ctrl-C. I am aware that .empty() is unreliable, thus my question.
Any advice on how I can implement such an early termination functionality?
You do not have a deadlock here. It is just linked to the behavior of multiprocessing.Queue, as the get method is blocking by default. Thus when you call get on an empty queue, the call stall, waiting for the next element to be ready. You can see that some of your workers will stall because when you use your loop while not self.queue.empty() to empty it, you remove all the None sentinel and some of your workers will block on the empty Queue, like in this code:
from multiprocessing import Queue
q = Queue()
for e in iter(q.get, None):
print(e)
To be notified when the queue is empty, you need to use non blocking call. You can for instance use q.get_nowait, or use a timeout in q.get(timeout=1). Both throw a multiprocessing.queues.Empty exception when the queue is empty. So you should replace your Worker for job in iter(...): loop by something like:
while not queue.empty():
try:
job = queue.get(timeout=.1)
except multiprocessing.queues.Empty:
continue
# Do stuff with your job
If you do not want to be stuck at any point.
For the synchronization part, I would recommend using a synchronization primitive such as multiprocessing.Condition or an multiprocessing.Event. This is cleaner than the Value are they are design for this purpose. Something like this should help
def run(self):
while not queue.empty():
try:
job = queue.get(timeout=.1)
except multiprocessing.queues.Empty:
continue
if self.event.is_set():
continue
stdout = mbkit.dispatch.cexectools.cexec([job], directory=self.directory, permit_nonzero=self.permit_nonzero)
with open(job.rsplit('.', 1)[0] + '.log', 'w') as f_out:
f_out.write(stdout)
if callable(self.check_success) and self.check_success(job):
self.event.set()
print("Worker {} terminated cleanly".format(self.name))
with event = multiprocessing.Event().
Note that it is also possible to use a multiprocessing.Pool to get avoid dealing with the queue and the workers. But as you need some synchronization primitive, it might be a bit more complicated to set up. Something like this should work:
def worker(job, success, check_success=None, directory=None, permit_nonzero=False):
if sucess.is_set():
return False
stdout = mbkit.dispatch.cexectools.cexec([job], directory=self.directory, permit_nonzero=self.permit_nonzero)
with open(job.rsplit('.', 1)[0] + '.log', 'w') as f_out:
f_out.write(stdout)
if callable(self.check_success) and self.check_success(job):
success.set()
return True
# ......
# In the class LocalJobServer
# .....
def sub(command, check_success=None, directory=None, nproc=1, permit_nonzero=False):
mgr = multiprocessing.Manager()
success = mgr.Event()
pool = multiprocessing.Pool(nproc)
run_args = [(cmd, success, check_success, directory, permit_nonzero)]
result = pool.starmap(worker, run_args)
pool.close()
pool.join()
Note here that I use a Manager as you cannot pass multiprocessing.Event directly as arguments. You could also use the arguments initializer and initargs of the Pool to initiate global success event in each worker and avoid relying on the Manager but it is slightly more complicated.
This might not be the optimal solution, and any other suggestion is much appreciated, but I managed to solve the problem as such:
class Worker(multiprocessing.Process):
"""Simple manual worker class to execute jobs in the queue"""
def __init__(self, queue, success, check_success=None, directory=None, permit_nonzero=False):
super(Worker, self).__init__()
self.check_success = check_success
self.directory = directory
self.permit_nonzero = permit_nonzero
self.success = success
self.queue = queue
def run(self):
"""Method representing the process's activity"""
for job in iter(self.queue.get, None):
if self.success.value:
continue
stdout = mbkit.dispatch.cexectools.cexec([job], directory=self.directory, permit_nonzero=self.permit_nonzero)
with open(job.rsplit('.', 1)[0] + '.log', 'w') as f_out:
f_out.write(stdout)
if callable(self.check_success) and self.check_success(job):
self.success.value = int(True)
time.sleep(1)
class LocalJobServer(object):
"""A local server to execute jobs via the multiprocessing module"""
#staticmethod
def sub(command, check_success=None, directory=None, nproc=1, permit_nonzero=False, time=None, *args, **kwargs):
if check_success and not callable(check_success):
msg = "check_success option requires a callable function/object: {0}".format(check_success)
raise ValueError(msg)
# Create a new queue
queue = multiprocessing.Queue()
success = multiprocessing.Value('i', int(False))
# Create workers equivalent to the number of jobs
workers = []
for _ in range(nproc):
wp = Worker(queue, success, check_success=check_success, directory=directory, permit_nonzero=permit_nonzero)
wp.start()
workers.append(wp)
# Add each command to the queue
for cmd in command:
queue.put(cmd)
# Stop workers from exiting without completion
for _ in range(nproc):
queue.put(None)
# Start the workers
for wp in workers:
wp.join(time)
Basically I'm creating a Value and providing that to each Process. Once a job is marked as successful, this variable gets updated. Each Process checks in if self.success.value: continue whether we have a success and if so, just iterates over the remaining jobs in the Queue until empty.
The time.sleep(1) call is required to account for potential syncing delays amongst the processes. This is certainly not the most efficient approach but it works.
Does
import multiprocessing
import schedule
def worker():
#do some stuff
def sched(argv):
schedule.every(0.01).minutes.do(worker)
while True:
schedule.run_pending()
processs = []
..
..
p = multiprocessing.Process(target=sched,args)
..
..
processs.append(p)
for p in processs:
p.terminate()
kills gracefully a list of processes ?
If not what is the simplest way to do it ?
The goal is to reload the configuration file into memory, so I would like to kill all children processes and create others instead, those latter will read the new config file.
Edit : Added more code to explain that I am running a while True loop
Edit : This is the new code after #dano suggestion
def get_config(self):
from ConfigParser import SafeConfigParser
..
return argv
def sched(self, args, event):
#schedule instruction:
schedule.every(0.01).minutes.do(self.worker,args)
while not event.is_set():
schedule.run_pending()
def dispatch_processs(self, conf):
processs = []
event = multiprocessing.Event()
for conf in self.get_config():
process = multiprocessing.Process(target=self.sched,args=( i for i in conf), kwargs={'event' : event})
processs.append((process, event)
return processs
def start_process(self, process):
process.start()
def gracefull_process(self, process):
process.join()
def main(self):
while True:
processs = self.dispatch_processs(self.get_config())
print ("%s processes running " % len(processs) )
for process, event in processs:
self.start_process(process)
time.sleep(1)
event.set()
self.gracefull_process(process)
The good thing about the code, is that I can edit config file and the process will reload its config also.
The problem is that only the first process runs and the others are ignored.
Edit : This saved my life , working with while True in schedule() is not a good idea, so I set up refresh_time instead
def sched(self, args, event):
schedule.every(0.01).minutes.do(self.worker,args)
for i in range(refresh_time):
schedule.run_pending()
time.sleep(1)
def start_processs(self, processs):
for p,event in processs:
if not p.is_alive():
p.start()
time.sleep(1)
event.set()
self.gracefull_processs(processs)
def gracefull_processs(self, processs):
for p,event in processs:
p.join()
processs = self.dispatch_processs(self.get_config())
self.start_processs(processs)
def main(self):
while True:
processs = self.dispatch_processs(self.get_config())
self.start_processs(processs)
break
print ("Reloading function main")
self.main()
If you don't mind only aborting after worker has completed all of its work, its very simple to add a multiprocessing.Event to handle exiting gracefully:
import multiprocessing
import schedule
def worker():
#do some stuff
def sched(argv, event=None):
schedule.every(0.01).minutes.do(worker)
while not event.is_set(): # Run until we're told to shut down.
schedule.run_pending()
processes = []
..
..
event = multiprocessing.Event()
p = multiprocessing.Process(target=sched,args, kwargs={'event' : event})
..
..
processes.append((p, event))
# Tell all processes to shut down
for _, event in processes:
event.set()
# Now actually wait for them to shut down
for p, _ in processes:
p.join()
A: No, both .terminate() & SIG_* methods are rather brutal
In a need to arrange a gracefull end of any process, as described in your post, there rather shall be some "soft-signalling" layer, that allows, on both ends, to send/receive smart-signalls without being dependent on the O/S interpretations ( O/S knows nothing about your application-level context and state of the respective work-unit, that is currently being processed ).
You may want to read about such soft-signalling approach in links referred from >>> https://stackoverflow.com/a/25373416/3666197
No, it doesn't kill a process according to your own definition of gracefully - unless you take some additional steps.
Assuming you're using a unix system (since you mentioned scp), terminate sends a SIGTERM signal to the child process. You can catch this signal in the child process, and act accordingly (wait for scp to finish):
import signal
def on_terminate(signum, stack):
wait_for_current_scp_operation()
signal.signal(signal.SIGTERM, on_terminate)
Here's a tutorial about handling and sending signals
My question is hopefully particular enough to not relate to any of the other ones that I've read. I'm wanting to use subprocess and multiprocessing to spawn a bunch of jobs serially and return the return code to me. The problem is that I don't want to wait() so I can spawn the jobs all at once, but I do want to know when it finishes so I can get the return code. I'm having this weird problem where if I poll() the process it won't run. It just hangs out in the activity monitor without running (I'm on a Mac). I thought I could use a watcher thread, but I'm hanging on the q_out.get() which is leading me to believe that maybe I'm filling up the buffer and deadlocking. I'm not sure how to get around this. This is basically what my code looks like. If anyone has any better ideas on how to do this I would be happy to completely change my approach.
def watchJob(p1,out_q):
while p1.poll() == None:
pass
print "Job is done"
out_q.put(p1.returncode)
def runJob(out_q):
LOGFILE = open('job_to_run.log','w')
p1 = Popen(['../../bin/jobexe','job_to_run'], stdout = LOGFILE)
t = threading.Thread(target=watchJob, args=(p1,out_q))
t.start()
out_q= Queue()
outlst=[]
for i in range(len(nprocs)):
proc = Process(target=runJob, args=(out_q,))
proc.start()
outlst.append(out_q.get()) # This hangs indefinitely
proc.join()
You don't need neither multiprocessing nor threading here. You could run multiple child processes in parallel and collect their statutes all in a single thread:
#!/usr/bin/env python3
from subprocess import Popen
def run(cmd, log_filename):
with open(log_filename, 'wb', 0) as logfile:
return Popen(cmd, stdout=logfile)
# start several subprocesses
processes = {run(['echo', c], 'subprocess.%s.log' % c) for c in 'abc'}
# now they all run in parallel
# report as soon as a child process exits
while processes:
for p in processes:
if p.poll() is not None:
processes.remove(p)
print('{} done, status {}'.format(p.args, p.returncode))
break
p.args stores cmd in Python 3.3+, keep track of cmd yourself on earlier Python versions.
See also:
Python threading multiple bash subprocesses?
Python subprocess in parallel
Python: execute cat subprocess in parallel
Using Python's Multiprocessing module to execute simultaneous and separate SEAWAT/MODFLOW model runs
To limit number of parallel jobs a ThreadPool could be used (as shown in the first link):
#!/usr/bin/env python3
from multiprocessing.dummy import Pool # use threads
from subprocess import Popen
def run_until_done(args):
cmd, log_filename = args
try:
with open(log_filename, 'wb', 0) as logfile:
p = Popen(cmd, stdout=logfile)
return cmd, p.wait(), None
except Exception as e:
return cmd, None, str(e)
commands = ((('echo', str(d)), 'subprocess.%03d.log' % d) for d in range(500))
pool = Pool(128) # 128 concurrent commands at a time
for cmd, status, error in pool.imap_unordered(run_until_done, commands):
if error is None:
fmt = '{cmd} done, status {status}'
else:
fmt = 'failed to run {cmd}, reason: {error}'
print(fmt.format_map(vars())) # or fmt.format(**vars()) on older versions
The thread pool in the example has 128 threads (no more, no less). It can't execute more than 128 jobs concurrently. As soon as any of the threads frees (done with a job), it takes another, etc. Total number of jobs that is executed concurrently is limited by the number of threads. New job doesn't wait for all 128 previous jobs to finish. It is started when any of the old jobs is done.
If you're going to run watchJob in a thread, there's no reason to busy-loop with p1.poll; just call p1.wait() to block until the process finishes. Using the busy loop requires the GIL to constantly be released/re-acquired, which slows down the main thread, and also pegs the CPU, which hurts performance even more.
Also, if you're not using the stdout of the child process, you shouldn't send it to PIPE, because that could cause a deadlock if the process writes enough data to the stdout buffer to fill it up (which may actually be what's happening in your case). There's also no need to use multiprocessing here; just call Popen in the main thread, and then have the watchJob thread wait on the process to finish.
import threading
from subprocess import Popen
from Queue import Queue
def watchJob(p1, out_q):
p1.wait()
out_q.put(p1.returncode)
out_q = Queue()
outlst=[]
p1 = Popen(['../../bin/jobexe','job_to_run'])
t = threading.Thread(target=watchJob, args=(p1,out_q))
t.start()
outlst.append(out_q.get())
t.join()
Edit:
Here's how to run multiple jobs concurrently this way:
out_q = Queue()
outlst = []
threads = []
num_jobs = 3
for _ in range(num_jobs):
p = Popen(['../../bin/jobexe','job_to_run'])
t = threading.Thread(target=watchJob, args=(p1, out_q))
t.start()
# Don't consume from the queue yet.
# All jobs are running, so now we can start
# consuming results from the queue.
for _ in range(num_jobs):
outlst.append(out_q.get())
t.join()
I've got a program on Windows that calls a bunch of subprocesses, and displays the results in a GUI. I'm using PyQt for the GUI, and the subprocess module to run the programs.
I've got the following WorkerThread, that spawns a subthread for each shell command devoted to reading the process stdout and printing the results (later I'll wire it up to the GUI).
This all works. Except proc.stdout.read(1) never returns until after the subprocess has completed. This is a big problem, since some of these subprocesses can take 15-20 minutes to run, and I need to display results as they're running.
What do I need to do to get the pipe working while the subprocess is running?
class WorkerThread(QtCore.QThread):
def run(self):
def sh(cmd, cwd = None):
proc = subprocess.Popen(cmd,
shell = True,
stdout = subprocess.PIPE,
stderr = subprocess.STDOUT,
stdin = subprocess.PIPE,
cwd = cwd,
env = os.environ)
proc.stdin.close()
class ReadStdOutThread(QtCore.QThread):
def run(_self):
s = ''
while True:
if self.request_exit: return
b = proc.stdout.read(1)
if b == '\n':
print s
s = ''
continue
if b:
s += b
continue
if s: print s
return
thread = ReadStdOutThread()
thread.start()
retcode = proc.wait()
if retcode:
raise subprocess.CalledProcessError(retcode, cmd)
return 0
FWIW: I rewrote the whole thing using QProcess, and I see the exact same problem. The stdout receives no data, until the underlying process has returned. Then I get everything all at once.
If you know how long will be the the lines of command's output you can poll on the stdout PIPE of the process.
An example of what I mean:
import select
import subprocess
import threading
import os
# Some time consuming command.
command = 'while [ 1 ]; do sleep 1; echo "Testing"; done'
# A worker thread, not as complex as yours, just to show my point.
class Worker(threading.Thread):
def __init__(self):
super(Worker, self).__init__()
self.proc = subprocess.Popen(
command, shell=True,
stdout=subprocess.PIPE,
stdin=subprocess.PIPE, stderr=subprocess.STDOUT
)
def run(self):
self.proc.communicate()
def get_proc(self):
# The proc is needed for ask him for his
# output file descriptor later.
return self.proc
if __name__ == '__main__':
w = Worker()
w.start()
proc = w.get_proc()
pollin = select.poll()
pollin.register(proc.stdout, select.POLLIN)
while ( 1 ):
events = pollin.poll()
for fd, event in events:
if event == select.POLLIN:
# This is the main issue of my idea,
# if you don't know the length of lines
# that process ouput, this is a problem.
# I put 7 since I know the word "Testing" have
# 7 characters.
print os.read(fd, 7)
Maybe this is not exactly what you're looking for, but I think it give you a pretty good idea of what to do to solve your problem.
EDIT: I think I've just found what you need Streaming stdout from a Python subprocess in Python.