Precarious Popen Piping - python

I want to use subprocess.Popen to run a process, with the following requirements.
I want to pipe the stdout and stderr back to the caller of Popen as the process runs.
I want to kill the process after timeout seconds if it is still running.
I have come to the conclusion that a flaw in the subprocess API means it cannot fulfill these two requirements at the same time. Consider the following toy programs:
chatty.py
while True:
print 'Hi'
silence.py
while True:
pass
caller.py
import subprocess
import time
def go(command, timeout=60):
proc = subprocess.Popen(command, shell=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
start = time.time()
while proc.poll() is None:
print proc.stdout.read(1024) # <----- Line of interest
if time.time() - start >= timeout:
proc.kill()
break
else:
time.sleep(1)
Consider the marked line above.
If it is included, go('python silence.py') will hang forever - not for just 60 seconds - because read is a blocking call until either 1024 bytes or end of stream, and neither ever comes.
If it is commented, go('python chatty.py') will be printing out 'Hi' over and over, but how can it be streamed back as it is generated? proc.communicate() blocks until end of stream.
I would be happy with a solution that replaces requirement (1) above with "In the case where a timeout did not occur, I want to get stdout and stderr once the algorithm finishes." Even this has been problematic. My implementation attempt is below.
speech.py
for i in xrange(0, 10000):
print 'Hi'
caller2.py
import subprocess
import time
def go2(command, timeout=60):
proc = subprocess.Popen(command, shell=True,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
start = time.time()
while True:
if proc.poll() is not None:
print proc.communicate()
break
elif time.time() - start >= timeout:
proc.kill()
break
else:
time.sleep(1)
But even this still has problems. Even though python speech.py runs in just a couple seconds, go2('python speech.py') takes the full 60 seconds. This is because the call to print 'Hi' in speech.py is blocking until proc.communicate() is called when the process is killed. Since proc.stdout.read had the problem demonstrated before with silence.py, I'm really at a loss for how to get this working.
How can I get both the stdout and stderr and the timeout behavior?

The trick is to setup a side-band timer to kill the process. I wrote up a program half way between chatty and silent:
import time
import sys
for i in range(10,0,-1):
print i
time.sleep(1)
And then a program to kill it early:
import subprocess as subp
import threading
import signal
proc = subp.Popen(['python', 'longtime.py'], stdout=subp.PIPE,
stderr=subp.PIPE)
timer = threading.Timer(3, lambda proc: proc.send_signal(signal.SIGINT),
args=(proc,))
timer.start()
out, err = proc.communicate()
timer.cancel()
print proc.returncode
print out
print err
and it output:
$ python killer.py
1
10
9
8
Traceback (most recent call last):
File "longtime.py", line 6, in <module>
time.sleep(1)
KeyboardInterrupt
Your timer could be made fancier, like trying increasingly bad signals til the process completes, but you get the idea.

Related

Non-blocking read from subprocess.Popen fails if the process exits fast

I followed the accepted answer for this question A non-blocking read on a subprocess.PIPE in Python to read non-blocking from a subprocess. This generally works fine, except if the process I call terminates quickly.
This is on Windows.
To illustrate, I have a bat file that simply writes one line to stdout:
test.bat:
#ECHO OFF
ECHO Fast termination
And here the python code, adapted from above mentioned answer:
from subprocess import PIPE, Popen
from threading import Thread
from queue import Queue, Empty
def enqueue_output(out, queue):
for line in iter(out.readline, b''):
queue.put(line)
out.close()
p = Popen(['test.bat'], stdout=PIPE, bufsize=-1,
text=True)
q = Queue()
t = Thread(target=enqueue_output, args=(p.stdout, q))
t.daemon = True # thread dies with the program
t.start()
output = str()
while True:
try:
line = q.get_nowait()
except Empty:
line = ""
output += line
if p.poll() is not None:
break
print(output)
Sometimes, the line from the bat file is correctly captured and printed, sometimes nothing is captured an printed. I suspect that the subprocess might finish before the thread connects the queue to the pipe, and then it doesn't read anything. If I add a little wait of 2 seconds in the bat file before echoing the line, it seems to always work. Likewise the behavior can be forced by adding a little sleep after the Popen in the python code. Is there a way to reliably capture the output of the subprocess even if it finishes immediately while still doing a non-blocking read?

Python Subprocess readline() hangs; can't use normal options

To start, I'm aware this looks like a duplicate. I've been reading:
Python subprocess readlines() hangs
Python Subprocess readline hangs() after reading all input
subprocess readline hangs waiting for EOF
But these options either straight don't work or I can't use them.
The Problem
# Obviously, swap HOSTNAME1 and HOSTNAME2 with something real
cmd = "ssh -N -f -L 1111:<HOSTNAME1>:80 <HOSTNAME2>"
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, env=os.environ)
while True:
out = p.stdout.readline()
# Hangs here ^^^^^^^ forever
out = out.decode('utf-8')
if out:
print(out)
if p.poll() is not None:
break
My dilemma is that the function calling the subprocess.Popen() is a library function for running bash commands, so it needs to be very generic and has the following restrictions:
Must display output as it comes in; not block and then spam the screen all at once
Can't use multiprocessing in case the parent caller is multiprocessing the library function (Python doesn't allow child processes to have child processes)
Can't use signal.SIGALRM for the same reason as multiprocessing; the parent caller may be trying to set their own timeout
Can't use third party non-built-in modules
Threading straight up doesn't work. When the readline() call is in a thread, thread.join(timeout=1)lets the program continue, but ctrl+c doesn't work on it at all, and calling sys.exit() doesn't exit the program, since the thread is still open. And as you know, you can't kill a thread in python by design.
No manner of bufsize or other subprocess args seems to make a difference; neither does putting readline() in an iterator.
I would have a workable solution if I could kill a thread, but that's super taboo, even though this is definitely a legitimate use case.
I'm open to any ideas.
One option is to use a thread to publish to a queue. Then you can block on the queue with a timeout. You can make the reader thread a daemon so it won't prevent system exit. Here's a sketch:
import subprocess
from threading import Thread
from queue import Queue
def reader(stream, queue):
while True:
line = stream.readline()
queue.put(line)
if not line:
break
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, ...)
queue = Queue()
thread = Thread(target=reader, args=(p.stdout, queue))
thread.daemon = True
thread.start()
while True:
out = queue.get(timeout=1) # timeout is optional
if not out: # Reached end of stream
break
... # Do whatever with output
# Output stream was closed but process may still be running
p.wait()
Note that you should adapt this answer to your particular use case. For example, you may want to add a way to signal to the reader thread to stop running before reaching the end of stream.
Another option would be to poll the input stream, like in this question: timeout on subprocess readline in python
I finally got a working solution; the key piece of information I was missing was thread.daemon = True, which #augurar pointed out in their answer.
Setting thread.daemon = True allows the thread to be terminated when the main process terminates; therefore unblocking my use of a sub-thread to monitor readline().
Here is a sample implementation of my solution; I used a Queue() object to pass strings to the main process, and I implemented a 3 second timer for cases like the original problem I was trying to solve where the subprocess has finished and terminated, but the readline() is hung for some reason.
This also helps avoid a race condition between which thing finishes first.
This works for both Python 2 and 3.
import sys
import threading
import subprocess
from datetime import datetime
try:
import queue
except:
import Queue as queue # Python 2 compatibility
def _monitor_readline(process, q):
while True:
bail = True
if process.poll() is None:
bail = False
out = ""
if sys.version_info[0] >= 3:
out = process.stdout.readline().decode('utf-8')
else:
out = process.stdout.readline()
q.put(out)
if q.empty() and bail:
break
def bash(cmd):
# Kick off the command
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
# Create the queue instance
q = queue.Queue()
# Kick off the monitoring thread
thread = threading.Thread(target=_monitor_readline, args=(process, q))
thread.daemon = True
thread.start()
start = datetime.now()
while True:
bail = True
if process.poll() is None:
bail = False
# Re-set the thread timer
start = datetime.now()
out = ""
while not q.empty():
out += q.get()
if out:
print(out)
# In the case where the thread is still alive and reading, and
# the process has exited and finished, give it up to 3 seconds
# to finish reading
if bail and thread.is_alive() and (datetime.now() - start).total_seconds() < 3:
bail = False
if bail:
break
# To demonstrate output in realtime, sleep is called in between these echos
bash("echo lol;sleep 2;echo bbq")

how to handle the commands that are hung indefinitely [duplicate]

Is there any argument or options to setup a timeout for Python's subprocess.Popen method?
Something like this:
subprocess.Popen(['..'], ..., timeout=20) ?
I would advise taking a look at the Timer class in the threading module. I used it to implement a timeout for a Popen.
First, create a callback:
def timeout( p ):
if p.poll() is None:
print 'Error: process taking too long to complete--terminating'
p.kill()
Then open the process:
proc = Popen( ... )
Then create a timer that will call the callback, passing the process to it.
t = threading.Timer( 10.0, timeout, [proc] )
t.start()
t.join()
Somewhere later in the program, you may want to add the line:
t.cancel()
Otherwise, the python program will keep running until the timer has finished running.
EDIT: I was advised that there is a race condition that the subprocess p may terminate between the p.poll() and p.kill() calls. I believe the following code can fix that:
import errno
def timeout( p ):
if p.poll() is None:
try:
p.kill()
print 'Error: process taking too long to complete--terminating'
except OSError as e:
if e.errno != errno.ESRCH:
raise
Though you may want to clean the exception handling to specifically handle just the particular exception that occurs when the subprocess has already terminated normally.
subprocess.Popen doesn't block so you can do something like this:
import time
p = subprocess.Popen(['...'])
time.sleep(20)
if p.poll() is None:
p.kill()
print 'timed out'
else:
print p.communicate()
It has a drawback in that you must always wait at least 20 seconds for it to finish.
import subprocess, threading
class Command(object):
def __init__(self, cmd):
self.cmd = cmd
self.process = None
def run(self, timeout):
def target():
print 'Thread started'
self.process = subprocess.Popen(self.cmd, shell=True)
self.process.communicate()
print 'Thread finished'
thread = threading.Thread(target=target)
thread.start()
thread.join(timeout)
if thread.is_alive():
print 'Terminating process'
self.process.terminate()
thread.join()
print self.process.returncode
command = Command("echo 'Process started'; sleep 2; echo 'Process finished'")
command.run(timeout=3)
command.run(timeout=1)
The output of this should be:
Thread started
Process started
Process finished
Thread finished
0
Thread started
Process started
Terminating process
Thread finished
-15
where it can be seen that, in the first execution, the process finished correctly (return code 0), while the in the second one the process was terminated (return code -15).
I haven't tested in windows; but, aside from updating the example command, I think it should work since I haven't found in the documentation anything that says that thread.join or process.terminate is not supported.
You could do
from twisted.internet import reactor, protocol, error, defer
class DyingProcessProtocol(protocol.ProcessProtocol):
def __init__(self, timeout):
self.timeout = timeout
def connectionMade(self):
#defer.inlineCallbacks
def killIfAlive():
try:
yield self.transport.signalProcess('KILL')
except error.ProcessExitedAlready:
pass
d = reactor.callLater(self.timeout, killIfAlive)
reactor.spawnProcess(DyingProcessProtocol(20), ...)
using Twisted's asynchronous process API.
A python subprocess auto-timeout is not built in, so you're going to have to build your own.
This works for me on Ubuntu 12.10 running python 2.7.3
Put this in a file called test.py
#!/usr/bin/python
import subprocess
import threading
class RunMyCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = subprocess.Popen(self.cmd)
self.p.wait()
def run_the_process(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate() #if your process needs a kill -9 to make
#it go away, use self.p.kill() here instead.
self.join()
RunMyCmd(["sleep", "20"], 3).run_the_process()
Save it, and run it:
python test.py
The sleep 20 command takes 20 seconds to complete. If it doesn't terminate in 3 seconds (it won't) then the process is terminated.
el#apollo:~$ python test.py
el#apollo:~$
There is three seconds between when the process is run, and it is terminated.
As of Python 3.3, there is also a timeout argument to the blocking helper functions in the subprocess module.
https://docs.python.org/3/library/subprocess.html
Unfortunately, there isn't such a solution. I managed to do this using a threaded timer that would launch along with the process that would kill it after the timeout but I did run into some stale file descriptor issues because of zombie processes or some such.
No there is no time out. I guess, what you are looking for is to kill the sub process after some time. Since you are able to signal the subprocess, you should be able to kill it too.
generic approach to sending a signal to subprocess:
proc = subprocess.Popen([command])
time.sleep(1)
print 'signaling child'
sys.stdout.flush()
os.kill(proc.pid, signal.SIGUSR1)
You could use this mechanism to terminate after a time out period.
Yes, https://pypi.python.org/pypi/python-subprocess2 will extend the Popen module with two additional functions,
Popen.waitUpTo(timeout=seconds)
This will wait up to acertain number of seconds for the process to complete, otherwise return None
also,
Popen.waitOrTerminate
This will wait up to a point, and then call .terminate(), then .kill(), one orthe other or some combination of both, see docs for full details:
http://htmlpreview.github.io/?https://github.com/kata198/python-subprocess2/blob/master/doc/subprocess2.html
For Linux, you can use a signal. This is platform dependent so another solution is required for Windows. It may work with Mac though.
def launch_cmd(cmd, timeout=0):
'''Launch an external command
It launchs the program redirecting the program's STDIO
to a communication pipe, and appends those responses to
a list. Waits for the program to exit, then returns the
ouput lines.
Args:
cmd: command Line of the external program to launch
time: time to wait for the command to complete, 0 for indefinitely
Returns:
A list of the response lines from the program
'''
import subprocess
import signal
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
lines = []
if not launch_cmd.init:
launch_cmd.init = True
signal.signal(signal.SIGALRM, alarm_handler)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
signal.alarm(timeout) # timeout sec
try:
for line in p.stdout:
lines.append(line.rstrip())
p.wait()
signal.alarm(0) # disable alarm
except:
print "launch_cmd taking too long!"
p.kill()
return lines
launch_cmd.init = False

showing progress while spawning and running subprocess

I need to show some progress bar or something while spawning and running subprocess.
How can I do that with python?
import subprocess
cmd = ['python','wait.py']
p = subprocess.Popen(cmd, bufsize=1024,stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
p.stdin.close()
outputmessage = p.stdout.read() #This will print the standard output from the spawned process
message = p.stderr.read()
I could spawn subprocess with this code, but I need to print out something when each second is passing.
Since the subprocess call is blocking, one way to print something out while waiting would be to use multithreading. Here's an example using threading._Timer:
import threading
import subprocess
class RepeatingTimer(threading._Timer):
def run(self):
while True:
self.finished.wait(self.interval)
if self.finished.is_set():
return
else:
self.function(*self.args, **self.kwargs)
def status():
print "I'm alive"
timer = RepeatingTimer(1.0, status)
timer.daemon = True # Allows program to exit if only the thread is alive
timer.start()
proc = subprocess.Popen([ '/bin/sleep', "5" ])
proc.wait()
timer.cancel()
On an unrelated note, calling stdout.read() while using multiple pipes can lead to deadlock. The subprocess.communicate() function should be used instead.
As far as I see it all you need to do is put those reads in a loop with a delay and a print - does it have to be precisely a second or around about a second?

How to limit program's execution time when using subprocess?

I want to use subprocess to run a program and I need to limit the execution time. For example, I want to kill it if it runs for more than 2 seconds.
For common programs, kill() works well. But if I try to run /usr/bin/time something, kill() can’t really kill the program.
My code below seems doesn’t work well. The program is still running.
import subprocess
import time
exec_proc = subprocess.Popen("/usr/bin/time -f \"%e\\n%M\" ./son > /dev/null", stdout = subprocess.PIPE, stderr = subprocess.STDOUT, shell = True)
max_time = 1
cur_time = 0.0
return_code = 0
while cur_time <= max_time:
if exec_proc.poll() != None:
return_code = exec_proc.poll()
break
time.sleep(0.1)
cur_time += 0.1
if cur_time > max_time:
exec_proc.kill()
If you're using Python 2.6 or later, you can use the multiprocessing module.
from multiprocessing import Process
def f():
# Stuff to run your process here
p = Process(target=f)
p.start()
p.join(timeout)
if p.is_alive():
p.terminate()
Actually, multiprocessing is the wrong module for this task since it is just a way to control how long a thread runs. You have no control over any children the thread may run. As singularity suggests, using signal.alarm is the normal approach.
import signal
import subprocess
def handle_alarm(signum, frame):
# If the alarm is triggered, we're still in the exec_proc.communicate()
# call, so use exec_proc.kill() to end the process.
frame.f_locals['self'].kill()
max_time = ...
stdout = stderr = None
signal.signal(signal.SIGALRM, handle_alarm)
exec_proc = subprocess.Popen(['time', 'ping', '-c', '5', 'google.com'],
stdin=None, stdout=subprocess.PIPE,
stderr=subprocess.STDOUT)
signal.alarm(max_time)
try:
(stdout, stderr) = exec_proc.communicate()
except IOError:
# process was killed due to exceeding the alarm
finally:
signal.alarm(0)
# do stuff with stdout/stderr if they're not None
do it like so in your command line:
perl -e 'alarm shift #ARGV; exec #ARGV' <timeout> <your_command>
this will run the command <your_command> and terminate it in <timeout> second.
a dummy example :
# set time out to 5, so that the command will be killed after 5 second
command = ['perl', '-e', "'alarm shift #ARGV; exec #ARGV'", "5"]
command += ["ping", "www.google.com"]
exec_proc = subprocess.Popen(command)
or you can use the signal.alarm() if you want it with python but it's the same.
I use os.kill() but am not sure if it works on all OSes.
Pseudo code follows, and see Doug Hellman's page.
proc = subprocess.Popen(['google-chrome'])
os.kill(proc.pid, signal.SIGUSR1)</code>

Categories

Resources