Break loop when child process finishes

Break loop when child process finishes - python

I am trying to make a simple program that will start a child process which writes a string to a pipe while the parent process counts until it gets the string from the pipe. My problem however is that when the program runs it'll either not count or will not stop counting. I want to know how I can check if the child process is still running and depending on that break out of the counting loop.
import os, time
pipein, pipeout = os.pipe()
def child(input, pipeout):
time.sleep(2)
msg = ('child got this %s' % input).encode()
os.write(pipeout, msg)
input = input()
pid = os.fork()
if pid:
i = 0
while True:
print(i)
time.sleep(1)
i += 1
try:
os.kill(pid, 0)
except OSError:
break
line = os.read(pipein, 32)
print(line)
else:
child(input, pipeout)

You should use the subprocess module, and then you can call poll()
use popen.poll()
Explained here
if Popen.poll() is not None:
//child process has terminated
[edit]:
"The only way to control the input and output streams and also retrieve the return codes is to use the subprocess module; these are only available on Unix."
Source

Related

Python Subprocess readline() hangs; can't use normal options

To start, I'm aware this looks like a duplicate. I've been reading:
Python subprocess readlines() hangs
Python Subprocess readline hangs() after reading all input
subprocess readline hangs waiting for EOF
But these options either straight don't work or I can't use them.
The Problem
# Obviously, swap HOSTNAME1 and HOSTNAME2 with something real
cmd = "ssh -N -f -L 1111:<HOSTNAME1>:80 <HOSTNAME2>"
p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, env=os.environ)
while True:
out = p.stdout.readline()
# Hangs here ^^^^^^^ forever
out = out.decode('utf-8')
if out:
print(out)
if p.poll() is not None:
break
My dilemma is that the function calling the subprocess.Popen() is a library function for running bash commands, so it needs to be very generic and has the following restrictions:
Must display output as it comes in; not block and then spam the screen all at once
Can't use multiprocessing in case the parent caller is multiprocessing the library function (Python doesn't allow child processes to have child processes)
Can't use signal.SIGALRM for the same reason as multiprocessing; the parent caller may be trying to set their own timeout
Can't use third party non-built-in modules
Threading straight up doesn't work. When the readline() call is in a thread, thread.join(timeout=1)lets the program continue, but ctrl+c doesn't work on it at all, and calling sys.exit() doesn't exit the program, since the thread is still open. And as you know, you can't kill a thread in python by design.
No manner of bufsize or other subprocess args seems to make a difference; neither does putting readline() in an iterator.
I would have a workable solution if I could kill a thread, but that's super taboo, even though this is definitely a legitimate use case.
I'm open to any ideas.

One option is to use a thread to publish to a queue. Then you can block on the queue with a timeout. You can make the reader thread a daemon so it won't prevent system exit. Here's a sketch:
import subprocess
from threading import Thread
from queue import Queue
def reader(stream, queue):
while True:
line = stream.readline()
queue.put(line)
if not line:
break
p = subprocess.Popen(cmd, stdout=subprocess.PIPE, ...)
queue = Queue()
thread = Thread(target=reader, args=(p.stdout, queue))
thread.daemon = True
thread.start()
while True:
out = queue.get(timeout=1) # timeout is optional
if not out: # Reached end of stream
break
... # Do whatever with output
# Output stream was closed but process may still be running
p.wait()
Note that you should adapt this answer to your particular use case. For example, you may want to add a way to signal to the reader thread to stop running before reaching the end of stream.
Another option would be to poll the input stream, like in this question: timeout on subprocess readline in python

I finally got a working solution; the key piece of information I was missing was thread.daemon = True, which #augurar pointed out in their answer.
Setting thread.daemon = True allows the thread to be terminated when the main process terminates; therefore unblocking my use of a sub-thread to monitor readline().
Here is a sample implementation of my solution; I used a Queue() object to pass strings to the main process, and I implemented a 3 second timer for cases like the original problem I was trying to solve where the subprocess has finished and terminated, but the readline() is hung for some reason.
This also helps avoid a race condition between which thing finishes first.
This works for both Python 2 and 3.
import sys
import threading
import subprocess
from datetime import datetime
try:
import queue
except:
import Queue as queue # Python 2 compatibility
def _monitor_readline(process, q):
while True:
bail = True
if process.poll() is None:
bail = False
out = ""
if sys.version_info[0] >= 3:
out = process.stdout.readline().decode('utf-8')
else:
out = process.stdout.readline()
q.put(out)
if q.empty() and bail:
break
def bash(cmd):
# Kick off the command
process = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
# Create the queue instance
q = queue.Queue()
# Kick off the monitoring thread
thread = threading.Thread(target=_monitor_readline, args=(process, q))
thread.daemon = True
thread.start()
start = datetime.now()
while True:
bail = True
if process.poll() is None:
bail = False
# Re-set the thread timer
start = datetime.now()
out = ""
while not q.empty():
out += q.get()
if out:
print(out)
# In the case where the thread is still alive and reading, and
# the process has exited and finished, give it up to 3 seconds
# to finish reading
if bail and thread.is_alive() and (datetime.now() - start).total_seconds() < 3:
bail = False
if bail:
break
# To demonstrate output in realtime, sleep is called in between these echos
bash("echo lol;sleep 2;echo bbq")

Python subprocess doesn't work without sleep

I'm working on a Python launcher which should execute a few programs in my list by calling subprocess. The code is correct, but it works very strangely.
In short, it doesn't work without some sleep or input command in main.
Here is the example:
import threading
import subprocess
import time
def executeFile(file_path):
subprocess.call(file_path, shell=True)
def main():
file = None
try:
file = open('./config.ini', 'r');
except:
# TODO: add alert widget
print("cant find a file")
pathes = [ path.strip() for path in file.readlines() ]
try:
for idx in range(len(pathes)):
print(pathes[idx])
file_path = pathes[idx];
newThread = threading.Thread(target=executeFile, args=(file_path,))
newThread.daemon = True
newThread.start()
except:
print("cant start thread")
if __name__ == '__main__':
main()
# IT WORKS WHEN SLEEP EXISTS
time.sleep(10)
# OR
# input("Press enter to exit ;)")
but without input or sleep it doesn't work:
if __name__ == '__main__':
# Doesn't work
main()
Could someone explain me, please, why it happens?
I have some idea but I'm not sure. Maybe it's because subprocess is asynchronyous and the program executes and closes itself BEFORE the subprocess execution.
In case of sleep and input, the program suspends and subprocess has enough time to execute.
Thanks for any help!

As soon as the last thread is started, your main() returns. That in turn will exit your Python program. That stops all your threads.
From the documentation on daemon threads:
Note: Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.
The simple fix would be to not use daemon threads.
As an aside, I would suggest some changes to your loop. First, iterate over pathes directly instead of using indices. Second; catch errors for each thread seperately, so one error doesn't leave remaining files unprocessed.
for path in pathes:
try:
print(path)
newThread = threading.Thread(target=executeFile, args=(path,))
newThread.start()
except:
print("cant start thread for", path)
Another option would be to skip threads entirely, and just maintain a list of running subprocesses:
import os
import subprocess
import time
def manageprocs(proclist):
"""Check a list of subprocesses for processes that have
ended and remove them from the list.
:param proclist: list of Popen objects
"""
for pr in proclist:
if pr.poll() is not None:
proclist.remove(pr)
# since manageprocs is called from a loop,
# keep CPU usage down.
time.sleep(0.5)
def main():
# Read config file
try:
with open('./config.ini', 'r') as f:
pathes = [path.strip() for path in f.readlines()]
except FileNotFoundError:
print("cant find config file")
exit(1)
# List of subprocesses
procs = []
# Do not launch more processes concurrently than your
# CPU has cores. That will only lead to the processes
# fighting over CPU resources.
maxprocs = os.cpu_count()
# Launch all subprocesses.
for path in pathes:
while len(procs) == maxprocs:
manageprocs(procs)
procs.append(subprocess.Popen(path, shell=True))
# Wait for all subprocesses to finish.
while len(procs) > 0:
manageprocs(procs)
if __name__ == '__main__':
main()

how to handle the commands that are hung indefinitely [duplicate]

Is there any argument or options to setup a timeout for Python's subprocess.Popen method?
Something like this:
subprocess.Popen(['..'], ..., timeout=20) ?

I would advise taking a look at the Timer class in the threading module. I used it to implement a timeout for a Popen.
First, create a callback:
def timeout( p ):
if p.poll() is None:
print 'Error: process taking too long to complete--terminating'
p.kill()
Then open the process:
proc = Popen( ... )
Then create a timer that will call the callback, passing the process to it.
t = threading.Timer( 10.0, timeout, [proc] )
t.start()
t.join()
Somewhere later in the program, you may want to add the line:
t.cancel()
Otherwise, the python program will keep running until the timer has finished running.
EDIT: I was advised that there is a race condition that the subprocess p may terminate between the p.poll() and p.kill() calls. I believe the following code can fix that:
import errno
def timeout( p ):
if p.poll() is None:
try:
p.kill()
print 'Error: process taking too long to complete--terminating'
except OSError as e:
if e.errno != errno.ESRCH:
raise
Though you may want to clean the exception handling to specifically handle just the particular exception that occurs when the subprocess has already terminated normally.

subprocess.Popen doesn't block so you can do something like this:
import time
p = subprocess.Popen(['...'])
time.sleep(20)
if p.poll() is None:
p.kill()
print 'timed out'
else:
print p.communicate()
It has a drawback in that you must always wait at least 20 seconds for it to finish.

import subprocess, threading
class Command(object):
def __init__(self, cmd):
self.cmd = cmd
self.process = None
def run(self, timeout):
def target():
print 'Thread started'
self.process = subprocess.Popen(self.cmd, shell=True)
self.process.communicate()
print 'Thread finished'
thread = threading.Thread(target=target)
thread.start()
thread.join(timeout)
if thread.is_alive():
print 'Terminating process'
self.process.terminate()
thread.join()
print self.process.returncode
command = Command("echo 'Process started'; sleep 2; echo 'Process finished'")
command.run(timeout=3)
command.run(timeout=1)
The output of this should be:
Thread started
Process started
Process finished
Thread finished
0
Thread started
Process started
Terminating process
Thread finished
-15
where it can be seen that, in the first execution, the process finished correctly (return code 0), while the in the second one the process was terminated (return code -15).
I haven't tested in windows; but, aside from updating the example command, I think it should work since I haven't found in the documentation anything that says that thread.join or process.terminate is not supported.

You could do
from twisted.internet import reactor, protocol, error, defer
class DyingProcessProtocol(protocol.ProcessProtocol):
def __init__(self, timeout):
self.timeout = timeout
def connectionMade(self):
#defer.inlineCallbacks
def killIfAlive():
try:
yield self.transport.signalProcess('KILL')
except error.ProcessExitedAlready:
pass
d = reactor.callLater(self.timeout, killIfAlive)
reactor.spawnProcess(DyingProcessProtocol(20), ...)
using Twisted's asynchronous process API.

A python subprocess auto-timeout is not built in, so you're going to have to build your own.
This works for me on Ubuntu 12.10 running python 2.7.3
Put this in a file called test.py
#!/usr/bin/python
import subprocess
import threading
class RunMyCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = subprocess.Popen(self.cmd)
self.p.wait()
def run_the_process(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate() #if your process needs a kill -9 to make
#it go away, use self.p.kill() here instead.
self.join()
RunMyCmd(["sleep", "20"], 3).run_the_process()
Save it, and run it:
python test.py
The sleep 20 command takes 20 seconds to complete. If it doesn't terminate in 3 seconds (it won't) then the process is terminated.
el#apollo:~$ python test.py
el#apollo:~$
There is three seconds between when the process is run, and it is terminated.

As of Python 3.3, there is also a timeout argument to the blocking helper functions in the subprocess module.
https://docs.python.org/3/library/subprocess.html

Unfortunately, there isn't such a solution. I managed to do this using a threaded timer that would launch along with the process that would kill it after the timeout but I did run into some stale file descriptor issues because of zombie processes or some such.

No there is no time out. I guess, what you are looking for is to kill the sub process after some time. Since you are able to signal the subprocess, you should be able to kill it too.
generic approach to sending a signal to subprocess:
proc = subprocess.Popen([command])
time.sleep(1)
print 'signaling child'
sys.stdout.flush()
os.kill(proc.pid, signal.SIGUSR1)
You could use this mechanism to terminate after a time out period.

Yes, https://pypi.python.org/pypi/python-subprocess2 will extend the Popen module with two additional functions,
Popen.waitUpTo(timeout=seconds)
This will wait up to acertain number of seconds for the process to complete, otherwise return None
also,
Popen.waitOrTerminate
This will wait up to a point, and then call .terminate(), then .kill(), one orthe other or some combination of both, see docs for full details:
http://htmlpreview.github.io/?https://github.com/kata198/python-subprocess2/blob/master/doc/subprocess2.html

For Linux, you can use a signal. This is platform dependent so another solution is required for Windows. It may work with Mac though.
def launch_cmd(cmd, timeout=0):
'''Launch an external command
It launchs the program redirecting the program's STDIO
to a communication pipe, and appends those responses to
a list. Waits for the program to exit, then returns the
ouput lines.
Args:
cmd: command Line of the external program to launch
time: time to wait for the command to complete, 0 for indefinitely
Returns:
A list of the response lines from the program
'''
import subprocess
import signal
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
lines = []
if not launch_cmd.init:
launch_cmd.init = True
signal.signal(signal.SIGALRM, alarm_handler)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
signal.alarm(timeout) # timeout sec
try:
for line in p.stdout:
lines.append(line.rstrip())
p.wait()
signal.alarm(0) # disable alarm
except:
print "launch_cmd taking too long!"
p.kill()
return lines
launch_cmd.init = False

Start process and reuse with PID in python

My main process/thread starts an executable that starts waiting for a signal after echoing Algorithm loaded. I am using the subprocess.Popen class for running the executable.
Later, a thread is started that is supposed to send a signal to the earlier started executable. But I have no clue of how to send a signal to that particular subprocess from that thread.
Is it possible to pass PID's and "recover" subprocesses using the PID? The purpose of reusing the process is to send something equivalent to stdin.
Here's my code for starting the executable:
def start_module():
cmd = '%s/libraries/OpenBR' % settings.MODULES_DIR
process = subprocess.Popen(cmd,stdout=subprocess.PIPE)
while True:
line = process.stdout.readline()
if line.find('Algorithm loaded') > -1:
break
return 0

The process variable in your code refers to a Popen object, which supports a pid attribute. If you have your start_module function return the process, you can later send it a signal using os.kill. For example:
def start_module():
cmd = '%s/libraries/OpenBR' % settings.MODULES_DIR
process = subprocess.Popen(cmd,stdout=subprocess.PIPE)
while True:
line = process.stdout.readline()
if line.find('Algorithm loaded') > -1:
break
return process
p = start_module()
os.kill(p.pid, signal.SIGALRM)
As far as I can see, using a thread or not to send the signal should not make any difference. Notice that os.kill does not necessarily kill a process: it sends it a signal that the process can then handle appropriately (an ALARM signal, here).
If your intention was to pass some input to the process's stdin, then things are also easy. You just need to add stdin=subprocess.PIPE to the Popen call and print to the stdin attribute of the new process:
def start_module():
cmd = '%s/libraries/OpenBR' % settings.MODULES_DIR
process = subprocess.Popen(cmd,stdin=subprocess.PIPE,stdout=subprocess.PIPE)
while True:
line = process.stdout.readline()
if line.find('Algorithm loaded') > -1:
break
return process
p = start_module()
print >> p.stdin, "Hello world!"
print >> p.stdin, "How are things there?"

Run a external program with specified max running time

I want to execute an external program in each thread of a multi-threaded python program.
Let's say max running time is set to 1 second. If started process completes within 1 second, main program capture its output for further processing. If it doesn't finishes in 1 second, main program just terminate it and start another new process.
How to implement this?

You could poll it periodically:
import subprocess, time
s = subprocess.Popen(['foo', 'args'])
timeout = 1
poll_period = 0.1
s.poll()
while s.returncode is None and timeout > 0:
time.sleep(poll_period)
timeout -= poll_period
s.poll()
if timeout <= 0:
s.kill() # timed out
else:
pass # completed
You can then just put the above in a function and start it as a thread.

This is the helper function I use:
def run_with_timeout(command, timeout):
import time
import subprocess
p = subprocess.Popen(command, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
while timeout > 0:
if p.poll() is not None:
return p.communicate()
time.sleep(0.1)
timeout -= 0.1
else:
try:
p.kill()
except OSError as e:
if e.errno != 3:
raise
return (None, None)

A nasty hack on linux is to use the timeout program to run the command. You may opt for a nicer all Python solution, however.

here is a solution using the pexpect module (I needed to capture the output of the program before it ran into the timeout, I did not manage to do this with subprocess.Popen):
import pexpect
timeout = ... # timeout in seconds
proc = pexpect.spawn('foo', ['args'], timeout = timeout)
result = proc.expect([ pexpect.EOF, pexpect.TIMEOUT])
if result == 0:
# program terminated by itself
...
else:
# result is 1 here, we ran into the timeout
...
print "program's output:", print proc.before

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Break loop when child process finishes - python

Related

Python Subprocess readline() hangs; can't use normal options

Python subprocess doesn't work without sleep

how to handle the commands that are hung indefinitely [duplicate]

Start process and reuse with PID in python

Run a external program with specified max running time

Categories

Resources