Start and Stop external process from python - python

Is there a way to start and stop a process from python? I'm talking about a continues process that I stop with ctrl+z when normally running. I want to start the process, wait for some time and then kill it. I'm using linux.
this question is not like mine because there, the user only needs to run the process. I need to run it and also stop it.

I want to start the process, wait for some time and then kill it.
#!/usr/bin/env python3
import subprocess
try:
subprocess.check_call(['command', 'arg 1', 'arg 2'],
timeout=some_time_in_seconds)
except subprocess.TimeoutExpired:
print('subprocess has been killed on timeout')
else:
print('subprocess has exited before timeout')
See Using module 'subprocess' with timeout

You can use the os.kill function to send -SIGSTOP (-19) and -SIGCONT (-18)
Example (unverified):
import signal
from subprocess import check_output
def get_pid(name):
return check_output(["pidof",name])
def stop_process(name):
pid = get_pid(name)
os.kill(pid, signal.SIGSTOP)
def restart_process(name):
pid = get_pid(name)
os.kill(pid, signal.SIGCONT)

Maybe you can use Process module:
from multiprocessing import Process
import os
import time
def sleeper(name, seconds):
print "Sub Process %s ID# %s" % (name, os.getpid())
print "Parent Process ID# %s" % (os.getppid())
print "%s will sleep for %s seconds" % (name, seconds)
time.sleep(seconds)
if __name__ == "__main__":
child_proc = Process(target=sleeper, args=('bob', 5))
child_proc.start()
time.sleep(2)
child_proc.terminate()
#child_proc.join()
#time.sleep(2)
#print "in parent process after child process join"
#print "the parent's parent process: %s" % (os.getppid())

Related

how to handle the commands that are hung indefinitely [duplicate]

Is there any argument or options to setup a timeout for Python's subprocess.Popen method?
Something like this:
subprocess.Popen(['..'], ..., timeout=20) ?
I would advise taking a look at the Timer class in the threading module. I used it to implement a timeout for a Popen.
First, create a callback:
def timeout( p ):
if p.poll() is None:
print 'Error: process taking too long to complete--terminating'
p.kill()
Then open the process:
proc = Popen( ... )
Then create a timer that will call the callback, passing the process to it.
t = threading.Timer( 10.0, timeout, [proc] )
t.start()
t.join()
Somewhere later in the program, you may want to add the line:
t.cancel()
Otherwise, the python program will keep running until the timer has finished running.
EDIT: I was advised that there is a race condition that the subprocess p may terminate between the p.poll() and p.kill() calls. I believe the following code can fix that:
import errno
def timeout( p ):
if p.poll() is None:
try:
p.kill()
print 'Error: process taking too long to complete--terminating'
except OSError as e:
if e.errno != errno.ESRCH:
raise
Though you may want to clean the exception handling to specifically handle just the particular exception that occurs when the subprocess has already terminated normally.
subprocess.Popen doesn't block so you can do something like this:
import time
p = subprocess.Popen(['...'])
time.sleep(20)
if p.poll() is None:
p.kill()
print 'timed out'
else:
print p.communicate()
It has a drawback in that you must always wait at least 20 seconds for it to finish.
import subprocess, threading
class Command(object):
def __init__(self, cmd):
self.cmd = cmd
self.process = None
def run(self, timeout):
def target():
print 'Thread started'
self.process = subprocess.Popen(self.cmd, shell=True)
self.process.communicate()
print 'Thread finished'
thread = threading.Thread(target=target)
thread.start()
thread.join(timeout)
if thread.is_alive():
print 'Terminating process'
self.process.terminate()
thread.join()
print self.process.returncode
command = Command("echo 'Process started'; sleep 2; echo 'Process finished'")
command.run(timeout=3)
command.run(timeout=1)
The output of this should be:
Thread started
Process started
Process finished
Thread finished
0
Thread started
Process started
Terminating process
Thread finished
-15
where it can be seen that, in the first execution, the process finished correctly (return code 0), while the in the second one the process was terminated (return code -15).
I haven't tested in windows; but, aside from updating the example command, I think it should work since I haven't found in the documentation anything that says that thread.join or process.terminate is not supported.
You could do
from twisted.internet import reactor, protocol, error, defer
class DyingProcessProtocol(protocol.ProcessProtocol):
def __init__(self, timeout):
self.timeout = timeout
def connectionMade(self):
#defer.inlineCallbacks
def killIfAlive():
try:
yield self.transport.signalProcess('KILL')
except error.ProcessExitedAlready:
pass
d = reactor.callLater(self.timeout, killIfAlive)
reactor.spawnProcess(DyingProcessProtocol(20), ...)
using Twisted's asynchronous process API.
A python subprocess auto-timeout is not built in, so you're going to have to build your own.
This works for me on Ubuntu 12.10 running python 2.7.3
Put this in a file called test.py
#!/usr/bin/python
import subprocess
import threading
class RunMyCmd(threading.Thread):
def __init__(self, cmd, timeout):
threading.Thread.__init__(self)
self.cmd = cmd
self.timeout = timeout
def run(self):
self.p = subprocess.Popen(self.cmd)
self.p.wait()
def run_the_process(self):
self.start()
self.join(self.timeout)
if self.is_alive():
self.p.terminate() #if your process needs a kill -9 to make
#it go away, use self.p.kill() here instead.
self.join()
RunMyCmd(["sleep", "20"], 3).run_the_process()
Save it, and run it:
python test.py
The sleep 20 command takes 20 seconds to complete. If it doesn't terminate in 3 seconds (it won't) then the process is terminated.
el#apollo:~$ python test.py
el#apollo:~$
There is three seconds between when the process is run, and it is terminated.
As of Python 3.3, there is also a timeout argument to the blocking helper functions in the subprocess module.
https://docs.python.org/3/library/subprocess.html
Unfortunately, there isn't such a solution. I managed to do this using a threaded timer that would launch along with the process that would kill it after the timeout but I did run into some stale file descriptor issues because of zombie processes or some such.
No there is no time out. I guess, what you are looking for is to kill the sub process after some time. Since you are able to signal the subprocess, you should be able to kill it too.
generic approach to sending a signal to subprocess:
proc = subprocess.Popen([command])
time.sleep(1)
print 'signaling child'
sys.stdout.flush()
os.kill(proc.pid, signal.SIGUSR1)
You could use this mechanism to terminate after a time out period.
Yes, https://pypi.python.org/pypi/python-subprocess2 will extend the Popen module with two additional functions,
Popen.waitUpTo(timeout=seconds)
This will wait up to acertain number of seconds for the process to complete, otherwise return None
also,
Popen.waitOrTerminate
This will wait up to a point, and then call .terminate(), then .kill(), one orthe other or some combination of both, see docs for full details:
http://htmlpreview.github.io/?https://github.com/kata198/python-subprocess2/blob/master/doc/subprocess2.html
For Linux, you can use a signal. This is platform dependent so another solution is required for Windows. It may work with Mac though.
def launch_cmd(cmd, timeout=0):
'''Launch an external command
It launchs the program redirecting the program's STDIO
to a communication pipe, and appends those responses to
a list. Waits for the program to exit, then returns the
ouput lines.
Args:
cmd: command Line of the external program to launch
time: time to wait for the command to complete, 0 for indefinitely
Returns:
A list of the response lines from the program
'''
import subprocess
import signal
class Alarm(Exception):
pass
def alarm_handler(signum, frame):
raise Alarm
lines = []
if not launch_cmd.init:
launch_cmd.init = True
signal.signal(signal.SIGALRM, alarm_handler)
p = subprocess.Popen(cmd, stdout=subprocess.PIPE)
signal.alarm(timeout) # timeout sec
try:
for line in p.stdout:
lines.append(line.rstrip())
p.wait()
signal.alarm(0) # disable alarm
except:
print "launch_cmd taking too long!"
p.kill()
return lines
launch_cmd.init = False

kill the sub processes by signal, but position affects?

I use the signal function to kill all sub-processes in the mul-process program, the code is shown blow, save as a file named mul_process.py:
import time
import os
import signal
from multiprocessing import Process
processes = []
def fun(x):
print 'current sub-process pid is %s' % os.getpid()
while True:
print 'args is %s' % x
time.sleep(100)
def term(sig_num, frame):
print 'terminate process %d' % os.getpid()
for p in processes:
print p.pid
try:
for p in processes:
print 'process %d terminate' % p.pid
p.terminate()
p.join()
except Exception as e:
print str(e)
if __name__ == '__main__':
print 'current main-process pid is %s' % os.getpid()
for i in range(3):
t = Process(target=fun, args=(str(i),))
t.start()
processes.append(t)
signal.signal(signal.SIGTERM, term)
try:
for p in processes:
p.join()
except Exception as e:
print str(e)
Using 'python mul_process.py' to launch the program on Ubuntu 10.04.4 and Python 2.6, when it start running, in another tab, I use kill -15 with the main process pid to send signal SIGTERM to kill all processes, when the main process receive the signal SIGTERM, it exit after terminate all sub processes, but when I use kill -15 with the sub process pid, it does not work, the program still alive and running as before, and does not print the sentence defined in the function term, seems that the subprocess doesn't receive the SIGTERM.As I know, the sub process will inherit the signal handler, but it doesn`t work, here is the first question.
And then I move the line 'signal.signal(signal.SIGTERM, term)' to position after line 'if name == 'main':', like this:
if __name__ == '__main__':
signal.signal(signal.SIGTERM, term)
print 'current main-process pid is %s' % os.getpid()
for i in range(3):
t = Process(target=fun, args=(str(i),))
t.start()
processes.append(t)
try:
for p in processes:
p.join()
except Exception as e:
print str(e)
Launch the program, and use kill -15 with the main process pid to send the signal SIGTERM, the program receive the signal and call the function term but also doesn't kill any subprocessed and exit itself, this is the second question.
Few problems in your program- Agree that subprocess will inherit signal handler in your 2nd code snippet, But global variable "processes" list won't be shared. So list of process would be available with main process only. "process" would be empty list for other sub process.
You can use queue or pipe kind of mechanism to pass list of process to sub processes. But it will bring another problem
You terminate process1 and handler of process1 try to terminate process2 to process4.
Now process 2 also has same handler,
So Process 2 handler again try to terminate all other process
which will push your program into infinite loop.

python cannot kill process using process.terminate

I have a python code as following:
import threading
import time
import subprocess, os, sys, psutil, signal
from signal import SIGKILL
def processing():
global p_2
global subp_2
.
.
.
if condition1: #loop again
threading.Timer(10,processing).start()
if condition2:
signal.signal(signal.SIGINT, signal_handler)
#signal.signal(signal.SIGTERM, signal_handler)
subp_2.terminate()
#os.kill(subp_2.pid, 0)
#subp_2.kill()
print " Status p_2: ", p_2.status
def signal_handler(signal, frame):
print('Exiting')
sys.exit(0)
def function():
global p_1
global subp_1
.
.
.
if condition1: #loop again
threading.Timer(5,function).start()
if condition2:
signal.signal(signal.SIGINT, signal_handler)
#signal.signal(signal.SIGTERM, signal_handler)
subp_1.terminate()
#os.kill(subp_1.pid, 0)
#subp_1.kill()
print " Status p_1: ", p_1.status
threading.Timer(10,processing).start()
subp_2 = subprocess.Popen('./myScript2.sh %s %s' % (arg0, arg1), shell=True)
p_2 = psutil.Process(subp_2.pid)
if __name__ == '__main__':
global p_1
global subp_1
.
.
.
subp_1 = subprocess.Popen(["/.../myScript1.sh"], shell=True)
p_1 = psutil.Process(subp_1.pid)
threading.Timer(5,function).start()
I could not kill the processe subp_1 and subp_2. Whatever I tried: .terminate(), .kill() or os.kill() I am still getting process status running. Could anyone tell me please what am I missing ? Any hint is appreciated.
When you use shell=True, first a subprocess is spawned which runs the shell. Then the shell spawns a subprocess which runs myScript2.sh. The subprocess running the shell can be terminated without terminating the myScript2.sh subprocess.
If you can avoid using shell=True, then that would be one way to avoid this problem. If using user input to form the command, shell=True should definitely be avoided, since it is a security risk.
On Unix, by default, the subprocess spawned by subprocess.Popen is not a session leader. When you send a signal to a session leader, it is propagated to all processes with the same session id. So to have the shell pass the SIGTERM to myScript2.sh, make the shell a session leader.
For Python versions < 3.2 on Unix, that can be done by having the shell process run os.setsid():
import os
subp_2 = subprocess.Popen('./myScript2.sh %s %s' % (arg0, arg1),
shell=True,
preexec_fn=os.setsid)
# To send SIGTERM to the process group:
os.killpg(subp_2.pid, signal.SIGTERM)
For Python versions >= 3.2 on Unix, pass start_new_session=True to Popen.
For Windows, see J.F. Sebastian's solution.

python multiprocessing.Pool kill *specific* long running or hung process

I need to execute a pool of many parallel database connections and queries. I would like to use a multiprocessing.Pool or concurrent.futures ProcessPoolExecutor. Python 2.7.5
In some cases, query requests take too long or will never finish (hung/zombie process). I would like to kill the specific process from the multiprocessing.Pool or concurrent.futures ProcessPoolExecutor that has timed out.
Here is an example of how to kill/re-spawn the entire process pool, but ideally I would minimize that CPU thrashing since I only want to kill a specific long running process that has not returned data after timeout seconds.
For some reason the code below does not seem to be able to terminate/join the process Pool after all results are returned and completed. It may have to do with killing worker processes when a timeout occurs, however the Pool creates new workers when they are killed and results are as expected.
from multiprocessing import Pool
import time
import numpy as np
from threading import Timer
import thread, time, sys
def f(x):
time.sleep(x)
return x
if __name__ == '__main__':
pool = Pool(processes=4, maxtasksperchild=4)
results = [(x, pool.apply_async(f, (x,))) for x in np.random.randint(10, size=10).tolist()]
while results:
try:
x, result = results.pop(0)
start = time.time()
print result.get(timeout=5), '%d done in %f Seconds!' % (x, time.time()-start)
except Exception as e:
print str(e)
print '%d Timeout Exception! in %f' % (x, time.time()-start)
for p in pool._pool:
if p.exitcode is None:
p.terminate()
pool.terminate()
pool.join()
I am not fully understanding your question. You say you want to stop one specific process, but then, in your exception handling phase, you are calling terminate on all jobs. Not sure why you are doing that. Also, I am pretty sure using internal variables from multiprocessing.Pool is not quite safe. Having said all of that, I think your question is why this program does not finish when a time out happens. If that is the problem, then the following does the trick:
from multiprocessing import Pool
import time
import numpy as np
from threading import Timer
import thread, time, sys
def f(x):
time.sleep(x)
return x
if __name__ == '__main__':
pool = Pool(processes=4, maxtasksperchild=4)
results = [(x, pool.apply_async(f, (x,))) for x in np.random.randint(10, size=10).tolist()]
result = None
start = time.time()
while results:
try:
x, result = results.pop(0)
print result.get(timeout=5), '%d done in %f Seconds!' % (x, time.time()-start)
except Exception as e:
print str(e)
print '%d Timeout Exception! in %f' % (x, time.time()-start)
for i in reversed(range(len(pool._pool))):
p = pool._pool[i]
if p.exitcode is None:
p.terminate()
del pool._pool[i]
pool.terminate()
pool.join()
The point is you need to remove items from the pool; just calling terminate on them is not enough.
In your solution you're tampering internal variables of the pool itself. The pool is relying on 3 different threads in order to correctly operate, it is not safe to intervene in their internal variables without being really aware of what you're doing.
There's not a clean way to stop timing out processes in the standard Python Pools, but there are alternative implementations which expose such feature.
You can take a look at the following libraries:
pebble
billiard
To avoid access to the internal variables you can save multiprocessing.current_process().pid from the executing task into the shared memory. Then iterate over the multiprocessing.active_children() from the main process and kill the target pid if exists.
However, after such external termination of the workers, they are recreated, but the pool becomes nonjoinable and also requires explicit termination before the join()
I also came across this problem.
The original code and the edited version by #stacksia has the same issue:
in both cases it will kill all currently running processes when timeout is reached for just one of the processes (ie when the loop over pool._pool is done).
Find below my solution. It involves creating a .pid file for each worker process as suggested by #luart. It will work if there is a way to tag each worker process (in the code below, x does this job).
If someone has a more elegant solution (such as saving PID in memory) please share it.
#!/usr/bin/env python
from multiprocessing import Pool
import time, os
import subprocess
def f(x):
PID = os.getpid()
print 'Started:', x, 'PID=', PID
pidfile = "/tmp/PoolWorker_"+str(x)+".pid"
if os.path.isfile(pidfile):
print "%s already exists, exiting" % pidfile
sys.exit()
file(pidfile, 'w').write(str(PID))
# Do the work here
time.sleep(x*x)
# Delete the PID file
os.remove(pidfile)
return x*x
if __name__ == '__main__':
pool = Pool(processes=3, maxtasksperchild=4)
results = [(x, pool.apply_async(f, (x,))) for x in [1,2,3,4,5,6]]
pool.close()
while results:
print results
try:
x, result = results.pop(0)
start = time.time()
print result.get(timeout=3), '%d done in %f Seconds!' % (x, time.time()-start)
except Exception as e:
print str(e)
print '%d Timeout Exception! in %f' % (x, time.time()-start)
# We know which process gave us an exception: it is "x", so let's kill it!
# First, let's get the PID of that process:
pidfile = '/tmp/PoolWorker_'+str(x)+'.pid'
PID = None
if os.path.isfile(pidfile):
PID = str(open(pidfile).read())
print x, 'pidfile=',pidfile, 'PID=', PID
# Now, let's check if there is indeed such process runing:
for p in pool._pool:
print p, p.pid
if str(p.pid)==PID:
print 'Found it still running!', p, p.pid, p.is_alive(), p.exitcode
# We can also double-check how long it's been running with system 'ps' command:"
tt = str(subprocess.check_output('ps -p "'+str(p.pid)+'" o etimes=', shell=True)).strip()
print 'Run time from OS (may be way off the real time..) = ', tt
# Now, KILL the m*$#r:
p.terminate()
pool._pool.remove(p)
pool._repopulate_pool()
# Let's not forget to remove the pidfile
os.remove(pidfile)
break
pool.terminate()
pool.join()
Many people suggest pebble. It looks nice, but only available for Python 3. If someone has a way to get pebble imported for python 2.6 - would be great.

terminate a process and its subprocesses started with subprocess.popen the right way (windows and linux)

I'm struggling with some processes I started with Popen and which start subprocesses. When I start these processes manually in a terminal every process terminates as expected if I send CTRL+C. But running inside a python program using subprocess.Popen any attempt to terminate the process only gets rid of the parent but not of its children.
I tried .terminate() ..kill() as well as ..send_signal() with signal.SIGBREAK, signal.SIGTERM, but in every case I just terminate the parent process.
With this parent process I can reproduce the misbehavior:
#!/usr/bin/python
import time
import sys
import os
import subprocess
import signal
if __name__ == "__main__":
print os.getpid(), "MAIN: start a process.."
p = subprocess.Popen([sys.executable, 'process_to_shutdown.py'])
print os.getpid(), "MAIN: started process", p.pid
time.sleep(2)
print os.getpid(), "MAIN: kill the process"
# these just terminate the parent:
#p.terminate()
#p.kill()
#os.kill(p.pid, signal.SIGINT)
#os.kill(p.pid, signal.SIGTERM)
os.kill(p.pid, signal.SIGABRT)
p.wait()
print os.getpid(), "MAIN: job done - ciao"
The real life child process is manage.py from Django which spawns a few subprocesses and waits for CRTL-C. But the following example seems to work, too:
#!/usr/bin/python
import time
import sys
import os
import subprocess
if __name__ == "__main__":
timeout = int(sys.argv[1]) if len(sys.argv) >= 2 else 0
if timeout == 0:
p = subprocess.Popen([sys.executable, '-u', __file__, '13'])
print os.getpid(), "just waiting..."
p.wait()
else:
for i in range(timeout):
time.sleep(1)
print os.getpid(), i, "alive!"
sys.stdout.flush()
print os.getpid(), "ciao"
So my question in short: how do I kill the process in the first example and get rid of the child processes as well? On windows os.kill(p.pid, signal.CTRL_C_EVENT) seems to work in some cases, but what's the right way to do it? And how does a Terminal do it?
Like Henri Korhonen mentioned in a comment, grouping processes should help. Additionally, if you are on Windows and this is Cygwin Python that starts Windows applications, it appears Cygwin Python can not kill the children. For those cases you would need to run TASKKILL. TASKKILL also takes a group parameter.

Categories

Resources