I have python script called monitiq_install.py which calls other scripts (or modules) using the subprocess python module. However, if the user sends a keyboard interrupt (CTRL + C) it exits, but with an exception. I want it to exit, but nicely.
My Code:
import os
import sys
from os import listdir
from os.path import isfile, join
from subprocess import Popen, PIPE
import json
# Run a module and capture output and exit code
def runModule(module):
try:
# Run Module
process = Popen(os.path.dirname(os.path.realpath(__file__)) + "/modules/" + module, shell=True, stdout=PIPE, bufsize=1)
for line in iter(process.stdout.readline, b''):
print line,
process.communicate()
exit_code = process.wait();
return exit_code;
except KeyboardInterrupt:
print "Got keyboard interupt!";
sys.exit(0);
The error I'm getting is below:
python monitiq_install.py -a
Invalid module filename: create_db_user_v0_0_0.pyc
Not Running Module: '3parssh_install' as it is already installed
######################################
Running Module: 'create_db_user' Version: '0.0.3'
Choose username for Monitiq DB User [MONITIQ]
^CTraceback (most recent call last):
File "/opt/monitiq-universal/install/modules/create_db_user-v0_0_3.py", line 132, in <module>
inputVal = raw_input("");
Traceback (most recent call last):
File "monitiq_install.py", line 40, in <module>
KeyboardInterrupt
module_install.runModules();
File "/opt/monitiq-universal/install/module_install.py", line 86, in runModules
exit_code = runModule(module);
File "/opt/monitiq-universal/install/module_install.py", line 19, in runModule
for line in iter(process.stdout.readline, b''):
KeyboardInterrupt
A solution or some pointers would be helpful :)
--EDIT
With try catch
Running Module: 'create_db_user' Version: '0.0.0'
Choose username for Monitiq DB User [MONITIQ]
^CGot keyboard interupt!
Traceback (most recent call last):
File "monitiq_install.py", line 36, in <module>
module_install.runModules();
File "/opt/monitiq-universal/install/module_install.py", line 90, in runModules
exit_code = runModule(module);
File "/opt/monitiq-universal/install/module_install.py", line 29, in runModule
sys.exit(0);
NameError: global name 'sys' is not defined
Traceback (most recent call last):
File "/opt/monitiq-universal/install/modules/create_db_user-v0_0_0.py", line 132, in <module>
inputVal = raw_input("");
KeyboardInterrupt
If you press Ctrl + C in a terminal then SIGINT is sent to all processes within the process group. See child process receives parent's SIGINT.
That is why you see the traceback from the child process despite try/except KeyboardInterrupt in the parent.
You could suppress the stderr output from the child process: stderr=DEVNULL. Or start it in a new process group: start_new_session=True:
import sys
from subprocess import call
try:
call([sys.executable, 'child.py'], start_new_session=True)
except KeyboardInterrupt:
print('Ctrl C')
else:
print('no exception')
If you remove start_new_session=True in the above example then KeyboardInterrupt may be raised in the child too and you might get the traceback.
If subprocess.DEVNULL is not available; you could use DEVNULL = open(os.devnull, 'r+b', 0). If start_new_session parameter is not available; you could use preexec_fn=os.setsid on POSIX.
You can do this using try and except as below:
import subprocess
try:
proc = subprocess.Popen("dir /S", shell=True, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
while proc.poll() is None:
print proc.stdout.readline()
except KeyboardInterrupt:
print "Got Keyboard interrupt"
You could avoid shell=True in your execution as best security practice.
This code spawns a child process and hands signals like SIGINT, ... to them just like shells (bash, zsh, ...) do it.
This means KeyboardInterrupt is no longer seen by the Python process, but the child receives this and is killed correctly.
It works by running the process in a new foreground process group set by Python.
import os
import signal
import subprocess
import sys
import termios
def run_as_fg_process(*args, **kwargs):
"""
the "correct" way of spawning a new subprocess:
signals like C-c must only go
to the child process, and not to this python.
the args are the same as subprocess.Popen
returns Popen().wait() value
Some side-info about "how ctrl-c works":
https://unix.stackexchange.com/a/149756/1321
fun fact: this function took a whole night
to be figured out.
"""
old_pgrp = os.tcgetpgrp(sys.stdin.fileno())
old_attr = termios.tcgetattr(sys.stdin.fileno())
user_preexec_fn = kwargs.pop("preexec_fn", None)
def new_pgid():
if user_preexec_fn:
user_preexec_fn()
# set a new process group id
os.setpgid(os.getpid(), os.getpid())
# generally, the child process should stop itself
# before exec so the parent can set its new pgid.
# (setting pgid has to be done before the child execs).
# however, Python 'guarantee' that `preexec_fn`
# is run before `Popen` returns.
# this is because `Popen` waits for the closure of
# the error relay pipe '`errpipe_write`',
# which happens at child's exec.
# this is also the reason the child can't stop itself
# in Python's `Popen`, since the `Popen` call would never
# terminate then.
# `os.kill(os.getpid(), signal.SIGSTOP)`
try:
# fork the child
child = subprocess.Popen(*args, preexec_fn=new_pgid,
**kwargs)
# we can't set the process group id from the parent since the child
# will already have exec'd. and we can't SIGSTOP it before exec,
# see above.
# `os.setpgid(child.pid, child.pid)`
# set the child's process group as new foreground
os.tcsetpgrp(sys.stdin.fileno(), child.pid)
# revive the child,
# because it may have been stopped due to SIGTTOU or
# SIGTTIN when it tried using stdout/stdin
# after setpgid was called, and before we made it
# forward process by tcsetpgrp.
os.kill(child.pid, signal.SIGCONT)
# wait for the child to terminate
ret = child.wait()
finally:
# we have to mask SIGTTOU because tcsetpgrp
# raises SIGTTOU to all current background
# process group members (i.e. us) when switching tty's pgrp
# it we didn't do that, we'd get SIGSTOP'd
hdlr = signal.signal(signal.SIGTTOU, signal.SIG_IGN)
# make us tty's foreground again
os.tcsetpgrp(sys.stdin.fileno(), old_pgrp)
# now restore the handler
signal.signal(signal.SIGTTOU, hdlr)
# restore terminal attributes
termios.tcsetattr(sys.stdin.fileno(), termios.TCSADRAIN, old_attr)
return ret
# example:
run_as_fg_process(['openage', 'edit', '-f', 'random_map.rms'])
Related
Right now, I'm using subprocess to run a long-running job in the background. For multiple reasons (PyInstaller + AWS CLI) I can't use subprocess anymore.
Is there an easy way to achieve the same thing as below ? Running a long running python function in a multiprocess pool (or something else) and do real time processing of stdout/stderr ?
import subprocess
process = subprocess.Popen(
["python", "long-job.py"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True,
)
while True:
out = process.stdout.read(2000).decode()
if not out:
err = process.stderr.read().decode()
else:
err = ""
if (out == "" or err == "") and process.poll() is not None:
break
live_stdout_process(out)
Thanks
getting it cross platform is messy .... first of all windows implementation of non-blocking pipe is not user friendly or portable.
one option is to just have your application read its command line arguments and conditionally execute a file, and you get to use subprocess since you will be launching yourself with different argument.
but to keep it to multiprocessing :
the output must be logged to queues instead of pipes.
you need the child to execute a python file, this can be done using runpy to execute the file as __main__.
this runpy function should run under a multiprocessing child, this child must first redirect its stdout and stderr in the initializer.
when an error happens, your main application must catch it .... but if it is too busy reading the output it won't be able to wait for the error, so a child thread has to start the multiprocess and wait for the error.
the main process has to create the queues and launch the child thread and read the output.
putting it all together:
import multiprocessing
from multiprocessing import Queue
import sys
import concurrent.futures
import threading
import traceback
import runpy
import time
class StdoutQueueWrapper:
def __init__(self,queue:Queue):
self._queue = queue
def write(self,text):
self._queue.put(text)
def flush(self):
pass
def function_to_run():
# runpy.run_path("long-job.py",run_name="__main__") # run long-job.py
print("hello") # print something
raise ValueError # error out
def initializer(stdout_queue: Queue,stderr_queue: Queue):
sys.stdout = StdoutQueueWrapper(stdout_queue)
sys.stderr = StdoutQueueWrapper(stderr_queue)
def thread_function(child_stdout_queue,child_stderr_queue):
with concurrent.futures.ProcessPoolExecutor(1, initializer=initializer,
initargs=(child_stdout_queue, child_stderr_queue)) as pool:
result = pool.submit(function_to_run)
try:
result.result()
except Exception as e:
child_stderr_queue.put(traceback.format_exc())
if __name__ == "__main__":
child_stdout_queue = multiprocessing.Queue()
child_stderr_queue = multiprocessing.Queue()
child_thread = threading.Thread(target=thread_function,args=(child_stdout_queue,child_stderr_queue),daemon=True)
child_thread.start()
while True:
while not child_stdout_queue.empty():
var = child_stdout_queue.get()
print(var,end='')
while not child_stderr_queue.empty():
var = child_stderr_queue.get()
print(var,end='')
if not child_thread.is_alive():
break
time.sleep(0.01) # check output every 0.01 seconds
Note that a direct consequence of running as a multiprocess is that if the child runs into a segmentation fault or some unrecoverable error the parent will also die, hencing running yourself under subprocess might seem a better option if segfaults are expected.
I tried using os.kill but it does not seem to be working.
import signal
import os
from time import sleep
def isr(signum, frame):
print "Hey, I'm the ISR!"
signal.signal(signal.SIGALRM, isr)
pid = os.fork()
if pid == 0:
def sisr(signum, frame):
os.kill(os.getppid(), signal.SIGALRM)
signal.signal(signal.SIGVTALRM, sisr)
signal.setitimer(signal.ITIMER_VIRTUAL, 1, 1)
while True:
print "2"
else:
sleep(2)
Your sisr handler never executes.
signal.setitimer(signal.ITIMER_VIRTUAL, 1, 1)
This line sets a virtual timer, which, according to documentation, “Decrements interval timer only when the process is executing, and delivers SIGVTALRM upon expiration”.
The thing is, your process is almost never executing. The prints take almost no time inside your process, all the work is done by the kernel delivering the output to your console application (xterm, konsole, …) and the application repainting the screen. Meanwhile, your child process is asleep, and the timer does not run.
Change it with a real timer, it works :)
import signal
import os
from time import sleep
def isr(signum, frame):
print "Hey, I'm the ISR!"
signal.signal(signal.SIGALRM, isr)
pid = os.fork()
if pid == 0:
def sisr(signum, frame):
print "Child running sisr"
os.kill(os.getppid(), signal.SIGALRM)
signal.signal(signal.SIGALRM, sisr)
signal.setitimer(signal.ITIMER_REAL, 1, 1)
while True:
print "2"
sleep(1)
else:
sleep(10)
print "Parent quitting"
Output:
spectras#etherbee:~/temp$ python test.py
2
Child running sisr
2
Hey, I'm the ISR!
Parent quitting
spectras#etherbee:~/temp$ Child running sisr
Traceback (most recent call last):
File "test.py", line 22, in <module>
sleep(1)
File "test.py", line 15, in sisr
os.kill(os.getppid(), signal.SIGALRM)
OSError: [Errno 1] Operation not permitted
Note: the child crashes the second time it runs sisr, because then the parent has exited, so os.getppid() return 0 and sending a signal to process 0 is forbidden.
I want to access the traceback of a python programm running in a subprocess.
The documentation says:
Exceptions raised in the child process, before the new program has started to execute, will be re-raised in the parent. Additionally, the exception object will have one extra attribute called child_traceback, which is a string containing traceback information from the child’s point of view.
Contents of my_sub_program.py:
raise Exception("I am raised!")
Contents of my_main_program.py:
import sys
import subprocess
try:
subprocess.check_output([sys.executable, "my_sub_program.py"])
except Exception as e:
print e.child_traceback
If I run my_main_program.py, I get the following error:
Traceback (most recent call last):
File "my_main_program.py", line 6, in <module>
print e.child_traceback
AttributeError: 'CalledProcessError' object has no attribute 'child_traceback'
How can I access the traceback of the subprocess without modifying the subprocess program code? This means, I want to avoid adding a large try/except clause around my whole sub-program code, but rather handle error logging from my main program.
Edit: sys.executable should be replaceable with an interpreter differing from the one running the main program.
As you're starting another Python process, you can also try to use the multiprocessing Python module ; by sub-classing the Process class it is quite easy to get exceptions from the target function:
from multiprocessing import Process, Pipe
import traceback
import functools
class MyProcess(Process):
def __init__(self, *args, **kwargs):
Process.__init__(self, *args, **kwargs)
self._pconn, self._cconn = Pipe()
self._exception = None
def run(self):
try:
Process.run(self)
self._cconn.send(None)
except Exception as e:
tb = traceback.format_exc()
self._cconn.send((e, tb))
# raise e # You can still rise this exception if you need to
#property
def exception(self):
if self._pconn.poll():
self._exception = self._pconn.recv()
return self._exception
p = MyProcess(target=functools.partial(execfile, "my_sub_program.py"))
p.start()
p.join() #wait for sub-process to end
if p.exception:
error, traceback = p.exception
print 'you got', traceback
The trick is to have the target function executing the Python sub-program, this is done by using functools.partial.
I let a server run, which should communicate with a serial device. I wrote a init.d script which should automatically restarts the server if it crashs for some reason. However to achieve this, the python script has to terminate properly. Unfortunately my thread just stucks if a exception is raised (e.g. if i unplug my serial device) and never terminates.
This is the error:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 505, in run
self.__target(*self.__args, **self.__kwargs)
File "RPiQuadroServer.py", line 87, in recv_thr
while pySerial.inWaiting() > 0:
File "/usr/lib/python2.7/dist-packages/serial/serialposix.py", line 431, in inWaiting
s = fcntl.ioctl(self.fd, TIOCINQ, TIOCM_zero_str)
IOError: [Errno 5] Input/output error
And this is the code. I removed some unimportant functions..
# Make this Python 2.7 script compatible to Python 3 standard
from __future__ import print_function
# For remote control
import socket
import json
import serial
# For sensor readout
import logging
import threading
# For system specific functions
import sys
from time import *
# Create a sensor log with date and time
layout = '%(asctime)s - %(levelname)s - %(message)s'
logging.basicConfig(filename='/tmp/RPiQuadrocopter.log', level=logging.INFO, format=layout)
# Socket for WiFi data transport
udp_sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
udp_sock.bind(('0.0.0.0', 7000))
client_adr = ""
# Thread lock for multi threading
THR_LOCK = threading.Lock()
#pySerial
pySerial = 0
# These functions shall run in separate threads
# recv_thr() is used to catch sensor data
def recv_thr():
global client_adr
ser_line = ""
while True:
# Lock while data in queue to get red
THR_LOCK.acquire()
while pySerial.inWaiting() > 0:
try:
# Remove newline character '\n'
ser_line = pySerial.readline().strip()
except serial.SerialTimeoutException as e:
logging.error("Read timeout on serial port '{}': {}".format(com_port, e))
return # never ends..
else:
try:
p = json.loads(ser_line)
except (ValueError, KeyError, TypeError):
# Print everything what is not valid json string to console
#print ("JSON format error: %s" % ser_line)
logging.debug("JSON format error: " + ser_line)
else:
logging.info(ser_line)
if client_adr != "":
bytes = udp_sock.sendto(ser_line, client_adr)
THR_LOCK.release()
# Main program for sending and receiving
# Working with two separate threads
def main():
# Start threads for receiving and transmitting
recv=threading.Thread(target=recv_thr)
recv.start()
# Start Program
bInitialized, pySerial = init_serial()
if not bInitialized:
print ("Could not open any serial port. Exit script.")
sys.exit()
main()
Not your program is terminating, just a thread of yours is terminating with an exception.
You need to check yourself if that thread is still running and if so, terminate.
Besides the proposal of radu.ciorba of polling the thread you could also catch all exceptions in the thread and in case it is failing with an exception, send a SIGTERM to your process; this will terminate all threads and thus the process.
Use os.kill(os.getpid(), 15) for that and place it in a general except clause:
def recv_thr(...): # add your arguments here
try:
... # add your code here
except:
os.kill(os.getpid(), 15)
you can check if the thread is still alive and exit if it's not:
import time
import sys
import threading
def spam():
raise AssertionError("spam")
t = threading.Thread(target=spam)
r = t.start()
while 1:
time.sleep(.5)
if not t.is_alive():
sys.exit(1)
I'm getting the following error when using the multiprocessing module within a python daemon process (using python-daemon):
Traceback (most recent call last):
File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/local/lib/python2.6/multiprocessing/util.py", line 262, in _exit_function
for p in active_children():
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 43, in active_children
_cleanup()
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
if p._popen.poll() is not None:
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
The daemon process (parent) spawns a number of processes (children) and then periodically polls the processes to see if they have completed. If the parent detects that one of the processes has completed, it then attempts to restart that process. It is at this point that the above exception is raised. It seems that once one of the processes completes, any operation involving the multiprocessing module will generate this exception. If I run the identical code in a non-daemon python script, it executes with no errors whatsoever.
EDIT:
Sample script
from daemon import runner
class DaemonApp(object):
def __init__(self, pidfile_path, run):
self.pidfile_path = pidfile_path
self.run = run
self.stdin_path = '/dev/null'
self.stdout_path = '/dev/tty'
self.stderr_path = '/dev/tty'
def run():
import multiprocessing as processing
import time
import os
import sys
import signal
def func():
print 'pid: ', os.getpid()
for i in range(5):
print i
time.sleep(1)
process = processing.Process(target=func)
process.start()
while True:
print 'checking process'
if not process.is_alive():
print 'process dead'
process = processing.Process(target=func)
process.start()
time.sleep(1)
# uncomment to run as daemon
app = DaemonApp('/root/bugtest.pid', run)
daemon_runner = runner.DaemonRunner(app)
daemon_runner.do_action()
#uncomment to run as regular script
#run()
Your problem is a conflict between the daemon and multiprocessing modules, in particular in its handling of the SIGCLD (child process terminated) signal. daemon sets SIGCLD to SIG_IGN when launching, which, at least on Linux, causes terminated children to immediately be reaped (rather than becoming a zombie until the parent invokes wait()). But multiprocessing's is_alive test invokes wait() to see if the process is alive, which fails if the process has already been reaped.
Simplest solution is just to set SIGCLD back to SIG_DFL (default behaviour -- ignore the signal and let the parent wait() for the terminated child process):
def run():
# ...
signal.signal(signal.SIGCLD, signal.SIG_DFL)
process = processing.Process(target=func)
process.start()
while True:
# ...
Ignoring SIGCLD also causes problems with the subprocess module, because of a bug in that module (issue 1731717, still open as of 2011-09-21).
This behaviour is addressed in version 1.4.8 of the python-daemon library; it now omits the default fiddling with SIGCLD, so no longer has this unpleasant interaction with other standard library modules.
I think there was a fix put into trunk and 2.6 maint a little while ago which should help with this can you try running your script in python-trunk or the latest 2.6-maint svn? I'm failing to pull up the bug information
Looks like your error is coming at the very end of your process -- your clue's at the very start of your traceback, and I quote...:
File "/usr/local/lib/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
if atexit._run_exitfuncs is running, this clearly shows that your own process is terminating. So, the error itself is a minor issue in a sense -- just from some function that the multiprocessing module registered to run "at-exit" from your process. The really interesting issue is, WHY is your main process exiting? I think this may be due to some uncaught exception: try setting the exception hook and showing rich diagnostic info before it gets lost by the OTHER exception caused by whatever it is that multiprocessing's registered for at-exit running...
I'm running into this also using the celery distributed task manager under RHEL 5.3 with Python 2.6. My traceback looks a little different but the error the same:
File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 334, in terminate
self._terminate()
File "/usr/local/lib/python2.6/multiprocessing/util.py", line 174, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/local/lib/python2.6/multiprocessing/pool.py", line 373, in _terminate_pool
p.terminate()
File "/usr/local/lib/python2.6/multiprocessing/process.py", line 111, in terminate
self._popen.terminate()
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 136, in terminate
if self.wait(timeout=0.1) is None:
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 121, in wait
res = self.poll()
File "/usr/local/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
Quite frustrating.. I'm running the code through pdb now, but haven't spotted anything yet.
The original sample script has "import signal" but no use of signals. However, I had a script causing this error message and it was due to my signal handling, so I'll explain here in case its what is happening for others. Within a signal handler, I was doing stuff with processes (e.g. creating a new process). Apparently this doesn't work, so I stopped doing that within the handler and fixed the error. (Note: sleep() functions wake up after signal handling so that can be an alternative approach to acting upon signals if you need to do things with processes)