When running Python from a Linux shell (same behavior observed in both bash and ksh), and generating a SIGINT with a Ctl-C keypress, I have discovered behavior that I am unable to understand, and which has frustrated me considerably.
When I press Ctl-C, the Python process appropriately terminates, but the shell continues to the next command on the line, as shown in the following console capture:
$ python -c "import time; time.sleep(100)"; echo END
^CTraceback (most recent call last):
File "<string>", line 1, in <module>
KeyboardInterrupt
END
In contrast, I had expected, and would like, that the shell processes the signal in such a way that execution does not continue to the next command on the line, as I see when I call the sleep function from a bash subshell instead of from Python.
For example, I would expect the above capture to appear more similar to the following:
$ bash -c "sleep 100"; echo END
^C
Python 2 and 3 are installed on my system, and while the above capture was generated running Python 2, both behave the same way.
My best explanation is that when I press Ctl-C while the Python process is running, the signal somehow goes directly to the Python process, whereas normally it is handled by the calling shell, then propagated to the subprocess. However, I have no idea why or how Python is causing this difference.
The examples above are trivial tests but the behavior is also observed in real-world uses. Installing custom signal handlers does not resolve the issue.
After considerable digging I found a few loosely related questions on Stack Overflow that eventually led me to an article describing the proper handling of SIGINT. (The most relevant section is How to be a proper program.)
From this information, I was able to solve the problem. Without it, I would have never have come close.
The solution is best illustrated by beginning with a Bash script that cannot be terminated by a keyboard interrupt, but which does hide the ugly stack trace from Python's KeyboardInterrupt exception.
A basic example might appear as follows:
#!/usr/bin/env bash
echo "Press Ctrl-C to stop... No sorry it won't work."
while true
do
python -c '
import time, signal
signal.signal(signal.SIGINT, signal.SIG_IGN)
time.sleep(100)
'
done
For the outer script to process the interrupt, the following change is required:
echo "Press Ctrl-C to stop..."
while true
do
python -c '
import time, signal, os
signal.signal(signal.SIGINT, signal.SIG_DFL)
time.sleep(100)
'
done
However, the solution makes it impossible to use a custom handler (for example, to perform cleanup). If doing so is required, then a more sophisticated approach is needed.
The required change is illustrated as follows:
#!/usr/bin/env bash
echo "Press [CTRL+C] to stop ..."
while true
do
python -c '
import time, sys, signal, os
def handle_int(signum, frame):
# Cleanup code here
signal.signal(signum, signal.SIG_DFL)
os.kill(os.getpid(), signum)
signal.signal(signal.SIGINT, handle_int)
time.sleep(100)
'
done
The reason appears to be that unless the inner process terminates through executing the default SIGINT handler provided by the system, the parent bash process does not realize that the child has terminated because of a keyboard interrupt, and does not itself terminate.
I have not fully understood all the ancillary issues quite yet, such as whether the parent process is not receiving the SIGINT from the system, or is receiving a signal, but ignoring it. I also have no idea what the default handler does or how the parent detects that it was called. If I am able to learn more, I will offer an update.
I must advance the question of whether the current behavior of Python should be considered a design flaw in Python. I have seen various manifestations of this issue over the years when calling Python from a shell script, but have not had the luxury of investigation until now. I have not found a single article through a web search, however, on the topic. If the issue does represent a flaw, it surprised me to observe that not many developers are affected.
The behavior of any program that gets a CTRL+C is up to that program. Usually the behavior is to exit, but some programs might just abort some internal procedure instead of stopping the whole program. It's even possible (though it may be considered bad manners) for a program to ignore the keystroke completely.
The behavior of the program is defined by the signal handlers it has set up. The C library provides default signal handlers (which do things like exit on SIGTERM and SIGINT), but a program can provide its own handlers that will run instead. Not all signals allow arbitrary responses. For instance, SIGSEGV (a seg-fault) requires the program to exit, though it can configure its signal handlers to make a core dump or not. SIGKILL can't be handled at all (the OS kernel takes care of it).
To customize signal handlers in Python, you'll want to use the signal module from the standard library. You can call signal.signal to set your own signal handler function for any of the signals defined by your system's C library. Typing CTRL+C is going to send SIGINT on any UNIX-based system, so that's probably what you'll want to handle if you want your own behavior.
Try something like this:
import signal
import sys
import time
def interrupt_handler(sig, frame):
sys.exit(1)
signal.signal(signal.SIGINT, interrupt_handler)
time.sleep(100)
If you run this script and interrupt it with CTRL+C, it should exit silently, just like your bash script does.
You could explicitly handle it on the bash side in a script file like this:
if python -c "import time; time.sleep(100)"; then
echo END
fi
or, more aggressively,
python -c "import time; time.sleep(100)"
[[ $? -ne 0 ]] && exit
echo END
$? is the return status code of the previous command. Where a status code of 0 means it exited fine, and anything else was an error. So, we use the short-circuit nature of && to succinctly exit if the previous command fails.
(See https://unix.stackexchange.com/questions/186826/parent-script-continues-when-child-exits-with-non-zero-exit-code for more info on that)
Note: this will exit the bash script for any kind of python failure, not just ctrl+c, e.g. IndexError, AssertionError, etc
Related
index.js
const childProcess = require("child_process");
childProcess.spawn("python", ["main.py"]);
main.py
import time
while True:
time.sleep(1)
When running the NodeJS process with node index.js, it runs forever since the Python child process it spawns runs forever.
When the NodeJS process is ended by x-ing out of the Command Prompt, the Python process ends as well, which is desired, but how can you run some cleanup code in the Python code before it exits?
Previous attempts
Look in documentation for child_process.spawn for how this termination is communicated from parent to child, perhaps by a signal. Didn't find it.
In Python use signal.signal(signal.SIGTERM, handler) (as well as with signal.SIGINT). Didn't get handler to run (though, ending the NodeJS process with Ctrl-C instead of closing the window did get the SIGINT handler to run even though I'm not explicitly forwarding input from the NodeJS process to the Python child process).
Lastly, though this reproducible example is also valid and much more simple, my real-life use case involves Electron, so in case that introduces a complication, or a solution, figured I'd mention it.
Windows 10, NodeJS 12.7.0, Python 3.8.3
Problem: When executing a python script from the command line, it catches and handles SIGTERM signals as expected. However, if the script is called from by a bash script, and then bash script then sends the signal to the python script, it does not handle the SIGTERM signal as expected.
The python script in question is extremely simple: it waits for a SIGTERM and then waits for a few seconds before exiting.
#!/usr/bin/env python3
import sys
import signal
import time
# signal handler
def sigterm_handler(signal, frame):
time.sleep(5)
print("dying")
sys.exit()
# register the signal handler
signal.signal(signal.SIGTERM, sigterm_handler)
while True:
time.sleep(1)
If this is called directly and then the signal sent from the command line
i.e.
> ./sigterm_tester.py &
> kill -15 <PID>
the signal handling performs normally (it waits 5 seconds, posts "dying" to stdout, and exits)
However, if it is instead called from a bash script, it no longer seems to catch the SIGTERM and instead exits immediately.
This simple bash script executes the python script and then kills its child (the python script). However, the termination occurs immediately instead of after a 5 second delay, and there is no printing of "dying" to stdout (or to a file when I attempted stdout redirection).
#!/bin/bash
./sigterm_tester.py &
child=$(pgrep -P $$)
kill -15 $child
while true;
do
sleep 1
done
Some additional information: I have also tested this with sh as well as bash and the same behavior occurs. Additionally I have tested this and gotten the same behavior in a MacOS environment as well as a Linux environment. I also tested it with both python2 and python3.
My question is why is the behavior different seemingly dependent on how the program is called, and is there a way to ensure that the python program appropriately handles signals even when called from a bash script?
Summing #Matt Walck comments. In the bash script you were killing the python process right after invoking it, which might not had enough time to register on the sigterm signal. Adding a sleep command between the spawn and the kill command will back the theory up.
#!/bin/bash
./sigterm_tester.py &
child=$(pgrep -P $$)
#DEBUGONLY
sleep 2
kill -15 $child
while true;
do
sleep 1
done
I have two scripts: "autorun.py" and "main.py". I added "autorun.py" as service to the autorun in my linux system. works perfectly!
Now my question is: When I want to launch "main.py" from my autorun script, and "main.py" will run forever, "autorun.py" never terminates as well! So when I do
sudo service autorun-test start
the command also never finishes!
How can I run "main.py" and then exit, and to finish it up, how can I then stop "main.py" when "autorun.py" is launched with the parameter "stop" ? (this is how all other services work I think)
EDIT:
Solution:
if sys.argv[1] == "start":
print "Starting..."
with daemon.DaemonContext(working_directory="/home/pi/python"):
execfile("main.py")
else:
pid = int(open("/home/pi/python/main.pid").read())
try:
os.kill(pid, 9)
print "Stopped!"
except:
print "No process with PID "+str(pid)
First, if you're trying to create a system daemon, you almost certainly want to follow PEP 3143, and you almost certainly want to use the daemon module to do that for you.
When I want to launch "main.py" from my autorun script, and "main.py" will run forever, "autorun.py" never terminates as well!
You didn't say how you're running it. If you're doing anything that launches main.py as a child and waits (or, worse, tries to import/execfile/etc. in the same process), you can't do that. Either autorun.py has to launch and detach main.py (or do so indirectly via some external tool), or main.py has to daemonize when launched.
how can I then stop "main.py" when "autorun.py" is launched with the parameter "stop" ?
You need some form of inter-process communication (IPC), and some way for autorun to find the right IPC channel to use.
If you're building a network server, the right answer might be to connect to it as a client. But otherwise, the simplest thing to do is kill the process with a signal.
If you're using the daemon module, it can easily map signals to callbacks. Or, if you don't need any cleanup, just use SIGTERM, which by default will abruptly terminate. If neither of those applies, you will have to set up a custom signal handler (and within that handler do something useful—e.g., set a flag that your main code checks periodically).
How do you know what process to send the signal to? The standard way to do this is to have main.py record its PID in a pidfile at startup. You read that pidfile, and signal whatever process is specified there. (If you get an error because there is no process with that PID, that just means the daemon already quit for some reason—possibly because of an unhandled exception, or even a segfault. You may want to log that, but treat the "stop" as successful otherwise.) Again, if you're using daemon, it does the pidfile stuff for you; if not, you have to do it yourself.
You may want to take a look at the service scripts for daemons that came with your computer. They're probably all written in bash rather than Python, but it's not that hard to figure out what they're doing. Or… just use one of them as a skeleton, in which case you don't really need any bash knowledge; it's just search-and-replace on the name.
If your distro has LSB-style init functions, you can use something like this example. That one does a whole lot more than you need to, but it's a good example of all of the details. Or do it all from scratch with something like this example. This one is doing the pidfile management and the backgrounding from the service script (turning a non-daemon program into a daemon), which you don't need if you're using daemon properly, and it's using SIGHUP instead of SIGTERM. You can google yourself for other examples of init.d service scripts.
But again, if you're just trying to do this for your own system, the best thing to do is look inside the /etc/init.d on your distro. There will be dozens of examples there, and 90% of them will be exactly the same except for the name of the daemon.
In Python, I wrote the following code to see if I could get my program to not terminate upon Control+C like all those fancy terminal apps such as Vim or Dwarf Fortress.
def getinput():
x = input('enter something: ')
while True:
try:
getinput()
except KeyboardInterrupt:
pass
Unfortunately, in the Windows console, this script terminates after a few seconds. If I run it in IDLE, it works as expected. Python version is 3.2.1, 3.2 acted the same. Am I doing something wrong?
EDIT: If I hold down, Control+C, that is.
In order to not terminate on Control-C you need to set a signal handler.
From the Python doc here
Python installs a small number of
signal handlers by default: SIGPIPE is
ignored (so write errors on pipes and
sockets can be reported as ordinary
Python exceptions) and SIGINT is
translated into a KeyboardInterrupt
exception. All of these can be
overridden.
So you would need to install a signal handler to catch the SIGINT signal and do what you want on that.
The behavior with IDLE is probably that they have a handler installed that blocks the application exit.
When using mpirun, is it possible to catch signals (for example, the SIGINT generated by ^C) in the code being run?
For example, I'm running a parallelized python code. I can except KeyboardInterrupt to catch those errors when running python blah.py by itself, but I can't when doing mpirun -np 1 python blah.py.
Does anyone have a suggestion? Even finding how to catch signals in a C or C++ compiled program would be a helpful start.
If I send a signal to the spawned Python processes, they can handle the signals properly; however, signals sent to the parent orterun process (i.e. from exceeding wall time on a cluster, or pressing control-C in a terminal) will kill everything immediately.
I think it is really implementation dependent.
In SLURM, I tried to use sbatch --signal USR1#30 to send SIGUSR1 (whose signum is 30,10 or 16) to the program launched by srun commands. And the process received signal SIGUSR1 = 10.
For platform MPI of IBM, according to https://www.ibm.com/support/knowledgecenter/en/SSF4ZA_9.1.4/pmpi_guide/signal_propagation.html
SIGINT, SIGUSR1, SIGUSR2 will be bypassed to processes.
In MPICH, SIGUSR1 is used by the process manager for internal notification of abnormal failures.
ref: http://lists.mpich.org/pipermail/discuss/2014-October/003242.html>
Open MPI on the other had will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes.
ref: http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect14>
For IntelMPI, according to https://software.intel.com/en-us/mpi-developer-reference-linux-hydra-environment-variables
I_MPI_JOB_SIGNAL_PROPAGATION and I_MPI_JOB_TIMEOUT_SIGNAL can be set to send signal.
Another thing worth notice: For many python scripts, they will invoke other library or codes through cython, and if the SIGUSR1 is caught by the sub-process, something unwanted might happen.
If you use mpirun --nw, then mpirun itself should terminate as soon as it's started the subprocesses, instead of waiting for their termination; if that's acceptable then I believe your processes would be able to catch their own signals.
The signal module supports setting signal handlers using signal.signal:
Set the handler for signal signalnum to the function handler. handler can be a callable Python object taking two arguments (see below), or one of the special values signal.SIG_IGN or signal.SIG_DFL. The previous signal handler will be returned ...
import signal
def ignore(sig, stack):
print "I'm ignoring signal %d" % (sig, )
signal.signal(signal.SIGINT, ignore)
while True: pass
If you send a SIGINT to a Python interpreter running this script (via kill -INT <pid>), it will print a message and simply continue to run.