Gracefully handle the termination event in Python script - python

I am writing some sort of informational bot and I need to handle the script termination event both in Windows (the system I use for development) and Linux (server).
I have tried the variant with signal module, but it did not work.
signal.signal(signal.SIGTERM, sigterm_handler)
I need to close the connection gracefully upon exit, shutdown the connection with database and so on.
What is the most correct way to handle the script termination event in both OS (Windows/Linux)?
Thanks.

Related

Python, How to handle signal in Windows

This is specifically for Windows, I don't have this issue on linux based systems.
So, I have a program that creates subprocesses when running it.
These subprocesses will terminate correctly if the program exits normally, or even with exceptions or ctrl+c event, by using try and KeyboardInterrupt and finally in if __name__ == '__main__':
However, if I kill the program in the middle, I'm talking about killing it in PyCharm, using the STOP button. Those subprocesses will not terminate. I'm not exactly sure what signal this STOP button sends on Windows.
I tried signal handling using signal.signal(signal.SIGTERM, handler). It doesn't work, I have tried SIGTERM, SIGINT, (SIGKILL, CTRL_C_EVENT, CTRL_BREAK_EVENT don't work in signal handler. ). None of them works. I have also read this post: How to handle the signal in python on windows machine
How can I gracefully exit in this scenario? This STOP button in PyCharm scenario.

terminate Python multithreaded program with log output

Issues
I currently have a simple Python multithreaded server program, which will run forever with out manual interruption. I want to achieve that it can be terminated gracefully at some point. Once it is terminated, I want the server to output some stats.
Solutions I have tried
Terminate the program by kill. The issue is that the server cannot output the stats because the HARD termination.
Create a control thread in the program, which listens the key input. And if key is pressed, then terminate the program and get stats. The issue with this approach is I need to do every step manually. E.g, SSH to the device, start the program, and press key at some point.
Question
Is there a way that I can run some bash/or other program to stop the program gracefully with stats output?
Have you tried to use signal.signal() to register a handler for e.g. SIGTERM? There you could implement this part of code that throws out the statistics and then just terminate the program.
The standard approach is to either
make threads sufficiently short-lived
at the stop signal, stop spawning new ones and .join() the active ones.
or
make threads periodically (e.g. after serving each request) check some shared stop flag and quit when it's set
at the stop signal, set the stop flag, then .join() the threads
Some threads can be .setDaemon(True), but only if they can be safely killed off (there's no exception or anything raised in the thread, it's just stopped where it is).
If a thread is in a blocking call, it may be possible to unblock it by shutting down the facility that it is waiting on (close the socket or the stream).

Python3 Non-blocking input or killing threads

Reading through posts of similar questions I strongly suspect there is no way to do what I'm trying to do but figured I'd ask. I have a program using python3 that is designed to run headless, receiving commands from remote users that have logged in. One of the commands of course is a shutdown so that the program can be ended cleanly. This section is working correctly.
However while working on this I realized an option to be able to enter commands directly, without a remote connection, would be useful in the event something unusual happened to prevent remote access. I added a local_control function that runs in it's own thread so that it doesn't interfere with the main loop. This works great for all commands except for the shutdown command.
I have a variable that both loops monitor so that they can end when the shutdown command is sent. Sending the shutdown command from within local_control works fine because the loop ends before getting back to input(). however when sending the shutdown command remotely the program doesn't end until someone presses the enter key locally because that loop remains stuck at input(). As soon as enter is pressed the program continues, successfully breaks the loop and continues with the shutdown as normal. Below is an example of my code.
import threading
self.runserver = True
def local_control(): #system to control server without remote access
while self.runserver:
raw_input = input()
if raw_input == "shutdown":
self.runserver = False
mythread = threading.Thread(target=local_control)
mythread.start()
while self.runserver:
some_input = get_remote_input() #getting command from remote user
if some_input == "shutdown":
self.runserver = False
sys.exit(0) #server is shutdown cleanly
Because the program runs primarily headless GUI options such as pygame aren't an option. Other solutions I've found online involve libraries that are not cross-platform such as msvcrt, termios, and curses. Although it's not as clean an option I'd settle for simply killing the thread to end it if I could however there is no way to do that as well. So is there a cross-platform, non-GUI option to have a non-blocking input? Or is there another way to break a blocked loop from another thread?
Your network-IO thread is blocking the processing of commands while waiting for remote commands, so it will only evaluate the state of runserver after get_remote_input() returns (and it's command is processed).
You will need three threads:
One which loops in local_control(), sending commands to the processing thread.
One which loops on get_remote_input(), also sending commands to the processing thread.
A processing thread (possibly the main thread).
A queue will probably be helpful here, since you need to avoid the race condition caused by unsynchronized access as currently present with regards to runserver.
Not a portable solution, but in *nix, you might be able send yourself an interrupt signal from the local_control function to break the blocking input(). You'll need the pthread ID (pthread_self and save it somewhere readable from local_control) for the network control thread so you can call pthread_kill.

proper way to stop a daemon process

I have a Jython script that I run as a daemon. It starts up, logs into a server and then goes into a loop that checks for things to process, processes them, then sleeps for 5 seconds.
I have a cron job that checks every 5 minutes to make sure that the process is running and starts it again if not.
I have another cron job that once a day restarts the process no matter what. We do this because sometimes the daemon's connection to the server sometimes gets screwed up and there is no way to tell when this happens.
The problem I have with this "solution" is the 2nd cron job that kills the process and starts another one. Its okay if it gets killed while it is sleeping but bad things might happen if the daemon is in the middle of processing things when it is killed.
What is the proper way to stop a daemon process... instead of just killing it?
Is there a standard practice for this in general, in Python, or in Java?
In the future I may move to pure Python instead of Jython.
Thanks
You can send a SIGTERM first before sending SIGKILL when terminating the process and receive the signal by the Jython script.
For example, send a SIGTERM, which can be received and processed by your script and if nothing happens within a specified time period, you can send SIGKILL and force kill the process.
For more information on handling the events, please see the signal module documentation.
Also, example that may be handy (uses atexit hook):
#!/usr/bin/env python
from signal import signal, SIGTERM
from sys import exit
import atexit
def cleanup():
print "Cleanup"
if __name__ == "__main__":
from time import sleep
atexit.register(cleanup)
# Normal exit when killed
signal(SIGTERM, lambda signum, stack_frame: exit(1))
sleep(10)
Taken from here.
The normal Linux type way to do this would be to send a signal to your long-running process that's hanging. You can handle this with Python's built in signal library.
http://docs.python.org/library/signal.html
So, you can send a SIGHUP to your 1st app from your 2nd app, and handle it in the first based on whether you're in a state where it's OK to reboot.

MPI signal handling

When using mpirun, is it possible to catch signals (for example, the SIGINT generated by ^C) in the code being run?
For example, I'm running a parallelized python code. I can except KeyboardInterrupt to catch those errors when running python blah.py by itself, but I can't when doing mpirun -np 1 python blah.py.
Does anyone have a suggestion? Even finding how to catch signals in a C or C++ compiled program would be a helpful start.
If I send a signal to the spawned Python processes, they can handle the signals properly; however, signals sent to the parent orterun process (i.e. from exceeding wall time on a cluster, or pressing control-C in a terminal) will kill everything immediately.
I think it is really implementation dependent.
In SLURM, I tried to use sbatch --signal USR1#30 to send SIGUSR1 (whose signum is 30,10 or 16) to the program launched by srun commands. And the process received signal SIGUSR1 = 10.
For platform MPI of IBM, according to https://www.ibm.com/support/knowledgecenter/en/SSF4ZA_9.1.4/pmpi_guide/signal_propagation.html
SIGINT, SIGUSR1, SIGUSR2 will be bypassed to processes.
In MPICH, SIGUSR1 is used by the process manager for internal notification of abnormal failures.
ref: http://lists.mpich.org/pipermail/discuss/2014-October/003242.html>
Open MPI on the other had will forward SIGUSR1 and SIGUSR2 from mpiexec to the other processes.
ref: http://www.open-mpi.org/doc/v1.6/man1/mpirun.1.php#sect14>
For IntelMPI, according to https://software.intel.com/en-us/mpi-developer-reference-linux-hydra-environment-variables
I_MPI_JOB_SIGNAL_PROPAGATION and I_MPI_JOB_TIMEOUT_SIGNAL can be set to send signal.
Another thing worth notice: For many python scripts, they will invoke other library or codes through cython, and if the SIGUSR1 is caught by the sub-process, something unwanted might happen.
If you use mpirun --nw, then mpirun itself should terminate as soon as it's started the subprocesses, instead of waiting for their termination; if that's acceptable then I believe your processes would be able to catch their own signals.
The signal module supports setting signal handlers using signal.signal:
Set the handler for signal signalnum to the function handler. handler can be a callable Python object taking two arguments (see below), or one of the special values signal.SIG_IGN or signal.SIG_DFL. The previous signal handler will be returned ...
import signal
def ignore(sig, stack):
print "I'm ignoring signal %d" % (sig, )
signal.signal(signal.SIGINT, ignore)
while True: pass
If you send a SIGINT to a Python interpreter running this script (via kill -INT <pid>), it will print a message and simply continue to run.

Categories

Resources