The problem being that SIGKILL cannot be caught, is there a way to ensure that a block of code gets executed on program termination? I have a process that starts five subprocesses and changes the state of platform hardware - it handles SIGTERM and SIGINT, however a SIGKILL will kill the main process while the subprocesses continue to run. Is there a way to always execute a cleanup or do I have to accept that SIGKILL will leave the computer in a 'corrupt' state?
SIGKILL cannot be handled. This signal used for critical cases where you dont want the program to be able to ignore the signal and you want to force it to be killed.
SIGTERM is the complement, this need to be use when you want to let the program cleanup before it closed
for more information you can read: http://en.wikipedia.org/wiki/Unix_signal#Handling_signals
Related
I want to terminate a process gracefully by sending it a SIGTERM signal. Therefore, I tried to find the process and call proc.terminate():
[proc] = [proc in psutil.process_iter() if proc.name().lower() == "somename"]
proc.terminate()
This works, but leaves a zombie process which ps aux shows as somename <defunct>. I googled a bit and thought it would maybe help to add proc.wait(1) after terminating it, because I need to still listen for the exit status or something, but that didn't help either. I know that the process exits properly because proc.wait(1) does not throw a TimeoutExpired exception.
How do I terminate a process properly so that it does not become a zombie process using psutil?
Issues
I currently have a simple Python multithreaded server program, which will run forever with out manual interruption. I want to achieve that it can be terminated gracefully at some point. Once it is terminated, I want the server to output some stats.
Solutions I have tried
Terminate the program by kill. The issue is that the server cannot output the stats because the HARD termination.
Create a control thread in the program, which listens the key input. And if key is pressed, then terminate the program and get stats. The issue with this approach is I need to do every step manually. E.g, SSH to the device, start the program, and press key at some point.
Question
Is there a way that I can run some bash/or other program to stop the program gracefully with stats output?
Have you tried to use signal.signal() to register a handler for e.g. SIGTERM? There you could implement this part of code that throws out the statistics and then just terminate the program.
The standard approach is to either
make threads sufficiently short-lived
at the stop signal, stop spawning new ones and .join() the active ones.
or
make threads periodically (e.g. after serving each request) check some shared stop flag and quit when it's set
at the stop signal, set the stop flag, then .join() the threads
Some threads can be .setDaemon(True), but only if they can be safely killed off (there's no exception or anything raised in the thread, it's just stopped where it is).
If a thread is in a blocking call, it may be possible to unblock it by shutting down the facility that it is waiting on (close the socket or the stream).
I have a Python 2.7 multiprocessing Process which will not exit on parent process exit. I've set the daemon flag which should force it to exit on parent death. The docs state that:
"When a process exits, it attempts to terminate all of its daemonic child processes."
p = Process(target=_serverLaunchHelper, args=args)
p.daemon = True
print p.daemon # prints True
p.start()
When I terminate the parent process via a kill command the daemon is left alive and running (which blocks the port on the next run). The child process is starting a SimpleHttpServer and calling serve_forever without doing anything else. My guess is that the "attempts" part of the docs means that the blocking server process is stopping process death and it's letting the process get orphaned as a result. I could have the child push the serving to another Thread and have the main thread check for parent process id changes, but this seems like a lot of code to just replicate the daemon functionality.
Does someone have insight into why the daemon flag isn't working as described? This is repeatable on windows8 64 bit and ubuntu12 32 bit vm.
A boiled down version of the process function is below:
def _serverLaunchHelper(port)
httpd = SocketServer.TCPServer(("", port), Handler)
httpd.serve_forever()
When a process exits, it attempts to terminate all of its daemonic child processes.
The key word here is "attempts". Also, "exits".
Depending on your platform and implementation, it may be that the only way to get daemonic child processes terminated is to do so explicitly. If the parent process exits normally, it gets a chance to do so explicitly, so everything is fine. But if the parent process is terminated abruptly, it doesn't.
For CPython in particular, if you look at the source, terminating daemonic processes is handled the same way as joining non-daemonic processes: by walking active_children() in an atexit function. So, your daemons will be killed if and only if your atexit handlers get to run. And, as that module's docs say:
Note: the functions registered via this module are not called when the program is killed by a signal not handled by Python, when a Python fatal internal error is detected, or when os._exit() is called.
Depending on how you're killing the parent, you might be able to work around this by adding a signal handler to intercept abrupt termination. But you might not—e.g., on POSIX, SIGKILL is not intercept able, so if you kill -9 $PARENTPID, this isn't an option.
Another option is to kill the process group, instead of just the parent process. For example, if your parent has PID 12345, kill -- -12345 on linux will kill it and all of its children (assuming you haven't done anything fancy).
I have a Jython script that I run as a daemon. It starts up, logs into a server and then goes into a loop that checks for things to process, processes them, then sleeps for 5 seconds.
I have a cron job that checks every 5 minutes to make sure that the process is running and starts it again if not.
I have another cron job that once a day restarts the process no matter what. We do this because sometimes the daemon's connection to the server sometimes gets screwed up and there is no way to tell when this happens.
The problem I have with this "solution" is the 2nd cron job that kills the process and starts another one. Its okay if it gets killed while it is sleeping but bad things might happen if the daemon is in the middle of processing things when it is killed.
What is the proper way to stop a daemon process... instead of just killing it?
Is there a standard practice for this in general, in Python, or in Java?
In the future I may move to pure Python instead of Jython.
Thanks
You can send a SIGTERM first before sending SIGKILL when terminating the process and receive the signal by the Jython script.
For example, send a SIGTERM, which can be received and processed by your script and if nothing happens within a specified time period, you can send SIGKILL and force kill the process.
For more information on handling the events, please see the signal module documentation.
Also, example that may be handy (uses atexit hook):
#!/usr/bin/env python
from signal import signal, SIGTERM
from sys import exit
import atexit
def cleanup():
print "Cleanup"
if __name__ == "__main__":
from time import sleep
atexit.register(cleanup)
# Normal exit when killed
signal(SIGTERM, lambda signum, stack_frame: exit(1))
sleep(10)
Taken from here.
The normal Linux type way to do this would be to send a signal to your long-running process that's hanging. You can handle this with Python's built in signal library.
http://docs.python.org/library/signal.html
So, you can send a SIGHUP to your 1st app from your 2nd app, and handle it in the first based on whether you're in a state where it's OK to reboot.
I have a simple example:
from twisted.internet import utils,reactor
def test:
utils.getProcessOutput(executable="/bin/sleep",args=["10000"])
reactor.callWhenRunning(test)
reactor.run()
when I send signal "TERM" to program, "sleep" continues to be carried out, when I press Ctrl-C on keyboard "sleep" stopping. ( Ctrl-C is not equivalent signal TERM ?) Why ? How to kill "sleep" after send signal "TERM" to this program ?
Ctrl-C sends SIGINT to the entire foreground process group. That means it gets send to your Twisted program and to the sleep child process.
If you want to kill the sleep process whenever the Python process is going to exit, then you may want a before shutdown trigger:
def killSleep():
# Do it, somehow
reactor.addSystemEventTrigger('before', 'shutdown', killSleep)
As your example code is written, killSleep is difficult to implement. getProcessOutput doesn't give you something that easily allows the child to be killed (for example, you don't know its pid). If you use reactor.spawnProcess and a custom ProcessProtocol, this problem is solved though - the ProcessProtocol will be connected to a process transport which has a signalProcess method which you can use to send a SIGTERM (or whatever you like) to the child process.
You could also ignore SIGINT and this point and then manually deliver it to the whole process group:
import os, signal
def killGroup():
signal.signal(signal.SIGINT, signal.SIG_IGN)
os.kill(-os.getpgid(os.getpid()), signal.SIGINT)
reactor.addSystemEventTrigger('before', 'shutdown', killGroup)
Ignore SIGINT because the Twisted process is already shutting down and another signal won't do any good (and will probably confuse it or at least lead to spurious errors being reported). Sending a signal to -os.getpgid(os.getpid()) is how to send it to your entire process group.