How to stop multiprocess python-program cleanly after exception?

How to stop multiprocess python-program cleanly after exception? - python

I have a Python program that has several processes (for now only 2) and threads (2 per process). I would like to catch every exception and especially shut down my program cleanly on Ctrl+c but I can't get it to work. Everytime an Exception occurs the program stops but does not shut down correctly, leaving me with an unusable commandline.
What I have tried so far in pseudocode is:
try:
for process in processes:
process.join()
except:
pass #Just to suppress error-messages, will be removed later
finally:
for process in processes:
process.terminate()
But as I already said with no luck. Also note, that I get the Exception error message for both Processes, so they are both halted I believe?
Maybe I should also mention that most of the threads are blocked in listening on a pipe.
EDIT
So I nearly got it working. I needed to try: every thread and make sure the threads are joined correctly. There is just one flaw: Exception KeyboardInterrupt in <module 'threading' from '/usr/lib64/python2.7/threading.pyc'> ignored when shutting down. This is raised in the main-thread of the main-process. This thread is already finished, meaning it has passed the last line of code.

The problem (I expect) is that the exceptions are raised inside the processes not in the join calls.
I suggest you try wrapping each process's main method in try-except loop. Then have a flag (e.g. an instance of multiprocessing.Value) that the except statement sets to False. Each process could check the value of the flag and stop (cleaning up after itself) if it's set to False.
Note, if you just terminate a process it won't clean up after itself, as this is the same as sending a SIG_TERM to it.

Related

Python: throw exception in main thread on Windows shutdown (similar to Ctrl+C)

I have a Python script for automating simple tasks. Its main loop looks like this:
while True:
input = download_task_input()
if input:
output = process_task(input)
upload_task_output(output)
sleep(60)
Some local files are altered during task processing. They are modified when the task is started, and restored back to proper state when the task is done, or if exception is caught. Restoring these files on program exit is very important to me: leaving them in altered state causes some trouble later that I'd like to avoid.
When I want to terminate the script, I hit Ctrl+C. It raises KeyboardInterrupt exception which both stops task processing and triggers files restoration. However, if I hit Ctrl+Break, the program is simply terminated: if a task is being processed at this moment, then local files are left in altered state (which is undesirable).
The question: I'm worried about the situation when Windows OS is shutdown by pressing the Power button. Is it possible to make Python handle it exactly like it handles Ctrl+C? I.e. I'd like to detect OS shutdown in Python script and raise Python exception on the main thread.
I know it is possible to call SetConsoleCtrlHandler function from WinAPI and install own handler for situations like Ctrl+C, Ctrl+Break, Shutdown, etc. However, this handler seems to be executed in additional thread, and raising exception in it does not achieve anything. On the other hand, Python itself supposedly uses the same WinAPI feature to raise KeyboardInterrupt on the main thread on Ctrl+C, so it should be doable.
This is not a serious automation script, so I don't mind if a solution is hacky or not 100% reliable.

How can sleep forever in python?

I know that I can block the current thread with time.sleep(0xFFFFFFFFF) but is there other way?
I know that this may be seem silly, but there are use cases.
For example this could be used inside a try except to catch KeyboardInterrupt exception.
See this: https://stackoverflow.com/a/69744286/1951448
Or if there are daemonic threads running and there is nothing more to do, but don't want the threads be killed, then the main thread has to be suspended.
To clarify, I dont want to kill the thread, I want to suspend it.

It's unusual to want to block a thread indefinitely, so AFAIK there isn't an API designed specifically for that use case.
The simplest way to achieve that goal would be to call time.sleep() with a large value, as you suggested; wrap it in a while True: loop so that even if the specified time-period expires, your thread will wake up and then go immediately back to sleep again.
OTOH if for some reason you want to guarantee that your thread never wakes up at all, no matter how much time passes, you could have it call recv() on a socket that you are certain will never actually receive any data, e.g.:
import socket
print("About to sleep forever")
[sockA, sockB] = socket.socketpair()
junk = sockA.recv(1) # will never return since sockA will never receive any data
print("This should never get printed")

python function not running as thread

this is done in python 2.7.12
serialHelper is a class module arround python serial and this code does work nicely
#!/usr/bin/env python
import threading
from time import sleep
import serialHelper
sh = serialHelper.SerialHelper()
def serialGetter():
h = 0
while True:
h = h + 1
s_resp = sh.getResponse()
print ('response ' + s_resp)
sleep(3)
if __name__ == '__main__':
try:
t = threading.Thread(target=sh.serialReader)
t.setDaemon(True)
t.start()
serialGetter()
#tSR = threading.Thread(target=serialGetter)
#tSR.setDaemon(True)
#tSR.start()
except Exception as e:
print (e)
however the attemp to run serialGetter as thread as remarked it just dies.
Any reason why that function can not run as thread ?

Quoting from the Python documentation:
The entire Python program exits when no alive non-daemon threads are left.
So if you setDaemon(True) every new thread and then exit the main thread (by falling off the end of the script), the whole program will exit immediately. This kills all of the threads. Either don't use setDaemon(True), or don't exit the main thread without first calling join() on all of the threads you want to wait for.
Stepping back for a moment, it may help to think about the intended use case of a daemon thread. In Unix, a daemon is a process that runs in the background and (typically) serves requests or performs operations, either on behalf of remote clients over the network or local processes. The same basic idea applies to daemon threads:
You launch the daemon thread with some kind of work queue.
When you need some work done on the thread, you hand it a work object.
When you want the result of that work, you use an event or a future to wait for it to complete.
After requesting some work, you always eventually wait for it to complete, or perhaps cancel it (if your worker protocol supports cancellation).
You don't have to clean up the daemon thread at program termination. It just quietly goes away when there are no other threads left.
The problem is step (4). If you forget about some work object, and exit the app without waiting for it to complete, the work may get interrupted. Daemon threads don't gracefully shut down, so you could leave the outside world in an inconsistent state (e.g. an incomplete database transaction, a file that never got closed, etc.). It's often better to use a regular thread, and replace step (5) with an explicit "Finish up your work and shut down" work object that the main thread hands to the worker thread before exiting. The worker thread then recognizes this object, stops waiting on the work queue, and terminates itself once it's no longer doing anything else. This is slightly more up-front work, but is much safer in the event that a work object is inadvertently abandoned.
Because of all of the above, I recommend not using daemon threads unless you have a strong reason for them.

Gracefull python joblib kill

Is it possible to gracefully kill a joblib process (threading backend), and still return the so far computed results ?
parallel = Parallel(n_jobs=4, backend="threading")
result = parallel(delayed(dummy_f)(x) for x in range(100))
For the moment I came up with two solutions
parallel._aborted = True which waits for the started jobs to finish (in my case it can be very long)
parallel._terminate_backend() which hangs if jobs are still in the pipe (parallel._jobs not empty)
Is there a way to workaround the lib to do this ?

As far as I know, Joblib does not provide methods to kill spawned threads.
As each child thread runs in its own context, it's actually difficult to perform graceful killing or termination.
That being said, there is a workaround that could be adopted.
Mimic .join() (of threading) functionality (kind of):
Create a shared memory shared_dict with keys corresponding each thread id, values if contain either thread output or Exception e.g.:
shared_dict = {i: None for i in range(num_workers)}
Whenever an error is raised in any thread, catch the exception through the handler and instead of raising it immediately, store it in the shared memory flag
Create an exception handler which waits for all(shared_dict.values())
After all values are filled with either result or error, exit the program by raising the error or logging or whatever.

Unlabeled exception in threading

I have a chunk of code like this
def f(x):
try:
g(x)
except Exception, e:
print "Exception %s: %d" % (x, e)
def h(x):
thread.start_new_thread(f, (x,))
Once in a while, I get this:
Unhandled exception in thread started by
Error in sys.excepthook:
Original exception was:
Unlike the code sample, that's the complete text. I assume after the "by" there's supposed to be a thread ID and after the colon there are supposed to be stack traces, but nope, nothing. I don't know how to even start to debug this.

The error you're seeing means the interpreter was exiting (because the main thread exited) while another thread was still executing Python code. Python will clean up its environment, cleaning out and throwing away all of the loaded modules (to make sure as many finalizers as possible execute) but unfortunately that means the still-running thread will start raising exceptions when it tries to use something that was already destroyed. And then that exception propagates up to the start_new_thread function that started the thread, and it will try to report the exception -- only to find that what it tries to use to report the exception is also gone, which causes the confusing empty error messages.
In your specific example, this is all caused by your thread being started and your main thread exiting right away. Whether the newly started thread gets a chance to run before, during or after the interpreter exits (and thus whether you see it run as normal, run partially and report an error or never see it run) is entirely up to the OS thread scheduler.
If you're using threads (which is not a bad thing to avoid) you probably want to not have threads running while you're exiting the interpreter. The threading.Thread class is a better interface for starting new threads, and it will make the interpreter wait for all threads by default, on exit. If you really don't want to wait for a thread to end, you can set its 'daemonic' flag in the Thread object to get the old behaviour -- including the problem you see here.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.