I've got the following code which uses a concurrent.futures.ThreadPoolExecutor to launch processes of another program in a metered way (no more than 30 at a time). I additionally want the ability to stop all work if I ctrl-C the python process. This code works with one caveat: I have to ctrl-C twice. The first time I send the SIGINT, nothing happens; the second time, I see the "sending SIGKILL to processes", the processes die, and it works. What is happening to my first SIGINT?
execution_list = [['prog', 'arg1'], ['prog', 'arg2']] ... etc
processes = []
def launch_instance(args):
process = subprocess.Popen(args)
processes.append(process)
process.wait()
try:
with concurrent.futures.ThreadPoolExecutor(max_workers=30) as executor:
results = list(executor.map(launch_instance, execution_list))
except KeyboardInterrupt:
print('sending SIGKILL to processes')
for p in processes:
if p.poll() is None: #If process is still alive
p.send_signal(signal.SIGKILL)
I stumbled upon your question while trying to solve something similar. Not 100% sure that it will solve your use case (I'm not using subprocesses), but I think it will.
Your code will stay within the context manager of the executor as long as the jobs are still running. My educated guess is that the first KeyboardInterrupt will be caught by the ThreadPoolExecutor, whose default behaviour would be to not start any new jobs, wait until the current ones are finished, and then clean up (and probably reraise the KeyboardInterrupt). But the processes are probably long running, so you wouldn't notice. The second KeyboardInterrupt then interrupts this error handling.
How I solved my problem (inifinite background processes in separate threads) is with the following code:
from concurrent.futures import ThreadPoolExecutor
import signal
import threading
from time import sleep
def loop_worker(exiting):
while not exiting.is_set():
try:
print("started work")
sleep(10)
print("finished work")
except KeyboardInterrupt:
print("caught keyboardinterrupt") # never caught here. just for demonstration purposes
def loop_in_worker():
exiting = threading.Event()
def signal_handler(signum, frame):
print("Setting exiting event")
exiting.set()
signal.signal(signal.SIGTERM, signal_handler)
with ThreadPoolExecutor(max_workers=1) as executor:
executor.submit(loop_worker, exiting)
try:
while not exiting.is_set():
sleep(1)
print('waiting')
except KeyboardInterrupt:
print('Caught keyboardinterrupt')
exiting.set()
print("Main thread finished (and thus all others)")
if __name__ == '__main__':
loop_in_worker()
It uses an Event to signal to the threads that they should stop what they are doing. In the main loop, there is a loop just to keep busy and check for any exceptions. Note that this loop is within the context of the ThreadPoolExecutor.
As a bonus it also handles the SIGTERM signal by using the same exiting Event.
If you add a loop in between processes.append(process) and process.wait() that checks for a signal, then it will probably solve your use case as well. It depends on what you want to do with the running processes what actions you should take there.
If you run my script from the command line and press ctrl-C you should see something like:
started work
waiting
waiting
^CCaught keyboardinterrupt
# some time passes here
finished work
Main thread finished (and thus all others)
Inspiration for my solution came from this blog post
Related
Is there a way in python to interrupt a thread when it's sleeping?
(As we can do in java)
I am looking for something like that.
import threading
from time import sleep
def f():
print('started')
try:
sleep(100)
print('finished')
except SleepInterruptedException:
print('interrupted')
t = threading.Thread(target=f)
t.start()
if input() == 'stop':
t.interrupt()
The thread is sleeping for 100 seconds and if I type 'stop', it interrupts
The correct approach is to use threading.Event. For example:
import threading
e = threading.Event()
e.wait(timeout=100) # instead of time.sleep(100)
In the other thread, you need to have access to e. You can interrupt the sleep by issuing:
e.set()
This will immediately interrupt the sleep. You can check the return value of e.wait to determine whether it's timed out or interrupted. For more information refer to the documentation: https://docs.python.org/3/library/threading.html#event-objects .
How about using condition objects: https://docs.python.org/2/library/threading.html#condition-objects
Instead of sleep() you use wait(timeout). To "interrupt" you call notify().
If you, for whatever reason, needed to use the time.sleep function and happened to expect the time.sleep function to throw an exception and you simply wanted to test what happened with large sleep values without having to wait for the whole timeout...
Firstly, sleeping threads are lightweight and there's no problem just letting them run in daemon mode with threading.Thread(target=f, daemon=True) (so that they exit when the program does). You can check the result of the thread without waiting for the whole execution with t.join(0.5).
But if you absolutely need to halt the execution of the function, you could use multiprocessing.Process, and call .terminate() on the spawned process. This does not give the process time to clean up (e.g. except and finally blocks aren't run), so use it with care.
I am trying to find a way to handle SIGTERM nicely and have my subprocesses terminate when the main process received a SIGTERM.
Basically, I am creating processes manually (but I believe the issue is the same with mp.pool for example)
import multiprocessing as mp
...
workers = [
mp.Process(
target=worker,
args=(...,)
) for _ in range(nb_workers)
]
and I am catching signals
signal.signal(signal.SIGTERM, term)
signal.signal(signal.SIGINT, term)
signal.signal(signal.SIGQUIT, term)
signal.signal(signal.SIGABRT, term)
When a signal is caught, I want to terminate all my subprocesses and exit. I do not want to wait for them to finish running as their individual run times can be pretty long (understand a few minutes).
The same way, I cannot really set a threading.Event() all process would look at periodically as they are basically just doing one huge, but slow operation (depending on a few libraries).
My idea was to set a flag when a signal is caught, and then have a watchdog terminate all subprocesses when the flag is set. But using .terminate() also uses SIGTERM, which is caught again by my signal handlers.
Ex, simplified code:
import multiprocessing as mp
import signal
import time
FLAG = False
def f(x):
time.sleep(5)
print(x)
return x * x
def term(signum, frame):
print(f'Received Signal {signum}')
global FLAG
FLAG = True
def terminate(w):
for process in w:
print('Terminating worker {}'.format(process.pid))
process.terminate()
process.join()
process.close()
signal.signal(signal.SIGTERM, term)
signal.signal(signal.SIGINT, term)
signal.signal(signal.SIGQUIT, term)
signal.signal(signal.SIGABRT, term)
if __name__ == '__main__':
workers = [
mp.Process(
target=f,
args=(i,)
) for i in range(4)
]
for process in workers:
process.start()
while not FLAG:
time.sleep(0.1)
print('flag set')
terminate(workers)
print('Done')
If I interrupt the code before the processes are done (with ctrl-c):
Received Signal 2
Received Signal 2
Received Signal 2
Received Signal 2
Received Signal 2
flag set
Terminating worker 27742
Received Signal 15
0
Terminating worker 27743
Received Signal 15
1
3
2
Terminating worker 27744
Terminating worker 27745
Done
As you can see, it seems that .terminate() does not terminate the sub-processes as they keep running to their end, and as it appears we catch the resulting SIGTERM (15) too.
So far, my solutions are:
somehow manage to have the processes periodically check a threading.Event(). This mean rethinking completely what our current processes are doing.
use .kill() instead of .terminate(). This works on Linux but it is a less clean exit. Not sure about windows, but I was under the impression that on Windows .kill == .terminate.
do not catch SIGTERM anymore, assuming the program will never get killed this way (unlikely)
Is there any clean way to handle this?
The solution very much depends on what platform you are running on as is often the case for Python questions tagged with [multiprocessing] and it is for that reason one is supposed also tag such questions with the specific platform, such as [linux], too. I am inferring that your platform is not Windows since signal.SIGQUIT is not defined for that platform. So I will go with Linux.
For Linux you do not want your subprocesses to handle the signals at all (and it's sort of nonsensical for them to be calling function term on an Ctrl-C interrupt, for example). For Windows, however, you want your subprocesses to ignore these interrupts. That means you want your main process to call signal only after it has created the subprocesses.
Instead of using FLAG to indicate that the main process should terminate and have to have the main process loop testing this value periodically, it is simpler, cleaner and more efficient to have the main process just wait on a threading.Event instance, done_event. Although. for some reason, this does not seem to work on Windows; the main process wait call does not get satisfied immediately.
You would like some provision to terminate gracefully if and when your processes complete normally and there has been so signal triggered. The easiest way to accomplish all your goals including this is to make your subprocesses daemon processes that will terminate when the main process terminates. Then create a daemon thread that simply waits for the subprocesses to normally terminate and sets done_event when that occurs. So the main process will fall through on the call to done_event.wait() on either an interrupt of some sort or normal completion. All it has to do now is just end normally; there is no need to call terminate against the subprocesses since they will end when the main process ends.
import multiprocessing as mp
from threading import Thread, Event
import signal
import time
import sys
IS_WINDOWS = sys.platform == 'win32'
def f(x):
if IS_WINDOWS:
signal.signal(signal.SIGTERM, signal.SIG_IGN)
signal.signal(signal.SIGINT, signal.SIG_IGN)
signal.signal(signal.SIGABRT, signal.SIG_IGN)
time.sleep(5)
print(x)
return x * x
def term(signum, frame):
print(f'Received Signal {signum}')
if IS_WINDOWS:
globals()['FLAG'] = True
else:
done_event.set()
def process_wait_thread():
"""
wait for processes to finish normally and set done_event
"""
for process in workers:
process.join()
if IS_WINDOWS:
globals()['FLAG'] = True
else:
done_event.set()
if __name__ == '__main__':
if IS_WINDOWS:
globals()['FLAG'] = False
else:
done_event = Event()
workers = [
mp.Process(
target=f,
args=(i,),
daemon=True
) for i in range(4)
]
for process in workers:
process.start()
# We don't want subprocesses to inherit these so
# call signal after we start the processes:
signal.signal(signal.SIGTERM, term)
signal.signal(signal.SIGINT, term)
if not IS_WINDOWS:
signal.signal(signal.SIGQUIT, term) # Not supported by Windows at all
signal.signal(signal.SIGABRT, term)
Thread(target=process_wait_thread, daemon=True).start()
if IS_WINDOWS:
while not globals()['FLAG']:
time.sleep(0.1)
else:
done_event.wait()
print('Done')
Is there a way in python to interrupt a thread when it's sleeping?
(As we can do in java)
I am looking for something like that.
import threading
from time import sleep
def f():
print('started')
try:
sleep(100)
print('finished')
except SleepInterruptedException:
print('interrupted')
t = threading.Thread(target=f)
t.start()
if input() == 'stop':
t.interrupt()
The thread is sleeping for 100 seconds and if I type 'stop', it interrupts
The correct approach is to use threading.Event. For example:
import threading
e = threading.Event()
e.wait(timeout=100) # instead of time.sleep(100)
In the other thread, you need to have access to e. You can interrupt the sleep by issuing:
e.set()
This will immediately interrupt the sleep. You can check the return value of e.wait to determine whether it's timed out or interrupted. For more information refer to the documentation: https://docs.python.org/3/library/threading.html#event-objects .
How about using condition objects: https://docs.python.org/2/library/threading.html#condition-objects
Instead of sleep() you use wait(timeout). To "interrupt" you call notify().
If you, for whatever reason, needed to use the time.sleep function and happened to expect the time.sleep function to throw an exception and you simply wanted to test what happened with large sleep values without having to wait for the whole timeout...
Firstly, sleeping threads are lightweight and there's no problem just letting them run in daemon mode with threading.Thread(target=f, daemon=True) (so that they exit when the program does). You can check the result of the thread without waiting for the whole execution with t.join(0.5).
But if you absolutely need to halt the execution of the function, you could use multiprocessing.Process, and call .terminate() on the spawned process. This does not give the process time to clean up (e.g. except and finally blocks aren't run), so use it with care.
I am new to python multi threading and trying to understand the basic difference between joining multiple worker threads and calling abort on them after I am done processing with them. Can somebody please explain me with an example?
.join() and setting a abort flags are two different steps in cleanly shutting down a thread.
join() just waits for a thread that is going to terminate anyway to be finished. Thus:
import threading
import time
def thread_main():
time.sleep(10)
t = threading.Thread(target=thread_main)
t.start()
t.join()
This is a reasonable program. The join just waits until the thread is finished. It doesn't do anything to make that happen, but the thread will terminate anyway, because it is just a 10 second sleep.
In contrast
import threading
import time
def thread_main():
while True:
time.sleep(10)
t = threading.Thread(target=thread_main)
t.start()
t.join()
Is not a good idea, because join will still wait for the thread to terminate on it's own. But the thread will never do that because it loops forever. Thus the whole program can't terminate.
That's the point where you want some kind of signaling to the thread for it so stop itself
import threading
import time
stop_thread = False
def thread_main():
while not stop_thread:
time.sleep(10)
t = threading.Thread(target=thread_main)
t.start()
stop_thread = True
t.join()
Here stop_thread takes the role of your __abort flag and signals the thread to stop after it has finished with it's latest work (the sleep(10) in this case)
Thus this program again is reasonable and terminates when asked to do.
Another popular way to signal a thread to stop when the thread uses a consumer pattern (i.e. gets its work from a queue) is to post a special 'terminate now' work item as alternative to setting a flag variable:
def thread_main():
while True:
(quit, data) = work_queue().get()
if quit: break
do_work(data)
I'm writing a multithreaded Python app on Windows.
I used to terminate the app using ctrl-c, but once I added threading.Timer instances ctrl-c stopped working (or sometimes takes a very long time).
How could this be?
What's the relation between having Timer threads and ctrl-c?
UPDATE:
I found the following in Python's thread documentation:
Threads interact strangely with
interrupts: the KeyboardInterrupt
exception will be received by an
arbitrary thread. (When the signal
module is available, interrupts always
go to the main thread.)
The way threading.Thread (and thus threading.Timer) works is that each thread registers itself with the threading module, and upon interpreter exit the interpreter will wait for all registered threads to exit before terminating the interpreter proper. This is done so threads actually finish execution, instead of having the interpreter brutally removed from under them. So when you hit ^C, the main thread receives the signal, decides to terminate and waits for the timers to finish.
You can set threads daemonic (with the setDaemon method) to make the threading module not wait for these threads, but if they happen to be executing Python code while the interpreter exits, you get confusing errors during exit. Even if you cancel the threading.Timer (and set it daemonic) it can still wake up while the interpreter is being destroyed -- because threading.Timer's cancel method just tells the threading.Timer not to execute anything when it wakes up, but it has to actually execute Python code to make that determination.
There is no graceful way to terminate threads (other than the current one), and no reliable way to interrupt a thread that's blocked. A more manageable approach to timers is usually an event loop, like the ones GUIs and other event-driven systems offer you. What to use depends entirely on what else your program will be doing.
There is a presentation by David Beazley that sheds some light on the topic. The PDF is available here. Look around pages 22--25 ("Interlude: Signals" to "Frozen Signals").
This is a possible workaround: using time.sleep() instead of Timer means a "graceful shutdown" mechanism can be implemented ... for Python3 where, it appears, KeyboardInterrupt is only raised in user code for the main thread. Otherwise, it appears, the exception is "ignored" as per here: in fact it results in the thread where it occurs dying immediately, but not any ancestor threads, where problematically it can't be caught.
Let's say you want Ctrl-C responsiveness to be 0.5 seconds, but you only want to repeat some actual work every 5 seconds (work is of random duration as below):
import threading, sys, time, random
blip_counter = 0
work_threads=[]
def repeat_every_5():
global blip_counter
print( f'counter: {blip_counter}')
def real_work():
real_work_duration_s = random.randrange(10)
print( f'do some real work every 5 seconds, lasting {real_work_duration_s} s: starting...')
# in a real world situation stop_event.is_set() can be tested anywhere in the code
for interval_500ms in range( real_work_duration_s * 2 ):
if threading.current_thread().stop_event.is_set():
print( f'stop_event SET!')
return
time.sleep(0.5)
print( f'...real work ends')
# clean up work_threads as appropriate
for work_thread in work_threads:
if not work_thread.is_alive():
print(f'work thread {work_thread} dead, removing from list' )
work_threads.remove( work_thread )
new_work_thread = threading.Thread(target=real_work)
# stop event for graceful shutdown
new_work_thread.stop_event = threading.Event()
work_threads.append(new_work_thread)
# in fact, because a graceful shutdown is now implemented, new_work_thread doesn't have to be daemon
# new_work_thread.daemon = True
new_work_thread.start()
blip_counter += 1
time.sleep( 5 )
timer_thread = threading.Thread(target=repeat_every_5)
timer_thread.daemon = True
timer_thread.start()
repeat_every_5()
while True:
try:
time.sleep( 0.5 )
except KeyboardInterrupt:
print( f'shutting down due to Ctrl-C..., work threads left: {len(work_threads)}')
# trigger stop event for graceful shutdown
for work_thread in work_threads:
if work_thread.is_alive():
print( f'work_thread {work_thread}: setting STOP event')
work_thread.stop_event.set()
print( f'work_thread {work_thread}: joining to main...')
work_thread.join()
print( f'work_thread {work_thread}: ...joined to main')
else:
print( f'work_thread {work_thread} has died' )
sys.exit(1)
This while True: mechanism looks a bit clunky. But I think, as I say, that currently (Python 3.8.x) KeyboardInterrupt can only be caught on the main thread.
PS according to my experiments, handling child processes may be easier, in the sense that Ctrl-C will, it seems, in a simple case at least, cause a KeyboardInterrupt to occur simultaneously in all running processes.
Wrap your main while loop in a try except:
from threading import Timer
import time
def randomfn():
print ("Heartbeat sent!")
class RepeatingTimer(Timer):
def run(self):
while not self.finished.is_set():
self.function(*self.args, **self.kwargs)
self.finished.wait(self.interval)
t = RepeatingTimer(10.0, function=randomfn)
print ("Starting...")
t.start()
while (True):
try:
print ("Hello")
time.sleep(1)
except:
print ("Cancelled timer...")
t.cancel()
print ("Cancelled loop...")
break
print ("End")
Results:
Heartbeat sent!
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Hello
Cancelled timer...
Cancelled loop...
End