I'm playing around with threads on python 3.7.4, and I want to use atexit to register a cleanup function that will (cleanly) terminate the threads.
For example:
# example.py
import threading
import queue
import atexit
import sys
Terminate = object()
class Worker(threading.Thread):
def __init__(self):
super().__init__()
self.queue = queue.Queue()
def send_message(self, m):
self.queue.put_nowait(m)
def run(self):
while True:
m = self.queue.get()
if m is Terminate:
break
else:
print("Received message: ", m)
def shutdown_threads(threads):
for t in threads:
print(f"Terminating thread {t}")
t.send_message(Terminate)
for t in threads:
print(f"Joining on thread {t}")
t.join()
else:
print("All threads terminated")
if __name__ == "__main__":
threads = [
Worker()
for _ in range(5)
]
atexit.register(shutdown_threads, threads)
for t in threads:
t.start()
for t in threads:
t.send_message("Hello")
#t.send_message(Terminate)
sys.exit(0)
However, it seems interacting with the threads and queues in the atexit callback creates a deadlock with some internal shutdown routine:
$ python example.py
Received message: Hello
Received message: Hello
Received message: Hello
Received message: Hello
Received message: Hello
^CException ignored in: <module 'threading' from '/usr/lib64/python3.7/threading.py'>
Traceback (most recent call last):
File "/usr/lib64/python3.7/threading.py", line 1308, in _shutdown
lock.acquire()
KeyboardInterrupt
Terminating thread <Worker(Thread-1, started 140612492904192)>
Terminating thread <Worker(Thread-2, started 140612484511488)>
Terminating thread <Worker(Thread-3, started 140612476118784)>
Terminating thread <Worker(Thread-4, started 140612263212800)>
Terminating thread <Worker(Thread-5, started 140612254820096)>
Joining on thread <Worker(Thread-1, stopped 140612492904192)>
Joining on thread <Worker(Thread-2, stopped 140612484511488)>
Joining on thread <Worker(Thread-3, stopped 140612476118784)>
Joining on thread <Worker(Thread-4, stopped 140612263212800)>
Joining on thread <Worker(Thread-5, stopped 140612254820096)>
All threads terminated
(the KeyboardInterrupt is me using ctrl-c since the process seems to be hanging indefinitely).
However, if I send the Terminate message before exit(uncomment the line after t.send_message("Hello")), the program doesn't hang and terminates gracefully:
$ python example.py
Received message: Hello
Received message: Hello
Received message: Hello
Received message: Hello
Received message: Hello
Terminating thread <Worker(Thread-1, stopped 140516051592960)>
Terminating thread <Worker(Thread-2, stopped 140516043200256)>
Terminating thread <Worker(Thread-3, stopped 140515961992960)>
Terminating thread <Worker(Thread-4, stopped 140515953600256)>
Terminating thread <Worker(Thread-5, stopped 140515945207552)>
Joining on thread <Worker(Thread-1, stopped 140516051592960)>
Joining on thread <Worker(Thread-2, stopped 140516043200256)>
Joining on thread <Worker(Thread-3, stopped 140515961992960)>
Joining on thread <Worker(Thread-4, stopped 140515953600256)>
Joining on thread <Worker(Thread-5, stopped 140515945207552)>
All threads terminated
This begs the question, when does this threading._shutdown routine gets executed, relative to atexit handlers?
Does it make sense to interact with threads in atexit handlers?
You can use one daemon thread to ask your non-daemon threads to clean up gracefully. For an example where this is necessary, if you are using a third-party library that starts a non-daemon thread, you'd either have to change that library or do something like:
import threading
def monitor_thread():
main_thread = threading.main_thread()
main_thread.join()
send_signal_to_non_daemon_thread_to_gracefully_shutdown()
monitor = threading.Thread(target=monitor_thread)
monitor.daemon = True
monitor.start()
start_non_daemon_thread()
To put this in the context of the original poster's code (note we don't need the atexit function, since that won't get called until all the non-daemon threads are stopped):
if __name__ == "__main__":
threads = [
Worker()
for _ in range(5)
]
for t in threads:
t.start()
for t in threads:
t.send_message("Hello")
#t.send_message(Terminate)
def monitor_thread():
main_thread = threading.main_thread()
main_thread.join()
shutdown_threads(threads)
monitor = threading.Thread(target=monitor_thread)
monitor.daemon = True
monitor.start()
atexit.register(func) registers func as a function to be executed at termination.
After execute the last line of code (it is sys.exit(0) in above example) in main thread, threading._shutdown was invoked (by interpreter) to wait for all non-daemon threads (Workers created in above example) exit
The entire Python program exits when no alive non-daemon threads are left.
So after typing CTRL+C, the main thread was terminated by SIGINT signal, and then atexit registered functions are called by interpreter.
By the way, if you pass daemon=True to Thread.__init__, the program would run straightforward without any human interactive.
Related
I have come across a Python sub-process issue that I have replicated on Python 3.6 and 3.7 that I do not understand. I have a program, call it Main, that launches an external process using subprocess.Popen(), call it "Slave". The Main program registers a SIGTERM signal handler. Main waits on the Slave process to complete using either proc.wait(None) or proc.wait(timeout). The Slave process can be interrupted by sending a SIGTERM signal to Main. The sigterm handler will send a SIGINT signal to the Slave and wait(30) for it to terminate. If Main is using wait(None), then the sigterm handler's wait(30) will wait the full 30 seconds even though the slave process has terminated. If Main is using the wait(timeout) version, then the sigterm handler's wait(30) will return as soon as the Slave terminates.
Here is a small test app that demonstrates the issue. Run it via python wait_test.py to use the non-timeout wait(None). Run it via python wait_test.py <timeout value> to provide a specific timeout to the Main wait.
Once the program is running, execute kill -15 <pid> and see how the app reacts.
#
# Save this to a file called wait_test.py
#
import signal
import subprocess
import sys
from datetime import datetime
slave_proc = None
def sigterm_handler(signum, stack):
print("Process received SIGTERM signal {} while processing job!".format(signum))
print("slave_proc is {}".format(slave_proc))
if slave_proc is not None:
try:
print("{}: Sending SIGINT to slave.".format(datetime.now()))
slave_proc.send_signal(signal.SIGINT)
slave_proc.wait(30)
print("{}: Handler wait completed.".format(datetime.now()))
except subprocess.TimeoutExpired:
slave_proc.terminate()
except Exception as exception:
print('Sigterm Exception: {}'.format(exception))
slave_proc.terminate()
slave_proc.send_signal(signal.SIGKILL)
def main(wait_val=None):
with open("stdout.txt", 'w+') as stdout:
with open("stderr.txt", 'w+') as stderr:
proc = subprocess.Popen(["python", "wait_test.py", "slave"],
stdout=stdout,
stderr=stderr,
universal_newlines=True)
print('Slave Started')
global slave_proc
slave_proc = proc
try:
proc.wait(wait_val) # If this is a no-timeout wait, ie: wait(None), then will hang in sigterm_handler.
print('Slave Finished by itself.')
except subprocess.TimeoutExpired as te:
print(te)
print('Slave finished by timeout')
proc.send_signal(signal.SIGINT)
proc.wait()
print("Job completed")
if __name__ == '__main__':
if len(sys.argv) > 1 and sys.argv[1] == 'slave':
while True:
pass
signal.signal(signal.SIGTERM, sigterm_handler)
main(int(sys.argv[1]) if len(sys.argv) > 1 else None)
print("{}: Exiting main.".format(datetime.now()))
Here is an example of the two runs:
Note here the 30 second delay
--------------------------------
[mkurtz#localhost testing]$ python wait_test.py
Slave Started
Process received SIGTERM signal 15 while processing job!
slave_proc is <subprocess.Popen object at 0x7f79b50e8d90>
2022-03-30 11:08:15.526319: Sending SIGINT to slave. <--- 11:08:15
Slave Finished by itself.
Job completed
2022-03-30 11:08:45.526942: Exiting main. <--- 11:08:45
Note here the instantaneous shutdown
-------------------------------------
[mkurtz#localhost testing]$ python wait_test.py 100
Slave Started
Process received SIGTERM signal 15 while processing job!
slave_proc is <subprocess.Popen object at 0x7fa2412a2dd0>
2022-03-30 11:10:03.649931: Sending SIGINT to slave. <--- 11:10:03.649
2022-03-30 11:10:03.653170: Handler wait completed. <--- 11:10:03.653
Slave Finished by itself.
Job completed
2022-03-30 11:10:03.673234: Exiting main. <--- 11:10:03.673
These specific tests were run using Python 3.7.9 on CentOS 7.
Can someone explain this behavior?
The Popen class has an internal lock for wait operations:
# Held while anything is calling waitpid before returncode has been
# updated to prevent clobbering returncode if wait() or poll() are
# called from multiple threads at once. After acquiring the lock,
# code must re-check self.returncode to see if another thread just
# finished a waitpid() call.
self._waitpid_lock = threading.Lock()
The major difference between wait() and wait(timeout=...) is that the former waits indefinitely while holding the lock, whereas the latter is a busy loop that releases the lock on each iteration.
if timeout is not None:
...
while True:
if self._waitpid_lock.acquire(False):
try:
...
# wait without any delay
(pid, sts) = self._try_wait(os.WNOHANG)
...
finally:
self._waitpid_lock.release()
...
time.sleep(delay)
else:
while self.returncode is None:
with self._waitpid_lock: # acquire lock unconditionally
...
# wait indefinitley
(pid, sts) = self._try_wait(0)
This isn't a problem for regular concurrent code – i.e. threading – since the thread running wait() and holding the lock will be woken up as soon as the subprocess finishes. This in turn allows all other threads waiting on the lock/subprocess to proceed promptly.
However, things look different when a) the main thread holds the lock in wait() and b) a signal handler attempts to wait. A subtle point of signal handlers is that they interrupt the main thread:
signal: Signals and Threads
Python signal handlers are always executed in the main Python thread of the main interpreter, even if the signal was received in another thread. […]
Since the signal handler runs in the main thread, the main thread's regular code execution is paused until the signal handler finishes!
By running wait in the signal handler, a) the signal handler blocks waiting for the lock and b) the lock blocks waiting for the signal handler. Only once the signal handler wait times out does the "main thread" resume, receive confirmation that the suprocess finished, set the return code and release the lock.
import threading
import time
def worker(i):
while True:
try:
print i
time.sleep(10)
break
except Exception, msg:
print msg
threads = []
for i in range(10):
t1 = threading.Thread(target=worker, args=(i,))
threads.append(t1)
for t in threads:
t.start()
print "started all threads... waiting to be finished"
for t in threads:
t.join()
if i press ^C while the threads are running, does the thread gets the SIGINT?
if this is true, what can i do from the caller thread to stop it from propagating SIGINT to running threads?
signal handler in caller thread would prevent it?
or do i need signal handler for each thread?
if i press ^C while the threads are running, does the thread gets the SIGINT?
No. As it says in the documentation:
Python signal handlers are always executed in the main Python thread of the main interpreter, even if the signal was received in another thread.
You can see that this is true with a simple test:
import threading
import time
def worker():
while True:
print('Worker working')
time.sleep(0.5)
pass
worker_thread = threading.Thread(target=worker)
worker_thread.start()
while True:
print('Parent parenting')
time.sleep(0.5)
After you send SIGINT with ^C, you will see that the main thread is killed (no more 'Parent parenting' logs) and the child thread continues to run.
In your example, your child threads exit because you break out of their while loops after 10 seconds.
As referred in Python's docs, you should use the attribute daemon:
daemon: A boolean value indicating whether this thread is a daemon
thread (True) or not (False). This must be set before start() is
called, otherwise RuntimeError is raised. Its initial value is
inherited from the creating thread; the main thread is not a daemon
thread and therefore all threads created in the main thread default to
daemon = False.
The entire Python program exits when no alive non-daemon threads are
left.
New in version 2.6.
To control the CTRL+C signal, you should capture it changing the handler with the signal.signal(signal_number, handler) function. The child process inherits the signal handler for SIGINT.
import threading
import time
import signal
def worker(i):
while True:
try:
print(i)
time.sleep(10)
break
except Exception as msg:
print(msg)
def signal_handler(signal, frame):
print('You pressed Ctrl+C!')
print("I will wait for all threads... waiting to be finished")
for t in threads:
t.join()
signal.signal(signal.SIGINT, signal_handler)
threads = []
for i in range(10):
t1 = threading.Thread(target=worker, args=(i,))
threads.append(t1)
for t in threads:
t.start()
print("started all threads... waiting to be finished")
for t in threads:
t.join()
I want to use threads to do some blocking work. What should I do to:
Spawn a thread safely
Do useful work
Wait until the thread finishes
Continue with the function
Here is my code:
import threading
def my_thread(self):
# Wait for the server to respond..
def main():
a = threading.thread(target=my_thread)
a.start()
# Do other stuff here
You can use Thread.join. Few lines from docs.
Wait until the thread terminates. This blocks the calling thread until the thread whose join() method is called terminates – either normally or through an unhandled exception – or until the optional timeout occurs.
For your example it will be like.
def main():
a = threading.thread(target = my_thread)
a.start()
a.join()
I installed the signal in the main method,
But when I pressed ctrl+c during running the process wasn't stopped,
exceptions.SystemExit: 0
^CKilled by user
Unhandled Error
EventTrigger and MemoryInfo are classes inherit from threading
and HttpStreamClient is a class inferits from twosted.reactor
How to kill my process by ctrl+c , thanks
Code
def signal_handler(*args):
print("Killed by user")
# teardown()
sys.exit(0)
def install_signal():
for sig in (SIGABRT, SIGILL, SIGINT, SIGSEGV, SIGTERM):
signal(sig, signal_handler)
def main():
try:
global cgi, config
install_signal()
config = Config().read_file(sys.argv[1])[0]
init_export_folder()
setup_logging()
threads = [
EventTrigger(config),
MemoryInfo(config),
]
for thr in threads:
thr.setDaemon(True)
thr.start()
HttpStreamClient(config).run()
for thr in threads:
thr.join()
except BaseException as e:
traceback.print_exc(file=sys.stdout)
raise e
I think your problem might be the forceful nature that you are terminating the process.
While using twisted you should call reactor.stop() to get the initial run call to stop blocking.
If you change your signal_handler to shutdown the reactor.
def signal_handler(*args):
print("Killed by user")
reactor.stop()
Your threads could still keep the process alive. Thread.join doesn't forcefully stop a thread, which in general is never really a good idea. If EventTrigger or MemoryInfo are still running the thr.join will block. You will need a mechanism to stop threads. Maybe take a look here.
sys.exit() raises a Python exception; I'm pretty sure raising an exception in a signal handler does not do much. Either call reactor.stop() as Alex says or use os._exit(0). Be aware that using os._exit(0) will terminate the process without further ado.
I would like to stop the execution of a process with Ctrl+C in Python. But I have read somewhere that KeyboardInterrupt exceptions are only raised in the main thread. I have also read that the main thread is blocked while the child thread executes. So how can I kill the child thread?
For instance, Ctrl+C has no effect with the following code:
def main():
try:
thread = threading.Thread(target=f)
thread.start() # thread is totally blocking (e.g. while True)
thread.join()
except KeyboardInterrupt:
print "Ctrl+C pressed..."
sys.exit(1)
def f():
while True:
pass # do the actual work
If you want to have main thread to receive the CTRL+C signal while joining, it can be done by adding timeout to join() call.
The following seems to be working (don't forget to add daemon=True if you want main to actually end):
thread1.start()
while True:
thread1.join(600)
if not thread1.isAlive():
break
The problem there is that you are using thread1.join(), which will cause your program to wait until that thread finishes to continue.
The signals will always be caught by the main process, because it's the one that receives the signals, it's the process that has threads.
Doing it as you show, you are basically running a 'normal' application, without thread features, as you start 1 thread and wait until it finishes to continue.
KeyboardInterrupt exceptions are raised only in the main thread of each process. But the method Thread.join blocks the calling thread, including KeyboardInterrupt exceptions. That is why Ctrl+C seems to have no effect.
A simple solution to your problem is to make the method Thread.join time out to unblock KeyboardInterrupt exceptions, and make the child thread daemonic to let the parent thread kill it at exit (non-daemonic child threads are not killed but joined by their parent at exit):
def main():
try:
thread = threading.Thread(target=f)
thread.daemon = True # let the parent kill the child thread at exit
thread.start()
while thread.is_alive():
thread.join(1) # time out not to block KeyboardInterrupt
except KeyboardInterrupt:
print "Ctrl+C pressed..."
sys.exit(1)
def f():
while True:
pass # do the actual work
A better solution if you control the code of the child thread is to notify the child thread to exit gracefully (instead of abruptly like with the simple solution), for instance using a threading.Event:
def main():
try:
event = threading.Event()
thread = threading.Thread(target=f, args=(event,))
thread.start()
event.wait() # wait without blocking KeyboardInterrupt
except KeyboardInterrupt:
print "Ctrl+C pressed..."
event.set() # notify the child thread to exit
sys.exit(1)
def f(event):
while not event.is_set():
pass # do the actual work