Python daemon threads and the "with" statement

Python daemon threads and the "with" statement - python

If I have the following code in a daemon thread and the main thread does not do invoke a join on the daemon. Will the file close safely since it is used inside "with" once the main thread exits or no? Anyway to make it safe? Thanks :D
while True:
with open('file.txt', 'r') as f:
cfg = f.readlines()
time.sleep(60)

From the docs:
Note: Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.
This suggests, but does not outright state, that daemon threads are terminated without a chance for __exit__ methods and finally blocks to run. We can run an experiment to verify that this is the case:
import contextlib
import threading
import time
#contextlib.contextmanager
def cm():
try:
yield
finally:
print 'in __exit__'
def f():
with cm():
print 'in with block'
event.set()
time.sleep(10)
event = threading.Event()
t = threading.Thread(target=f)
t.daemon = True
t.start()
event.wait()
where we start a daemon thread and leave it sleeping in a with block when the main thread exits. When we run the experiment, we get an output of
in with block
but no in __exit__, so the __exit__ method never gets a chance to run.
If you want cleanup, don't use a daemon thread. Use a regular thread, and tell it to shut down at the end of the main thread through the regular inter-thread communication channels.

Related

ThreadPoolExecutor KeyboardInterrupt

I've got the following code which uses a concurrent.futures.ThreadPoolExecutor to launch processes of another program in a metered way (no more than 30 at a time). I additionally want the ability to stop all work if I ctrl-C the python process. This code works with one caveat: I have to ctrl-C twice. The first time I send the SIGINT, nothing happens; the second time, I see the "sending SIGKILL to processes", the processes die, and it works. What is happening to my first SIGINT?
execution_list = [['prog', 'arg1'], ['prog', 'arg2']] ... etc
processes = []
def launch_instance(args):
process = subprocess.Popen(args)
processes.append(process)
process.wait()
try:
with concurrent.futures.ThreadPoolExecutor(max_workers=30) as executor:
results = list(executor.map(launch_instance, execution_list))
except KeyboardInterrupt:
print('sending SIGKILL to processes')
for p in processes:
if p.poll() is None: #If process is still alive
p.send_signal(signal.SIGKILL)

I stumbled upon your question while trying to solve something similar. Not 100% sure that it will solve your use case (I'm not using subprocesses), but I think it will.
Your code will stay within the context manager of the executor as long as the jobs are still running. My educated guess is that the first KeyboardInterrupt will be caught by the ThreadPoolExecutor, whose default behaviour would be to not start any new jobs, wait until the current ones are finished, and then clean up (and probably reraise the KeyboardInterrupt). But the processes are probably long running, so you wouldn't notice. The second KeyboardInterrupt then interrupts this error handling.
How I solved my problem (inifinite background processes in separate threads) is with the following code:
from concurrent.futures import ThreadPoolExecutor
import signal
import threading
from time import sleep
def loop_worker(exiting):
while not exiting.is_set():
try:
print("started work")
sleep(10)
print("finished work")
except KeyboardInterrupt:
print("caught keyboardinterrupt") # never caught here. just for demonstration purposes
def loop_in_worker():
exiting = threading.Event()
def signal_handler(signum, frame):
print("Setting exiting event")
exiting.set()
signal.signal(signal.SIGTERM, signal_handler)
with ThreadPoolExecutor(max_workers=1) as executor:
executor.submit(loop_worker, exiting)
try:
while not exiting.is_set():
sleep(1)
print('waiting')
except KeyboardInterrupt:
print('Caught keyboardinterrupt')
exiting.set()
print("Main thread finished (and thus all others)")
if __name__ == '__main__':
loop_in_worker()
It uses an Event to signal to the threads that they should stop what they are doing. In the main loop, there is a loop just to keep busy and check for any exceptions. Note that this loop is within the context of the ThreadPoolExecutor.
As a bonus it also handles the SIGTERM signal by using the same exiting Event.
If you add a loop in between processes.append(process) and process.wait() that checks for a signal, then it will probably solve your use case as well. It depends on what you want to do with the running processes what actions you should take there.
If you run my script from the command line and press ctrl-C you should see something like:
started work
waiting
waiting
^CCaught keyboardinterrupt
# some time passes here
finished work
Main thread finished (and thus all others)
Inspiration for my solution came from this blog post

How to exit python deamon thread gracefully

I have code like below
def run():
While True:
doSomething()
def main():
thread = threading.thread(target = run)
thread.setDaemon(True)
thread.start()
doSomethingElse()
if I Write code like above, when the main thread exits, the Deamon thread will exit, but maybe still in the process of doSomething.
The main function will be called outside, I am not allowed to use join in the main thread,
is there any way I can do to make the Daemon thread exit gracefully upon the main thread completion.

You can use thread threading.Event to signal child thread when to exit from main thread.
Example:
class DemonThead(threading.Thread):
def __init__(self):
self.shutdown_flag = threading.Event()
def run(self):
while not self.shutdown_flag:
# Run your code here
pass
def main_thread():
demon_thread = DemonThead()
demon_thread.setDaemon(True)
demon_thread.start()
# Stop your thread
demon_thread.shutdown_flag.set()
demon_thread.join()

You are not allowed to use join, but you can set an Event and do not use daemonic flag. Official doc is below:
Note: Daemon threads are abruptly stopped at shutdown. Their resources (such as open files, database transactions, etc.) may not be released properly. If you want your threads to stop gracefully, make them non-daemonic and use a suitable signalling mechanism such as an Event.

The workers in ThreadPoolExecutor is not really daemon

The thing I cannot figure out is that although ThreadPoolExecutor uses daemon workers, they will still run even if main thread exit.
I can provide a minimal example in python3.6.4:
import concurrent.futures
import time
def fn():
while True:
time.sleep(5)
print("Hello")
thread_pool = concurrent.futures.ThreadPoolExecutor()
thread_pool.submit(fn)
while True:
time.sleep(1)
print("Wow")
Both main thread and the worker thread are infinite loops. So if I use KeyboardInterrupt to terminate main thread, I expect that the whole program will terminate too. But actually the worker thread is still running even though it is a daemon thread.
The source code of ThreadPoolExecutor confirms that worker threads are daemon thread:
t = threading.Thread(target=_worker,
args=(weakref.ref(self, weakref_cb),
self._work_queue))
t.daemon = True
t.start()
self._threads.add(t)
Further, if I manually create a daemon thread, it works like a charm:
from threading import Thread
import time
def fn():
while True:
time.sleep(5)
print("Hello")
thread = Thread(target=fn)
thread.daemon = True
thread.start()
while True:
time.sleep(1)
print("Wow")
So I really cannot figure out this strange behavior.

Suddenly... I found why. According to much more source code of ThreadPoolExecutor:
# Workers are created as daemon threads. This is done to allow the interpreter
# to exit when there are still idle threads in a ThreadPoolExecutor's thread
# pool (i.e. shutdown() was not called). However, allowing workers to die with
# the interpreter has two undesirable properties:
# - The workers would still be running during interpreter shutdown,
# meaning that they would fail in unpredictable ways.
# - The workers could be killed while evaluating a work item, which could
# be bad if the callable being evaluated has external side-effects e.g.
# writing to a file.
#
# To work around this problem, an exit handler is installed which tells the
# workers to exit when their work queues are empty and then waits until the
# threads finish.
_threads_queues = weakref.WeakKeyDictionary()
_shutdown = False
def _python_exit():
global _shutdown
_shutdown = True
items = list(_threads_queues.items())
for t, q in items:
q.put(None)
for t, q in items:
t.join()
atexit.register(_python_exit)
There is an exit handler which will join all unfinished worker...

Here's the way to avoid this problem. Bad design can be beaten by another bad design. People write daemon=True only if they really know that the worker won't damage any objects or files.
In my case, I created TreadPoolExecutor with a single worker and after a single submit I just deleted the newly created thread from the queue so the interpreter won't wait till this thread stops on its own. Notice that worker threads are created after submit, not after the initialization of TreadPoolExecutor.
import concurrent.futures.thread
from concurrent.futures import ThreadPoolExecutor
...
executor = ThreadPoolExecutor(max_workers=1)
future = executor.submit(lambda: self._exec_file(args))
del concurrent.futures.thread._threads_queues[list(executor._threads)[0]]
It works in Python 3.8 but may not work in 3.9+ since this code is accessing private variables.
See the working piece of code on github

Python threading - blocking operation - terminate execution

I have a python program like this:
from threading import Thread
def foo():
while True:
blocking_function() #Actually waiting for a message on a socket
def run():
Thread(target=foo).start()
run()
This program does not terminate with KeyboardInterrupt, due to the main Thread exiting before a Thread running foo() has a chance to terminate. I tried keeping the main thread alive with just running while True loop after calling run() but that also doesn't exit the program (blocking_function() just blocks the thread from running I guess, waits for the message). Also tried catching KeyboardInterrupt exception in main thread and call sys.exit(0) - same outcome (I would actually expect it to kill the thread running foo(), but apparently it doesn't)
Now, I could simply timeout the execution of blocking_function() but that's no fun. Can I unblock it on KeyboardInterrupt or anything similar?
Main goal: Terminate the program with blocked thread on Ctrl+C

Maybe a little bit of a workaround, but you could use thread instead of threading. This is not really advised, but if it suits you and your program, why not.
You will need to keep your program running, otherwise the thread exits right after run()
import thread, time
def foo():
while True:
blocking_function() #Actually waiting for a message on a socket
def run():
thread.start_new_thread(foo, ())
run()
while True:
#Keep the main thread alive
time.sleep(1)

setDaemon() method of threading.Thread

I am a newbie in python programming, what I understand is that a process can be a daemon, but a thread in a daemon mode, I couldn't understand the usecase of this, I would request the python gurus to help me in understanding this.

Here is some basic code using threading:
import Queue
import threading
def basic_worker(queue):
while True:
item = queue.get()
# do_work(item)
print(item)
queue.task_done()
def basic():
# http://docs.python.org/library/queue.html
queue = Queue.Queue()
for i in range(3):
t = threading.Thread(target=basic_worker,args=(queue,))
t.daemon = True
t.start()
for item in range(4):
queue.put(item)
queue.join() # block until all tasks are done
print('got here')
basic()
When you run it, you get
% test.py
0
1
2
3
got here
Now comment out the line:
t.daemon = True
Run it again, and you'll see that the script prints the same result, but hangs.
The main thread ends (note that got here was printed), but the second thread never finishes.
In contrast, when t.daemon is set to True, the thread t is terminated when the main thread ends.
Note that "daemon threads" has little to do with daemon processes.

It looks like people intend to use Queue to explain threading, but I think there should be a much simpler way, by using time.sleep(), to demo a daemon thread.
Create daemon thread by setting the daemon parameter (default as None):
from threading import Thread
import time
def worker():
time.sleep(3)
print('daemon done')
thread = Thread(target=worker, daemon=True)
thread.start()
print('main done')
Output:
main done
Process finished with exit code 0
Remove the daemon argument, like:
thread = Thread(target=worker)
Re-run and see the output:
main done
daemon done
Process finished with exit code 0
Here we already see the difference of a daemon thread:
The entire Python program can exit if only daemon thread is left.
isDaemon() and setDaemon() are old getter/setter API. Using constructor argument, as above, or daemon property is recommended.

Module Queue has been renamed queue starting with Python3 to better reflect the fact that there are several queue classes (lifo, fifo, priority) in the module.
so please make the changes while using this example

In simple words...
What is a Daemon thread?
daemon threads can shut down any time in between their flow whereas non-daemon (i.e. user threads) execute completely.
daemon threads run intermittently in the background as long as other non-daemon threads are running.
When all of the non-daemon threads are complete, daemon threads terminate automatically (no matter whether they got fully executed or not).
daemon threads are service providers for user threads running in the same process.
python does not care about daemon threads to complete when in running state, NOT EVEN the finally block but python does give preference to non-daemon threads that are created by us.
daemon threads act as services in operating systems.
python stops the daemon threads when all user threads (in contrast to the daemon threads) are terminated. Hence daemon threads can be used to implement, for example, a monitoring functionality as the thread is stopped by the python as soon as all user threads have stopped.
In a nutshell
If you do something like this
thread = Thread(target=worker_method, daemon=True)
there is NO guarantee that worker_method will get executed completely.
Where does this behaviour be useful?
Consider two threads t1 (parent thread) and t2 (child thread). Let t2 be daemon. Now, you want to analyze the working of t1 while it is in running state; you can write the code to do this in t2.
Reference:
StackOverflow - What is a daemon thread in Java?
GeeksForGeeks - Python daemon threads
TutotrialsPoint - Concurrency in Python - Threads
Official Python Documentation

I've adapted #unutbu's answer for python 3. Make sure that you run this script from the command line and not some interactive environment like jupyter notebook.
import queue
import threading
def basic_worker(q):
while True:
item = q.get()
# do_work(item)
print(item)
q.task_done()
def basic():
q = queue.Queue()
for item in range(4):
q.put(item)
for i in range(3):
t = threading.Thread(target=basic_worker,args=(q,))
t.daemon = True
t.start()
q.join() # block until all tasks are done
print('got here')
basic()
So when you comment out the daemon line, you'll notice that the program does not finish, you'll have to interrupt it manually.
Setting the threads to daemon threads makes sure that they are killed once they have finished.
Note: you could achieve the same thing here without daemon threads, if you would replace the infinite while loop with another condition:
def basic_worker(q):
while not q.empty():
item = q.get()
# do_work(item)
print(item)
q.task_done()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.