We were hit by this bug:
http://bugs.python.org/issue1856 Daemon threads segfault during interpreter shut down.
Now I search a way to code around this bug.
At the moment the code looks like this:
while True:
do_something()
time.sleep(interval)
Is there a way to check if the interpreter is still usable before do_something()?
Or is it better to not do mythread.setDaemon(True) and the check if the main thread has exited?
Answer to own question:
I use this pattern now: don't setDaemon(True), don't use sleep(), use parent_thread.join()
while True:
parent_thread.join(interval)
if not parent_thread.is_alive():
break
do_something()
Related: http://docs.python.org/2/library/threading.html#threading.Thread.join
This is a code from threading.py module:
import sys as _sys
class Thread(_Verbose):
def _bootstrap_inner(self):
# some code
# If sys.stderr is no more (most likely from interpreter
# shutdown) use self._stderr. Otherwise still use sys (as in
# _sys) in case sys.stderr was redefined since the creation of
# self.
if _sys:
_sys.stderr.write("Exception in thread %s:\n%s\n" %
(self.name, _format_exc()))
else:
# some code
might be helpful. The error you see comes from else statement. So in your case:
import sys as _sys
while True:
if not _sys:
break/return/die/whatever
do_something()
time.sleep(interval)
I'm not sure if it works though (note that interpreter shutdown may happen inside do_something so you should probably wrap everything with try:except:).
Daemon threads are not necessarily bad, they can definitely speed up development process. You just have to be careful with them.
Related
How would I unit test the following?
def sigterm_handler(signum, frame):
pid = os.getpid() # type: int
sys.exit(0)
signal.signal(signal.SIGTERM, sigterm_handler)
Should I mock and ensure mock is called?
I would write a test that runs your code in a subprocess which can check if you terminated successfully.
For example, let's say your question code lives in a module called signals.py. You can write a test wrapper module that looks like this:
test_signals_wrapper.py
from time import sleep
from sys import exit
# import this last to ensure it overrides any prior settings
import signals
while True:
sleep(1)
exit(1) # just in case the loop ends for other reasons
Now you can write a unit test that looks like this:
test_signals.py
from subprocess import run, TimeoutExpired
from sys import executable
def test_sigterm_handler():
try:
status = run([executable, '-m', 'test_signals_wrapper'], timeout=30)
except TimeoutExpired:
assert False, 'Did not trigger assertion in 30 seconds'
assert status.returncode == 0, f'Wrong return code: {status.returncode}'
This requires a bit of extra infrastructure for your test, but it solves all the problems with testing this code. By running in a subprocess, you can freely execute sys.exit and get the return value. By having a wrapper script, you can control how the code is loaded and run. You don't need to mock anything, just make sure that your packages are set up correctly, and that your test runner doesn't attempt to pick up the wrapper script as a test.
The code lines you have shown are not suited to be unit-tested, but should rather be integration tested. The reason is, that your code lines consist only of interactions with other components (in this case the signal, sys and os modules).
Therefore, the bugs you can expect to encounter lie in the interactions with these other components: Are you calling the right functions in the right components with the right values for the arguments in the right order and are the results/reactions as you expect them to be?
All these questions can not be answered in a unit-test, where bugs shall be found that can be found in the isolated units: If you mock the signal, sys and/or the os dependencies, then you will write your mocks such that they reflect your (potentially wrong) understanding of these components. The unit-tests will therefore succeed, although the code in the integrated system may fail. If your intent is that the code works on different systems, you might even encounter the situation that the code works in one integration (maybe Linux) but fails in another (maybe Windows).
Therefore, for code like yours, unit-testing and thus mocking for unit-testing does not have much value.
Monkey patch the handler and send the signal when testing?
import os
import signal
import sys
import time
# your handler
def sigterm_handler(signum, frame):
print("Handled")
pid = os.getpid() # type: int FIXME: what's this for?
sys.exit(0)
signal.signal(signal.SIGTERM, sigterm_handler)
# Mock out the existing sigterm_handler
_handled = False
def mocked_sigterm_handler(signum, frame):
print("Mocked")
_handled = True
# register the handler
signal.signal(signal.SIGTERM, mocked_sigterm_handler)
# test sending the signal
os.kill(os.getpid(), signal.SIGTERM)
print(f"done ({_handled})")
# reset your handler?
signal.signal(signal.SIGTERM, sigterm_handler)
If you want to test you handler itself you'll probably have to put some kind of code like this.. in the handler which is not beautiful.
if _unittesting_sigterm_handler:
_handled = True
else:
sys.exit(0)
and then you can just call the handler directly (or pass the test flag in the call).
_unittesting_sigterm_handler = True
sigterm_handler(0, None)
In Python 2 there is a function thread.interrupt_main(), which raises a KeyboardInterrupt exception in the main thread when called from a subthread.
This is also available through _thread.interrupt_main() in Python 3, but it's a low-level "support module", mostly for use within other standard modules.
What is the modern way of doing this in Python 3, presumably through the threading module, if there is one?
Well raising an exception manually is kinda low-level, so if you think you have to do that just use _thread.interrupt_main() since that's the equivalent you asked for (threading module itself doesn't provide this).
It could be that there is a more elegant way to achieve your ultimate goal, though. Maybe setting and checking a flag would be already enough or using a threading.Event like #RFmyD already suggested, or using message passing over a queue.Queue. It depends on your specific setup.
If you need a way for a thread to stop execution of the whole program, this is how I did it with a threading.Event:
def start():
"""
This runs in the main thread and starts a sub thread
"""
stop_event = threading.Event()
check_stop_thread = threading.Thread(
target=check_stop_signal, args=(stop_event), daemon=True
)
check_stop_thread.start()
# If check_stop_thread sets the check_stop_signal, sys.exit() is executed here in the main thread.
# Since the sub thread is a daemon, it will be terminated as well.
stop_event.wait()
logging.debug("Threading stop event set, calling sys.exit()...")
sys.exit()
def check_stop_signal(stop_event):
"""
Checks continuously (every 0.1 s) if a "stop" flag has been set in the database.
Needs to run in its own thread.
"""
while True:
if io.check_stop():
logger.info("Program was aborted by user.")
logging.debug("Setting threading stop event...")
stop_event.set()
break
sleep(0.1)
You might want to look into the threading.Event module.
This has been answered for Android, Objective C and C++ before, but apparently not for Python. How do I reliably determine whether the current thread is the main thread? I can think of a few approaches, none of which really satisfy me, considering it could be as easy as comparing to threading.MainThread if it existed.
Check the thread name
The main thread is instantiated in threading.py like this:
Thread.__init__(self, name="MainThread")
so one could do
if threading.current_thread().name == 'MainThread'
but is this name fixed? Other codes I have seen checked whether MainThread is contained anywhere in the thread's name.
Store the starting thread
I could store a reference to the starting thread the moment the program starts up, i.e. while there are no other threads yet. This would be absolutely reliable, but way too cumbersome for such a simple query?
Is there a more concise way of doing this?
The problem with threading.current_thread().name == 'MainThread' is that one can always do:
threading.current_thread().name = 'MyName'
assert threading.current_thread().name == 'MainThread' # will fail
Perhaps the following is more solid:
threading.current_thread().__class__.__name__ == '_MainThread'
Having said that, one may still cunningly do:
threading.current_thread().__class__.__name__ = 'Grrrr'
assert threading.current_thread().__class__.__name__ == '_MainThread' # will fail
But this option still seems better; "after all, we're all consenting adults here."
UPDATE:
Python 3.4 introduced threading.main_thread() which is much better than the above:
assert threading.current_thread() is threading.main_thread()
UPDATE 2:
For Python < 3.4, perhaps the best option is:
isinstance(threading.current_thread(), threading._MainThread)
The answers here are old and/or bad, so here's a current solution:
if threading.current_thread() is threading.main_thread():
...
This method is available since Python 3.4+.
If, like me, accessing protected attributes gives you the Heebie-jeebies, you may want an alternative for using threading._MainThread, as suggested. In that case, you may exploit the fact that only the Main Thread can handle signals, so the following can do the job:
import signal
def is_main_thread():
try:
# Backup the current signal handler
back_up = signal.signal(signal.SIGINT, signal.SIG_DFL)
except ValueError:
# Only Main Thread can handle signals
return False
# Restore signal handler
signal.signal(signal.SIGINT, back_up)
return True
Updated to address potential issue as pointed out by #user4815162342.
It seems that asynchronous signals in multithreaded programs are not correctly handled by Python. But, I thought I would check here to see if anyone can spot a place where I am violating some principle, or misunderstanding some concept.
There are similar threads I've found here on SO, but none that seem to be quite the same.
The scenario is: I have two threads, reader thread and writer thread (main thread). The writer thread writes to a pipe that the reader thread polls. The two threads are coordinated using a threading.Event() primitive (which I assume is implemented using pthread_cond_wait). The main thread waits on the Event while the reader thread eventually sets it.
But, if I want to interrupt my program while the main thread is waiting on the Event, the KeyboardInterrupt is not handled asynchronously.
Here is a small program to illustrate my point:
#!/usr/bin/python
import os
import sys
import select
import time
import threading
pfd_r = -1
pfd_w = -1
reader_ready = threading.Event()
class Reader(threading.Thread):
"""Read data from pipe and echo to stdout."""
def run(self):
global pfd_r
while True:
if select.select([pfd_r], [], [], 1)[0] == [pfd_r]:
output = os.read(pfd_r, 1000)
sys.stdout.write("R> '%s'\n" % output)
sys.stdout.flush()
# Suppose there is some long-running processing happening:
time.sleep(10)
reader_ready.set()
# Set up pipe.
(pfd_r, pfd_w) = os.pipe()
rt = Reader()
rt.daemon = True
rt.start()
while True:
reader_ready.clear()
user_input = raw_input("> ").strip()
written = os.write(pfd_w, user_input)
assert written == len(user_input)
# Wait for reply -- Try to ^C here and it won't work immediately.
reader_ready.wait()
Start the program with './bug.py' and enter some input at the prompt. Once you see the reader reply with the prefix 'R>', try to interrupt using ^C.
What I see (Ubuntu Linux 10.10, Python 2.6.6) is that the ^C is not handled until after the blocking reader_ready.wait() returns. What I expected to see is that the ^C is raised asynchronously, resulting in the program terminating (because I do not catch KeyboardInterrupt).
This may seem like a contrived example, but I'm running into this in a real-world program where the time.sleep(10) is replaced by actual computation.
Am I doing something obviously wrong, like misunderstanding what the expected result would be?
Edit: I've also just tested with Python 3.1.1 and the same problem exists.
The wait() method of a threading._Event object actually relies on a thread.lock's acquire() method. However, the thread documentation states that a lock's acquire() method cannot be interrupted, and that any KeyboardInterrupt exception will be handled after the lock is released.
So basically, this is working as intended. Threading objects that implement this behavior rely on a lock at some point (including queues), so you might want to choose another path.
Alternatively, you could also use the pause() function of the signal module instead of reader_ready.wait(). signal.pause() is a blocking function and gets unblocked when a signal is received by the process. In your case, when ^C is pressed, SIGINT signal unblocks the function.
According to the documentation, the function is not available for Windows. I've tested it on Linux and it works. I think this is better than using wait() with a timeout.
I looked online and found some SO discussing and ActiveState recipes for running some code with a timeout. It looks there are some common approaches:
Use thread that run the code, and join it with timeout. If timeout elapsed - kill the thread. This is not directly supported in Python (used private _Thread__stop function) so it is bad practice
Use signal.SIGALRM - but this approach not working on Windows!
Use subprocess with timeout - but this is too heavy - what if I want to start interruptible task often, I don't want fire process for each!
So, what is the right way? I'm not asking about workarounds (eg use Twisted and async IO), but actual way to solve actual problem - I have some function and I want to run it only with some timeout. If timeout elapsed, I want control back. And I want it to work on Linux and Windows.
A completely general solution to this really, honestly does not exist. You have to use the right solution for a given domain.
If you want timeouts for code you fully control, you have to write it to cooperate. Such code has to be able to break up into little chunks in some way, as in an event-driven system. You can also do this by threading if you can ensure nothing will hold a lock too long, but handling locks right is actually pretty hard.
If you want timeouts because you're afraid code is out of control (for example, if you're afraid the user will ask your calculator to compute 9**(9**9)), you need to run it in another process. This is the only easy way to sufficiently isolate it. Running it in your event system or even a different thread will not be enough. It is also possible to break things up into little chunks similar to the other solution, but requires very careful handling and usually isn't worth it; in any event, that doesn't allow you to do the same exact thing as just running the Python code.
What you might be looking for is the multiprocessing module. If subprocess is too heavy, then this may not suit your needs either.
import time
import multiprocessing
def do_this_other_thing_that_may_take_too_long(duration):
time.sleep(duration)
return 'done after sleeping {0} seconds.'.format(duration)
pool = multiprocessing.Pool(1)
print 'starting....'
res = pool.apply_async(do_this_other_thing_that_may_take_too_long, [8])
for timeout in range(1, 10):
try:
print '{0}: {1}'.format(duration, res.get(timeout))
except multiprocessing.TimeoutError:
print '{0}: timed out'.format(duration)
print 'end'
If it's network related you could try:
import socket
socket.setdefaulttimeout(number)
I found this with eventlet library:
http://eventlet.net/doc/modules/timeout.html
from eventlet.timeout import Timeout
timeout = Timeout(seconds, exception)
try:
... # execution here is limited by timeout
finally:
timeout.cancel()
For "normal" Python code, that doesn't linger prolongued times in C extensions or I/O waits, you can achieve your goal by setting a trace function with sys.settrace() that aborts the running code when the timeout is reached.
Whether that is sufficient or not depends on how co-operating or malicious the code you run is. If it's well-behaved, a tracing function is sufficient.
An other way is to use faulthandler:
import time
import faulthandler
faulthandler.enable()
try:
faulthandler.dump_tracebacks_later(3)
time.sleep(10)
finally:
faulthandler.cancel_dump_tracebacks_later()
N.B: The faulthandler module is part of stdlib in python3.3.
If you're running code that you expect to die after a set time, then you should write it properly so that there aren't any negative effects on shutdown, no matter if its a thread or a subprocess. A command pattern with undo would be useful here.
So, it really depends on what the thread is doing when you kill it. If its just crunching numbers who cares if you kill it. If its interacting with the filesystem and you kill it , then maybe you should really rethink your strategy.
What is supported in Python when it comes to threads? Daemon threads and joins. Why does python let the main thread exit if you've joined a daemon while its still active? Because its understood that someone using daemon threads will (hopefully) write the code in a way that it wont matter when that thread dies. Giving a timeout to a join and then letting main die, and thus taking any daemon threads with it, is perfectly acceptable in this context.
I've solved that in that way:
For me is worked great (in windows and not heavy at all) I'am hope it was useful for someone)
import threading
import time
class LongFunctionInside(object):
lock_state = threading.Lock()
working = False
def long_function(self, timeout):
self.working = True
timeout_work = threading.Thread(name="thread_name", target=self.work_time, args=(timeout,))
timeout_work.setDaemon(True)
timeout_work.start()
while True: # endless/long work
time.sleep(0.1) # in this rate the CPU is almost not used
if not self.working: # if state is working == true still working
break
self.set_state(True)
def work_time(self, sleep_time): # thread function that just sleeping specified time,
# in wake up it asking if function still working if it does set the secured variable work to false
time.sleep(sleep_time)
if self.working:
self.set_state(False)
def set_state(self, state): # secured state change
while True:
self.lock_state.acquire()
try:
self.working = state
break
finally:
self.lock_state.release()
lw = LongFunctionInside()
lw.long_function(10)
The main idea is to create a thread that will just sleep in parallel to "long work" and in wake up (after timeout) change the secured variable state, the long function checking the secured variable during its work.
I'm pretty new in Python programming, so if that solution has a fundamental errors, like resources, timing, deadlocks problems , please response)).
solving with the 'with' construct and merging solution from -
Timeout function if it takes too long to finish
this thread which work better.
import threading, time
class Exception_TIMEOUT(Exception):
pass
class linwintimeout:
def __init__(self, f, seconds=1.0, error_message='Timeout'):
self.seconds = seconds
self.thread = threading.Thread(target=f)
self.thread.daemon = True
self.error_message = error_message
def handle_timeout(self):
raise Exception_TIMEOUT(self.error_message)
def __enter__(self):
try:
self.thread.start()
self.thread.join(self.seconds)
except Exception, te:
raise te
def __exit__(self, type, value, traceback):
if self.thread.is_alive():
return self.handle_timeout()
def function():
while True:
print "keep printing ...", time.sleep(1)
try:
with linwintimeout(function, seconds=5.0, error_message='exceeded timeout of %s seconds' % 5.0):
pass
except Exception_TIMEOUT, e:
print " attention !! execeeded timeout, giving up ... %s " % e