Python - Properly Kill/Exit Futures Thread? - python

I was previously using the threading.Thread module. Now I'm using concurrent.futures -> ThreadPoolExecutor. Previously, I was using the following code to exit/kill/finish a thread:
def terminate_thread(thread):
"""Terminates a python thread from another thread.
:param thread: a threading.Thread instance
"""
if not thread.isAlive():
return
exc = ctypes.py_object(SystemExit)
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
ctypes.c_long(thread.ident), exc)
if res == 0:
raise ValueError("nonexistent thread id")
elif res > 1:
# """if it returns a number greater than one, you're in trouble,
# and you should call it again with exc=NULL to revert the effect"""
ctypes.pythonapi.PyThreadState_SetAsyncExc(thread.ident, None)
raise SystemError("PyThreadState_SetAsyncExc failed")
This doesn't appear to be working with futures interface. What's the best practice here? Just return? My threads are controlling Selenium instances. I need to make sure that when I kill a thread, the Selenium instance is torn down.
Edit: I had already seen the post that is referenced as duplicate. It's insufficient because when you venture into something like futures, behaviors can be radically different. In the case of the previous threading module, my terminate_thread function is acceptable and not applicable to the criticism of the other q/a. It's not the same as "killing". Please take a look at the code I posted to see that.
I don't want to kill. I want to check if its still alive and gracefully exit the thread in the most proper way. How to do with futures?

If you want to let the threads finish their current work use:
thread_executor.shutdown(wait=True)
If you want to bash the current futures being run on the head and stop all ...future...(heh) futures use:
thread_executor.shutdown(wait=False)
for t in thread_executor._threads:
terminate_thread(t)
This uses your terminate_thread function to call an exception in the threads in the thread pool executor. Those futures that were disrupted will return with the exception set.

How about .cancel() on the thread result?
cancel() Attempt to cancel the call. If the call is currently being
executed and cannot be cancelled then the method will return False,
otherwise the call will be cancelled and the method will return True.
https://docs.python.org/3/library/concurrent.futures.html

Related

How do I know if a thread is a dummy thread in python?

My basic question is: how do I detect whether the current thread is a dummy thread? I am new to threading and I recently was debugging some code in my Apache2/Flask app and thought it might be useful. I was getting a flip flopping error where a request was processed successfully on the main thread, unsuccessfully on a dummy thread and then successfully on the main thread again, etc.
Like I said I am using Apache2 and Flask which seems the combination of which creates these dummy threads. I would also be interested in knowing more about that if anyone can teach me.
My code is meant to print information about the threads running on the service and looks something like this:
def allthr_info(self):
"""Returns info in JSON form of all threads."""
all_thread_infos = Queue()
for thread_x in threading.enumerate():
if thread_x is threading.current_thread() or thread_x is threading.main_thread():
continue
info = self._thr_info(thread_x)
all_thread_infos.put(info)
return list(all_thread_infos.queue)
def _thr_info(self, thr):
"""Consolidation of the thread info that can be obtained from threading module."""
thread_info = {}
try:
thread_info = {
'name': thr.getName(),
'ident': thr.ident,
'daemon': thr.daemon,
'is_alive': thr.is_alive(),
}
except Exception as e:
LOGGER.error(e)
return thread_info
You can check if the current thread is an instance of threading._DummyThread.
isinstance(threading.current_thread(), threading._DummyThread)
threading.py itself can teach you what dummy-threads are about:
Dummy thread class to represent threads not started here.
These aren't garbage collected when they die, nor can they be waited for.
If they invoke anything in threading.py that calls current_thread(), they
leave an entry in the _active dict forever after.
Their purpose is to return something from current_thread().
They are marked as daemon threads so we won't wait for them
when we exit (conform previous semantics).
def current_thread():
"""Return the current Thread object, corresponding to the caller's thread of control.
If the caller's thread of control was not created through the threading
module, a dummy thread object with limited functionality is returned.
"""
try:
return _active[get_ident()]
except KeyError:
return _DummyThread()

Equivalent of thread.interrupt_main() in Python 3

In Python 2 there is a function thread.interrupt_main(), which raises a KeyboardInterrupt exception in the main thread when called from a subthread.
This is also available through _thread.interrupt_main() in Python 3, but it's a low-level "support module", mostly for use within other standard modules.
What is the modern way of doing this in Python 3, presumably through the threading module, if there is one?
Well raising an exception manually is kinda low-level, so if you think you have to do that just use _thread.interrupt_main() since that's the equivalent you asked for (threading module itself doesn't provide this).
It could be that there is a more elegant way to achieve your ultimate goal, though. Maybe setting and checking a flag would be already enough or using a threading.Event like #RFmyD already suggested, or using message passing over a queue.Queue. It depends on your specific setup.
If you need a way for a thread to stop execution of the whole program, this is how I did it with a threading.Event:
def start():
"""
This runs in the main thread and starts a sub thread
"""
stop_event = threading.Event()
check_stop_thread = threading.Thread(
target=check_stop_signal, args=(stop_event), daemon=True
)
check_stop_thread.start()
# If check_stop_thread sets the check_stop_signal, sys.exit() is executed here in the main thread.
# Since the sub thread is a daemon, it will be terminated as well.
stop_event.wait()
logging.debug("Threading stop event set, calling sys.exit()...")
sys.exit()
def check_stop_signal(stop_event):
"""
Checks continuously (every 0.1 s) if a "stop" flag has been set in the database.
Needs to run in its own thread.
"""
while True:
if io.check_stop():
logger.info("Program was aborted by user.")
logging.debug("Setting threading stop event...")
stop_event.set()
break
sleep(0.1)
You might want to look into the threading.Event module.

When is the right time to call loop.close()?

I have been experimenting with asyncio for a little while and read the PEPs; a few tutorials; and even the O'Reilly book.
I think I got the hang of it, but I'm still puzzled by the behavior of loop.close() which I can't quite figure out when it is "safe" to invoke.
Distilled to its simplest, my use case is a bunch of blocking "old school" calls, which I wrap in the run_in_executor() and an outer coroutine; if any of those calls goes wrong, I want to stop progress, cancel the ones still outstanding, print a sensible log and then (hopefully, cleanly) get out of the way.
Say, something like this:
import asyncio
import time
def blocking(num):
time.sleep(num)
if num == 2:
raise ValueError("don't like 2")
return num
async def my_coro(loop, num):
try:
result = await loop.run_in_executor(None, blocking, num)
print(f"Coro {num} done")
return result
except asyncio.CancelledError:
# Do some cleanup here.
print(f"man, I was canceled: {num}")
def main():
loop = asyncio.get_event_loop()
tasks = []
for num in range(5):
tasks.append(loop.create_task(my_coro(loop, num)))
try:
# No point in waiting; if any of the tasks go wrong, I
# just want to abandon everything. The ALL_DONE is not
# a good solution here.
future = asyncio.wait(tasks, return_when=asyncio.FIRST_EXCEPTION)
done, pending = loop.run_until_complete(future)
if pending:
print(f"Still {len(pending)} tasks pending")
# I tried putting a stop() - with/without a run_forever()
# after the for - same exception raised.
# loop.stop()
for future in pending:
future.cancel()
for task in done:
res = task.result()
print("Task returned", res)
except ValueError as error:
print("Outer except --", error)
finally:
# I also tried placing the run_forever() here,
# before the stop() - no dice.
loop.stop()
if pending:
print("Waiting for pending futures to finish...")
loop.run_forever()
loop.close()
I tried several variants of the stop() and run_forever() calls, the "run_forever first, then stop" seems to be the one to use according to the pydoc and, without the call to close() yields a satisfying:
Coro 0 done
Coro 1 done
Still 2 tasks pending
Task returned 1
Task returned 0
Outer except -- don't like 2
Waiting for pending futures to finish...
man, I was canceled: 4
man, I was canceled: 3
Process finished with exit code 0
However, when the call to close() is added (as shown above) I get two exceptions:
exception calling callback for <Future at 0x104f21438 state=finished returned int>
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/futures.py", line 414, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 620, in call_soon_threadsafe
self._check_closed()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 357, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
which is at best annoying, but to me, totally puzzling: and, to make matter worse, I've been unable to figure out what would The Right Way of handling such a situation.
Thus, two questions:
what am I missing? how should I modify the code above in a way that with the call to close() included does not raise?
what actually happens if I don't call close() - in this trivial case, I presume it's largely redundant; but what might the consequences be in a "real" production code?
For my own personal satisfaction, also:
why does it raise at all? what more does the loop want from the coros/tasks: they either exited; raised; or were canceled: isn't this enough to keep it happy?
Many thanks in advance for any suggestions you may have!
Distilled to its simplest, my use case is a bunch of blocking "old school" calls, which I wrap in the run_in_executor() and an outer coroutine; if any of those calls goes wrong, I want to stop progress, cancel the ones still outstanding
This can't work as envisioned because run_in_executor submits the function to a thread pool, and OS threads can't be cancelled in Python (or in other languages that expose them). Canceling the future returned by run_in_executor will attempt to cancel the underlying concurrent.futures.Future, but that will only have effect if the blocking function is not yet running, e.g. because the thread pool is busy. Once it starts to execute, it cannot be safely cancelled. Support for safe and reliable cancellation is one of the benefits of using asyncio compared to threads.
If you are dealing with synchronous code, be it a legacy blocking call or longer-running CPU-bound code, you should run it with run_in_executor and incorporate a way to interrupt it. For example, the code could occasionally check a stop_requested flag and exit if that is true, perhaps by raising an exception. Then you can "cancel" those tasks by setting the appropriate flag or flags.
how should I modify the code above in a way that with the call to close() included does not raise?
As far as I can tell, there is currently no way to do so without modifications to blocking and the top-level code. run_in_executor will insist on informing the event loop of the result, and this fails when the event loop is closed. It doesn't help that the asyncio future is cancelled, because the cancellation check is performed in the event loop thread, and the error occurs before that, when call_soon_threadsafe is called by the worker thread. (It might be possible to move the check to the worker thread, but it should be carefully analyzed whether it leads a race condition between the call to cancel() and the actual check.)
why does it raise at all? what more does the loop want from the coros/tasks: they either exited; raised; or were canceled: isn't this enough to keep it happy?
It wants the blocking functions passed to run_in_executor (literally called blocking in the question) that have already been started to finish running before the event loop is closed. You cancelled the asyncio future, but the underlying concurrent future still wants to "phone home", finding the loop closed.
It is not obvious whether this is a bug in asyncio, or if you are simply not supposed to close an event loop until you somehow ensure that all work submitted to run_in_executor is done. Doing so requires the following changes:
Don't attempt to cancel the pending futures. Canceling them looks correct superficially, but it prevents you from being able to wait() for those futures, as asyncio will consider them complete.
Instead, send an application-specific event to your background tasks informing them that they need to abort.
Call loop.run_until_complete(asyncio.wait(pending)) before loop.close().
With these modifications (except for the application-specific event - I simply let the sleep()s finish their course), the exception did not appear.
what actually happens if I don't call close() - in this trivial case, I presume it's largely redundant; but what might the consequences be in a "real" production code?
Since a typical event loop runs as long as the application, there should be no issue in not call close() at the very end of the program. The operating system will clean up the resources on program exit anyway.
Calling loop.close() is important for event loops that have a clear lifetime. For example, a library might create a fresh event loop for a specific task, run it in a dedicated thread, and dispose of it. Failing to close such a loop could leak its internal resources (such as the pipe it uses for inter-thread wakeup) and cause the program to fail. Another example are test suites, which often start a new event loop for each unit test to ensure separation of test environments.
EDIT: I filed a bug for this issue.
EDIT 2: The bug was fixed by devs.
Until the upstream issue is fixed, another way to work around the problem is by replacing the use of run_in_executor with a custom version without the flaw. While rolling one's own run_in_executor sounds like a bad idea at first, it is in fact only a small glue between a concurrent.futures and an asyncio future.
A simple version of run_in_executor can be cleanly implemented using the public API of those two classes:
def run_in_executor(executor, fn, *args):
"""Submit FN to EXECUTOR and return an asyncio future."""
loop = asyncio.get_event_loop()
if args:
fn = functools.partial(fn, *args)
work_future = executor.submit(fn)
aio_future = loop.create_future()
aio_cancelled = False
def work_done(_f):
if not aio_cancelled:
loop.call_soon_threadsafe(set_result)
def check_cancel(_f):
nonlocal aio_cancelled
if aio_future.cancelled():
work_future.cancel()
aio_cancelled = True
def set_result():
if work_future.cancelled():
aio_future.cancel()
elif work_future.exception() is not None:
aio_future.set_exception(work_future.exception())
else:
aio_future.set_result(work_future.result())
work_future.add_done_callback(work_done)
aio_future.add_done_callback(check_cancel)
return aio_future
When loop.run_in_executor(blocking) is replaced with run_in_executor(executor, blocking), executor being a ThreadPoolExecutor created in main(), the code works without other modifications.
Of course, in this variant the synchronous functions will continue running in the other thread to completion despite being canceled -- but that is unavoidable without modifying them to support explicit interruption.

Timeout child thread for python3

I am quite new to programming and I am running Linux, python3.5
There are a few similar questions in Stack Overflow but most of them do not have any response
like: [Python 2.7 multi-thread]In Python, how to timeout a function call in sub-thread?, and Python , Timeout on a function on child thread without using signal and thread.join
I am able to use signal when it is in main thread and timeout for multiprocess. However, the function I am currently running is a child thread using apscheduler (or it can be started directly)
schedule.add_job(test_upload.run, 'interval', seconds=10, start_date='2016-01-01 00:00:05',
args=['instant'])
and I can't convert it to child process because I am sharing database connection.
I have also tried https://stackoverflow.com/a/36904264/2823816, but terminal said
result = await future.result(timeout = timeout)
^
SyntaxError: invalid syntax
in
import concurrent
def run():
return 1
timeout = 10
with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
future = executor.submit(run) # get a future object
try:
result = await future.result(timeout = timeout)
except concurrent.futures.TimeOutError:
result = None
I am now very sure how to solve it:( Thanks for any help.
I gave up timing out the thread in my child thread.
So I used multi-process within the child thread to kill it. I could not find any other solution.
https://github.com/dozysun/timeout-timer, work fine in sub thread which use sub thread as timer

python ThreadPoolExecutor: after submit a task, will the thread die if task failed?

I have the following code:
my question is: if the function of check_running_job raise an exception and didn't catch it up in check_running_job, will it cause the thread running check_running_job die? So if I have max workers as 3, after it dies, then only 2 threads can serve future request?
with futures.ThreadPoolExecutor(max_workers = setting.parallelism_of_job_checking) as te:
while True:
cursor.execute(sql)
result = fetch_rows_as_dict(cursor)
for x in result:
id = x["id"]
te.submit(check_running_job, id,)
time.sleep(10)
ThreadPoolExecutor threads complete cleanly either by finishing their task or by raising an exception; a raised exception won't block the thread or prevent another worker from being assigned to it, and will cleanly set .done() to True just as if the task had finished correctly.
(You're probably aware of this, but if you try to access the .return() method of a task that has failed, its exception will be raised - so accessing the return value should always be done in a try ... except structure. If your code needs to know whether a task completed successfully or failed, this is one way of doing so.)

Categories

Resources