I have been experimenting with asyncio for a little while and read the PEPs; a few tutorials; and even the O'Reilly book.
I think I got the hang of it, but I'm still puzzled by the behavior of loop.close() which I can't quite figure out when it is "safe" to invoke.
Distilled to its simplest, my use case is a bunch of blocking "old school" calls, which I wrap in the run_in_executor() and an outer coroutine; if any of those calls goes wrong, I want to stop progress, cancel the ones still outstanding, print a sensible log and then (hopefully, cleanly) get out of the way.
Say, something like this:
import asyncio
import time
def blocking(num):
time.sleep(num)
if num == 2:
raise ValueError("don't like 2")
return num
async def my_coro(loop, num):
try:
result = await loop.run_in_executor(None, blocking, num)
print(f"Coro {num} done")
return result
except asyncio.CancelledError:
# Do some cleanup here.
print(f"man, I was canceled: {num}")
def main():
loop = asyncio.get_event_loop()
tasks = []
for num in range(5):
tasks.append(loop.create_task(my_coro(loop, num)))
try:
# No point in waiting; if any of the tasks go wrong, I
# just want to abandon everything. The ALL_DONE is not
# a good solution here.
future = asyncio.wait(tasks, return_when=asyncio.FIRST_EXCEPTION)
done, pending = loop.run_until_complete(future)
if pending:
print(f"Still {len(pending)} tasks pending")
# I tried putting a stop() - with/without a run_forever()
# after the for - same exception raised.
# loop.stop()
for future in pending:
future.cancel()
for task in done:
res = task.result()
print("Task returned", res)
except ValueError as error:
print("Outer except --", error)
finally:
# I also tried placing the run_forever() here,
# before the stop() - no dice.
loop.stop()
if pending:
print("Waiting for pending futures to finish...")
loop.run_forever()
loop.close()
I tried several variants of the stop() and run_forever() calls, the "run_forever first, then stop" seems to be the one to use according to the pydoc and, without the call to close() yields a satisfying:
Coro 0 done
Coro 1 done
Still 2 tasks pending
Task returned 1
Task returned 0
Outer except -- don't like 2
Waiting for pending futures to finish...
man, I was canceled: 4
man, I was canceled: 3
Process finished with exit code 0
However, when the call to close() is added (as shown above) I get two exceptions:
exception calling callback for <Future at 0x104f21438 state=finished returned int>
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/concurrent/futures/_base.py", line 324, in _invoke_callbacks
callback(self)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/futures.py", line 414, in _call_set_state
dest_loop.call_soon_threadsafe(_set_state, destination, source)
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 620, in call_soon_threadsafe
self._check_closed()
File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/asyncio/base_events.py", line 357, in _check_closed
raise RuntimeError('Event loop is closed')
RuntimeError: Event loop is closed
which is at best annoying, but to me, totally puzzling: and, to make matter worse, I've been unable to figure out what would The Right Way of handling such a situation.
Thus, two questions:
what am I missing? how should I modify the code above in a way that with the call to close() included does not raise?
what actually happens if I don't call close() - in this trivial case, I presume it's largely redundant; but what might the consequences be in a "real" production code?
For my own personal satisfaction, also:
why does it raise at all? what more does the loop want from the coros/tasks: they either exited; raised; or were canceled: isn't this enough to keep it happy?
Many thanks in advance for any suggestions you may have!
Distilled to its simplest, my use case is a bunch of blocking "old school" calls, which I wrap in the run_in_executor() and an outer coroutine; if any of those calls goes wrong, I want to stop progress, cancel the ones still outstanding
This can't work as envisioned because run_in_executor submits the function to a thread pool, and OS threads can't be cancelled in Python (or in other languages that expose them). Canceling the future returned by run_in_executor will attempt to cancel the underlying concurrent.futures.Future, but that will only have effect if the blocking function is not yet running, e.g. because the thread pool is busy. Once it starts to execute, it cannot be safely cancelled. Support for safe and reliable cancellation is one of the benefits of using asyncio compared to threads.
If you are dealing with synchronous code, be it a legacy blocking call or longer-running CPU-bound code, you should run it with run_in_executor and incorporate a way to interrupt it. For example, the code could occasionally check a stop_requested flag and exit if that is true, perhaps by raising an exception. Then you can "cancel" those tasks by setting the appropriate flag or flags.
how should I modify the code above in a way that with the call to close() included does not raise?
As far as I can tell, there is currently no way to do so without modifications to blocking and the top-level code. run_in_executor will insist on informing the event loop of the result, and this fails when the event loop is closed. It doesn't help that the asyncio future is cancelled, because the cancellation check is performed in the event loop thread, and the error occurs before that, when call_soon_threadsafe is called by the worker thread. (It might be possible to move the check to the worker thread, but it should be carefully analyzed whether it leads a race condition between the call to cancel() and the actual check.)
why does it raise at all? what more does the loop want from the coros/tasks: they either exited; raised; or were canceled: isn't this enough to keep it happy?
It wants the blocking functions passed to run_in_executor (literally called blocking in the question) that have already been started to finish running before the event loop is closed. You cancelled the asyncio future, but the underlying concurrent future still wants to "phone home", finding the loop closed.
It is not obvious whether this is a bug in asyncio, or if you are simply not supposed to close an event loop until you somehow ensure that all work submitted to run_in_executor is done. Doing so requires the following changes:
Don't attempt to cancel the pending futures. Canceling them looks correct superficially, but it prevents you from being able to wait() for those futures, as asyncio will consider them complete.
Instead, send an application-specific event to your background tasks informing them that they need to abort.
Call loop.run_until_complete(asyncio.wait(pending)) before loop.close().
With these modifications (except for the application-specific event - I simply let the sleep()s finish their course), the exception did not appear.
what actually happens if I don't call close() - in this trivial case, I presume it's largely redundant; but what might the consequences be in a "real" production code?
Since a typical event loop runs as long as the application, there should be no issue in not call close() at the very end of the program. The operating system will clean up the resources on program exit anyway.
Calling loop.close() is important for event loops that have a clear lifetime. For example, a library might create a fresh event loop for a specific task, run it in a dedicated thread, and dispose of it. Failing to close such a loop could leak its internal resources (such as the pipe it uses for inter-thread wakeup) and cause the program to fail. Another example are test suites, which often start a new event loop for each unit test to ensure separation of test environments.
EDIT: I filed a bug for this issue.
EDIT 2: The bug was fixed by devs.
Until the upstream issue is fixed, another way to work around the problem is by replacing the use of run_in_executor with a custom version without the flaw. While rolling one's own run_in_executor sounds like a bad idea at first, it is in fact only a small glue between a concurrent.futures and an asyncio future.
A simple version of run_in_executor can be cleanly implemented using the public API of those two classes:
def run_in_executor(executor, fn, *args):
"""Submit FN to EXECUTOR and return an asyncio future."""
loop = asyncio.get_event_loop()
if args:
fn = functools.partial(fn, *args)
work_future = executor.submit(fn)
aio_future = loop.create_future()
aio_cancelled = False
def work_done(_f):
if not aio_cancelled:
loop.call_soon_threadsafe(set_result)
def check_cancel(_f):
nonlocal aio_cancelled
if aio_future.cancelled():
work_future.cancel()
aio_cancelled = True
def set_result():
if work_future.cancelled():
aio_future.cancel()
elif work_future.exception() is not None:
aio_future.set_exception(work_future.exception())
else:
aio_future.set_result(work_future.result())
work_future.add_done_callback(work_done)
aio_future.add_done_callback(check_cancel)
return aio_future
When loop.run_in_executor(blocking) is replaced with run_in_executor(executor, blocking), executor being a ThreadPoolExecutor created in main(), the code works without other modifications.
Of course, in this variant the synchronous functions will continue running in the other thread to completion despite being canceled -- but that is unavoidable without modifying them to support explicit interruption.
Related
Imagine we're writing an application which allows a user to run an application (let's say it's a series of important operations against an API) continuously, and can run multiple applications concurrently. Requirements include:
the user can control the number of concurrent applications (which may limit concurrent load against an API, which is often important)
if the OS tries to close the Python program running this thing, it should gracefully terminate, allowing any in-progress applications to complete their run before closing
The question here is specifically about the task manager we've coded, so let's stub out some code that illustrates this problem:
import asyncio
import signal
async def work_chunk():
"""Simulates a chunk of work that can possibly fail"""
await asyncio.sleep(1)
async def protected_work():
"""All steps of this function MUST complete, the caller should shield it from cancelation."""
print("protected_work start")
for i in range(3):
await work_chunk()
print(f"protected_work working... {i+1} out of 3 steps complete")
print("protected_work done... ")
async def subtask():
print("subtask: starting loop of protected work...")
cancelled = False
while not cancelled:
protected_coro = asyncio.create_task(protected_work())
try:
await asyncio.shield(protected_coro)
except asyncio.CancelledError:
cancelled = True
await protected_coro
print("subtask: cancelation complete")
async def subtask_manager():
"""
Manage a pool of subtask workers.
(In the real world, the user can dynamically change the concurrency, but here we'll
hard code it at 3.)
"""
tasks = {}
while True:
for i in range(3):
task = tasks.get(i)
if not task or task.done():
tasks[i] = asyncio.create_task(subtask())
await asyncio.sleep(5)
def shutdown(signal, main_task):
"""Cleanup tasks tied to the service's shutdown."""
print(f"Received exit signal {signal.name}. Scheduling cancelation:")
main_task.cancel()
async def main():
print("main... start")
coro = asyncio.ensure_future(subtask_manager())
loop = asyncio.get_running_loop()
loop.add_signal_handler(signal.SIGINT, lambda: shutdown(signal.SIGINT, coro))
loop.add_signal_handler(signal.SIGTERM, lambda: shutdown(signal.SIGTERM, coro))
await coro
print("main... done")
def run():
asyncio.run(main())
run()
subtask_manager manages a pool of workers, periodically looking up what the present concurrency requirement is and updating the number of active workers appropriately (note that the code above cuts out most of that, and just hard codes a number, since it isn't important to the question).
subtask is the worker loop itself, which continuously runs protected_work() until someone cancels it.
But this code is broken. When you give it a SIGINT, the whole thing immediately crashes.
Before I explain further, let me point you at a critical bit of code:
1 protected_coro = asyncio.create_task(protected_work())
2 try:
3 await asyncio.shield(protected_coro)
4 except asyncio.CancelledError:
5 cancelled = True
6 await protected_coro # <-- This will raise CancelledError too!
After some debugging, we find that our try/except block isn't working. We find that both line 3 AND line 6 raise CancelledError.
When we dig in further, we find that ALL "await" calls throw CancelledError after the subtask manager is canceled, not just the line noted above. (i.e., the second line of work_chunk(), await asyncio.sleep(1), and the 4th line of protected_work(), await work_chunk(), also raise CancelledError.)
What's going on here?
It would seem that Python, for some reason, isn't propagating cancelation as you would expect, and just throws up its hands and says "I'm canceling everything now".
Why?
Clearly, I don't understand how cancelation propagation works in Python. I've struggled to find documentation on how it works. Can someone describe to me how cancelation is propagated in a clear-minded way that explains the behavior found in the example above?
After looking at this problem for a long time, and experimenting with other code snippets (where cancelation propagation works as expected), I started to wonder if the problem is Python doesn't know the order of propagation here, in this case.
But why?
Well, subtask_manager creates tasks, but doesn't await them.
Could it be that Python doesn't assume that the coroutine that created that task (with create_task) owns that task? I think Python uses the await keyword exclusively to know in what order to propagate cancelation, and if after traversing the whole tree of tasks it finds tasks that still haven't been canceled, it just destroys them all.
Therefore, it's up to us to manage Task cancelation propagation ourselves, in any place where we know we haven't awaited an async task. So, we need to refactor subtask_manager to catch its own cancelation, and explicitly cancel and then await all its child tasks:
async def subtask_manager():
"""
Manage a pool of subtask workers.
(In the real world, the user can dynamically change the concurrency, but here we'll
hard code it at 3.)
"""
tasks = {}
while True:
for i in range(3):
task = tasks.get(i)
if not task or task.done():
tasks[i] = asyncio.create_task(subtask())
try:
await asyncio.sleep(5)
except asyncio.CancelledError:
print("cancelation detected, canceling children")
[t.cancel() for t in tasks.values()]
await asyncio.gather(*[t for t in tasks.values()])
return
Now our code works as expected:
Note: I've answered my own question Q&A style, but I still feel unsatisfied with my textual answer about how cancelation propagation works. If anyone has a better explanation of how cancelation propagation works, I would love to read it.
What's going on here? It would seem that Python, for some reason, isn't propagating cancelation as you would expect, and just throws up its hands and says "I'm canceling everything now".
TL;DR Canceling everything is precisely what's happening, simply because the event loop is exiting.
To investigate this, I changed the invocation of add_signal_handler() to loop.call_later(.5, lambda: shutdown(signal.SIGINT, coro)). Python's Ctrl+C handling has odd corners, and I wanted to check whether the strange behavior is the result of that. But the bug was perfectly reproducible without signals, so it wasn't that.
And yet, asyncio cancellation really shouldn't work like your code shows. Canceling a task propagates to the future (or another task) it awaits, but shield is specifically implemented to circumvent that. It creates and returns a fresh future, and connects the result of the original (shielded) future to the new one in a way that cancel() doesn't know how to follow.
It took me some time to unearth what really happens, and that is:
await coro at the end of main awaits the task that gets cancelled, so it gets a CancelledError as soon as shutdown cancels it;
the exception causes main to exit and enters the cleanup sequence at the end of asyncio.run(). This cleanup sequence cancels all tasks, including the ones you've shielded.
You can test it by changing await coro at the end of main() to:
try:
await coro
finally:
print('main... done')
And you will see that "main... done" is printed prior to all the mysterious cancellations you've been witnessing.
So that clears the mystery and to fix the issue, you should postpone exiting main until everything is done. For example, you can create the tasks dict in main, pass it to subtask_manager(), and then await those critical tasks when the main task gets cancelled:
async def subtask_manager(tasks):
while True:
for i in range(3):
task = tasks.get(i)
if not task or task.done():
tasks[i] = asyncio.create_task(subtask())
try:
await asyncio.sleep(5)
except asyncio.CancelledError:
for t in tasks.values():
t.cancel()
raise
# ... shutdown unchanged
async def main():
print("main... start")
tasks = {}
main_task = asyncio.ensure_future(subtask_manager(tasks))
loop = asyncio.get_running_loop()
loop.add_signal_handler(signal.SIGINT, lambda: shutdown(signal.SIGINT, main_task))
loop.add_signal_handler(signal.SIGTERM, lambda: shutdown(signal.SIGTERM, main_task))
try:
await main_task
except asyncio.CancelledError:
await asyncio.gather(*tasks.values())
finally:
print("main... done")
Note that the main task must explicitly cancel its subtasks because that actually wouldn't happen automatically. Cancellation is propagated through a chain of awaits, and subtask_manager doesn't explicitly awaits its subtasks, it just spawns them and awaits something else, effectively shielding them.
I'm reading the following code with Visual Studio Code (VSCode) Debugger.
import asyncio
async def main():
print("OK Google. Wake me up in 1 seconds.")
await asyncio.sleep(1)
print("Wake up!")
if __name__ == "__main__":
asyncio.run(main(), debug=False)
I could understand the main flow of the program that schedules a callback to sleep the process for a second, but it was difficult to follow when or where _run_until_complete_cb is called?
After the main coroutine is executed, this function/callback is called to stop an event loop by setting a _stopping flag to be True. It is, however, originally appended to a _callbacks internal property in the Future class through add_done_callback or called soon if the future is done.
def add_done_callback(self, fn, *, context=None):
"""Add a callback to be run when the future becomes done.
The callback is called with a single argument - the future object. If
the future is already done when this is called, the callback is
scheduled with call_soon.
"""
if self._state != _PENDING:
self._loop.call_soon(fn, self, context=context)
else:
if context is None:
context = contextvars.copy_context()
self._callbacks.append((fn, context))
Either case, it's registered with the event loop by a call_soon method and called at the next iteration at the end. But the future haven't done yet at the moment added the else clause above.
My question is where or when the future is done to proceed with _run_until_complete_cb to the else clause? Since the VSCode debugger just skips or ignores the line of code that calls the method on Future and Task instances somehow, so the flow jumps right into the call_soon in _run_until_complete_cb.
What exactly happened after finishing the main coroutine? Does someone have any ideas or hints about a clean-up process of the asyncio module to stop the event loop or a way to look into the methods on Future or Task by the VSCode debugger?
Thanks a lot in advance!
If you're trying to debug into asyncio then you want to set "justMyCode": true in your launch.json. That will have the debugger trace into 3rd-party code, including the stdlib.
I was previously using the threading.Thread module. Now I'm using concurrent.futures -> ThreadPoolExecutor. Previously, I was using the following code to exit/kill/finish a thread:
def terminate_thread(thread):
"""Terminates a python thread from another thread.
:param thread: a threading.Thread instance
"""
if not thread.isAlive():
return
exc = ctypes.py_object(SystemExit)
res = ctypes.pythonapi.PyThreadState_SetAsyncExc(
ctypes.c_long(thread.ident), exc)
if res == 0:
raise ValueError("nonexistent thread id")
elif res > 1:
# """if it returns a number greater than one, you're in trouble,
# and you should call it again with exc=NULL to revert the effect"""
ctypes.pythonapi.PyThreadState_SetAsyncExc(thread.ident, None)
raise SystemError("PyThreadState_SetAsyncExc failed")
This doesn't appear to be working with futures interface. What's the best practice here? Just return? My threads are controlling Selenium instances. I need to make sure that when I kill a thread, the Selenium instance is torn down.
Edit: I had already seen the post that is referenced as duplicate. It's insufficient because when you venture into something like futures, behaviors can be radically different. In the case of the previous threading module, my terminate_thread function is acceptable and not applicable to the criticism of the other q/a. It's not the same as "killing". Please take a look at the code I posted to see that.
I don't want to kill. I want to check if its still alive and gracefully exit the thread in the most proper way. How to do with futures?
If you want to let the threads finish their current work use:
thread_executor.shutdown(wait=True)
If you want to bash the current futures being run on the head and stop all ...future...(heh) futures use:
thread_executor.shutdown(wait=False)
for t in thread_executor._threads:
terminate_thread(t)
This uses your terminate_thread function to call an exception in the threads in the thread pool executor. Those futures that were disrupted will return with the exception set.
How about .cancel() on the thread result?
cancel() Attempt to cancel the call. If the call is currently being
executed and cannot be cancelled then the method will return False,
otherwise the call will be cancelled and the method will return True.
https://docs.python.org/3/library/concurrent.futures.html
I have an app which adds coroutines to an already-running event loop.
The arguments for these coroutines depend on I/O and are not available when I initially start the event loop - with loop.run_forever(), so I add the tasks later. To demonstrate the phenomenon, here is some example code:
import asyncio
from threading import Thread
from time import sleep
loop = asyncio.new_event_loop()
def foo():
loop.run_forever()
async def bar(s):
while True:
await asyncio.sleep(1)
print("s")
#loop.create_task(bar("A task created before thread created & before loop started"))
t = Thread(target=foo)
t.start()
sleep(1)
loop.create_task(bar("secondary task"))
The strange behaviour is that everything works as expected when there is at least one task in the loop when invoking loop.run_forever(). i.e. when the commented line is not commented out.
But when it is commented out, as shown above, nothing is printed and it appears I am unable to add a task to the event_loop. Should I avoid invoking run_forever() without adding a single task? I don't see why this should be a problem. Adding tasks to an event_loop after it is running is standard, why should the empty case be an issue?
Adding tasks to an event_loop after it is running is standard, why should the empty case be an issue?
Because you're supposed to add tasks from the thread running the event loop. In general one should not mix threads and asyncio, except through APIs designed for that purpose, such as loop.run_in_executor.
If you understand this and still have good reason to add tasks from a separate thread, use asyncio.run_coroutine_threadsafe. Change loop.create_task(bar(...)) to:
asyncio.run_coroutine_threadsafe(bar("in loop"), loop=loop)
run_coroutine_threadsafe accesses the event loop in a thread-safe manner, and also ensures that the event loop wakes up to notice the new task, even if it otherwise has nothing to do and is just waiting for IO/timeouts.
Adding another task beforehand only appeared to work because bar happens to be an infinite coroutine that makes the event loop wake up every second. Once the event loop wakes up for any reason, it executes all runnable tasks regardless of which thread added them. It would be a really bad idea to rely on this, though, because loop.create_task is not thread-safe, so there could be any number of race conditions if it executed in parallel with a running event loop.
Because loop.create_task is not thread safe, and if you set loop._debug = True, you should see the error as
Traceback (most recent call last):
File "test.py", line 23, in <module>
loop.create_task(bar("secondary task"))
File "/Users/soulomoon/.pyenv/versions/3.6.3/lib/python3.6/asyncio/base_events.py", line 284, in create_task
task = tasks.Task(coro, loop=self)
File "/Users/soulomoon/.pyenv/versions/3.6.3/lib/python3.6/asyncio/base_events.py", line 576, in call_soon
self._check_thread()
File "/Users/soulomoon/.pyenv/versions/3.6.3/lib/python3.6/asyncio/base_events.py", line 615, in _check_thread
"Non-thread-safe operation invoked on an event loop other "
RuntimeError: Non-thread-safe operation invoked on an event loop other than the current one
I have a problem to understand how asyncio works
if I create a future future = asyncio.Future() then add add_done_callback(done_callback) and after that cancel the future
future.cancel() the done_callback not suppose to get fired?
I tried to use the loop.run_forever() but I end up with infinite loop.
I have a small example code:
_future_set = asyncio.Future()
def done_callback(f):
if f.exception():
_future_set.set_exception(f.exception)
elif f.cancelled():
_future_set.cancel()
else:
_future_set.set_result(None)
async def check_stats(future):
while not future.cancelled():
print("not done")
continue
loop.stop()
def set(future):
if not _future_set.done():
future.add_done_callback(done_callback)
loop = asyncio.new_event_loop()
future = loop.create_future()
asyncio.ensure_future(check_stats(future), loop=loop)
set(future)
future.cancel()
loop.run_forever() # infinite
print(_future_set.cancelled())
I know something is missing and maybe this is not the behavior but I will be happy for a little help here.
I am using python 3.6
**update
After set is fire and I bind the add_done_callback to the future
when I cancel the future and the state of the future change to cancelled and done then I expect that _future_set will be cancelled too.
and print(_future_set.cancelled()) will be True
From the docstring (on my unix system) help(loop.run_forever):
run_forever() method of asyncio.unix_events._UnixSelectorEventLoop instance
Run until stop() is called.
When you call loop.run_forever() the program will not progress beyond that line until stop() is called on the ioloop instance, and there's nothing in your code doing so.
loop.run_forever() is essentially doing:
def run_forever(self):
while not self.stopped:
self.check_for_things_to_do()
Without knowing a little more as to what you're trying to achieve, it's hard to help you further. However it seems that you're expecting loop.run_forever() to be asynchronous in the execution of the python code, however this is not the case. The IOLoop will keep looping and check filevents and fire callbacks on futures, and will only return back to the point it's called if it is told to stop looping.
Ah, I realise now what you're expecting to happen. You need to register the futures with the ioloop, either by doing future = loop.create_future() or future = asyncio.Future(loop=loop). The former is the preferred method for creating futures. N.B. the code will still run forever at the loop.run_forever() call unless it is stopped, so your print statement will still never be reached.
Further addendum: If you actually run the code you have in your question, there is an exception being raised at f.exception(), which as as per the docs:
exception()
Return the exception that was set on this future.
The exception (or None if no exception was set) is returned only if the future is done. If the future has been cancelled, raises CancelledError. If the future isn’t done yet, raises InvalidStateError.
This means that the invocation of done_callback() is being stopped at the first if f.exception(). So if you switch done_callback() around to read:
def done_callback(f):
if f.cancelled():
_future_set.cancel()
elif f.exception():
_future_set.set_exception(f.exception)
else:
_future_set.set_result(None)
Then you get the expected output.