Confusing asyncio task cancellation behavior

Confusing asyncio task cancellation behavior - python

I'm confused by the behavior of the asyncio code below:
import time
import asyncio
from threading import Thread
import logging
logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.INFO)
event_loop = None
q = None
# queue items processing
async def _main():
global event_loop, q
q = asyncio.Queue(maxsize=5)
event_loop = asyncio.get_running_loop()
try:
while True:
try:
new_data = await asyncio.wait_for(q.get(), timeout=1)
logger.info(new_data)
q.task_done()
except asyncio.TimeoutError:
logger.warning(f'timeout - main cancelled? {asyncio.current_task().cancelled()}')
except asyncio.CancelledError:
logger.warning(f'cancelled')
raise
def _event_loop_thread():
try:
asyncio.run(_main(), debug=True)
except asyncio.CancelledError:
logger.warning('main was cancelled')
thread = Thread(target=_event_loop_thread)
thread.start()
# wait for the event loop to start
while not event_loop:
time.sleep(0.1)
async def _push(a):
try:
try:
await q.put(a)
await asyncio.sleep(0.1)
except asyncio.QueueFull:
logger.warning('q full')
except asyncio.CancelledError:
logger.warning('push cancelled')
raise
# push some stuff to the queue
for i in range(10):
future = asyncio.run_coroutine_threadsafe(_push(f'processed {i}'), event_loop)
pending_tasks = asyncio.all_tasks(loop=event_loop)
# cancel each pending task
for task in pending_tasks:
logger.info(f'killing task {task.get_coro()}')
event_loop.call_soon_threadsafe(task.cancel)
logger.info('finished')
Which produces the following output:
INFO:__main__:killing task <coroutine object _main at 0x7f7ff05d6a40>
INFO:__main__:killing task <coroutine object _push at 0x7f7fefd17140>
INFO:__main__:killing task <coroutine object _push at 0x7f7fefd0fbc0>
INFO:__main__:killing task <coroutine object Queue.get at 0x7f7fefd7dd40>
INFO:__main__:killing task <coroutine object _push at 0x7f7fefd170c0>
INFO:__main__:finished
INFO:__main__:processed 0
WARNING:__main__:push cancelled
WARNING:__main__:push cancelled
WARNING:__main__:push cancelled
INFO:__main__:processed 1
INFO:__main__:processed 2
INFO:__main__:processed 3
INFO:__main__:processed 4
INFO:__main__:processed 5
INFO:__main__:processed 6
INFO:__main__:processed 7
INFO:__main__:processed 8
INFO:__main__:processed 9
WARNING:__main__:timeout - main cancelled? False
WARNING:__main__:timeout - main cancelled? False
WARNING:__main__:timeout - main cancelled? False
WARNING:__main__:timeout - main cancelled? False
WARNING:__main__:timeout - main cancelled? False
Why does the _main() coro never get cancelled? I've looked through the asyncio documentation and haven't found anything that hints at what might be going on.
Furthermore, if you replace the line:
new_data = await asyncio.wait_for(q.get(), timeout=1)
With:
new_data = await q.get()
Things behave as expected. The _main() and all other tasks get properly cancelled. So it seems to be a problem with async.wait_for().
What I'm trying to do here is have a producer / consumer model where the consumer is the _main() task in the asyncio event loop (running in a separate thread) and the main thread is the producer (using _push()).
Thanks

Unfortunately you have stumbled on an outstanding bug in the asyncio package: https://bugs.python.org/issue42130. As you observe, asyncio.wait_for can suppress a CancelledError under some circumstances. This occurs when the awaitable passed to wait_for has actually finished when the cancellation occurs; wait_for then returns the awaitable's result without propagating the cancellation. (I also learned about this the hard way.)
The only available fix at the moment (as far as I know) is to avoid using wait_for in any coroutine that can be cancelled. Perhaps in your case you can simply await q.get() and not worry about the possibility of a timeout.
I would like to point out, in passing, that your program is seriously non-deterministic. What I mean is that you are not synchronizing the activity between the two threads - and that has some strange consequences. Did you notice, for example, that you created 10 tasks based on the _push coroutine, yet you only cancelled 3 of them? That happened because you fired off 10 task creations to the second thread:
# push some stuff to the queue
for i in range(10):
future = asyncio.run_coroutine_threadsafe(_push(f'processed {i}'), event_loop)
but without waiting on any of the returned futures, you immediately started to cancel tasks:
pending_tasks = asyncio.all_tasks(loop=event_loop)
# cancel each pending task
for task in pending_tasks:
logger.info(f'killing task {task.get_coro()}')
event_loop.call_soon_threadsafe(task.cancel)
Apparently the second thread hadn't finished creating all the tasks yet, so your task cancellation logic was hit-and-miss.
Allocating CPU time slices between two threads is an OS function, and if you want things in different threads to happen in a specific order you must write explicit logic. When I ran your exact code on my machine (python3.10, Windows 10) I got significantly different behavior from what you reported.
This wasn't the real problem, as it turns out, but it's hard to troubleshoot a program that doesn't do the same thing every time.

Related

Correctly adding a signal handler to Asyncio code

I'm trying to modify the graceful shutdown example from RogueLynn to cancel running processes that were spawned by the tasks.
Below is a minimal example to demonstrate the issue I'm facing. With this example, I get a warning message that the callback function isn't awaited and when I do try to terminate the script, the asyncio.gather call doesn't seem to complete. Any idea how to resolve this such that the shutdown callback executes completely?
import asyncio
import functools
import signal
async def run_process(time):
try:
print(f"Starting to sleep for {time} seconds")
await asyncio.sleep(time)
print(f"Completed sleep of {time} seconds")
except asyncio.CancelledError:
print("Received cancellation terminating process")
raise
async def main():
tasks = [run_process(10), run_process(5), run_process(2)]
for future in asyncio.as_completed(tasks):
try:
await future
except Exception as e:
print(f"Caught exception: {e}")
async def shutdown(signal, loop):
# Cancel running tasks on keyboard interrupt
print(f"Running shutdown")
tasks = [t for t in asyncio.all_tasks() if t is not asyncio.current_task()]
[task.cancel() for task in tasks]
await asyncio.gather(*tasks, return_exceptions=True)
print("Finished waiting for cancelled tasks")
loop.stop()
try:
loop = asyncio.get_event_loop()
signals = (signal.SIGINT,)
for sig in signals:
loop.add_signal_handler(sig, functools.partial(asyncio.create_task, shutdown(sig, loop)))
loop.run_until_complete(main())
finally:
loop.close()
Output when run to completion:
Starting to sleep for 2 seconds
Starting to sleep for 10 seconds
Starting to sleep for 5 seconds
Completed sleep of 2 seconds
Completed sleep of 5 seconds
Completed sleep of 10 seconds
/home/git/envs/lib/python3.8/asyncio/unix_events.py:140: RuntimeWarning: coroutine 'shutdown' was never awaited
del self._signal_handlers[sig]
And output when script is interrupted:
Starting to sleep for 2 seconds
Starting to sleep for 10 seconds
Starting to sleep for 5 seconds
Completed sleep of 2 seconds
^CRunning shutdown
Received cancellation terminating process
Received cancellation terminating process
Task was destroyed but it is pending!
task: <Task pending name='Task-5' coro=<shutdown() running at ./test.py:54> wait_for=<_GatheringFuture finished result=[CancelledError(), CancelledError(), CancelledError()]>>
Traceback (most recent call last):
File "./test.py", line 65, in <module>
loop.run_until_complete(main())
File "/home/git/envs/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete
return future.result()
asyncio.exceptions.CancelledError

The CancelledError you see results from your as_completed for loop.
If you just want to fix it, you could add an exception handling for it, e.g.
[...]
try:
await future
except Exception as e:
print(f"Caught exception: {e}")
except asyncio.CancelledError:
print("task was cancelled")
[...]
Note that, you will find a warning telling you that Task was destroyed but it is pending! for your shutdown task, which you could just ignore. I guess it is because you stop the loop within a task.
Still, I would like to point at the difference between co-routines, tasks, and futures, see https://docs.python.org/3/library/asyncio-task.html.
What you call tasks and future are co-routines.
For the type of problem that you are trying to solve, I would advise to have a look at Asynchronous Context Managers. Graceful shutdown sounds to me like you want to close some database connections or dump some process variables... Here, you could have a look at https://docs.python.org/3/library/contextlib.html#contextlib.asynccontextmanager
However, if things become more complex, you may want to write such a signal handler which adds its own task to the loop. In this case, I would advise to create the relevant tasks explicitly with asyncio.create_task(coro, name="my-task-name") so you can select exactly the tasks you want to cancel first by name, e.g.
tasks = [
task for task in asyncio.all_tasks()
if task.get_name().startswith("my-task")
]
Otherwise, you may accidentally cancel a cleanup-task.

Unable to cancel future - asyncio.sleep()

I have a signal handler defined that cancels all the tasks in the currently running asyncio event loop when the SIGINT signal is raised. In main, I have defined a new loop and the loop runs until the sleep function completes. I have used print statements inside signal_handler for better understanding as to what happens when an asyncio task is cancelled.
Below is my implementation,
import asyncio
import signal
class temp:
def signal_handler(self, sig, frame):
loop = asyncio.get_event_loop()
tasks = asyncio.all_tasks(loop=loop)
for task in tasks:
print(task.get_name()) #returns the name of the task
ret_val = asyncio.Future.cancel(task) #returns True if task was just cancelled
print(f"Return value : {ret_val}")
print(f"Task Cancelled : {task.cancelled()}") #returns True if task is cancelled
return
def main(self):
try:
signal.signal(signal.SIGINT, self.signal_handler)
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop=loop)
loop.run_until_complete(asyncio.sleep(20))
except asyncio.CancelledError as err:
print("Cancellation error raised")
finally:
if not loop.is_closed():
loop.close()
if __name__ == "__main__":
test = temp()
test.main()
Expected Behaviour:
When I raise a SIGINT at any time using Ctrl+C, the task (asyncio.sleep()) gets cancelled instantaneously and a CancellationError is raised and there is a graceful exit.
Actual Behaviour:
The CancellationError is raised after time t (in seconds) specified as a parameter in asyncio.sleep(t). For Example, the CancellationError is raised after 20 secs for the above code.
Unusual Observation:
The behaviour of the code is in line with the Actual Behaviour when executed on Windows.
The issue described above is only happening on Linux.
What could be the reason for this ambiguous behaviour?

How to wait for all tasks to finish before terminating the event loop?

What's the standard way in Python to ensure that all concurrent tasks are completed before the event loop ends? Here's a simplified example:
import asyncio
async def foo(delay):
print("Start foo.") # Eg: Send message
asyncio.create_task(bar(delay))
print("End foo.")
async def bar(delay):
print("Start bar.")
await asyncio.sleep(delay)
print("End bar.") # Eg: Delete message after delay
def main():
asyncio.run(foo(2))
if __name__ == "__main__":
main()
Current output:
Start foo. # Eg: Send message
End foo.
Start bar.
Desired output:
Start foo. # Eg: Send message
End foo.
Start bar.
End bar. # Eg: Delete message after delay
I've tried to run all outstanding tasks after loop.run_until_complete(), but that doesn't work since the loop will have been terminated by then. I've also tried modifying the main function to the following:
async def main():
await foo(2)
tasks = asyncio.all_tasks()
if len(tasks) > 0:
await asyncio.wait(tasks)
if __name__ == "__main__":
asyncio.run(main())
The output is correct, but it never terminates since the coroutine main() is one of the tasks. The setup above is also how discord.py sends a message and deletes it after a period of time, except that it uses loop.run_forever() instead, so does not encounter the problem.

There is no standard way to wait for all tasks in asyncio (and similar frameworks), and in fact one should not try to. Speaking in terms of threads, a Task expresses both regular and daemon activities. Waiting for all tasks indiscriminately may cause an application to stall indefinitely.
A task that is created but never awaited is de-facto a background/daemon task. In contrast, if a task should not be treated as background/daemon then it is the callers responsibility to ensure it is awaited.
The simplest solution is for every coroutine to await and/or cancel all tasks it spawns.
async def foo(delay):
print("Start foo.")
task = asyncio.create_task(bar(delay))
print("End foo.")
await task # foo is done here, it ensures the other task finishes as well
Since the entire point of async/tasks is to have cheap task switching, this is a cheap operation. It should also not affect any well-designed applications:
If the purpose of a function is to produce a value, any child tasks should be part of producing that value.
If the purpose of a function is some side-effect, any child tasks should be parts of that side-effect.
For more complex situations, it can be worthwhile to return any outstanding tasks.
async def foo(delay):
print("Start foo.")
task = asyncio.create_task(bar(delay))
print("End foo.")
return task # allow the caller to wait for our child tasks
This requires the caller to explicitly handle outstanding tasks, but gives prompt replies and the most control. The top-level task is then responsible for handling any orphan tasks.
For async programming in general, the structured programming paradigm encodes the idea of "handling outstanding tasks" in a managing object. In Python, this pattern has been encoded by the trio library as so-called Nursery objects.
import trio
async def foo(delay, nursery):
print("Start foo.")
# spawning a task via a nursery means *someone* awaits it
nursery.start_soon(bar, delay)
print("End foo.")
async def bar(delay):
print("Start bar.")
await trio.sleep(delay)
print("End bar.")
async def main():
# a task may spawn a nursery and pass it to child tasks
async with trio.open_nursery() as nursery:
await foo(2, nursery)
if __name__ == "__main__":
trio.run(main)
While this pattern has been suggested for asyncio as TaskGroups, so far it has been deferred.
Various ports of the pattern for asyncio are available via third-party libraries, however.

How to terminate long-running computation (CPU bound task) in Python using asyncio and concurrent.futures.ProcessPoolExecutor?

Similar Question (but answer does not work for me): How to cancel long-running subprocesses running using concurrent.futures.ProcessPoolExecutor?
Unlike the question linked above and the solution provided, in my case the computation itself is rather long (CPU bound) and cannot be run in a loop to check if some event has happened.
Reduced version of the code below:
import asyncio
import concurrent.futures as futures
import time
class Simulator:
def __init__(self):
self._loop = None
self._lmz_executor = None
self._tasks = []
self._max_execution_time = time.monotonic() + 60
self._long_running_tasks = []
def initialise(self):
# Initialise the main asyncio loop
self._loop = asyncio.get_event_loop()
self._loop.set_default_executor(
futures.ThreadPoolExecutor(max_workers=3))
# Run separate processes of long computation task
self._lmz_executor = futures.ProcessPoolExecutor(max_workers=3)
def run(self):
self._tasks.extend(
[self.bot_reasoning_loop(bot_id) for bot_id in [1, 2, 3]]
)
try:
# Gather bot reasoner tasks
_reasoner_tasks = asyncio.gather(*self._tasks)
# Send the reasoner tasks to main monitor task
asyncio.gather(self.sample_main_loop(_reasoner_tasks))
self._loop.run_forever()
except KeyboardInterrupt:
pass
finally:
self._loop.close()
async def sample_main_loop(self, reasoner_tasks):
"""This is the main monitor task"""
await asyncio.wait_for(reasoner_tasks, None)
for task in self._long_running_tasks:
try:
await asyncio.wait_for(task, 10)
except asyncio.TimeoutError:
print("Oops. Some long operation timed out.")
task.cancel() # Doesn't cancel and has no effect
task.set_result(None) # Doesn't seem to have an effect
self._lmz_executor.shutdown()
self._loop.stop()
print('And now I am done. Yay!')
async def bot_reasoning_loop(self, bot):
import math
_exec_count = 0
_sleepy_time = 15
_max_runs = math.floor(self._max_execution_time / _sleepy_time)
self._long_running_tasks.append(
self._loop.run_in_executor(
self._lmz_executor, really_long_process, _sleepy_time))
while time.monotonic() < self._max_execution_time:
print("Bot#{}: thinking for {}s. Run {}/{}".format(
bot, _sleepy_time, _exec_count, _max_runs))
await asyncio.sleep(_sleepy_time)
_exec_count += 1
print("Bot#{} Finished Thinking".format(bot))
def really_long_process(sleepy_time):
print("I am a really long computation.....")
_large_val = 9729379273492397293479237492734 ** 344323
print("I finally computed this large value: {}".format(_large_val))
if __name__ == "__main__":
sim = Simulator()
sim.initialise()
sim.run()
The idea is that there is a main simulation loop that runs and monitors three bot threads. Each of these bot threads then perform some reasoning but also start a really long background process using ProcessPoolExecutor, which may end up running longer their own threshold/max execution time for reasoning on things.
As you can see in the code above, I attempted to .cancel() these tasks when a timeout occurs. Though this is not really cancelling the actual computation, which keeps happening in the background and the asyncio loop doesn't terminate until after all the long running computation have finished.
How do I terminate such long running CPU-bound computations within a method?
Other similar SO questions, but not necessarily related or helpful:
asyncio: Is it possible to cancel a future been run by an Executor?
How to terminate a single async task in multiprocessing if that single async task exceeds a threshold time in Python
Asynchronous multiprocessing with a worker pool in Python: how to keep going after timeout?

How do I terminate such long running CPU-bound computations within a method?
The approach you tried doesn't work because the futures returned by ProcessPoolExecutor are not cancellable. Although asyncio's run_in_executor tries to propagate the cancellation, it is simply ignored by Future.cancel once the task starts executing.
There is no fundamental reason for that. Unlike threads, processes can be safely terminated, so it would be perfectly possible for ProcessPoolExecutor.submit to return a future whose cancel terminated the corresponding process. Asyncio coroutines have well-defined cancellation semantics and could automatically make use of it. Unfortunately, ProcessPoolExecutor.submit returns a regular concurrent.futures.Future, which assumes the lowest common denominator of the underlying executors, and treats a running future as untouchable.
As a result, to cancel tasks executed in subprocesses, one must circumvent the ProcessPoolExecutor altogether and manage one's own processes. The challenge is how to do this without reimplementing half of multiprocessing. One option offered by the standard library is to (ab)use multiprocessing.Pool for this purpose, because it supports reliable shutdown of worker processes. A CancellablePool could work as follows:
Instead of spawning a fixed number of processes, spawn a fixed number of 1-worker pools.
Assign tasks to pools from an asyncio coroutine. If the coroutine is canceled while waiting for the task to finish in the other process, terminate the single-process pool and create a new one.
Since everything is coordinated from the single asyncio thread, don't worry about race conditions such as accidentally killing a process which has already started executing another task. (This would need to be prevented if one were to support cancellation in ProcessPoolExecutor.)
Here is a sample implementation of that idea:
import asyncio
import multiprocessing
class CancellablePool:
def __init__(self, max_workers=3):
self._free = {self._new_pool() for _ in range(max_workers)}
self._working = set()
self._change = asyncio.Event()
def _new_pool(self):
return multiprocessing.Pool(1)
async def apply(self, fn, *args):
"""
Like multiprocessing.Pool.apply_async, but:
* is an asyncio coroutine
* terminates the process if cancelled
"""
while not self._free:
await self._change.wait()
self._change.clear()
pool = usable_pool = self._free.pop()
self._working.add(pool)
loop = asyncio.get_event_loop()
fut = loop.create_future()
def _on_done(obj):
loop.call_soon_threadsafe(fut.set_result, obj)
def _on_err(err):
loop.call_soon_threadsafe(fut.set_exception, err)
pool.apply_async(fn, args, callback=_on_done, error_callback=_on_err)
try:
return await fut
except asyncio.CancelledError:
pool.terminate()
usable_pool = self._new_pool()
finally:
self._working.remove(pool)
self._free.add(usable_pool)
self._change.set()
def shutdown(self):
for p in self._working | self._free:
p.terminate()
self._free.clear()
A minimalistic test case showing cancellation:
def really_long_process():
print("I am a really long computation.....")
large_val = 9729379273492397293479237492734 ** 344323
print("I finally computed this large value: {}".format(large_val))
async def main():
loop = asyncio.get_event_loop()
pool = CancellablePool()
tasks = [loop.create_task(pool.apply(really_long_process))
for _ in range(5)]
for t in tasks:
try:
await asyncio.wait_for(t, 1)
except asyncio.TimeoutError:
print('task timed out and cancelled')
pool.shutdown()
asyncio.get_event_loop().run_until_complete(main())
Note how the CPU usage never exceeds 3 cores, and how it starts dropping near the end of the test, indicating that the processes are being terminated as expected.
To apply it to the code from the question, make self._lmz_executor an instance of CancellablePool and change self._loop.run_in_executor(...) to self._loop.create_task(self._lmz_executor.apply(...)).

How to schedule and cancel tasks with asyncio

I am writing a client-server application. While connected, client sends to the server a "heartbeat" signal, for example, every second.
On the server-side I need a mechanism where I can add tasks (or coroutines or something else) to be executed asynchronously. Moreover, I want to cancel tasks from a client, when it stops sending that "heartbeat" signal.
In other words, when the server starts a task it has kind of timeout or ttl, in example 3 seconds. When the server receives the "heartbeat" signal it resets timer for another 3 seconds until task is done or client disconnected (stops send the signal).
Here is an example of canceling a task from asyncio tutorial on pymotw.com. But here the task is canceled before the event_loop started, which is not suitable for me.
import asyncio
async def task_func():
print('in task_func')
return 'the result'
event_loop = asyncio.get_event_loop()
try:
print('creating task')
task = event_loop.create_task(task_func())
print('canceling task')
task.cancel()
print('entering event loop')
event_loop.run_until_complete(task)
print('task: {!r}'.format(task))
except asyncio.CancelledError:
print('caught error from cancelled task')
else:
print('task result: {!r}'.format(task.result()))
finally:
event_loop.close()

You can use asyncio Task wrappers to execute a task via the ensure_future() method.
ensure_future will automatically wrap your coroutine in a Task wrapper and attach it to your event loop. The Task wrapper will then also ensure that the coroutine 'cranks-over' from await to await statement (or until the coroutine finishes).
In other words, just pass a regular coroutine to ensure_future and assign the resultant Task object to a variable. You can then call Task.cancel() when you need to stop it.
import asyncio
async def task_func():
print('in task_func')
# if the task needs to run for a while you'll need an await statement
# to provide a pause point so that other coroutines can run in the mean time
await some_db_or_long_running_background_coroutine()
# or if this is a once-off thing, then return the result,
# but then you don't really need a Task wrapper...
# return 'the result'
async def my_app():
my_task = None
while True:
await asyncio.sleep(0)
# listen for trigger / heartbeat
if heartbeat and my_task is None:
my_task = asyncio.ensure_future(task_func())
# also listen for termination of hearbeat / connection
elif not heartbeat and my_task:
if not my_task.cancelled():
my_task.cancel()
else:
my_task = None
run_app = asyncio.ensure_future(my_app())
event_loop = asyncio.get_event_loop()
event_loop.run_forever()
Note that tasks are meant for long-running tasks that need to keep working in the background without interrupting the main flow. If all you need is a quick once-off method, then just call the function directly instead.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.