Shutdown infinite async generator - python

Reproducible error
I tried to reproduce the error in an online REPL here. However, it is not exactly the same implementation (and hence behavior) as my real code (where I do async for response in position_stream(), instead of for position in count() in the REPL).
More details on my actual implementation
I define somewhere a coroutine like so:
async def position(self):
request = telemetry_pb2.SubscribePositionRequest()
position_stream = self._stub.SubscribePosition(request)
try:
async for response in position_stream:
yield Position.translate_from_rpc(response)
finally:
position_stream.cancel()
where position_stream is infinite (or possibly very long lasting). I use it from an example code like this:
async def print_altitude():
async for position in drone.telemetry.position():
print(f"Altitude: {position.relative_altitude_m}")
and print_altitude() is run on the loop with:
asyncio.ensure_future(print_altitude())
asyncio.get_event_loop().run_forever()
That works well. Now, at some point, I'd like to close the stream from the caller. I thought that I could just run asyncio.ensure_future(loop.shutdown_asyncgens()) and wait for my finally close above to get called, but it doesn't happen.
Instead, I receive a warning on an unretrieved exception:
Task exception was never retrieved
future: <Task finished coro=<print_altitude() done, defined at [...]
Why is that, and how can I make it such that all my async generators actually get closed (and run their finally clause)?

First of all, if you stop a loop, none of your coroutines will have a chance to shut down properly. Calling close basically means irreversibly destroying the loop.
If you do not care what happens to those running tasks, you can simply cancel them all, this will stop asynchronous generators as well:
import asyncio
from contextlib import suppress
async def position_stream():
while True:
await asyncio.sleep(1)
yield 0
async def print_position():
async for position in position_stream():
print(f'position: {position}')
async def cleanup_awaiter():
await asyncio.sleep(3)
print('cleanup!')
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
asyncio.ensure_future(print_position())
asyncio.ensure_future(print_position())
loop.run_until_complete(cleanup_awaiter())
# get all running tasks:
tasks = asyncio.gather(*asyncio.Task.all_tasks())
# schedule throwing CancelledError into the them:
tasks.cancel()
# allow them to process the exception and be cancelled:
with suppress(asyncio.CancelledError):
loop.run_until_complete(tasks)
finally:
print('closing loop')
loop.close()

Related

Asyncio: cancelling tasks and starting new ones when a signal flag is raised

My program is supposed to read data forever from provider classes stored in PROVIDERS, defined in the config. Every second, it should check whether the config has changed and if so, stop all tasks, reload the config and and create new tasks.
The below code raises CancelledError because I'm cancelling my tasks. Should I really try/catch each of those to achieve my goals or is there a better pattern?
async def main(config_file):
load_config(config_file)
tasks = []
config_task = asyncio.create_task(watch_config(config_file)) # checks every 1s if config changed and raises ConfigChangedSignal if so
tasks.append(config_task)
for asset_name, provider in PROVIDERS.items():
task = asyncio.create_task(provider.read_forever())
tasks.append(task)
try:
await asyncio.gather(*tasks, return_exceptions=False)
except ConfigChangedSignal:
# Restarting
for task in asyncio.tasks.all_tasks():
task.cancel() # raises CancelledError
await main(config_file)
try:
asyncio.run(main(config_file))
except KeyboardInterrupt:
logger.debug("Ctrl-C pressed. Aborting")
If you are on Python 3.11, your pattern maps directly to using asyncio.TaskGroup, the "successor" to asyncio.gather, which makes use of the new "exception Groups". By default, if any task in the group raises an exception, all tasks in the group are cancelled:
I played around this snippet in the ipython console, and had run asyncio.run(main(False)) for no exception and asyncio.run(main(True)) for inducing an exception just to check the results:
import asyncio
async def doit(i, n, cancel=False):
await asyncio.sleep(n)
if cancel:
raise RuntimeError()
print(i, "done")
async def main(cancel):
try:
async with asyncio.TaskGroup() as group:
tasks = [group.create_task(doit(i, 2)) for i in range(10)]
group.create_task(doit(42, 1, cancel=cancel))
group.create_task(doit(11, .5))
except Exception:
pass
await asyncio.sleep(3)
Your code can acommodate that -
Apart from the best practice for cancelling tasks, though, you are doing a recursive call to your main that, although will work for most practical purposes, can make seasoned developers go "sigh" - and also can break in edgecases, (it will fail after ~1000 cycles, for example), and leak resources.
The correct way to do that is assembling a while loop, since Python function calls, even tail calls, won't clean up the resources in the calling scope:
import asyncio
...
async def main(config_file):
while True:
load_config(config_file)
try:
async with asyncio.TaskGroup() as tasks:
tasks.create_task(watch_config(config_file)) # checks every 1s if config changed and raises ConfigChangedSignal if so
for asset_name, provider in PROVIDERS.items():
tasks.create_task.create_task(provider.read_forever())
# all tasks are awaited at the end of the with block
except *ConfigChangedSignal: # <- the new syntax in Python 3.11
# Restarting is just a matter of re-doing the while-loop
# ... log.info("config changed")
pass
# any other exception won't be caught and will error, allowing one
# to review what went wrong
...
For Python 3.10, looping over the tasks and cancelling each seems alright, but you should look at that recursive call. If you don't want a while-loop inside your current main, refactor the code so that main itself is called from an outter while-loop
async def main(config_file):
while True:
await inner_main(config_file)
async def inner_main(config_file):
load_config(config_file)
# keep the existing body
...
except ConfigChangedSignal:
# Restarting
for task in asyncio.tasks.all_tasks():
task.cancel() # raises CancelledError
# await main call dropped from here
jsbueno’s answer is appropriate.
An easy alternative is to enclose the entire event loop in an outer “while”:
async def main(config_file):
load_config(config_file)
tasks = []
for asset_name, provider in PROVIDERS.items():
task = asyncio.create_task(provider.read_forever())
tasks.append(task)
try:
await watch_config(config_file)
except ConfigChangedSignal:
pass
try:
while True:
asyncio.run(main(config_file))
except KeyboardInterrupt:
logger.debug("Ctrl-C pressed. Aborting")

Why does 'await' break from the local function when called from main()?

I am new to asynchronous programming, and while I understand most concepts, there is one relating to the inner runnings of 'await' that I don't quite understand.
Consider the following:
import asyncio
async def foo():
print('start fetching')
await asyncio.sleep(2)
print('done fetcihng')
async def main():
task1 = asyncio.create_task(foo())
asyncio.run(main())
Output: start fetching
vs.
async def foo():
print('start fetching')
print('done fetcihng')
async def main():
task1 = asyncio.create_task(foo())
asyncio.run(main())
Output: start fetching followed by done fetching
Perhaps it is my understanding of await, which I do understand insofar that we can use it to pause (2 seconds in the case above), or await for functions to fully finish running before any further code is run.
But for the first example above, why does await cause 'done fetching' to not run??
asyncio.create_task schedules an awaitable on the event loop and returns immediately, so you are actually exiting the main function (and closing the event loop) before the task is able to finish
you need to change main to either
async def main():
task1 = asyncio.create_task(foo())
await task1
or
async def main():
await foo()
creating a task first (the former) is useful in many cases, but they all involve situations where the event loop will outlast the task, e.g. a long running server, otherwise you should just await the coroutine directly like the latter

outer async context manager finalized before inner async generator

Given the following minimal example:
#asynccontextmanager
async def async_context():
try:
yield
finally:
await asyncio.sleep(1)
print('finalize context')
async def async_gen():
try:
yield
finally:
await asyncio.sleep(2)
# will never be called if timeout is larger than in async_context
print('finalize gen')
async def main():
async with async_context():
async for _ in async_gen():
break
if __name__ == "__main__":
asyncio.run(main())
I'm breaking while iterating over the async generator and I want the finally block to complete before my async context manager finally block runs. In this example "finalize gen" will never be printed because the program exits before that happens.
Note that I intentionally chose a timeout of 2 in the generators finally block so the context managers finally has a chance to run before. If I chose 1 for both timeouts both messages will be printed.
Is this kind of a race condition? I expected all finally blocks to complete before the program finishes.
How can I prevent the context mangers finally block to run before the generators finally block has completed?
For context:
I use playwright to control a chromium browser. The outer context manager provides a page that it closes in the finally block.
I'm using python 3.9.0.
Try this example: https://repl.it/#trixn86/AsyncGeneratorRaceCondition
The async context manager doesn't know anything about the asynchronous generator. Nothing in main knows about the asynchronous generator after you break, in fact. You've given yourself no way to wait for the generator's finalization.
If you want to wait for the generator to close, you need to handle closure explicitly:
async def main():
async with async_context():
gen = async_gen()
try:
async for _ in gen:
break
finally:
await gen.aclose()
In Python 3.10, you'll be able to use contextlib.aclosing instead of the try/finally:
async def main():
async with async_context():
gen = async_gen()
async with contextlib.aclosing(gen):
async for _ in gen:
break

Forcing an ayncio coroutine to start

I'm currently writing some unit tests for a system that uses asyncio so I'd like to be able to force an asyncio coroutine to run to an await point. As an example, consider the following:
import asyncio
event = asyncio.Event()
async def test_func():
print('Test func')
await event.wait()
async def main():
w = test_func()
await asyncio.sleep(0)
print('Post func')
event.set()
print('Post set')
await w
print('Post wait')
asyncio.run(main())
If I run this program with Python 3.7 I see the following output
Post func
Post set
Test func
Post wait
I'd like to be able to test the case where the event isn't set before the coroutine starts running - i.e. have the output
Test func
Post func
Post set
Post wait
Is there a way to force the coroutine to start running until it reaches the await point. I've tried using an asyncio.sleep(0) statement but even if I sleep for a number of seconds the test_func coroutine doesn't start until await is hit in main.
If this isn't possible is there another option for creating this test case?
I need to create a task to execute the coroutine so there is another task for asyncio to schedule. If instead of calling w = test_func() I use w = asyncio.create_task(test_func()) and follow that with asyncio.sleep(0). I get the behaviour I desire. I'm not sure how deterministic the asyncio event loop at scheduling tasks but it seems to be working reliably for this example.

How to forcefully close an async generator?

Let's say I have an async generator like this:
async def event_publisher(connection, queue):
while True:
if not await connection.is_disconnected():
event = await queue.get()
yield event
else:
return
I consume it like this:
published_events = event_publisher(connection, queue)
async for event in published_events:
# do event processing here
It works just fine, however when the connection is disconnected and there is no new event published the async for will just wait forever, so ideally I would like to close the generator forcefully like this:
if connection.is_disconnected():
await published_events.aclose()
But I get the following error:
RuntimeError: aclose(): asynchronous generator is already running
Is there a way to stop processing of an already running generator?
It seems to be related to this issue. Noticable:
As shown in
https://gist.github.com/1st1/d9860cbf6fe2e5d243e695809aea674c, it's an
error to close a synchronous generator while it is being iterated.
...
In 3.8, calling "aclose()" can crash with a RuntimeError. It's no
longer possible to reliably cancel a running asynchrounous
generator.
Well, since we can't cancel running asynchrounous generator, let's try to cancel its running.
import asyncio
from contextlib import suppress
async def cancel_gen(agen):
task = asyncio.create_task(agen.__anext__())
task.cancel()
with suppress(asyncio.CancelledError):
await task
await agen.aclose() # probably a good idea,
# but if you'll be getting errors, try to comment this line
...
if connection.is_disconnected():
await cancel_gen(published_events)
Can't test if it'll work since you didn't provide reproducable example.
You can use a timeout on the queue so is_connected() is polled regularly if there is no item to pop:
async def event_publisher(connection, queue):
while True:
if not await connection.is_disconnected():
try:
event = await asyncio.wait_for(queue.get(), timeout=10.0)
except asyncio.TimeoutError:
continue
yield event
else:
return
Alternatively, it is possible to use Queue.get_nowait().

Categories

Resources