Python asyncio run in thread causing Future error - python

I am running asyncio.run(func) in separate callback threads, and am receiving this error:
RuntimeError: Task <Task pending coro=<StreamReader.read() running at /usr/lib/python3.7/asyncio/streams.py:640> cb=[_release_waiter(<Future pendi...ab3a77710>()]>)() at /usr/lib/python3.7/asyncio/tasks.py:392]> got Future <Future pending> attached to a different loop
If I catch the exception, and proceed, everything is working fine.
One thing to note is, that my threads are started from an async function, but the callbacks themselves are synchronous. So the structure is something like: async-->sync threads-->async.run
Unfortunately, I cannot change the behavior of callbacks, as they are part of a 3rd party library (boto3) which is why I am resorting to async.run which works totally fine, but causes these Future warnings.

Related

got Future <Future pending> attached to a different loop when doing Lock.acquire()

Got a problem with asyncio.
I have a function that splits a dataset, iterates over the parts, and then creates a separate asyncio task for each of them using asyncio.create_task(do_operation), then saves that future into an array jobs.
Inside, do_operation does select + update in the database, and I don't want to run into any race conditions, and so for that I am using asyncio.Lock().
After the loop, another task is created with asyncio.create_task(wait_for_all_to_finish_and_then_complete()), in which we wait for all jobs to finish using await asyncio.gather(*jobs, return_exceptions=True).
The problem is, sometimes, when doing Lock.acquire(), I get this exception:
RuntimeError: Task <Task pending name='Task-69' coro=<run_scoring_operation() running at ***> cb=[gather.<locals>._done_callback() at /Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/asyncio/tasks.py:769]> got Future <Future pending> attached to a different loop
Why does it happen and how to solve it?

How can a Python future get a "was never awaited" error?

I have some code which is part of a larger application. It sets up a future like so:
self.__task = asyncio.ensure_future(self.__func())
My understanding is that I never need to explicitly await that task, that will be done automatically by the run loop. But at some point in working with the application I get the following exception:
RuntimeWarning: coroutine 'MyClass.__func' was never awaited
How can that happen? Does it mean that the run loop exited before it could run the task?

Python ThreadPoolExecutor Suppress Exceptions

from concurrent.futures import ThreadPoolExecutor, wait, ALL_COMPLETED
def div_zero(x):
print('In div_zero')
return x / 0
with ThreadPoolExecutor(max_workers=4) as executor:
futures = executor.submit(div_zero, 1)
done, _ = wait([futures], return_when=ALL_COMPLETED)
# print(done.pop().result())
print('Done')
The program above will run to completion without any error message.
You can only get the exception if you explicitly call future.result() or future.exception(), like what I did in the line commented-out.
I wonder why this Python module chose this kind of behavior even if it hides problems. Because of this, I spent hours debugging
a programming error (referencing a non-exist attribute in a class) that would otherwise be very obvious if the program just crashes with exception, like Java for instance.
I suspect the reason is so that the entire pool does not crash because of one thread raising an exception. This way, the pool will process all the tasks and you can get the threads that raised exceptions separately if you need to.
Each thread is (mostly) isolated from the other threads, including the primary thread. The primary thread does not communicate with the other threads until you ask it to do so.
This includes errors. The result is what you are seeing, the errors occurring other threads do not interfere with the primary thread. You only need to handle them when you ask for the results.

What API should an asyncio based library provide for critical error handling?

I'm writing a library using asyncio. Under some circumstances it detects a ciritical error and cannot continue. It is not a bug, the problem has an external cause.
A regular library code would raise an exception. The program will terminate by default, but the caller also has a chance to catch the exception and perform a cleanup or some kind of reset. That is what I want, but unfortunately exceptions do not work that way in asyncio. More about that:
https://docs.python.org/3/library/asyncio-dev.html#detect-never-retrieved-exceptions
What is a reasonable way to notify the async library user about the problem? Probably some callback activated when the error occurs. The user may do necessary cleanup and then exit in the callback.
But what should be the default action? Cancel the current task? Stop the entire event loop? Calling sys.exit?
In general, there should be no need for error-specific callbacks. Asyncio fully supports propagating exceptions across await boundaries inside coroutines, as well as across calls like run_until_complete where sync and async code meet. When someone awaits your coroutine, you can just raise an exception in the usual way.
One pitfall is with the coroutines that run as "background tasks". When such coroutines fail, potentially rendering the library unusable, no one will get notified automatically. This is a known deficiency in asyncio (see here for a detailed discussion), which is currently being addressed. In the meantime, you can achieve equivalent functionality with code like this:
class Library:
async def work_forever(self):
loop = asyncio.get_event_loop()
self._exit_future = loop.create_future()
await self._exit_future
async def stop_working(self):
self._cleanup()
self._exit_future.set_result(None)
async def _failed(self):
self._cleanup()
self._exit_future.set_exception(YourExceptionType())
def _cleanup(self):
# cancel the worker tasks
...
work_forever is analogous to serve_forever, a call that can be awaited by a main() coroutine, or even directly passed to asyncio.run(). In this design the library may detect an erroneous state and propagate the exception, or the main program (presumably through a separately spawned coroutine) can request it to exit cleanly.

asyncio: Why is awaiting a cancelled Future not showing CancelledError?

Given the following program:
import asyncio
async def coro():
future = asyncio.Future()
future.cancel()
print(future) # <Future cancelled>
await future # no output
loop = asyncio.get_event_loop()
loop.create_task(coro())
loop.run_forever()
Why is the CancelledError thrown by await future not shown? Explicitly wrapping await future in try/except shows that it occurs. Other unhandled errors are shown with Task exception was never retrieved, like this:
async def coro2():
raise Exception('foo')
loop.create_task(coro2())
Why is this not the case for awaiting cancelled Futures?
Additional question: What happens internally if a coroutine awaits a cancelled Future? Does it wait forever? Do I have to do any "cleanup" work?
Why is the CancelledError thrown by await future not shown?
The exception is not shown because you're never actually retrieving the result of coro. If you retrieved it in any way, e.g. by calling result() method on the task or just by awaiting it, you would get the expected error at the point where you retrieve it. The easiest way to observe the resulting traceback is by changing run_forever() to run_until_complete(coro()).
What happens internally if a coroutine awaits a cancelled Future? Does it wait forever? Do I have to do any "cleanup" work?
It doesn't wait forever, it receives the CancelledError at the point of await. You have discovered this already by adding a try/except around await future. The cleanup you need to do is the same like for any other exception - either nothing at all, or using with and finally to make sure that the resources you acquired get released in case of an exit.
Other unhandled errors are shown with Task exception was never retrieved [...] Why is this not the case for awaiting cancelled Futures?
Because Future.cancel intentionally disables the logging of traceback. This is to avoid spewing the output with "exception was never retrieved" whenever a future is canceled. Since CancelledError is normally injected from the outside and can happen at (almost) any moment, very little value is derived from retrieving it.
If it sounds strange to show the traceback in case of one exception but not in case of another, be aware that the tracebacks of failed tasks are displayed on a best-effort basis to begin with. Tasks created with create_task and not awaited effectively run "in the background", much like a thread that is not join()ed. But unlike threads, coroutines have the concept of a "result", either an object returned from the coroutine or an exception raised by it. The return value of the coroutine is provided by the result of its task. When the coroutine exits with an exception, the result encapsulates the exception, to be automatically raised when the result is retrieved. This is why Python cannot immediately print the traceback like it does when a thread terminates due to an unhandled exception - it has to wait for someone to actually retrieve the result. It is only when a Future whose result holds an exception is about to get garbage-collected that Python can tell that the result would never be retrieved. It then displays the warning and the traceback to avoid the exception passing silently.

Categories

Resources