How to wrap custom future to use with asyncio in Python? - python

There is a lot of libraries that use their custom version of Future. kafka and s3transfer are just two examples: all their custom future-like classes have object as the superclass.
Not surprisingly, you cannot directly call asyncio.wrap_future() on such objects and can't use await with them.
What is the proper way of wrapping such futures for use with asyncio?

If the future class supports standard future features such as done callbacks and the result method, just use something like this:
def wrap_future(f):
loop = asyncio.get_event_loop()
aio_future = loop.create_future()
def on_done(*_):
try:
result = f.result()
except Exception as e:
loop.call_soon_threadsafe(aio_future.set_exception, e)
else:
loop.call_soon_threadsafe(aio_future.set_result, result)
f.add_done_callback(on_done)
return aio_future
Consider that code a template which you can customize to match the specifics of the future you are dealing with.
Intended usage is to call it from the thread that runs the asyncio event loop:
value = await wrap_future(some_foreign_future)
If you are calling it from a different thread, be sure to pass loop explicitly because asyncio.get_event_loop will fail when invoked from a thread not registered with asyncio.

Related

How can you use async apis with ndb hooks?

I have some hooks in place, and I thought I could decorate them with #ndb.tasklet in order to use async apis inside the hooks.
e.g.
#classmethod
#ndb.tasklet
def _post_delete_hook(cls, key,future):
yield do_something_async()
This seemed to work, but every now and then I see "suspended generator" error for the code inside those hooks.
Should I be using #ndb.synctasklet instead?
An example of error:
suspended generator _post_put_hook(data_field.py:112) raised TypeError(Expected Future, received <class 'google.appengine.api.apiproxy_stub_map.UserRPC'>: <google.appengine.api.apiproxy_stub_map.UserRPC object at 0x09AA00B0>)
The code causing the error occasionally was:
t, d = yield (queue.add_async(task), queue.delete_tasks_async(taskqueue.Task(name=existing_task_name)))
Now that I've put #ndb.synctasklet it raises an actual exception.
An ndb tasklet returns a future. If calling the tasklet results in an exception, the exception will only be raised if the future's get_result method is called.
ndb.synctasklet automatically calls get_result on the futures yielded by tasklets, causing exceptions to be raised if they occurred, rather than just logged.
For the error that you are seeing, you may be able to fix it by converting the UserRPCs returned by the taskqueue async methods to tasklets.
This untested code is based on ndb.context.urlfetch (link), which converts the UserRPC produced by urlfetch.createRPC into a Future.
#ndb.tasklet
def add_async(queue, **taskqueue_kwargs):
rpc = queue.add_async(**taskqueue_kwargs)
result = yield rpc
raise ndb.Return(result)
You would need to create a tasklet for each async method that you want to use, or you could extend the taskqueue class and make the async methods tasklets.

How can I raise an exception through Tornado coroutines incorrectly called?

I have a scenario with Tornado where I have a coroutine that is called from a non-coroutine or without yielding, yet I need to propagate the exception back.
Imagine the following methods:
#gen.coroutine
def create_exception(with_yield):
if with_yield:
yield exception_coroutine()
else:
exception_coroutine()
#gen.coroutine
def exception_coroutine():
raise RuntimeError('boom')
def no_coroutine_create_exception(with_yield):
if with_yield:
yield create_exception(with_yield)
else:
create_exception(with_yield)
Calling:
try:
# Throws exception
yield create_exception(True)
except Exception as e:
print(e)
will properly raise the exception. However, none of the following raise the exception :
try:
# none of these throw the exception at this level
yield create_exception(False)
no_coroutine_create_exception(True)
no_coroutine_create_exception(False)
except Exception as e:
print('This is never hit)
The latter are variants similar to what my problem is - I have code outside my control calling coroutines without using yield. In some cases, they are not coroutines themselves. Regardless of which scenario, it means that any exceptions they generate are swallowed until Tornado returns them as "future exception not received."
This is pretty contrary to Tornado's intent, their documentation basically states you need to do yield/coroutine through the entire stack in order for it to work as I'm desiring without hackery/trickery.
I can change the way the exception is raised (ie modify exception_coroutine). But I cannot change several of the intermediate methods.
Is there something I can do in order to force the exception to be raised throughout the Tornado stack, even if it is not properly yielded? Basically to properly raise the exception in all of the last three situations?
This is complicated because I cannot change the code that is causing this situation. I can only change exception_coroutine for example in the above.
What you're asking for is impossible in Python because the decision to yield or not is made by the calling function after the coroutine has finished. The coroutine must return without raising an exception so it can be yielded, and after that it is no longer possible for it to raise an exception into the caller's context in the event that the Future is not yielded.
The best you can do is detect the garbage collection of a Future, but this can't do anything but log (this is how the "future exception not retrieved" message works)
If you're curious why this isn't working, it's because no_coroutine_create_exception contains a yield statement. Therefore it's a generator function, and calling it does not execute its code, it only creates a generator object:
>>> no_coroutine_create_exception(True)
<generator object no_coroutine_create_exception at 0x101651678>
>>> no_coroutine_create_exception(False)
<generator object no_coroutine_create_exception at 0x1016516d0>
Neither of the calls above executes any Python code, it only creates generators that must be iterated.
You'd have to make a blocking function that starts the IOLoop and runs it until your coroutine finishes:
def exception_blocking():
return ioloop.IOLoop.current().run_sync(exception_coroutine)
exception_blocking()
(The IOLoop acts as a scheduler for multiple non-blocking tasks, and the gen.coroutine decorator is responsible for iterating the coroutine until completion.)
However, I think I'm likely answering your immediate question but merely enabling you to proceed down an unproductive path. You're almost certainly better off using async code or blocking code throughout instead of trying to mix them.

Is this style of using Thread pool with tornado ok?

So I create a class variable called executor in my class
executor = ThreadPoolExecutor(100)
and instead of having functions and methods and using decorators, I simply use following line to handle my blocking tasks(like io and hash creation and....) in my async methods
result = await to_tornado_future(self.executor.submit(blocking_method, param1, param2)
I decided to use this style cause
1- decorators are slower by nature
2- there is no need for extra methods and functions
3- it workes as expected and creates no threads before it needed
Am I right ? Please use reasons(I want to know if the way I use, is slower or uses more resources or....)
Update
Based on Ben answer, my above approach was not correct
so I ended up using following function as needed, I think it's the best way to go
def pool(pool_executor, fn, *args, **kwargs):
new_future = Future()
result_future = pool_executor.submit(fn, *args, **kwargs)
result_future.add_done_callback(lambda f: new_future.set_result(f.result()))
return new_future
usage:
result = await pool(self.executor, time.sleep, 3)
This is safe as long as all your blocking methods are thread-safe. Since you mentioned doing IO in these threads, I'll point out that doing file IO here is fine but all network IO in Tornado must occur on the IOLoop's thread.
Why do you say "decorators are slower by nature"? Which decorators are slower than what? Some decorators have no performance overhead at all (although most do have some runtime cost). to_tornado_future(executor.submit()) isn't free either. (BTW, I think you want tornado.gen.convert_yielded instead of tornado.platform.asyncio.to_tornado_future. executor.submit doesn't return an asyncio.Future).
As a general rule, running blocking_method on a thread pool is going to be slower than just calling it directly. You should do this only when blocking_method is likely to block for long enough that you want the main thread free to do other things in the meantime.

Is this right way to call coroutine method in Tornado framework?

I have WebSocketHandler in my Tornado application.
I am not sure is this a right way to make code asynchronous.
class MyHandler(WebSocketHandler):
def open(self):
do something ...
self.my_coroutine_method()
#gen.coroutine
def my_coroutine_method(self):
user = yield db.user.find_one() # call motor asynchronous engine
self.write_message(user)
Yes, this is correct. However, in some cases simply calling a coroutine without yielding can cause exceptions to be handled in unexpected ways, so I recommend using IOLoop.current().spawn_callback(self.my_coroutine_method) when calling a coroutine from a non-coroutine like this.

Using Twisted's #inlineCallbacks with Tornado's #gen.engine

Tornado/Twisted newb here. First I just want to confirm what I know (please correct and elaborate if I am wrong):
In order to use #gen.engine and gen.Task in Tornado, I need to feed gen.Task() functions that are:
asynchronous to begin with
has the keyword argument "callback"
calls the callback function at the very end
In other words the function should look something like this:
def function(arg1, arg2, ... , callback=None):
# asynchronous stuff here ...
callback()
And I would call it like this (trivial example):
#gen.engine
def coroutine_call():
yield gen.Task(function, arg1, arg2)
Now I am in a weird situation where I have to use Twisted in a Tornado system for asynchronous client calls to a server (since Tornado apparently does not support it).
So I wrote a function in Twisted (e.g. connects to the server):
import tornado.platform.twisted
tornado.platform.twisted.install()
from twisted.web.xmlrpc import Proxy
class AsyncConnection():
def __init__(self, hostname):
self.proxy = Proxy(hostname)
self.token = False
#defer.inlineCallbacks
def login(self, user, passwd, callback=None):
"""Login to server using given username and password"""
self.token = yield self.proxy.callRemote('login', user, passwd) # twisted function
callback()
And if I run it like so:
#gen.engine
def test():
conn = AsyncConnection("192.168.11.11")
yield gen.Task(conn.login, "user","pwd")
print conn.token
if __name__ == '__main__':
test()
tornado.ioloop.IOLoop.instance().start()
And I DO get the token as I want. But my question is:
I know that Twisted and Tornado can share the same IOLoop. But am I allowed to do this (i.e. use #defer.inlineCallbacks function in gen.Task simply by giving it the callback keyword argument)? I seem to get the right result but is my way really running things asynchronously? Any complications/problems with the IOLoop this way?
I actually posted somewhat related questions on other threads
Is it possible to use tornado's gen.engine and gen.Task with twisted?
Using Tornado and Twisted at the same time
and the answers told me that I should "wrap" the inlineCallback function. I was wondering if adding the callback keyword is enough to "wrap" the twisted function to be suitable for Tornado.
Thanks in advance
What you're doing is mostly fine: adding a callback argument is enough to make a function usable with gen.Task. The only tricky part is exception handling: you'll need to run the callback from an except or finally block to ensure it always happens, and should probably return some sort of value to indicate whether the operation succeeded or not (exceptions do not reliably pass through a gen.Task when you're working with non-Tornado code)
The wrapper approach (which I posted in Is it possible to use tornado's gen.engine and gen.Task with twisted?) has two advantages: it can be used with most Twisted code directly (since Twisted functions usually don't have a callback argument), and exceptions work more like you'd expect (an exception raised in the inner function will be propagated to the outer function).

Categories

Resources