Why can't have `await` outside a function [duplicate] - python

Suppose I have code like this
async def fetch_text() -> str:
return "text "
async def show_something():
something = await fetch_text()
print(something)
Which is fine. But then I want to clean the data, so I do
async def fetch_text() -> str:
return "text "
def fetch_clean_text(text: str) -> str:
text = await fetch_text()
return text.strip(text)
async def show_something():
something = fetch_clean_text()
print(something)
(I could clean text inside show_something(), but let's assume that show_something() can print many things and doesn't or shouldn't know the proper way of cleaning them.)
This is of course a SyntaxError: 'await' outside async function. But—if this code could run—while the await expression is not placed inside a coroutine function, it is executed in the context of one. Why this behavior is not allowed?
I see one pro in this design; in my latter example, you can't see that show_something()'s body is doing something that can result in its suspension. But if I were to make fetch_clean_text() a coroutine, not only would it complicate things but would probably also reduce performance. It just makes little sense to have another coroutine that doesn't perform any I/O by itself. Is there a better way?

You can only use await in an async enviroment. Try changing the function to asnyc:
import asyncio
whatever = . . .
async def function(param) -> asyncio.coroutine:
await param
asyncio.run(function(whatever))
Simple and easy.

I see one pro in this design; in my latter example, you can't see that
show_something()'s body is doing something that can result in its
suspension.
That's exactly why it designed this way. Writing concurrent code can be very tricky and asyncio authors decided that it's critically important to always explicitly mark places of suspend in code.
This article explains it in details (you can start from "Get To The Point Already" paragraph).
But if I were to make fetch_clean_text() a coroutine, not only would
it complicate things but would probably also reduce performance.
You need coroutines almost exclusively when you deal with I/O. I/O always takes much-much more time than overhead for using coroutines. So I guess it can be said - no, comparing to I/O you already deal with, you won't lose any significant amount of execution time for using coroutines.
Is there a better way?
Only way I can suggest: is to maximally split logic that deals with I/O (async part) from rest of the code (sync part).
from typing import Awaitable
def clean_text(text: str) -> str:
return text.strip(text)
async def fetch_text() -> Awaitable[str]:
return "text "
async def fetch_clean_text(text: str) -> Awaitable[str]:
text = await fetch_text()
return clean_text(text)
async def show_something():
something = await fetch_clean_text()
print(something)

Related

Minimize repetition between the same function definition sync and async [duplicate]

This question already has an answer here:
Make function asynchronous depending on parameter
(1 answer)
Closed 2 years ago.
I have 2 code paths, 1 sync and 1 async. I want them to have the same behavior except for their synchronicity flavor.
To do this, I'm trying to have as much code in common between them, as DRY as I can.
Here is my problem:
def my_func(conn, several, keyword, arguments, that, are, tedious, to, refactor, but, must, be, explicit):
query = template(keyword, arguments, that, are, tedious, to, refactor, but, must, be, explicit)
res = find_all_sync(conn, query)
return res
async def my_func(conn, several, keyword, arguments, that, are, tedious, to, refactor, but, must, be, explicit):
query = template(keyword, arguments, that, are, tedious, to, refactor, but, must, be, explicit)
res = await find_all_async(conn, query)
return res
As you see, I have an embedded logic to call find_all_sync vs find_all_async, and await the latter---so I can't simply write a sync and wrap it in an async wrapper. I internally call a slightly different function.
Most of the rest of my logic is contained in the templating step, but I can't find any way to further abstract the repetition of the tedious arguments while still having them listed explicitly. What I'd imagine is something like....
# how can I make this /definition/ choose correctly sync or async?
def _my_func(conn, several, keyword, arguments, that, are, tedious, to, refactor, but, must, be, excplicit, is_async=False):
query = template(keyword, arguments, that, are, tedious, to, refactor, but, must, be, excplicit)
if is_async:
# this doesn't work. Look my outer function is sync
res = await find_all_async(conn, query)
else:
res = find_all_sync(conn, query)
return res
Is there an easier way to have one code path for sync + async? Am I overlooking something simple? (I'm also open to slightly more complicated options like inspecting the arguments inside the function.)
There's a decorator in the curio package that allows a function to have an async and sync implementation, which can be found here:
https://github.com/dabeaz/curio/blob/master/curio/meta.py#L118
Here's the decorator implementation:
def awaitable(syncfunc):
'''
Decorator that allows an asynchronous function to be paired with a
synchronous function in a single function call. The selection of
which function executes depends on the calling context. For example:
def spam(sock, maxbytes): (A)
return sock.recv(maxbytes)
#awaitable(spam) (B)
async def spam(sock, maxbytes):
return await sock.recv(maxbytes)
In later code, you could use the spam() function in either a synchronous
or asynchronous context. For example:
def foo():
...
r = spam(s, 1024) # Calls synchronous function (A) above
...
async def bar():
...
r = await spam(s, 1024) # Calls async function (B) above
...
'''
def decorate(asyncfunc):
if inspect.signature(syncfunc) != inspect.signature(asyncfunc):
raise TypeError(f'{syncfunc.__name__} and async {asyncfunc.__name__} have different signatures')
#wraps(asyncfunc)
def wrapper(*args, **kwargs):
if from_coroutine():
return asyncfunc(*args, **kwargs)
else:
return syncfunc(*args, **kwargs)
wrapper._syncfunc = syncfunc
wrapper._asyncfunc = asyncfunc
wrapper._awaitable = True
wrapper.__doc__ = syncfunc.__doc__ or asyncfunc.__doc__
return wrapper
return decorate
(Note the use of inspect.signature to quickly check that the two functions are compatible.)
The method it uses to determine the calling context is tied-up in this function:
def from_coroutine(level=2, _cache={}):
f_code = _getframe(level).f_code
if f_code in _cache:
return _cache[f_code]
if f_code.co_flags & _CO_FROM_COROUTINE:
_cache[f_code] = True
return True
else:
# Comment: It's possible that we could end up here if one calls a function
# from the context of a list comprehension or a generator expression. For
# example:
#
# async def coro():
# ...
# a = [ func() for x in s ]
# ...
#
# Where func() is some function that we've wrapped with one of the decorators
# below. If so, the code object is nested and has a name such as <listcomp> or <genexpr>
if (f_code.co_flags & _CO_NESTED and f_code.co_name[0] == '<'):
return from_coroutine(level + 2)
else:
_cache[f_code] = False
return False
This involves some pretty gnarly frame hacks, but it should get you where you want to go. If you're only interested in determining if a function is a coroutine or not there's inspect.iscoroutinefunction:
>>> import inspect
>>> async def f():
... ...
...
>>> inspect.iscoroutinefunction(f)
True
>>> def g():
... ...
...
>>> inspect.iscoroutinefunction(g)
False
I think you should write either an async or a regular function and then use library like unsync so it will be executed in the event loop or thread/process executor according to its type.
Some examples from unsync doc:
#unsync
async def unsync_async():
await asyncio.sleep(1)
return 'I like decorators'
#unsync
def non_async_function(seconds):
time.sleep(seconds)
return 'Run concurrently!'

How do yield-based coroutines in Python differ from coroutines with #asyncio.coroutine and #types.coroutine decorators?

I have been trying to understand asynchronous programming, particularly in Python. I understand that asyncio is built off of an event loop which schedules the execution of coroutines, but I have read about several different ways to define coroutines, and I am confused how they all relate to each other.
I read this article for more background information on the topic. Although it covers each of the four types of coroutines I have mentioned, it does not entirely describe how they differ. Without any external modules, a coroutine can be created using yield as an expression on the right side of an equals, and then data can be inputted through the .send(). However, code examples using the #asyncio.coroutine and #types.coroutine decorators do not ever use .send() from what I've found. Code examples from the article are below:
# Coroutine using yield as an expression
def coro():
hello = yield "Hello"
yield hello
c = coro()
print(next(c), end=" ")
print(c.send("World")) # Outputs Hello World
# Asyncio generator-based coroutine
#asyncio.coroutine
def display_date(num, loop):
end_time = loop.time() + 50.0
while True:
print("Loop: {} Time: {}".format(num, datetime.datetime.now()))
if (loop.time() + 1.0) >= end_time:
break
yield from asyncio.sleep(random.randint(0, 5))
# Types generator-based coroutine
#types.coroutine
def my_sleep_func():
yield from asyncio.sleep(random.randint(0, 5))
# Native coroutine in Python 3.5+
async def display_date(num, loop, ):
end_time = loop.time() + 50.0
while True:
print("Loop: {} Time: {}".format(num, datetime.datetime.now()))
if (loop.time() + 1.0) >= end_time:
break
await asyncio.sleep(random.randint(0, 5))
My questions are:
How do the yield coroutines relate to the types or asyncio decorated coroutines, and where is the .send() functionality utilized?
What functionality do the decorators add to the undecorated generator-based coroutine?
How do the #asyncio.coroutine and #types.coroutine decorators differ? I read this answer to try and understand this, but the only difference mentioned here is that the types coroutine executes like a subroutine if it has no yield statement. Is there anything more to it?
How do these generator-based coroutines differ in functionality and in implementation from the latest native async/await coroutines?
you will likely laugh, I took a look at the source code for asyncio.coroutine and found it uses types.coroutine (any comment with #!> is added by me)
def coroutine(func):
"""Decorator to mark coroutines...."""
#!> so clearly the async def is preferred.
warnings.warn('"#coroutine" decorator is deprecated since Python 3.8, use "async def" instead',
DeprecationWarning,
stacklevel=2)
if inspect.iscoroutinefunction(func):
#!> since 3.5 clearly this is returning something functionally identical to async def.
# In Python 3.5 that's all we need to do for coroutines
# defined with "async def".
return func
if inspect.isgeneratorfunction(func):
coro = func
else:
#!> omitted, makes a wrapper around a non generator function.
#!> USES types.coroutine !!!!
coro = types.coroutine(coro)
if not _DEBUG:
wrapper = coro
else:
#!> omitted, another wrapper for better error logging.
wrapper._is_coroutine = _is_coroutine # For iscoroutinefunction().
return wrapper
So I think this comes down to historical stuff only, asyncio existed for longer than types so the original handling was done here, then when types came along the true wrapper was moved there and asyncio continued to just have some extra wrapping stuff. But at the end of the day, both are just to mimic the behaviour of async def

How to await in cdef?

I have this Cython code (simplified):
class Callback:
async def foo(self):
print('called')
cdef void call_foo(void* callback):
print('call_foo')
asyncio.wait_for(<object>callback.foo())
async def py_call_foo():
call_foo(Callback())
async def example():
loop.run_until_complete(py_call_foo())
What happens though: I get RuntimeWarning: coroutine Callback.foo was never awaited. And, in fact, it is never called. However, call_foo is called.
Any idea what's going on / how to get it to actually wait for Callback.foo to complete?
Extended version
In the example above some important details are missing: In particular, it is really difficult to get hold of return value from call_foo. The real project setup has this:
Bison parser that has rules. Rules are given a reference to specially crafted struct, let's call it ParserState. This struct contains references to callbacks, which are called by parser when rules match.
In Cython code, there's a class, let's call it Parser, that users of the package are supposed to extend to make their custom parsers. This class has methods which then need to be called from callbacks of ParserState.
Parsing is supposed to happen like this:
async def parse_file(file, parser):
cdef ParserState state = allocate_parser_state(
rule_callbacks,
parser,
file,
)
parse_with_bison(state)
The callbacks are of a general shape:
ctypedef void(callback*)(char* text, void* parser)
I have to admit I don't know how exactly asyncio implements await, and so I don't know if it is in general possible to do this with the setup that I have. My ultimate goal though is that multiple Python functions be able to iteratively parse different files, all at the same time more or less.
TLDR:
Coroutines must be await'ed or run by an event loop. A cdef function cannot await, but it can construct and return a coroutine.
Your actual problem is mixing synchronous with asynchronous code. Case in point:
async def example():
loop.run_until_complete(py_call_foo())
This is similar to putting a subroutine in a Thread, but never starting it.
Even when started, this is a deadlock: the synchronous part would prevent the asynchronous part from running.
Asynchronous code must be awaited
An async def coroutine is similar to a def ...: yield generator: calling it only instantiates it. You must interact with it to actually run it:
def foo():
print('running!')
yield 1
bar = foo() # no output!
print(next(bar)) # prints `running!` followed by `1`
Similarly, when you have an async def coroutine, you must either await it or schedule it in an event loop. Since asyncio.wait_for produces a coroutine, and you never await or schedule it, it is not run. This is the cause of the RuntimeWarning.
Note that the purpose of putting a coroutine into asyncio.wait_for is purely to add a timeout. It produces an asynchronous wrapper which must be await'ed.
async def call_foo(callback):
print('call_foo')
await asyncio.wait_for(callback.foo(), timeout=2)
asyncio.get_event_loop().run_until_complete(call_foo(Callback()))
Asynchronous functions need asynchronous instructions
The key for asynchronous programming is that it is cooperative: Only one coroutine executes until it yields control. Afterwards, another coroutine executes until it yields control. This means that any coroutine blocking without yielding control blocks all other coroutines as well.
In general, if something performs work without an await context, it is blocking. Notably, loop.run_until_complete is blocking. You have to call it from a synchronous function:
loop = asyncio.get_event_loop()
# async def function uses await
async def py_call_foo():
await call_foo(Callback())
# non-await function is not async
def example():
loop.run_until_complete(py_call_foo())
example()
Return values from coroutines
A coroutine can return results like a regular function.
async def make_result():
await asyncio.sleep(0)
return 1
If you await it from another coroutine, you directly get the return value:
async def print_result():
result = await make_result()
print(result) # prints 1
asyncio.get_event_loop().run_until_complete(print_result())
To get the value from a coroutine inside a regular subroutine, use run_until_complete to run the coroutine:
def print_result():
result = asyncio.get_event_loop().run_until_complete(make_result())
print(result)
print_result()
A cdef/cpdef function cannot be a coroutine
Cython supports coroutines via yield from and await only for Python functions. Even for a classical coroutine, a cdef is not possible:
Error compiling Cython file:
------------------------------------------------------------
cdef call_foo(callback):
print('call_foo')
yield from asyncio.wait_for(callback.foo(), timeout=2)
^
------------------------------------------------------------
testbed.pyx:10:4: 'yield from' not supported here
You are perfectly fine calling a synchronous cdef function from a coroutine. You are perfectly fine scheduling a coroutine from a cdef function.
But you cannot await from inside a cdef function, nor await a cdef function. If you need to do that, as in your example, use a regular def function.
You can however construct and return a coroutine in a cdef function. This allows you to await the result in an outer coroutine:
# inner coroutine
async def pingpong(what):
print('pingpong', what)
await asyncio.sleep(0)
return what
# cdef layer to instantiate and return coroutine
cdef make_pingpong():
print('make_pingpong')
return pingpong('nananana')
# outer coroutine
async def play():
for i in range(3):
result = await make_pingpong()
print(i, '=>', result)
asyncio.get_event_loop().run_until_complete(play())
Note that despite the await, make_pingpong is not a coroutine. It is merely a factory for coroutines.

Not possible to chain native asyncio coroutines by simply returning them

I've been using py3.4's generator-based coroutines and in several places I've chained them by simply having one coroutine call return inner_coroutine() (like in the example below). However, I'm now converting them to use py3.5's native coroutines and I've found that no longer works as the inner coroutine doesn't get to run (see output from running the example below). In order for the native inner coroutine to run I need to use a return await inner_coroutine() instead of the original return inner_coroutine().
I expected chaining of native coroutines to work in the same way as the generator-based ones, and can't find any documentation stating otherwise. Am I missing something or is this an actual limitation of native coroutines?
import asyncio
#asyncio.coroutine
def coro():
print("Inside coro")
#asyncio.coroutine
def outer_coro():
print("Inside outer_coro")
return coro()
async def native_coro():
print("Inside native_coro")
async def native_outer_coro():
print("Inside native_outer_coro")
# return await native_coro() # this works!
return native_coro()
loop = asyncio.get_event_loop()
loop.run_until_complete(outer_coro())
loop.run_until_complete(native_outer_coro())
And the output from running that example:
Inside outer_coro
Inside coro
Inside native_outer_coro
foo.py:26: RuntimeWarning: coroutine 'native_coro' was never awaited
loop.run_until_complete(native_outer_coro())
This is the same content as another answer, but stated in a way that I think will be easier to understand as a response to the question.
The way python determines whether something is a generator or a normal function is whether it contains a yield statement.
This creates an ambiguity with #asyncio.coroutine.
Whether your coroutine executes immediately or whether it waits until the caller calls next on the resulting generator object depends on whether your code actually happens to include a yield statement.
The native coroutines are by design unambiguously generators even if they do not happen to include any await statements.
This provides predictable behavior, but does not permit the form of chaining you are using.
You can as you pointed out do
return await inner_coroutine()
However note that in that await syntax, the inner coroutine is called while executing the outer coroutine in the event loop. However, with the generator-based approach and no yield, the inner coroutine is constructed while actually submitting the coroutine to the event loop.
In most circumstances this difference does not matter.
Your old version had wrong logic and worked only due to imperfect generator-based implementation. New syntax allowed to close this feature and make asyncio more consistent.
Idea of coroutines is to work like this:
c = coro_func() # create coroutine object
coro_res = await c # await this object to get result
In this example...
#asyncio.coroutine
def outer():
return inner()
...awaiting of outer() should return inner() coroutine object not this object's result. But due to imperfect implementation it awaits of inner() (like if yield from inner() was written).
In new syntax asyncio works exactly as it should: it returns coroutine object instead of it's result. And since this coroutine object was never awaited (what usually means mistake) you get this warning.
You can change your code like this to see it all clearly:
loop = asyncio.get_event_loop()
print('old res:', loop.run_until_complete(outer_coro()))
print('new res:', loop.run_until_complete(native_outer_coro()))

Optional Synchronous Interface to Asynchronous Functions

I'm writing a library which is using Tornado Web's tornado.httpclient.AsyncHTTPClient to make requests which gives my code a async interface of:
async def my_library_function():
return await ...
I want to make this interface optionally serial if the user provides a kwarg - something like: serial=True. Though you can't obviously call a function defined with the async keyword from a normal function without await. This would be ideal - though almost certain imposible in the language at the moment:
async def here_we_go():
result = await my_library_function()
result = my_library_function(serial=True)
I'm not been able to find anything online where someones come up with a nice solution to this. I don't want to have to reimplement basically the same code without the awaits splattered throughout.
Is this something that can be solved or would it need support from the language?
Solution (though use Jesse's instead - explained below)
Jesse's solution below is pretty much what I'm going to go with. I did end up getting the interface I originally wanted by using a decorator. Something like this:
import asyncio
from functools import wraps
def serializable(f):
#wraps(f)
def wrapper(*args, asynchronous=False, **kwargs):
if asynchronous:
return f(*args, **kwargs)
else:
# Get pythons current execution thread and use that
loop = asyncio.get_event_loop()
return loop.run_until_complete(f(*args, **kwargs))
return wrapper
This gives you this interface:
result = await my_library_function(asynchronous=True)
result = my_library_function(asynchronous=False)
I sanity checked this on python's async mailing list and I was lucky enough to have Guido respond and he politely shot it down for this reason:
Code smell -- being able to call the same function both asynchronously
and synchronously is highly surprising. Also it violates the rule of
thumb that the value of an argument shouldn't affect the return type.
Nice to know it's possible though if not considered a great interface. Guido essentially suggested Jesse's answer and introducing the wrapping function as a helper util in the library instead of hiding it in a decorator.
When you want to call such a function synchronously, use run_until_complete:
asyncio.get_event_loop().run_until_complete(here_we_go())
Of course, if you do this often in your code, you should come up with an abbreviation for this statement, perhaps just:
def sync(fn, *args, **kwargs):
return asyncio.get_event_loop().run_until_complete(fn(*args, **kwargs))
Then you could do:
result = sync(here_we_go)

Categories

Resources