If I use async functions, then all the functions above the stack should also be async, and their call should be preceded by the await keyword. This example emulates modern programs with several architectural layers of the application:
async def func1():
await asyncio.sleep(1)
async def func2():
await func1()
async def func3():
await func2()
async def func4():
await func3()
async def func5():
await func4()
When an execution thread meet 'await', it can switch to another coroutine, which requires resources for context switching. With a large number of competing corutes and different levels of abstraction, these overheads may begin to limit the performance of the entire system. But in the presented example it makes sense to switch the context only in one case, on line:
await asyncio.sleep(1)
How can I ban context switching for certain asynchronous functions?
First of all, by default in your example context wouldn't be switched. In other words, until coroutine faces something actually blocking (like Future) it won't return control to event loop and resume its way directly to an inner coroutine.
I don't know easier way to demonstrate this than inheriting default event loop implementation:
import asyncio
class TestEventLoop(asyncio.SelectorEventLoop):
def _run_once(self):
print('control inside event loop')
super()._run_once()
async def func1():
await asyncio.sleep(1)
async def func2():
print('before func1')
await func1()
print('after func1')
async def main():
print('before func2')
await func2()
print('after func2')
loop = TestEventLoop()
asyncio.set_event_loop(loop)
try:
loop.run_until_complete(main())
finally:
loop.close()
In output you'll see:
control inside event loop
before func2
before func1
control inside event loop
control inside event loop
after func1
after func2
control inside event loop
func2 passed execution flow directly to func1 avoiding event loop's _run_once that could switch to another coroutine. Only when blocking asyncio.sleep was faced, event loop got control.
Although it's a detail of implementation of default event loop.
Second of all, and it's probably much more important, switching between coroutines is extremely cheap comparing to benefit we get from using asyncio to work with I/O.
It's also much cheaper than other async alternatives like switching between OS threads.
Situation when your code is slow because of many coroutines is highly unlikely, but even if it happened you should probably to take a look at more efficient event loop implementations like uvloop.
I would like to point out that if you ever run a sufficiently large number of coroutines that the overhead of switching context becomes an issue, you can ensure reduced concurrency using a Semaphore. I recently received a ~2x performance increase by reducing concurrency from 1000 to 50 for coroutines running HTTP requests.
Related
I have a asyncio running loop, and from the coroutine I'm calling a sync function, is there any way we can call and get result from an async function in a sync function
tried below code, it is not working
want to print output of hel() in i() without changing i() to async function
is it possible, if yes how?
import asyncio
async def hel():
return 4
def i():
loop = asyncio.get_running_loop()
x = asyncio.run_coroutine_threadsafe(hel(), loop) ## need to change
y = x.result() ## this lines
print(y)
async def h():
i()
asyncio.run(h())
This is one of the most commonly asked type of question here. The tools to do this are in the standard library and require only a few lines of setup code. However, the result is not 100% robust and needs to be used with care. This is probably why it's not already a high-level function.
The basic problem with running an async function from a sync function is that async functions contain await expressions. Await expressions pause the execution of the current task and allow the event loop to run other tasks. Therefore async functions (coroutines) have special properties that allow them to yield control and resume again where they left off. Sync functions cannot do this. So when your sync function calls an async function and that function encounters an await expression, what is supposed to happen? The sync function has no ability to yield and resume.
A simple solution is to run the async function in another thread, with its own event loop. The calling thread blocks until the result is available. The async function behaves like a normal function, returning a value. The downside is that the async function now runs in another thread, which can cause all the well-known problems that come with threaded programming. For many cases this may not be an issue.
This can be set up as follows. This is a complete script that can be imported anywhere in an application. The test code that runs in the if __name__ == "__main__" block is almost the same as the code in the original question.
The thread is lazily initialized so it doesn't get created until it's used. It's a daemon thread so it will not keep your program from exiting.
The solution doesn't care if there is a running event loop in the main thread.
import asyncio
import threading
_loop = asyncio.new_event_loop()
_thr = threading.Thread(target=_loop.run_forever, name="Async Runner",
daemon=True)
# This will block the calling thread until the coroutine is finished.
# Any exception that occurs in the coroutine is raised in the caller
def run_async(coro): # coro is a couroutine, see example
if not _thr.is_alive():
_thr.start()
future = asyncio.run_coroutine_threadsafe(coro, _loop)
return future.result()
if __name__ == "__main__":
async def hel():
await asyncio.sleep(0.1)
print("Running in thread", threading.current_thread())
return 4
def i():
y = run_async(hel())
print("Answer", y, threading.current_thread())
async def h():
i()
asyncio.run(h())
Output:
Running in thread <Thread(Async Runner, started daemon 28816)>
Answer 4 <_MainThread(MainThread, started 22100)>
In order to call an async function from a sync method, you need to use asyncio.run, however this should be the single entry point of an async program so asyncio makes sure that you don't do this more than once per program, so you can't do that.
That being said, this project https://github.com/erdewit/nest_asyncio patches the asyncio event loop to do that, so after using it you should be able to just call asyncio.run in your sync function.
This is an interesting situation I have come across. I wrote some code a while back that was synchronous but then I switched to async. The following code resembles the async code.
async def some_coroutine(args, obj):
genarg1, genarg2 = args
for (param1, param2) in some_generator(genarg1, genarg2):
await asyncio.sleep(0) # do this to prevent blocking and give control to event loop
# do work
for i in enumerate(param2):
await asyncio.sleep(0) # do this to prevent blocking and give control to event loop
obj.somefunc(i, param1)
pass
I want to refactor the above such that I can make it compatible with some of the non-async code. I used to have it where the for loops can be called in their own functions except this would block the eventloop. I don't want them to take over the event loop without giving control back to the event loop from time to time. I'd like to refactor to something like this but can't figure out how to avoid the blocking aspect of it:
async def some_coroutine(args, obj):
genarg1, genarg2 = args
somefunc(genarg1, genarg2, obj)
def somefunc(genarg1, genarg2, obj):
for (param1, param2) in some_generator(genarg1, genarg2):
# do work
for i in enumerate(param2):
obj.somefunc(i, param1)
pass
Clearly, the first code block, the protocol attempted to not block the event loop because the code was in one routine and had await asyncio.sleep(0). But now the refactored code breaks apart the for loop and is blocking and I'm not able to place await asyncio.sleep(0) in somefunc. I'd like to refactor the code this way so I could call it from other functions that don't use eventloops (e.g., test cases) but when an eventloop is used, I'd prefer it to be versatile enough to not block it.
Is this possible or am I just thinking about it wrong (i.e., refactor the code differently)?
Having read the documents and watched a number of videos, i am testing asyncio as an alternative to threading.
The docs are here:
https://docs.python.org/3/library/asyncio.html
I have constructed the following code with the expectation that it would produce the following.
before the sleep
hello
world
But in fact is produces this (world comes before hello):
before the sleep
world
hello
Here is the code:
import asyncio
import time
def main():
''' main entry point for the program '''
# create the event loop and add to the loop
# or run directly.
asyncio.run(main_async())
return
async def main_async():
''' the main async function '''
await foo()
await bar()
return
async def foo():
print('before the sleep')
await asyncio.sleep(2)
# time.sleep(0)
print('world')
return
async def bar():
print('hello')
await asyncio.sleep(0)
return
if __name__=='__main__':
''' This is executed when run from the command line '''
main()
The main() function calls the async main_async() function which in turn calls both the foo and bar async functions and both of those run the await asyncio.sleep(x) command.
So my question is: why is the hello world comming in the wrong (unexpected) order given that i was expecting world to be printed approximately 2 seconds after hello ?
You awaited foo() immediately, so bar() was never scheduled until foo() had run to completion; the execution of main_async will never do things after an await until the await has completed. If you want to schedule them both and let them interleave, replace:
await foo()
await bar()
with something like:
await asyncio.gather(foo(), bar())
which will convert both awaitables to tasks, scheduling both on the running asyncio event loop, then wait for both tasks to run to completion. With both scheduled at once, when one blocks on an await (and only await-based blocks, because only await yields control back to the event loop), the other will be allowed to run (and control can only return to the other task when the now running task awaits or finishes).
Basically, you have to remember that asyncio is cooperative multitasking. If you're only executing one task, and that task performs an await, there is nothing else to schedule, so nothing else runs until that await completes. If you block by any means other than an await, you still hold the event loop, and nothing else will get a chance to run, even if it's ready to go. So to gain any benefit from asyncio you need to be careful to:
Ensure other tasks are launched in time to occupy the event loop while the original task(s) are blocking on await.
Ensure you only block via await, so you don't monopolize the event loop unnecessarily.
Imagine I have a set of functions like this:
def func1():
func2()
def func2():
time.sleep(1) # simulate I/O operation
print('done')
I want these to be usable synchronously:
# this would take two seconds to complete
func1()
func1()
as well as asynchronously, for example like this:
# this would take 1 second to complete
future = asyncio.gather(func1.run_async(), func1.run_async())
loop = asyncio.get_event_loop()
loop.run_until_complete(future)
The problem is, of course, that func1 somehow has to propagate the "context" it's running in (synchronously vs. asynchronously) to func2.
I want to avoid writing an asynchronous variant of each of my functions because that would result in a lot of duplicate code:
def func1():
func2()
def func2():
time.sleep(1) # simulate I/O operation
print('done')
# duplicate code below...
async def func1_async():
await func2_async()
async def func2_async():
await asyncio.sleep(1) # simulate I/O operation
print('done')
Is there any way to do this without having to implement an asynchronous copy of all my functions?
Here's my "not-an-answer-answer," which I know that Stack Overflow loves...
Is there any way to do this without having to implement an asynchronous copy of all my functions?
I don't think that there is. Making a "blanket translator" to convert functions to native coroutines seems next-to-impossible. That's because making a synchronous function asynchronous is about more than throwing an async keyword in front of it and a couple of await statements within it. Keep in mind that anything that you await must be awaitable.
Your def func2(): time.sleep(1) illustrates that point. Synchronous functions will make blocking calls, such as time.sleep(); asynchronous (native coroutines) will await non-blocking coroutines. Making this function asynchronous, as you point out, requires not just using async def func(), but awaiting asyncio.sleep(). Now let's say instead of time.sleep(), you're calling a more complex, blocking function. You build some sort of fancy decorator that slaps a function attribute called run_async, which is a callable, onto the decorated function. But how does that decorator know how to "translate" the blocking calls within func2() into their coroutine equivalents, if those are even defined? I can't think of any magic that would be smart enough to convert all of the calls in a synchronous function to their awaitable counterparts.
In your comments, you mention that this is for HTTP requests. For a real-world example the differences in call signatures and APIs between the requests and aiohttp packages. In aiohttp, .text() is an instance method; in requests, .text is a property. How could you build something smart enough to know differences such as that?
I don't mean to be discouraging--but I think that using threading would be more realistic.
So I found a way to achieve this, but since this is literally the first time I've done anything with async I can't guarantee that this doesn't have any bugs or that it's not a terrible idea.
The concept is actually pretty simple: Define your functions like normal asynchronous functions using async def and await where necessary, and then add a wrapper around them that automatically awaits the function if no event loop is running. Proof of concept:
import asyncio
import functools
import time
class Hybrid:
def __init__(self, func):
self._func = func
functools.update_wrapper(self, func)
def __call__(self, *args, **kwargs):
coro = self._func(*args, **kwargs)
loop = asyncio.get_event_loop()
if loop.is_running():
# if the loop is running, we must've been called from a
# coroutine - so we'll return a future
return loop.create_task(coro)
else:
# if the loop isn't running, we must've been called synchronously,
# so we'll start the loop and let it execute the coroutine
return loop.run_until_complete(coro)
def run_async(self, *args, **kwargs):
return self._func(*args, **kwargs)
#Hybrid
async def func1():
await func2()
#Hybrid
async def func2():
await asyncio.sleep(0.1)
def twice_sync():
func1()
func1()
def twice_async():
future = asyncio.gather(func1.run_async(), func1.run_async())
loop = asyncio.get_event_loop()
loop.run_until_complete(future)
for func in [twice_sync, twice_async]:
start = time.time()
func()
end = time.time()
print('{:>11}: {} sec'.format(func.__name__, end-start))
# output:
# twice_sync: 0.20142340660095215 sec
# twice_async: 0.10088586807250977 sec
However, this approach does have its limitations. If you have a synchronous function calling a hybrid function, calling the synchronous function from an asynchronous function will change its behavior:
#hybrid
async def hybrid_function():
return "Success!"
def sync_function():
print('hybrid returned:', hybrid_function())
async def async_function():
sync_function()
sync_function() # this prints "Success!" as expected
loop = asyncio.get_event_loop()
loop.run_until_complete(async_function()) # but this prints a coroutine
Take care to account for this!
Let's assume I'm new to asyncio. I'm using async/await to parallelize my current project, and I've found myself passing all of my coroutines to asyncio.ensure_future. Lots of stuff like this:
coroutine = my_async_fn(*args, **kwargs)
task = asyncio.ensure_future(coroutine)
What I'd really like is for a call to an async function to return an executing task instead of an idle coroutine. I created a decorator to accomplish what I'm trying to do.
def make_task(fn):
def wrapper(*args, **kwargs):
return asyncio.ensure_future(fn(*args, **kwargs))
return wrapper
#make_task
async def my_async_func(*args, **kwargs):
# usually making a request of some sort
pass
Does asyncio have a built-in way of doing this I haven't been able to find? Am I using asyncio wrong if I'm lead to this problem to begin with?
asyncio had #task decorator in very early pre-released versions but we removed it.
The reason is that decorator has no knowledge what loop to use.
asyncio don't instantiate a loop on import, moreover test suite usually creates a new loop per test for sake of test isolation.
Does asyncio have a built-in way of doing this I haven't been able to
find?
No, asyncio doesn't have decorator to cast coroutine-functions into tasks.
Am I using asyncio wrong if I'm lead to this problem to begin with?
It's hard to say without seeing what you're doing, but I think it may happen to be true. While creating tasks is usual operation in asyncio programs I doubt you created this much coroutines that should be tasks always.
Awaiting for coroutine - is a way to "call some function asynchronously", but blocking current execution flow until it finished:
await some()
# you'll reach this line *only* when some() done
Task on the other hand - is a way to "run function in background", it won't block current execution flow:
task = asyncio.ensure_future(some())
# you'll reach this line immediately
When we write asyncio programs we usually need first way since we usually need result of some operation before starting next one:
text = await request(url)
links = parse_links(text) # we need to reach this line only when we got 'text'
Creating task on the other hand usually means that following further code doesn't depend of task's result. But again it doesn't happening always.
Since ensure_future returns immediately some people try to use it as a way to run some coroutines concurently:
# wrong way to run concurrently:
asyncio.ensure_future(request(url1))
asyncio.ensure_future(request(url2))
asyncio.ensure_future(request(url3))
Correct way to achieve this is to use asyncio.gather:
# correct way to run concurrently:
await asyncio.gather(
request(url1),
request(url2),
request(url3),
)
May be this is what you want?
Upd:
I think using tasks in your case is a good idea. But I don't think you should use decorator: coroutine functionality (to make request) still is a separate part from it's concrete usage detail (it will be used as task). If requests synchronization controlling is separate from their's main functionalities it's also make sense to move synchronization into separate function. I would do something like this:
import asyncio
async def request(i):
print(f'{i} started')
await asyncio.sleep(i)
print(f'{i} finished')
return i
async def when_ready(conditions, coro_to_start):
await asyncio.gather(*conditions, return_exceptions=True)
return await coro_to_start
async def main():
t = asyncio.ensure_future
t1 = t(request(1))
t2 = t(request(2))
t3 = t(request(3))
t4 = t(when_ready([t1, t2], request(4)))
t5 = t(when_ready([t2, t3], request(5)))
await asyncio.gather(t1, t2, t3, t4, t5)
if __name__ == '__main__':
loop = asyncio.get_event_loop()
try:
loop.run_until_complete(main())
finally:
loop.run_until_complete(loop.shutdown_asyncgens())
loop.close()