Learning asyncio: "coroutine was never awaited" warning error

Learning asyncio: "coroutine was never awaited" warning error - python

I am trying to learn to use asyncio in Python to optimize scripts.
My example returns a coroutine was never awaited warning, can you help to understand and find how to solve it?
import time
import datetime
import random
import asyncio
import aiohttp
import requests
def requete_bloquante(num):
print(f'Get {num}')
uid = requests.get("https://httpbin.org/uuid").json()['uuid']
print(f"Res {num}: {uid}")
def faire_toutes_les_requetes():
for x in range(10):
requete_bloquante(x)
print("Bloquant : ")
start = datetime.datetime.now()
faire_toutes_les_requetes()
exec_time = (datetime.datetime.now() - start).seconds
print(f"Pour faire 10 requêtes, ça prend {exec_time}s\n")
async def requete_sans_bloquer(num, session):
print(f'Get {num}')
async with session.get("https://httpbin.org/uuid") as response:
uid = (await response.json()['uuid'])
print(f"Res {num}: {uid}")
async def faire_toutes_les_requetes_sans_bloquer():
loop = asyncio.get_event_loop()
with aiohttp.ClientSession() as session:
futures = [requete_sans_bloquer(x, session) for x in range(10)]
loop.run_until_complete(asyncio.gather(*futures))
loop.close()
print("Fin de la boucle !")
print("Non bloquant : ")
start = datetime.datetime.now()
faire_toutes_les_requetes_sans_bloquer()
exec_time = (datetime.datetime.now() - start).seconds
print(f"Pour faire 10 requêtes, ça prend {exec_time}s\n")
The first classic part of the code runs correctly, but the second half only produces:
synchronicite.py:43: RuntimeWarning: coroutine 'faire_toutes_les_requetes_sans_bloquer' was never awaited

You made faire_toutes_les_requetes_sans_bloquer an awaitable function, a coroutine, by using async def.
When you call an awaitable function, you create a new coroutine object. The code inside the function won't run until you then await on the function or run it as a task:
>>> async def foo():
... print("Running the foo coroutine")
...
>>> foo()
<coroutine object foo at 0x10b186348>
>>> import asyncio
>>> asyncio.run(foo())
Running the foo coroutine
You want to keep that function synchronous, because you don't start the loop until inside that function:
def faire_toutes_les_requetes_sans_bloquer():
loop = asyncio.get_event_loop()
# ...
loop.close()
print("Fin de la boucle !")
However, you are also trying to use a aiophttp.ClientSession() object, and that's an asynchronous context manager, you are expected to use it with async with, not just with, and so has to be run in aside an awaitable task. If you use with instead of async with a TypeError("Use async with instead") exception will be raised.
That all means you need to move the loop.run_until_complete() call out of your faire_toutes_les_requetes_sans_bloquer() function, so you can keep that as the main task to be run; you can call and await on asycio.gather() directly then:
async def faire_toutes_les_requetes_sans_bloquer():
async with aiohttp.ClientSession() as session:
futures = [requete_sans_bloquer(x, session) for x in range(10)]
await asyncio.gather(*futures)
print("Fin de la boucle !")
print("Non bloquant : ")
start = datetime.datetime.now()
asyncio.run(faire_toutes_les_requetes_sans_bloquer())
exec_time = (datetime.datetime.now() - start).seconds
print(f"Pour faire 10 requêtes, ça prend {exec_time}s\n")
I used the new asyncio.run() function (Python 3.7 and up) to run the single main task. This creates a dedicated loop for that top-level coroutine and runs it until complete.
Next, you need to move the closing ) parenthesis on the await resp.json() expression:
uid = (await response.json())['uuid']
You want to access the 'uuid' key on the result of the await, not the coroutine that response.json() produces.
With those changes your code works, but the asyncio version finishes in sub-second time; you may want to print microseconds:
exec_time = (datetime.datetime.now() - start).total_seconds()
print(f"Pour faire 10 requêtes, ça prend {exec_time:.3f}s\n")
On my machine, the synchronous requests code in about 4-5 seconds, and the asycio code completes in under .5 seconds.

Do not use loop.run_until_complete call inside async function. The purpose for that method is to run an async function inside sync context. Anyway here's how you should change the code:
async def faire_toutes_les_requetes_sans_bloquer():
async with aiohttp.ClientSession() as session:
futures = [requete_sans_bloquer(x, session) for x in range(10)]
await asyncio.gather(*futures)
print("Fin de la boucle !")
loop = asyncio.get_event_loop()
loop.run_until_complete(faire_toutes_les_requetes_sans_bloquer())
Note that alone faire_toutes_les_requetes_sans_bloquer() call creates a future that has to be either awaited via explicit await (for that you have to be inside async context) or passed to some event loop. When left alone Python complains about that. In your original code you do none of that.

Not sure if this was the issue for you, but for me the response from the coroutine was another coroutine, so my code started warning me (note not actually crashing) I had creating coroutines that weren't being called. After I actually called them (although I didn't realy use the response the error went away).
Note main code I added was:
content_from_url_as_str: list[str] = await asyncio.gather(*content_from_url, return_exceptions=True)
inspired after I saw:
response: str = await content_from_url[0]
Full code:
"""
-- Notes from [1]
Threading and asyncio both run on a single processor and therefore only run one at a time [1]. It's cooperative concurrency.
Note: threads.py has a very good block with good defintions for io-bound, cpu-bound if you need to recall it.
Note: coroutine is an important definition to understand before proceeding. Definition provided at the end of this tutorial.
General idea for asyncio is that there is a general event loop that controls how and when each tasks gets run.
The event loop is aware of each task and knows what states they are in.
For simplicitly of exponsition assume there are only two states:
a) Ready state
b) Waiting state
a) indicates that a task has work to do and can be run - while b) indicates that a task is waiting for a response from an
external thing (e.g. io, printer, disk, network, coq, etc). This simplified event loop has two lists of tasks
(ready_to_run_lst, waiting_lst) and runs things from the ready to run list. Once a task runs it is in complete control
until it cooperatively hands back control to the event loop.
The way it works is that the task that was ran does what it needs to do (usually an io operation, or an interleaved op
or something like that) but crucially it gives control back to the event loop when the running task (with control) thinks is best.
(Note that this means the task might not have fully completed getting what is "fully needs".
This is probably useful when the user whats to implement the interleaving himself.)
Once the task cooperatively gives back control to the event loop it is placed by the event loop in either the
ready to run list or waiting list (depending how fast the io ran, etc). Then the event loop goes through the waiting
loop to see if anything waiting has "returned".
Once all the tasks have been sorted into the right list the event loop is able to choose what to run next (e.g. by
choosing the one that has been waiting to be ran the longest). This repeats until the event loop code you wrote is done.
The crucial point (and distinction with threads) that we want to emphasizes is that in asyncio, an operation is never
interrupted in the middle and every switching/interleaving is done deliberately by the programmer.
In a way you don't have to worry about making your code thread safe.
For more details see [2], [3].
Asyncio syntax:
i) await = this is where the code you wrote calls an expensive function (e.g. an io) and thus hands back control to the
event loop. Then the event loop will likely put it in the waiting loop and runs some other task. Likely eventually
the event loop comes back to this function and runs the remaining code given that we have the value from the io now.
await = the key word that does (mainly) two things 1) gives control back to the event loop to see if there is something
else to run if we called it on a real expensive io operation (e.g. calling network, printer, etc) 2) gives control to
the new coroutine (code that might give up control copperatively) that it is awaiting. If this is your own code with async
then it means it will go into this new async function (coroutine) you defined.
No real async benefits are being experienced until you call (await) a real io e.g. asyncio.sleep is the typical debug example.
todo: clarify, I think await doesn't actually give control back to the event loop but instead runs the "coroutine" this
await is pointing too. This means that if it's a real IO then it will actually give it back to the event loop
to do something else. In this case it is actually doing something "in parallel" in the async way.
Otherwise, it is your own python coroutine and thus gives it the control but "no true async parallelism" happens.
iii) async = approximately a flag that tells python the defined function might use await. This is not strictly true but
it gives you a simple model while your getting started. todo - clarify async.
async = defines a coroutine. This doesn't define a real io, it only defines a function that can give up and give the
execution power to other coroutines or the (asyncio) event loop.
todo - context manager with async
ii) awaiting = when you call something (e.g. a function) that usually requires waiting for the io response/return/value.
todo: though it seems it's also the python keyword to give control to a coroutine you wrote in python or give
control to the event loop assuming your awaiting an actual io call.
iv) async with = this creates a context manager from an object you would normally await - i.e. an object you would
wait to get the return value from an io. So usually we swap out (switch) from this object.
todo - e.g.
Note: - any function that calls await needs to be marked with async or you’ll get a syntax error otherwise.
- a task never gives up control without intentionally doing so e.g. never in the middle of an op.
Cons: - note how this also requires more thinking carefully (but feels less dangerous than threading due to no pre-emptive
switching) due to the concurrency. Another disadvantage is again the idisocyncracies of using this in python + learning
new syntax and details for it to actually work.
- understanding the semanics of new syntax + learning where to really put the syntax to avoid semantic errors.
- we needed a special asycio compatible lib for requests, since the normal requests is not designed to inform
the event loop that it's block (or done blocking)
- if one of the tasks doesn't cooperate properly then the whole code can be a mess and slow it down.
- not all libraries support the async IO paradigm in python (e.g. asyncio, trio, etc).
Pro: + despite learning where to put await and async might be annoying it forces your to think carefully about your code
which on itself can be an advantage (e.g. better, faster, less bugs due to thinking carefully)
+ often faster...? (skeptical)
1. https://realpython.com/python-concurrency/
2. https://realpython.com/async-io-python/
3. https://stackoverflow.com/a/51116910/6843734
todo - read [2] later (or [3] but thats not a tutorial and its more details so perhaps not a priority).
asynchronous = 1) dictionary def: not happening at the same time
e.g. happening indepedently 2) computing def: happening independently of the main program flow
couroutine = are computer program components that generalize subroutines for non-preemptive multitasking, by allowing execution to be suspended and resumed.
So basically it's a routine/"function" that can give up control in "a controlled way" (i.e. not randomly like with threads).
Usually they are associated with a single process -- so it's concurrent but not parallel.
Interesting note: Coroutines are well-suited for implementing familiar program components such as cooperative tasks, exceptions, event loops, iterators, infinite lists and pipes.
Likely we have an event loop in this document as an example. I guess yield and operators too are good examples!
Interesting contrast with subroutines: Subroutines are special cases of coroutines.[3] When subroutines are invoked, execution begins at the start,
and once a subroutine exits, it is finished; an instance of a subroutine only returns once, and does not hold state between invocations.
By contrast, coroutines can exit by calling other coroutines, which may later return to the point where they were invoked in the original coroutine;
from the coroutine's point of view, it is not exiting but calling another coroutine.
Coroutines are very similar to threads. However, coroutines are cooperatively multitasked, whereas threads are typically preemptively multitasked.
event loop = event loop is a programming construct or design pattern that waits for and dispatches events or messages in a program.
Appendix:
For I/O-bound problems, there’s a general rule of thumb in the Python community:
“Use asyncio when you can, threading when you must.”
asyncio can provide the best speed up for this type of program, but sometimes you will require critical libraries that
have not been ported to take advantage of asyncio.
Remember that any task that doesn’t give up control to the event loop will block all of the other tasks
-- Notes from [2]
see asyncio_example2.py file.
The sync fil should have taken longer e.g. in one run the async file took:
Downloaded 160 sites in 0.4063692092895508 seconds
While the sync option took:
Downloaded 160 in 3.351937770843506 seconds
"""
import asyncio
from asyncio import Task
from asyncio.events import AbstractEventLoop
import aiohttp
from aiohttp import ClientResponse
from aiohttp.client import ClientSession
from typing import Coroutine
import time
async def download_site(session: ClientSession, url: str) -> str:
async with session.get(url) as response:
print(f"Read {response.content_length} from {url}")
return response.text()
async def download_all_sites(sites: list[str]) -> list[str]:
# async with = this creates a context manager from an object you would normally await - i.e. an object you would wait to get the return value from an io. So usually we swap out (switch) from this object.
async with aiohttp.ClientSession() as session: # we will usually away session.FUNCS
# create all the download code a coroutines/task to be later managed/run by the event loop
tasks: list[Task] = []
for url in sites:
# creates a task from a coroutine todo: basically it seems it creates a callable coroutine? (i.e. function that is able to give up control cooperatively or runs an external io and also thus gives back control cooperatively to the event loop). read more? https://stackoverflow.com/questions/36342899/asyncio-ensure-future-vs-baseeventloop-create-task-vs-simple-coroutine
task: Task = asyncio.ensure_future(download_site(session, url))
tasks.append(task)
# runs tasks/coroutines in the event loop and aggrates the results. todo: does this halt until all coroutines have returned? I think so due to the paridgm of how async code works.
content_from_url: list[ClientResponse.text] = await asyncio.gather(*tasks, return_exceptions=True)
assert isinstance(content_from_url[0], Coroutine) # note allresponses are coroutines
print(f'result after aggregating/doing all coroutine tasks/jobs = {content_from_url=}')
# this is needed since the response is in a coroutine object for some reason
content_from_url_as_str: list[str] = await asyncio.gather(*content_from_url, return_exceptions=True)
print(f'result after getting response from coroutines that hold the text = {content_from_url_as_str=}')
return content_from_url_as_str
if __name__ == "__main__":
# - args
num_sites: int = 80
sites: list[str] = ["https://www.jython.org", "http://olympus.realpython.org/dice"] * num_sites
start_time: float = time.time()
# - run the same 160 tasks but without async paradigm, should be slower!
# note: you can't actually do this here because you have the async definitions to your functions.
# to test the synchronous version see the synchronous.py file. Then compare the two run times.
# await download_all_sites(sites)
# download_all_sites(sites)
# - Execute the coroutine coro and return the result.
asyncio.run(download_all_sites(sites))
# - run event loop manager and run all tasks with cooperative concurrency
# asyncio.get_event_loop().run_until_complete(download_all_sites(sites))
# makes explicit the creation of the event loop that manages the coroutines & external ios
# event_loop: AbstractEventLoop = asyncio.get_event_loop()
# asyncio.run(download_all_sites(sites))
# making creating the coroutine that hasn't been ran yet with it's args explicit
# event_loop: AbstractEventLoop = asyncio.get_event_loop()
# download_all_sites_coroutine: Coroutine = download_all_sites(sites)
# asyncio.run(download_all_sites_coroutine)
# - print stats about the content download and duration
duration = time.time() - start_time
print(f"Downloaded {len(sites)} sites in {duration} seconds")
print('Success.\a')

Related

How to perform a synchronous task in an asynchronous FastAPI REST endpoint without blocking the event loop? [duplicate]

I have the following code:
import time
from fastapi import FastAPI, Request
app = FastAPI()
#app.get("/ping")
async def ping(request: Request):
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}
If I run my code on localhost - e.g., http://localhost:8501/ping - in different tabs of the same browser window, I get:
Hello
bye
Hello
bye
instead of:
Hello
Hello
bye
bye
I have read about using httpx, but still, I cannot have a true parallelization. What's the problem?

As per FastAPI's documentation:
When you declare a path operation function with normal def instead
of async def, it is run in an external threadpool that is then
awaited, instead of being called directly (as it would block the
server).
also, as described here:
If you are using a third party library that communicates with
something (a database, an API, the file system, etc.) and doesn't have
support for using await, (this is currently the case for most
database libraries), then declare your path operation functions as
normally, with just def.
If your application (somehow) doesn't have to communicate with
anything else and wait for it to respond, use async def.
If you just don't know, use normal def.
Note: You can mix def and async def in your path operation functions as much as you need and define each one using the best
option for you. FastAPI will do the right thing with them.
Anyway, in any of the cases above, FastAPI will still work
asynchronously and be extremely fast.
But by following the steps above, it will be able to do some
performance optimizations.
Thus, def endpoints (in the context of asynchronous programming, a function defined with just def is called synchronous function) run in a separate thread from an external threadpool (that is then awaited, and hence, FastAPI will still work asynchronously), or, in other words, the server processes the requests concurrently, whereas async def endpoints run in the event loop—on the main (single) thread—that is, the server processes the requests sequentially, as long as there is no await call to (normally) non-blocking I/O-bound operations inside such endpoints/routes, such as waiting for (1) data from the client to be sent through the network, (2) contents of a file in the disk to be read, (3) a database operation to finish, etc., (have a look here), in which cases, the server will process the requests concurrently/asynchronously (Note that the same concept not only applies to FastAPI endpoints, but to Background Tasks as well—see Starlette's BackgroundTask class implementation—hence, after reading this answer to the end, you should be able to decide whether you should define a FastAPI endpoint or background task function with def or async def). The keyword await (which works only within an async def function) passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine (i.e., a coroutine object is the result of calling an async def function), and tells the event loop to let something else run, until that awaited task completes. Note that just because you may define a custom function with async def and then await it inside your endpoint, it doesn't mean that your code will work asynchronously, if that custom function contains, for example, calls to time.sleep(), CPU-bound tasks, non-async I/O libraries, or any other blocking call that is incompatible with asynchronous Python code. In FastAPI, for example, when using the async methods of UploadFile, such as await file.read() and await file.write(), FastAPI/Starlette, behind the scenes, actually runs such methods of File objects in an external threadpool (using the async run_in_threadpool() function) and awaits it, otherwise, such methods/operations would block the event loop. You can find out more by having a look at the implementation of the UploadFile class.
Asynchronous code with async and await is many times summarised as using coroutines. Coroutines are collaborative (or cooperatively multitasked), meaning that "at any given time, a program with coroutines is running only one of its coroutines, and this running coroutine suspends its execution only when it explicitly requests to be suspended" (see here and here for more info on coroutines). As described in this article:
Specifically, whenever execution of a currently-running coroutine
reaches an await expression, the coroutine may be suspended, and
another previously-suspended coroutine may resume execution if what it
was suspended on has since returned a value. Suspension can also
happen when an async for block requests the next value from an
asynchronous iterator or when an async with block is entered or
exited, as these operations use await under the hood.
If, however, a blocking I/O-bound or CPU-bound operation was directly executed/called inside an async def function/endpoint, it would block the main thread (i.e., the event loop). Hence, a blocking operation such as time.sleep() in an async def endpoint would block the entire server (as in the example provided in your question). Thus, if your endpoint is not going to make any async calls, you could declare it with just def instead, which would be run in an external threadpool that would then be awaited, as explained earlier (more solutions are given in the following sections). Example:
#app.get("/ping")
def ping(request: Request):
#print(request.client)
print("Hello")
time.sleep(5)
print("bye")
return "pong"
Otherwise, if the functions that you had to execute inside the endpoint are async functions that you had to await, you should define your endpoint with async def. To demonstrate this, the example below uses the asyncio.sleep() function (from the asyncio library), which provides a non-blocking sleep operation. The await asyncio.sleep() method will suspend the execution of the surrounding coroutine (until the sleep operation completes), thus allowing other tasks in the event loop to run. Similar examples are given here and here as well.
import asyncio
#app.get("/ping")
async def ping(request: Request):
#print(request.client)
print("Hello")
await asyncio.sleep(5)
print("bye")
return "pong"
Both the path operation functions above will print out the specified messages to the screen in the same order as mentioned in your question—if two requests arrived at around the same time—that is:
Hello
Hello
bye
bye
Important Note
When you call your endpoint for the second (third, and so on) time, please remember to do that from a tab that is isolated from the browser's main session; otherwise, succeeding requests (i.e., coming after the first one) will be blocked by the browser (on client side), as the browser will be waiting for response from the server for the previous request before sending the next one. You can confirm that by using print(request.client) inside the endpoint, where you would see the hostname and port number being the same for all incoming requests—if requests were initiated from tabs opened in the same browser window/session)—and hence, those requests would be processed sequentially, because of the browser sending them sequentially in the first place. To solve this, you could either:
Reload the same tab (as is running), or
Open a new tab in an Incognito Window, or
Use a different browser/client to send the request, or
Use the httpx library to make asynchronous HTTP requests, along with the awaitable asyncio.gather(), which allows executing multiple asynchronous operations concurrently and then returns a list of results in the same order the awaitables (tasks) were passed to that function (have a look at this answer for more details).
Example:
import httpx
import asyncio
URLS = ['http://127.0.0.1:8000/ping'] * 2
async def send(url, client):
return await client.get(url, timeout=10)
async def main():
async with httpx.AsyncClient() as client:
tasks = [send(url, client) for url in URLS]
responses = await asyncio.gather(*tasks)
print(*[r.json() for r in responses], sep='\n')
asyncio.run(main())
In case you had to call different endpoints that may take different time to process a request, and you would like to print the response out on client side as soon as it is returned from the server—instead of waiting for asyncio.gather() to gather the results of all tasks and print them out in the same order the tasks were passed to the send() function—you could replace the send() function of the example above with the one shown below:
async def send(url, client):
res = await client.get(url, timeout=10)
print(res.json())
return res
Async/await and Blocking I/O-bound or CPU-bound Operations
If you are required to use async def (as you might need to await for coroutines inside your endpoint), but also have some synchronous I/O-bound or CPU-bound operation (long-running computation task) that will block the event loop (essentially, the entire server) and won't let other requests to go through, for example:
#app.post("/ping")
async def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = await file.read()
res = cpu_bound_task(contents) # this will block the event loop
finally:
await file.close()
print("bye")
return "pong"
then:
You should check whether you could change your endpoint's definition to normal def instead of async def. For example, if the only method in your endpoint that has to be awaited is the one reading the file contents (as you mentioned in the comments section below), you could instead declare the type of the endpoint's parameter as bytes (i.e., file: bytes = File()) and thus, FastAPI would read the file for you and you would receive the contents as bytes. Hence, there would be no need to use await file.read(). Please note that the above approach should work for small files, as the enitre file contents would be stored into memory (see the documentation on File Parameters); and hence, if your system does not have enough RAM available to accommodate the accumulated data (if, for example, you have 8GB of RAM, you can’t load a 50GB file), your application may end up crashing. Alternatively, you could call the .read() method of the SpooledTemporaryFile directly (which can be accessed through the .file attribute of the UploadFile object), so that again you don't have to await the .read() method—and as you can now declare your endpoint with normal def, each request will run in a separate thread (example is given below). For more details on how to upload a File, as well how Starlette/FastAPI uses SpooledTemporaryFile behind the scenes, please have a look at this answer and this answer.
#app.post("/ping")
def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = file.file.read()
res = cpu_bound_task(contents)
finally:
file.file.close()
print("bye")
return "pong"
Use FastAPI's (Starlette's) run_in_threadpool() function from the concurrency module—as #tiangolo suggested here—which "will run the function in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked" (see here). As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports both sequence arguments and keyword arguments".
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, contents)
Alternatively, use asyncio's loop.run_in_executor()—after obtaining the running event loop using asyncio.get_running_loop()—to run the task, which, in this case, you can await for it to complete and return the result(s), before moving on to the next line of code. Passing None as the executor argument, the default executor will be used; that is ThreadPoolExecutor:
import asyncio
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, cpu_bound_task, contents)
or, if you would like to pass keyword arguments instead, you could use a lambda expression, or, preferably, functools.partial(), which is specifically recommended in the documentation for loop.run_in_executor():
import asyncio
from functools import partial
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, partial(cpu_bound_task, some_arg=contents))
You could also run your task in a custom ThreadPoolExecutor. For instance:
import asyncio
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ThreadPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
In Python 3.9+, you could also use asyncio.to_thread() to asynchronously run a synchronous function in a separate thread—which, essentially, uses await loop.run_in_executor(None, func_call) under the hood, as can been seen in the implementation of asyncio.to_thread(). The to_thread() function takes the name of a blocking function to execute, as well as any arguments (*args and/or **kwargs) to the function, and then returns a coroutine that can be awaited. Example:
import asyncio
res = await asyncio.to_thread(cpu_bound_task, contents)
ThreadPoolExecutor will successfully prevent the event loop from being blocked, but won't give you the performance improvement you would expect from running code in parallel; especially, when one needs to perform CPU-bound operations, such as the ones described here (e.g., audio or image processing, machine learning, and so on). It is thus preferable to run CPU-bound tasks in a separate process—using ProcessPoolExecutor, as shown below—which, again, you can integrate with asyncio, in order to await it to finish its work and return the result(s). As described here, on Windows, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc. Basically, your code must be under if __name__ == '__main__':.
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
Use more workers. For example, uvicorn main:app --workers 4 (if you are using Gunicorn as a process manager with Uvicorn workers, please have a look at this answer). Note: Each worker "has its own things, variables and memory". This means that global variables/objects, etc., won't be shared across the processes/workers. In this case, you should consider using a database storage, or Key-Value stores (Caches), as described here and here. Additionally, note that "if you are consuming a large amount of memory in your code, each process will consume an equivalent amount of memory".
If you need to perform heavy background computation and you don't necessarily need it to be run by the same process (for example, you don't need to share memory, variables, etc), you might benefit from using other bigger tools like Celery, as described in FastAPI's documentation.

Q :" ... What's the problem? "
A :The FastAPI documentation is explicit to say the framework uses in-process tasks ( as inherited from Starlette ).
That, by itself, means, that all such task compete to receive ( from time to time ) the Python Interpreter GIL-lock - being efficiently a MUTEX-terrorising Global Interpreter Lock, which in effect re-[SERIAL]-ises any and all amounts of Python Interpreter in-process threads to work as one-and-only-one-WORKS-while-all-others-stay-waiting...
On fine-grain scale, you see the result -- if spawning another handler for the second ( manually initiated from a second FireFox-tab ) arriving http-request actually takes longer than a sleep has taken, the result of GIL-lock interleaved ~ 100 [ms] time-quanta round-robin ( all-wait-one-can-work ~ 100 [ms] before each next round of GIL-lock release-acquire-roulette takes place ) Python Interpreter internal work does not show more details, you may use more details ( depending on O/S type or version ) from here to see more in-thread LoD, like this inside the async-decorated code being performed :
import time
import threading
from fastapi import FastAPI, Request
TEMPLATE = "INF[{0:_>20d}]: t_id( {1: >20d} ):: {2:}"
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Python Interpreter __main__ was started ..."
)
...
#app.get("/ping")
async def ping( request: Request ):
""" __doc__
[DOC-ME]
ping( Request ): a mock-up AS-IS function to yield
a CLI/GUI self-evidence of the order-of-execution
RETURNS: a JSON-alike decorated dict
[TEST-ME] ...
"""
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Hello..."
)
#------------------------------------------------- actual blocking work
time.sleep( 5 )
#------------------------------------------------- actual blocking work
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"...bye"
)
return { "ping": "pong!" }
Last, but not least, do not hesitate to read more about all other sharks threads-based code may suffer from ... or even cause ... behind the curtains ...
Ad Memorandum
A mixture of GIL-lock, thread-based pools, asynchronous decorators, blocking and event-handling -- a sure mix to uncertainties & HWY2HELL ;o)

FastAPI runs api-calls in serial instead of parallel fashion

I have the following code:
import time
from fastapi import FastAPI, Request
app = FastAPI()
#app.get("/ping")
async def ping(request: Request):
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}
If I run my code on localhost - e.g., http://localhost:8501/ping - in different tabs of the same browser window, I get:
Hello
bye
Hello
bye
instead of:
Hello
Hello
bye
bye
I have read about using httpx, but still, I cannot have a true parallelization. What's the problem?

As per FastAPI's documentation:
When you declare a path operation function with normal def instead
of async def, it is run in an external threadpool that is then
awaited, instead of being called directly (as it would block the
server).
also, as described here:
If you are using a third party library that communicates with
something (a database, an API, the file system, etc.) and doesn't have
support for using await, (this is currently the case for most
database libraries), then declare your path operation functions as
normally, with just def.
If your application (somehow) doesn't have to communicate with
anything else and wait for it to respond, use async def.
If you just don't know, use normal def.
Note: You can mix def and async def in your path operation functions as much as you need and define each one using the best
option for you. FastAPI will do the right thing with them.
Anyway, in any of the cases above, FastAPI will still work
asynchronously and be extremely fast.
But by following the steps above, it will be able to do some
performance optimizations.
Thus, def endpoints (in the context of asynchronous programming, a function defined with just def is called synchronous function) run in a separate thread from an external threadpool (that is then awaited, and hence, FastAPI will still work asynchronously), or, in other words, the server processes the requests concurrently, whereas async def endpoints run in the event loop—on the main (single) thread—that is, the server processes the requests sequentially, as long as there is no await call to (normally) non-blocking I/O-bound operations inside such endpoints/routes, such as waiting for (1) data from the client to be sent through the network, (2) contents of a file in the disk to be read, (3) a database operation to finish, etc., (have a look here), in which cases, the server will process the requests concurrently/asynchronously (Note that the same concept not only applies to FastAPI endpoints, but to Background Tasks as well—see Starlette's BackgroundTask class implementation—hence, after reading this answer to the end, you should be able to decide whether you should define a FastAPI endpoint or background task function with def or async def). The keyword await (which works only within an async def function) passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine (i.e., a coroutine object is the result of calling an async def function), and tells the event loop to let something else run, until that awaited task completes. Note that just because you may define a custom function with async def and then await it inside your endpoint, it doesn't mean that your code will work asynchronously, if that custom function contains, for example, calls to time.sleep(), CPU-bound tasks, non-async I/O libraries, or any other blocking call that is incompatible with asynchronous Python code. In FastAPI, for example, when using the async methods of UploadFile, such as await file.read() and await file.write(), FastAPI/Starlette, behind the scenes, actually runs such methods of File objects in an external threadpool (using the async run_in_threadpool() function) and awaits it, otherwise, such methods/operations would block the event loop. You can find out more by having a look at the implementation of the UploadFile class.
Asynchronous code with async and await is many times summarised as using coroutines. Coroutines are collaborative (or cooperatively multitasked), meaning that "at any given time, a program with coroutines is running only one of its coroutines, and this running coroutine suspends its execution only when it explicitly requests to be suspended" (see here and here for more info on coroutines). As described in this article:
Specifically, whenever execution of a currently-running coroutine
reaches an await expression, the coroutine may be suspended, and
another previously-suspended coroutine may resume execution if what it
was suspended on has since returned a value. Suspension can also
happen when an async for block requests the next value from an
asynchronous iterator or when an async with block is entered or
exited, as these operations use await under the hood.
If, however, a blocking I/O-bound or CPU-bound operation was directly executed/called inside an async def function/endpoint, it would block the main thread (i.e., the event loop). Hence, a blocking operation such as time.sleep() in an async def endpoint would block the entire server (as in the example provided in your question). Thus, if your endpoint is not going to make any async calls, you could declare it with just def instead, which would be run in an external threadpool that would then be awaited, as explained earlier (more solutions are given in the following sections). Example:
#app.get("/ping")
def ping(request: Request):
#print(request.client)
print("Hello")
time.sleep(5)
print("bye")
return "pong"
Otherwise, if the functions that you had to execute inside the endpoint are async functions that you had to await, you should define your endpoint with async def. To demonstrate this, the example below uses the asyncio.sleep() function (from the asyncio library), which provides a non-blocking sleep operation. The await asyncio.sleep() method will suspend the execution of the surrounding coroutine (until the sleep operation completes), thus allowing other tasks in the event loop to run. Similar examples are given here and here as well.
import asyncio
#app.get("/ping")
async def ping(request: Request):
#print(request.client)
print("Hello")
await asyncio.sleep(5)
print("bye")
return "pong"
Both the path operation functions above will print out the specified messages to the screen in the same order as mentioned in your question—if two requests arrived at around the same time—that is:
Hello
Hello
bye
bye
Important Note
When you call your endpoint for the second (third, and so on) time, please remember to do that from a tab that is isolated from the browser's main session; otherwise, succeeding requests (i.e., coming after the first one) will be blocked by the browser (on client side), as the browser will be waiting for response from the server for the previous request before sending the next one. You can confirm that by using print(request.client) inside the endpoint, where you would see the hostname and port number being the same for all incoming requests—if requests were initiated from tabs opened in the same browser window/session)—and hence, those requests would be processed sequentially, because of the browser sending them sequentially in the first place. To solve this, you could either:
Reload the same tab (as is running), or
Open a new tab in an Incognito Window, or
Use a different browser/client to send the request, or
Use the httpx library to make asynchronous HTTP requests, along with the awaitable asyncio.gather(), which allows executing multiple asynchronous operations concurrently and then returns a list of results in the same order the awaitables (tasks) were passed to that function (have a look at this answer for more details).
Example:
import httpx
import asyncio
URLS = ['http://127.0.0.1:8000/ping'] * 2
async def send(url, client):
return await client.get(url, timeout=10)
async def main():
async with httpx.AsyncClient() as client:
tasks = [send(url, client) for url in URLS]
responses = await asyncio.gather(*tasks)
print(*[r.json() for r in responses], sep='\n')
asyncio.run(main())
In case you had to call different endpoints that may take different time to process a request, and you would like to print the response out on client side as soon as it is returned from the server—instead of waiting for asyncio.gather() to gather the results of all tasks and print them out in the same order the tasks were passed to the send() function—you could replace the send() function of the example above with the one shown below:
async def send(url, client):
res = await client.get(url, timeout=10)
print(res.json())
return res
Async/await and Blocking I/O-bound or CPU-bound Operations
If you are required to use async def (as you might need to await for coroutines inside your endpoint), but also have some synchronous I/O-bound or CPU-bound operation (long-running computation task) that will block the event loop (essentially, the entire server) and won't let other requests to go through, for example:
#app.post("/ping")
async def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = await file.read()
res = cpu_bound_task(contents) # this will block the event loop
finally:
await file.close()
print("bye")
return "pong"
then:
You should check whether you could change your endpoint's definition to normal def instead of async def. For example, if the only method in your endpoint that has to be awaited is the one reading the file contents (as you mentioned in the comments section below), you could instead declare the type of the endpoint's parameter as bytes (i.e., file: bytes = File()) and thus, FastAPI would read the file for you and you would receive the contents as bytes. Hence, there would be no need to use await file.read(). Please note that the above approach should work for small files, as the enitre file contents would be stored into memory (see the documentation on File Parameters); and hence, if your system does not have enough RAM available to accommodate the accumulated data (if, for example, you have 8GB of RAM, you can’t load a 50GB file), your application may end up crashing. Alternatively, you could call the .read() method of the SpooledTemporaryFile directly (which can be accessed through the .file attribute of the UploadFile object), so that again you don't have to await the .read() method—and as you can now declare your endpoint with normal def, each request will run in a separate thread (example is given below). For more details on how to upload a File, as well how Starlette/FastAPI uses SpooledTemporaryFile behind the scenes, please have a look at this answer and this answer.
#app.post("/ping")
def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = file.file.read()
res = cpu_bound_task(contents)
finally:
file.file.close()
print("bye")
return "pong"
Use FastAPI's (Starlette's) run_in_threadpool() function from the concurrency module—as #tiangolo suggested here—which "will run the function in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked" (see here). As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports both sequence arguments and keyword arguments".
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, contents)
Alternatively, use asyncio's loop.run_in_executor()—after obtaining the running event loop using asyncio.get_running_loop()—to run the task, which, in this case, you can await for it to complete and return the result(s), before moving on to the next line of code. Passing None as the executor argument, the default executor will be used; that is ThreadPoolExecutor:
import asyncio
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, cpu_bound_task, contents)
or, if you would like to pass keyword arguments instead, you could use a lambda expression, or, preferably, functools.partial(), which is specifically recommended in the documentation for loop.run_in_executor():
import asyncio
from functools import partial
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, partial(cpu_bound_task, some_arg=contents))
You could also run your task in a custom ThreadPoolExecutor. For instance:
import asyncio
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ThreadPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
In Python 3.9+, you could also use asyncio.to_thread() to asynchronously run a synchronous function in a separate thread—which, essentially, uses await loop.run_in_executor(None, func_call) under the hood, as can been seen in the implementation of asyncio.to_thread(). The to_thread() function takes the name of a blocking function to execute, as well as any arguments (*args and/or **kwargs) to the function, and then returns a coroutine that can be awaited. Example:
import asyncio
res = await asyncio.to_thread(cpu_bound_task, contents)
ThreadPoolExecutor will successfully prevent the event loop from being blocked, but won't give you the performance improvement you would expect from running code in parallel; especially, when one needs to perform CPU-bound operations, such as the ones described here (e.g., audio or image processing, machine learning, and so on). It is thus preferable to run CPU-bound tasks in a separate process—using ProcessPoolExecutor, as shown below—which, again, you can integrate with asyncio, in order to await it to finish its work and return the result(s). As described here, on Windows, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc. Basically, your code must be under if __name__ == '__main__':.
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
Use more workers. For example, uvicorn main:app --workers 4 (if you are using Gunicorn as a process manager with Uvicorn workers, please have a look at this answer). Note: Each worker "has its own things, variables and memory". This means that global variables/objects, etc., won't be shared across the processes/workers. In this case, you should consider using a database storage, or Key-Value stores (Caches), as described here and here. Additionally, note that "if you are consuming a large amount of memory in your code, each process will consume an equivalent amount of memory".
If you need to perform heavy background computation and you don't necessarily need it to be run by the same process (for example, you don't need to share memory, variables, etc), you might benefit from using other bigger tools like Celery, as described in FastAPI's documentation.

Q :" ... What's the problem? "
A :The FastAPI documentation is explicit to say the framework uses in-process tasks ( as inherited from Starlette ).
That, by itself, means, that all such task compete to receive ( from time to time ) the Python Interpreter GIL-lock - being efficiently a MUTEX-terrorising Global Interpreter Lock, which in effect re-[SERIAL]-ises any and all amounts of Python Interpreter in-process threads to work as one-and-only-one-WORKS-while-all-others-stay-waiting...
On fine-grain scale, you see the result -- if spawning another handler for the second ( manually initiated from a second FireFox-tab ) arriving http-request actually takes longer than a sleep has taken, the result of GIL-lock interleaved ~ 100 [ms] time-quanta round-robin ( all-wait-one-can-work ~ 100 [ms] before each next round of GIL-lock release-acquire-roulette takes place ) Python Interpreter internal work does not show more details, you may use more details ( depending on O/S type or version ) from here to see more in-thread LoD, like this inside the async-decorated code being performed :
import time
import threading
from fastapi import FastAPI, Request
TEMPLATE = "INF[{0:_>20d}]: t_id( {1: >20d} ):: {2:}"
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Python Interpreter __main__ was started ..."
)
...
#app.get("/ping")
async def ping( request: Request ):
""" __doc__
[DOC-ME]
ping( Request ): a mock-up AS-IS function to yield
a CLI/GUI self-evidence of the order-of-execution
RETURNS: a JSON-alike decorated dict
[TEST-ME] ...
"""
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Hello..."
)
#------------------------------------------------- actual blocking work
time.sleep( 5 )
#------------------------------------------------- actual blocking work
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"...bye"
)
return { "ping": "pong!" }
Last, but not least, do not hesitate to read more about all other sharks threads-based code may suffer from ... or even cause ... behind the curtains ...
Ad Memorandum
A mixture of GIL-lock, thread-based pools, asynchronous decorators, blocking and event-handling -- a sure mix to uncertainties & HWY2HELL ;o)

the part of asyncio that checks if the response is ready [duplicate]

This question is motivated by my another question: How to await in cdef?
There are tons of articles and blog posts on the web about asyncio, but they are all very superficial. I couldn't find any information about how asyncio is actually implemented, and what makes I/O asynchronous. I was trying to read the source code, but it's thousands of lines of not the highest grade C code, a lot of which deals with auxiliary objects, but most crucially, it is hard to connect between Python syntax and what C code it would translate into.
Asycnio's own documentation is even less helpful. There's no information there about how it works, only some guidelines about how to use it, which are also sometimes misleading / very poorly written.
I'm familiar with Go's implementation of coroutines, and was kind of hoping that Python did the same thing. If that was the case, the code I came up in the post linked above would have worked. Since it didn't, I'm now trying to figure out why. My best guess so far is as follows, please correct me where I'm wrong:
Procedure definitions of the form async def foo(): ... are actually interpreted as methods of a class inheriting coroutine.
Perhaps, async def is actually split into multiple methods by await statements, where the object, on which these methods are called is able to keep track of the progress it made through the execution so far.
If the above is true, then, essentially, execution of a coroutine boils down to calling methods of coroutine object by some global manager (loop?).
The global manager is somehow (how?) aware of when I/O operations are performed by Python (only?) code and is able to choose one of the pending coroutine methods to execute after the current executing method relinquished control (hit on the await statement).
In other words, here's my attempt at "desugaring" of some asyncio syntax into something more understandable:
async def coro(name):
print('before', name)
await asyncio.sleep()
print('after', name)
asyncio.gather(coro('first'), coro('second'))
# translated from async def coro(name)
class Coro(coroutine):
def before(self, name):
print('before', name)
def after(self, name):
print('after', name)
def __init__(self, name):
self.name = name
self.parts = self.before, self.after
self.pos = 0
def __call__():
self.parts[self.pos](self.name)
self.pos += 1
def done(self):
return self.pos == len(self.parts)
# translated from asyncio.gather()
class AsyncIOManager:
def gather(*coros):
while not every(c.done() for c in coros):
coro = random.choice(coros)
coro()
Should my guess prove correct: then I have a problem. How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter? What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?

How does asyncio work?
Before answering this question we need to understand a few base terms, skip these if you already know any of them.
Generators
Generators are objects that allow us to suspend the execution of a python function. User curated generators are implemented using the keyword yield. By creating a normal function containing the yield keyword, we turn that function into a generator:
>>> def test():
... yield 1
... yield 2
...
>>> gen = test()
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
As you can see, calling next() on the generator causes the interpreter to load the test's frame, and return the yielded value. Calling next() again, causes the frame to load again into the interpreter stack, and continues on yielding another value.
By the third time next() is called, our generator was finished, and StopIteration was thrown.
Communicating with a generator
A less-known feature of generators is the fact that you can communicate with them using two methods: send() and throw().
>>> def test():
... val = yield 1
... print(val)
... yield 2
... yield 3
...
>>> gen = test()
>>> next(gen)
1
>>> gen.send("abc")
abc
2
>>> gen.throw(Exception())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in test
Exception
Upon calling gen.send(), the value is passed as a return value from the yield keyword.
gen.throw() on the other hand, allows throwing Exceptions inside generators, with the exception raised at the same spot yield was called.
Returning values from generators
Returning a value from a generator, results in the value being put inside the StopIteration exception. We can later on recover the value from the exception and use it to our needs.
>>> def test():
... yield 1
... return "abc"
...
>>> gen = test()
>>> next(gen)
1
>>> try:
... next(gen)
... except StopIteration as exc:
... print(exc.value)
...
abc
Behold, a new keyword: yield from
Python 3.4 came with the addition of a new keyword: yield from. What that keyword allows us to do, is pass on any next(), send() and throw() into an inner-most nested generator. If the inner generator returns a value, it is also the return value of yield from:
>>> def inner():
... inner_result = yield 2
... print('inner', inner_result)
... return 3
...
>>> def outer():
... yield 1
... val = yield from inner()
... print('outer', val)
... yield 4
...
>>> gen = outer()
>>> next(gen)
1
>>> next(gen) # Goes inside inner() automatically
2
>>> gen.send("abc")
inner abc
outer 3
4
I've written an article to further elaborate on this topic.
Putting it all together
Upon introducing the new keyword yield from in Python 3.4, we were now able to create generators inside generators that just like a tunnel, pass the data back and forth from the inner-most to the outer-most generators. This has spawned a new meaning for generators - coroutines.
Coroutines are functions that can be stopped and resumed while being run. In Python, they are defined using the async def keyword. Much like generators, they too use their own form of yield from which is await. Before async and await were introduced in Python 3.5, we created coroutines in the exact same way generators were created (with yield from instead of await).
async def inner():
return 1
async def outer():
await inner()
Just like all iterators and generators implement the __iter__() method, all coroutines implement __await__() which allows them to continue on every time await coro is called.
There's a nice sequence diagram inside the Python docs that you should check out.
In asyncio, apart from coroutine functions, we have 2 important objects: tasks and futures.
Futures
Futures are objects that have the __await__() method implemented, and their job is to hold a certain state and result. The state can be one of the following:
PENDING - future does not have any result or exception set.
CANCELLED - future was cancelled using fut.cancel()
FINISHED - future was finished, either by a result set using fut.set_result() or by an exception set using fut.set_exception()
The result, just like you have guessed, can either be a Python object, that will be returned, or an exception which may be raised.
Another important feature of future objects, is that they contain a method called add_done_callback(). This method allows functions to be called as soon as the task is done - whether it raised an exception or finished.
Tasks
Task objects are special futures, which wrap around coroutines, and communicate with the inner-most and outer-most coroutines. Every time a coroutine awaits a future, the future is passed all the way back to the task (just like in yield from), and the task receives it.
Next, the task binds itself to the future. It does so by calling add_done_callback() on the future. From now on, if the future will ever be done, by either being cancelled, passed an exception or passed a Python object as a result, the task's callback will be called, and it will rise back up to existence.
Asyncio
The final burning question we must answer is - how is the IO implemented?
Deep inside asyncio, we have an event loop. An event loop of tasks. The event loop's job is to call tasks every time they are ready and coordinate all that effort into one single working machine.
The IO part of the event loop is built upon a single crucial function called select. Select is a blocking function, implemented by the operating system underneath, that allows waiting on sockets for incoming or outgoing data. Upon receiving data it wakes up, and returns the sockets which received data, or the sockets which are ready for writing.
When you try to receive or send data over a socket through asyncio, what actually happens below is that the socket is first checked if it has any data that can be immediately read or sent. If its .send() buffer is full, or the .recv() buffer is empty, the socket is registered to the select function (by simply adding it to one of the lists, rlist for recv and wlist for send) and the appropriate function awaits a newly created future object, tied to that socket.
When all available tasks are waiting for futures, the event loop calls select and waits. When the one of the sockets has incoming data, or its send buffer drained up, asyncio checks for the future object tied to that socket, and sets it to done.
Now all the magic happens. The future is set to done, the task that added itself before with add_done_callback() rises up back to life, and calls .send() on the coroutine which resumes the inner-most coroutine (because of the await chain) and you read the newly received data from a nearby buffer it was spilled unto.
Method chain again, in case of recv():
select.select waits.
A ready socket, with data is returned.
Data from the socket is moved into a buffer.
future.set_result() is called.
Task that added itself with add_done_callback() is now woken up.
Task calls .send() on the coroutine which goes all the way into the inner-most coroutine and wakes it up.
Data is being read from the buffer and returned to our humble user.
In summary, asyncio uses generator capabilities, that allow pausing and resuming functions. It uses yield from capabilities that allow passing data back and forth from the inner-most generator to the outer-most. It uses all of those in order to halt function execution while it's waiting for IO to complete (by using the OS select function).
And the best of all? While one function is paused, another may run and interleave with the delicate fabric, which is asyncio.

Talking about async/await and asyncio is not the same thing. The first is a fundamental, low-level construct (coroutines) while the later is a library using these constructs. Conversely, there is no single ultimate answer.
The following is a general description of how async/await and asyncio-like libraries work. That is, there may be other tricks on top (there are...) but they are inconsequential unless you build them yourself. The difference should be negligible unless you already know enough to not have to ask such a question.
1. Coroutines versus subroutines in a nut shell
Just like subroutines (functions, procedures, ...), coroutines (generators, ...) are an abstraction of call stack and instruction pointer: there is a stack of executing code pieces, and each is at a specific instruction.
The distinction of def versus async def is merely for clarity. The actual difference is return versus yield. From this, await or yield from take the difference from individual calls to entire stacks.
1.1. Subroutines
A subroutine represents a new stack level to hold local variables, and a single traversal of its instructions to reach an end. Consider a subroutine like this:
def subfoo(bar):
qux = 3
return qux * bar
When you run it, that means
allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)
Notably, 4. means that a subroutine always starts at the same state. Everything exclusive to the function itself is lost upon completion. A function cannot be resumed, even if there are instructions after return.
root -\
: \- subfoo --\
:/--<---return --/
|
V
1.2. Coroutines as persistent subroutines
A coroutine is like a subroutine, but can exit without destroying its state. Consider a coroutine like this:
def cofoo(bar):
qux = yield bar # yield marks a break point
return qux
When you run it, that means
allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
once at a yield, push its value to the calling stack but store the stack and instruction pointer
once calling into yield, restore stack and instruction pointer and push arguments to qux
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)
Note the addition of 2.1 and 2.2 - a coroutine can be suspended and resumed at predefined points. This is similar to how a subroutine is suspended during calling another subroutine. The difference is that the active coroutine is not strictly bound to its calling stack. Instead, a suspended coroutine is part of a separate, isolated stack.
root -\
: \- cofoo --\
:/--<+--yield --/
| :
V :
This means that suspended coroutines can be freely stored or moved between stacks. Any call stack that has access to a coroutine can decide to resume it.
1.3. Traversing the call stack
So far, our coroutine only goes down the call stack with yield. A subroutine can go down and up the call stack with return and (). For completeness, coroutines also need a mechanism to go up the call stack. Consider a coroutine like this:
def wrap():
yield 'before'
yield from cofoo()
yield 'after'
When you run it, that means it still allocates the stack and instruction pointer like a subroutine. When it suspends, that still is like storing a subroutine.
However, yield from does both. It suspends stack and instruction pointer of wrap and runs cofoo. Note that wrap stays suspended until cofoo finishes completely. Whenever cofoo suspends or something is sent, cofoo is directly connected to the calling stack.
1.4. Coroutines all the way down
As established, yield from allows to connect two scopes across another intermediate one. When applied recursively, that means the top of the stack can be connected to the bottom of the stack.
root -\
: \-> coro_a -yield-from-> coro_b --\
:/ <-+------------------------yield ---/
| :
:\ --+-- coro_a.send----------yield ---\
: coro_b <-/
Note that root and coro_b do not know about each other. This makes coroutines much cleaner than callbacks: coroutines still built on a 1:1 relation like subroutines. Coroutines suspend and resume their entire existing execution stack up until a regular call point.
Notably, root could have an arbitrary number of coroutines to resume. Yet, it can never resume more than one at the same time. Coroutines of the same root are concurrent but not parallel!
1.5. Python's async and await
The explanation has so far explicitly used the yield and yield from vocabulary of generators - the underlying functionality is the same. The new Python3.5 syntax async and await exists mainly for clarity.
def foo(): # subroutine?
return None
def foo(): # coroutine?
yield from foofoo() # generator? coroutine?
async def foo(): # coroutine!
await foofoo() # coroutine!
return None
The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.
2. Anatomy of a simple event loop
By itself, a coroutine has no concept of yielding control to another coroutine. It can only yield control to the caller at the bottom of a coroutine stack. This caller can then switch to another coroutine and run it.
This root node of several coroutines is commonly an event loop: on suspension, a coroutine yields an event on which it wants resume. In turn, the event loop is capable of efficiently waiting for these events to occur. This allows it to decide which coroutine to run next, or how to wait before resuming.
Such a design implies that there is a set of pre-defined events that the loop understands. Several coroutines await each other, until finally an event is awaited. This event can communicate directly with the event loop by yielding control.
loop -\
: \-> coroutine --await--> event --\
:/ <-+----------------------- yield --/
| :
| : # loop waits for event to happen
| :
:\ --+-- send(reply) -------- yield --\
: coroutine <--yield-- event <-/
The key is that coroutine suspension allows the event loop and events to directly communicate. The intermediate coroutine stack does not require any knowledge about which loop is running it, nor how events work.
2.1.1. Events in time
The simplest event to handle is reaching a point in time. This is a fundamental block of threaded code as well: a thread repeatedly sleeps until a condition is true.
However, a regular sleep blocks execution by itself - we want other coroutines to not be blocked. Instead, we want tell the event loop when it should resume the current coroutine stack.
2.1.2. Defining an Event
An event is simply a value we can identify - be it via an enum, a type or other identity. We can define this with a simple class that stores our target time. In addition to storing the event information, we can allow to await a class directly.
class AsyncSleep:
"""Event to sleep until a point in time"""
def __init__(self, until: float):
self.until = until
# used whenever someone ``await``s an instance of this Event
def __await__(self):
# yield this Event to the loop
yield self
def __repr__(self):
return '%s(until=%.1f)' % (self.__class__.__name__, self.until)
This class only stores the event - it does not say how to actually handle it.
The only special feature is __await__ - it is what the await keyword looks for. Practically, it is an iterator but not available for the regular iteration machinery.
2.2.1. Awaiting an event
Now that we have an event, how do coroutines react to it? We should be able to express the equivalent of sleep by awaiting our event. To better see what is going on, we wait twice for half the time:
import time
async def asleep(duration: float):
"""await that ``duration`` seconds pass"""
await AsyncSleep(time.time() + duration / 2)
await AsyncSleep(time.time() + duration / 2)
We can directly instantiate and run this coroutine. Similar to a generator, using coroutine.send runs the coroutine until it yields a result.
coroutine = asleep(100)
while True:
print(coroutine.send(None))
time.sleep(0.1)
This gives us two AsyncSleep events and then a StopIteration when the coroutine is done. Notice that the only delay is from time.sleep in the loop! Each AsyncSleep only stores an offset from the current time.
2.2.2. Event + Sleep
At this point, we have two separate mechanisms at our disposal:
AsyncSleep Events that can be yielded from inside a coroutine
time.sleep that can wait without impacting coroutines
Notably, these two are orthogonal: neither one affects or triggers the other. As a result, we can come up with our own strategy to sleep to meet the delay of an AsyncSleep.
2.3. A naive event loop
If we have several coroutines, each can tell us when it wants to be woken up. We can then wait until the first of them wants to be resumed, then for the one after, and so on. Notably, at each point we only care about which one is next.
This makes for a straightforward scheduling:
sort coroutines by their desired wake up time
pick the first that wants to wake up
wait until this point in time
run this coroutine
repeat from 1.
A trivial implementation does not need any advanced concepts. A list allows to sort coroutines by date. Waiting is a regular time.sleep. Running coroutines works just like before with coroutine.send.
def run(*coroutines):
"""Cooperatively run all ``coroutines`` until completion"""
# store wake-up-time and coroutines
waiting = [(0, coroutine) for coroutine in coroutines]
while waiting:
# 2. pick the first coroutine that wants to wake up
until, coroutine = waiting.pop(0)
# 3. wait until this point in time
time.sleep(max(0.0, until - time.time()))
# 4. run this coroutine
try:
command = coroutine.send(None)
except StopIteration:
continue
# 1. sort coroutines by their desired suspension
if isinstance(command, AsyncSleep):
waiting.append((command.until, coroutine))
waiting.sort(key=lambda item: item[0])
Of course, this has ample room for improvement. We can use a heap for the wait queue or a dispatch table for events. We could also fetch return values from the StopIteration and assign them to the coroutine. However, the fundamental principle remains the same.
2.4. Cooperative Waiting
The AsyncSleep event and run event loop are a fully working implementation of timed events.
async def sleepy(identifier: str = "coroutine", count=5):
for i in range(count):
print(identifier, 'step', i + 1, 'at %.2f' % time.time())
await asleep(0.1)
run(*(sleepy("coroutine %d" % j) for j in range(5)))
This cooperatively switches between each of the five coroutines, suspending each for 0.1 seconds. Even though the event loop is synchronous, it still executes the work in 0.5 seconds instead of 2.5 seconds. Each coroutine holds state and acts independently.
3. I/O event loop
An event loop that supports sleep is suitable for polling. However, waiting for I/O on a file handle can be done more efficiently: the operating system implements I/O and thus knows which handles are ready. Ideally, an event loop should support an explicit "ready for I/O" event.
3.1. The select call
Python already has an interface to query the OS for read I/O handles. When called with handles to read or write, it returns the handles ready to read or write:
readable, writable, _ = select.select(rlist, wlist, xlist, timeout)
For example, we can open a file for writing and wait for it to be ready:
write_target = open('/tmp/foo')
readable, writable, _ = select.select([], [write_target], [])
Once select returns, writable contains our open file.
3.2. Basic I/O event
Similar to the AsyncSleep request, we need to define an event for I/O. With the underlying select logic, the event must refer to a readable object - say an open file. In addition, we store how much data to read.
class AsyncRead:
def __init__(self, file, amount=1):
self.file = file
self.amount = amount
self._buffer = b'' if 'b' in file.mode else ''
def __await__(self):
while len(self._buffer) < self.amount:
yield self
# we only get here if ``read`` should not block
self._buffer += self.file.read(1)
return self._buffer
def __repr__(self):
return '%s(file=%s, amount=%d, progress=%d)' % (
self.__class__.__name__, self.file, self.amount, len(self._buffer)
)
As with AsyncSleep we mostly just store the data required for the underlying system call. This time, __await__ is capable of being resumed multiple times - until our desired amount has been read. In addition, we return the I/O result instead of just resuming.
3.3. Augmenting an event loop with read I/O
The basis for our event loop is still the run defined previously. First, we need to track the read requests. This is no longer a sorted schedule, we only map read requests to coroutines.
# new
waiting_read = {} # type: Dict[file, coroutine]
Since select.select takes a timeout parameter, we can use it in place of time.sleep.
# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(waiting_read), [], [])
This gives us all readable files - if there are any, we run the corresponding coroutine. If there are none, we have waited long enough for our current coroutine to run.
# new - reschedule waiting coroutine, run readable coroutine
if readable:
waiting.append((until, coroutine))
waiting.sort()
coroutine = waiting_read[readable[0]]
Finally, we have to actually listen for read requests.
# new
if isinstance(command, AsyncSleep):
...
elif isinstance(command, AsyncRead):
...
3.4. Putting it together
The above was a bit of a simplification. We need to do some switching to not starve sleeping coroutines if we can always read. We need to handle having nothing to read or nothing to wait for. However, the end result still fits into 30 LOC.
def run(*coroutines):
"""Cooperatively run all ``coroutines`` until completion"""
waiting_read = {} # type: Dict[file, coroutine]
waiting = [(0, coroutine) for coroutine in coroutines]
while waiting or waiting_read:
# 2. wait until the next coroutine may run or read ...
try:
until, coroutine = waiting.pop(0)
except IndexError:
until, coroutine = float('inf'), None
readable, _, _ = select.select(list(waiting_read), [], [])
else:
readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
# ... and select the appropriate one
if readable and time.time() < until:
if until and coroutine:
waiting.append((until, coroutine))
waiting.sort()
coroutine = waiting_read.pop(readable[0])
# 3. run this coroutine
try:
command = coroutine.send(None)
except StopIteration:
continue
# 1. sort coroutines by their desired suspension ...
if isinstance(command, AsyncSleep):
waiting.append((command.until, coroutine))
waiting.sort(key=lambda item: item[0])
# ... or register reads
elif isinstance(command, AsyncRead):
waiting_read[command.file] = coroutine
3.5. Cooperative I/O
The AsyncSleep, AsyncRead and run implementations are now fully functional to sleep and/or read.
Same as for sleepy, we can define a helper to test reading:
async def ready(path, amount=1024*32):
print('read', path, 'at', '%d' % time.time())
with open(path, 'rb') as file:
result = await AsyncRead(file, amount)
print('done', path, 'at', '%d' % time.time())
print('got', len(result), 'B')
run(sleepy('background', 5), ready('/dev/urandom'))
Running this, we can see that our I/O is interleaved with the waiting task:
id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B
4. Non-Blocking I/O
While I/O on files gets the concept across, it is not really suitable for a library like asyncio: the select call always returns for files, and both open and read may block indefinitely. This blocks all coroutines of an event loop - which is bad. Libraries like aiofiles use threads and synchronization to fake non-blocking I/O and events on file.
However, sockets do allow for non-blocking I/O - and their inherent latency makes it much more critical. When used in an event loop, waiting for data and retrying can be wrapped without blocking anything.
4.1. Non-Blocking I/O event
Similar to our AsyncRead, we can define a suspend-and-read event for sockets. Instead of taking a file, we take a socket - which must be non-blocking. Also, our __await__ uses socket.recv instead of file.read.
class AsyncRecv:
def __init__(self, connection, amount=1, read_buffer=1024):
assert not connection.getblocking(), 'connection must be non-blocking for async recv'
self.connection = connection
self.amount = amount
self.read_buffer = read_buffer
self._buffer = b''
def __await__(self):
while len(self._buffer) < self.amount:
try:
self._buffer += self.connection.recv(self.read_buffer)
except BlockingIOError:
yield self
return self._buffer
def __repr__(self):
return '%s(file=%s, amount=%d, progress=%d)' % (
self.__class__.__name__, self.connection, self.amount, len(self._buffer)
)
In contrast to AsyncRead, __await__ performs truly non-blocking I/O. When data is available, it always reads. When no data is available, it always suspends. That means the event loop is only blocked while we perform useful work.
4.2. Un-Blocking the event loop
As far as the event loop is concerned, nothing changes much. The event to listen for is still the same as for files - a file descriptor marked ready by select.
# old
elif isinstance(command, AsyncRead):
waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
waiting_read[command.connection] = coroutine
At this point, it should be obvious that AsyncRead and AsyncRecv are the same kind of event. We could easily refactor them to be one event with an exchangeable I/O component. In effect, the event loop, coroutines and events cleanly separate a scheduler, arbitrary intermediate code and the actual I/O.
4.3. The ugly side of non-blocking I/O
In principle, what you should do at this point is replicate the logic of read as a recv for AsyncRecv. However, this is much more ugly now - you have to handle early returns when functions block inside the kernel, but yield control to you. For example, opening a connection versus opening a file is much longer:
# file
file = open(path, 'rb')
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
connection.connect((url, port))
except BlockingIOError:
pass
Long story short, what remains is a few dozen lines of Exception handling. The events and event loop already work at this point.
id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5
Addendum
Example code at github

What is asyncio?
Asyncio stands for asynchronous input output and refers to a programming paradigm which achieves high concurrency using a single thread or event loop.
Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. When the work is complete, it notifies the main thread about completion or failure of the worker thread.
Let's have a look in below image:
Let's understand asyncio with an example:
To understand the concept behind asyncio, let’s consider a restaurant with a single waiter. Suddenly, three customers, A, B and C show up. The three of them take a varying amount of time to decide what to eat once they receive the menu from the waiter.
Let’s assume A takes 5 minutes, B 10 minutes and C 1 minute to decide. If the single waiter starts with B first and takes B's order in 10 minutes, next he serves A and spends 5 minutes on noting down his order and finally spends 1 minute to know what C wants to eat.
So, in total, waiter spends 10 + 5 + 1 = 16 minutes to take down their orders. However, notice in this sequence of events, C ends up waiting 15 minutes before the waiter gets to him, A waits 10 minutes and B waits 0 minutes.
Now consider if the waiter knew the time each customer would take to decide. He can start with C first, then go to A and finally to B. This way each customer would experience a 0 minute wait.
An illusion of three waiters, one dedicated to each customer is created even though there’s only one.
Lastly, the total time it takes for the waiter to take all three orders is 10 minutes, much less than the 16 minutes in the other scenario.
Let's go through another example:
Suppose, Chess master Magnus Carlsen hosts a chess exhibition in which he plays with multiple amateur players. He has two ways of conducting the exhibition: synchronously and asynchronously.
Assumptions:
24 opponents
Magnus Carlsen makes each chess move in 5 seconds
Opponents each take 55 seconds to make a move
Games average 30 pair-moves (60 moves total)
Synchronously: Magnus Carlsen plays one game at a time, never two at the same time, until the game is complete. Each game takes (55 + 5) * 30 == 1800 seconds, or 30 minutes. The entire exhibition takes 24 * 30 == 720 minutes, or 12 hours.
Asynchronously: Magnus Carlsen moves from table to table, making one move at each table. She leaves the table and lets the opponent make their next move during the wait time. One move on all 24 games takes Judit 24 * 5 == 120 seconds, or 2 minutes. The entire exhibition is now cut down to 120 * 30 == 3600 seconds, or just 1 hour
There is only one Magnus Carlsen, who has only two hands and makes only one move at a time by himself. But playing asynchronously cuts the exhibition time down from 12 hours to one.
Coding Example:
Let try to demonstrate Synchronous and Asynchronous execution time using code snippet.
Asynchronous - async_count.py
import asyncio
import time
async def count():
print("One", end=" ")
await asyncio.sleep(1)
print("Two", end=" ")
await asyncio.sleep(2)
print("Three", end=" ")
async def main():
await asyncio.gather(count(), count(), count(), count(), count())
if __name__ == "__main__":
start_time = time.perf_counter()
asyncio.run(main())
end_time = time.perf_counter()
execution_time = end_time - start_time
print(f"\nExecuting - {__file__}\nExecution Starts: {start_time}\nExecutions Ends: {end_time}\nTotals Execution Time:{execution_time:0.2f} seconds.")
Asynchronous - Output:
One One One One One Two Two Two Two Two Three Three Three Three Three
Executing - async_count.py
Execution Starts: 18453.442160108
Executions Ends: 18456.444719712
Totals Execution Time:3.00 seconds.
Synchronous - sync_count.py
import time
def count():
print("One", end=" ")
time.sleep(1)
print("Two", end=" ")
time.sleep(2)
print("Three", end=" ")
def main():
for _ in range(5):
count()
if __name__ == "__main__":
start_time = time.perf_counter()
main()
end_time = time.perf_counter()
execution_time = end_time - start_time
print(f"\nExecuting - {__file__}\nExecution Starts: {start_time}\nExecutions Ends: {end_time}\nTotals Execution Time:{execution_time:0.2f} seconds.")
Synchronous - Output:
One Two Three One Two Three One Two Three One Two Three One Two Three
Executing - sync_count.py
Execution Starts: 18875.175965998
Executions Ends: 18890.189930292
Totals Execution Time:15.01 seconds.
Why use asyncio instead of multithreading in Python?
It’s very difficult to write code that is thread safe. With asynchronous code, you know exactly where the code will shift from one task to the next and race conditions are much harder to come by.
Threads consume a fair amount of data since each thread needs to have its own stack. With async code, all the code shares the same stack and the stack is kept small due to continuously unwinding the stack between tasks.
Threads are OS structures and therefore require more memory for the platform to support. There is no such problem with asynchronous tasks.
How does asyncio works?
Before going deep let's recall Python Generator
Python Generator:
Functions containing a yield statement are compiled as generators. Using a yield expression in a function’s body causes that function to be a generator. These functions return an object which supports the iteration protocol methods. The generator object created automatically receives a __next()__ method. Going back to the example from the previous section we can invoke __next__ directly on the generator object instead of using next():
def asynchronous():
yield "Educative"
if __name__ == "__main__":
gen = asynchronous()
str = gen.__next__()
print(str)
Remember the following about generators:
Generator functions allow you to procrastinate computing expensive values. You only compute the next value when required. This makes generators memory and compute efficient; they refrain from saving long sequences in memory or doing all expensive computations upfront.
Generators, when suspended, retain the code location, which is the last yield statement executed, and their entire local scope. This allows them to resume execution from where they left off.
Generator objects are nothing more than iterators.
Remember to make a distinction between a generator function and the associated generator object which are often used interchangeably. A generator function when invoked returns a generator object and next() is invoked on the generator object to run the code within the generator function.
States of a generator:
A generator goes through the following states:
GEN_CREATED when a generator object has been returned for the first time from a generator function and iteration hasn’t started.
GEN_RUNNING when next has been invoked on the generator object and is being executed by the python interpreter.
GEN_SUSPENDED when a generator is suspended at a yield
GEN_CLOSED when a generator has completed execution or has been closed.
Methods on generator objects:
A generator object exposes different methods that can be invoked to manipulate the generator. These are:
throw()
send()
close()
Let's deep dive into more details explanations
The rules of asyncio:
The syntax async def introduces either a native coroutine or an asynchronous generator. The expressions async with and async for are also valid.
The keyword await passes function control back to the event loop. (It suspends the execution of the surrounding coroutine.) If Python encounters an await f() expression in the scope of g(), this is how await tells the event loop, "Suspend execution of g() until whatever I’m waiting on—the result of f()—is returned. In the meantime, go let something else run."
In code, that second bullet point looks roughly like this:
async def g():
# Pause here and come back to g() when f() is ready
r = await f()
return r
There's also a strict set of rules around when and how you can and cannot use async/await. These can be handy whether you are still picking up the syntax or already have exposure to using async/await:
A function that you introduce with async def is a coroutine. It may use await, return, or yield, but all of these are optional. Declaring async def noop(): pass is valid:
Using await and/or return creates a coroutine function. To call a coroutine function, you must await it to get its results.
It is less common to use yield in an async def block. This creates an asynchronous generator, which you iterate over with async for. Forget about async generators for the time being and focus on getting down the syntax for coroutine functions, which use await and/or return.
Anything defined with async def may not use yield from, which will raise a SyntaxError.
Just like it’s a SyntaxError to use yield outside of a def function, it is a SyntaxError to use await outside of an async def coroutine. You can only use await in the body of coroutines.
Here are some terse examples meant to summarize the above few rules:
async def f(x):
y = await z(x) # OK - `await` and `return` allowed in coroutines
return y
async def g(x):
yield x # OK - this is an async generator
async def m(x):
yield from gen(x) # NO - SyntaxError
def m(x):
y = await z(x) # NO - SyntaxError (no `async def` here)
return y
Generator Based Coroutine
Python created a distinction between Python generators and generators that were meant to be used as coroutines. These coroutines are called generator-based coroutines and require the decorator #asynio.coroutine to be added to the function definition, though this isn’t strictly enforced.
Generator based coroutines use yield from syntax instead of yield. A coroutine can:
yield from another coroutine
yield from a future
return an expression
raise exception
Coroutines in Python make cooperative multitasking possible.
Cooperative multitasking is the approach in which the running process voluntarily gives up the CPU to other processes. A process may do so when it is logically blocked, say while waiting for user input or when it has initiated a network request and will be idle for a while.
A coroutine can be defined as a special function that can give up control to its caller without losing its state.
So what’s the difference between coroutines and generators?
Generators are essentially iterators though they look like functions. The distinction between generators and coroutines, in general, is that:
Generators yield back a value to the invoker whereas a coroutine yields control to another coroutine and can resume execution from the point it gives up control.
A generator can’t accept arguments once started whereas a coroutine can.
Generators are primarily used to simplify writing iterators. They are a type of coroutine and sometimes also called as semicoroutines.
Generator Based Coroutine Example
The simplest generator based coroutine we can write is as follows:
#asyncio.coroutine
def do_something_important():
yield from asyncio.sleep(1)
The coroutine sleeps for one second. Note the decorator and the use of yield from.
Native Based Coroutine Example
By native it is meant that the language introduced syntax to specifically define coroutines, making them first class citizens in the language. Native coroutines can be defined using the async/await syntax.
The simplest native based coroutine we can write is as follows:
async def do_something_important():
await asyncio.sleep(1)
AsyncIO Design Patterns
AsyncIO comes with its own set of possible script designs, which we will discuss in this section.
1. Event loops
The event loop is a programming construct that waits for events to happen and then dispatches them to an event handler. An event can be a user clicking on a UI button or a process initiating a file download. At the core of asynchronous programming, sits the event loop.
Example Code:
import asyncio
import random
import time
from threading import Thread
from threading import current_thread
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[34m", # Blue
)
async def do_something_important(sleep_for):
print(colors[1] + f"Is event loop running in thread {current_thread().getName()} = {asyncio.get_event_loop().is_running()}" + colors[0])
await asyncio.sleep(sleep_for)
def launch_event_loops():
# get a new event loop
loop = asyncio.new_event_loop()
# set the event loop for the current thread
asyncio.set_event_loop(loop)
# run a coroutine on the event loop
loop.run_until_complete(do_something_important(random.randint(1, 5)))
# remember to close the loop
loop.close()
if __name__ == "__main__":
thread_1 = Thread(target=launch_event_loops)
thread_2 = Thread(target=launch_event_loops)
start_time = time.perf_counter()
thread_1.start()
thread_2.start()
print(colors[2] + f"Is event loop running in thread {current_thread().getName()} = {asyncio.get_event_loop().is_running()}" + colors[0])
thread_1.join()
thread_2.join()
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[3] + f"Event Loop Start Time: {start_time}\nEvent Loop End Time: {end_time}\nEvent Loop Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_event_loop.py
Output:
Try it out yourself and examine the output and you’ll realize that each spawned thread is running its own event loop.
Types of event loops
There are two types of event loops:
SelectorEventLoop: SelectorEventLoop is based on the selectors module and is the default loop on all platforms.
ProactorEventLoop: ProactorEventLoop is based on Windows’ I/O Completion Ports and is only supported on Windows.
2. Futures
Future represents a computation that is either in progress or will get scheduled in the future. It is a special low-level awaitable object that represents an eventual result of an asynchronous operation. Don’t confuse threading.Future and asyncio.Future.
Example Code:
import time
import asyncio
from asyncio import Future
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[34m", # Blue
)
async def bar(future):
print(colors[1] + "bar will sleep for 3 seconds" + colors[0])
await asyncio.sleep(3)
print(colors[1] + "bar resolving the future" + colors[0])
future.done()
future.set_result("future is resolved")
async def foo(future):
print(colors[2] + "foo will await the future" + colors[0])
await future
print(colors[2] + "foo finds the future resolved" + colors[0])
async def main():
future = Future()
await asyncio.gather(foo(future), bar(future))
if __name__ == "__main__":
start_time = time.perf_counter()
asyncio.run(main())
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[3] + f"Future Start Time: {start_time}\nFuture End Time: {end_time}\nFuture Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_futures.py
Output:
Both the coroutines are passed a future. The foo() coroutine awaits for the future to get resolved, while the bar() coroutine resolves the future after three seconds.
3. Tasks
Tasks are like futures, in fact, Task is a subclass of Future and can be created using the following methods:
asyncio.create_task() accepts coroutines and wraps them as tasks.
loop.create_task() only accepts coroutines.
asyncio.ensure_future() accepts futures, coroutines and any awaitable objects.
Tasks wrap coroutines and run them in event loops. If a coroutine awaits on a Future, the Task suspends the execution of the coroutine and waits for the Future to complete. When the Future is done, the execution of the wrapped coroutine resumes.
Example Code:
import time
import asyncio
from asyncio import Future
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[34m", # Blue
)
async def bar(future):
print(colors[1] + "bar will sleep for 3 seconds" + colors[0])
await asyncio.sleep(3)
print(colors[1] + "bar resolving the future" + colors[0])
future.done()
future.set_result("future is resolved")
async def foo(future):
print(colors[2] + "foo will await the future" + colors[0])
await future
print(colors[2] + "foo finds the future resolved" + colors[0])
async def main():
future = Future()
loop = asyncio.get_event_loop()
t1 = loop.create_task(bar(future))
t2 = loop.create_task(foo(future))
await t2, t1
if __name__ == "__main__":
start_time = time.perf_counter()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[3] + f"Future Start Time: {start_time}\nFuture End Time: {end_time}\nFuture Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_tasks.py
Output:
4. Chaining Coroutines:
A key feature of coroutines is that they can be chained together. A coroutine object is awaitable, so another coroutine can await it. This allows you to break programs into smaller, manageable, recyclable coroutines:
Example Code:
import sys
import asyncio
import random
import time
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[36m", # Cyan
"\033[34m", # Blue
)
async def function1(n: int) -> str:
i = random.randint(0, 10)
print(colors[1] + f"function1({n}) is sleeping for {i} seconds." + colors[0])
await asyncio.sleep(i)
result = f"result{n}-1"
print(colors[1] + f"Returning function1({n}) == {result}." + colors[0])
return result
async def function2(n: int, arg: str) -> str:
i = random.randint(0, 10)
print(colors[2] + f"function2{n, arg} is sleeping for {i} seconds." + colors[0])
await asyncio.sleep(i)
result = f"result{n}-2 derived from {arg}"
print(colors[2] + f"Returning function2{n, arg} == {result}." + colors[0])
return result
async def chain(n: int) -> None:
start = time.perf_counter()
p1 = await function1(n)
p2 = await function2(n, p1)
end = time.perf_counter() - start
print(colors[3] + f"--> Chained result{n} => {p2} (took {end:0.2f} seconds)." + colors[0])
async def main(*args):
await asyncio.gather(*(chain(n) for n in args))
if __name__ == "__main__":
random.seed(444)
args = [1, 2, 3] if len(sys.argv) == 1 else map(int, sys.argv[1:])
start_time = time.perf_counter()
asyncio.run(main(*args))
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[4] + f"Program Start Time: {start_time}\nProgram End Time: {end_time}\nProgram Execution Time: {execution_time:0.2f} seconds." + colors[0])
Pay careful attention to the output, where function1() sleeps for a variable amount of time, and function2() begins working with the results as they become available:
Execution Command: python async_chained.py 11 8 5
Output:
5. Using a Queue:
In this design, there is no chaining of any individual consumer to a producer. The consumers don’t know the number of producers, or even the cumulative number of items that will be added to the queue, in advance.
It takes an individual producer or consumer a variable amount of time to put and extract items from the queue, respectively. The queue serves as a throughput that can communicate with the producers and consumers without them talking to each other directly.
Example Code:
import asyncio
import argparse
import itertools as it
import os
import random
import time
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[36m", # Cyan
"\033[34m", # Blue
)
async def generate_item(size: int = 5) -> str:
return os.urandom(size).hex()
async def random_sleep(caller=None) -> None:
i = random.randint(0, 10)
if caller:
print(colors[1] + f"{caller} sleeping for {i} seconds." + colors[0])
await asyncio.sleep(i)
async def produce(name: int, producer_queue: asyncio.Queue) -> None:
n = random.randint(0, 10)
for _ in it.repeat(None, n): # Synchronous loop for each single producer
await random_sleep(caller=f"Producer {name}")
i = await generate_item()
t = time.perf_counter()
await producer_queue.put((i, t))
print(colors[2] + f"Producer {name} added <{i}> to queue." + colors[0])
async def consume(name: int, consumer_queue: asyncio.Queue) -> None:
while True:
await random_sleep(caller=f"Consumer {name}")
i, t = await consumer_queue.get()
now = time.perf_counter()
print(colors[3] + f"Consumer {name} got element <{i}>" f" in {now - t:0.5f} seconds." + colors[0])
consumer_queue.task_done()
async def main(no_producer: int, no_consumer: int):
q = asyncio.Queue()
producers = [asyncio.create_task(produce(n, q)) for n in range(no_producer)]
consumers = [asyncio.create_task(consume(n, q)) for n in range(no_consumer)]
await asyncio.gather(*producers)
await q.join() # Implicitly awaits consumers, too
for consumer in consumers:
consumer.cancel()
if __name__ == "__main__":
random.seed(444)
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--no_producer", type=int, default=10)
parser.add_argument("-c", "--no_consumer", type=int, default=15)
ns = parser.parse_args()
start_time = time.perf_counter()
asyncio.run(main(**ns.__dict__))
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[4] + f"Program Start Time: {start_time}\nProgram End Time: {end_time}\nProgram Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_queue.py -p 2 -c 4
Output:
Lastly, let's have an example of how asyncio cuts down on wait time: given a coroutine generate_random_int() that keeps producing random integers in the range [0, 10], until one of them exceeds a threshold, you want to let multiple calls of this coroutine not need to wait for each other to complete in succession.
Example Code:
import time
import asyncio
import random
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[36m", # Cyan
"\033[35m", # Magenta
"\033[34m", # Blue
)
async def generate_random_int(indx: int, threshold: int = 5) -> int:
print(colors[indx + 1] + f"Initiated generate_random_int({indx}).")
i = random.randint(0, 10)
while i <= threshold:
print(colors[indx + 1] + f"generate_random_int({indx}) == {i} too low; retrying.")
await asyncio.sleep(indx + 1)
i = random.randint(0, 10)
print(colors[indx + 1] + f"---> Finished: generate_random_int({indx}) == {i}" + colors[0])
return i
async def main():
res = await asyncio.gather(*(generate_random_int(i, 10 - i - 1) for i in range(3)))
return res
if __name__ == "__main__":
random.seed(444)
start_time = time.perf_counter()
r1, r2, r3 = asyncio.run(main())
print(colors[4] + f"\nRandom INT 1: {r1}, Random INT 2: {r2}, Random INT 3: {r3}\n" + colors[0])
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[5] + f"Program Start Time: {start_time}\nProgram End Time: {end_time}\nProgram Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_random.py
Output:
Note: If you’re writing any code yourself, prefer native coroutines
for the sake of being explicit rather than implicit. Generator based
coroutines will be removed in Python 3.10.
GitHub Repo: https://github.com/tssovi/asynchronous-in-python

Your coro desugaring is conceptually correct, but slightly incomplete.
await doesn't suspend unconditionally, but only if it encounters a blocking call. How does it know that a call is blocking? This is decided by the code being awaited. For example, an awaitable implementation of socket read could be desugared to:
def read(sock, n):
# sock must be in non-blocking mode
try:
return sock.recv(n)
except EWOULDBLOCK:
event_loop.add_reader(sock.fileno, current_task())
return SUSPEND
In real asyncio the equivalent code modifies the state of a Future instead of returning magic values, but the concept is the same. When appropriately adapted to a generator-like object, the above code can be awaited.
On the caller side, when your coroutine contains:
data = await read(sock, 1024)
It desugars into something close to:
data = read(sock, 1024)
if data is SUSPEND:
return SUSPEND
self.pos += 1
self.parts[self.pos](...)
People familiar with generators tend to describe the above in terms of yield from which does the suspension automatically.
The suspension chain continues all the way up to the event loop, which notices that the coroutine is suspended, removes it from the runnable set, and goes on to execute coroutines that are runnable, if any. If no coroutines are runnable, the loop waits in select() until either a file descriptor a coroutine is interested in becomes ready for IO or a timeout expires. (The event loop maintains a file-descriptor-to-coroutine mapping.)
In the above example, once select() tells the event loop that sock is readable, it will re-add coro to the runnable set, so it will be continued from the point of suspension.
In other words:
Everything happens in the same thread by default.
The event loop is responsible for scheduling the coroutines and waking them up when whatever they were waiting for (typically an IO call that would normally block, or a timeout) becomes ready.
For insight on coroutine-driving event loops, I recommend this talk by Dave Beazley, where he demonstrates coding an event loop from scratch in front of live audience.

It all boils down to the two main challenges that asyncio is addressing:
How to perform multiple I/O in a single thread?
How to implement cooperative multitasking?
The answer to the first point has been around for a long while and is called a select loop. In python, it is implemented in the selectors module.
The second question is related to the concept of coroutine, i.e. functions that can stop their execution and be restored later on. In python, coroutines are implemented using generators and the yield from statement. That's what is hiding behind the async/await syntax.
More resources in this answer.
EDIT: Addressing your comment about goroutines:
The closest equivalent to a goroutine in asyncio is actually not a coroutine but a task (see the difference in the documentation). In python, a coroutine (or a generator) knows nothing about the concepts of event loop or I/O. It simply is a function that can stop its execution using yield while keeping its current state, so it can be restored later on. The yield from syntax allows for chaining them in a transparent way.
Now, within an asyncio task, the coroutine at the very bottom of the chain always ends up yielding a future. This future then bubbles up to the event loop, and gets integrated into the inner machinery. When the future is set to done by some other inner callback, the event loop can restore the task by sending the future back into the coroutine chain.
EDIT: Addressing some of the questions in your post:
How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter?
No, nothing happens in a thread. I/O is always managed by the event loop, mostly through file descriptors. However the registration of those file descriptors is usually hidden by high-level coroutines, making the dirty work for you.
What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?
An I/O is any blocking call. In asyncio, all the I/O operations should go through the event loop, because as you said, the event loop has no way to be aware that a blocking call is being performed in some synchronous code. That means you're not supposed to use a synchronous open within the context of a coroutine. Instead, use a dedicated library such aiofiles which provides an asynchronous version of open.

It allows you to write single-threaded asynchronous code and implement concurrency in Python. Basically, asyncio provides an event loop for asynchronous programming. For example, if we need to make requests without blocking the main thread, we can use the asyncio library.
The asyncio module allows for the implementation of asynchronous programming
using a combination of the following elements:
Event loop: The asyncio module allows an event loop per process.
Coroutines: A coroutine is a generator that follows certain conventions. Its most interesting feature is that it can be suspended during execution to wait for external processing (the some routine in I/O) and return from the point it had stopped when the external processing was done.
Futures: Futures represent a process that has still not finished. A future is an object that is supposed to have a result in the future and represents uncompleted tasks.
Tasks: This is a subclass of asyncio.Future that encapsulates and manages
coroutines. We can use the asyncio.Task object to encapsulate a coroutine.
The most important concept within asyncio is the event loop. An event loop
allows you to write asynchronous code using either callbacks or coroutines.
The keys to understanding asyncio are the terms of coroutines and the event
loop. Coroutines are stateful functions whose execution can be stopped while another I/O operation is being executed. An event loop is used to orchestrate the execution of the coroutines.
To run any coroutine function, we need to get an event loop. We can do this
with
loop = asyncio.get_event_loop()
This gives us a BaseEventLoop object. This has a run_until_complete method that takes in a coroutine and runs it until completion. Then, the coroutine returns a result. At a low level, an event loop executes the BaseEventLoop.rununtilcomplete(future) method.

If you picture an airport control tower, with many planes waiting to land on the same runway. The control tower can be seen as the event loop and runway as the thread. Each plane is a separate function waiting to execute. In reality only one plane can land on the runway at a time. What asyncio basically does it allows many planes to land simultaneously on the same runway by using the event loop to suspend functions and allow other functions to run when you use the await syntax it basically means that plane(function can be suspended and allow other functions to process

How does asyncio actually work?

This question is motivated by my another question: How to await in cdef?
There are tons of articles and blog posts on the web about asyncio, but they are all very superficial. I couldn't find any information about how asyncio is actually implemented, and what makes I/O asynchronous. I was trying to read the source code, but it's thousands of lines of not the highest grade C code, a lot of which deals with auxiliary objects, but most crucially, it is hard to connect between Python syntax and what C code it would translate into.
Asycnio's own documentation is even less helpful. There's no information there about how it works, only some guidelines about how to use it, which are also sometimes misleading / very poorly written.
I'm familiar with Go's implementation of coroutines, and was kind of hoping that Python did the same thing. If that was the case, the code I came up in the post linked above would have worked. Since it didn't, I'm now trying to figure out why. My best guess so far is as follows, please correct me where I'm wrong:
Procedure definitions of the form async def foo(): ... are actually interpreted as methods of a class inheriting coroutine.
Perhaps, async def is actually split into multiple methods by await statements, where the object, on which these methods are called is able to keep track of the progress it made through the execution so far.
If the above is true, then, essentially, execution of a coroutine boils down to calling methods of coroutine object by some global manager (loop?).
The global manager is somehow (how?) aware of when I/O operations are performed by Python (only?) code and is able to choose one of the pending coroutine methods to execute after the current executing method relinquished control (hit on the await statement).
In other words, here's my attempt at "desugaring" of some asyncio syntax into something more understandable:
async def coro(name):
print('before', name)
await asyncio.sleep()
print('after', name)
asyncio.gather(coro('first'), coro('second'))
# translated from async def coro(name)
class Coro(coroutine):
def before(self, name):
print('before', name)
def after(self, name):
print('after', name)
def __init__(self, name):
self.name = name
self.parts = self.before, self.after
self.pos = 0
def __call__():
self.parts[self.pos](self.name)
self.pos += 1
def done(self):
return self.pos == len(self.parts)
# translated from asyncio.gather()
class AsyncIOManager:
def gather(*coros):
while not every(c.done() for c in coros):
coro = random.choice(coros)
coro()
Should my guess prove correct: then I have a problem. How does I/O actually happen in this scenario? In a separate thread? Is the whole interpreter suspended and I/O happens outside the interpreter? What exactly is meant by I/O? If my python procedure called C open() procedure, and it in turn sent interrupt to kernel, relinquishing control to it, how does Python interpreter know about this and is able to continue running some other code, while kernel code does the actual I/O and until it wakes up the Python procedure which sent the interrupt originally? How can Python interpreter in principle, be aware of this happening?

How does asyncio work?
Before answering this question we need to understand a few base terms, skip these if you already know any of them.
Generators
Generators are objects that allow us to suspend the execution of a python function. User curated generators are implemented using the keyword yield. By creating a normal function containing the yield keyword, we turn that function into a generator:
>>> def test():
... yield 1
... yield 2
...
>>> gen = test()
>>> next(gen)
1
>>> next(gen)
2
>>> next(gen)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
As you can see, calling next() on the generator causes the interpreter to load the test's frame, and return the yielded value. Calling next() again, causes the frame to load again into the interpreter stack, and continues on yielding another value.
By the third time next() is called, our generator was finished, and StopIteration was thrown.
Communicating with a generator
A less-known feature of generators is the fact that you can communicate with them using two methods: send() and throw().
>>> def test():
... val = yield 1
... print(val)
... yield 2
... yield 3
...
>>> gen = test()
>>> next(gen)
1
>>> gen.send("abc")
abc
2
>>> gen.throw(Exception())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 4, in test
Exception
Upon calling gen.send(), the value is passed as a return value from the yield keyword.
gen.throw() on the other hand, allows throwing Exceptions inside generators, with the exception raised at the same spot yield was called.
Returning values from generators
Returning a value from a generator, results in the value being put inside the StopIteration exception. We can later on recover the value from the exception and use it to our needs.
>>> def test():
... yield 1
... return "abc"
...
>>> gen = test()
>>> next(gen)
1
>>> try:
... next(gen)
... except StopIteration as exc:
... print(exc.value)
...
abc
Behold, a new keyword: yield from
Python 3.4 came with the addition of a new keyword: yield from. What that keyword allows us to do, is pass on any next(), send() and throw() into an inner-most nested generator. If the inner generator returns a value, it is also the return value of yield from:
>>> def inner():
... inner_result = yield 2
... print('inner', inner_result)
... return 3
...
>>> def outer():
... yield 1
... val = yield from inner()
... print('outer', val)
... yield 4
...
>>> gen = outer()
>>> next(gen)
1
>>> next(gen) # Goes inside inner() automatically
2
>>> gen.send("abc")
inner abc
outer 3
4
I've written an article to further elaborate on this topic.
Putting it all together
Upon introducing the new keyword yield from in Python 3.4, we were now able to create generators inside generators that just like a tunnel, pass the data back and forth from the inner-most to the outer-most generators. This has spawned a new meaning for generators - coroutines.
Coroutines are functions that can be stopped and resumed while being run. In Python, they are defined using the async def keyword. Much like generators, they too use their own form of yield from which is await. Before async and await were introduced in Python 3.5, we created coroutines in the exact same way generators were created (with yield from instead of await).
async def inner():
return 1
async def outer():
await inner()
Just like all iterators and generators implement the __iter__() method, all coroutines implement __await__() which allows them to continue on every time await coro is called.
There's a nice sequence diagram inside the Python docs that you should check out.
In asyncio, apart from coroutine functions, we have 2 important objects: tasks and futures.
Futures
Futures are objects that have the __await__() method implemented, and their job is to hold a certain state and result. The state can be one of the following:
PENDING - future does not have any result or exception set.
CANCELLED - future was cancelled using fut.cancel()
FINISHED - future was finished, either by a result set using fut.set_result() or by an exception set using fut.set_exception()
The result, just like you have guessed, can either be a Python object, that will be returned, or an exception which may be raised.
Another important feature of future objects, is that they contain a method called add_done_callback(). This method allows functions to be called as soon as the task is done - whether it raised an exception or finished.
Tasks
Task objects are special futures, which wrap around coroutines, and communicate with the inner-most and outer-most coroutines. Every time a coroutine awaits a future, the future is passed all the way back to the task (just like in yield from), and the task receives it.
Next, the task binds itself to the future. It does so by calling add_done_callback() on the future. From now on, if the future will ever be done, by either being cancelled, passed an exception or passed a Python object as a result, the task's callback will be called, and it will rise back up to existence.
Asyncio
The final burning question we must answer is - how is the IO implemented?
Deep inside asyncio, we have an event loop. An event loop of tasks. The event loop's job is to call tasks every time they are ready and coordinate all that effort into one single working machine.
The IO part of the event loop is built upon a single crucial function called select. Select is a blocking function, implemented by the operating system underneath, that allows waiting on sockets for incoming or outgoing data. Upon receiving data it wakes up, and returns the sockets which received data, or the sockets which are ready for writing.
When you try to receive or send data over a socket through asyncio, what actually happens below is that the socket is first checked if it has any data that can be immediately read or sent. If its .send() buffer is full, or the .recv() buffer is empty, the socket is registered to the select function (by simply adding it to one of the lists, rlist for recv and wlist for send) and the appropriate function awaits a newly created future object, tied to that socket.
When all available tasks are waiting for futures, the event loop calls select and waits. When the one of the sockets has incoming data, or its send buffer drained up, asyncio checks for the future object tied to that socket, and sets it to done.
Now all the magic happens. The future is set to done, the task that added itself before with add_done_callback() rises up back to life, and calls .send() on the coroutine which resumes the inner-most coroutine (because of the await chain) and you read the newly received data from a nearby buffer it was spilled unto.
Method chain again, in case of recv():
select.select waits.
A ready socket, with data is returned.
Data from the socket is moved into a buffer.
future.set_result() is called.
Task that added itself with add_done_callback() is now woken up.
Task calls .send() on the coroutine which goes all the way into the inner-most coroutine and wakes it up.
Data is being read from the buffer and returned to our humble user.
In summary, asyncio uses generator capabilities, that allow pausing and resuming functions. It uses yield from capabilities that allow passing data back and forth from the inner-most generator to the outer-most. It uses all of those in order to halt function execution while it's waiting for IO to complete (by using the OS select function).
And the best of all? While one function is paused, another may run and interleave with the delicate fabric, which is asyncio.

Talking about async/await and asyncio is not the same thing. The first is a fundamental, low-level construct (coroutines) while the later is a library using these constructs. Conversely, there is no single ultimate answer.
The following is a general description of how async/await and asyncio-like libraries work. That is, there may be other tricks on top (there are...) but they are inconsequential unless you build them yourself. The difference should be negligible unless you already know enough to not have to ask such a question.
1. Coroutines versus subroutines in a nut shell
Just like subroutines (functions, procedures, ...), coroutines (generators, ...) are an abstraction of call stack and instruction pointer: there is a stack of executing code pieces, and each is at a specific instruction.
The distinction of def versus async def is merely for clarity. The actual difference is return versus yield. From this, await or yield from take the difference from individual calls to entire stacks.
1.1. Subroutines
A subroutine represents a new stack level to hold local variables, and a single traversal of its instructions to reach an end. Consider a subroutine like this:
def subfoo(bar):
qux = 3
return qux * bar
When you run it, that means
allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)
Notably, 4. means that a subroutine always starts at the same state. Everything exclusive to the function itself is lost upon completion. A function cannot be resumed, even if there are instructions after return.
root -\
: \- subfoo --\
:/--<---return --/
|
V
1.2. Coroutines as persistent subroutines
A coroutine is like a subroutine, but can exit without destroying its state. Consider a coroutine like this:
def cofoo(bar):
qux = yield bar # yield marks a break point
return qux
When you run it, that means
allocate stack space for bar and qux
recursively execute the first statement and jump to the next statement
once at a yield, push its value to the calling stack but store the stack and instruction pointer
once calling into yield, restore stack and instruction pointer and push arguments to qux
once at a return, push its value to the calling stack
clear the stack (1.) and instruction pointer (2.)
Note the addition of 2.1 and 2.2 - a coroutine can be suspended and resumed at predefined points. This is similar to how a subroutine is suspended during calling another subroutine. The difference is that the active coroutine is not strictly bound to its calling stack. Instead, a suspended coroutine is part of a separate, isolated stack.
root -\
: \- cofoo --\
:/--<+--yield --/
| :
V :
This means that suspended coroutines can be freely stored or moved between stacks. Any call stack that has access to a coroutine can decide to resume it.
1.3. Traversing the call stack
So far, our coroutine only goes down the call stack with yield. A subroutine can go down and up the call stack with return and (). For completeness, coroutines also need a mechanism to go up the call stack. Consider a coroutine like this:
def wrap():
yield 'before'
yield from cofoo()
yield 'after'
When you run it, that means it still allocates the stack and instruction pointer like a subroutine. When it suspends, that still is like storing a subroutine.
However, yield from does both. It suspends stack and instruction pointer of wrap and runs cofoo. Note that wrap stays suspended until cofoo finishes completely. Whenever cofoo suspends or something is sent, cofoo is directly connected to the calling stack.
1.4. Coroutines all the way down
As established, yield from allows to connect two scopes across another intermediate one. When applied recursively, that means the top of the stack can be connected to the bottom of the stack.
root -\
: \-> coro_a -yield-from-> coro_b --\
:/ <-+------------------------yield ---/
| :
:\ --+-- coro_a.send----------yield ---\
: coro_b <-/
Note that root and coro_b do not know about each other. This makes coroutines much cleaner than callbacks: coroutines still built on a 1:1 relation like subroutines. Coroutines suspend and resume their entire existing execution stack up until a regular call point.
Notably, root could have an arbitrary number of coroutines to resume. Yet, it can never resume more than one at the same time. Coroutines of the same root are concurrent but not parallel!
1.5. Python's async and await
The explanation has so far explicitly used the yield and yield from vocabulary of generators - the underlying functionality is the same. The new Python3.5 syntax async and await exists mainly for clarity.
def foo(): # subroutine?
return None
def foo(): # coroutine?
yield from foofoo() # generator? coroutine?
async def foo(): # coroutine!
await foofoo() # coroutine!
return None
The async for and async with statements are needed because you would break the yield from/await chain with the bare for and with statements.
2. Anatomy of a simple event loop
By itself, a coroutine has no concept of yielding control to another coroutine. It can only yield control to the caller at the bottom of a coroutine stack. This caller can then switch to another coroutine and run it.
This root node of several coroutines is commonly an event loop: on suspension, a coroutine yields an event on which it wants resume. In turn, the event loop is capable of efficiently waiting for these events to occur. This allows it to decide which coroutine to run next, or how to wait before resuming.
Such a design implies that there is a set of pre-defined events that the loop understands. Several coroutines await each other, until finally an event is awaited. This event can communicate directly with the event loop by yielding control.
loop -\
: \-> coroutine --await--> event --\
:/ <-+----------------------- yield --/
| :
| : # loop waits for event to happen
| :
:\ --+-- send(reply) -------- yield --\
: coroutine <--yield-- event <-/
The key is that coroutine suspension allows the event loop and events to directly communicate. The intermediate coroutine stack does not require any knowledge about which loop is running it, nor how events work.
2.1.1. Events in time
The simplest event to handle is reaching a point in time. This is a fundamental block of threaded code as well: a thread repeatedly sleeps until a condition is true.
However, a regular sleep blocks execution by itself - we want other coroutines to not be blocked. Instead, we want tell the event loop when it should resume the current coroutine stack.
2.1.2. Defining an Event
An event is simply a value we can identify - be it via an enum, a type or other identity. We can define this with a simple class that stores our target time. In addition to storing the event information, we can allow to await a class directly.
class AsyncSleep:
"""Event to sleep until a point in time"""
def __init__(self, until: float):
self.until = until
# used whenever someone ``await``s an instance of this Event
def __await__(self):
# yield this Event to the loop
yield self
def __repr__(self):
return '%s(until=%.1f)' % (self.__class__.__name__, self.until)
This class only stores the event - it does not say how to actually handle it.
The only special feature is __await__ - it is what the await keyword looks for. Practically, it is an iterator but not available for the regular iteration machinery.
2.2.1. Awaiting an event
Now that we have an event, how do coroutines react to it? We should be able to express the equivalent of sleep by awaiting our event. To better see what is going on, we wait twice for half the time:
import time
async def asleep(duration: float):
"""await that ``duration`` seconds pass"""
await AsyncSleep(time.time() + duration / 2)
await AsyncSleep(time.time() + duration / 2)
We can directly instantiate and run this coroutine. Similar to a generator, using coroutine.send runs the coroutine until it yields a result.
coroutine = asleep(100)
while True:
print(coroutine.send(None))
time.sleep(0.1)
This gives us two AsyncSleep events and then a StopIteration when the coroutine is done. Notice that the only delay is from time.sleep in the loop! Each AsyncSleep only stores an offset from the current time.
2.2.2. Event + Sleep
At this point, we have two separate mechanisms at our disposal:
AsyncSleep Events that can be yielded from inside a coroutine
time.sleep that can wait without impacting coroutines
Notably, these two are orthogonal: neither one affects or triggers the other. As a result, we can come up with our own strategy to sleep to meet the delay of an AsyncSleep.
2.3. A naive event loop
If we have several coroutines, each can tell us when it wants to be woken up. We can then wait until the first of them wants to be resumed, then for the one after, and so on. Notably, at each point we only care about which one is next.
This makes for a straightforward scheduling:
sort coroutines by their desired wake up time
pick the first that wants to wake up
wait until this point in time
run this coroutine
repeat from 1.
A trivial implementation does not need any advanced concepts. A list allows to sort coroutines by date. Waiting is a regular time.sleep. Running coroutines works just like before with coroutine.send.
def run(*coroutines):
"""Cooperatively run all ``coroutines`` until completion"""
# store wake-up-time and coroutines
waiting = [(0, coroutine) for coroutine in coroutines]
while waiting:
# 2. pick the first coroutine that wants to wake up
until, coroutine = waiting.pop(0)
# 3. wait until this point in time
time.sleep(max(0.0, until - time.time()))
# 4. run this coroutine
try:
command = coroutine.send(None)
except StopIteration:
continue
# 1. sort coroutines by their desired suspension
if isinstance(command, AsyncSleep):
waiting.append((command.until, coroutine))
waiting.sort(key=lambda item: item[0])
Of course, this has ample room for improvement. We can use a heap for the wait queue or a dispatch table for events. We could also fetch return values from the StopIteration and assign them to the coroutine. However, the fundamental principle remains the same.
2.4. Cooperative Waiting
The AsyncSleep event and run event loop are a fully working implementation of timed events.
async def sleepy(identifier: str = "coroutine", count=5):
for i in range(count):
print(identifier, 'step', i + 1, 'at %.2f' % time.time())
await asleep(0.1)
run(*(sleepy("coroutine %d" % j) for j in range(5)))
This cooperatively switches between each of the five coroutines, suspending each for 0.1 seconds. Even though the event loop is synchronous, it still executes the work in 0.5 seconds instead of 2.5 seconds. Each coroutine holds state and acts independently.
3. I/O event loop
An event loop that supports sleep is suitable for polling. However, waiting for I/O on a file handle can be done more efficiently: the operating system implements I/O and thus knows which handles are ready. Ideally, an event loop should support an explicit "ready for I/O" event.
3.1. The select call
Python already has an interface to query the OS for read I/O handles. When called with handles to read or write, it returns the handles ready to read or write:
readable, writable, _ = select.select(rlist, wlist, xlist, timeout)
For example, we can open a file for writing and wait for it to be ready:
write_target = open('/tmp/foo')
readable, writable, _ = select.select([], [write_target], [])
Once select returns, writable contains our open file.
3.2. Basic I/O event
Similar to the AsyncSleep request, we need to define an event for I/O. With the underlying select logic, the event must refer to a readable object - say an open file. In addition, we store how much data to read.
class AsyncRead:
def __init__(self, file, amount=1):
self.file = file
self.amount = amount
self._buffer = b'' if 'b' in file.mode else ''
def __await__(self):
while len(self._buffer) < self.amount:
yield self
# we only get here if ``read`` should not block
self._buffer += self.file.read(1)
return self._buffer
def __repr__(self):
return '%s(file=%s, amount=%d, progress=%d)' % (
self.__class__.__name__, self.file, self.amount, len(self._buffer)
)
As with AsyncSleep we mostly just store the data required for the underlying system call. This time, __await__ is capable of being resumed multiple times - until our desired amount has been read. In addition, we return the I/O result instead of just resuming.
3.3. Augmenting an event loop with read I/O
The basis for our event loop is still the run defined previously. First, we need to track the read requests. This is no longer a sorted schedule, we only map read requests to coroutines.
# new
waiting_read = {} # type: Dict[file, coroutine]
Since select.select takes a timeout parameter, we can use it in place of time.sleep.
# old
time.sleep(max(0.0, until - time.time()))
# new
readable, _, _ = select.select(list(waiting_read), [], [])
This gives us all readable files - if there are any, we run the corresponding coroutine. If there are none, we have waited long enough for our current coroutine to run.
# new - reschedule waiting coroutine, run readable coroutine
if readable:
waiting.append((until, coroutine))
waiting.sort()
coroutine = waiting_read[readable[0]]
Finally, we have to actually listen for read requests.
# new
if isinstance(command, AsyncSleep):
...
elif isinstance(command, AsyncRead):
...
3.4. Putting it together
The above was a bit of a simplification. We need to do some switching to not starve sleeping coroutines if we can always read. We need to handle having nothing to read or nothing to wait for. However, the end result still fits into 30 LOC.
def run(*coroutines):
"""Cooperatively run all ``coroutines`` until completion"""
waiting_read = {} # type: Dict[file, coroutine]
waiting = [(0, coroutine) for coroutine in coroutines]
while waiting or waiting_read:
# 2. wait until the next coroutine may run or read ...
try:
until, coroutine = waiting.pop(0)
except IndexError:
until, coroutine = float('inf'), None
readable, _, _ = select.select(list(waiting_read), [], [])
else:
readable, _, _ = select.select(list(waiting_read), [], [], max(0.0, until - time.time()))
# ... and select the appropriate one
if readable and time.time() < until:
if until and coroutine:
waiting.append((until, coroutine))
waiting.sort()
coroutine = waiting_read.pop(readable[0])
# 3. run this coroutine
try:
command = coroutine.send(None)
except StopIteration:
continue
# 1. sort coroutines by their desired suspension ...
if isinstance(command, AsyncSleep):
waiting.append((command.until, coroutine))
waiting.sort(key=lambda item: item[0])
# ... or register reads
elif isinstance(command, AsyncRead):
waiting_read[command.file] = coroutine
3.5. Cooperative I/O
The AsyncSleep, AsyncRead and run implementations are now fully functional to sleep and/or read.
Same as for sleepy, we can define a helper to test reading:
async def ready(path, amount=1024*32):
print('read', path, 'at', '%d' % time.time())
with open(path, 'rb') as file:
result = await AsyncRead(file, amount)
print('done', path, 'at', '%d' % time.time())
print('got', len(result), 'B')
run(sleepy('background', 5), ready('/dev/urandom'))
Running this, we can see that our I/O is interleaved with the waiting task:
id background round 1
read /dev/urandom at 1530721148
id background round 2
id background round 3
id background round 4
id background round 5
done /dev/urandom at 1530721148
got 1024 B
4. Non-Blocking I/O
While I/O on files gets the concept across, it is not really suitable for a library like asyncio: the select call always returns for files, and both open and read may block indefinitely. This blocks all coroutines of an event loop - which is bad. Libraries like aiofiles use threads and synchronization to fake non-blocking I/O and events on file.
However, sockets do allow for non-blocking I/O - and their inherent latency makes it much more critical. When used in an event loop, waiting for data and retrying can be wrapped without blocking anything.
4.1. Non-Blocking I/O event
Similar to our AsyncRead, we can define a suspend-and-read event for sockets. Instead of taking a file, we take a socket - which must be non-blocking. Also, our __await__ uses socket.recv instead of file.read.
class AsyncRecv:
def __init__(self, connection, amount=1, read_buffer=1024):
assert not connection.getblocking(), 'connection must be non-blocking for async recv'
self.connection = connection
self.amount = amount
self.read_buffer = read_buffer
self._buffer = b''
def __await__(self):
while len(self._buffer) < self.amount:
try:
self._buffer += self.connection.recv(self.read_buffer)
except BlockingIOError:
yield self
return self._buffer
def __repr__(self):
return '%s(file=%s, amount=%d, progress=%d)' % (
self.__class__.__name__, self.connection, self.amount, len(self._buffer)
)
In contrast to AsyncRead, __await__ performs truly non-blocking I/O. When data is available, it always reads. When no data is available, it always suspends. That means the event loop is only blocked while we perform useful work.
4.2. Un-Blocking the event loop
As far as the event loop is concerned, nothing changes much. The event to listen for is still the same as for files - a file descriptor marked ready by select.
# old
elif isinstance(command, AsyncRead):
waiting_read[command.file] = coroutine
# new
elif isinstance(command, AsyncRead):
waiting_read[command.file] = coroutine
elif isinstance(command, AsyncRecv):
waiting_read[command.connection] = coroutine
At this point, it should be obvious that AsyncRead and AsyncRecv are the same kind of event. We could easily refactor them to be one event with an exchangeable I/O component. In effect, the event loop, coroutines and events cleanly separate a scheduler, arbitrary intermediate code and the actual I/O.
4.3. The ugly side of non-blocking I/O
In principle, what you should do at this point is replicate the logic of read as a recv for AsyncRecv. However, this is much more ugly now - you have to handle early returns when functions block inside the kernel, but yield control to you. For example, opening a connection versus opening a file is much longer:
# file
file = open(path, 'rb')
# non-blocking socket
connection = socket.socket()
connection.setblocking(False)
# open without blocking - retry on failure
try:
connection.connect((url, port))
except BlockingIOError:
pass
Long story short, what remains is a few dozen lines of Exception handling. The events and event loop already work at this point.
id background round 1
read localhost:25000 at 1530783569
read /dev/urandom at 1530783569
done localhost:25000 at 1530783569 got 32768 B
id background round 2
id background round 3
id background round 4
done /dev/urandom at 1530783569 got 4096 B
id background round 5
Addendum
Example code at github

What is asyncio?
Asyncio stands for asynchronous input output and refers to a programming paradigm which achieves high concurrency using a single thread or event loop.
Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. When the work is complete, it notifies the main thread about completion or failure of the worker thread.
Let's have a look in below image:
Let's understand asyncio with an example:
To understand the concept behind asyncio, let’s consider a restaurant with a single waiter. Suddenly, three customers, A, B and C show up. The three of them take a varying amount of time to decide what to eat once they receive the menu from the waiter.
Let’s assume A takes 5 minutes, B 10 minutes and C 1 minute to decide. If the single waiter starts with B first and takes B's order in 10 minutes, next he serves A and spends 5 minutes on noting down his order and finally spends 1 minute to know what C wants to eat.
So, in total, waiter spends 10 + 5 + 1 = 16 minutes to take down their orders. However, notice in this sequence of events, C ends up waiting 15 minutes before the waiter gets to him, A waits 10 minutes and B waits 0 minutes.
Now consider if the waiter knew the time each customer would take to decide. He can start with C first, then go to A and finally to B. This way each customer would experience a 0 minute wait.
An illusion of three waiters, one dedicated to each customer is created even though there’s only one.
Lastly, the total time it takes for the waiter to take all three orders is 10 minutes, much less than the 16 minutes in the other scenario.
Let's go through another example:
Suppose, Chess master Magnus Carlsen hosts a chess exhibition in which he plays with multiple amateur players. He has two ways of conducting the exhibition: synchronously and asynchronously.
Assumptions:
24 opponents
Magnus Carlsen makes each chess move in 5 seconds
Opponents each take 55 seconds to make a move
Games average 30 pair-moves (60 moves total)
Synchronously: Magnus Carlsen plays one game at a time, never two at the same time, until the game is complete. Each game takes (55 + 5) * 30 == 1800 seconds, or 30 minutes. The entire exhibition takes 24 * 30 == 720 minutes, or 12 hours.
Asynchronously: Magnus Carlsen moves from table to table, making one move at each table. She leaves the table and lets the opponent make their next move during the wait time. One move on all 24 games takes Judit 24 * 5 == 120 seconds, or 2 minutes. The entire exhibition is now cut down to 120 * 30 == 3600 seconds, or just 1 hour
There is only one Magnus Carlsen, who has only two hands and makes only one move at a time by himself. But playing asynchronously cuts the exhibition time down from 12 hours to one.
Coding Example:
Let try to demonstrate Synchronous and Asynchronous execution time using code snippet.
Asynchronous - async_count.py
import asyncio
import time
async def count():
print("One", end=" ")
await asyncio.sleep(1)
print("Two", end=" ")
await asyncio.sleep(2)
print("Three", end=" ")
async def main():
await asyncio.gather(count(), count(), count(), count(), count())
if __name__ == "__main__":
start_time = time.perf_counter()
asyncio.run(main())
end_time = time.perf_counter()
execution_time = end_time - start_time
print(f"\nExecuting - {__file__}\nExecution Starts: {start_time}\nExecutions Ends: {end_time}\nTotals Execution Time:{execution_time:0.2f} seconds.")
Asynchronous - Output:
One One One One One Two Two Two Two Two Three Three Three Three Three
Executing - async_count.py
Execution Starts: 18453.442160108
Executions Ends: 18456.444719712
Totals Execution Time:3.00 seconds.
Synchronous - sync_count.py
import time
def count():
print("One", end=" ")
time.sleep(1)
print("Two", end=" ")
time.sleep(2)
print("Three", end=" ")
def main():
for _ in range(5):
count()
if __name__ == "__main__":
start_time = time.perf_counter()
main()
end_time = time.perf_counter()
execution_time = end_time - start_time
print(f"\nExecuting - {__file__}\nExecution Starts: {start_time}\nExecutions Ends: {end_time}\nTotals Execution Time:{execution_time:0.2f} seconds.")
Synchronous - Output:
One Two Three One Two Three One Two Three One Two Three One Two Three
Executing - sync_count.py
Execution Starts: 18875.175965998
Executions Ends: 18890.189930292
Totals Execution Time:15.01 seconds.
Why use asyncio instead of multithreading in Python?
It’s very difficult to write code that is thread safe. With asynchronous code, you know exactly where the code will shift from one task to the next and race conditions are much harder to come by.
Threads consume a fair amount of data since each thread needs to have its own stack. With async code, all the code shares the same stack and the stack is kept small due to continuously unwinding the stack between tasks.
Threads are OS structures and therefore require more memory for the platform to support. There is no such problem with asynchronous tasks.
How does asyncio works?
Before going deep let's recall Python Generator
Python Generator:
Functions containing a yield statement are compiled as generators. Using a yield expression in a function’s body causes that function to be a generator. These functions return an object which supports the iteration protocol methods. The generator object created automatically receives a __next()__ method. Going back to the example from the previous section we can invoke __next__ directly on the generator object instead of using next():
def asynchronous():
yield "Educative"
if __name__ == "__main__":
gen = asynchronous()
str = gen.__next__()
print(str)
Remember the following about generators:
Generator functions allow you to procrastinate computing expensive values. You only compute the next value when required. This makes generators memory and compute efficient; they refrain from saving long sequences in memory or doing all expensive computations upfront.
Generators, when suspended, retain the code location, which is the last yield statement executed, and their entire local scope. This allows them to resume execution from where they left off.
Generator objects are nothing more than iterators.
Remember to make a distinction between a generator function and the associated generator object which are often used interchangeably. A generator function when invoked returns a generator object and next() is invoked on the generator object to run the code within the generator function.
States of a generator:
A generator goes through the following states:
GEN_CREATED when a generator object has been returned for the first time from a generator function and iteration hasn’t started.
GEN_RUNNING when next has been invoked on the generator object and is being executed by the python interpreter.
GEN_SUSPENDED when a generator is suspended at a yield
GEN_CLOSED when a generator has completed execution or has been closed.
Methods on generator objects:
A generator object exposes different methods that can be invoked to manipulate the generator. These are:
throw()
send()
close()
Let's deep dive into more details explanations
The rules of asyncio:
The syntax async def introduces either a native coroutine or an asynchronous generator. The expressions async with and async for are also valid.
The keyword await passes function control back to the event loop. (It suspends the execution of the surrounding coroutine.) If Python encounters an await f() expression in the scope of g(), this is how await tells the event loop, "Suspend execution of g() until whatever I’m waiting on—the result of f()—is returned. In the meantime, go let something else run."
In code, that second bullet point looks roughly like this:
async def g():
# Pause here and come back to g() when f() is ready
r = await f()
return r
There's also a strict set of rules around when and how you can and cannot use async/await. These can be handy whether you are still picking up the syntax or already have exposure to using async/await:
A function that you introduce with async def is a coroutine. It may use await, return, or yield, but all of these are optional. Declaring async def noop(): pass is valid:
Using await and/or return creates a coroutine function. To call a coroutine function, you must await it to get its results.
It is less common to use yield in an async def block. This creates an asynchronous generator, which you iterate over with async for. Forget about async generators for the time being and focus on getting down the syntax for coroutine functions, which use await and/or return.
Anything defined with async def may not use yield from, which will raise a SyntaxError.
Just like it’s a SyntaxError to use yield outside of a def function, it is a SyntaxError to use await outside of an async def coroutine. You can only use await in the body of coroutines.
Here are some terse examples meant to summarize the above few rules:
async def f(x):
y = await z(x) # OK - `await` and `return` allowed in coroutines
return y
async def g(x):
yield x # OK - this is an async generator
async def m(x):
yield from gen(x) # NO - SyntaxError
def m(x):
y = await z(x) # NO - SyntaxError (no `async def` here)
return y
Generator Based Coroutine
Python created a distinction between Python generators and generators that were meant to be used as coroutines. These coroutines are called generator-based coroutines and require the decorator #asynio.coroutine to be added to the function definition, though this isn’t strictly enforced.
Generator based coroutines use yield from syntax instead of yield. A coroutine can:
yield from another coroutine
yield from a future
return an expression
raise exception
Coroutines in Python make cooperative multitasking possible.
Cooperative multitasking is the approach in which the running process voluntarily gives up the CPU to other processes. A process may do so when it is logically blocked, say while waiting for user input or when it has initiated a network request and will be idle for a while.
A coroutine can be defined as a special function that can give up control to its caller without losing its state.
So what’s the difference between coroutines and generators?
Generators are essentially iterators though they look like functions. The distinction between generators and coroutines, in general, is that:
Generators yield back a value to the invoker whereas a coroutine yields control to another coroutine and can resume execution from the point it gives up control.
A generator can’t accept arguments once started whereas a coroutine can.
Generators are primarily used to simplify writing iterators. They are a type of coroutine and sometimes also called as semicoroutines.
Generator Based Coroutine Example
The simplest generator based coroutine we can write is as follows:
#asyncio.coroutine
def do_something_important():
yield from asyncio.sleep(1)
The coroutine sleeps for one second. Note the decorator and the use of yield from.
Native Based Coroutine Example
By native it is meant that the language introduced syntax to specifically define coroutines, making them first class citizens in the language. Native coroutines can be defined using the async/await syntax.
The simplest native based coroutine we can write is as follows:
async def do_something_important():
await asyncio.sleep(1)
AsyncIO Design Patterns
AsyncIO comes with its own set of possible script designs, which we will discuss in this section.
1. Event loops
The event loop is a programming construct that waits for events to happen and then dispatches them to an event handler. An event can be a user clicking on a UI button or a process initiating a file download. At the core of asynchronous programming, sits the event loop.
Example Code:
import asyncio
import random
import time
from threading import Thread
from threading import current_thread
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[34m", # Blue
)
async def do_something_important(sleep_for):
print(colors[1] + f"Is event loop running in thread {current_thread().getName()} = {asyncio.get_event_loop().is_running()}" + colors[0])
await asyncio.sleep(sleep_for)
def launch_event_loops():
# get a new event loop
loop = asyncio.new_event_loop()
# set the event loop for the current thread
asyncio.set_event_loop(loop)
# run a coroutine on the event loop
loop.run_until_complete(do_something_important(random.randint(1, 5)))
# remember to close the loop
loop.close()
if __name__ == "__main__":
thread_1 = Thread(target=launch_event_loops)
thread_2 = Thread(target=launch_event_loops)
start_time = time.perf_counter()
thread_1.start()
thread_2.start()
print(colors[2] + f"Is event loop running in thread {current_thread().getName()} = {asyncio.get_event_loop().is_running()}" + colors[0])
thread_1.join()
thread_2.join()
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[3] + f"Event Loop Start Time: {start_time}\nEvent Loop End Time: {end_time}\nEvent Loop Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_event_loop.py
Output:
Try it out yourself and examine the output and you’ll realize that each spawned thread is running its own event loop.
Types of event loops
There are two types of event loops:
SelectorEventLoop: SelectorEventLoop is based on the selectors module and is the default loop on all platforms.
ProactorEventLoop: ProactorEventLoop is based on Windows’ I/O Completion Ports and is only supported on Windows.
2. Futures
Future represents a computation that is either in progress or will get scheduled in the future. It is a special low-level awaitable object that represents an eventual result of an asynchronous operation. Don’t confuse threading.Future and asyncio.Future.
Example Code:
import time
import asyncio
from asyncio import Future
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[34m", # Blue
)
async def bar(future):
print(colors[1] + "bar will sleep for 3 seconds" + colors[0])
await asyncio.sleep(3)
print(colors[1] + "bar resolving the future" + colors[0])
future.done()
future.set_result("future is resolved")
async def foo(future):
print(colors[2] + "foo will await the future" + colors[0])
await future
print(colors[2] + "foo finds the future resolved" + colors[0])
async def main():
future = Future()
await asyncio.gather(foo(future), bar(future))
if __name__ == "__main__":
start_time = time.perf_counter()
asyncio.run(main())
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[3] + f"Future Start Time: {start_time}\nFuture End Time: {end_time}\nFuture Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_futures.py
Output:
Both the coroutines are passed a future. The foo() coroutine awaits for the future to get resolved, while the bar() coroutine resolves the future after three seconds.
3. Tasks
Tasks are like futures, in fact, Task is a subclass of Future and can be created using the following methods:
asyncio.create_task() accepts coroutines and wraps them as tasks.
loop.create_task() only accepts coroutines.
asyncio.ensure_future() accepts futures, coroutines and any awaitable objects.
Tasks wrap coroutines and run them in event loops. If a coroutine awaits on a Future, the Task suspends the execution of the coroutine and waits for the Future to complete. When the Future is done, the execution of the wrapped coroutine resumes.
Example Code:
import time
import asyncio
from asyncio import Future
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[34m", # Blue
)
async def bar(future):
print(colors[1] + "bar will sleep for 3 seconds" + colors[0])
await asyncio.sleep(3)
print(colors[1] + "bar resolving the future" + colors[0])
future.done()
future.set_result("future is resolved")
async def foo(future):
print(colors[2] + "foo will await the future" + colors[0])
await future
print(colors[2] + "foo finds the future resolved" + colors[0])
async def main():
future = Future()
loop = asyncio.get_event_loop()
t1 = loop.create_task(bar(future))
t2 = loop.create_task(foo(future))
await t2, t1
if __name__ == "__main__":
start_time = time.perf_counter()
loop = asyncio.get_event_loop()
loop.run_until_complete(main())
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[3] + f"Future Start Time: {start_time}\nFuture End Time: {end_time}\nFuture Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_tasks.py
Output:
4. Chaining Coroutines:
A key feature of coroutines is that they can be chained together. A coroutine object is awaitable, so another coroutine can await it. This allows you to break programs into smaller, manageable, recyclable coroutines:
Example Code:
import sys
import asyncio
import random
import time
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[36m", # Cyan
"\033[34m", # Blue
)
async def function1(n: int) -> str:
i = random.randint(0, 10)
print(colors[1] + f"function1({n}) is sleeping for {i} seconds." + colors[0])
await asyncio.sleep(i)
result = f"result{n}-1"
print(colors[1] + f"Returning function1({n}) == {result}." + colors[0])
return result
async def function2(n: int, arg: str) -> str:
i = random.randint(0, 10)
print(colors[2] + f"function2{n, arg} is sleeping for {i} seconds." + colors[0])
await asyncio.sleep(i)
result = f"result{n}-2 derived from {arg}"
print(colors[2] + f"Returning function2{n, arg} == {result}." + colors[0])
return result
async def chain(n: int) -> None:
start = time.perf_counter()
p1 = await function1(n)
p2 = await function2(n, p1)
end = time.perf_counter() - start
print(colors[3] + f"--> Chained result{n} => {p2} (took {end:0.2f} seconds)." + colors[0])
async def main(*args):
await asyncio.gather(*(chain(n) for n in args))
if __name__ == "__main__":
random.seed(444)
args = [1, 2, 3] if len(sys.argv) == 1 else map(int, sys.argv[1:])
start_time = time.perf_counter()
asyncio.run(main(*args))
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[4] + f"Program Start Time: {start_time}\nProgram End Time: {end_time}\nProgram Execution Time: {execution_time:0.2f} seconds." + colors[0])
Pay careful attention to the output, where function1() sleeps for a variable amount of time, and function2() begins working with the results as they become available:
Execution Command: python async_chained.py 11 8 5
Output:
5. Using a Queue:
In this design, there is no chaining of any individual consumer to a producer. The consumers don’t know the number of producers, or even the cumulative number of items that will be added to the queue, in advance.
It takes an individual producer or consumer a variable amount of time to put and extract items from the queue, respectively. The queue serves as a throughput that can communicate with the producers and consumers without them talking to each other directly.
Example Code:
import asyncio
import argparse
import itertools as it
import os
import random
import time
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[36m", # Cyan
"\033[34m", # Blue
)
async def generate_item(size: int = 5) -> str:
return os.urandom(size).hex()
async def random_sleep(caller=None) -> None:
i = random.randint(0, 10)
if caller:
print(colors[1] + f"{caller} sleeping for {i} seconds." + colors[0])
await asyncio.sleep(i)
async def produce(name: int, producer_queue: asyncio.Queue) -> None:
n = random.randint(0, 10)
for _ in it.repeat(None, n): # Synchronous loop for each single producer
await random_sleep(caller=f"Producer {name}")
i = await generate_item()
t = time.perf_counter()
await producer_queue.put((i, t))
print(colors[2] + f"Producer {name} added <{i}> to queue." + colors[0])
async def consume(name: int, consumer_queue: asyncio.Queue) -> None:
while True:
await random_sleep(caller=f"Consumer {name}")
i, t = await consumer_queue.get()
now = time.perf_counter()
print(colors[3] + f"Consumer {name} got element <{i}>" f" in {now - t:0.5f} seconds." + colors[0])
consumer_queue.task_done()
async def main(no_producer: int, no_consumer: int):
q = asyncio.Queue()
producers = [asyncio.create_task(produce(n, q)) for n in range(no_producer)]
consumers = [asyncio.create_task(consume(n, q)) for n in range(no_consumer)]
await asyncio.gather(*producers)
await q.join() # Implicitly awaits consumers, too
for consumer in consumers:
consumer.cancel()
if __name__ == "__main__":
random.seed(444)
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--no_producer", type=int, default=10)
parser.add_argument("-c", "--no_consumer", type=int, default=15)
ns = parser.parse_args()
start_time = time.perf_counter()
asyncio.run(main(**ns.__dict__))
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[4] + f"Program Start Time: {start_time}\nProgram End Time: {end_time}\nProgram Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_queue.py -p 2 -c 4
Output:
Lastly, let's have an example of how asyncio cuts down on wait time: given a coroutine generate_random_int() that keeps producing random integers in the range [0, 10], until one of them exceeds a threshold, you want to let multiple calls of this coroutine not need to wait for each other to complete in succession.
Example Code:
import time
import asyncio
import random
# ANSI colors
colors = (
"\033[0m", # End of color
"\033[31m", # Red
"\033[32m", # Green
"\033[36m", # Cyan
"\033[35m", # Magenta
"\033[34m", # Blue
)
async def generate_random_int(indx: int, threshold: int = 5) -> int:
print(colors[indx + 1] + f"Initiated generate_random_int({indx}).")
i = random.randint(0, 10)
while i <= threshold:
print(colors[indx + 1] + f"generate_random_int({indx}) == {i} too low; retrying.")
await asyncio.sleep(indx + 1)
i = random.randint(0, 10)
print(colors[indx + 1] + f"---> Finished: generate_random_int({indx}) == {i}" + colors[0])
return i
async def main():
res = await asyncio.gather(*(generate_random_int(i, 10 - i - 1) for i in range(3)))
return res
if __name__ == "__main__":
random.seed(444)
start_time = time.perf_counter()
r1, r2, r3 = asyncio.run(main())
print(colors[4] + f"\nRandom INT 1: {r1}, Random INT 2: {r2}, Random INT 3: {r3}\n" + colors[0])
end_time = time.perf_counter()
execution_time = end_time - start_time
print(colors[5] + f"Program Start Time: {start_time}\nProgram End Time: {end_time}\nProgram Execution Time: {execution_time:0.2f} seconds." + colors[0])
Execution Command: python async_random.py
Output:
Note: If you’re writing any code yourself, prefer native coroutines
for the sake of being explicit rather than implicit. Generator based
coroutines will be removed in Python 3.10.
GitHub Repo: https://github.com/tssovi/asynchronous-in-python

Your coro desugaring is conceptually correct, but slightly incomplete.
await doesn't suspend unconditionally, but only if it encounters a blocking call. How does it know that a call is blocking? This is decided by the code being awaited. For example, an awaitable implementation of socket read could be desugared to:
def read(sock, n):
# sock must be in non-blocking mode
try:
return sock.recv(n)
except EWOULDBLOCK:
event_loop.add_reader(sock.fileno, current_task())
return SUSPEND
In real asyncio the equivalent code modifies the state of a Future instead of returning magic values, but the concept is the same. When appropriately adapted to a generator-like object, the above code can be awaited.
On the caller side, when your coroutine contains:
data = await read(sock, 1024)
It desugars into something close to:
data = read(sock, 1024)
if data is SUSPEND:
return SUSPEND
self.pos += 1
self.parts[self.pos](...)
People familiar with generators tend to describe the above in terms of yield from which does the suspension automatically.
The suspension chain continues all the way up to the event loop, which notices that the coroutine is suspended, removes it from the runnable set, and goes on to execute coroutines that are runnable, if any. If no coroutines are runnable, the loop waits in select() until either a file descriptor a coroutine is interested in becomes ready for IO or a timeout expires. (The event loop maintains a file-descriptor-to-coroutine mapping.)
In the above example, once select() tells the event loop that sock is readable, it will re-add coro to the runnable set, so it will be continued from the point of suspension.
In other words:
Everything happens in the same thread by default.
The event loop is responsible for scheduling the coroutines and waking them up when whatever they were waiting for (typically an IO call that would normally block, or a timeout) becomes ready.
For insight on coroutine-driving event loops, I recommend this talk by Dave Beazley, where he demonstrates coding an event loop from scratch in front of live audience.

If you picture an airport control tower, with many planes waiting to land on the same runway. The control tower can be seen as the event loop and runway as the thread. Each plane is a separate function waiting to execute. In reality only one plane can land on the runway at a time. What asyncio basically does it allows many planes to land simultaneously on the same runway by using the event loop to suspend functions and allow other functions to run when you use the await syntax it basically means that plane(function can be suspended and allow other functions to process

Python asyncio, possible to await / yield entire myFunction()

I've written a library of objects, many which make HTTP / IO calls. I've been looking at moving over to asyncio due to the mounting overheads, but I don't want to rewrite the underlying code.
I've been hoping to wrap asyncio around my code in order to perform functions asynchronously without replacing all of my deep / low level code with await / yield.
I began by attempting the following:
async def my_function1(some_object, some_params):
#Lots of existing code which uses existing objects
#No await statements
return output_data
async def my_function2():
#Does more stuff
while True:
loop = asyncio.get_event_loop()
tasks = my_function(some_object, some_params), my_function2()
output_data = loop.run_until_complete(asyncio.gather(*tasks))
print(output_data)
I quickly realised that while this code runs, nothing actually happens asynchronously, the functions complete synchronously. I'm very new to asynchronous programming, but I think this is because neither of my functions are using the keyword await or yield and thus these functions are not cooroutines, and do not yield, thus do not provide an opportunity to move to a different cooroutine. Please correct me if I am wrong.
My question is, is it possible to wrap complex functions (where deep within they make HTTP / IO calls ) in an asyncio await keyword, e.g.
async def my_function():
print("Welcome to my function")
data = await bigSlowFunction()
UPDATE - Following Karlson's Answer
Following and thanks to Karlsons accepted answer, I used the following code which works nicely:
from concurrent.futures import ThreadPoolExecutor
import time
#Some vars
a_var_1 = 0
a_var_2 = 10
pool = ThreadPoolExecutor(3)
future = pool.submit(my_big_function, object, a_var_1, a_var_2)
while not future.done() :
print("Waiting for future...")
time.sleep(0.01)
print("Future done")
print(future.result())
This works really nicely, and the future.done() / sleep loop gives you an idea of how many CPU cycles you get to use by going async.

The short answer is, you can't have the benefits of asyncio without explicitly marking the points in your code where control may be passed back to the event loop. This is done by turning your IO heavy functions into coroutines, just like you assumed.
Without changing existing code you might achieve your goal with greenlets (have a look at eventlet or gevent).
Another possibility would be to make use of Python's Future implementation wrapping and passing calls to your already written functions to some ThreadPoolExecutor and yield the resulting Future. Be aware, that this comes with all the caveats of multi-threaded programming, though.
Something along the lines of
from concurrent.futures import ThreadPoolExecutor
from thinair import big_slow_function
executor = ThreadPoolExecutor(max_workers=5)
async def big_slow_coroutine():
await executor.submit(big_slow_function)

As of python 3.9 you can wrap a blocking (non-async) function in a coroutine to make it awaitable using asyncio.to_thread(). The exampe given in the official documentation is:
def blocking_io():
print(f"start blocking_io at {time.strftime('%X')}")
# Note that time.sleep() can be replaced with any blocking
# IO-bound operation, such as file operations.
time.sleep(1)
print(f"blocking_io complete at {time.strftime('%X')}")
async def main():
print(f"started main at {time.strftime('%X')}")
await asyncio.gather(
asyncio.to_thread(blocking_io),
asyncio.sleep(1))
print(f"finished main at {time.strftime('%X')}")
asyncio.run(main())
# Expected output:
#
# started main at 19:50:53
# start blocking_io at 19:50:53
# blocking_io complete at 19:50:54
# finished main at 19:50:54
This seems like a more joined up approach than using concurrent.futures to make a coroutine, but I haven't tested it extensively.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.