How to make Python async functions callable in Jupyter without await - python

I've written a Python library that currently makes several independent HTTP requests in serial. I'd like to parallelize these requests without altering how the library is called by users, or requiring users to be aware that calls are being made asynchronously under the hood. The library is meant for novice/intermediate Python users mostly using Jupyter, and I'd like it to work without introducing them to unfamiliar async/await semantics.
The following example, which works in Jupyter, illustrates what I'd like to achieve but requires use of await to invoke the code on the final line:
import asyncio
async def first_request():
await asyncio.sleep(2) # Simulate request time
return "First request response"
async def second_request():
await asyncio.sleep(2)
return "Second request response"
async def make_requests_in_parallel():
"""Make requests in parallel and return the responses."""
return await asyncio.gather(first_request(), second_request())
results = await make_requests_in_parallel() # Undesirable use of `await`
I've found previous answers describing how to call async code from synchronous code using asyncio.run(). In the Jupyter example above, I can replace the final line with the following to create a working, importable Python module:
def main():
"""Make results available to async-naive users"""
return asyncio.run(make_requests_in_parallel())
results = main() # No `await` needed to get results -- good!
This seems to be what I want. However, in Jupyter, the code will produce an error:
RuntimeError: asyncio.run() cannot be called from a running event loop
A comment on the same answer above explains that because Jupyter runs its own async event loop, there is no need (or, apparently, option) to start another one, so async code can "simply" be called using await. In my situation, though, avoiding await is why I wanted to use asyncio.run() in the first place.
This seems to suggest that existing synchronous libraries cannot, by any means, internally parallelize any operation using asyncio without altering their public API to require use of await. Is this true?
If so, are there more practical alternatives to asyncio that would let me parallelize a group of requests in an internal function without educating my users about async/await?

I found a great solution for this: nest_asyncio.
Once installed, the working solution in Jupyter is as follows:
import asyncio
import nest_asyncio
nest_asyncio.apply()
async def first_request():
await asyncio.sleep(2) # Simulate request time
return "First request response"
async def second_request():
await asyncio.sleep(2)
return "Second request response"
async def make_requests_in_parallel():
"""Make requests in parallel and return the responses."""
return await asyncio.gather(first_request(), second_request())
def main():
"""Make results available to async-naive users"""
return asyncio.run(make_requests_in_parallel())
results = main() # No `await` needed to get results

Related

How to perform a synchronous task in an asynchronous FastAPI REST endpoint without blocking the event loop? [duplicate]

I have the following code:
import time
from fastapi import FastAPI, Request
app = FastAPI()
#app.get("/ping")
async def ping(request: Request):
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}
If I run my code on localhost - e.g., http://localhost:8501/ping - in different tabs of the same browser window, I get:
Hello
bye
Hello
bye
instead of:
Hello
Hello
bye
bye
I have read about using httpx, but still, I cannot have a true parallelization. What's the problem?
As per FastAPI's documentation:
When you declare a path operation function with normal def instead
of async def, it is run in an external threadpool that is then
awaited, instead of being called directly (as it would block the
server).
also, as described here:
If you are using a third party library that communicates with
something (a database, an API, the file system, etc.) and doesn't have
support for using await, (this is currently the case for most
database libraries), then declare your path operation functions as
normally, with just def.
If your application (somehow) doesn't have to communicate with
anything else and wait for it to respond, use async def.
If you just don't know, use normal def.
Note: You can mix def and async def in your path operation functions as much as you need and define each one using the best
option for you. FastAPI will do the right thing with them.
Anyway, in any of the cases above, FastAPI will still work
asynchronously and be extremely fast.
But by following the steps above, it will be able to do some
performance optimizations.
Thus, def endpoints (in the context of asynchronous programming, a function defined with just def is called synchronous function) run in a separate thread from an external threadpool (that is then awaited, and hence, FastAPI will still work asynchronously), or, in other words, the server processes the requests concurrently, whereas async def endpoints run in the event loop—on the main (single) thread—that is, the server processes the requests sequentially, as long as there is no await call to (normally) non-blocking I/O-bound operations inside such endpoints/routes, such as waiting for (1) data from the client to be sent through the network, (2) contents of a file in the disk to be read, (3) a database operation to finish, etc., (have a look here), in which cases, the server will process the requests concurrently/asynchronously (Note that the same concept not only applies to FastAPI endpoints, but to Background Tasks as well—see Starlette's BackgroundTask class implementation—hence, after reading this answer to the end, you should be able to decide whether you should define a FastAPI endpoint or background task function with def or async def). The keyword await (which works only within an async def function) passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine (i.e., a coroutine object is the result of calling an async def function), and tells the event loop to let something else run, until that awaited task completes. Note that just because you may define a custom function with async def and then await it inside your endpoint, it doesn't mean that your code will work asynchronously, if that custom function contains, for example, calls to time.sleep(), CPU-bound tasks, non-async I/O libraries, or any other blocking call that is incompatible with asynchronous Python code. In FastAPI, for example, when using the async methods of UploadFile, such as await file.read() and await file.write(), FastAPI/Starlette, behind the scenes, actually runs such methods of File objects in an external threadpool (using the async run_in_threadpool() function) and awaits it, otherwise, such methods/operations would block the event loop. You can find out more by having a look at the implementation of the UploadFile class.
Asynchronous code with async and await is many times summarised as using coroutines. Coroutines are collaborative (or cooperatively multitasked), meaning that "at any given time, a program with coroutines is running only one of its coroutines, and this running coroutine suspends its execution only when it explicitly requests to be suspended" (see here and here for more info on coroutines). As described in this article:
Specifically, whenever execution of a currently-running coroutine
reaches an await expression, the coroutine may be suspended, and
another previously-suspended coroutine may resume execution if what it
was suspended on has since returned a value. Suspension can also
happen when an async for block requests the next value from an
asynchronous iterator or when an async with block is entered or
exited, as these operations use await under the hood.
If, however, a blocking I/O-bound or CPU-bound operation was directly executed/called inside an async def function/endpoint, it would block the main thread (i.e., the event loop). Hence, a blocking operation such as time.sleep() in an async def endpoint would block the entire server (as in the example provided in your question). Thus, if your endpoint is not going to make any async calls, you could declare it with just def instead, which would be run in an external threadpool that would then be awaited, as explained earlier (more solutions are given in the following sections). Example:
#app.get("/ping")
def ping(request: Request):
#print(request.client)
print("Hello")
time.sleep(5)
print("bye")
return "pong"
Otherwise, if the functions that you had to execute inside the endpoint are async functions that you had to await, you should define your endpoint with async def. To demonstrate this, the example below uses the asyncio.sleep() function (from the asyncio library), which provides a non-blocking sleep operation. The await asyncio.sleep() method will suspend the execution of the surrounding coroutine (until the sleep operation completes), thus allowing other tasks in the event loop to run. Similar examples are given here and here as well.
import asyncio
#app.get("/ping")
async def ping(request: Request):
#print(request.client)
print("Hello")
await asyncio.sleep(5)
print("bye")
return "pong"
Both the path operation functions above will print out the specified messages to the screen in the same order as mentioned in your question—if two requests arrived at around the same time—that is:
Hello
Hello
bye
bye
Important Note
When you call your endpoint for the second (third, and so on) time, please remember to do that from a tab that is isolated from the browser's main session; otherwise, succeeding requests (i.e., coming after the first one) will be blocked by the browser (on client side), as the browser will be waiting for response from the server for the previous request before sending the next one. You can confirm that by using print(request.client) inside the endpoint, where you would see the hostname and port number being the same for all incoming requests—if requests were initiated from tabs opened in the same browser window/session)—and hence, those requests would be processed sequentially, because of the browser sending them sequentially in the first place. To solve this, you could either:
Reload the same tab (as is running), or
Open a new tab in an Incognito Window, or
Use a different browser/client to send the request, or
Use the httpx library to make asynchronous HTTP requests, along with the awaitable asyncio.gather(), which allows executing multiple asynchronous operations concurrently and then returns a list of results in the same order the awaitables (tasks) were passed to that function (have a look at this answer for more details).
Example:
import httpx
import asyncio
URLS = ['http://127.0.0.1:8000/ping'] * 2
async def send(url, client):
return await client.get(url, timeout=10)
async def main():
async with httpx.AsyncClient() as client:
tasks = [send(url, client) for url in URLS]
responses = await asyncio.gather(*tasks)
print(*[r.json() for r in responses], sep='\n')
asyncio.run(main())
In case you had to call different endpoints that may take different time to process a request, and you would like to print the response out on client side as soon as it is returned from the server—instead of waiting for asyncio.gather() to gather the results of all tasks and print them out in the same order the tasks were passed to the send() function—you could replace the send() function of the example above with the one shown below:
async def send(url, client):
res = await client.get(url, timeout=10)
print(res.json())
return res
Async/await and Blocking I/O-bound or CPU-bound Operations
If you are required to use async def (as you might need to await for coroutines inside your endpoint), but also have some synchronous I/O-bound or CPU-bound operation (long-running computation task) that will block the event loop (essentially, the entire server) and won't let other requests to go through, for example:
#app.post("/ping")
async def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = await file.read()
res = cpu_bound_task(contents) # this will block the event loop
finally:
await file.close()
print("bye")
return "pong"
then:
You should check whether you could change your endpoint's definition to normal def instead of async def. For example, if the only method in your endpoint that has to be awaited is the one reading the file contents (as you mentioned in the comments section below), you could instead declare the type of the endpoint's parameter as bytes (i.e., file: bytes = File()) and thus, FastAPI would read the file for you and you would receive the contents as bytes. Hence, there would be no need to use await file.read(). Please note that the above approach should work for small files, as the enitre file contents would be stored into memory (see the documentation on File Parameters); and hence, if your system does not have enough RAM available to accommodate the accumulated data (if, for example, you have 8GB of RAM, you can’t load a 50GB file), your application may end up crashing. Alternatively, you could call the .read() method of the SpooledTemporaryFile directly (which can be accessed through the .file attribute of the UploadFile object), so that again you don't have to await the .read() method—and as you can now declare your endpoint with normal def, each request will run in a separate thread (example is given below). For more details on how to upload a File, as well how Starlette/FastAPI uses SpooledTemporaryFile behind the scenes, please have a look at this answer and this answer.
#app.post("/ping")
def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = file.file.read()
res = cpu_bound_task(contents)
finally:
file.file.close()
print("bye")
return "pong"
Use FastAPI's (Starlette's) run_in_threadpool() function from the concurrency module—as #tiangolo suggested here—which "will run the function in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked" (see here). As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports both sequence arguments and keyword arguments".
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, contents)
Alternatively, use asyncio's loop.run_in_executor()—after obtaining the running event loop using asyncio.get_running_loop()—to run the task, which, in this case, you can await for it to complete and return the result(s), before moving on to the next line of code. Passing None as the executor argument, the default executor will be used; that is ThreadPoolExecutor:
import asyncio
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, cpu_bound_task, contents)
or, if you would like to pass keyword arguments instead, you could use a lambda expression, or, preferably, functools.partial(), which is specifically recommended in the documentation for loop.run_in_executor():
import asyncio
from functools import partial
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, partial(cpu_bound_task, some_arg=contents))
You could also run your task in a custom ThreadPoolExecutor. For instance:
import asyncio
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ThreadPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
In Python 3.9+, you could also use asyncio.to_thread() to asynchronously run a synchronous function in a separate thread—which, essentially, uses await loop.run_in_executor(None, func_call) under the hood, as can been seen in the implementation of asyncio.to_thread(). The to_thread() function takes the name of a blocking function to execute, as well as any arguments (*args and/or **kwargs) to the function, and then returns a coroutine that can be awaited. Example:
import asyncio
res = await asyncio.to_thread(cpu_bound_task, contents)
ThreadPoolExecutor will successfully prevent the event loop from being blocked, but won't give you the performance improvement you would expect from running code in parallel; especially, when one needs to perform CPU-bound operations, such as the ones described here (e.g., audio or image processing, machine learning, and so on). It is thus preferable to run CPU-bound tasks in a separate process—using ProcessPoolExecutor, as shown below—which, again, you can integrate with asyncio, in order to await it to finish its work and return the result(s). As described here, on Windows, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc. Basically, your code must be under if __name__ == '__main__':.
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
Use more workers. For example, uvicorn main:app --workers 4 (if you are using Gunicorn as a process manager with Uvicorn workers, please have a look at this answer). Note: Each worker "has its own things, variables and memory". This means that global variables/objects, etc., won't be shared across the processes/workers. In this case, you should consider using a database storage, or Key-Value stores (Caches), as described here and here. Additionally, note that "if you are consuming a large amount of memory in your code, each process will consume an equivalent amount of memory".
If you need to perform heavy background computation and you don't necessarily need it to be run by the same process (for example, you don't need to share memory, variables, etc), you might benefit from using other bigger tools like Celery, as described in FastAPI's documentation.
Q :" ... What's the problem? "
A :The FastAPI documentation is explicit to say the framework uses in-process tasks ( as inherited from Starlette ).
That, by itself, means, that all such task compete to receive ( from time to time ) the Python Interpreter GIL-lock - being efficiently a MUTEX-terrorising Global Interpreter Lock, which in effect re-[SERIAL]-ises any and all amounts of Python Interpreter in-process threads to work as one-and-only-one-WORKS-while-all-others-stay-waiting...
On fine-grain scale, you see the result -- if spawning another handler for the second ( manually initiated from a second FireFox-tab ) arriving http-request actually takes longer than a sleep has taken, the result of GIL-lock interleaved ~ 100 [ms] time-quanta round-robin ( all-wait-one-can-work ~ 100 [ms] before each next round of GIL-lock release-acquire-roulette takes place ) Python Interpreter internal work does not show more details, you may use more details ( depending on O/S type or version ) from here to see more in-thread LoD, like this inside the async-decorated code being performed :
import time
import threading
from fastapi import FastAPI, Request
TEMPLATE = "INF[{0:_>20d}]: t_id( {1: >20d} ):: {2:}"
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Python Interpreter __main__ was started ..."
)
...
#app.get("/ping")
async def ping( request: Request ):
""" __doc__
[DOC-ME]
ping( Request ): a mock-up AS-IS function to yield
a CLI/GUI self-evidence of the order-of-execution
RETURNS: a JSON-alike decorated dict
[TEST-ME] ...
"""
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Hello..."
)
#------------------------------------------------- actual blocking work
time.sleep( 5 )
#------------------------------------------------- actual blocking work
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"...bye"
)
return { "ping": "pong!" }
Last, but not least, do not hesitate to read more about all other sharks threads-based code may suffer from ... or even cause ... behind the curtains ...
Ad Memorandum
A mixture of GIL-lock, thread-based pools, asynchronous decorators, blocking and event-handling -- a sure mix to uncertainties & HWY2HELL ;o)

jupyter notebooks-safe asyncio run wrapper method for a library

I'm building a library that leverages asyncio internally.
While the user shouldn't be aware of it, the internal implementation currently wraps the async code with the asyncio.run() porcelain wrapper.
However, some users will be executing this library code from a jupyter notebook, and I'm struggling to replace the asyncio.run() with a wrapper that's safe for either environment.
Here's what I've tried:
ASYNC_IO_NO_RUNNING_LOOP_MSG = 'no running event loop'
def jupyter_safe_run_coroutine(async_coroutine, _test_mode: bool = False)
try:
loop = asyncio.get_running_loop()
task = loop.create_task(async_coroutine)
result = loop.run_until_complete(task) # <- fails as loop is already running
# OR
asyncio.wait_for(task, timeout=None, loop=loop) # <- fails as this is an async method
result = task.result()
except RuntimeError as e:
if _test_mode:
raise e
if ASYNC_IO_NO_RUNNING_LOOP_MSG in str(e):
return asyncio.run(async_coroutine)
except Exception as e:
raise e
Requirements
We use python 3.8, so we can't use asyncio.Runner context manager
We can't use threading, so the solution suggested here would not work
Problem:
How can I wait/await for the async_coroutine, or the task/future provided by loop.create_task(async_coroutine) to be completed?
None of the methods above actually do the waiting, and for the reasons stated in the comments.
Update
I've found this nest_asyncio library that's built to solve this problem exactly:
ASYNC_IO_NO_RUNNING_LOOP_MSG = 'no running event loop'
HAS_BEEN_RUN = False
def jupyter_safe_run_coroutine(async_coroutine, _test_mode: bool = False):
global HAS_BEEN_RUN
if not HAS_BEEN_RUN:
_apply_nested_asyncio_patch()
HAS_BEEN_RUN = True
return asyncio.run(async_coroutine)
def _apply_nested_asyncio_patch():
try:
loop = asyncio.get_running_loop()
logger.info(f'as get_running_loop() returned {loop}, this environment has it`s own event loop.\n'
f'Patching with nest_asyncio')
import nest_asyncio
nest_asyncio.apply()
except RuntimeError as e:
if ASYNC_IO_NO_RUNNING_LOOP_MSG in str(e):
logger.info(f'as get_running_loop() raised {e}, this environment does not have it`s own event loop.\n'
f'No patching necessary')
else:
raise e
Still, there are some issues I'm facing with it:
As per this SO answer, there might be starvation issues
Any logs written in the async_coroutine are not printed in the jupyter notebook
The jupyter notebook kernel occasionally crashes upon completion of the task
Edit
For context, the library internally calls external APIs for data enrichment of a user-provided dataframe:
# user code using the library
import my_lib
df = pd.DataFrame(data='some data')
enriched_df = my_lib.enrich(df)
It's usually a good idea to expose the asynchronous function. This way you will give your users more flexibility.
If some of your users can't (or don't want to) use asynchronous calls to your functions, they will be able to call the async function using asyncio.run(your_function()). Or in the rare situation where they have an event loop running but can't make async calls they could use the create_task + add_one_callback method described here. (I really have no idea why such a use case may happen, but for the sake of the argument I included it.)
Hidding the asynchronous interface from your users is not the best idea because it limits their capabilities. They will probably fork your package to patch it and make the exposed function async or call the hidden async function directly. None of which is good news for you (harder to document / track bugs). I would really suggest to stick to the simplest solution and provide the async functions as the main entry points.
Suppose the following package code followed by 3 different usage of it:
async def package_code():
return "package"
Client codes
Typical clients will probably just use it this way:
async def client_code_a():
print(await package_code())
# asyncio.run(client_code_a())
For some people, the following might make sense. For example if your package is the only asynchronous thing they will ever use. Or maybe they are not yet confortable using async code (these you can probably convince to try client_code_a instead):
def client_code_b():
print(asyncio.run(package_code()))
# client_code_b()
The very few (I'm tempted to say none):
async def client_code_c():
# asyncio.run() cannot be called from a running event loop:
# print(asyncio.run(package_code()))
loop = asyncio.get_running_loop()
task = loop.create_task(package_code())
task.add_done_callback(lambda t: print(t.result()))
# asyncio.run(client_code_c())
I'm still not sure to understand what your goal is, but I'll describe with code what I tried to explain in my comment so you can tell me where your issue lies in the following.
If you package requests the user to call some functions (your_package_function in the example) that take coroutines as arguments, then you shouldn't worry about the event loop.
That means the package shouldn't call asyncio.run nor loop.run_until_complete. The client should (in almost all cases) be responsible for starting the even loop.
Your package code should assume there is an event loop running. Since I don't know your package's goal I just made a function that feeds a "test" argument to any coroutine the client is passing:
import asyncio
async def your_package_function(coroutine):
print("- Package internals start")
task = asyncio.create_task(coroutine("test"))
await asyncio.sleep(.5) # Simulates slow tasks within your package
print("- Package internals completed other task")
x = await task
print("- Package internals end")
return x
The client (package user) should then call the following:
async def main():
x = await your_package_function(return_with_delay)
print(f"Computed value = {x}")
async def return_with_delay(value):
print("+ User function start")
await asyncio.sleep(.2)
print("+ User function end")
return value
await main()
# or asyncio.run(main()) if needed
This would print:
- Package internals start
- Package internals completed other task
+ User function start
+ User function end
- Package internals end
Computed value = test

FastAPI runs api-calls in serial instead of parallel fashion

I have the following code:
import time
from fastapi import FastAPI, Request
app = FastAPI()
#app.get("/ping")
async def ping(request: Request):
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}
If I run my code on localhost - e.g., http://localhost:8501/ping - in different tabs of the same browser window, I get:
Hello
bye
Hello
bye
instead of:
Hello
Hello
bye
bye
I have read about using httpx, but still, I cannot have a true parallelization. What's the problem?
As per FastAPI's documentation:
When you declare a path operation function with normal def instead
of async def, it is run in an external threadpool that is then
awaited, instead of being called directly (as it would block the
server).
also, as described here:
If you are using a third party library that communicates with
something (a database, an API, the file system, etc.) and doesn't have
support for using await, (this is currently the case for most
database libraries), then declare your path operation functions as
normally, with just def.
If your application (somehow) doesn't have to communicate with
anything else and wait for it to respond, use async def.
If you just don't know, use normal def.
Note: You can mix def and async def in your path operation functions as much as you need and define each one using the best
option for you. FastAPI will do the right thing with them.
Anyway, in any of the cases above, FastAPI will still work
asynchronously and be extremely fast.
But by following the steps above, it will be able to do some
performance optimizations.
Thus, def endpoints (in the context of asynchronous programming, a function defined with just def is called synchronous function) run in a separate thread from an external threadpool (that is then awaited, and hence, FastAPI will still work asynchronously), or, in other words, the server processes the requests concurrently, whereas async def endpoints run in the event loop—on the main (single) thread—that is, the server processes the requests sequentially, as long as there is no await call to (normally) non-blocking I/O-bound operations inside such endpoints/routes, such as waiting for (1) data from the client to be sent through the network, (2) contents of a file in the disk to be read, (3) a database operation to finish, etc., (have a look here), in which cases, the server will process the requests concurrently/asynchronously (Note that the same concept not only applies to FastAPI endpoints, but to Background Tasks as well—see Starlette's BackgroundTask class implementation—hence, after reading this answer to the end, you should be able to decide whether you should define a FastAPI endpoint or background task function with def or async def). The keyword await (which works only within an async def function) passes function control back to the event loop. In other words, it suspends the execution of the surrounding coroutine (i.e., a coroutine object is the result of calling an async def function), and tells the event loop to let something else run, until that awaited task completes. Note that just because you may define a custom function with async def and then await it inside your endpoint, it doesn't mean that your code will work asynchronously, if that custom function contains, for example, calls to time.sleep(), CPU-bound tasks, non-async I/O libraries, or any other blocking call that is incompatible with asynchronous Python code. In FastAPI, for example, when using the async methods of UploadFile, such as await file.read() and await file.write(), FastAPI/Starlette, behind the scenes, actually runs such methods of File objects in an external threadpool (using the async run_in_threadpool() function) and awaits it, otherwise, such methods/operations would block the event loop. You can find out more by having a look at the implementation of the UploadFile class.
Asynchronous code with async and await is many times summarised as using coroutines. Coroutines are collaborative (or cooperatively multitasked), meaning that "at any given time, a program with coroutines is running only one of its coroutines, and this running coroutine suspends its execution only when it explicitly requests to be suspended" (see here and here for more info on coroutines). As described in this article:
Specifically, whenever execution of a currently-running coroutine
reaches an await expression, the coroutine may be suspended, and
another previously-suspended coroutine may resume execution if what it
was suspended on has since returned a value. Suspension can also
happen when an async for block requests the next value from an
asynchronous iterator or when an async with block is entered or
exited, as these operations use await under the hood.
If, however, a blocking I/O-bound or CPU-bound operation was directly executed/called inside an async def function/endpoint, it would block the main thread (i.e., the event loop). Hence, a blocking operation such as time.sleep() in an async def endpoint would block the entire server (as in the example provided in your question). Thus, if your endpoint is not going to make any async calls, you could declare it with just def instead, which would be run in an external threadpool that would then be awaited, as explained earlier (more solutions are given in the following sections). Example:
#app.get("/ping")
def ping(request: Request):
#print(request.client)
print("Hello")
time.sleep(5)
print("bye")
return "pong"
Otherwise, if the functions that you had to execute inside the endpoint are async functions that you had to await, you should define your endpoint with async def. To demonstrate this, the example below uses the asyncio.sleep() function (from the asyncio library), which provides a non-blocking sleep operation. The await asyncio.sleep() method will suspend the execution of the surrounding coroutine (until the sleep operation completes), thus allowing other tasks in the event loop to run. Similar examples are given here and here as well.
import asyncio
#app.get("/ping")
async def ping(request: Request):
#print(request.client)
print("Hello")
await asyncio.sleep(5)
print("bye")
return "pong"
Both the path operation functions above will print out the specified messages to the screen in the same order as mentioned in your question—if two requests arrived at around the same time—that is:
Hello
Hello
bye
bye
Important Note
When you call your endpoint for the second (third, and so on) time, please remember to do that from a tab that is isolated from the browser's main session; otherwise, succeeding requests (i.e., coming after the first one) will be blocked by the browser (on client side), as the browser will be waiting for response from the server for the previous request before sending the next one. You can confirm that by using print(request.client) inside the endpoint, where you would see the hostname and port number being the same for all incoming requests—if requests were initiated from tabs opened in the same browser window/session)—and hence, those requests would be processed sequentially, because of the browser sending them sequentially in the first place. To solve this, you could either:
Reload the same tab (as is running), or
Open a new tab in an Incognito Window, or
Use a different browser/client to send the request, or
Use the httpx library to make asynchronous HTTP requests, along with the awaitable asyncio.gather(), which allows executing multiple asynchronous operations concurrently and then returns a list of results in the same order the awaitables (tasks) were passed to that function (have a look at this answer for more details).
Example:
import httpx
import asyncio
URLS = ['http://127.0.0.1:8000/ping'] * 2
async def send(url, client):
return await client.get(url, timeout=10)
async def main():
async with httpx.AsyncClient() as client:
tasks = [send(url, client) for url in URLS]
responses = await asyncio.gather(*tasks)
print(*[r.json() for r in responses], sep='\n')
asyncio.run(main())
In case you had to call different endpoints that may take different time to process a request, and you would like to print the response out on client side as soon as it is returned from the server—instead of waiting for asyncio.gather() to gather the results of all tasks and print them out in the same order the tasks were passed to the send() function—you could replace the send() function of the example above with the one shown below:
async def send(url, client):
res = await client.get(url, timeout=10)
print(res.json())
return res
Async/await and Blocking I/O-bound or CPU-bound Operations
If you are required to use async def (as you might need to await for coroutines inside your endpoint), but also have some synchronous I/O-bound or CPU-bound operation (long-running computation task) that will block the event loop (essentially, the entire server) and won't let other requests to go through, for example:
#app.post("/ping")
async def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = await file.read()
res = cpu_bound_task(contents) # this will block the event loop
finally:
await file.close()
print("bye")
return "pong"
then:
You should check whether you could change your endpoint's definition to normal def instead of async def. For example, if the only method in your endpoint that has to be awaited is the one reading the file contents (as you mentioned in the comments section below), you could instead declare the type of the endpoint's parameter as bytes (i.e., file: bytes = File()) and thus, FastAPI would read the file for you and you would receive the contents as bytes. Hence, there would be no need to use await file.read(). Please note that the above approach should work for small files, as the enitre file contents would be stored into memory (see the documentation on File Parameters); and hence, if your system does not have enough RAM available to accommodate the accumulated data (if, for example, you have 8GB of RAM, you can’t load a 50GB file), your application may end up crashing. Alternatively, you could call the .read() method of the SpooledTemporaryFile directly (which can be accessed through the .file attribute of the UploadFile object), so that again you don't have to await the .read() method—and as you can now declare your endpoint with normal def, each request will run in a separate thread (example is given below). For more details on how to upload a File, as well how Starlette/FastAPI uses SpooledTemporaryFile behind the scenes, please have a look at this answer and this answer.
#app.post("/ping")
def ping(file: UploadFile = File(...)):
print("Hello")
try:
contents = file.file.read()
res = cpu_bound_task(contents)
finally:
file.file.close()
print("bye")
return "pong"
Use FastAPI's (Starlette's) run_in_threadpool() function from the concurrency module—as #tiangolo suggested here—which "will run the function in a separate thread to ensure that the main thread (where coroutines are run) does not get blocked" (see here). As described by #tiangolo here, "run_in_threadpool is an awaitable function, the first parameter is a normal function, the next parameters are passed to that function directly. It supports both sequence arguments and keyword arguments".
from fastapi.concurrency import run_in_threadpool
res = await run_in_threadpool(cpu_bound_task, contents)
Alternatively, use asyncio's loop.run_in_executor()—after obtaining the running event loop using asyncio.get_running_loop()—to run the task, which, in this case, you can await for it to complete and return the result(s), before moving on to the next line of code. Passing None as the executor argument, the default executor will be used; that is ThreadPoolExecutor:
import asyncio
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, cpu_bound_task, contents)
or, if you would like to pass keyword arguments instead, you could use a lambda expression, or, preferably, functools.partial(), which is specifically recommended in the documentation for loop.run_in_executor():
import asyncio
from functools import partial
loop = asyncio.get_running_loop()
res = await loop.run_in_executor(None, partial(cpu_bound_task, some_arg=contents))
You could also run your task in a custom ThreadPoolExecutor. For instance:
import asyncio
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ThreadPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
In Python 3.9+, you could also use asyncio.to_thread() to asynchronously run a synchronous function in a separate thread—which, essentially, uses await loop.run_in_executor(None, func_call) under the hood, as can been seen in the implementation of asyncio.to_thread(). The to_thread() function takes the name of a blocking function to execute, as well as any arguments (*args and/or **kwargs) to the function, and then returns a coroutine that can be awaited. Example:
import asyncio
res = await asyncio.to_thread(cpu_bound_task, contents)
ThreadPoolExecutor will successfully prevent the event loop from being blocked, but won't give you the performance improvement you would expect from running code in parallel; especially, when one needs to perform CPU-bound operations, such as the ones described here (e.g., audio or image processing, machine learning, and so on). It is thus preferable to run CPU-bound tasks in a separate process—using ProcessPoolExecutor, as shown below—which, again, you can integrate with asyncio, in order to await it to finish its work and return the result(s). As described here, on Windows, it is important to protect the main loop of code to avoid recursive spawning of subprocesses, etc. Basically, your code must be under if __name__ == '__main__':.
import concurrent.futures
loop = asyncio.get_running_loop()
with concurrent.futures.ProcessPoolExecutor() as pool:
res = await loop.run_in_executor(pool, cpu_bound_task, contents)
Use more workers. For example, uvicorn main:app --workers 4 (if you are using Gunicorn as a process manager with Uvicorn workers, please have a look at this answer). Note: Each worker "has its own things, variables and memory". This means that global variables/objects, etc., won't be shared across the processes/workers. In this case, you should consider using a database storage, or Key-Value stores (Caches), as described here and here. Additionally, note that "if you are consuming a large amount of memory in your code, each process will consume an equivalent amount of memory".
If you need to perform heavy background computation and you don't necessarily need it to be run by the same process (for example, you don't need to share memory, variables, etc), you might benefit from using other bigger tools like Celery, as described in FastAPI's documentation.
Q :" ... What's the problem? "
A :The FastAPI documentation is explicit to say the framework uses in-process tasks ( as inherited from Starlette ).
That, by itself, means, that all such task compete to receive ( from time to time ) the Python Interpreter GIL-lock - being efficiently a MUTEX-terrorising Global Interpreter Lock, which in effect re-[SERIAL]-ises any and all amounts of Python Interpreter in-process threads to work as one-and-only-one-WORKS-while-all-others-stay-waiting...
On fine-grain scale, you see the result -- if spawning another handler for the second ( manually initiated from a second FireFox-tab ) arriving http-request actually takes longer than a sleep has taken, the result of GIL-lock interleaved ~ 100 [ms] time-quanta round-robin ( all-wait-one-can-work ~ 100 [ms] before each next round of GIL-lock release-acquire-roulette takes place ) Python Interpreter internal work does not show more details, you may use more details ( depending on O/S type or version ) from here to see more in-thread LoD, like this inside the async-decorated code being performed :
import time
import threading
from fastapi import FastAPI, Request
TEMPLATE = "INF[{0:_>20d}]: t_id( {1: >20d} ):: {2:}"
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Python Interpreter __main__ was started ..."
)
...
#app.get("/ping")
async def ping( request: Request ):
""" __doc__
[DOC-ME]
ping( Request ): a mock-up AS-IS function to yield
a CLI/GUI self-evidence of the order-of-execution
RETURNS: a JSON-alike decorated dict
[TEST-ME] ...
"""
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"Hello..."
)
#------------------------------------------------- actual blocking work
time.sleep( 5 )
#------------------------------------------------- actual blocking work
print( TEMPLATE.format( time.perf_counter_ns(),
threading.get_ident(),
"...bye"
)
return { "ping": "pong!" }
Last, but not least, do not hesitate to read more about all other sharks threads-based code may suffer from ... or even cause ... behind the curtains ...
Ad Memorandum
A mixture of GIL-lock, thread-based pools, asynchronous decorators, blocking and event-handling -- a sure mix to uncertainties & HWY2HELL ;o)

Python asyncio within Multiprocessing. One event loop per process

I am writing a function for my team that will download some data from the cloud. The function itself is a regular python function but under the hood, it uses asyncio. So, I create an event loop with in my function and have async coroutines do the downloading concurrently. After the data is downloaded, I process it and return the results.
My function works as expected when I call it from any other Python function. But, when I try to parallelize it using multiprocessing, I am occasionally seeing some IOErrors.
I tried to search for an example on how to achieve this but I couldn't find any. I only see recommendations to use a concurrent.futures and have the event loop's run_in_executor do the parallelization. That is not an options for me because, I want to hide all the async stuff from my team and just provide them this simple Python function that they can call from their code (possibly in multiprocessing). I have seen arguments online for why this is a bad idea, and why I shouldn't conceal the async stuff, but in my case, my team aren't savvy programmers. They would never use (or bother to understand) asyncio and so a simple python function is what works best for us.
Lastly, here is a pseudo example that shows what I am trying to do:
import asyncio
import aiohttp
from typing import List
async def _async_fetch_data(symbol: str) -> bytes:
'''
Download stock data for given symbol from yahoo finance.
'''
async with asyncio.BoundedSemaphore(50), aiohttp.ClientSession() as session:
try:
url = f'https://query1.finance.yahoo.com/v8/finance/chart/{symbol}?symbol={symbol}&period1=0&period2=9999999999&interval=1d'
async with session.get(url) as response:
return await response.read()
except:
return None
def fetch_data(symbols: List[str]) -> List[bytes]:
'''
Gateway function that wraps the under the hood async stuff
'''
coroutine_list = [_async_fetch_data(x) for x in symbols]
if len(coroutine_list) == 0:
return []
loop = asyncio.new_event_loop()
asyncio.set_event_loop(loop)
data = loop.run_until_complete(asyncio.wait(coroutine_list))[0]
loop.close()
return [d.result() for d in data if d.result() is not None]
This works alright if I run it as
>>> data = fetch_data(['AAPL', 'GOOG'])
But I am afraid if it will run alright when I do
>>> from multiprocessing import Pool as ProcessPool
>>> with ProcessPool(2) as pool:
data = [j for i in pool.map(fetch_data, [['AAPL', 'GOOG'], ['AMZN', 'MSFT']]) for j in i]
I am seeing occasional IOErrors but I cannot reproduce them and I am not sure if it is because I am mixing asyncio with multiprocessing or because of something else.

Python asyncio, possible to await / yield entire myFunction()

I've written a library of objects, many which make HTTP / IO calls. I've been looking at moving over to asyncio due to the mounting overheads, but I don't want to rewrite the underlying code.
I've been hoping to wrap asyncio around my code in order to perform functions asynchronously without replacing all of my deep / low level code with await / yield.
I began by attempting the following:
async def my_function1(some_object, some_params):
#Lots of existing code which uses existing objects
#No await statements
return output_data
async def my_function2():
#Does more stuff
while True:
loop = asyncio.get_event_loop()
tasks = my_function(some_object, some_params), my_function2()
output_data = loop.run_until_complete(asyncio.gather(*tasks))
print(output_data)
I quickly realised that while this code runs, nothing actually happens asynchronously, the functions complete synchronously. I'm very new to asynchronous programming, but I think this is because neither of my functions are using the keyword await or yield and thus these functions are not cooroutines, and do not yield, thus do not provide an opportunity to move to a different cooroutine. Please correct me if I am wrong.
My question is, is it possible to wrap complex functions (where deep within they make HTTP / IO calls ) in an asyncio await keyword, e.g.
async def my_function():
print("Welcome to my function")
data = await bigSlowFunction()
UPDATE - Following Karlson's Answer
Following and thanks to Karlsons accepted answer, I used the following code which works nicely:
from concurrent.futures import ThreadPoolExecutor
import time
#Some vars
a_var_1 = 0
a_var_2 = 10
pool = ThreadPoolExecutor(3)
future = pool.submit(my_big_function, object, a_var_1, a_var_2)
while not future.done() :
print("Waiting for future...")
time.sleep(0.01)
print("Future done")
print(future.result())
This works really nicely, and the future.done() / sleep loop gives you an idea of how many CPU cycles you get to use by going async.
The short answer is, you can't have the benefits of asyncio without explicitly marking the points in your code where control may be passed back to the event loop. This is done by turning your IO heavy functions into coroutines, just like you assumed.
Without changing existing code you might achieve your goal with greenlets (have a look at eventlet or gevent).
Another possibility would be to make use of Python's Future implementation wrapping and passing calls to your already written functions to some ThreadPoolExecutor and yield the resulting Future. Be aware, that this comes with all the caveats of multi-threaded programming, though.
Something along the lines of
from concurrent.futures import ThreadPoolExecutor
from thinair import big_slow_function
executor = ThreadPoolExecutor(max_workers=5)
async def big_slow_coroutine():
await executor.submit(big_slow_function)
As of python 3.9 you can wrap a blocking (non-async) function in a coroutine to make it awaitable using asyncio.to_thread(). The exampe given in the official documentation is:
def blocking_io():
print(f"start blocking_io at {time.strftime('%X')}")
# Note that time.sleep() can be replaced with any blocking
# IO-bound operation, such as file operations.
time.sleep(1)
print(f"blocking_io complete at {time.strftime('%X')}")
async def main():
print(f"started main at {time.strftime('%X')}")
await asyncio.gather(
asyncio.to_thread(blocking_io),
asyncio.sleep(1))
print(f"finished main at {time.strftime('%X')}")
asyncio.run(main())
# Expected output:
#
# started main at 19:50:53
# start blocking_io at 19:50:53
# blocking_io complete at 19:50:54
# finished main at 19:50:54
This seems like a more joined up approach than using concurrent.futures to make a coroutine, but I haven't tested it extensively.

Categories

Resources