I need to return a response from my FastAPI path operation, but before this I want to send a slow request and I don't need to wait for result of that request, just log errors if there are any. Can I do this by means of Python and FastAPI? I would not like to add Celery to the project.
Here is what I have so far, but it runs synchronously:
import asyncio
import requests
async def slow_request(data):
url = 'https://external.service'
response = requests.post(
headers={'Auth-Header': settings.API_TOKEN}
if not response.status_code == 200:
logger.error('response:', response.status_code)
logger.error('data', data)
async def handle_order(order: Order):
json_data = {
'order': order
task = asyncio.create_task(
await task
return {'body': {'message': 'success'}}
OK, if nobody wants to post an answer here are the solutions:
Solution #1
We can just remove await task line as alex_noname suggested. It will work because create_task schedules task and we are no longer awaiting for its completion.
async def handle_order(order: Order):
json_data = {
'order': order
task = asyncio.create_task(
return {'body': {'message': 'success'}}
Solution #2
I ended up with BackgroundTasks as HTF suggested as I'm already using FastAPI anyway, and this solution seems more neat to me.
async def handle_order(order: Order, background_tasks: BackgroundTasks):
json_data = {
'order': order
background_tasks.add_task(slow_request, json_data)
return {'body': {'message': 'success'}}
This even works without async before def slow_request(data):
The problem is really two-part
the requests library is synchronous, so requests.post(...) will block the event loop until completed
you don't need the result of the web request to respond to the client, but your current handler cannot respond to the client until the request is completed (even if it was async)
Consider separating the request logic off into another process, so it can happen at its own speed.
The key being that you can put work into a queue of some kind to complete eventually, without directly needing the result for response to the client.
You could use an async http request library and some collection of callbacks, multiprocessing to spawn a new process(es), or something more exotic like an independent program (perhaps with a pipe or sockets to communicate).
Maybe something of this form will work for you
import base64
import json
import multiprocessing
URL_EXTERNAL_SERVICE = "https://example.com"
TIMEOUT_REQUESTS = (2, 10) # always set a timeout for requests
SHARED_QUEUE = multiprocessing.Queue() # may leak as unbounded
async def slow_request(data):
# now returns on successful queue put, rather than request completion
def requesting_loop(logger, Q, url, token):
while True: # expects to be a daemon
data = json.dumps(Q.get()) # block until retrieval (non-daemon can use sentinel here)
response = requests.post(
headers={'Auth-Header': token},
# raise_for_status() --> try/except --> log + continue
if response.status_code != 200:
logger.error('call to {} failed (code={}) with data: {}'.format(
url, response.status_code,
"base64:" + base64.b64encode(data.encode())
def startup(): # run me when starting
# do whatever is needed for logger
# create a pool instead if you may need to process a lot of requests
p = multiprocessing.Process(
kwargs={"logger": logger, "Q": SHARED_QUEUE, "url": URL_EXTERNAL_SERVICE, "token": settings.API_TOKEN},
I am trying to execute a longer task (the_longer_function) in the background, and return the response immediately without waiting for the execution of primary function to complete.
Here's the basic code line that I have created:
#app.post("/something/something_test", response_class=JSONResponse)
async def home(request: Request, background_tasks: BackgroundTasks):
request_params = await request.json()
background_tasks.add_task(the_longer_function, request_params)
return JSONResponse({
"Result": "Execution Started!"
}, status_code=200)
except Exception as ex:
return {
"Result": f"Error in starting execution. Error {ex}"
Here's the definition of the_longer_function:
def the_longer_function(request_params):
variable = None
variable = request_params.get('variable', None)
executionId = str(uuid.uuid4())
bot_message['ExecutionId'] = executionId
publish_bot_scan_data(request_params, variable)
except (JSONDecodeError, Exception) as ex:
log.error(f"Error {ex}")
I want the API to respond immediately as soon as it adds the task for the new request in the background.
But I have observed, that if the prior request is running, the API is holding the call back and waiting for the previous one to complete and then is returning to the caller.
Have tried, async, trio, parallelism and concurrency but I don't think these are solution to what I am looking for.
Will appreciate the help and input.
Some sources that studied for above:
I have just ran into a funny situation when testing my FastAPI Python application and thought it might be useful for some of the people who reuse sessions in their apps and want to test requests using the same app, but get stuck on weir errors like the one in the title.
Also I desire to know what is happening here.
I have an async FastAPI application, that schedules multiple requests based on a unimportant configuration. After the list of request definitions is prepared, a session is created the requests are sent, possibly with delays so I can spread them in time.
To test if the requests are getting through, I have cretaed routes in my own app so I can send the testing requests back to my own application. The application basically talks to itself.
It was listening on at the time of testing.
I have following functions defined for building async tasks:
def optional_session(func):
async def wrapper(*args, **kwargs):
if 'session' not in kwargs or kwargs['session'] is None:
async with ClientSession() as session:
kwargs['session'] = session
return await func(*args, **kwargs)
return await func(*args, **kwargs)
return wrapper
async def post_json_with_time_from_url(url: str, data: dict, session: ClientSession = None) -> Tuple[Union[dict, None], float]:
A method that performs a request to a specified URL and reads the response as JSON data.
If the request is successful the data is returned. If an error occurs it is logged and the returned data is None.
:param data: data to send i the request
:param url: The URL to retrieve the image from
:return: A valid response or None
:param session:
result = None, time.time()
async with session.post(url, data=data) as response: # type: ClientResponse
# check if the response is valid
if response.status == 200:
# we have to read the response before leaving the response context manager
result = await response.json(), time.time()
except Exception as e:
except InvalidURL as e:
except Exception as e:
return result
def delay(func, seconds: int):
This decorator adds a time delay to an async function.
if seconds is None:
seconds = 0
async def wrapper(*args, **kwargs):
await asyncio.sleep(seconds)
return await func(*args, **kwargs)
return wrapper
def parse_get_post_request(config: ConfigContext, session: aiohttp.ClientSession = None) -> asyncio.Task:
Parses the get/post request from the configuration dictionary and creates an async task for it.
request_type = config.extract_key('request_type', True).lower()
delay_ = config.extract_key('delay')
url_base_ = config.extract_key('request_url_base', True)
url_suffix_ = config.extract_key('request_url_suffix', True)
url_ = urljoin(base=url_base_, url=url_suffix_)
if request_type == 'get':
return asyncio.ensure_future(
delay(get_json_with_time_from_url, delay_)(url=url_, session=session)
elif request_type == 'post':
return asyncio.ensure_future(
delay(post_json_with_time_from_url, delay_)(url=url_, session=session, data=config.extract_key('request_data'))
raise ValueError(f"Unsupported request type: {request_type}")
I am creating an aiohttp session like this:
async with aiohttp.ClientSession() as session:
and then reusing it throughout the context code block somehting like this:
single_request_tasks = []
for config in configs:
single_request_tasks.append(parse_get_post_request(config=plan_config, session=session))
responses = await asyncio.gather(*single_request_tasks)
Somehow, when I send the requests altogether, and one of the requests arrives back to the app at the same time as another one, an exception is thrown:
ConnectionAbortedError: [WinError 10053] An established connection was aborted by the software in your host machine
It turns out, that for some reason, the session I share for all the requests is terminated when multiple requests arrive at the same time, using the same ClientSession instance.
I am not really sure why this happens exactly, apart from suspecting some port clash shanenigans,
but it is resolved, when I use separate session for each request or when I spread them in time with an interval of one second (for example)
I have used separate sessions for each request when looping back to localhost.
I also avoided the issue, when I have spread the requests in time, so each one has time to complete before the other one is sent, but timing is not that reliable mechanism (since OS task scheduler, concurrency in asyncio, network latency, etc.)
This problem does not occur when sharing a session with a different host (for example when scraping images from imgur.com) so I believe the problem is related to the fact that I am looping back to the localhost.
Why this happens exactly? Why is the session closed by the software in the situation I described?
Is there anything I am doing wrong with the session? How does Starlette handle loopback connections? Is this case-dependent and do I need to do more detective work somehow or is this a generally recognized, platform independent behaviour?
I am working with an API that limits to 4 requests/s. I am using asyncio and aiohttp to make asynchronous http requests. I am using Windows.
When working with the API I receive 3 status codes most commonly, 200, 400 & 429.
The 4/s issue works entirely fine when seeing many 400s but soon as I receive a 200 and try to resolve the json using response.json I begin to receive 429s for too many requests. I am trying to understand why something like this would occur.
EDIT: After doing some logging, I am seeing that the 429s appear to creep up after I have a response that takes longer than a second to complete (in the case of a 200, a large JSON response might take a bit of time to resolve). It appears that after a > 1s request elapsed time request occurs, followed by a fast one (which takes a few ms) the requests sort of "jump" ahead too quickly and overwhelm the API with more than 4 requests resolving in a second.
I am utilizing a semaphore of size 3 (trying 4 hits 429 way more often). The workflow is generally:
1. Create event loop
2. Gather tasks
3. Create http session and begin async requests with our Semaphore.
4. _fetch() is handling the specific asynchronous requests.
I am trying to understand why that when I receive 200s (which requires some JSON serialization and likely adds some latency). If I am always awaiting a sleep call of 1.5 seconds per call, why am I still able to hit rate limits? Is this fault of the API I am hitting or is there something intrinsically wrong with my async-await calls.
Below is my code:
import asyncio
import aiohttp
import time
class Request:
def __init__(self, url: str, method: str="get", payload: str=None):
self.url: str = url
self.method: str = method
self.payload: str or dict = payload or dict()
class Response:
def __init__(self, url: str, status: int, payload: dict=None, error: bool=False, text: str=None):
self.url: str = url
self.status: int = status
self.payload: dict = payload or dict()
self.error: bool = error
self.text: str = text or ''
def make_requests(headers: dict, requests: list[Request]) -> asyncio.AbstractEventLoop:
requests is a list with data necessary to make requests
loop: asyncio.AbstractEventLoop = asyncio.get_event_loop()
responses: asyncio.AbstractEventLoop = loop.run_until_complete(_run(headers, requests))
return responses
async def _run(headers: dict, requests: list[Request]) -> "list[Response]":
# Create a semaphore to limit how many concurrent thread processes we can run (MAXIMUM) at a time.
semaphore: asyncio.Semaphore = asyncio.Semaphore(3)
time.sleep(10) # wait 10 seconds before beginning our async requests
async with aiohttp.ClientSession(headers=headers) as session:
tasks: list[asyncio.Task] = [asyncio.create_task(_iterate(semaphore, session, request)) for request in requests]
responses: list[Response] = await asyncio.gather(*tasks)
return responses
async def _iterate(semaphore: asyncio.Semaphore, session: aiohttp.ClientSession, request: Request) -> Response:
async with semaphore:
return await _fetch(session, request)
async def _fetch(session: aiohttp.ClientSession, request: Request) -> Response:
async with session.request(request.method, request.url, params=request.payload) as response:
print(f"NOW: {time.time()}")
print(f"Response Status: {response.status}.")
content: dict = await response.json()
await asyncio.sleep(1.5)
return Response(request.url, response.status, payload=content, error=False)
except aiohttp.ClientResponseError:
if response.status == 429:
await asyncio.sleep(12) # Back off before proceeding with more requests
return await _fetch(session, request)
await asyncio.sleep(1.5)
return Response(request.url, response.status, error=True)
The 4/s issue works entirely fine when seeing many 400s, but soon as I
receive a 200 and try to resolve the JSON using response.json, I begin
to receive 429s for too many requests. I am trying to understand why
something like this would occur.
The response status does not depend on how often you call the .json method on responses. The cause can be the security of the server API is running on. At the debugging time, I had to optimize the make_requests to make it more readable.
import asyncio
import aiohttp
class Request:
def __init__(self, url: str, method: str = "get", payload: str = None):
self.url: str = url
self.method: str = method
self.payload: str or dict = payload or dict()
class Response:
def __init__(self, url: str, status: int, payload: dict = None, error: bool = False, text: str = None):
self.url: str = url
self.status: int = status
self.payload: dict = payload or dict()
self.error: bool = error
self.text: str = text or ''
async def make_requests(headers: dict, requests: "list[Request]"):
This function makes concurrent requests with a semaphore.
:param headers: Main HTTP headers to use in the session.
:param requests: A list of Request objects.
:return: List of responses converted to Response objects.
async def make_request(request: Request) -> Response:
This closure makes limited requests at the time.
:param request: An instance of Request that describes HTTP request.
:return: A processed response.
async with semaphore:
response = await session.request(request.method, request.url, params=request.payload)
content = await response.json()
return Response(request.url, response.status, payload=content, error=False)
except (aiohttp.ClientResponseError, aiohttp.ContentTypeError, aiohttp.ClientError):
if response.status == 429:
return await make_request(request)
return Response(request.url, response.status, error=True)
semaphore = asyncio.Semaphore(3)
curr_loop = asyncio.get_running_loop()
async with aiohttp.ClientSession(headers=headers) as session:
return await asyncio.gather(*[curr_loop.create_task(make_request(request)) for request in requests])
if __name__ == "__main__":
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:101.0) Gecko/20100101 Firefox/101.0"
loop = asyncio.get_event_loop()
responses = loop.run_until_complete(make_requests(HEADERS, REQUESTS))
print(responses) # [<__main__.Response object at 0x7f4f73e5da30>, <__main__.Response object at 0x7f4f73e734f0>, <__main__.Response object at 0x7f4f73e73790>, <__main__.Response object at 0x7f4f73e5d9d0>, <__main__.Response object at 0x7f4f73e73490>]
If you get 400s after some count of requests, you need to check what headers are sent by the browser that is missed in your request.
I am trying to understand why when I receive 200s (which requires some
JSON serialization and it likely adds some latency). If I am always
awaiting a sleep call of 1.5 seconds per call, why am I still able to
hit rate limits? Is this fault of the API I am hitting, or is there
something intrinsically wrong with my async-await calls?
I'm not sure what you meant by saying "to be able to hit rate limits", but asyncio.sleep should work properly. The script makes the first limited count of concurrent requests (in this case, semaphore allows three concurrent tasks) almost at the same time. After a request is received, it waits for 1.5 sec concurrently and returns the result of the task. The key is concurrency. If you wait with asyncio.sleep for 1.5 sec in 3 different tasks, it will wait 1.5 sec but not 4.5. If you wanted to set delays between requests, you could wait before or after calling the create_task.
Using Tornado, I have a POST request that takes a long time as it makes many requests to another API service and processes the data. This can take minutes to fully complete. I don't want this to block the entire web server from responding to other requests, which it currently does.
I looked at multiple threads here on SO, but they are often 8 years old and the code does not work anylonger as tornado removed the "engine" component from tornado.gen.
Is there an easy way to kick off this long get call and not have it block the entire web server in the process? Is there anything I can put in the code to say.. "submit the POST response and work on this one function without blocking any concurrent server requests from getting an immediate response"?
def make_app():
return tornado.web.Application([
(r"/v1", MainHandler),
(r"/v1/addfile", AddHandler, dict(folderpaths = folderpaths)),
(r"/v1/getfiles", GetHandler, dict(folderpaths = folderpaths)),
(r"/v1/getfile", GetFileHandler, dict(folderpaths = folderpaths)),
if __name__ == "__main__":
app = make_app()
sockets = tornado.netutil.bind_sockets(8888)
server = tornado.httpserver.HTTPServer(app)
class AddHandler(tornado.web.RequestHandler):
def initialize(self, folderpaths):
self.folderpaths = folderpaths
def blockingFunction(self):
def post(self):
user = self.get_argument('user')
folderpath = self.get_argument('inpath')
outpath = self.get_argument('outpath')
workflow_value = self.get_argument('workflow')
status_code, status_text = validateInFolder(folderpath)
if (status_code == 200):
logging.info("Status Code 200")
result = self.folderpaths.add_file(user, folderpath, outpath, workflow_value)
#At this point the path is validated.
#POST response should be send out. Internal process should continue, new
#requests should not be blocked
Idea is that if input-parameters are validated the POST response should be sent out.
Then internal process (blockingFunction()) should be started, that should not block the Tornado Server from processing another API POST request.
I tried defining the (blockingFunction()) as async, which allows me to process multiple concurrent user requests - however there was a warning about missing "await" with async method.
Any help welcome. Thank you
class AddHandler(tornado.web.RequestHandler):
def initialize(self, folderpaths):
self.folderpaths = folderpaths
def blockingFunction(self):
async def post(self):
user = self.get_argument('user')
folderpath = self.get_argument('inpath')
outpath = self.get_argument('outpath')
workflow_value = self.get_argument('workflow')
status_code, status_text = validateInFolder(folderpath)
if (status_code == 200):
logging.info("Status Code 200")
result = self.folderpaths.add_file(user, folderpath, outpath, workflow_value)
#At this point the path is validated.
#POST response should be send out. Internal process should continue, new
#requests should not be blocked
await loop.run_in_executor(None, self.blockingFunction)
#if this had multiple parameters it would be
#await loop.run_in_executor(None, self.blockingFunction, param1, param2)
Thank you #xyres
Further read: https://www.tornadoweb.org/en/stable/faq.html
I need to make about 10,000 requests to the web service, and i expected JSON in response. Since the requests are independent of each other, I want to run them in parallel. I think aiohttp can help me with that. I wrote the following code:
import asyncio
import aiohttp
async def execute_module(session: aiohttp.ClientSession, module_id: str,
post_body: dict) -> dict:
headers = {
'Content-Type': r'application/json',
'Authorization': fr'Bearer {TOKEN}',
async with session.post(
) as response:
return await response.json()
async def execute_all(campaign_ids, post_body):
async with aiohttp.ClientSession() as session:
return await asyncio.gather(*[
execute_module(session, campaign_id, post_body)
for campaign_id in campaign_ids
campaign_ids = ['101', '102', '103'] * 400
post_body = {'inputs': [{"name": "one", "value": 1}]}
print(asyncio.run(execute_all(campaign_ids, post_body)))
P.S. I make 1,200 requests for testing.
Another way to solve it - wrapped requests.post in run_in_executor function. I know it's wrong to use blocking code in the asynchronous function, but it works faster (~ 7 seconds vs. ~ 10 seconds for aiohttp)
import requests
import asyncio
def execute_module(module_id, post_body):
headers = {
'Content-Type': r'application/json',
'Authorization': fr'Bearer {TOKEN}',
return requests.post(
async def execute_all(campaign_ids, post_body):
loop = asyncio.get_running_loop()
return await asyncio.gather(*[
loop.run_in_executor(None, execute_module, campaign_id, post_body)
for campaign_id in campaign_ids
campaign_ids = ['101', '102', '103'] * 400
post_body = {'inputs': [{"name": "one", "value": 1}]}
print(asyncio.run(execute_all(campaign_ids, post_body)))
What am I doing wrong?
Have you tried uvloop - https://github.com/MagicStack/uvloop? This should increase speed of aiohttp request
loop.run_in_executor(None, ...) runs syncroneous code in a thread pool (multiple threads). Event loop runs code in one thread.
My guess is that waiting for IO shouldn't make much of a difference, but handling the response (i.e. json decoding) does.
It probably due to def execute_module calls don't share requests.Session, i.e. each has its connection pool
On the other hand, async def execute_module runs with shared aiohttp.ClientSession, having limit of 100 connections https://docs.aiohttp.org/en/latest/http_request_lifecycle.html#how-to-use-the-clientsession
To check that, it may pass a customised aiohttp.TCPConnector to aiohttp.ClientSession with larger limit: