Related
We are writing a web service using Python FastAPI that is going to be hosted in Kubernetes. For auditing purposes, we need to save the raw JSON body of the request/response for specific routes. The body size of both request and response JSON is about 1MB, and preferably, this should not impact the response time.
How can we do that?
Option 1 - Using Middleware
You could use a Middleware. A middleware takes each request that comes to your application, and hence, allows you to handle the request before it is processed by any specific endpoint, as well as the response, before it is returned to the client. To create a middleware, you use the decorator #app.middleware("http") on top of a function, as shown below. As you need to consume the request body from the stream inside the middleware—using either request.body() or request.stream(), as shown in this answer (behind the scenes, the former method actually calls the latter, see here)—then it won't be available when you later pass the request to the corresponding endpoint. Thus, you can follow the approach described in this post to make the request body available down the line (i.e., using the set_body function below). As for the response body, you can use the same approach as described in this answer to consume the body and then return the response to the client. Either option described in the aforementioned linked answer would work; the below, however, uses Option 2, which stores the body in a bytes object and returns a custom Response directly (along with the status_code, headers and media_type of the original response).
To log the data, you could use a BackgroundTask, as described in this answer and this answer. A BackgroundTask will run only once the response has been sent (see Starlette documentation as well); thus, the client won't have to be waiting for the logging to complete before receiving the response (and hence, the response time won't be noticeably impacted).
Note
If you had a streaming request or response with a body that wouldn't fit into your server's RAM (for example, imagine a body of 100GB on a machine running 8GB RAM), it would become problematic, as you are storing the data to RAM, which wouldn't have enough space available to accommodate the accumulated data. Also, in case of a large response (e.g., a large FileResponse or StreamingResponse), you may be faced with Timeout errors on client side (or on reverse proxy side, if you are using one), as you would not be able to respond back to the client, until you have read the entire response body (as you are looping over response.body_iterator). You mentioned that "the body size of both request and response JSON is about 1MB"; hence, that should normally be fine (however, it is always a good practice to consider beforehand matters, such as how many requests your API is expected to be serving concurrently, what other applications might be using the RAM, etc., in order to rule whether this is an issue or not). If you needed to, you could limit the number of requests to your API endpoints using, for example, SlowAPI (as shown in this answer).
Limiting the usage of the middleware to specific routes only
You could limit the usage of the middleware to specific endpoints by:
checking the request.url.path inside the middleware against a
pre-defined list of routes for which you would like to log the
request and response, as described in this answer (see
"Update" section),
or using a sub application, as demonstrated in this
answer
or using a custom APIRoute class, as demonstrated in Option 2
below.
Working Example
from fastapi import FastAPI, APIRouter, Response, Request
from starlette.background import BackgroundTask
from fastapi.routing import APIRoute
from starlette.types import Message
from typing import Dict, Any
import logging
app = FastAPI()
logging.basicConfig(filename='info.log', level=logging.DEBUG)
def log_info(req_body, res_body):
logging.info(req_body)
logging.info(res_body)
async def set_body(request: Request, body: bytes):
async def receive() -> Message:
return {'type': 'http.request', 'body': body}
request._receive = receive
#app.middleware('http')
async def some_middleware(request: Request, call_next):
req_body = await request.body()
await set_body(request, req_body)
response = await call_next(request)
res_body = b''
async for chunk in response.body_iterator:
res_body += chunk
task = BackgroundTask(log_info, req_body, res_body)
return Response(content=res_body, status_code=response.status_code,
headers=dict(response.headers), media_type=response.media_type, background=task)
#app.post('/')
def main(payload: Dict[Any, Any]):
return payload
In case you would like to perform some validation on the request body—for example, ensruing that the request body size is not exceeding a certain value—instead of using request.body(), you can process the body one chunk at a time using the .stream() method, as shown below (similar to this answer).
#app.middleware('http')
async def some_middleware(request: Request, call_next):
req_body = b''
async for chunk in request.stream():
req_body += chunk
...
Option 2 - Using custom APIRoute class
You can alternatively use a custom APIRoute class—similar to here and here—which, among other things, would allow you to manipulate the request body before it is processed by your application, as well as the response body before it is returned to the client. This option also allows you to limit the usage of this class to the routes you wish, as only the endpoints under the APIRouter (i.e., router in the example below) will use the custom APIRoute class .
It should be noted that the same comments mentioned in Option 1 above, under the "Note" section, apply to this option as well. For example, if your API returns a StreamingResponse—such as in /video route of the example below, which is streaming a video file from an online source (public videos to test this can be found here, and you can even use a longer video than the one used below to see the effect more clearly)—you may come across issues on server side, if your server's RAM can't handle it, as well as delays on client side (and reverse proxy server, if using one) due to the whole (streaming) response being read and stored in RAM, before it is returned to the client (as explained earlier). In such cases, you could exclude such endpoints that return a StreamingResponse from the custom APIRoute class and limit its usage only to the desired routes—especially, if it is a large video file, or even live video that wouldn't likely make much sense to have it stored in the logs—simply by not using the #<name_of_router> decorator (i.e., #router in the example below) for such endpoints, but rather using the #<name_of_app> decorator (i.e., #app in the example below), or some other APIRouter or sub application.
Working Example
from fastapi import FastAPI, APIRouter, Response, Request
from starlette.background import BackgroundTask
from starlette.responses import StreamingResponse
from fastapi.routing import APIRoute
from starlette.types import Message
from typing import Callable, Dict, Any
import logging
import httpx
def log_info(req_body, res_body):
logging.info(req_body)
logging.info(res_body)
class LoggingRoute(APIRoute):
def get_route_handler(self) -> Callable:
original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response:
req_body = await request.body()
response = await original_route_handler(request)
if isinstance(response, StreamingResponse):
res_body = b''
async for item in response.body_iterator:
res_body += item
task = BackgroundTask(log_info, req_body, res_body)
return Response(content=res_body, status_code=response.status_code,
headers=dict(response.headers), media_type=response.media_type, background=task)
else:
res_body = response.body
response.background = BackgroundTask(log_info, req_body, res_body)
return response
return custom_route_handler
app = FastAPI()
router = APIRouter(route_class=LoggingRoute)
logging.basicConfig(filename='info.log', level=logging.DEBUG)
#router.post('/')
def main(payload: Dict[Any, Any]):
return payload
#router.get('/video')
def get_video():
url = 'https://storage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4'
def gen():
with httpx.stream('GET', url) as r:
for chunk in r.iter_raw():
yield chunk
return StreamingResponse(gen(), media_type='video/mp4')
app.include_router(router)
You may try to customize APIRouter like in FastAPI official documentation:
import time
from typing import Callable
from fastapi import APIRouter, FastAPI, Request, Response
from fastapi.routing import APIRoute
class TimedRoute(APIRoute):
def get_route_handler(self) -> Callable:
original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response:
before = time.time()
response: Response = await original_route_handler(request)
duration = time.time() - before
response.headers["X-Response-Time"] = str(duration)
print(f"route duration: {duration}")
print(f"route response: {response}")
print(f"route response headers: {response.headers}")
return response
return custom_route_handler
app = FastAPI()
router = APIRouter(route_class=TimedRoute)
#app.get("/")
async def not_timed():
return {"message": "Not timed"}
#router.get("/timed")
async def timed():
return {"message": "It's the time of my life"}
app.include_router(router)
As the other answers did not work for me and I searched quite extensively on stackoverflow to fix this problem, I will show my solution below.
The main issue is that when using the request body or response body many of the approaches/solutions offered online do simply not work as the request/response body is consumed in reading it from the stream.
To solve this issue I adapted an approach that basically reconstructs the request and response after reading them. This is heavily based on the comment by user 'kovalevvlad' on https://github.com/encode/starlette/issues/495.
Custom middleware is created that is later added to the app to log all requests and responses. Note that you need some kind of logger to store your logs.
from json import JSONDecodeError
import json
import logging
from typing import Callable, Awaitable, Tuple, Dict, List
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response, StreamingResponse
from starlette.types import Scope, Message
# Set up your custom logger here
logger = ""
class RequestWithBody(Request):
"""Creation of new request with body"""
def __init__(self, scope: Scope, body: bytes) -> None:
super().__init__(scope, self._receive)
self._body = body
self._body_returned = False
async def _receive(self) -> Message:
if self._body_returned:
return {"type": "http.disconnect"}
else:
self._body_returned = True
return {"type": "http.request", "body": self._body, "more_body": False}
class CustomLoggingMiddleware(BaseHTTPMiddleware):
"""
Use of custom middleware since reading the request body and the response consumes the bytestream.
Hence this approach to basically generate a new request/response when we read the attributes for logging.
"""
async def dispatch( # type: ignore
self, request: Request, call_next: Callable[[Request], Awaitable[StreamingResponse]]
) -> Response:
# Store request body in a variable and generate new request as it is consumed.
request_body_bytes = await request.body()
request_with_body = RequestWithBody(request.scope, request_body_bytes)
# Store response body in a variable and generate new response as it is consumed.
response = await call_next(request_with_body)
response_content_bytes, response_headers, response_status = await self._get_response_params(response)
# Logging
# If there is no request body handle exception, otherwise convert bytes to JSON.
try:
req_body = json.loads(request_body_bytes)
except JSONDecodeError:
req_body = ""
# Logging of relevant variables.
logger.info(
f"{request.method} request to {request.url} metadata\n"
f"\tStatus_code: {response.status_code}\n"
f"\tRequest_Body: {req_body}\n"
)
# Finally, return the newly instantiated response values
return Response(response_content_bytes, response_status, response_headers)
async def _get_response_params(self, response: StreamingResponse) -> Tuple[bytes, Dict[str, str], int]:
"""Getting the response parameters of a response and create a new response."""
response_byte_chunks: List[bytes] = []
response_status: List[int] = []
response_headers: List[Dict[str, str]] = []
async def send(message: Message) -> None:
if message["type"] == "http.response.start":
response_status.append(message["status"])
response_headers.append({k.decode("utf8"): v.decode("utf8") for k, v in message["headers"]})
else:
response_byte_chunks.append(message["body"])
await response.stream_response(send)
content = b"".join(response_byte_chunks)
return content, response_headers[0], response_status[0]
I have just ran into a funny situation when testing my FastAPI Python application and thought it might be useful for some of the people who reuse sessions in their apps and want to test requests using the same app, but get stuck on weir errors like the one in the title.
Also I desire to know what is happening here.
Context
I have an async FastAPI application, that schedules multiple requests based on a unimportant configuration. After the list of request definitions is prepared, a session is created the requests are sent, possibly with delays so I can spread them in time.
To test if the requests are getting through, I have cretaed routes in my own app so I can send the testing requests back to my own application. The application basically talks to itself.
It was listening on 127.0.0.1:8000 at the time of testing.
I have following functions defined for building async tasks:
def optional_session(func):
async def wrapper(*args, **kwargs):
if 'session' not in kwargs or kwargs['session'] is None:
async with ClientSession() as session:
kwargs['session'] = session
return await func(*args, **kwargs)
else:
return await func(*args, **kwargs)
return wrapper
#optional_session
async def post_json_with_time_from_url(url: str, data: dict, session: ClientSession = None) -> Tuple[Union[dict, None], float]:
"""
A method that performs a request to a specified URL and reads the response as JSON data.
If the request is successful the data is returned. If an error occurs it is logged and the returned data is None.
:param data: data to send i the request
:param url: The URL to retrieve the image from
:return: A valid response or None
:param session:
"""
result = None, time.time()
try:
async with session.post(url, data=data) as response: # type: ClientResponse
# check if the response is valid
if response.status == 200:
try:
# we have to read the response before leaving the response context manager
result = await response.json(), time.time()
except Exception as e:
logger.error("...")
else:
logger.error(
"...")
except InvalidURL as e:
logger.error(f"...")
except Exception as e:
logger.error("...")
return result
def delay(func, seconds: int):
""""
This decorator adds a time delay to an async function.
"""
if seconds is None:
seconds = 0
async def wrapper(*args, **kwargs):
await asyncio.sleep(seconds)
return await func(*args, **kwargs)
return wrapper
def parse_get_post_request(config: ConfigContext, session: aiohttp.ClientSession = None) -> asyncio.Task:
"""
Parses the get/post request from the configuration dictionary and creates an async task for it.
"""
request_type = config.extract_key('request_type', True).lower()
delay_ = config.extract_key('delay')
url_base_ = config.extract_key('request_url_base', True)
url_suffix_ = config.extract_key('request_url_suffix', True)
url_ = urljoin(base=url_base_, url=url_suffix_)
if request_type == 'get':
return asyncio.ensure_future(
delay(get_json_with_time_from_url, delay_)(url=url_, session=session)
)
elif request_type == 'post':
return asyncio.ensure_future(
delay(post_json_with_time_from_url, delay_)(url=url_, session=session, data=config.extract_key('request_data'))
)
else:
raise ValueError(f"Unsupported request type: {request_type}")
I am creating an aiohttp session like this:
async with aiohttp.ClientSession() as session:
...
and then reusing it throughout the context code block somehting like this:
single_request_tasks = []
...
for config in configs:
single_request_tasks.append(parse_get_post_request(config=plan_config, session=session))
...
responses = await asyncio.gather(*single_request_tasks)
...
Problem
Somehow, when I send the requests altogether, and one of the requests arrives back to the app at the same time as another one, an exception is thrown:
ConnectionAbortedError: [WinError 10053] An established connection was aborted by the software in your host machine
It turns out, that for some reason, the session I share for all the requests is terminated when multiple requests arrive at the same time, using the same ClientSession instance.
I am not really sure why this happens exactly, apart from suspecting some port clash shanenigans,
but it is resolved, when I use separate session for each request or when I spread them in time with an interval of one second (for example)
Workaround
I have used separate sessions for each request when looping back to localhost.
I also avoided the issue, when I have spread the requests in time, so each one has time to complete before the other one is sent, but timing is not that reliable mechanism (since OS task scheduler, concurrency in asyncio, network latency, etc.)
This problem does not occur when sharing a session with a different host (for example when scraping images from imgur.com) so I believe the problem is related to the fact that I am looping back to the localhost.
Question
Why this happens exactly? Why is the session closed by the software in the situation I described?
Is there anything I am doing wrong with the session? How does Starlette handle loopback connections? Is this case-dependent and do I need to do more detective work somehow or is this a generally recognized, platform independent behaviour?
I am working with an API that limits to 4 requests/s. I am using asyncio and aiohttp to make asynchronous http requests. I am using Windows.
When working with the API I receive 3 status codes most commonly, 200, 400 & 429.
The 4/s issue works entirely fine when seeing many 400s but soon as I receive a 200 and try to resolve the json using response.json I begin to receive 429s for too many requests. I am trying to understand why something like this would occur.
EDIT: After doing some logging, I am seeing that the 429s appear to creep up after I have a response that takes longer than a second to complete (in the case of a 200, a large JSON response might take a bit of time to resolve). It appears that after a > 1s request elapsed time request occurs, followed by a fast one (which takes a few ms) the requests sort of "jump" ahead too quickly and overwhelm the API with more than 4 requests resolving in a second.
I am utilizing a semaphore of size 3 (trying 4 hits 429 way more often). The workflow is generally:
1. Create event loop
2. Gather tasks
3. Create http session and begin async requests with our Semaphore.
4. _fetch() is handling the specific asynchronous requests.
I am trying to understand why that when I receive 200s (which requires some JSON serialization and likely adds some latency). If I am always awaiting a sleep call of 1.5 seconds per call, why am I still able to hit rate limits? Is this fault of the API I am hitting or is there something intrinsically wrong with my async-await calls.
Below is my code:
import asyncio
import aiohttp
import time
class Request:
def __init__(self, url: str, method: str="get", payload: str=None):
self.url: str = url
self.method: str = method
self.payload: str or dict = payload or dict()
class Response:
def __init__(self, url: str, status: int, payload: dict=None, error: bool=False, text: str=None):
self.url: str = url
self.status: int = status
self.payload: dict = payload or dict()
self.error: bool = error
self.text: str = text or ''
def make_requests(headers: dict, requests: list[Request]) -> asyncio.AbstractEventLoop:
"""
requests is a list with data necessary to make requests
"""
loop: asyncio.AbstractEventLoop = asyncio.get_event_loop()
responses: asyncio.AbstractEventLoop = loop.run_until_complete(_run(headers, requests))
return responses
async def _run(headers: dict, requests: list[Request]) -> "list[Response]":
# Create a semaphore to limit how many concurrent thread processes we can run (MAXIMUM) at a time.
semaphore: asyncio.Semaphore = asyncio.Semaphore(3)
time.sleep(10) # wait 10 seconds before beginning our async requests
async with aiohttp.ClientSession(headers=headers) as session:
tasks: list[asyncio.Task] = [asyncio.create_task(_iterate(semaphore, session, request)) for request in requests]
responses: list[Response] = await asyncio.gather(*tasks)
return responses
async def _iterate(semaphore: asyncio.Semaphore, session: aiohttp.ClientSession, request: Request) -> Response:
async with semaphore:
return await _fetch(session, request)
async def _fetch(session: aiohttp.ClientSession, request: Request) -> Response:
try:
async with session.request(request.method, request.url, params=request.payload) as response:
print(f"NOW: {time.time()}")
print(f"Response Status: {response.status}.")
content: dict = await response.json()
response.raise_for_status()
await asyncio.sleep(1.5)
return Response(request.url, response.status, payload=content, error=False)
except aiohttp.ClientResponseError:
if response.status == 429:
await asyncio.sleep(12) # Back off before proceeding with more requests
return await _fetch(session, request)
else:
await asyncio.sleep(1.5)
return Response(request.url, response.status, error=True)
The 4/s issue works entirely fine when seeing many 400s, but soon as I
receive a 200 and try to resolve the JSON using response.json, I begin
to receive 429s for too many requests. I am trying to understand why
something like this would occur.
The response status does not depend on how often you call the .json method on responses. The cause can be the security of the server API is running on. At the debugging time, I had to optimize the make_requests to make it more readable.
import asyncio
import aiohttp
class Request:
def __init__(self, url: str, method: str = "get", payload: str = None):
self.url: str = url
self.method: str = method
self.payload: str or dict = payload or dict()
class Response:
def __init__(self, url: str, status: int, payload: dict = None, error: bool = False, text: str = None):
self.url: str = url
self.status: int = status
self.payload: dict = payload or dict()
self.error: bool = error
self.text: str = text or ''
async def make_requests(headers: dict, requests: "list[Request]"):
"""
This function makes concurrent requests with a semaphore.
:param headers: Main HTTP headers to use in the session.
:param requests: A list of Request objects.
:return: List of responses converted to Response objects.
"""
async def make_request(request: Request) -> Response:
"""
This closure makes limited requests at the time.
:param request: An instance of Request that describes HTTP request.
:return: A processed response.
"""
async with semaphore:
try:
response = await session.request(request.method, request.url, params=request.payload)
content = await response.json()
response.raise_for_status()
return Response(request.url, response.status, payload=content, error=False)
except (aiohttp.ClientResponseError, aiohttp.ContentTypeError, aiohttp.ClientError):
if response.status == 429:
return await make_request(request)
return Response(request.url, response.status, error=True)
semaphore = asyncio.Semaphore(3)
curr_loop = asyncio.get_running_loop()
async with aiohttp.ClientSession(headers=headers) as session:
return await asyncio.gather(*[curr_loop.create_task(make_request(request)) for request in requests])
if __name__ == "__main__":
HEADERS = {
"User-Agent": "Mozilla/5.0 (X11; Linux x86_64; rv:101.0) Gecko/20100101 Firefox/101.0"
}
REQUESTS = [
Request("https://www.google.com/search?q=query1"),
Request("https://www.google.com/search?q=query2"),
Request("https://www.google.com/search?q=query3"),
Request("https://www.google.com/search?q=query4"),
Request("https://www.google.com/search?q=query5"),
]
loop = asyncio.get_event_loop()
responses = loop.run_until_complete(make_requests(HEADERS, REQUESTS))
print(responses) # [<__main__.Response object at 0x7f4f73e5da30>, <__main__.Response object at 0x7f4f73e734f0>, <__main__.Response object at 0x7f4f73e73790>, <__main__.Response object at 0x7f4f73e5d9d0>, <__main__.Response object at 0x7f4f73e73490>]
loop.close()
If you get 400s after some count of requests, you need to check what headers are sent by the browser that is missed in your request.
I am trying to understand why when I receive 200s (which requires some
JSON serialization and it likely adds some latency). If I am always
awaiting a sleep call of 1.5 seconds per call, why am I still able to
hit rate limits? Is this fault of the API I am hitting, or is there
something intrinsically wrong with my async-await calls?
I'm not sure what you meant by saying "to be able to hit rate limits", but asyncio.sleep should work properly. The script makes the first limited count of concurrent requests (in this case, semaphore allows three concurrent tasks) almost at the same time. After a request is received, it waits for 1.5 sec concurrently and returns the result of the task. The key is concurrency. If you wait with asyncio.sleep for 1.5 sec in 3 different tasks, it will wait 1.5 sec but not 4.5. If you wanted to set delays between requests, you could wait before or after calling the create_task.
I have implemented a test function in pytest which loads data from files, casts it into Python objects and provides a new object for each test.
Each one of these objects contains a request I need to make to the server and the expected responses, the function looks like this:
#pytest.mark.asyncio
#pytest.mark.parametrize('test', TestLoader.load(JSONTest, 'json_tests'))
async def test_json(test: JSONTest, groups: Set[TestGroup], client: httpx.AsyncClient):
skip_if_not_in_groups(test, groups)
request = Request(url=test.url, body=test.body.dict())
response = await client.post(request.url, json=request.body)
# Assertions down here...
Many times I send many requests that contain the same http endpoint with the same body so the response is the same, but I'm testing for different things in the response.
Because of that I thought of implementing an in-memory cache so that for each test run the same requests won't be implemented twice.
What I've tried to do is create a request object, with its own __hash__ implementation and use the #asyncstdlib.lru_cache on the function, it didn't seem to work.
# Does not work...
#asyncstdlib.lru_cache
async def send_request(request: Request, client: httpx.AsyncClient):
return await client.post(request.url, json=request.body)
#pytest.mark.asyncio
#pytest.mark.parametrize('test', TestLoader.load(JSONTest, 'json_tests'))
async def test_json(test: JSONTest, groups: Set[TestGroup], client: httpx.AsyncClient):
skip_if_not_in_groups(test, groups)
request = Request(url=test.url, body=test.body.dict())
response = await send_request(request)
The client I'm using: httpx.AsyncClient also implements __hash__, it's coming from a pytest.fixture in conftest.py and it has a scope of 'session':
# conftest.py
#pytest.fixture(scope='session')
def event_loop(request):
loop = asyncio.get_event_loop_policy().new_event_loop()
yield loop
loop.close()
#pytest.fixture(scope='session')
async def client() -> httpx.AsyncClient:
async with httpx.AsyncClient() as client:
yield client
Just let go of the opaque 3rd party cache, and cache yourself.
Since you don't require cleaning-up the cache during a single execution, a plain dictionary will work:
_cache = {}
async def send_request(request: Request, client: httpx.AsyncClient):
if request.url not in _cache:
_cache[request.url] = await client.post(request.url, json=request.body)
return _cache[request.url]
We are writing a web service using Python FastAPI that is going to be hosted in Kubernetes. For auditing purposes, we need to save the raw JSON body of the request/response for specific routes. The body size of both request and response JSON is about 1MB, and preferably, this should not impact the response time.
How can we do that?
Option 1 - Using Middleware
You could use a Middleware. A middleware takes each request that comes to your application, and hence, allows you to handle the request before it is processed by any specific endpoint, as well as the response, before it is returned to the client. To create a middleware, you use the decorator #app.middleware("http") on top of a function, as shown below. As you need to consume the request body from the stream inside the middleware—using either request.body() or request.stream(), as shown in this answer (behind the scenes, the former method actually calls the latter, see here)—then it won't be available when you later pass the request to the corresponding endpoint. Thus, you can follow the approach described in this post to make the request body available down the line (i.e., using the set_body function below). As for the response body, you can use the same approach as described in this answer to consume the body and then return the response to the client. Either option described in the aforementioned linked answer would work; the below, however, uses Option 2, which stores the body in a bytes object and returns a custom Response directly (along with the status_code, headers and media_type of the original response).
To log the data, you could use a BackgroundTask, as described in this answer and this answer. A BackgroundTask will run only once the response has been sent (see Starlette documentation as well); thus, the client won't have to be waiting for the logging to complete before receiving the response (and hence, the response time won't be noticeably impacted).
Note
If you had a streaming request or response with a body that wouldn't fit into your server's RAM (for example, imagine a body of 100GB on a machine running 8GB RAM), it would become problematic, as you are storing the data to RAM, which wouldn't have enough space available to accommodate the accumulated data. Also, in case of a large response (e.g., a large FileResponse or StreamingResponse), you may be faced with Timeout errors on client side (or on reverse proxy side, if you are using one), as you would not be able to respond back to the client, until you have read the entire response body (as you are looping over response.body_iterator). You mentioned that "the body size of both request and response JSON is about 1MB"; hence, that should normally be fine (however, it is always a good practice to consider beforehand matters, such as how many requests your API is expected to be serving concurrently, what other applications might be using the RAM, etc., in order to rule whether this is an issue or not). If you needed to, you could limit the number of requests to your API endpoints using, for example, SlowAPI (as shown in this answer).
Limiting the usage of the middleware to specific routes only
You could limit the usage of the middleware to specific endpoints by:
checking the request.url.path inside the middleware against a
pre-defined list of routes for which you would like to log the
request and response, as described in this answer (see
"Update" section),
or using a sub application, as demonstrated in this
answer
or using a custom APIRoute class, as demonstrated in Option 2
below.
Working Example
from fastapi import FastAPI, APIRouter, Response, Request
from starlette.background import BackgroundTask
from fastapi.routing import APIRoute
from starlette.types import Message
from typing import Dict, Any
import logging
app = FastAPI()
logging.basicConfig(filename='info.log', level=logging.DEBUG)
def log_info(req_body, res_body):
logging.info(req_body)
logging.info(res_body)
async def set_body(request: Request, body: bytes):
async def receive() -> Message:
return {'type': 'http.request', 'body': body}
request._receive = receive
#app.middleware('http')
async def some_middleware(request: Request, call_next):
req_body = await request.body()
await set_body(request, req_body)
response = await call_next(request)
res_body = b''
async for chunk in response.body_iterator:
res_body += chunk
task = BackgroundTask(log_info, req_body, res_body)
return Response(content=res_body, status_code=response.status_code,
headers=dict(response.headers), media_type=response.media_type, background=task)
#app.post('/')
def main(payload: Dict[Any, Any]):
return payload
In case you would like to perform some validation on the request body—for example, ensruing that the request body size is not exceeding a certain value—instead of using request.body(), you can process the body one chunk at a time using the .stream() method, as shown below (similar to this answer).
#app.middleware('http')
async def some_middleware(request: Request, call_next):
req_body = b''
async for chunk in request.stream():
req_body += chunk
...
Option 2 - Using custom APIRoute class
You can alternatively use a custom APIRoute class—similar to here and here—which, among other things, would allow you to manipulate the request body before it is processed by your application, as well as the response body before it is returned to the client. This option also allows you to limit the usage of this class to the routes you wish, as only the endpoints under the APIRouter (i.e., router in the example below) will use the custom APIRoute class .
It should be noted that the same comments mentioned in Option 1 above, under the "Note" section, apply to this option as well. For example, if your API returns a StreamingResponse—such as in /video route of the example below, which is streaming a video file from an online source (public videos to test this can be found here, and you can even use a longer video than the one used below to see the effect more clearly)—you may come across issues on server side, if your server's RAM can't handle it, as well as delays on client side (and reverse proxy server, if using one) due to the whole (streaming) response being read and stored in RAM, before it is returned to the client (as explained earlier). In such cases, you could exclude such endpoints that return a StreamingResponse from the custom APIRoute class and limit its usage only to the desired routes—especially, if it is a large video file, or even live video that wouldn't likely make much sense to have it stored in the logs—simply by not using the #<name_of_router> decorator (i.e., #router in the example below) for such endpoints, but rather using the #<name_of_app> decorator (i.e., #app in the example below), or some other APIRouter or sub application.
Working Example
from fastapi import FastAPI, APIRouter, Response, Request
from starlette.background import BackgroundTask
from starlette.responses import StreamingResponse
from fastapi.routing import APIRoute
from starlette.types import Message
from typing import Callable, Dict, Any
import logging
import httpx
def log_info(req_body, res_body):
logging.info(req_body)
logging.info(res_body)
class LoggingRoute(APIRoute):
def get_route_handler(self) -> Callable:
original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response:
req_body = await request.body()
response = await original_route_handler(request)
if isinstance(response, StreamingResponse):
res_body = b''
async for item in response.body_iterator:
res_body += item
task = BackgroundTask(log_info, req_body, res_body)
return Response(content=res_body, status_code=response.status_code,
headers=dict(response.headers), media_type=response.media_type, background=task)
else:
res_body = response.body
response.background = BackgroundTask(log_info, req_body, res_body)
return response
return custom_route_handler
app = FastAPI()
router = APIRouter(route_class=LoggingRoute)
logging.basicConfig(filename='info.log', level=logging.DEBUG)
#router.post('/')
def main(payload: Dict[Any, Any]):
return payload
#router.get('/video')
def get_video():
url = 'https://storage.googleapis.com/gtv-videos-bucket/sample/ForBiggerBlazes.mp4'
def gen():
with httpx.stream('GET', url) as r:
for chunk in r.iter_raw():
yield chunk
return StreamingResponse(gen(), media_type='video/mp4')
app.include_router(router)
You may try to customize APIRouter like in FastAPI official documentation:
import time
from typing import Callable
from fastapi import APIRouter, FastAPI, Request, Response
from fastapi.routing import APIRoute
class TimedRoute(APIRoute):
def get_route_handler(self) -> Callable:
original_route_handler = super().get_route_handler()
async def custom_route_handler(request: Request) -> Response:
before = time.time()
response: Response = await original_route_handler(request)
duration = time.time() - before
response.headers["X-Response-Time"] = str(duration)
print(f"route duration: {duration}")
print(f"route response: {response}")
print(f"route response headers: {response.headers}")
return response
return custom_route_handler
app = FastAPI()
router = APIRouter(route_class=TimedRoute)
#app.get("/")
async def not_timed():
return {"message": "Not timed"}
#router.get("/timed")
async def timed():
return {"message": "It's the time of my life"}
app.include_router(router)
As the other answers did not work for me and I searched quite extensively on stackoverflow to fix this problem, I will show my solution below.
The main issue is that when using the request body or response body many of the approaches/solutions offered online do simply not work as the request/response body is consumed in reading it from the stream.
To solve this issue I adapted an approach that basically reconstructs the request and response after reading them. This is heavily based on the comment by user 'kovalevvlad' on https://github.com/encode/starlette/issues/495.
Custom middleware is created that is later added to the app to log all requests and responses. Note that you need some kind of logger to store your logs.
from json import JSONDecodeError
import json
import logging
from typing import Callable, Awaitable, Tuple, Dict, List
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.requests import Request
from starlette.responses import Response, StreamingResponse
from starlette.types import Scope, Message
# Set up your custom logger here
logger = ""
class RequestWithBody(Request):
"""Creation of new request with body"""
def __init__(self, scope: Scope, body: bytes) -> None:
super().__init__(scope, self._receive)
self._body = body
self._body_returned = False
async def _receive(self) -> Message:
if self._body_returned:
return {"type": "http.disconnect"}
else:
self._body_returned = True
return {"type": "http.request", "body": self._body, "more_body": False}
class CustomLoggingMiddleware(BaseHTTPMiddleware):
"""
Use of custom middleware since reading the request body and the response consumes the bytestream.
Hence this approach to basically generate a new request/response when we read the attributes for logging.
"""
async def dispatch( # type: ignore
self, request: Request, call_next: Callable[[Request], Awaitable[StreamingResponse]]
) -> Response:
# Store request body in a variable and generate new request as it is consumed.
request_body_bytes = await request.body()
request_with_body = RequestWithBody(request.scope, request_body_bytes)
# Store response body in a variable and generate new response as it is consumed.
response = await call_next(request_with_body)
response_content_bytes, response_headers, response_status = await self._get_response_params(response)
# Logging
# If there is no request body handle exception, otherwise convert bytes to JSON.
try:
req_body = json.loads(request_body_bytes)
except JSONDecodeError:
req_body = ""
# Logging of relevant variables.
logger.info(
f"{request.method} request to {request.url} metadata\n"
f"\tStatus_code: {response.status_code}\n"
f"\tRequest_Body: {req_body}\n"
)
# Finally, return the newly instantiated response values
return Response(response_content_bytes, response_status, response_headers)
async def _get_response_params(self, response: StreamingResponse) -> Tuple[bytes, Dict[str, str], int]:
"""Getting the response parameters of a response and create a new response."""
response_byte_chunks: List[bytes] = []
response_status: List[int] = []
response_headers: List[Dict[str, str]] = []
async def send(message: Message) -> None:
if message["type"] == "http.response.start":
response_status.append(message["status"])
response_headers.append({k.decode("utf8"): v.decode("utf8") for k, v in message["headers"]})
else:
response_byte_chunks.append(message["body"])
await response.stream_response(send)
content = b"".join(response_byte_chunks)
return content, response_headers[0], response_status[0]