FastAPI database dependency setup for connection pooling

FastAPI database dependency setup for connection pooling - python

Consider the following fastapi setup:
application.add_event_handler(
"startup",
create_start_app_handler(application, settings),
)
def create_start_app_handler(
app: FastAPI,
settings: AppSettings,
) -> Callable:
async def start_app() -> None:
await connect_to_db(app, settings)
return start_app
async def connect_to_db(app: FastAPI, settings: AppSettings) -> None:
db_url = settings.DATABASE_URL
engine = create_engine(db_url, pool_size=settings.POOL_SIZE, max_overflow=settings.MAX_OVERFLOW)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
db = SessionLocal()
def close_db():
db.close()
engine.dispose()
app.state.db = db
app.state.close_db = close_db
close_db is used to close the database connection on app shutdown
I have the following dependencies defined:
def _get_db(request: Request) -> Generator:
yield request.app.state.db
def get_repository(
repo_type: Type[BaseRepository],
) -> Callable[[Session], BaseRepository]:
def _get_repo(
sess: Session = Depends(_get_db),
) -> BaseRepository:
return repo_type(sess)
return _get_repo
Would this still allow me to take advantage of connection pooling?
Also, this feels a little hacky and I could use some feedback if there's anything in particular that I should not be doing.

To be blunt; it seems overly complicated for something that is pretty well documented in the docs.
In your case, you create only 1 instance of SessionLocal() and will share that across all your requests (because you store it in the app.state). In other words: no this will not be using connection pooling, it will use only 1 connection.
A better approach is to yield an instance per request, either via middleware or via a dependency. That way, the connection is actually closed when the incoming request has been fully handled. For example, like this:
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()
#app.get("/")
def root(db: SessionLocal = Depends(get_db)):
return "hello world"
I am not sure how you ended up where you ended up, but I would recommend to refactor a bunch.

Related

What is the recommended way to instantiate and pass around a redis client with FastAPI

I'm using FastAPI with Redis. My app looks something like this
from fastapi import FastAPI
import redis
# Instantiate redis client
r = redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)
# Instantiate fastapi app
app = FastAPI()
#app.get("/foo/")
async def foo():
x = r.get("foo")
return {"message": x}
#app.get("/bar/")
async def bar():
x = r.get("bar")
return {"message": x}
Is it bad practice to create r as a module-scoped variable like this? If so what are the drawbacks?
In Tiangolo's tutorial on setting up a SQL database connection he uses a dependency, which I guess in my case would look something like this
from fastapi import Depends, FastAPI
import redis
# Instantiate fastapi app
app = FastAPI()
# Dependency
def get_redis():
return redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)
#app.get("/foo/")
async def foo(r = Depends(get_redis)):
x = r.get("foo")
return {"message": x}
#app.get("/bar/")
async def bar(r = Depends(get_redis)):
x = r.get("bar")
return {"message": x}
I'm a bit confused as to which of these methods (or something else) would be preferred and why.

Depends will evaluate every time your function got a request, so your second example will create a new connection for each request. As #JarroVGIT said, we can use connection pooling to maintain the connection from FastAPI to Redis and reduce open-closing connection costs.
Usually, I create a different file to define the connection. Let's say we have config/db.py:
import redis
def create_redis():
return redis.ConnectionPool(
host='localhost',
port=6379,
db=0,
decode_responses=True
)
pool = create_redis()
Then in the main.py
from fastapi import Depends, FastAPI
import redis
from config.db import pool
app = FastAPI()
def get_redis():
# Here, we re-use our connection pool
# not creating a new one
return redis.Redis(connection_pool=pool)
#app.get("/items/{item_id}")
def read_item(item_id: int, cache = Depends(get_redis)):
status = cache.get(item_id)
return {"item_name": status}
#app.put("/items/{item_id}")
def update_item(item_id: int, cache = Depends(get_redis)):
cache.set(item_id, "available")
return {"status": "available", "item_id": item_id}
Usually, I also split the dependencies file like the doc so we can call it from our routing module, but for simplicity, I will leave it like this.
You can check this repo to experiment by yourself. It has more comprehensive code and I have already created several scenarios that might help you understand the difference. And it will also cover how your first example may block other endpoints.

In your second example, every time your creating new redis instance and one time it reach max connection limit. If you put code like this that much more clean and re-usable,
from fastapi import FastAPI
import redis
class AppAPI(FastAPI):
def __init__(self):
self.redis_client = redis.Redis(host='localhost', port=6379, db=0, decode_responses=True)
#app.get("/foo/")
async def foo():
x = self.redis_client.get("foo")
return {"message": x}

https://github.com/redis/redis-py#connection-pools. You can define the pool at module level and import it wherever needed. All Redis connection will be created out of the pool.
pool = redis.ConnectionPool(host='localhost', port=6379, db=0)
r = redis.Redis(connection_pool=pool)

Testing asynchronous FastAPI endpoints with dependencies

I've encountered this problem, and I can't see any solution, though it must be a common one. So, maybe I'm missing something here.
I'm working on FastAPI app with asynchronous endpoints and asynchronous connection with database. Database connection is passed as a dependency. I want to write some asynchronous tests for said app.
engine = create_async_engine(connection_string, echo=True)
def get_session():
return sessionmaker(engine, class_=AsyncSession, expire_on_commit=False)
#router.post("/register")
async def register(
user_data: UserRequest,
authorize: AuthJWT = Depends(),
async_session: sessionmaker = Depends(get_session),
):
"""Register new user."""
if authorize.get_jwt_subject():
raise LogicException("already authorized")
session: AsyncSession
async with async_session() as session:
query = await session.execute(
select(UserModel).where(UserModel.name == user_data.name)
)
...
I'm using AsyncSession to work with database. So in my test, db connection also has to be asynchronous.
engine = create_async_engine(
SQLALCHEMY_DATABASE_URL, connect_args={"check_same_thread": False}
)
app.dependency_overrides[get_session] = lambda: sessionmaker(
engine, class_=AsyncSession, expire_on_commit=False
)
#pytest.mark.asyncio
async def test_create_user():
async with engine.begin() as conn:
await conn.run_sync(Base.metadata.create_all)
async with AsyncClient(app=app, base_url="http://test") as ac:
response = await ac.post(
"/register",
json={"name": "TestGuy", "password": "TestPass"},
)
assert response.status_code == 200, response.text
When running the test, I get the following error:
...
coin_venv\lib\site-packages\fastapi\routing.py:217: in app
solved_result = await solve_dependencies(
coin_venv\lib\site-packages\fastapi\dependencies\utils.py:529: in solve_dependencies
solved = await run_in_threadpool(call, **sub_values)
AttributeError: module 'anyio' has no attribute 'to_thread'
I concluded that error appears only when there is a dependency in an endpoint. Weird part is that I don't even have anyio in my environment.
So, is there a way to test asynchronous FastAPI endpoints with dependencies and asynchronous db connection? Surely, there must be something, it's not like this situation is something unique...
UPD: I tried using decorator #pytest.mark.anyio and also have installed trio and anyio. Now pytest seem to discover two distinct tests in this one:
login_test.py::test_create_user[asyncio]
login_test.py::test_create_user[trio]
Both fails, first one with what seems to be a valid error in my code, and second one with:
RuntimeError: There is no current event loop in thread 'MainThread'.
I guess it is true, though I don't really know if pytest creates eventloop to test async code. Anyway, I don't need the second test, why it is here, and how can I get rid of it?

It turned out, I can specify backend to run tests like this:
#pytest.fixture
def anyio_backend():
return 'asyncio'
So, now I have only the right tests running)

pytest runs on different eventloop (not get_running_loop), and so, when you try to run it in the same context, it raises exception. I suggest you to consider using nest_asyncio (https://pypi.org/project/nest-asyncio/), so that pytest can run in the same eventloop.
import nest_asyncio
nest_asyncio.apply()

Why open the same one database with sqlalchemy, but get different, how can I update it?

I write some tests with pytest, I want to test create user and email with post method.
With some debug, I know the issue is I open two databases in memory, but they are same database SessionLocal().
So how can I fix this, I try db.flush(), but it doesn't work.
this is the post method code
#router.post("/", response_model=schemas.User)
def create_user(
*,
db: Session = Depends(deps.get_db), #the get_db is SessionLocal()
user_in: schemas.UserCreate,
current_user: models.User = Depends(deps.get_current_active_superuser),
) -> Any:
"""
Create new user.
"""
user = crud.user.get_by_email(db, email=user_in.email)
if user:
raise HTTPException(
status_code=400,
detail="The user with this username already exists in the system.",
)
user = crud.user.create(db, obj_in=user_in)
print("====post====")
print(db.query(models.User).count())
print(db)
if settings.EMAILS_ENABLED and user_in.email:
send_new_account_email(
email_to=user_in.email, username=user_in.email, password=user_in.password
)
return user
and the test code is:
def test_create_user_new_email(
client: TestClient, superuser_token_headers: dict, db: Session # db is SessionLocal()
) -> None:
username = random_email()
password = random_lower_string()
data = {"email": username, "password": password}
r = client.post(
f"{settings.API_V1_STR}/users/", headers=superuser_token_headers, json=data,
)
assert 200 <= r.status_code < 300
created_user = r.json()
print("====test====")
print(db.query(User).count())
print(db)
user = crud.user.get_by_email(db, email=username)
assert user
assert user.email == created_user["email"]
and the test result is
> assert user
E assert None
====post====
320
<sqlalchemy.orm.session.Session object at 0x7f0a9f660910>
====test====
319
<sqlalchemy.orm.session.Session object at 0x7f0aa09c4d60>

Your code does not provide enough information to help you, the key issues are probably in what is hidden and explained by your comments.
And it seems like you are confusing sqlalchemy session and databases. If you are not familiar with these concepts, I highly recommend you to have a look at SQLAlchemy documentation.
But, looking at your code structure, it seems like you are using FastAPI.
Then, if you want to test SQLAlchemy with pytest, I recommend you to use pytest fixture with SQL transactions.
Here is my suggestion on how to implement such a test. I'll suppose that you want to run the test on your actual database and not create a new database especially for the tests. This implementation is heavily based on this github gist (the author made a "feel free to use statement", so I suppose he is ok with me copying his code here):
# test.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from fastapi.testclient import TestClient
from myapp.models import BaseModel
from myapp.main import app # import your fastapi app
from myapp.database import get_db # import the dependency
client = TestClient(app)
# scope="session" mean that the engine will last for the whole test session
#pytest.fixture(scope="session")
def engine():
return create_engine("postgresql://localhost/test_database")
# at the end of the test session drops the created metadata using fixture with yield
#pytest.fixture(scope="session")
def tables(engine):
BaseModel.metadata.create_all(engine)
yield
BaseModel.metadata.drop_all(engine)
# here scope="function" (by default) so each time a test finished, the database is cleaned
#pytest.fixture
def dbsession(engine, tables):
"""Returns an sqlalchemy session, and after the test tears down everything properly."""
connection = engine.connect()
# begin the nested transaction
transaction = connection.begin()
# use the connection with the already started transaction
session = Session(bind=connection)
yield session
session.close()
# roll back the broader transaction
transaction.rollback()
# put back the connection to the connection pool
connection.close()
## end of the gist.github code
#pytest.fixture
def db_fastapi(dbsession):
def override_get_db():
db = dbsession
try:
yield db
finally:
db.close()
client.app.dependency_overrides[get_db] = override_get_db
yield db
# Now you can run your test
def test_create_user_new_email(db_fastapi):
username = random_email()
# ...

DB changes made in fixture don't seem to persist to test

I'm writing some Pytest code using a sqlite db, to test some logic. I setup a root level fixture to instantiate a db engine:
class SqliteEngine:
def __init__(self):
self._conn_engine = create_engine("sqlite://")
self._conn_engine.execute("pragma foreign_keys=ON")
def get_engine(self):
return self._conn_engine
def get_session(self):
Session = sessionmaker(bind=self._conn_engine, autoflush=True)
return Session()
#pytest.fixture(scope="session")
def sqlite_engine():
sqlite_engine = SqliteEngine()
return sqlite_engine
Then in my test class, I have
class TestRbac:
#pytest.fixture(scope="class")
def setup_rbac_tables(self, sqlite_engine):
conn_engine = sqlite_engine.get_engine()
conn_engine.execute("attach ':memory:' as rbac")
Application.__table__.create(conn_engine)
Client.__table__.create(conn_engine)
Role.__table__.create(conn_engine)
session = sqlite_engine.get_session()
application = Application(id=1, name="test-application")
session.add(application)
session.flush()
client = Client(id=0, name="Test", email_pattern="")
session.add(client)
session.flush()
Finally in the test in that class, I tried
def test_query_config_data_default(self, sqlite_engine, setup_rbac_tables, rbac):
conn_engine = sqlite_engine.get_engine()
session = sqlite_engine.get_session()
client = Client(id=1, name=factory.Faker("name").generate(), email_pattern="")
session.add(client)
session.flush()
clients = sqlite_engine.get_session().query(Client).all()
for client in clients:
print(client.id, client.name)
However, only one client prints (and if I try for Application, none print), and I can't figure out why. Is this a problem with the fixture scopes? Or the engine? Or how sqlite works in pytest?

I'm not an expert on this but I think you need to define the fixture in such a way that the session is shared unless you plan to commit in each fixture. In setup_rbac_tables the session is destroyed with the function scope. And when get_session is called again a new session is created.
In my pytest sqlalchemy tests I do something like this, where the db fixture is a db session that is reused between fixtures and in the test:
#pytest.fixture
def customer_user(db):
from ..model.user import User
from ..model.auth import Group
group = db.query(Group).filter(
Group.name == 'customer').first()
if not group:
group = Group(name='customer', label='customer')
user = User(email=test_email_fmt.format(uuid4().hex), group=group)
db.add(user)
return user

Handle multiple connections in Flask API

I'm writing a simple internal REST API for our solution using Flask, serving JSON objects through get calls (including authentication). We have multiple backends to fetch data from. From what I understand these should be connected to in a function decorated with #app.before_request and assigned to the g global for use in the specific route being requested. It's not a pattern I'm used to.
Here is a toy example of what I'm doing:
#app.before_request
def before_request():
g.some_conn_a = create_connection('a')
g.some_conn_b = create_connection('b')
g.some_client = create_client()
#app.route('/get_some_data')
#requires_auth
def get_some_data():
# Fetch something from all connections in g
payload = ... # Construct payload using above connections
return jsonify(payload)
#app.route('/get_some_other_data')
#requires_auth
def get_some_other_data():
# Fetch something from maybe just g.some_conn_b
payload = ... # Construct payload using g.some_conn_b
return jsonify(payload)
This seems wasteful to me if the user makes a request for data residing in only one or two of these connections/clients, like in the get_some_other_data route example.
I'm considering just making the connections/clients in the route functions instead, or load it lazily. What's the "correct" way? I hope it isn't to make a new module, that seems extreme for what I'm doing.

Riffing on the Flask docs Database Connections example
you could modify get_db() to accept an argument for each of your multiple connections.
def get_db(conn):
"""Open specificied connection if none yet for the current app context. """
if conn == 'some_conn_a':
if not hasattr(g, 'some_conn_a'):
g.some_conn_a = create_connection('a')
db = g.some_conn_a
elif conn == 'some_conn_b':
if not hasattr(g, 'some_conn_b'):
g.some_conn_b = create_connection('b')
db = g.some_conn_b
elif conn == 'some_client':
if not hasattr(g, 'some_client'):
g.some_client = create_client()
db = g.some_client
else:
raise Exception("Unknown connection: %s" % conn)
return db
#app.teardown_appcontext
def close_db(error):
"""Closes the db connections. """
if hasattr(g, 'some_conn_a'):
g.some_conn_a.close()
if hasattr(g, 'some_conn_b'):
g.some_conn_b.close()
if hasattr(g, 'some_client'):
g.some_client.close()
Then you could query each connection as needed:
#app.route('/get_some_data')
def get_some_data():
data_a = get_db('some_conn_a').query().something()
data_b = get_db('some_conn_b').query().something()
data_c = get_db('some_client').query().something()
payload = {'a': data_a, 'b': data_b, 'c': data_c}
return jsonify(payload)
The get_db() pattern is preferred over the before_request pattern for lazy loading database connections. The docs examples for Flask 0.11 and up utilize the get_db() pattern to a larger extent.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

FastAPI database dependency setup for connection pooling - python

Related

What is the recommended way to instantiate and pass around a redis client with FastAPI

Testing asynchronous FastAPI endpoints with dependencies

Why open the same one database with sqlalchemy, but get different, how can I update it?

DB changes made in fixture don't seem to persist to test

Handle multiple connections in Flask API

Categories

Resources