Why open the same one database with sqlalchemy, but get different, how can I update it? - python

I write some tests with pytest, I want to test create user and email with post method.
With some debug, I know the issue is I open two databases in memory, but they are same database SessionLocal().
So how can I fix this, I try db.flush(), but it doesn't work.
this is the post method code
#router.post("/", response_model=schemas.User)
def create_user(
*,
db: Session = Depends(deps.get_db), #the get_db is SessionLocal()
user_in: schemas.UserCreate,
current_user: models.User = Depends(deps.get_current_active_superuser),
) -> Any:
"""
Create new user.
"""
user = crud.user.get_by_email(db, email=user_in.email)
if user:
raise HTTPException(
status_code=400,
detail="The user with this username already exists in the system.",
)
user = crud.user.create(db, obj_in=user_in)
print("====post====")
print(db.query(models.User).count())
print(db)
if settings.EMAILS_ENABLED and user_in.email:
send_new_account_email(
email_to=user_in.email, username=user_in.email, password=user_in.password
)
return user
and the test code is:
def test_create_user_new_email(
client: TestClient, superuser_token_headers: dict, db: Session # db is SessionLocal()
) -> None:
username = random_email()
password = random_lower_string()
data = {"email": username, "password": password}
r = client.post(
f"{settings.API_V1_STR}/users/", headers=superuser_token_headers, json=data,
)
assert 200 <= r.status_code < 300
created_user = r.json()
print("====test====")
print(db.query(User).count())
print(db)
user = crud.user.get_by_email(db, email=username)
assert user
assert user.email == created_user["email"]
and the test result is
> assert user
E assert None
====post====
320
<sqlalchemy.orm.session.Session object at 0x7f0a9f660910>
====test====
319
<sqlalchemy.orm.session.Session object at 0x7f0aa09c4d60>

Your code does not provide enough information to help you, the key issues are probably in what is hidden and explained by your comments.
And it seems like you are confusing sqlalchemy session and databases. If you are not familiar with these concepts, I highly recommend you to have a look at SQLAlchemy documentation.
But, looking at your code structure, it seems like you are using FastAPI.
Then, if you want to test SQLAlchemy with pytest, I recommend you to use pytest fixture with SQL transactions.
Here is my suggestion on how to implement such a test. I'll suppose that you want to run the test on your actual database and not create a new database especially for the tests. This implementation is heavily based on this github gist (the author made a "feel free to use statement", so I suppose he is ok with me copying his code here):
# test.py
import pytest
from sqlalchemy import create_engine
from sqlalchemy.orm import Session
from fastapi.testclient import TestClient
from myapp.models import BaseModel
from myapp.main import app # import your fastapi app
from myapp.database import get_db # import the dependency
client = TestClient(app)
# scope="session" mean that the engine will last for the whole test session
#pytest.fixture(scope="session")
def engine():
return create_engine("postgresql://localhost/test_database")
# at the end of the test session drops the created metadata using fixture with yield
#pytest.fixture(scope="session")
def tables(engine):
BaseModel.metadata.create_all(engine)
yield
BaseModel.metadata.drop_all(engine)
# here scope="function" (by default) so each time a test finished, the database is cleaned
#pytest.fixture
def dbsession(engine, tables):
"""Returns an sqlalchemy session, and after the test tears down everything properly."""
connection = engine.connect()
# begin the nested transaction
transaction = connection.begin()
# use the connection with the already started transaction
session = Session(bind=connection)
yield session
session.close()
# roll back the broader transaction
transaction.rollback()
# put back the connection to the connection pool
connection.close()
## end of the gist.github code
#pytest.fixture
def db_fastapi(dbsession):
def override_get_db():
db = dbsession
try:
yield db
finally:
db.close()
client.app.dependency_overrides[get_db] = override_get_db
yield db
# Now you can run your test
def test_create_user_new_email(db_fastapi):
username = random_email()
# ...

Related

FastAPI database dependency setup for connection pooling

Consider the following fastapi setup:
application.add_event_handler(
"startup",
create_start_app_handler(application, settings),
)
def create_start_app_handler(
app: FastAPI,
settings: AppSettings,
) -> Callable:
async def start_app() -> None:
await connect_to_db(app, settings)
return start_app
async def connect_to_db(app: FastAPI, settings: AppSettings) -> None:
db_url = settings.DATABASE_URL
engine = create_engine(db_url, pool_size=settings.POOL_SIZE, max_overflow=settings.MAX_OVERFLOW)
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
db = SessionLocal()
def close_db():
db.close()
engine.dispose()
app.state.db = db
app.state.close_db = close_db
close_db is used to close the database connection on app shutdown
I have the following dependencies defined:
def _get_db(request: Request) -> Generator:
yield request.app.state.db
def get_repository(
repo_type: Type[BaseRepository],
) -> Callable[[Session], BaseRepository]:
def _get_repo(
sess: Session = Depends(_get_db),
) -> BaseRepository:
return repo_type(sess)
return _get_repo
Would this still allow me to take advantage of connection pooling?
Also, this feels a little hacky and I could use some feedback if there's anything in particular that I should not be doing.
To be blunt; it seems overly complicated for something that is pretty well documented in the docs.
In your case, you create only 1 instance of SessionLocal() and will share that across all your requests (because you store it in the app.state). In other words: no this will not be using connection pooling, it will use only 1 connection.
A better approach is to yield an instance per request, either via middleware or via a dependency. That way, the connection is actually closed when the incoming request has been fully handled. For example, like this:
def get_db():
db = SessionLocal()
try:
yield db
finally:
db.close()
#app.get("/")
def root(db: SessionLocal = Depends(get_db)):
return "hello world"
I am not sure how you ended up where you ended up, but I would recommend to refactor a bunch.

Pytest - Writing to real database while test script runs in test client with #pytest.fixture

I have test client which is implemented using #pytest.fixture client. In test client I have all my database tables/modals. Inside the test script block, I want to be able to write log records to real database tables, not to test client tables. Test script works fine, however there is no change at the real database tables.
Here is the simplified example:
#pytest.fixture
def client():
tables = [
User,
Profile
]
with db.atomic():
db.drop_tables(tables)
db.create_tables(tables)
with app.test_client() as client:
yield client
with db.atomic():
db.drop_tables(tables)
def test_script(client):
with db.atomic():
User.create(
name = "example",
surname = "example"
)
assert True == True
Additionally, I don't does it matter but I am using PeeWee as ORM and SQLite for storage.

DB changes made in fixture don't seem to persist to test

I'm writing some Pytest code using a sqlite db, to test some logic. I setup a root level fixture to instantiate a db engine:
class SqliteEngine:
def __init__(self):
self._conn_engine = create_engine("sqlite://")
self._conn_engine.execute("pragma foreign_keys=ON")
def get_engine(self):
return self._conn_engine
def get_session(self):
Session = sessionmaker(bind=self._conn_engine, autoflush=True)
return Session()
#pytest.fixture(scope="session")
def sqlite_engine():
sqlite_engine = SqliteEngine()
return sqlite_engine
Then in my test class, I have
class TestRbac:
#pytest.fixture(scope="class")
def setup_rbac_tables(self, sqlite_engine):
conn_engine = sqlite_engine.get_engine()
conn_engine.execute("attach ':memory:' as rbac")
Application.__table__.create(conn_engine)
Client.__table__.create(conn_engine)
Role.__table__.create(conn_engine)
session = sqlite_engine.get_session()
application = Application(id=1, name="test-application")
session.add(application)
session.flush()
client = Client(id=0, name="Test", email_pattern="")
session.add(client)
session.flush()
Finally in the test in that class, I tried
def test_query_config_data_default(self, sqlite_engine, setup_rbac_tables, rbac):
conn_engine = sqlite_engine.get_engine()
session = sqlite_engine.get_session()
client = Client(id=1, name=factory.Faker("name").generate(), email_pattern="")
session.add(client)
session.flush()
clients = sqlite_engine.get_session().query(Client).all()
for client in clients:
print(client.id, client.name)
However, only one client prints (and if I try for Application, none print), and I can't figure out why. Is this a problem with the fixture scopes? Or the engine? Or how sqlite works in pytest?
I'm not an expert on this but I think you need to define the fixture in such a way that the session is shared unless you plan to commit in each fixture. In setup_rbac_tables the session is destroyed with the function scope. And when get_session is called again a new session is created.
In my pytest sqlalchemy tests I do something like this, where the db fixture is a db session that is reused between fixtures and in the test:
#pytest.fixture
def customer_user(db):
from ..model.user import User
from ..model.auth import Group
group = db.query(Group).filter(
Group.name == 'customer').first()
if not group:
group = Group(name='customer', label='customer')
user = User(email=test_email_fmt.format(uuid4().hex), group=group)
db.add(user)
return user

Handle multiple connections in Flask API

I'm writing a simple internal REST API for our solution using Flask, serving JSON objects through get calls (including authentication). We have multiple backends to fetch data from. From what I understand these should be connected to in a function decorated with #app.before_request and assigned to the g global for use in the specific route being requested. It's not a pattern I'm used to.
Here is a toy example of what I'm doing:
#app.before_request
def before_request():
g.some_conn_a = create_connection('a')
g.some_conn_b = create_connection('b')
g.some_client = create_client()
#app.route('/get_some_data')
#requires_auth
def get_some_data():
# Fetch something from all connections in g
payload = ... # Construct payload using above connections
return jsonify(payload)
#app.route('/get_some_other_data')
#requires_auth
def get_some_other_data():
# Fetch something from maybe just g.some_conn_b
payload = ... # Construct payload using g.some_conn_b
return jsonify(payload)
This seems wasteful to me if the user makes a request for data residing in only one or two of these connections/clients, like in the get_some_other_data route example.
I'm considering just making the connections/clients in the route functions instead, or load it lazily. What's the "correct" way? I hope it isn't to make a new module, that seems extreme for what I'm doing.
Riffing on the Flask docs Database Connections example
you could modify get_db() to accept an argument for each of your multiple connections.
def get_db(conn):
"""Open specificied connection if none yet for the current app context. """
if conn == 'some_conn_a':
if not hasattr(g, 'some_conn_a'):
g.some_conn_a = create_connection('a')
db = g.some_conn_a
elif conn == 'some_conn_b':
if not hasattr(g, 'some_conn_b'):
g.some_conn_b = create_connection('b')
db = g.some_conn_b
elif conn == 'some_client':
if not hasattr(g, 'some_client'):
g.some_client = create_client()
db = g.some_client
else:
raise Exception("Unknown connection: %s" % conn)
return db
#app.teardown_appcontext
def close_db(error):
"""Closes the db connections. """
if hasattr(g, 'some_conn_a'):
g.some_conn_a.close()
if hasattr(g, 'some_conn_b'):
g.some_conn_b.close()
if hasattr(g, 'some_client'):
g.some_client.close()
Then you could query each connection as needed:
#app.route('/get_some_data')
def get_some_data():
data_a = get_db('some_conn_a').query().something()
data_b = get_db('some_conn_b').query().something()
data_c = get_db('some_client').query().something()
payload = {'a': data_a, 'b': data_b, 'c': data_c}
return jsonify(payload)
The get_db() pattern is preferred over the before_request pattern for lazy loading database connections. The docs examples for Flask 0.11 and up utilize the get_db() pattern to a larger extent.

Flask RESTful API: Issue with db connection pool

I'm trying to create REST API endpoints using flask framework. This is my fully working script:
from flask import Flask, jsonify
from flask_restful import Resource, Api
from flask_restful import reqparse
from sqlalchemy import create_engine
from flask.ext.httpauth import HTTPBasicAuth
from flask.ext.cors import CORS
conn_string = "mssql+pyodbc://x:x#x:1433/x?driver=SQL Server"
auth = HTTPBasicAuth()
#auth.get_password
def get_password(username):
if username == 'x':
return 'x'
return None
app = Flask(__name__)
cors = CORS(app)
api = Api(app)
class Report(Resource):
decorators = [auth.login_required]
def get(self):
parser = reqparse.RequestParser()
parser.add_argument('start', type = str)
parser.add_argument('end', type = str)
args = parser.parse_args()
e = create_engine(conn_string)
conn = e.connect()
stat = """
select x from report
"""
query = conn.execute(stat)
json_dict = []
for i in query.cursor.fetchall():
res = {'x': i[0], 'xx': i[1]}
json_dict.append(res)
conn.close()
e.dispose()
return jsonify(results=json_dict)
api.add_resource(Report, '/report')
if __name__ == '__main__':
app.run(host='0.0.0.0')
The issue is that I get results when I call this API only for a day or so after which I stop getting results unless I restart my script (or sometimes even my VM) after which I get results again. I reckon there is some issue with the database connection pool or something but I'm closing the connection and disposing it as well. I have no idea why the API gives me results only for some time being because of which I have to restart my VM every single day. Any ideas?
Per my experience, the issue was caused by coding create_engine(conn_string) to create db pool inside the Class Report so that always do the create & destory operations of db pool for per restful request. It's not correct way for using SQLAlchemy ORM, and be cause IO resouce clash related to DB connection, see the engine.dispose() function description below at http://docs.sqlalchemy.org/en/rel_1_0/core/connections.html#sqlalchemy.engine.Engine:
To resolve the issue, you just need to move e = create_engine(conn_string) to the below of the code conn_string = "mssql+pyodbc://x:x#x:1433/x?driver=SQL Server" and remove the code e.dispose() both in the Class Report, see below.
conn_string = "mssql+pyodbc://x:x#x:1433/x?driver=SQL Server"
e = create_engine(conn_string) # To here
In the def get(delf) function:
args = parser.parse_args()
# Move: e = create_engine(conn_string)
conn = e.connect()
and
conn.close()
# Remove: e.dispose()
return jsonify(results=json_dict)

Categories

Resources