SQLAlchemy used in Flask, Session management implementation

SQLAlchemy used in Flask, Session management implementation - python

As I cannot use the Flask-SQLAlchemy due to models definitions and use of the database part of the app in other contexts than Flask, I found several ways to manage sessions and I am not sure what to do.
One thing that everyone seems to agree (including me) is that a new session should be created at the beginning of each request and be committed + closed when the request has been processed and the response is ready to be sent back to the client.
Currently, I implemented the session management that way:
I have a database initialization python script which creates the engine (engine = create_engine(app.config["MYSQL_DATABASE_URI"])) and defines the session maker Session = sessionmaker(bind=engine, expire_on_commit=False).
In another file I defined two function decorated with flask's before_request and teardown_request applications decorators.
#app.before_request
def create_db_session():
g.db_session = Session()
#app.teardown_request
def close_db_session(exception):
try:
g.db_session.commit()
except:
g.db_session.rollback()
finally:
g.db_session.close()
I then use the g.db_session when I need to perform queries: g.db_session.query(models.User.user_id).filter_by(username=username)
Is this a correct way to manage sessions ?
I also took a look at the scoped sessions proposed by SQLAlchemy and this might be anotherway of doing things, but I am not sure about how to change my system to use scoped sessions...
If I understood it well, I would not use the g variable, but I would instead always refer to the Session definition declared by Session = scoped_session(sessionmaker(bind=engine, expire_on_commit=False)) and I would not need to initialize a new session explicitly when a request arrives.
I could just perform my queries as usual with Session.query(models.User.user_id).filter_by(username=username) and I would just need to remove the session when the request ends:
#app.teardown_request
def close_db_session(exception):
Session.commit()
Session.remove()
I am a bit lost with this session management topic and I would need help to understand how to manage sessions. Is there a real difference between the two approaches above?

Your approach of managing the session via flask.g is completely acceptable to my point of view. Whatever we are trying to do with SQLAlchemy, one must remember the basic principles:
Always clean up after yourself. At web application runtime, if you spawn a lot of sessions without .close()ing them, this will eventually lead to connection overflow at your DB instance. You are handling this by calling finally: session.close()
Maintain session independence. It's not good if various application contexts ( requests, threads, etc..) share the same session instance, because it's not deterministic. You are doing this by ensuring only one session runs per one request.
The scoped_session can be considered as just an alternative of flask.g - it ensures that within one thread, each call to the Session() constructor returns the same object - https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual
It's a SQLA batteries included version of your session management code.
So far, if you are using Flask, which is a synchronous framework, I don't think you will have any issues with this setup.

Related

Is storing data in "thread local storage" in a Django application safe, in cases of concurrent requests?

I have seen at many places that using thread local storage to store any data in Django application is not a good practice.
But this is the only way I could store my request object. I need to store it because my application has a complex structure. And I can't keep on passing the request object at each function call or class intialization.
I need the cookies and headers from my request object, to be passed to some api calls I'm making at different places in the application.
I'm using this for reference:
https://blndxp.wordpress.com/2016/03/04/django-get-current-user-anywhere-in-your-code-using-a-middleware/
So I'm using a middleware, as mentioned in the reference.
And, this is how request is stored
from threading import local
_thread_locals = local()
_thread_locals.request = request
And, this is how data is fetched:
getattr(_thread_locals, "request", None)
So does are the data stored in the threads local to that particular request ? Or if another request takes place at the same time, does both of them use the same data ?(Which is certainly not what i want)
Or is there any new way of dealing with this old problem(storing request object globally)
Note: I'm also using async at places in my Django application(If that matters).

Yes, using thread-local storage in Django is safe.
Django uses one thread to handle each request. Django also uses thread-local data itself, for instance for storing the currently activated locale. While appservers such as Gunicorn and uwsgi can be configured to utilize multiple threads, each request will still be handled by a single thread.
However, there have been conflicting opinions on whether using thread-locals is an elegant and well-designed solution. The reasons against using thread-locals boil down to the same reasons why global variables are considered bad practice. This answer discusses a number of them.
Still, storing the request object in thread-local data has become a widely used pattern in the Django community. There is even an app Django-CRUM that contains a CurrentRequestUserMiddleware class and the functions get_current_user() and get_current_request().
Note that as of version 3.0, Django has started to implement asynchronous support. I'm not sure what its implications are for apps like Django-CRUM. For the foreseeable future, however, thread-locals can safely be used with Django.

Correct use of sqlalchemy sesion with Flask

I am trying to use sqlalchemy with flask(without the flask-sqlalchemy extension). I am struggling with how to integrate the session object of sqlalchemy with flask.
I have seen several approaches:
1) Create a new session everytime I want to query the database and close the corresponding session at the end of the request(in the request function itself rather than taking help from Flask). This does not scale well as it exhausts the sqlalchemy connection pool quickly with the number of increasing requests.
2) Use the scoped session, query the database and remove the scoped session acquired at the end of request using the flask's app.after_request decorator. This approach works, but I am confused so as how to make use of it, as some places recommend to use the Flask's global request variable g to keep track of which session is allocated to which request while others do not use it at all.
I have decided to use the second option, as the first option is not scalable at all for large applications.
I would like to know if there are any pitfalls with the second approach like concurrent sessions handling the same objects with or without the g operator. Also, does not using the g variable have any effect on sqlalchemy scoped_session?
Any help would be appreciated.

Django: app level variables

I've created a Django-rest-framework app. It exposes some API which does some get/set operations in the MySQL DB.
I have a requirement of making an HTTP request to another server and piggyback this response along with the usual response. I'm trying to use a self-made HTTP connection pool to make HTTP requests instead of making new connections on each request.
What is the most appropriate place to keep this app level HTTP connection pool object?
I've looked around for it & there are multiple solutions each with some cons. Here are some:
To make a singleton class of the pool in a diff file, but this is not a good pythonic way to do things. There are various discussions over why not to use singleton design pattern.
Also, I don't know how intelligent it would be to pool a pooler? (:P)
To keep it in init.py of the app dir. The issue with that are as follows:
It should only contain imports & things related to that.
It will be difficult to unit test the code because the import would happen before mocking and it would actually try to hit the API.
To use sessions, but I guess that makes more sense if it was something user session specific, like a user specific number, etc
Also, the object needs to be serializable. I don't know how HTTP Connection pool can be serialized.
To keep it global in views.py but that also is discouraged.
What is the best place to store such app/global level variables?

This thread is a bit old but still could be googled. generally, if you want a component to be accessible among several apps in your Django project you can put it in a general or core app as a Util or whatever.
in terms of reusability and app-specific you can use a Factory with a cache mechanism something like:
class ConnectionPool:
pass
#dataclass
class ConnectionPoolFactory:
connection_pool_cache: dict[str: ConnectionPool] = field(default_factory=dict)
def get_connection(self, app_name: str) -> ConnectionPool:
if self.connection_pool_cache.get(app_name, None) is None:
self.connection_pool_cache[app_name] = ConnectionPool()
return self.connection_pool_cache[app_name]

A possible solution is to implement a custom Django middleware, as described in https://docs.djangoproject.com/ja/1.9/topics/http/middleware/.
You could initialize the HTTP connection pool in the middleware's __init__ method, which is only called once at the first request. Then, start the HTTP request during process_request and on process_response check it has finished (or wait for it) and append that response to the internal one.

Do I authenticate at database level, at Flask User level, or both?

I have an MS-SQL deployed on AWS RDS, that I'm writing a Flask front end for.
I've been following some intro Flask tutorials, all of which seem to pass the DB credentials in the connection string URI. I'm following the tutorial here:
https://medium.com/#rodkey/deploying-a-flask-application-on-aws-a72daba6bb80#.e6b4mzs1l
For deployment, do I prompt for the DB login info and add to the connection string? If so, where? Using SQLAlchemy, I don't see any calls to create_engine (using the code in the tutorial), I just see an initialization using config.from_object, referencing the config.py where the SQLALCHEMY_DATABASE_URI is stored, which points to the DB location. Trying to call config.update(dict(UID='****', PASSWORD='******')) from my application has no effect, and looking in the config dict doesn't seem to have any applicable entries to set for this purpose. What am I doing wrong?
Or should I be authenticating using Flask-User, and then get rid of the DB level authentication? I'd prefer authenticating at the DB layer, for ease of use.

The tutorial you are using uses Flask-Sqlalchemy to abstract the database setup stuff, that's why you don't see engine.connect().
Frameworks like Flask-Sqlalchemy are designed around the idea that you create a connection pool to the database on launch, and share that pool amongst your various worker threads. You will not be able to use that for what you are doing... it takes care of initializing the session and things early in the process.
Because of your requirements, I don't know that you'll be able to make any use of things like connection pooling. Instead, you'll have to handle that yourself. The actual connection isn't too hard...
engine = create_engine('dialect://username:password#host/db')
connection = engine.connect()
result = connection.execute("SOME SQL QUERY")
for row in result:
# Do Something
connection.close()
The issue is that you're going to have to do that in every endpoint. A database connection isn't something you can store in the session- you'll have to store the credentials there and do a connect/disconnect loop in every endpoint you write. Worse, you'll have to either figure out encrypted sessions or server side sessions (without a db connection!) to prevent keeping those credentials in the session from becoming a horrible security leak.
I promise you, it will be easier both now and in the long run to figure out a simple way to authenticate users so that they can share a connection pool that is abstracted out of your app endpoints. But if you HAVE to do it this way, this is how you will do it. (make sure you are closing those connections every time!)

How to get a SQLAlchemy session managed by zope.transaction that has the same scope as a http request but that does not close automatically on commit?

I have a Pyramid web application with some form pages that reads data from database and write to it as well.
The application uses SQLAlchemy with a PostgreSQL database and here is how I setup the SQLAlchemy session:
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from zope.sqlalchemy import ZopeTransactionExtension
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
When I process a form, I need to perform an explicit commit surrounded in a try and see if the commit worked. I need this explicit commit because I have deferrable triggers in the PostgreSQL database (checks that are performed at commit time) and there are some cases where the absence of error is not predictable.
Once I successfully committed a transaction, for example adding an instance of MyClass, I would like to get some attributes on this instance and also some attributes on linked instances. Indeed, I cannot get those data before committing because they are computed by the database itself.
My problem is that, when I use transaction.commit() (in transaction package) the session is automatically closed and I cannot use the instance anymore because it is in Detached state. Documentation confirms this point.
So, as mentioned in the documentation, I tried to use the following session setup instead:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension(keep_session=True)))
However, now the session scope is not the same as the http request scope anymore : no ROLLBACK is sent at the end of my http requests that just perform read queries.
So, is there a way to have a session that have the same scope as the http request but that does not close automatically on commit?

You can detach the session from the request via the keep_session=True on the ZTE. However, you probably also want to use the objects after the session is committed if this is the case, otherwise you'd be happy with a new session. Thus, you'll also want expire_on_commit=False on your session. After that, you've successfully detached the session from the lifecycle of pyramid_tm and you can commit/abort as you please. So how do we reattach it back to the request lifecycle without using pyramid_tm? Well, if you wrap your DBSession in something that's a little less global which will make it more manageable as a request-scoped variable, that'd help. From there, it's obvious when the session is created, and when it should be destroyed via a request-finished callback. Here's a summary of my prose:
def get_db(request):
session = request.registry['db_session_factory']()
def _closer(request):
session.close()
request.add_finished_callback(_closer)
return session
def main(global_conf, **settings):
config = Configurator()
DBSession = sessionmaker(expire_on_commit=False, extension=ZopeTransactionExtension(keep_session=True))
# look we don't even need a scoped_session anymore because it's not global, it's local to the request!
config.registry['db_session_factory'] = DBSession
config.add_request_method('db', get_db, reify=True)
def myview(request):
db = request.db # creates, or reuses the session created for this request
model = db.query(MyModel).first();
transaction.commit()
# model is still valid here
return {}
Of course, if we're doing all this, the ZTE may not be helping you at all and you just want to use db.commit() and handle things yourself. The finished callbacks will still be invoked if an exception occurs, so you don't need pyramid_tm to cleanup after you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.