Understanding SqlAlchemy Sessions - How volatile are they?

Understanding SqlAlchemy Sessions - How volatile are they? - python

I'm experimenting with SqlAlchemy, and trying to get a grasp of how I should treat connection objects.
So the sessionmaker returns a sessionFactory (confusingly also called Session in all their documentation), and you use that to create session objects that sound a lot like a database cursor to me.
What is a session object, specifically? Is it as ephemeral as a db cursor, or is it more material (does a session bind exclusively to one of the underlying connections in the engine's connection pool, for example)?

The Session object is not a database cursor; while using the Session you may open and close any number of individual cursors. Within a single session's lifespan you may insert some records, run queries, issue updates, and delete.
There's a FAQ on the session where this topic is addressed; in short, the Session is an in-memory object implementing an identity map pattern which will sync the state of objects as they exist in your application with the database upon commit.
# User here is some SQLAlchemy model
user = session.query(User).filter(User.name == 'John').one()
user.name = 'John Smith'
At this stage, the database still thinks this user's name is John. It will continue to until the session is flushed or committed. Note that under most configurations, any query you run from a session automatically flushes the session so you don't need to worry about this.
Now let's inspect our user to better understand what the session is keeping track of:
> from sqlalchemy import orm
> orm.attributes.get_history(user, 'name')
History(added=['John Smith'], unchanged=(), deleted=['John'])
Watch once we've flushed the session:
> session.flush()
> orm.attributes.get_history(user, 'name')
History(added=(), unchanged=['John Smith'], deleted=())
However, if we do not commit the session but instead roll it back, our change will not stick:
> session.rollback()
> orm.attributes.get_history(user, 'name')
History(added=(), unchanged=['John'], deleted=())
The Session object is a public API for the underlying connection and transaction objects. To understand how connections and transactions work in SQLAlchemy, take a look at the core documentation's section on the topic.
UPDATE: Session persistence
The Session stays open until explicitly closed via Session.close(). Often transaction managers handle this for you automatically in a web application implementation, but, for instance, failure to close sessions you open in a test suite can cause problems due to many open transactions.
The Session holds your changes entirely in Python until it is flushed, either via Session.flush() or, if autoflush is on, when a query is run. Once flushed the session will emit SQL within a transaction to the database. Repeated flushes simply emit more SQL within that transaction. Appropriate calls to Session.begin and Session.begin_nested will can create sub-transactions if your underlying engine/db supports it.
Calls to Session.commit and Session.rollback execute SQL within the currently active transaction.
Turn on echo=True when you initialize your engine and watch the SQL emitted by various Session methods to better understand what's happening.

Related

SQLAlchemy used in Flask, Session management implementation

As I cannot use the Flask-SQLAlchemy due to models definitions and use of the database part of the app in other contexts than Flask, I found several ways to manage sessions and I am not sure what to do.
One thing that everyone seems to agree (including me) is that a new session should be created at the beginning of each request and be committed + closed when the request has been processed and the response is ready to be sent back to the client.
Currently, I implemented the session management that way:
I have a database initialization python script which creates the engine (engine = create_engine(app.config["MYSQL_DATABASE_URI"])) and defines the session maker Session = sessionmaker(bind=engine, expire_on_commit=False).
In another file I defined two function decorated with flask's before_request and teardown_request applications decorators.
#app.before_request
def create_db_session():
g.db_session = Session()
#app.teardown_request
def close_db_session(exception):
try:
g.db_session.commit()
except:
g.db_session.rollback()
finally:
g.db_session.close()
I then use the g.db_session when I need to perform queries: g.db_session.query(models.User.user_id).filter_by(username=username)
Is this a correct way to manage sessions ?
I also took a look at the scoped sessions proposed by SQLAlchemy and this might be anotherway of doing things, but I am not sure about how to change my system to use scoped sessions...
If I understood it well, I would not use the g variable, but I would instead always refer to the Session definition declared by Session = scoped_session(sessionmaker(bind=engine, expire_on_commit=False)) and I would not need to initialize a new session explicitly when a request arrives.
I could just perform my queries as usual with Session.query(models.User.user_id).filter_by(username=username) and I would just need to remove the session when the request ends:
#app.teardown_request
def close_db_session(exception):
Session.commit()
Session.remove()
I am a bit lost with this session management topic and I would need help to understand how to manage sessions. Is there a real difference between the two approaches above?

Your approach of managing the session via flask.g is completely acceptable to my point of view. Whatever we are trying to do with SQLAlchemy, one must remember the basic principles:
Always clean up after yourself. At web application runtime, if you spawn a lot of sessions without .close()ing them, this will eventually lead to connection overflow at your DB instance. You are handling this by calling finally: session.close()
Maintain session independence. It's not good if various application contexts ( requests, threads, etc..) share the same session instance, because it's not deterministic. You are doing this by ensuring only one session runs per one request.
The scoped_session can be considered as just an alternative of flask.g - it ensures that within one thread, each call to the Session() constructor returns the same object - https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual
It's a SQLA batteries included version of your session management code.
So far, if you are using Flask, which is a synchronous framework, I don't think you will have any issues with this setup.

Do I authenticate at database level, at Flask User level, or both?

I have an MS-SQL deployed on AWS RDS, that I'm writing a Flask front end for.
I've been following some intro Flask tutorials, all of which seem to pass the DB credentials in the connection string URI. I'm following the tutorial here:
https://medium.com/#rodkey/deploying-a-flask-application-on-aws-a72daba6bb80#.e6b4mzs1l
For deployment, do I prompt for the DB login info and add to the connection string? If so, where? Using SQLAlchemy, I don't see any calls to create_engine (using the code in the tutorial), I just see an initialization using config.from_object, referencing the config.py where the SQLALCHEMY_DATABASE_URI is stored, which points to the DB location. Trying to call config.update(dict(UID='****', PASSWORD='******')) from my application has no effect, and looking in the config dict doesn't seem to have any applicable entries to set for this purpose. What am I doing wrong?
Or should I be authenticating using Flask-User, and then get rid of the DB level authentication? I'd prefer authenticating at the DB layer, for ease of use.

The tutorial you are using uses Flask-Sqlalchemy to abstract the database setup stuff, that's why you don't see engine.connect().
Frameworks like Flask-Sqlalchemy are designed around the idea that you create a connection pool to the database on launch, and share that pool amongst your various worker threads. You will not be able to use that for what you are doing... it takes care of initializing the session and things early in the process.
Because of your requirements, I don't know that you'll be able to make any use of things like connection pooling. Instead, you'll have to handle that yourself. The actual connection isn't too hard...
engine = create_engine('dialect://username:password#host/db')
connection = engine.connect()
result = connection.execute("SOME SQL QUERY")
for row in result:
# Do Something
connection.close()
The issue is that you're going to have to do that in every endpoint. A database connection isn't something you can store in the session- you'll have to store the credentials there and do a connect/disconnect loop in every endpoint you write. Worse, you'll have to either figure out encrypted sessions or server side sessions (without a db connection!) to prevent keeping those credentials in the session from becoming a horrible security leak.
I promise you, it will be easier both now and in the long run to figure out a simple way to authenticate users so that they can share a connection pool that is abstracted out of your app endpoints. But if you HAVE to do it this way, this is how you will do it. (make sure you are closing those connections every time!)

In SQLAlchemy, why can I still commit to the database after closing a session?

I'm using SQLAlchemy with Postgres, for the first time. After doing the dance of
engine = create_engine('postgresql://localhost/test', convert_unicode=True)
db_session = scoped_session(sessionmaker(
autocommit=False,
autoflush=False,
bind=engine
))
I ran code to add some stuff to the test database, and after that, I issued a db_session.remove() command (and .close() for good measure). Yet, I was still able to query and modify the tables in the test database. If removing the session doesn't affect its functionality, what's the purpose of removing it? Am I doing something wrong?

scoped_session is a special factory object. Accessing methods such as query() creates a session. One session exists per scope, so the same session will be used each time query (for example) is accessed. Calling remove() will remove the session and the next attribute access will create a new session.
Calling close() on a session will release the connection resources associated with it. They will be re-acquired when the session is used again.
close() and remove() are "temporary", they don't make the session unusable in the future. Typically, you do not need to close or remove a scoped_session, it will be managed automatically.

Using SQLAlchemy sessions with flask & concurrency problems

I'm working on an API with Flask and SQLAlchemy, and here's what I would like to do :
I have a client application, working on multiple tablets, that have to send several requests to add content to the server.
But I don't want to use auto rollback at the end of each API request (default behavior with flask-sqlalchemy), because the sending of data is done with multiple requests, like in this very simplified example :
1. beginTransaction/?id=transactionId -> opens a new session for the client making that request. SessionManager.new_session() in the code below.
2. addObject/?id=objectAid -> adds an object to the PostGreSQL database and flush
3. addObject/?id=objectBid -> adds an object to the PostGreSQL database and flush
4. commitTransaction/?id= transactionId -> commit what happened since the beginTransaction. SessionManager.commit() in the code below.
The point here is to not add the data to the server if the client app crashed / lost his connection before the « commitTransaction » was sent, thus preventing from having incomplete data on the server.
Since I don't want to use auto rollback, I can't really use flask-SQLAlchemy, so I'm implementing SQLAlchemy by myself into my flask application, but I'm not sure how to use the sessions.
Here's the implementation I did in the __ init __.py :
db = create_engine('postgresql+psycopg2://admin:pwd#localhost/postgresqlddb',
pool_reset_on_return=False,
echo=True, pool_size=20, max_overflow=5)
Base = declarative_base()
metadata = Base.metadata
metadata.bind = db
# create a configured "Session" class
Session = scoped_session(sessionmaker(bind=db, autoflush=False))
class SessionManager(object):
currentSession = Session()
#staticmethod
def new_session():
#if a session is already opened by the client, close it
#create a new session
try:
SessionManager.currentSession.rollback()
SessionManager.currentSession.close()
except Exception, e:
print(e)
SessionManager.currentSession = Session()
return SessionManager.currentSession
#staticmethod
def flush():
try:
SessionManager.currentSession.flush()
return True
except Exception, e:
print(e)
SessionManager.currentSession.rollback()
return False
#staticmethod
def commit():
#commit and close the session
#create a new session in case the client makes a single request without using beginTransaction/
try:
SessionManager.currentSession.commit()
SessionManager.currentSession.close()
SessionManager.currentSession = Session()
return True
except Exception, e:
print(e)
SessionManager.currentSession.rollback()
SessionManager.currentSession.close()
SessionManager.currentSession = Session()
return False
But now the API doesn’t work when several clients make a request, it seems like every client share the same session.
How should I implement the sessions so that each client has a different session and can make requests concurrently ?
Thank you.

You seem to want several HTTP requests to share one transaction. It's impossible - incompatible with stateless nature of HTTP.
Please consider for example that one client would open transaction and fail to close it because it has lost connectivity. A server has no way of knowing it and would leave this transaction open forever, possibly blocking other clients.
Using transactions to bundle database request is reasonable for example for performance reasons when there's more than one write operation. Or for keeping database consistent. But it always has to be committed or rolled back on the same HTTP request it was open.

I know this is an old thread, but you can achieve this with djondb (NoSQL database),
With djondb you can create transactions and if something goes wrong, i.e. you lost the connection, it does not matter, the transaction could be there forever without affecting the performance, or creating locks, djondb has been made to support long-term transactions, so you can open the transaction, use it, commit it, roll it back or just discard it (close the connection and forget it was there) and it won't leave the database in any inconsistent state.
I know this may sounds weird for Relational guys, but that's the beauty of NoSQL it creates new paradigms supporting what SQL guys say it's impossible.
Hope this helps,

How to get a SQLAlchemy session managed by zope.transaction that has the same scope as a http request but that does not close automatically on commit?

I have a Pyramid web application with some form pages that reads data from database and write to it as well.
The application uses SQLAlchemy with a PostgreSQL database and here is how I setup the SQLAlchemy session:
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from zope.sqlalchemy import ZopeTransactionExtension
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
When I process a form, I need to perform an explicit commit surrounded in a try and see if the commit worked. I need this explicit commit because I have deferrable triggers in the PostgreSQL database (checks that are performed at commit time) and there are some cases where the absence of error is not predictable.
Once I successfully committed a transaction, for example adding an instance of MyClass, I would like to get some attributes on this instance and also some attributes on linked instances. Indeed, I cannot get those data before committing because they are computed by the database itself.
My problem is that, when I use transaction.commit() (in transaction package) the session is automatically closed and I cannot use the instance anymore because it is in Detached state. Documentation confirms this point.
So, as mentioned in the documentation, I tried to use the following session setup instead:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension(keep_session=True)))
However, now the session scope is not the same as the http request scope anymore : no ROLLBACK is sent at the end of my http requests that just perform read queries.
So, is there a way to have a session that have the same scope as the http request but that does not close automatically on commit?

You can detach the session from the request via the keep_session=True on the ZTE. However, you probably also want to use the objects after the session is committed if this is the case, otherwise you'd be happy with a new session. Thus, you'll also want expire_on_commit=False on your session. After that, you've successfully detached the session from the lifecycle of pyramid_tm and you can commit/abort as you please. So how do we reattach it back to the request lifecycle without using pyramid_tm? Well, if you wrap your DBSession in something that's a little less global which will make it more manageable as a request-scoped variable, that'd help. From there, it's obvious when the session is created, and when it should be destroyed via a request-finished callback. Here's a summary of my prose:
def get_db(request):
session = request.registry['db_session_factory']()
def _closer(request):
session.close()
request.add_finished_callback(_closer)
return session
def main(global_conf, **settings):
config = Configurator()
DBSession = sessionmaker(expire_on_commit=False, extension=ZopeTransactionExtension(keep_session=True))
# look we don't even need a scoped_session anymore because it's not global, it's local to the request!
config.registry['db_session_factory'] = DBSession
config.add_request_method('db', get_db, reify=True)
def myview(request):
db = request.db # creates, or reuses the session created for this request
model = db.query(MyModel).first();
transaction.commit()
# model is still valid here
return {}
Of course, if we're doing all this, the ZTE may not be helping you at all and you just want to use db.commit() and handle things yourself. The finished callbacks will still be invoked if an exception occurs, so you don't need pyramid_tm to cleanup after you.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.