Using SQLAlchemy sessions with flask & concurrency problems

Using SQLAlchemy sessions with flask & concurrency problems - python

I'm working on an API with Flask and SQLAlchemy, and here's what I would like to do :
I have a client application, working on multiple tablets, that have to send several requests to add content to the server.
But I don't want to use auto rollback at the end of each API request (default behavior with flask-sqlalchemy), because the sending of data is done with multiple requests, like in this very simplified example :
1. beginTransaction/?id=transactionId -> opens a new session for the client making that request. SessionManager.new_session() in the code below.
2. addObject/?id=objectAid -> adds an object to the PostGreSQL database and flush
3. addObject/?id=objectBid -> adds an object to the PostGreSQL database and flush
4. commitTransaction/?id= transactionId -> commit what happened since the beginTransaction. SessionManager.commit() in the code below.
The point here is to not add the data to the server if the client app crashed / lost his connection before the « commitTransaction » was sent, thus preventing from having incomplete data on the server.
Since I don't want to use auto rollback, I can't really use flask-SQLAlchemy, so I'm implementing SQLAlchemy by myself into my flask application, but I'm not sure how to use the sessions.
Here's the implementation I did in the __ init __.py :
db = create_engine('postgresql+psycopg2://admin:pwd#localhost/postgresqlddb',
pool_reset_on_return=False,
echo=True, pool_size=20, max_overflow=5)
Base = declarative_base()
metadata = Base.metadata
metadata.bind = db
# create a configured "Session" class
Session = scoped_session(sessionmaker(bind=db, autoflush=False))
class SessionManager(object):
currentSession = Session()
#staticmethod
def new_session():
#if a session is already opened by the client, close it
#create a new session
try:
SessionManager.currentSession.rollback()
SessionManager.currentSession.close()
except Exception, e:
print(e)
SessionManager.currentSession = Session()
return SessionManager.currentSession
#staticmethod
def flush():
try:
SessionManager.currentSession.flush()
return True
except Exception, e:
print(e)
SessionManager.currentSession.rollback()
return False
#staticmethod
def commit():
#commit and close the session
#create a new session in case the client makes a single request without using beginTransaction/
try:
SessionManager.currentSession.commit()
SessionManager.currentSession.close()
SessionManager.currentSession = Session()
return True
except Exception, e:
print(e)
SessionManager.currentSession.rollback()
SessionManager.currentSession.close()
SessionManager.currentSession = Session()
return False
But now the API doesn’t work when several clients make a request, it seems like every client share the same session.
How should I implement the sessions so that each client has a different session and can make requests concurrently ?
Thank you.

You seem to want several HTTP requests to share one transaction. It's impossible - incompatible with stateless nature of HTTP.
Please consider for example that one client would open transaction and fail to close it because it has lost connectivity. A server has no way of knowing it and would leave this transaction open forever, possibly blocking other clients.
Using transactions to bundle database request is reasonable for example for performance reasons when there's more than one write operation. Or for keeping database consistent. But it always has to be committed or rolled back on the same HTTP request it was open.

I know this is an old thread, but you can achieve this with djondb (NoSQL database),
With djondb you can create transactions and if something goes wrong, i.e. you lost the connection, it does not matter, the transaction could be there forever without affecting the performance, or creating locks, djondb has been made to support long-term transactions, so you can open the transaction, use it, commit it, roll it back or just discard it (close the connection and forget it was there) and it won't leave the database in any inconsistent state.
I know this may sounds weird for Relational guys, but that's the beauty of NoSQL it creates new paradigms supporting what SQL guys say it's impossible.
Hope this helps,

Related

SQLAlchemy used in Flask, Session management implementation

As I cannot use the Flask-SQLAlchemy due to models definitions and use of the database part of the app in other contexts than Flask, I found several ways to manage sessions and I am not sure what to do.
One thing that everyone seems to agree (including me) is that a new session should be created at the beginning of each request and be committed + closed when the request has been processed and the response is ready to be sent back to the client.
Currently, I implemented the session management that way:
I have a database initialization python script which creates the engine (engine = create_engine(app.config["MYSQL_DATABASE_URI"])) and defines the session maker Session = sessionmaker(bind=engine, expire_on_commit=False).
In another file I defined two function decorated with flask's before_request and teardown_request applications decorators.
#app.before_request
def create_db_session():
g.db_session = Session()
#app.teardown_request
def close_db_session(exception):
try:
g.db_session.commit()
except:
g.db_session.rollback()
finally:
g.db_session.close()
I then use the g.db_session when I need to perform queries: g.db_session.query(models.User.user_id).filter_by(username=username)
Is this a correct way to manage sessions ?
I also took a look at the scoped sessions proposed by SQLAlchemy and this might be anotherway of doing things, but I am not sure about how to change my system to use scoped sessions...
If I understood it well, I would not use the g variable, but I would instead always refer to the Session definition declared by Session = scoped_session(sessionmaker(bind=engine, expire_on_commit=False)) and I would not need to initialize a new session explicitly when a request arrives.
I could just perform my queries as usual with Session.query(models.User.user_id).filter_by(username=username) and I would just need to remove the session when the request ends:
#app.teardown_request
def close_db_session(exception):
Session.commit()
Session.remove()
I am a bit lost with this session management topic and I would need help to understand how to manage sessions. Is there a real difference between the two approaches above?

Your approach of managing the session via flask.g is completely acceptable to my point of view. Whatever we are trying to do with SQLAlchemy, one must remember the basic principles:
Always clean up after yourself. At web application runtime, if you spawn a lot of sessions without .close()ing them, this will eventually lead to connection overflow at your DB instance. You are handling this by calling finally: session.close()
Maintain session independence. It's not good if various application contexts ( requests, threads, etc..) share the same session instance, because it's not deterministic. You are doing this by ensuring only one session runs per one request.
The scoped_session can be considered as just an alternative of flask.g - it ensures that within one thread, each call to the Session() constructor returns the same object - https://docs.sqlalchemy.org/en/13/orm/contextual.html#unitofwork-contextual
It's a SQLA batteries included version of your session management code.
So far, if you are using Flask, which is a synchronous framework, I don't think you will have any issues with this setup.

Do I authenticate at database level, at Flask User level, or both?

I have an MS-SQL deployed on AWS RDS, that I'm writing a Flask front end for.
I've been following some intro Flask tutorials, all of which seem to pass the DB credentials in the connection string URI. I'm following the tutorial here:
https://medium.com/#rodkey/deploying-a-flask-application-on-aws-a72daba6bb80#.e6b4mzs1l
For deployment, do I prompt for the DB login info and add to the connection string? If so, where? Using SQLAlchemy, I don't see any calls to create_engine (using the code in the tutorial), I just see an initialization using config.from_object, referencing the config.py where the SQLALCHEMY_DATABASE_URI is stored, which points to the DB location. Trying to call config.update(dict(UID='****', PASSWORD='******')) from my application has no effect, and looking in the config dict doesn't seem to have any applicable entries to set for this purpose. What am I doing wrong?
Or should I be authenticating using Flask-User, and then get rid of the DB level authentication? I'd prefer authenticating at the DB layer, for ease of use.

The tutorial you are using uses Flask-Sqlalchemy to abstract the database setup stuff, that's why you don't see engine.connect().
Frameworks like Flask-Sqlalchemy are designed around the idea that you create a connection pool to the database on launch, and share that pool amongst your various worker threads. You will not be able to use that for what you are doing... it takes care of initializing the session and things early in the process.
Because of your requirements, I don't know that you'll be able to make any use of things like connection pooling. Instead, you'll have to handle that yourself. The actual connection isn't too hard...
engine = create_engine('dialect://username:password#host/db')
connection = engine.connect()
result = connection.execute("SOME SQL QUERY")
for row in result:
# Do Something
connection.close()
The issue is that you're going to have to do that in every endpoint. A database connection isn't something you can store in the session- you'll have to store the credentials there and do a connect/disconnect loop in every endpoint you write. Worse, you'll have to either figure out encrypted sessions or server side sessions (without a db connection!) to prevent keeping those credentials in the session from becoming a horrible security leak.
I promise you, it will be easier both now and in the long run to figure out a simple way to authenticate users so that they can share a connection pool that is abstracted out of your app endpoints. But if you HAVE to do it this way, this is how you will do it. (make sure you are closing those connections every time!)

Understanding SqlAlchemy Sessions - How volatile are they?

I'm experimenting with SqlAlchemy, and trying to get a grasp of how I should treat connection objects.
So the sessionmaker returns a sessionFactory (confusingly also called Session in all their documentation), and you use that to create session objects that sound a lot like a database cursor to me.
What is a session object, specifically? Is it as ephemeral as a db cursor, or is it more material (does a session bind exclusively to one of the underlying connections in the engine's connection pool, for example)?

The Session object is not a database cursor; while using the Session you may open and close any number of individual cursors. Within a single session's lifespan you may insert some records, run queries, issue updates, and delete.
There's a FAQ on the session where this topic is addressed; in short, the Session is an in-memory object implementing an identity map pattern which will sync the state of objects as they exist in your application with the database upon commit.
# User here is some SQLAlchemy model
user = session.query(User).filter(User.name == 'John').one()
user.name = 'John Smith'
At this stage, the database still thinks this user's name is John. It will continue to until the session is flushed or committed. Note that under most configurations, any query you run from a session automatically flushes the session so you don't need to worry about this.
Now let's inspect our user to better understand what the session is keeping track of:
> from sqlalchemy import orm
> orm.attributes.get_history(user, 'name')
History(added=['John Smith'], unchanged=(), deleted=['John'])
Watch once we've flushed the session:
> session.flush()
> orm.attributes.get_history(user, 'name')
History(added=(), unchanged=['John Smith'], deleted=())
However, if we do not commit the session but instead roll it back, our change will not stick:
> session.rollback()
> orm.attributes.get_history(user, 'name')
History(added=(), unchanged=['John'], deleted=())
The Session object is a public API for the underlying connection and transaction objects. To understand how connections and transactions work in SQLAlchemy, take a look at the core documentation's section on the topic.
UPDATE: Session persistence
The Session stays open until explicitly closed via Session.close(). Often transaction managers handle this for you automatically in a web application implementation, but, for instance, failure to close sessions you open in a test suite can cause problems due to many open transactions.
The Session holds your changes entirely in Python until it is flushed, either via Session.flush() or, if autoflush is on, when a query is run. Once flushed the session will emit SQL within a transaction to the database. Repeated flushes simply emit more SQL within that transaction. Appropriate calls to Session.begin and Session.begin_nested will can create sub-transactions if your underlying engine/db supports it.
Calls to Session.commit and Session.rollback execute SQL within the currently active transaction.
Turn on echo=True when you initialize your engine and watch the SQL emitted by various Session methods to better understand what's happening.

How to get a SQLAlchemy session managed by zope.transaction that has the same scope as a http request but that does not close automatically on commit?

I have a Pyramid web application with some form pages that reads data from database and write to it as well.
The application uses SQLAlchemy with a PostgreSQL database and here is how I setup the SQLAlchemy session:
from sqlalchemy.orm import scoped_session
from sqlalchemy.orm import sessionmaker
from zope.sqlalchemy import ZopeTransactionExtension
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
When I process a form, I need to perform an explicit commit surrounded in a try and see if the commit worked. I need this explicit commit because I have deferrable triggers in the PostgreSQL database (checks that are performed at commit time) and there are some cases where the absence of error is not predictable.
Once I successfully committed a transaction, for example adding an instance of MyClass, I would like to get some attributes on this instance and also some attributes on linked instances. Indeed, I cannot get those data before committing because they are computed by the database itself.
My problem is that, when I use transaction.commit() (in transaction package) the session is automatically closed and I cannot use the instance anymore because it is in Detached state. Documentation confirms this point.
So, as mentioned in the documentation, I tried to use the following session setup instead:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension(keep_session=True)))
However, now the session scope is not the same as the http request scope anymore : no ROLLBACK is sent at the end of my http requests that just perform read queries.
So, is there a way to have a session that have the same scope as the http request but that does not close automatically on commit?

You can detach the session from the request via the keep_session=True on the ZTE. However, you probably also want to use the objects after the session is committed if this is the case, otherwise you'd be happy with a new session. Thus, you'll also want expire_on_commit=False on your session. After that, you've successfully detached the session from the lifecycle of pyramid_tm and you can commit/abort as you please. So how do we reattach it back to the request lifecycle without using pyramid_tm? Well, if you wrap your DBSession in something that's a little less global which will make it more manageable as a request-scoped variable, that'd help. From there, it's obvious when the session is created, and when it should be destroyed via a request-finished callback. Here's a summary of my prose:
def get_db(request):
session = request.registry['db_session_factory']()
def _closer(request):
session.close()
request.add_finished_callback(_closer)
return session
def main(global_conf, **settings):
config = Configurator()
DBSession = sessionmaker(expire_on_commit=False, extension=ZopeTransactionExtension(keep_session=True))
# look we don't even need a scoped_session anymore because it's not global, it's local to the request!
config.registry['db_session_factory'] = DBSession
config.add_request_method('db', get_db, reify=True)
def myview(request):
db = request.db # creates, or reuses the session created for this request
model = db.query(MyModel).first();
transaction.commit()
# model is still valid here
return {}
Of course, if we're doing all this, the ZTE may not be helping you at all and you just want to use db.commit() and handle things yourself. The finished callbacks will still be invoked if an exception occurs, so you don't need pyramid_tm to cleanup after you.

Pyramid exception logging with SQLAlchemy - commands not committing

I am using the Pyramid web framework with SQLAlchemy, connected to a MySQL backend. The app I've put together works, but I'm trying to add some polish by way of some enhanced logging and exception handling.
I based everything off of the basic SQLAlchemy tutorial on the Pyramid site, using the session like so:
DBSession = scoped_session(sessionmaker(extension=ZopeTransactionExtension()))
Using DBSession to query works great, and if I need to add and commit something to the database I'll do something like
DBSession.add(myobject)
DBSession.flush()
So I get my new ID.
Then I wanted to add logging to the database, so I followed this tutorial. That seemed to work great. I did initially run into some weirdness with things getting committed and I wasn't sure how SQLAlchemy was working so I had changed "transaction.commit()" to "DBSession.flush()" to force the logs to commit (this is addressed below!).
Next I wanted to add custom exception handling with the intent that I could put a friendly error page for anything that wasn't explicitly caught and still log things. So based on this documentation I created error handlers like so:
from pyramid.view import (
view_config,
forbidden_view_config,
notfound_view_config
)
from pyramid.httpexceptions import (
HTTPFound,
HTTPNotFound,
HTTPForbidden,
HTTPBadRequest,
HTTPInternalServerError
)
from models import DBSession
import transaction
import logging
log = logging.getLogger(__name__)
#region Custom HTTP Errors and Exceptions
#view_config(context=HTTPNotFound, renderer='HTTPNotFound.mako')
def notfound(request):
log.exception('404 not found: {0}'.format(str(request.url)))
request.response.status_int = 404
return {}
#view_config(context=HTTPInternalServerError, renderer='HTTPInternalServerError.mako')
def internalerror(request):
log.exception('HTTPInternalServerError: {0}'.format(str(request.url)))
request.response.status_int = 500
return {}
#view_config(context=Exception, renderer="HTTPExceptionCaught.mako")
def error_view(exc, request):
log.exception('HTTPException: {0}'.format(str(request.url)))
log.exception(exc.message)
return {}
#endregion
So now my problem is, exceptions are caught and my custom exception view comes up as expected. But the exceptions aren't logged to the database. It appears this is because the DBSession transaction is rolled back on any exception. So I changed the logging handler back to "transaction.commit". This had the effect of actually committing my exception logs to the database, BUT now any DBSession action after any log statement throws an "Instance not bound to a session" error...which makes sense because from what I understand after a transaction.commit() the session is cleared out. The console log always shows exactly what I want logged, including the SQL statements to write the log info to the database. But it's not committing on exception unless I use transaction.commit(), but if I do that then I kill any DBSession statements after the transaction.commit()!.
Sooooo....how might I set things up so that I can log to the database, but also catch and successfully log exceptions to the database, too? I feel like I want the logging handler to use some sort of separate database session/connection/instance/something so that it is self-contained but I'm unclear on how that might work.
Or should I architect what I want to do completely different?
EDIT:
I did end up going with a separate, log-specific session dedicated only to adding committing log info to the database. This seemed to work well until I started integrating a Pyramid console script into the mix, in which I ran into problems with sessions and database commits within the script not necessarily working like they do in the actual Pyramid web application.
In hindsight (and what I'm doing now) instead of logging to a database I use the standard logging and FileHandlers (TimedRotatingFileHandlers specifically) and log to the file system.

Using transaction.commit() has an unintended side-effect of the changes to other models being committed too, which is not too cool - the idea behind the "normal" Pyramid session setup with ZopeTransactionExtension is that a single session starts at the beginning of the request, then if everything succeeds the session is committed, if there's an exception then everything is rolled back. It would be better to keep this logic and avoid committing things manually in the middle of request.
(as a side note - DBSession.flush() does not commit the transaction, it emits the SQL statements but the transaction can be rolled back later)
For things like exception logs, I would look at setting up a separate Session which is not bound to Pyramid's request/response cycle (without ZopeTransactionExtension) and then using it to create log records. You'd need to commit the transaction manually after adding a log record:
record = Log("blah")
log_session.add(record)
log_session.commit()

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.